Understand d-separation

Yu Yang
5 min readFeb 5, 2021

--

D-separation is a critical idea in Bayesian Networks and causal inference. The problem it intends to tackle is: given a causal graph G, is a set X of variables independent of another set Y, given a third set Z?

At the first sight, it may look intimidating. However, with a few examples, you shall able to understand it better. For this article, I mainly focus on the application aspect and will talk about:

  1. three rules to check d-separation in the corresponding scenarios;
  2. one step-by-step algorithm to check d-separation in general;
  3. how to use R package bnlearn to check d-separation.

Three Scenarios and Three Rules

If two nodes are not d-connected, then they are d-separated. It is easier to understand d-connectedness than d-separation, so the following rules will be based on d-connectedness. There is one concept you need to familiarize yourself first.

  • Collider: if a node has two arrows pointing to it, then it is a collider. For example, in x → y ← z, y is a collider.

Scenario 1:

Z is empty, namely, we don’t condition on any variables.

Rule 1: if there exists a path from x to y and there is no collider on the path, then x and y are d-connected.

Example 1: Consider the following graph.

Image Source: http://bayes.cs.ucla.edu/BOOK-2K/d-sep.html
  1. t is collider since there are two arrows pointing to it (s → t ← u).
  2. x and t are d-connected, since there is no collider along the path.
  3. x and u are not d-connected and hence d-separated, since t is along the path. So are (x, v) and (x, y).
  4. u and y are d-connected, since there is no collider along the path u-v-y. Note that v has two arrows emerging from it, not pointing to it, so it is not a collider.

Scenario 2:

Z is non-empty, and the colliders don’t belong to Z or have no children in Z.

Rule 2: if there exists a path from x to y and none of the nodes on the path belongs to Z, then x and y are d-connected.

Example 2: Consider the following graph, where Z = {r, v}.

Image Source: http://bayes.cs.ucla.edu/BOOK-2K/d-sep.html
  1. t is a collider, but t doesn’t belong to Z and t has no children in Z, so Rule 2 fits this case.
  2. Only pair s — t, t — u are d-connected. All the other pairs are d-separated as either r or v falls along the path in between.

Scenario 3:

Z is non-empty, and there are colliders either inside Z or have children in Z.

Rule 3: for colliders that fall inside Z or have children in Z, they are no longer seen as colliders.

Example 3: Consider the following graph, where Z = {r, p}.

Image Source: http://bayes.cs.ucla.edu/BOOK-2K/d-sep.html
  1. There is one collider t in the graph. It child p falls inside Z, so Rule 3 fits this case. Therefore, t is no longer seen as a collider and hence s and u are d-connected, so are (s, v), (s, q), (s, y).
  2. Note that x are not d-connected to any other node (by Rule 2), since r is along all possible paths linking to x.

A step-by-step algorithm

In the above, we check d-separation by examining the paths in the causal graph and by following specific scenarios. Now, let’s take a look at an algorithm which can unveil the connectedness in a step-by-step manner. This algorithm is very easy to follow and let’s begin!

  1. Ancestral Graph: select the subgraph which contains all the nodes in X, Y, and Z, as well as their ancestors (i.e., parents, parents’ parents, …), and all the associated nodes.
  2. Moralization: moralize the subgraph, namely, if x and y are both parents of z, then draw an edge connecting x and y. You can think it as marriage between the parents.
  3. Disorientation: remove the direction of the edges, namely, turn a directed graph into a indirected graph.
  4. Givens Removal: remove all the nodes in Z and the associated edges.
  5. Connectedness: if you can find a path between x and y, for example, x — y, or x — u — y), then these two nodes are d-connected, otherwise, they are d-separated.

I highly recommend you walk through an example (shown below) as illustrated in the MIT 6.034 handout. Detailed step-by-step graphical explanation is provided in the handout. Here, I display the causal graph and some key takeaways from that example.

http://web.mit.edu/jmn/www/6.034/d-separation.pdf
  • Are A and B conditionally independent, given a set of variables Z? <=> P(A|B, Z) =? P(A|Z)
  • Are A and B marginally independent? <=> P(A|B, {}) =? P(A|{}). (Here, the Z set is an empty set. So in the 4th step, we don’t need to remove any nodes.)
  • P(D|CEG) =? P(D|C) <=> P(D|EC) =?P(D|C) && P(D|GC) = P(D|C), namely, “are D and E conditionally independent, given C? AND are D and G conditionally independent, given C?” (This one can be a bit tricky, but well worth taking some time to fully understand it.)

Check d-separation in R package {bnlearn}

In bnlearn, once you get the fitted Bayesian Network model, an object of class bn.fit, you can easily check d-separation with the function dsep(). For example, if we would like to check whether or not A and B are d-separated given C, then we can run the following code, where bn is the fitted Bayesian Network model. The result will be TRUE or FALSE.

> dsep(bn, x = "A", y = "B", z = "C")

Hope this article gives you some idea about d-separation and make it less intimidating. Happy learning!

References

--

--

Yu Yang
Yu Yang

Written by Yu Yang

Ph.D. in Statistics and NLP.

Responses (2)