Bayesian networks in R

class: center, middle, inverse, title-slide

# Bayesian networks in R
### Simon Dirmeier <a href="mailto:simon.dirmeier@bsse.ethz.ch" class="email">simon.dirmeier@bsse.ethz.ch</a>
### 17. January 2020

---

h1, h2, h3 {
  font-family: 'Raleway';
}

.title-slide, .title-slide h1, .title-slide h3 {
  color: black;
  background-color: white;
  text-shadow: none;
  font-family: 'Raleway';
  text-align: left;
}

.title-slide h1 {
  margin-top: 30%;
}

.remark-slide-content {
  font-size: 30px;
  font-family: 'Raleway';
}

a, a code {
  color: maroon;
}

.left-column {
  width: 50%;
  height: 92%;
  float: left;
  color: black;
  padding-top: 0em;
}

.right-column {
  width: 50%;
  height: 92%;
  float: right;
  color: black;
  padding-top: 0em;
}

.mjx-chtml {
  font-size: 30px !important;
  color: black;
}

.onethirdcolumn, .onethirdcolumnborder {
  margin-left: 5px;
  margin-right: 5px;
  width: 30%; 
  height: 92%;
  float: left; 
  color: black;
  padding-top: 0em;
  box-sizing: border-box;
}

.onethirdcolumnborder {
  border-style: solid;
  border-width: 0px 1px 0px 1px;
}

.centeredtwo {
  margin: auto;
  width: 50%;
  height: 92%;
  color: black;
  padding-top: 0em;
  box-sizing: border-box;
  margin: auto;
  text-align: center;
}

body {
  color: black;
}

.small .remark-code {
  font-size: 50% !important;
}

.small {
  font-size: 85% !important;
}

</style>

# Before we start

- We will use `R` for learning Bayesian networks from data.
- If you want to follow the tutorials on your laptop, you need some packages. Install them using:

```r
install.packages(
  c("bnlearn", "ggraph", "igraph", "dagitty")
)
```

- If you want to do them at some point later, the `R` tutorials can be found on: [bit.ly/bayesian-networks](http://bit.ly/bayesian-networks)
- Slides: [bit.ly/bayesian-networks-slides](http://bit.ly/bayesian-networks-slides)

---

# What are Bayesian networks

.right-column[
<img src="index_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" />

`$$\small \begin{align}
P(\cdot \mid \theta, G)  = & \, P(\text{PLCG}) \cdot P(\text{RAF}) \\ & \cdot P(\text{MEK} \mid \text{RAF}) \\ & \cdot P(\text{PIP2} \mid \text{RAF}, \text{PLCG}) \\ & \cdot P(\text{PIP3} \mid \text{PIP2})
\end{align}$$`
]

.left-column[
A Bayesian network is a probability distribution `$P$` of a multivariate random variable that factorizes w.r.t. a directed acyclic graph.

- nodes: random variables `$X_i$`
- edges: conditional dependencies
- factorization: `$P(X \mid \theta, G) = \prod_i P(X_i \mid \mathbf{pa}_i)$`
]

---

# A BN of dependent genes

.right-column[
<img src="index_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" />
]

.left-column[
Dependent gene expression levels

```r
head(data)
```

```
##      MEK   PIP2   PIP3   PLCG    RAF
## 1 Medium    Low   High Medium Medium
## 2   High    Low Medium Medium   High
## 3    Low    Low Medium   High Medium
## 4 Medium Medium Medium Medium Medium
## 5 Medium    Low    Low    Low Medium
## 6    Low Medium    Low   High Medium
```
]

---

# A BN of dependent genes

Usually the structure of the BN is not known. <br>We need to learn the edges and fit the parameters `$\theta$`.

.centeredtwo[
<img src="index_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" />
]

---

# Let's learn a BN from data
## What do we need ?

- A research question (i.e., does MEK regulate RAF expression)
- Data 
- A method for learning the structure:
  - Score-based learning (e.g., using BIC as score)
  - Constraint-based learning (i.e., using CI)
- A method for inferring parameters: MAP or MLE
- Understanding what the learned BN *can* tell us

---

# Discrete Bayesian networks

.right-column[
<img src="index_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" />
]

.left-column[
Dependent gene expression levels (multinomial variables)

```r
head(data)
```

---

# Discrete Bayesian networks

.right-column[
<img src="index_files/figure-html/unnamed-chunk-8-1.png" width="60%" style="display: block; margin: auto;" />

]

.left-column[

.small[
Local parameters of MEK:
<img src="index_files/figure-html/unnamed-chunk-9-1.png" width="75%" style="display: block; margin: auto;" />

Estimated from data:

```r
raf_high <- sum(data$RAF == "High")
mek_med_raf_high <- sum(data$MEK == "Medium" & 
                        data$RAF == "High")
mek_med_raf_high / raf_high
```

```
## [1] 0.496
```
]

]

---

# Discrete Bayesian networks

Local parameters of PIP2
<img src="index_files/figure-html/unnamed-chunk-11-1.png" width="90%" style="display: block; margin: auto;" />

---

# Gaussian Bayesian networks

.right-column[
<img src="index_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" />
]

.left-column[
Dependent gene expression levels (normal variables)

```r
head(normal.data)
```

```
##           MEK        PIP2        PIP3        PLCG         RAF
## 1 -0.01704444 -0.48526067  0.30454290 -0.05768997 -0.01982261
## 2  0.28452593 -0.46043638 -0.07975671  0.03530991  0.23767098
## 3 -0.35224836 -0.35125256 -0.15358762  0.22120396  0.02566972
## 4 -0.06308748  0.13502731 -0.15457930  0.07573096  0.05395930
## 5 -0.02004423 -0.50384090 -0.55696652 -0.40868773  0.16620857
## 6 -0.69012854 -0.04767902 -0.36097671  0.63511402 -0.05855898
```
]

---

# Gaussian Bayesian networks

.right-column[
<img src="index_files/figure-html/unnamed-chunk-14-1.png" width="60%" style="display: block; margin: auto;" />

]

.left-column[

.small[
Local parameters of MEK:

```r
> bn_fit$MEK

Conditional density: MEK | RAF
Coefficients:
 (Intercept)           RAF  
-0.003345871  -0.286159395
```

Estimated from data:

```r
> lm(MEK ~ RAF, data=data)

Call: lm(formula = MEK ~ RAF, data = data)

Coefficients:
(Intercept)          RAF  
  -0.003346    -0.286159  
```
]

]

---

# Score-based learning

Search space of DAGs and find the one that maximizes a score, e.g. BIC: `$\text{score}(G) = \log P(D \mid \hat{\theta}, G) - \frac{\nu}{2} \log n$`

.onethirdcolumn[

```r
> score(graph, data)
[1] -200
```
]

.onethirdcolumnborder[
<img src="index_files/figure-html/unnamed-chunk-19-1.png" style="display: block; margin: auto;" />

```r
> score(graph, data)
[1] -10
```
]

.onethirdcolumn[
<img src="index_files/figure-html/unnamed-chunk-21-1.png" style="display: block; margin: auto;" />

```r
> score(graph, data)
[1] -1
```
]

---

# Constraint-based learning

Test conditional independence of pairs of variables.

.right-column[
<img src="index_files/figure-html/unnamed-chunk-23-1.png" style="display: block; margin: auto;" />

]

.left-column[
Are PIP3 and PLCG marginally independent?

```r
> ci.test("PIP3", "PLCG", data)

data:  PIP3 ~ PLCG  
x2 = 22.44, df = 4, p-value = 0.0001638
```
]

---

# Constraint-based learning

Remove edges if variables are conditionally independent.

.right-column[
<img src="index_files/figure-html/unnamed-chunk-25-1.png" style="display: block; margin: auto;" />
]

.left-column[
Are PIP3 and PLCG conditionally independent?

```r
> ci.test("PIP3", "PLCG", 
          c("PIP2", "RAF", "MEK"), data)

data:  PIP3 ~ PLCG | PIP2 + RAF + MEK
x2 = 90.678, df = 108, p-value = 0.8854
```
]

---

# Things to consider

Cannot (generally) learn direction of edges from observational data. All three DAGs have same score and encode same CIs!

.onethirdcolumn[
<img src="index_files/figure-html/unnamed-chunk-27-1.png" style="display: block; margin: auto;" />
]

.onethirdcolumn[
<img src="index_files/figure-html/unnamed-chunk-28-1.png" style="display: block; margin: auto;" />
<br><br>
`$X \not\!\perp\!\!\!\perp Z$`, `$X \perp\!\!\!\perp Z \mid Y$`
]

.onethirdcolumn[
<img src="index_files/figure-html/unnamed-chunk-29-1.png" style="display: block; margin: auto;" />
]

---

# Things to consider

*Can* learn direction of immorality.

.centeredtwo[
<img src="index_files/figure-html/unnamed-chunk-30-1.png" style="display: block; margin: auto;" />
<br><br>
`$X \perp\!\!\!\perp Z$`, `$X \not\!\perp\!\!\!\perp Z \mid Y$`
]

---

# Examples

.left-column[

```r
> score(dag1, data)
[1] -2589.816
```
]

.right-column[
<img src="index_files/figure-html/unnamed-chunk-33-1.png" style="display: block; margin: auto;" />

```r
> score(dag2, data)
[1] -2589.816
```
]

---

# Markov equivalence class

Can only learn a BN up to its *Markov equivalence class*, i.e. the *completed partially directed acycled graph* (CPDAG).
.left-column[
<img src="index_files/figure-html/unnamed-chunk-35-1.png" style="display: block; margin: auto;" />
]
.right-column[
**Crucial detail!** Remember this when interpreting the structure of a BN. Edges can be deceiving and turned around in some cases.
- **no** causal interpretation (generally)
- **only** conditional dependence
]

---

# Let's learn a BN from data
## R tutorial 1

- Load the data: `data/singalling_data-categorical.rds`
- Learn the structure of a BN using:
    - `bnlearn::tabu`
    - `bnlearn::pc.stable`
- Compute the equivalence class of the learned DAGs (`bnlear::cpdag`)
- Make a query: `$P(\text{MEK} = \text{high} \mid \text{RAF} = \text{low})$` (`bnlearn::cpquery`)

---

# Let's learn a BN from data
## R tutorial 2

- Load the data: `data/singalling_data.rds`
- Learn the structure of a BN using `bnlearn::tabu` and `bnlearn::pc.stable` (do we still get warnings?)
- Compute the equivalence class of the learned DAGs (`bnlear::cpdag`)
- Make a query: `$P(\text{MEK} > 0 \mid \text{RAF} < 0)$` (`bnlearn::cpquery`)
- What do the coefficients mean ?

---

# Let's learn a BN from data
## R tutorial 3

- Get some confidence in the learned edges. Use `bnlearn::boot.strength` to assess edge strength.
- Plot the network using
  - `plot(average.network(...))`
  - `plot.bootstrapped`

---

# R packages

- Structure learning: `bnlearn` and `pcalg`

- Causal reasoning: `dagitty` and `ggdag`

- Visualization: `ggraph`

---

# References

- Marco Scutari and  Jean-Baptiste Denis, *Bayesian networks with examples in R* (applied)

- David Barber, *Bayesian Reasoning and Machine Learning* (free, theory, good intro)

- Daphne Koller and Nir Friedman,  *Probabilistic Graphical Models* (very exhaustive)