class: center, middle, inverse, title-slide # Bayesian networks in R ### Simon Dirmeier
simon.dirmeier@bsse.ethz.ch
### 17. January 2020 --- <style type="text/css"> @import url(https://fonts.googleapis.com/css?family=Raleway:400,300,600); h1, h2, h3 { font-family: 'Raleway'; } .title-slide, .title-slide h1, .title-slide h3 { color: black; background-color: white; text-shadow: none; font-family: 'Raleway'; text-align: left; } .title-slide h1 { margin-top: 30%; } .remark-slide-content { font-size: 30px; font-family: 'Raleway'; } a, a code { color: maroon; } .left-column { width: 50%; height: 92%; float: left; color: black; padding-top: 0em; } .right-column { width: 50%; height: 92%; float: right; color: black; padding-top: 0em; } .mjx-chtml { font-size: 30px !important; color: black; } .onethirdcolumn, .onethirdcolumnborder { margin-left: 5px; margin-right: 5px; width: 30%; height: 92%; float: left; color: black; padding-top: 0em; box-sizing: border-box; } .onethirdcolumnborder { border-style: solid; border-width: 0px 1px 0px 1px; } .centeredtwo { margin: auto; width: 50%; height: 92%; color: black; padding-top: 0em; box-sizing: border-box; margin: auto; text-align: center; } body { color: black; } .small .remark-code { font-size: 50% !important; } .small { font-size: 85% !important; } </style> # Before we start - We will use `R` for learning Bayesian networks from data. - If you want to follow the tutorials on your laptop, you need some packages. Install them using: ```r install.packages( c("bnlearn", "ggraph", "igraph", "dagitty") ) ``` - If you want to do them at some point later, the `R` tutorials can be found on: [bit.ly/bayesian-networks](http://bit.ly/bayesian-networks) - Slides: [bit.ly/bayesian-networks-slides](http://bit.ly/bayesian-networks-slides) --- # What are Bayesian networks .right-column[ <img src="index_files/figure-html/unnamed-chunk-2-1.png" style="display: block; margin: auto;" /> `$$\small \begin{align} P(\cdot \mid \theta, G) = & \, P(\text{PLCG}) \cdot P(\text{RAF}) \\ & \cdot P(\text{MEK} \mid \text{RAF}) \\ & \cdot P(\text{PIP2} \mid \text{RAF}, \text{PLCG}) \\ & \cdot P(\text{PIP3} \mid \text{PIP2}) \end{align}$$` ] .left-column[ A Bayesian network is a probability distribution `\(P\)` of a multivariate random variable that factorizes w.r.t. a directed acyclic graph. - nodes: random variables `\(X_i\)` - edges: conditional dependencies - factorization: `\(P(X \mid \theta, G) = \prod_i P(X_i \mid \mathbf{pa}_i)\)` ] --- # A BN of dependent genes .right-column[ <img src="index_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> ] .left-column[ Dependent gene expression levels ```r head(data) ``` ``` ## MEK PIP2 PIP3 PLCG RAF ## 1 Medium Low High Medium Medium ## 2 High Low Medium Medium High ## 3 Low Low Medium High Medium ## 4 Medium Medium Medium Medium Medium ## 5 Medium Low Low Low Medium ## 6 Low Medium Low High Medium ``` ] --- # A BN of dependent genes Usually the structure of the BN is not known. <br>We need to learn the edges and fit the parameters `\(\theta\)`. .centeredtwo[ <img src="index_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> ] --- # Let's learn a BN from data ## What do we need ? - A research question (i.e., does MEK regulate RAF expression) - Data - A method for learning the structure: - Score-based learning (e.g., using BIC as score) - Constraint-based learning (i.e., using CI) - A method for inferring parameters: MAP or MLE - Understanding what the learned BN *can* tell us --- # Discrete Bayesian networks .right-column[ <img src="index_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> ] .left-column[ Dependent gene expression levels (multinomial variables) ```r head(data) ``` ``` ## MEK PIP2 PIP3 PLCG RAF ## 1 Medium Low High Medium Medium ## 2 High Low Medium Medium High ## 3 Low Low Medium High Medium ## 4 Medium Medium Medium Medium Medium ## 5 Medium Low Low Low Medium ## 6 Low Medium Low High Medium ``` ] --- # Discrete Bayesian networks .right-column[ <img src="index_files/figure-html/unnamed-chunk-8-1.png" width="60%" style="display: block; margin: auto;" /> `$$\small \begin{align} P(\cdot \mid \theta, G) = & \, P(\text{PLCG}) \cdot P(\text{RAF}) \\ & \cdot P(\text{MEK} \mid \text{RAF}) \\ & \cdot P(\text{PIP2} \mid \text{RAF}, \text{PLCG}) \\ & \cdot P(\text{PIP3} \mid \text{PIP2}) \end{align}$$` ] .left-column[ .small[ Local parameters of MEK: <img src="index_files/figure-html/unnamed-chunk-9-1.png" width="75%" style="display: block; margin: auto;" /> Estimated from data: ```r raf_high <- sum(data$RAF == "High") mek_med_raf_high <- sum(data$MEK == "Medium" & data$RAF == "High") mek_med_raf_high / raf_high ``` ``` ## [1] 0.496 ``` ] ] --- # Discrete Bayesian networks Local parameters of PIP2 <img src="index_files/figure-html/unnamed-chunk-11-1.png" width="90%" style="display: block; margin: auto;" /> --- # Gaussian Bayesian networks .right-column[ <img src="index_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> ] .left-column[ Dependent gene expression levels (normal variables) ```r head(normal.data) ``` ``` ## MEK PIP2 PIP3 PLCG RAF ## 1 -0.01704444 -0.48526067 0.30454290 -0.05768997 -0.01982261 ## 2 0.28452593 -0.46043638 -0.07975671 0.03530991 0.23767098 ## 3 -0.35224836 -0.35125256 -0.15358762 0.22120396 0.02566972 ## 4 -0.06308748 0.13502731 -0.15457930 0.07573096 0.05395930 ## 5 -0.02004423 -0.50384090 -0.55696652 -0.40868773 0.16620857 ## 6 -0.69012854 -0.04767902 -0.36097671 0.63511402 -0.05855898 ``` ] --- # Gaussian Bayesian networks .right-column[ <img src="index_files/figure-html/unnamed-chunk-14-1.png" width="60%" style="display: block; margin: auto;" /> `$$\small \begin{align} P(\cdot \mid \theta, G) = & \, P(\text{PLCG}) \cdot P(\text{RAF}) \\ & \cdot P(\text{MEK} \mid \text{RAF}) \\ & \cdot P(\text{PIP2} \mid \text{RAF}, \text{PLCG}) \\ & \cdot P(\text{PIP3} \mid \text{PIP2}) \end{align}$$` ] .left-column[ .small[ Local parameters of MEK: ```r > bn_fit$MEK Conditional density: MEK | RAF Coefficients: (Intercept) RAF -0.003345871 -0.286159395 ``` Estimated from data: ```r > lm(MEK ~ RAF, data=data) Call: lm(formula = MEK ~ RAF, data = data) Coefficients: (Intercept) RAF -0.003346 -0.286159 ``` ] ] --- # Score-based learning Search space of DAGs and find the one that maximizes a score, e.g. BIC: `\(\text{score}(G) = \log P(D \mid \hat{\theta}, G) - \frac{\nu}{2} \log n\)` .onethirdcolumn[ <img src="index_files/figure-html/unnamed-chunk-17-1.png" style="display: block; margin: auto;" /> ```r > score(graph, data) [1] -200 ``` ] .onethirdcolumnborder[ <img src="index_files/figure-html/unnamed-chunk-19-1.png" style="display: block; margin: auto;" /> ```r > score(graph, data) [1] -10 ``` ] .onethirdcolumn[ <img src="index_files/figure-html/unnamed-chunk-21-1.png" style="display: block; margin: auto;" /> ```r > score(graph, data) [1] -1 ``` ] --- # Constraint-based learning Test conditional independence of pairs of variables. .right-column[ <img src="index_files/figure-html/unnamed-chunk-23-1.png" style="display: block; margin: auto;" /> ] .left-column[ Are PIP3 and PLCG marginally independent? ```r > ci.test("PIP3", "PLCG", data) data: PIP3 ~ PLCG x2 = 22.44, df = 4, p-value = 0.0001638 ``` ] --- # Constraint-based learning Remove edges if variables are conditionally independent. .right-column[ <img src="index_files/figure-html/unnamed-chunk-25-1.png" style="display: block; margin: auto;" /> ] .left-column[ Are PIP3 and PLCG conditionally independent? ```r > ci.test("PIP3", "PLCG", c("PIP2", "RAF", "MEK"), data) data: PIP3 ~ PLCG | PIP2 + RAF + MEK x2 = 90.678, df = 108, p-value = 0.8854 ``` ] --- # Things to consider Cannot (generally) learn direction of edges from observational data. All three DAGs have same score and encode same CIs! .onethirdcolumn[ <img src="index_files/figure-html/unnamed-chunk-27-1.png" style="display: block; margin: auto;" /> ] .onethirdcolumn[ <img src="index_files/figure-html/unnamed-chunk-28-1.png" style="display: block; margin: auto;" /> <br><br> `\(X \not\!\perp\!\!\!\perp Z\)`, `\(X \perp\!\!\!\perp Z \mid Y\)` ] .onethirdcolumn[ <img src="index_files/figure-html/unnamed-chunk-29-1.png" style="display: block; margin: auto;" /> ] --- # Things to consider *Can* learn direction of immorality. .centeredtwo[ <img src="index_files/figure-html/unnamed-chunk-30-1.png" style="display: block; margin: auto;" /> <br><br> `\(X \perp\!\!\!\perp Z\)`, `\(X \not\!\perp\!\!\!\perp Z \mid Y\)` ] --- # Examples .left-column[ <img src="index_files/figure-html/unnamed-chunk-31-1.png" style="display: block; margin: auto;" /> ```r > score(dag1, data) [1] -2589.816 ``` ] .right-column[ <img src="index_files/figure-html/unnamed-chunk-33-1.png" style="display: block; margin: auto;" /> ```r > score(dag2, data) [1] -2589.816 ``` ] --- # Markov equivalence class Can only learn a BN up to its *Markov equivalence class*, i.e. the *completed partially directed acycled graph* (CPDAG). .left-column[ <img src="index_files/figure-html/unnamed-chunk-35-1.png" style="display: block; margin: auto;" /> ] .right-column[ **Crucial detail!** Remember this when interpreting the structure of a BN. Edges can be deceiving and turned around in some cases. - **no** causal interpretation (generally) - **only** conditional dependence ] --- # Let's learn a BN from data ## R tutorial 1 - Load the data: `data/singalling_data-categorical.rds` - Learn the structure of a BN using: - `bnlearn::tabu` - `bnlearn::pc.stable` - Compute the equivalence class of the learned DAGs (`bnlear::cpdag`) - Make a query: `\(P(\text{MEK} = \text{high} \mid \text{RAF} = \text{low})\)` (`bnlearn::cpquery`) --- # Let's learn a BN from data ## R tutorial 2 - Load the data: `data/singalling_data.rds` - Learn the structure of a BN using `bnlearn::tabu` and `bnlearn::pc.stable` (do we still get warnings?) - Compute the equivalence class of the learned DAGs (`bnlear::cpdag`) - Make a query: `\(P(\text{MEK} > 0 \mid \text{RAF} < 0)\)` (`bnlearn::cpquery`) - What do the coefficients mean ? --- # Let's learn a BN from data ## R tutorial 3 - Get some confidence in the learned edges. Use `bnlearn::boot.strength` to assess edge strength. - Plot the network using - `plot(average.network(...))` - `plot.bootstrapped` --- # R packages - Structure learning: `bnlearn` and `pcalg` - Causal reasoning: `dagitty` and `ggdag` - Visualization: `ggraph` --- # References - Marco Scutari and Jean-Baptiste Denis, *Bayesian networks with examples in R* (applied) - David Barber, *Bayesian Reasoning and Machine Learning* (free, theory, good intro) - Daphne Koller and Nir Friedman, *Probabilistic Graphical Models* (very exhaustive) <div> <div style="float: right; padding: 5px"> <img src="https://mitpress.mit.edu/sites/default/files/styles/large_book_cover/http/mitp-content-server.mit.edu%3A18180/books/covers/cover/%3Fcollid%3Dbooks_covers_0%26isbn%3D9780262013192%26type%3D.jpg?itok=BDzWXp-V" height="150"/> </div> <div style="float: right; padding: 5px"> <img src="http://web4.cs.ucl.ac.uk/staff/D.Barber/textbook/jacket.gif" height="150"/> </div> <div style="float: right; padding: 5px"> <img src="http://bnlearn.com/images/crc.jpg" height="150"/> </div> </div>