Last updated: 2018-07-20
workflowr checks: (Click a bullet for more information)Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: docs/.DS_Store
Ignored: docs/figure/.DS_Store
Unstaged changes:
Deleted: analysis/license.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes. File | Version | Author | Date | Message |
---|---|---|---|---|
html | 5487b70 | Jason Willwerscheid | 2018-07-20 | Build site. |
Rmd | 8911eb8 | Jason Willwerscheid | 2018-07-20 | wflow_publish(c(“analysis/obj_notes.Rmd”, |
html | de6736c | Jason Willwerscheid | 2018-07-18 | Build site. |
Rmd | 4d6f3da | Jason Willwerscheid | 2018-07-18 | wflow_publish(“analysis/obj_notes.Rmd”) |
html | f34a82b | Jason Willwerscheid | 2018-07-18 | Build site. |
Rmd | 6c2cefd | Jason Willwerscheid | 2018-07-18 | wflow_publish(“analysis/obj_notes.Rmd”) |
html | cc77169 | Jason Willwerscheid | 2018-07-17 | Build site. |
Rmd | 87f4d40 | Jason Willwerscheid | 2018-07-17 | wflow_publish(“analysis/obj_notes.Rmd”) |
html | e77fc4c | Jason Willwerscheid | 2018-07-17 | Build site. |
Rmd | eb42402 | Jason Willwerscheid | 2018-07-17 | wflow_publish(“analysis/obj_notes.Rmd”) |
html | 9678634 | Jason Willwerscheid | 2018-07-17 | Build site. |
Rmd | 3e874c6 | Jason Willwerscheid | 2018-07-17 | wflow_publish(“analysis/obj_notes.Rmd”) |
html | 02765a9 | Jason Willwerscheid | 2018-07-17 | Build site. |
Rmd | e811943 | Jason Willwerscheid | 2018-07-17 | wflow_publish(“analysis/obj_notes.Rmd”) |
html | 0621185 | Jason Willwerscheid | 2018-07-17 | Build site. |
Rmd | a0bccf9 | Jason Willwerscheid | 2018-07-17 | wflow_publish(“analysis/obj_notes.Rmd”) |
Recall the FLASH model: Y=LF′+E
When updating loading lk, we are optimizing over glk and qlk. glk∈G is the prior on the elements of the kth column of the loadings matrix: l1k,…,lnk∼iidglk qlk is an arbitrary distribution which enters into the problem via the variational approach. For convenience, I drop the subscripts in the following.
The part of the objective that depends on g and q is F(g,q):=Eq[−12∑i(Ail2i−2Bili)]+Eqlogg(l)q(l) with Ai=∑jτijEf2j and Bi=∑jτijRijEfj, (R is the matrix of residuals (excluding factor k) and Efj and Ef2j are the expected values of fjk and f2jk with respect to the distribution qfk fitted during the factor update.)
As Lemma 2 in the paper shows (see Appendix A.2), this expression is optimized by setting s2j=Aj and xj=Bjs2j, and then solving the EBNM problem, where the EBNM model is: x=θ+e, θ1,…,θn∼iidg, ej∼N(0,s2j)
Solving the EBNM problem gives ˆg=argmax and \hat{q} = p(\theta \mid x, \hat{g})
Finally, to update the overall objective, we need to compute E_q \log \frac{g(\mathbf{l})}{q(\mathbf{l})}. FLASH uses a clever trick, noticing that E_{\hat{q}} \log \frac{\hat{g}(\mathbf{l})}{\hat{q}(\mathbf{l})} = F(\hat{g}, \hat{q}) + \frac{1}{2} \sum_j \left[ \log 2\pi s_j^2 + (1/s_j^2) E_{\hat{q}} (x_j - \theta_j)^2 \right] (See Appendix A.4.)
When using ebnm_pn
, however, it seems possible to compute E_q \log \frac{g(\mathbf{l})}{q(\mathbf{l})} directly. Since the elements l_1, \ldots, l_n are i.i.d. from g (by the FLASH model) and the posterior distributions are mutually independent (by the EBNM model), E_q \log \frac{g(\mathbf{l})}{q(\mathbf{l})}
= \sum_i E_{q_i} \log \frac{g(l_i)}{q(l_i)}
I drop the subscripts i. Write g \sim \pi_0 \delta_0 + (1 - \pi_0) N(0, 1/a) and q \sim (1 - \tilde{w}) \delta_0 + \tilde{w} N(\tilde{\mu}, \tilde{\sigma}^2) (I use different parametrizations to make the derivation cleaner and to follow the code more closely.)
Then \begin{aligned} E_q \log \frac{g(l)}{q(l)} &= (1 - \tilde{w}) \log \frac{\pi_0}{1 - \tilde{w}} + \int \tilde{w}\ \text{dnorm}(x; \tilde{\mu}, \tilde{\sigma}^2) \log \frac{(1 - \pi_0)\text{dnorm}(x; 0, 1/a)} {\tilde{w}\ \text{dnorm}(x; \tilde{\mu}, \tilde{\sigma}^2)}\ dx \\ &= (1 - \tilde{w}) \log \frac{\pi_0}{1 - \tilde{w}} + \tilde{w} \log \frac{1 - \pi_0}{\tilde{w}} \\ &\ + \int \tilde{w}\ \text{dnorm}(x; \tilde{\mu}, \tilde{\sigma}^2) \log \left( \sqrt{a \tilde{\sigma}^2} \exp \left( -\frac{ax^2}{2} + \frac{(x - \tilde{\mu})^2}{2 \tilde{\sigma}^2} \right) \right) \ dx \end{aligned}
The last integral is equal to \begin{aligned} \frac{\tilde{w}}{2} \log (a \tilde{\sigma}^2) &- \frac{\tilde{w} a}{2} E_{N(x; \tilde{\mu}, \tilde{\sigma}^2)} x^2 + \frac{\tilde{w}}{2 \tilde{\sigma}^2} E_{N(x; \tilde{\mu}, \tilde{\sigma}^2)} (x - \tilde{\mu})^2 \\ &= \frac{\tilde{w}}{2} \log (a \tilde{\sigma}^2) - \frac{\tilde{w} a}{2} (\tilde{\mu}^2 + \tilde{\sigma}^2) + \frac{\tilde{w}}{2} \end{aligned}
Thus E_q \log \frac{g(l)}{q(l)} = (1 - \tilde{w}) \log \frac{\pi_0}{1 - \tilde{w}} + \tilde{w} \log \frac{1 - \pi_0}{\tilde{w}} + \frac{\tilde{w}}{2} \left( \log (a \tilde{\sigma}^2) - a(\tilde{\mu}^2 + \tilde{\sigma}^2) + 1 \right)
This reproducible R Markdown analysis was created with workflowr 1.0.1