**Last updated:** 2017-03-04

**Code version:** 5d0fa13

You should be familiar with Bayesian inference for a continuous parameter.

Suppose we want to do inference for multiple parameters, and suppose that the data that are informative for each parameter are independent. Then provided the prior distributions on these parameters are independent, the posterior distributions are also independent. This is useful as it essentially means we can do Bayesian inference for all the parameters by doing the inference for each parameter separately.

Suppose we have data \(D_1\) that depend on parameter \(\theta_1\), and independent data \(D_2\) that depend on a second parameter \(\theta_2\). That is, suppose that the joint distribution of the data \((D_1,D_2)\) factorizes as \[p(D_1,D_2 | \theta_1, \theta_2) = p(D_1 | \theta_1)p(D_2 | \theta_2).\]

Now assume that our prior distribution on \((\theta_1,\theta_2)\) has the property that \(\theta_1, \theta_2\) are independent. (This is sometimes said “\(\theta_1\) and \(\theta_2\) are *a priori* independent”.) Intuitively this independence assumption means that telling you \(\theta_1\) would not tell you anything about \(\theta_2\). Mathematically, the independence assumption means that the prior distribution for \(\theta_1,\theta_2\) factorizes as \[p(\theta_1,\theta_2) = p(\theta_1)p(\theta_2).\]

Applying Bayes theorem we have

\[\begin{align} p(\theta_1, \theta_2 | D_1,D_2) & \propto p(D_1, D_2 | \theta_1, \theta_2) p(\theta_1, \theta_2) \\ & \propto p(D_1 | \theta_1) p(D_2 | \theta_2) p(\theta_1) p(\theta_2) \\ & = p(D_1 | \theta_1)p(\theta_1) \, p(D_2 | \theta_2) p(\theta_2) \\ & \propto p(\theta_1 | D_1) \, p(\theta_2 | D_2) \end{align}\]That is, the posterior distribution on \(\theta_1,\theta_2\) factorizes into independent parts \(p(\theta_1 | D_1)\) and \(p(\theta_2 | D_2)\). We say “\(\theta_1\) and \(\theta_2\) are *a posteriori* independent”.

This result extends naturally from 2 parameters to \(J\) parameters. That is, if we have independent data sets \(D_1,\dots,D_J\) that depend on parameters \(\theta_1,\dots,\theta_J\), with \[p(D_1,\dots, D_J | \theta_1,\dots,\theta_J) = \prod_{j=1}^J p(D_j | \theta_j)\] and we assume independent priors \[p(\theta_1,\dots,\theta_J) = \prod_{j=1}^J p(\theta_j)\] then the posteriors also factorize \[p(\theta_1,\dots, \theta_J | D_1,\dots, D_J) = \prod_{j=1}^J p(\theta_j | D_j).\]

Suppose we collect genetic data on \(n\) elephants at \(J\) locations along the genome (“loci”). Suppose that at each location there are two genetic types (“alleles”) that we label “0” and “1”. Our goal is to estimate the frequency of the “1” allele, \(q_j\), at each locus \(j=1,\dots,J\).

Let \(n_{ja}\) denote the number of alleles of type \(a\) observed at locus \(j\) (\(a \in \{0,1\}\), \(j \in \{1,2,\dots,J\}\)). Let \(n_j\) denote the data at locus \(j\) (so \(n_j = (n_{j0},n_{j1})\)) and \(n\) denote the data at all \(J\) loci.

Also let \(q\) denote the vector \((q_1,\dots,q_J)\).

Thus, \(n\) denotes the data and \(q\) denotes the unknown parameters. To do Bayesian inference for \(q\) we want to compute the posterior distribution \(p(q | n)\).

To apply the above results we must assume that

data at different loci are independent, so \[p(n | q) = \prod p(n_j | q_j),\] and

the \(q_j\) are

*a priori*independent. This would imply, for example, that telling you \(q_1\) (the frequency of the 1 allele at locus 1) would not tell

you anything about \(q_2\) (the frequency of the 1 allele at locus 2).

In practice these are reasonable assumptions provided that the loci are well separated along the genome and the samples are taken from a well-mixing (“random-mating”) population of elephants without substructure.

Under these assumptions we have that the \(q_j\) are *a posteriori* independent, with \[p(q | n ) = \prod_j p(q_j | n_j).\]

Furthermore, we know from conjugacy that if the prior distribution on \(q_j\) is a Beta distribution, say \(q_j \sim \text{Beta}(a_j,b_j)\), then the posterior \(p(q_j | n_j)\) is also a Beta distribution, with \(q_j | n_j \sim \text{Beta}(a_j + n_{j1}, b_j + n_{j0})\).

`sessionInfo()`

```
R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)
locale:
[1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] tidyr_0.4.1 dplyr_0.5.0 ggplot2_2.1.0 knitr_1.15.1
[5] MASS_7.3-45 expm_0.999-0 Matrix_1.2-6 viridis_0.3.4
[9] workflowr_0.3.0 rmarkdown_1.3
loaded via a namespace (and not attached):
[1] Rcpp_0.12.5 git2r_0.18.0 plyr_1.8.4 tools_3.3.0
[5] digest_0.6.9 evaluate_0.10 tibble_1.1 gtable_0.2.0
[9] lattice_0.20-33 shiny_0.13.2 DBI_0.4-1 yaml_2.1.14
[13] gridExtra_2.2.1 stringr_1.2.0 gtools_3.5.0 rprojroot_1.2
[17] grid_3.3.0 R6_2.1.2 reshape2_1.4.1 magrittr_1.5
[21] backports_1.0.5 scales_0.4.0 htmltools_0.3.5 assertthat_0.1
[25] mime_0.5 colorspace_1.2-6 xtable_1.8-2 httpuv_1.3.3
[29] labeling_0.3 stringi_1.1.2 munsell_0.4.3
```

This site was created with R Markdown