Last updated: 2022-03-17

Checks: 2 0

Knit directory: ~/Documents/Winter_Quarter_2022/Fundamentals/jaurbanChicago.github.io/bin/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 50fe36b. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Untracked files:
    Untracked:  .DS_Store
    Untracked:  .Rhistory

Unstaged changes:
    Deleted:    New Folder With Items/.DS_Store
    Deleted:    New Folder With Items/Genetic_Drift_Markov_Chain.Rmd
    Deleted:    New Folder With Items/_site.yml
    Deleted:    New Folder With Items/include/footer.html
    Deleted:    index.html

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (bin/Genetic_Drift_Markov.Rmd) and HTML (docs/Genetic_Drift_Markov.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
html a311740 jaurbanChicago 2022-03-17 Removed all htmls
Rmd 0d8c343 jaurbanChicago 2022-03-17 Updates both rmd and html
html 0d8c343 jaurbanChicago 2022-03-17 Updates both rmd and html
html 112890c jaurbanChicago 2022-03-17 Added Genetic_Drift_Markov.html
Rmd 6b81ff1 jaurbanChicago 2022-03-17 Added Genetic_Drift_Markov.Rmd

Pre-Requisites

Introduction to Discrete Markov Chains

Before getting into the specific details of the relationship between Markov chains and genetic drift, it is necessary to briefly explain what a discrete, finite Markov chain is.

Let´s consider a discrete random variable \(X\), which at time points \(0,1,2,3,...\), takes values \(0,1,2,...,M\). Therefore, it is precise to say that each state \(E_i\), \(X\) takes the value \(i\) [2]. This means that at each state \(E_i\) at some time \(t\), \(X\) is in some state \(E_i\). A variable is said to be Markovian if the probability of a certain outcome in the next time step \(t+1\) depends only on the present state at time \(t\) and is memoryless with regard to any previous time step \(t-1,t-2,...\). Therefore, we can now define a discrete Markov chain as a sequence of discrete random variables in which the probability distribution of the different states at each time \(t\) depends only on the state at time \(t-1\) [1,2].

Some important mathematical notation to describe a discrete Markov chain is [2,3]:
\[ P(E_{n+1}|E_{n})=i,E_{n-1}=i_{n-1},...,E_1=i_1,E_0=i_0)=P(E_{n+1}=j|E_n=i)\\ \mathbb{A \space discrete \space Markov \space is \space homogeneous \space if:}\\ P(E_{n+1}|E_{n})=P(E_i=j|E_{i-1}=i) \space \mathbb{for \space all} \space n, \mathbb{so} \space P(E_i=j|E_{i-1})=P_{ij}\\ P_{ij} \gt 0;i,j \ge 0; \sum_{j=0}^{\infty}P_{ij}=1 \\ \mathbb{A \space transition \space probability \space matrix \space can \space be \space assembled \space to \space represent \space the \space probabilities \space of \space} P_{ij} \space \\ P_{ij}= \begin{pmatrix} p_{00} & p_{01} & ... & p_{0M}\\ p_{00} & p_{11} & ... & p_{1M} \\ .\\ .\\ .\\ p_{M0} & p_{M1} & ... & p_{MM} \end{pmatrix} \\ \mathbb{This \space is \space an \space stochastic \space matrix, so \space all \space rows \space must \space sum \space to \space 1} \]

Next, we are going to classify the chains in two different classes:

  • Irreducible: All states can communicate with one another[3].
  • Ergodic: Irreducible, recurrent, and aperiodic. Recurrent means that every state \(E_i\) can reenter \(E_i\) often infinitely when the chain starts at this particular \(E_i\)., i.e. \(f_{i}=1\). Aperiodic means all of the Markov chain states are aperiodic [3].

For the remainder of this vignette, it is important to denote that we will be working with Markov chains that have no periodicities in them, which hardly arise in any genetical applications [2]. Finally, we need to define the concept of stationary distribution of a Markov chain. A stationary distribution may be defined with the following mathematical terms [3]:

\[ \mathbb{Consider \space}P^t \space \mathbb{as \space t \space gets \space large}:\\ \lim_{t\to\infty}(\pi^{(0)})^TP^t=\pi^T ; \space \mathbb{the \space \pi \space vector \space is \space the \space limiting \space distribution \space of \space the \space Markov \space chain}\\ \mathbb{Stationarity \space property \space of \space \pi}:\\ \pi^T=\pi^TP; \space \mathbb{because} \space \pi_i=\sum_k \pi_kP_{ki} \] For ergodic chains, the limiting distribution is also the stationary distribution and there is only one unique stationary distribution (i.e., equilibrium distribution) [3]. To obtain the stationary distribution of the Markov chain, we can use matrix exponentiation (each row of the resulting matrix will have \(\pi\) in each row), solve linear equations (solve for \(\pi^T=\pi^TP\)), and use eigendecomposition (\(\pi\) is left eigenvector of \(P\), so eigendecomposition can lead to \(\pi\)).

Markov Chains in Genetic Drift

A discrete Markov chain can be quite helpful


This site was created with R Markdown