Last updated: 2019-05-22
Checks: 2 0
Knit directory: MSTPsummerstatistics/
This reproducible R Markdown analysis was created with workflowr (version 1.3.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: analysis/.RData
Ignored: analysis/.Rhistory
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view them.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 4ce8e85 | Anthony Hung | 2019-05-21 | bandersnatch add |
html | 096760a | Anthony Hung | 2019-05-18 | Build site. |
Rmd | 193ab25 | Anthony Hung | 2019-05-18 | additions to complete mult testing |
html | 193ab25 | Anthony Hung | 2019-05-18 | additions to complete mult testing |
html | da98ae8 | Anthony Hung | 2019-05-17 | Build site. |
Rmd | 239723e | Anthony Hung | 2019-05-08 | Update learning objectives |
html | 2ec7944 | Anthony Hung | 2019-05-06 | Build site. |
Rmd | d45dca4 | Anthony Hung | 2019-05-06 | Republish |
html | d45dca4 | Anthony Hung | 2019-05-06 | Republish |
Rmd | ee75486 | Anthony Hung | 2019-05-04 | Build site. |
Markov chains are models that describe the sequence of possible countable events for a system in which the probability of transitions from each event to the next is dependent only on the event immediately preceeding that event. Markov chains are a staple in computational statistics. Our objective today is to learn the basics behind Markov Chains and their long-run behavior.
The Markov assumption assumes that in order to predict the future behavior of a system, all that is required is knowledge of the present state of the system and not the past state of the system. For example, given a set of times \(t_1, t_2, t_3, t_4\) and states \(X_1, X_2, X_3, X_4\), under the Markov assumption or Markov property:
\[P(X_4=1|X_3=0, X_2=1, X_1=1) = P(X_4=1|X_3=0)\]
In other words, “the past and the future are conditionally independent given the present”. If we have knowledge about the present, then knowing the past does not give us any more information to predict what will happen in the future. Another term that is commonly used to describe Markov chains is “memorylessness.”
Question: What distribution that we have discussed in probability is also described by the property of “memorylessness”?
The central dogma of biology describes how information moves from DNA to RNA to Protein.
\[DNA \rightarrow RNA \rightarrow Protein\]
The assumption under the central dogma is that information flows only in one direction, and never backwards. Under a Markov chain model of the central dogma, the amount of RNA you observe in a cell is some function of the genetic variations seen at the DNA sequence level (in coding and noncoding regulatory regions), and the amount of protein you see in the cell is some function of the abundance of RNA transcripts in the cell coding for that protein. If you know the amount of RNA in the cell, then knowing the underlying DNA sequence of the cell at the gene encoding the protein does not give you more information to better predict the amount of protein in the cell. Obviously, there are exceptions to such a simple model of biology, but in the vast majority of cases this model does a very good job of describing biological networks.
A Markov chain can be described by two