Pre-requisites

Overview

Definition of a Markov Chain

Example: Gary’s mood

What are the expected probability states after one or two steps?

What about an abitrary number of time steps?

Session information

Last updated: 2017-03-04

Code version: 5d0fa13282db4a97dc7d62e2d704e88a5afdb824

Pre-requisites

An understanding of matrix multiplication and matrix powers.

Overview

Here we provide a quick introduction to discrete Markov Chains.

Definition of a Markov Chain

A Markov Chain is a discrete stochastic process with the Markov property : $P(X_t|X_{t-1},\ldots,X_1)= P(X_t|X_{t-1})$ . It is fully determined by a probability transition matrix $P$ which defines the transition probabilities ( $P_ij=P(X_t=j|X_{t-1}=i)$ and an initial probability distribution specified by the vector $x$ where $x_i=P(X_0=i)$ . The time-dependent random variable $X_t$ is describing the state of our probabilistic system at time-step $t$ .

Example: Gary’s mood

In Sheldon Ross’s Introduction to Probability Models, he has an example (4.3) of a Markov Chain for modeling Gary’s mood. Gary alternates between 3 state: Cheery ( $X=1$ ), So-So ( $X=2$ ), or Glum ( $X=3$ ). Here we input the $P$ matrix given by Ross and we input an arbitrary initial probability matrix.

# Define prob transition matrix 
# (note matrix() takes vectors in column form so there is a transpose here to switch col's to row's)
P=t(matrix(c(c(0.5,0.4,0.1),c(0.3,0.4,0.3),c(0.2,0.3,0.5)),nrow=3))
# Check sum across = 1
apply(P,1,sum)

[1] 1 1 1

# Definte initial probability vector
x0=c(0.1,0.2,0.7)
# Check sums to 1
sum(x0)

[1] 1

What are the expected probability states after one or two steps?

If initial prob distribution $x_0$ is $3 \times 1$ column vector, then $x_0^T P= x_1^T$ . In R, the %*% operator automatically promotes a vector to the appropriate matrix to make the arguments conformable. In the case of multiplying a length 3 vector by a $3\time 3$ matrix, it takes the vector to be a row-vector. This means our math can look simply:

# After one step
x0%*%P

     [,1] [,2] [,3]
[1,] 0.25 0.33 0.42

And after two time-steps:

## The two-step prob trans matrix
P%*%P

     [,1] [,2] [,3]
[1,] 0.39 0.39 0.22
[2,] 0.33 0.37 0.30
[3,] 0.29 0.35 0.36

## Multiplied by the initial state probability
x0%*%P%*%P

      [,1]  [,2]  [,3]
[1,] 0.308 0.358 0.334

What about an abitrary number of time steps?

To generalize to an arbitrary number of time steps into the future, we can compute a the matrix power. In R, this can be done easily with the package expm. Let’s load the library and verify the second power is the same as we saw for P%*%P above.

# Load library 
library(expm)
# Verify the second power is P%*%P
P%^%2

     [,1] [,2] [,3]
[1,] 0.39 0.39 0.22
[2,] 0.33 0.37 0.30
[3,] 0.29 0.35 0.36

And now let’s push this : Looking at the state of the chain after many steps, say 100. First let’s look at the probability transition matrix…

P%^%100

          [,1]      [,2]      [,3]
[1,] 0.3387097 0.3709677 0.2903226
[2,] 0.3387097 0.3709677 0.2903226
[3,] 0.3387097 0.3709677 0.2903226

What do you notice about the rows? And let’s see what this does for various starting distributions:

c(1,0,0) %*%(P%^%100)

          [,1]      [,2]      [,3]
[1,] 0.3387097 0.3709677 0.2903226

c(0.2,0.5,0.3) %*%(P%^%100)

          [,1]      [,2]      [,3]
[1,] 0.3387097 0.3709677 0.2903226

Note that after a large number of steps the initial state does not matter any more, the probability of the chain being in any state $j$ is independent of where we started. This is our first view of the equilibrium distribuion of a Markov Chain. These are also known as the limiting probabilities of a Markov chain or stationary distribution.

Session information

sessionInfo()

R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tidyr_0.4.1     dplyr_0.5.0     ggplot2_2.1.0   knitr_1.15.1   
 [5] MASS_7.3-45     expm_0.999-0    Matrix_1.2-6    viridis_0.3.4  
 [9] workflowr_0.3.0 rmarkdown_1.3  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.5      git2r_0.18.0     plyr_1.8.4       tools_3.3.0     
 [5] digest_0.6.9     evaluate_0.10    tibble_1.1       gtable_0.2.0    
 [9] lattice_0.20-33  shiny_0.13.2     DBI_0.4-1        yaml_2.1.14     
[13] gridExtra_2.2.1  stringr_1.2.0    gtools_3.5.0     rprojroot_1.2   
[17] grid_3.3.0       R6_2.1.2         reshape2_1.4.1   magrittr_1.5    
[21] backports_1.0.5  scales_0.4.0     htmltools_0.3.5  assertthat_0.1  
[25] mime_0.5         colorspace_1.2-6 xtable_1.8-2     httpuv_1.3.3    
[29] labeling_0.3     stringi_1.1.2    munsell_0.4.3

This site was created with R Markdown

Introduction to Discrete Markov Chains

John Novembre

2016-01-31