Last updated: 2017-03-04

Code version: 5d0fa13282db4a97dc7d62e2d704e88a5afdb824

# Pre-requisites

This document assumes basic familiarity with Markov chains and linear algebra.

# Overview

In this note, we illustrate one way of analytically obtaining the stationary distribution for a finite discrete Markov chain.

# 3x3 example

Assume our probability transition matrix is: $P = \begin{bmatrix} 0.7 & 0.2 & 0.1 \\ 0.4 & 0.6 & 0 \\ 0 & 1 & 0 \end{bmatrix}$

Since every state is accessible from every other state, this Markov chain is irreducible. Every irreducible finite state space Markov chain has a unique stationary distribution. Recall that the stationary distribution $$\pi$$ is the vector such that $\pi = \pi P$.

Therefore, we can find our stationary distribution by solving the following linear system: \begin{align*} 0.7\pi_1 + 0.4\pi_2 &= \pi_1 \\ 0.2\pi_1 + 0.6\pi_2 + \pi_3 &= \pi_2 \\ 0.1\pi_1 &= \pi_3 \end{align*} subject to $$\pi_1 + \pi_2 + \pi_3 = 1$$. Putting these four equations together and moving all of the variables to the left hand side, we get the following linear system: \begin{align*} -0.3\pi_1 + 0.4\pi_2 &= 0 \\ 0.2\pi_1 + -0.4\pi_2 + \pi_3 &= 0 \\ 0.1\pi_1 - \pi_3 &= 0 \\ \pi_1 + \pi_2 + \pi_3 &= 1 \end{align*}

We will define the linear system in matrix notation: $\underbrace{\begin{bmatrix} -0.3 & 0.4 & 0 \\ 0.2 & -0.4 & 1 \\ 0.1 & 0 & 1 \\ 1 & 1 & 1 \end{bmatrix}}_A \begin{bmatrix} \pi_1 \\ \pi_2 \\ \pi_3 \end{bmatrix} = \underbrace{\begin{bmatrix} 0 \\ 0 \\ 0 \\ 1 \end{bmatrix}}_b \\ A\pi = b$

Since this linear system has more equations than unknowns, it is an overdetermined system. Recall from linear algebra that an overdetermined system is consistent (i.e. we can solve for $$\pi$$ exactly) when $$b$$ is in the column space of $$A$$. We can check this numerically by obtaining the rank of $$A$$, then obtaining the rank of an augmented matrix with $$b$$ appended as a column of $$A$$. (Side note: it can be difficult to find the rank of an arbitrary matrix numerically, especially for large matrices. But for our small example, we are safe.)

library(Matrix)
A        <- matrix(c(-0.3, 0.2, 0.1, 1, 0.4, -0.4, 0, 1, 0, 1, -1, 1 ), ncol=3,nrow=4)
b        <- c(0,0,0, 1)
rank.A   <- as.numeric(rankMatrix(A))
rank.Ab  <- as.numeric(rankMatrix(cbind(A,b)))
print(paste("The rank of A =", rank.A, "and the rank of the augmented matrix =", rank.Ab))
[1] "The rank of A = 3 and the rank of the augmented matrix = 3"

We see that $$A$$ has full column rank, and that the rank is unchanged when we add $$b$$ as a column. Therefore, $$b$$ is in the column space of $$A$$, and this system is consistent. We can find $$\pi$$ by solving the normal equations: $A^TA\pi = A^Tb$

We use the solve function in R to solve for the stationary distribution $$\pi$$:

pi        <- drop(solve(t(A) %*% A, t(A) %*% b))
names(pi) <- c('state.1', 'state.2', 'state.3')
pi 
   state.1    state.2    state.3
0.54054054 0.40540541 0.05405405 

We find that: \begin{align*} \pi_1 \approx 0.54, \pi_2 \approx 0.41, \pi_3 \approx 0.05 \end{align*}

Therefore, under proper conditions, we expect the Markov chain to spend more time in states 1 and 2 as the chain evolves.

## The General Approach

Recall that we are attempting to find a solution to $\pi = \pi P$ such that $$\sum_i \pi_i =1$$. First we rearrange the expression above to get: \begin{align} \pi - \pi P &= 0 \\ (I - P)\pi &= 0 \end{align}

One challenge though is that we need the constrained solution which respects that $$\pi$$ describes a probability distribution (i.e. $$\sum \pi_i = 1$$). Luckily this is a linear constraint that is easily represented by adding another equation to the system. So as a small trick, we need to add a row of all 1’s to our $$(I-P)$$ (call this new matrix $$A$$) and a 1 to the last element of the zero vector on the right hand side (call this new vector $$b$$). Now we want to solve $$A\pi = b$$ which is overdetermined, so we solve the normal equations $$A^TA\pi = A^Tb$$.

# Session information

sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] tidyr_0.4.1     dplyr_0.5.0     ggplot2_2.1.0   knitr_1.15.1
[5] MASS_7.3-45     expm_0.999-0    Matrix_1.2-6    viridis_0.3.4
[9] workflowr_0.3.0 rmarkdown_1.3

loaded via a namespace (and not attached):
[1] Rcpp_0.12.5      git2r_0.18.0     plyr_1.8.4       tools_3.3.0
[5] digest_0.6.9     evaluate_0.10    tibble_1.1       gtable_0.2.0
[9] lattice_0.20-33  shiny_0.13.2     DBI_0.4-1        yaml_2.1.14
[13] gridExtra_2.2.1  stringr_1.2.0    gtools_3.5.0     rprojroot_1.2
[17] grid_3.3.0       R6_2.1.2         reshape2_1.4.1   magrittr_1.5
[21] backports_1.0.5  scales_0.4.0     htmltools_0.3.5  assertthat_0.1
[25] mime_0.5         colorspace_1.2-6 xtable_1.8-2     httpuv_1.3.3
[29] labeling_0.3     stringi_1.1.2    munsell_0.4.3   

This site was created with R Markdown