  • Pre-requisites
  • Introduction
  • Some linear algebra
    • Interpration as a random linear combination of eigenvectors
  • Example: rank 1 covariance

Last updated: 2021-03-01

Warning: package 'mvtnorm' was built under R version 3.6.2


You should be familiar with the Multivariate normal distribution, and with the eigen-decomposition for symmetric positive semi-definite (PSD) matrices.


Getting an intuition for what the p-dimensional multivariate normal distribution, Np(μ,Σ), “looks like” can be difficult. For p=1,2 things are not too bad: we can directly visualize a univariate normal distribution by plotting its density, and visualize a bivariate normal distribution by plotting a contour plot of the density, or by simulating samples from the distribution and visualizing them using a 2d scatterplot. For example, the following code does this for N(0,Σ) where Σ=(

Sigma= cbind(c(1,0.7),c(0.7,1))
X = rmvnorm(1000,c(0,0),Sigma)
plot(X[,1],X[,2],main="Samples from bivariate normal with variance Sigma",asp=1)

But in p=100 dimensions, or even just p=4 dimensions, things become much harder because direct visualization is impractical.

So how can we get intuition about the multivariate normal distribution, Np(μ,Σ) when p is large?

Note first that the mean μ is just a vector of p numbers, and generally causes few problem in interpretation: you can just think of each number as specifying the mean in each of the p coordinates one at a time.

In contrast, the covariance matrix Σ is a p×p matrix that captures potentially more complex patterns, and creates more challenges for intuition. One possible approach is to plot a heatmap of this matrix, and this can certainly be helpful in certain situations. However, this vignette describes a more algebraic approach, based on the eigen-decomposition of Σ.

Some linear algebra

Recall that any valid p×p covariance matrix Σ must be symmetric and positive semi-definite (PSD). Furthermore, recall that any such PSD matrix must have eigen-decomposition: Σ=VΛV where:

  • Λ is a K×K diagonal matrix with the non-zero eigenvalues of Σ, λ1,,λK say, on the diagonal (Kp is the rank of Σ).

  • V is a p×K orthonormal matrix (VV=IK), whose columns v1,,vK are the normalized eigenvectors of Σ corresponding to the non-zero eigenvalues.

Recall also that if ZNp(0,Ip) and A is any n×p matrix then μ+AZN(μ,AA).

Now apply this last result with A=VΛ0.5 where Λ0.5 is the diagonal matrix with λ0.51,,λ0.5K on the diagonal. We get μ+VΛ0.5ZNp(μ,VΛ0.5Λ0.5V). That is, μ+VΛ0.5ZNp(μ,Σ). We can write the matrix multiple VΛ0.5Z as a sum to make the structure more obvious: μ+Kk=1λ0.5kzkvkNp(μ,Σ). Here μ and v1,,vK are all column vector of length p, whereas the λk and zk are all scalars.

Interpration as a random linear combination of eigenvectors

From this algebra, if XNp(μ,Σ), then we can think of X as being generated by taking the mean μ, and adding a random linear combination of the eigenvectors of Σ. Specifically X=μ+Kk=1bkvk, where the weights bk=λ0.5kzkN(0,λk). are independent of one another.

Note that if λk is small then bk0, so the eigenvectors with small eigenvalues contribute little to X, and we can focus on the eigenvectors with large eigenvalues. Indeed, this approach provides the simplest insights when most of the λk are negligible, and only one or two eigenvectors contribute meaningfully to the sum.

Example: rank 1 covariance

To make a simple example, set μ=0 and assume Σ is a rank 1 matrix. That is, Σ has only one eigenvector: Σ=λvv for some p-vector v.

In this case the algebra above gives the representation X=bv where bN(0,λ). That is X is simply a multiple of v, where the multiplier is randomly distributed from a univariate normal. Thus in this case the randomness in X boils down to the randomness in a single random univarate normal, which is easy to visualize.

To give a specific example, suppose that v is the vector of all 1s v=(1,,1) and λ=1. That is Σ is a matrix of all 1s. Then X=(b,b,b,,b) where bN(0,1).

To give another specific example, if v=(1,1,1,1,1) and λ=2 then X=(b,b,b,b,b) where bN(0,2).

