Two by Two Matrix Jacobians

md"""

# Two by Two Matrix Jacobians

"""

168 μs

This notebook emphasizes the multiple views of Jacobians with examples of 2x2 matrix functions.

In particular we will see the

Symbolic "vec" format producing 4x4 matrices (generally n² by n² or mn by mn)
Numerical formats
The important Linear Transformation view
Kronecker notation
An example using ForwardDiff automatic differentiation

We also emphasize that matrix factorizations are also matrix functions, just as much as the square and the cube.

md"""

This notebook emphasizes the multiple views of Jacobians with examples of 2x2 matrix functions.

In particular we will see the

* Symbolic "vec" format producing 4x4 matrices (generally n² by n² or mn by mn)

* Numerical formats

* The important Linear Transformation view

* Kronecker notation

* An example using ForwardDiff automatic differentiation

We also emphasize that matrix factorizations are also matrix functions, just as much as the square and the cube.

"""

2.4 ms

using Symbolics

, LinearAlgebra

, PlutoUI

3.6 s

TableOfContents(title="Two by Two Matrix Jacobians", indent=true,aside=true)

4.3 ms

Symbolic Matrices

md"""

# Symbolic Matrices

"""

134 μs

Symbolics.Num

$p$

$q$

$r$

$s$

$θ$

@variables p,q,r,s,θ

101 ms

$[\begin{array}{cc} r \cos (θ) & r \sin (θ) \\ r \sin (θ) & - r \cos (θ) \end{array}]$

simplify.( Q*[r 0 ; 0 -r]*Q' )

10.8 s

$[\begin{array}{cc} p & r \\ q & s \end{array}]$

X = [p r;q s]

18.1 ms

vec

The vec command in Julia and in standard mathematics flattens a matrix column by column.

md"""

## vec

The `vec` command in Julia and in standard mathematics flattens a matrix column by column.

"""

186 μs

Symbolics.Num

$p$

$q$

$r$

$s$

vec(X)

8.7 μs

1) The matrix square function

md"""

# 1) The matrix square function

"""

135 μs

$[\begin{array}{cc} p^{2} + q r & p r + r s \\ p q + q s & q r + s^{2} \end{array}]$

X^2

674 ms

Symbolics.Num

$p^{2} + q r$

$p q + q s$

$p r + r s$

$q r + s^{2}$

vec(X^2)

2.5 ms

Symbolic Jacobian

The Jacobian of the (flattened) matrix function X² symbolically

md"""

## Symbolic Jacobian

The Jacobian of the (flattened) matrix function X² symbolically

"""

166 μs

jac (generic function with 1 method)

jac(Y,X) = Symbolics.jacobian(vec(Y),vec(X))

292 μs

$[\begin{array}{cccc} 2 p & r & q & 0 \\ q & p + s & 0 & q \\ r & 0 & p + s & r \\ 0 & r & q & 2 s \end{array}]$

J = jac(X^2, X)

1.2 s

Numerical Jacobian

md"""

## Numerical Jacobian

"""

125 μs

$[\begin{array}{cccc} 2 & 2 & 3 & 0 \\ 3 & 5 & 0 & 3 \\ 2 & 0 & 5 & 2 \\ 0 & 2 & 3 & 8 \end{array}]$

begin

M = [1 2;3 4]

E = [.0001 .0002;.0003 .0004]

substitute(J,Dict(p=>1,q=>3,r=>2,s=>4))

end

985 ms

Symbolics.Num

$0.0014$

$0.003$

$0.002$

$0.0044$

substitute(J,Dict(p=>1,q=>3,r=>2,s=>4)) * vec(E)

92.4 ms

2×2 Matrix{Float64}:
 0.00140007  0.0020001
 0.00300015  0.00440022

(M+E)^2 - M^2

67.0 ms

Linear Transformation Jacobian

Notice: there is no flattening; this is just matrix to matrix.

md"""

## Linear Transformation Jacobian

Notice: there is no flattening; this is just matrix to matrix.

"""

160 μs

linear_transformation (generic function with 1 method)

linear_transformation(E) = M*E + E*M

270 μs

2×2 Matrix{Float64}:
 0.0014  0.002
 0.003   0.0044

linear_transformation(E)

738 ms

Kronecker product or ⊗ notation

Notation that kind of lets you think "flattened" or "not flattened" at the same time

md"""

## Kronecker product or ⊗ notation

Notation that kind of lets you think "flattened" or "not flattened" at the same time

"""

161 μs

Symbolics.Num

$a$

$b$

$c$

$d$

@variables a,b,c,d

64.1 μs

$[\begin{array}{cc} p & r \\ q & s \end{array}]$

$[\begin{array}{cc} a & c \\ b & d \end{array}]$

[p r;q s],[a c;b d]

26.0 μs

Notice all possible products with the first matrix and the second

md"""

Notice all possible products with the first matrix and the second

"""

130 μs

$[\begin{array}{cc} a p & a q \\ a r & a s \\ b p & b q \\ b r & b s \end{array}]$

kron([a;b],[p q;r s])

5.1 ms

$[\begin{array}{cccc} a p & a q & c p & c q \\ a r & a s & c r & c s \\ b p & b q & d p & d q \\ b r & b s & d r & d s \end{array}]$

kron([a c;b d],[p q;r s] )

98.2 μs

Symbolics.Num

$e$

$f$

$g$

$h$

$🍕$

$👽$

$🐼$

$😸$

@variables e f g h 🍕 👽 🐼 😸

18.9 ms

$[\begin{array}{cccccc} a 🍕 & a 👽 & b 🍕 & b 👽 & c 🍕 & c 👽 \\ a 🐼 & a 😸 & b 🐼 & b 😸 & c 🐼 & c 😸 \\ d 🍕 & d 👽 & e 🍕 & e 👽 & f 🍕 & f 👽 \\ d 🐼 & d 😸 & e 🐼 & e 😸 & f 🐼 & f 😸 \end{array}]$

kron([a b c;d e f],[🍕 👽; 🐼 😸])

62.1 μs

$[\begin{array}{cccc} 🍕 & 0 & 👽 & 0 \\ 0 & 🍕 & 0 & 👽 \\ 🐼 & 0 & 😸 & 0 \\ 0 & 🐼 & 0 & 😸 \end{array}]$

kron([🍕 👽; 🐼 😸],I2)

39.0 μs

$[\begin{array}{cccc} 🍕 & 👽 & 0 & 0 \\ 🐼 & 😸 & 0 & 0 \\ 0 & 0 & 🍕 & 👽 \\ 0 & 0 & 🐼 & 😸 \end{array}]$

kron(I2,[🍕 👽; 🐼 😸])

36.4 μs

$[\begin{array}{cccc} p & r & 0 & 0 \\ q & s & 0 & 0 \\ 0 & 0 & p & r \\ 0 & 0 & q & s \end{array}]$

kron(I2,X)

31.7 μs

$[\begin{array}{cccc} p & 0 & q & 0 \\ 0 & p & 0 & q \\ r & 0 & s & 0 \\ 0 & r & 0 & s \end{array}]$

kron(X',I2)

52.1 μs

It is very reasonable to express the Jacobian of the matrix square function as
$I_{2} \otimes X + X^{T} \otimes I_{2}$

md"""

It is very reasonable to express the Jacobian of the matrix square function as $(br)

``I_2 \otimes X + X^T \otimes I_2``

"""

16.8 ms

$[\begin{array}{cccc} 2 p & r & q & 0 \\ q & p + s & 0 & q \\ r & 0 & p + s & r \\ 0 & r & q & 2 s \end{array}]$

begin

I2 = [1 0; 0 1]

kron(I2,X) + kron(X',I2) , J

end

51.0 ms

Key Kronecker identity

(A ⊗ B) * vec(C) = vec(BCAᵀ)

md"""

### Key Kronecker identity

(A ⊗ B) * vec(C) = vec(BCAᵀ)

"""

163 μs

true

begin

A = rand(5,7)

B = rand(4,3)

C = rand(3,7)

kron(A,B) * vec(C) ≈ vec(B*C*A')

end

113 μs

25×25 Matrix{Float64}:
 0.0012953    0.166326   0.0562614  0.0499152  …  0.184296   0.163508   0.589165
 0.143089     0.0619388  0.0722452  0.192885      0.236655   0.631837   0.727104
 0.0310261    0.214501   0.0969148  0.0296121     0.317465   0.0970006  0.520203
 0.0162279    0.136538   0.138209   0.0550643     0.452734   0.180375   0.0318315
 0.0896786    0.0640814  0.0263767  0.0670469     0.0864026  0.219626   0.577041
 0.000992756  0.127477   0.0431203  0.0382564  …  0.0449995  0.0399237  0.143856
 0.109667     0.0474716  0.0553708  0.147833      0.0577839  0.154276   0.177537
 ⋮                                             ⋱                        
 0.134359     0.0960087  0.0395185  0.100452      0.104482   0.265582   0.697785
 0.00143673   0.184485   0.0624041  0.0553651  …  0.200821   0.178169   0.641993
 0.158711     0.0687014  0.0801332  0.213945      0.257875   0.688491   0.792301
 0.0344136    0.237921   0.107496   0.0328452     0.345931   0.105698   0.566848
 0.0179997    0.151445   0.1533     0.0610763     0.493329   0.196548   0.0346857
 0.09947      0.0710779  0.0292566  0.0743673     0.09415    0.239319   0.628782

kron( rand(5,5) , rand(5,5) )

17.8 μs

Useful Krockecker identities

$(A \otimes B)^{T} = A^{T} \otimes B^{T}$
$(A \otimes B)^{- 1} = A^{- 1} \otimes B^{- 1}$
$det (A \otimes B) = det (A)^{m} det (B)^{n}$

, $A \in ℜ^{n, n}, B \in ℜ^{m, m}$
$t r a c e (A \otimes B) = t r a c e (A) t r a c e (B)$
$A \otimes B$

is orthogonal if $A$ and $B$ are orthogonal
$(A \otimes B) (C \otimes D) = (A C) \otimes (B D)$
If $A u = λ u$ , and $B v = μ v$ , then if $X = v u^{T}$ , then $B X A^{T} = λ μ X$ , and also $A X^{T} B^{T} = λ μ X^{T}$ . Therefore $A \otimes B$ and $B \otimes A$ have the same eigenvalues, and transposed eigenvectors.

(See Wikipedia for more properties. )

md"""

Useful Krockecker identities

* $(A\otimes B)^T=A^T\otimes B^T$

* $(A\otimes B)^{-1}=A^{-1}\otimes B^{-1}$

* $\det(A\otimes B)=\det(A)^m\det(B)^n$, $A\in\Re^{n,n}, B\in\Re^{m,m}$

* $trace(A\otimes B)=trace(A)trace(B)$

* $A\otimes B$ is orthogonal if $A$ and $B$ are orthogonal

* $(A \otimes B)(C \otimes D)=(AC) \otimes (BD)$

* If $Au = \lambda u$, and $Bv=\mu v$, then if $X=vu^T$, then

$BXA^T =\lambda \mu X$, and also $AX^T B^T =

\lambda \mu X^T$. Therefore $A \otimes B$ and $B \otimes A$

have the same eigenvalues, and transposed eigenvectors.

(See [Wikipedia](https://en.wikipedia.org/wiki/Kronecker_product#Properties) for more properties. )

"""

491 μs

The Jacobian in Kronecker notation

md"""

## The Jacobian in Kronecker notation

"""

140 μs

You see (I⊗X + X'⊗I) vec(dX) = vec(XdX + dX X) = vec( d(X²))
showing that d(X²) = (I⊗X + X'⊗I) dX.

(I feel it's okay to drop the "vec" and think of the kronecker notation as defining the linear operator from matrices to matrices)

Do look this over.

md"""

You see (I⊗X + X'⊗I) vec(dX) = vec(XdX + dX X) = vec( d(X²)) $(br)

showing that d(X²) = (I⊗X + X'⊗I) dX.

(I feel it's okay to drop the "vec" and think of the kronecker notation

as defining the linear operator from matrices to matrices)

Do look this over. $(br)

"""

259 μs

Automatic Differentiation (is not finite differences nor symbolic)

It comes in forward and reverse modes. Let's try forward.

md"""

## Automatic Differentiation (is not finite differences nor symbolic)

It comes in forward and reverse modes. Let's try forward.

"""

171 μs

using ForwardDiff

76.4 ms

$[\begin{array}{cccc} 2 p & r & q & 0 \\ q & p + s & 0 & q \\ r & 0 & p + s & r \\ 0 & r & q & 2 s \end{array}]$

9.9 μs

4×4 Matrix{Int64}:
 2  2  3  0
 3  5  0  3
 2  0  5  2
 0  2  3  8

ForwardDiff.jacobian(X->X^2,M)

4.8 s

$[\begin{array}{cccc} 2 & 3 & 2 & 0 \\ 2 & 5 & 0 & 2 \\ 3 & 0 & 5 & 3 \\ 0 & 3 & 2 & 8 \end{array}]$

#Check

substitute(J, Dict(X.=>[1 3;2 4] ))

116 ms

$[\begin{array}{cccc} 2 p & r & q & 0 \\ q & p + s & 0 & q \\ r & 0 & p + s & r \\ 0 & r & q & 2 s \end{array}]$

ForwardDiff.jacobian(X->X^2,X)

2.0 s

2) The matrix cube Function

md"""

# 2) The matrix cube Function

"""

121 μs

$[\begin{array}{cc} p^{3} + q r s + 2 p q r & r^{2} q + p^{2} r + s^{2} r + p r s \\ p^{2} q + q^{2} r + s^{2} q + p q s & s^{3} + p q r + 2 q r s \end{array}]$

expand.(X^3)

3.9 s

Symbolic Jacobian

The Jacobian of the (flattened) matrix function X² symbolically

md"""

## Symbolic Jacobian

The Jacobian of the (flattened) matrix function X² symbolically

"""

156 μs

$[\begin{array}{cccc} 3 p^{2} + 2 q r & r s + 2 p r & q s + 2 p q & q r \\ q s + 2 p q & p^{2} + p s + s^{2} + 2 q r & q^{2} & p q + 2 q s \\ r s + 2 p r & r^{2} & p^{2} + p s + s^{2} + 2 q r & p r + 2 r s \\ q r & p r + 2 r s & p q + 2 q s & 3 s^{2} + 2 q r \end{array}]$

expand.(jac(X^3, X))

920 ms

expand.(ForwardDiff.jacobian(X->X^3,X))

1.5 s

LinearTransformation Jacobian

md"""

## LinearTransformation Jacobian

"""

121 μs

dX X² + X dX X + dX X²

md"""

dX X² + X dX X + dX X²

"""

127 μs

with numerical data:

md"""

with numerical data:

"""

126 μs

#7 (generic function with 1 method)

E -> E*M*M + M*E*M + E*M*M

21.0 μs

2×2 Matrix{Float64}:
 0.0111011  0.0162016
 0.0243024  0.0354035

(E+M)^3 - M^3

23.9 μs

2×2 Matrix{Float64}:
 0.0111  0.0162
 0.0243  0.0354

(E -> E*M*M + M*E*M + E*M*M)(E)

22.3 μs

check against the symbolic answer

md"""

check against the symbolic answer

"""

123 μs

$[\begin{array}{cccc} 15 & 12 & 18 & 6 \\ 18 & 33 & 9 & 27 \\ 12 & 4 & 33 & 18 \\ 6 & 18 & 27 & 60 \end{array}]$

substitute( Symbolics.jacobian(vec(X^3), vec(X)) , Dict(p=>M[1,1],q=>M[2,1],r=>M[1,2],s=>M[2,2]))

287 ms

Symbolics.Num

$0.011100000000000002$

$0.0243$

$0.016200000000000003$

$0.0354$

substitute( Symbolics.jacobian(vec(X^3), vec(X)) , Dict(p=>M[1,1],q=>M[2,1],r=>M[1,2],s=>M[2,2])) * vec(E)

5.3 ms

The Jacobian in Kronecker Notation

md"""

## The Jacobian in Kronecker Notation

"""

146 μs

expand.( kron(I2,X^2) + kron(X',X) + kron(X'^2,I2) )

71.9 ms

3) The LU Decomposition

Recall the LU Decomposition factors a matrix into unit lower-trianguar and upper triangular:

md"""

# 3) The LU Decomposition

Recall the LU Decomposition factors a matrix into unit lower-trianguar and upper triangular:

"""

378 μs

$[\begin{array}{cc} 1 & 0 \\ \frac{q}{p} & 1 \end{array}]$

$[\begin{array}{cc} p & r \\ 0 & s + \frac{- q r}{p} \end{array}]$

begin

L,U = lu(X);

L,U

end

147 ms

$[\begin{array}{cc} p & r \\ q & s \end{array}]$

simplify_fractions.(L*U)

520 ms

The four entries of X: p,q,r,s are transformed into these four entries in LU:

md"""

The four entries of X: p,q,r,s are transformed into these four entries in LU:

"""

191 μs

Symbolics.Num

$\frac{q}{p}$

$p$

$r$

$s + \frac{- q r}{p}$

[L[2,1],U[1,1],U[1,2],U[2,2]]

15.0 μs

$[\begin{array}{cccc} - \frac{q}{p^{2}} & \frac{1}{p} & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ - \frac{- q r}{p^{2}} & \frac{- r}{p} & \frac{- q}{p} & 1 \end{array}]$

jac([L[2,1],U[1,1],U[1,2],U[2,2]], X)

653 ms

Exercise: Relate this to d(LU) = dL U + L dU

md"""

Exercise: Relate this to d(LU) = dL U + L dU

"""

126 μs

4) Traceless symmetric eigenproblem: an example with two parameters not four

md"""

# 4) Traceless symmetric eigenproblem: an example with two parameters not four

"""

123 μs

$[\begin{array}{cc} p & s \\ s & - p \end{array}]$

S = [p s; s -p]

5.2 ms

We know that the eigenvalues add to 0 (from the trace) and the eigenvectors are orthogonal (from being symmetric), so we can represent the eigenvectors and eigenvalues:

md"""

We know that the eigenvalues add to 0 (from the trace) and the eigenvectors are orthogonal (from being symmetric), so we can represent the eigenvectors and eigenvalues:

"""

135 μs

$[\begin{array}{cc} \cos (\frac{1}{2} θ) & - \sin (\frac{1}{2} θ) \\ \sin (\frac{1}{2} θ) & \cos (\frac{1}{2} θ) \end{array}]$

Q = [cos(θ/2) -sin(θ/2); sin(θ/2) cos(θ/2)] # Eigenvector matrix

386 ms

$[\begin{array}{cc} r & 0 \\ 0 & - r \end{array}]$

Λ = [r 0;0 -r] # Eigenvalue matrix

97.4 μs

$[\begin{array}{cc} \cos (\frac{1}{2} θ) & - \sin (\frac{1}{2} θ) \\ \sin (\frac{1}{2} θ) & \cos (\frac{1}{2} θ) \end{array}]$

9.3 μs

$[\begin{array}{cc} r & 0 \\ 0 & - r \end{array}]$

6.9 μs

$[\begin{array}{cc} r \cos (θ) & r \sin (θ) \\ r \sin (θ) & - r \cos (θ) \end{array}]$

Symbolics.simplify.(Q * Λ * Q')

6.2 ms

The relationship between θ,r to p,s:

md"""

The relationship between θ,r to p,s:

"""

127 μs

$[\begin{array}{cc} p & s \\ s & - p \end{array}]$

$[\begin{array}{cc} r \cos (θ) & r \sin (θ) \\ r \sin (θ) & - r \cos (θ) \end{array}]$

S, simplify.(Q*Λ*Q'), [r*cos(θ) r*sin(θ) ; r*sin(θ) -r*cos(θ)]

15.8 ms

$[\begin{array}{cc} \cos (θ) & - r \sin (θ) \\ \sin (θ) & r \cos (θ) \end{array}]$

simplify.(jac( (Q*Λ*Q')[1:2] , [r,θ]))

2.4 s

Interesting mathematical observation: these are the formulas you may remember from other classes that relate cartesian coordinates to polar coordinates in the plane.

md"""

Interesting mathematical observation: these are the formulas you may remember

from other classes that relate cartesian coordinates to polar coordinates in the plane.

"""

134 μs

jacobian_det

$r$

jacobian_det = simplify(det(simplify.(jac( (Q*Λ*Q')[1:2] , [r,θ]))))

1.4 s

Mathematical aside det J=r , this is the change of variables from x,y to r,θ that you may have seen in 18.02. This eigenvalue problem is the same as the cartesian coordinates to polar representations of the plane. Often written dx dy = r dr dθ

md"""

"""

139 μs

5) The full 2x2 symmetric eigenproblem

md"""

# 5) The full 2x2 symmetric eigenproblem

"""

122 μs

Symbolics.Num

$λ_{1}$

$λ_{2}$

@variables λ₁ λ₂

43.0 ms

We think of

$(\begin{array}{cc} p & s \\ s & r \end{array}) = (\begin{array}{rr} \cos (θ) & - \sin (θ) \\ \sin (θ) & \cos (θ) \end{array}) (\begin{array}{cc} λ_{1} & 0 \\ 0 & λ_{2} \end{array}) {(\begin{array}{rr} \cos (θ) & - \sin (θ) \\ \sin (θ) & \cos (θ) \end{array})}^{T}$
as the function from
$λ_{1}, λ_{2}, θ \to p, r, s$

md"""

We think of

``\left( \begin{array}{cc}

p & s \\ s & r

\end{array} \right) =

\left( \begin{array}{rr} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta)

\end{array} \right)

\left( \begin{array}{cc}

\lambda_1 & 0 \\ 0 & \lambda_2

\end{array} \right)

\left( \begin{array}{rr} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta)

\end{array} \right)^T

$(br)

as the function from $(br)

``\lambda_1,\lambda_2,θ \rightarrow p,r,s``

"""

287 μs

S = QΛQ':

md"""

S = QΛQ':

"""

124 μs

$[\begin{array}{ccc} \cos^{2} (θ) & \sin^{2} (θ) & - 2 λ_{1} \cos (θ) \sin (θ) + 2 λ_{2} \cos (θ) \sin (θ) \\ \sin^{2} (θ) & \cos^{2} (θ) & 2 λ_{1} \cos (θ) \sin (θ) - 2 λ_{2} \cos (θ) \sin (θ) \\ \cos (θ) \sin (θ) & - \cos (θ) \sin (θ) & \cos^{2} (θ) λ_{1} + \sin^{2} (θ) λ_{2} - \cos^{2} (θ) λ_{2} - \sin^{2} (θ) λ_{1} \end{array}]$

let

Q = [cos(θ) -sin(θ); sin(θ) cos(θ)]

S = Q*[λ₁ 0;0 λ₂]*Q'

[p s;s r], S

J = jac([S[1,1],S[2,2],S[1,2]] , [λ₁,λ₂,θ])

end

89.2 ms

The determinant of this transformation simplifies to $λ_{1} - λ_{2}$ which some people interpret as a kind of repulsion between the two eigenvalues: that is there is a tendency for the two eigenvalues to not want to be too close together. (If both are equal, when n=2, the matrix is $α I$ , one condition takes three parameters down to 1)

md"""

The determinant of this transformation simplifies to ``\lambda_1 - \lambda_2``

which some people interpret as a kind of repulsion between the two eigenvalues:

that is there is a tendency for the two eigenvalues to not want to be too close

together. (If both are equal, when n=2, the matrix is ``\alpha I``, one condition

takes three parameters down to 1)

"""

153 μs