Math: The gradient of the determinant
Theorem: $\nabla(\det A) = \text{cofactor}(A) = (\det A)A^{-T} = \text{adj}(A^T)$
$d(\det A) = \text{tr}( \det(A)A^{-1}dA) = \text{tr(adj}(A) dA) = \text{tr(cofactor}(A)^T dA)$
Note:the adjugate is defined as $(\det A)A^{-1}$; it is the transpose of the cofactor.
adj (generic function with 1 method)
cofactor (generic function with 1 method)
Formulas at a glance:
$A^{-1}$ = adj($A$)/det($A$) = cofactor($A$)$^T$/det($A$)
adj($A$) = det($A$)$A^{-1}$ = cofactor($A$)$^T$
cofactor($A$) = $A^{-T}$/det($A$) = adj($A$)$^{T}$
The adjugate is the transpose of the cofactor matrix. You may remember that the cofactor matrix has as (i,j) entry $(-1)^{i+j}$ times the determinant obtained by deleting row i and column j.
$$\begin{equation} a \end{equation}$$
$$\begin{equation} b \end{equation}$$
$$\begin{equation} c \end{equation}$$
$$\begin{equation} d \end{equation}$$
$$\begin{equation} \left[ \begin{array}{cc} a & c \\ b & d \\ \end{array} \right] \end{equation}$$
$$\begin{equation} \left[ \begin{array}{cc} d & - b \\ - c & a \\ \end{array} \right] \end{equation}$$
$$\begin{equation} \left[ \begin{array}{cc} d & - c \\ - b & a \\ \end{array} \right] \end{equation}$$
$$\begin{equation} \left[ \begin{array}{cc} \frac{d}{a d - b c} & \frac{ - c}{a d - b c} \\ \frac{ - b}{a d - b c} & \frac{a}{a d - b c} \\ \end{array} \right] \end{equation}$$
Numerical Demonstration
9×9 Matrix{Float64}:
9.84997e-6 3.27962e-6 3.68515e-6 … 1.31594e-6 8.59009e-6 5.59433e-6
6.58166e-6 9.49231e-6 7.67381e-6 8.88176e-6 8.25997e-7 7.52597e-6
6.70657e-6 2.578e-6 8.35539e-6 1.98348e-6 8.75045e-6 6.93709e-6
2.44452e-6 3.84697e-6 9.332e-6 7.35059e-6 1.4179e-6 2.84029e-6
1.90982e-6 3.21393e-6 5.36157e-6 3.90574e-6 2.79935e-6 6.62397e-6
5.20741e-6 9.23777e-7 7.61258e-6 … 8.47971e-6 5.19634e-6 1.73769e-6
3.78462e-6 5.52636e-7 9.17756e-6 8.67793e-6 5.85933e-6 8.39804e-6
9.97994e-6 7.38427e-6 8.06906e-6 5.13478e-6 9.34815e-6 2.17865e-6
7.46987e-6 6.87991e-6 6.53469e-6 3.4723e-6 4.03916e-6 9.32723e-6
-1.01992e-6
-1.01993e-6
Forward Diff Autodiff
9×9 Matrix{Float64}:
-0.0254462 0.0750306 0.0256232 … 0.0128037 0.00625252 -0.0146315
0.0392959 -0.0659637 0.00465413 -0.0378384 -0.00419778 0.0321942
0.0107266 -0.0722348 -0.0137141 -0.0293705 0.000304564 0.0236944
0.00138368 -0.015074 -0.0132784 0.0103984 0.0082165 -0.00174828
0.0157866 -0.00764511 0.0334501 -0.000560729 -0.0247453 -0.00310108
0.0181448 -0.0610458 -0.00610382 … -0.0353692 0.0179075 0.0156312
-0.0554795 0.156881 -0.0144352 0.0659222 0.00133675 -0.0340218
0.0235991 -0.0512598 0.0203016 -0.0102284 -0.0143283 0.0191255
0.000760144 0.00688254 -0.0554133 0.022568 0.01269 -0.0365086
9×9 adjoint(::Matrix{Float64}) with eltype Float64:
-0.0254462 0.0750306 0.0256232 … 0.0128037 0.00625252 -0.0146315
0.0392959 -0.0659637 0.00465413 -0.0378384 -0.00419778 0.0321942
0.0107266 -0.0722348 -0.0137141 -0.0293705 0.000304564 0.0236944
0.00138368 -0.015074 -0.0132784 0.0103984 0.0082165 -0.00174828
0.0157866 -0.00764511 0.0334501 -0.000560729 -0.0247453 -0.00310108
0.0181448 -0.0610458 -0.00610382 … -0.0353692 0.0179075 0.0156312
-0.0554795 0.156881 -0.0144352 0.0659222 0.00133675 -0.0340218
0.0235991 -0.0512598 0.0203016 -0.0102284 -0.0143283 0.0191255
0.000760144 0.00688254 -0.0554133 0.022568 0.01269 -0.0365086
Reverse Mode Autodiff
Zygote does reverse ad
9×9 Matrix{Float64}: -0.0254462 0.0750306 0.0256232 … 0.0128037 0.00625252 -0.0146315 0.0392959 -0.0659637 0.00465413 -0.0378384 -0.00419778 0.0321942 0.0107266 -0.0722348 -0.0137141 -0.0293705 0.000304564 0.0236944 0.00138368 -0.015074 -0.0132784 0.0103984 0.0082165 -0.00174828 0.0157866 -0.00764511 0.0334501 -0.000560729 -0.0247453 -0.00310108 0.0181448 -0.0610458 -0.00610382 … -0.0353692 0.0179075 0.0156312 -0.0554795 0.156881 -0.0144352 0.0659222 0.00133675 -0.0340218 0.0235991 -0.0512598 0.0203016 -0.0102284 -0.0143283 0.0191255 0.000760144 0.00688254 -0.0554133 0.022568 0.01269 -0.0365086
Symbolic Demonstration
$$\begin{equation} \left[ \begin{array}{cc} d & - b \\ - c & a \\ \end{array} \right] \end{equation}$$
$$\begin{equation} \left[ \begin{array}{cc} d & - b \\ - c & a \\ \end{array} \right] \end{equation}$$
Direct Proof
A direct proof where you just differentiate the scalar with respect to every input can be obtained as a simple consequence from the cofactor expansion a.k.a. the Laplace expansion of the determinant based on the $i$th row.
$\det(A) = A_{i1} C_{i1} + A_{i2}C_{i2} + \ldots + A_{in}C_{in}$
Recall that the determinant is a linear (affine) function of any element with slope equal to the cofactor.
We then readily obtain $\frac{\partial \det(A)}{\partial A_{ij}}=C_{ij}$ from which we conclude $\nabla (\det A ) = C$.
Example:
$$\begin{equation} \left[ \begin{array}{ccc} 4 & 1 & 5 \\ a & 4 & 5 \\ 6 & 7 & 8 \\ \end{array} \right] \end{equation}$$
$$\begin{equation} 18 + 5 \left( -24 + 7 a \right) - 8 a \end{equation}$$
$$\begin{equation} 27.0 \end{equation}$$
Fancy Proof
Figure out linearization near the identity I
$\det(I+dA)- 1 = \text{tr}(dA)$ (think of the $n!$ terms in the determinant and drop higher terms)
$\det(A+ A(A^{-1}dA)) - \det(A) = \det(A)\det(I+A^{-1}dA) -\det(A) =$ $\det(A) \text{tr}( A^{-1}dA) = \text{tr}(\det(A) A^{-1}dA)$
Application to derivative of the characteristic polynomial evaluated at $x$
$p(x) = \det(xI-A)$, a scalar function of $x$
Recall that $p(x)$ can be written in terms of the eigenvalues $\lambda_i$:
Direct derivative (freshman calculus):
$\frac{d}{dx} \prod_i (x-\lambda_i) \ = \ \sum_i \prod_{j\ne i} (x-\lambda _j) = \prod_i (x-\lambda_i) \left\{ \sum_i (x-\lambda_i)^{-1} \right\}$ (as long as $x \ne \lambda_i$)
Perfectly good simple proof, but if you want to show off...
With our new technology:
$d(\det (xI-A))= \det(xI-A) \text{tr}( (xI-A)^{-1} d(xI-A) )$
$\hspace{1.2in}= \det(xI-A) \text{tr}(xI-A)^{-1} dx$
Note: $d(xI-A)=dxI$ when $A$ is constant and
$\text{tr}(Adx) = \text{tr}(A)dx$ since $dx$ is a scalar.
Check:
-5.42182
-5.42182
Application d(log(det(A)))
$d(\log(\det(A))) = \frac{d(\det(A))}{\det A} = \det(A^{-1}) d(\det(A)) = \text{tr}( A^{-1} dA)$
The logarithmic derivative shows up a lot in applied mathematics. For example the key term in Newton's method: $f(x)/f'(x)$ may be written $1/ \left\{ \log f(x) \right\}'$
Math: The Jacobian of the Inverse
$A^{-1}A= I \rightarrow d(A^{-1}A)=0=d(A^{-1})A + A^{-1}dA$ from the product rule
$\rightarrow d(A^{-1}) = -A^{-1} (dA) A^{-1} = -(A^{-T} \otimes A^{-1}) dA$
16×16 Matrix{Float64}:
-3.88325 -3.02397 5.69787 1.98243 … 1.39286 -2.62447 -0.913117
1.94632 5.55259 -7.09678 -3.76388 -2.55755 3.26882 1.73367
-2.595 -5.94282 3.09885 4.87364 2.7373 -1.42735 -2.24483
1.78865 -0.190301 0.690036 -2.211 0.0876539 -0.317834 1.0184
-3.02397 -2.35484 4.43706 1.54376 -0.148192 0.279228 0.0971501
1.51564 4.32392 -5.52642 -2.93102 … 0.272108 -0.347783 -0.184452
-2.02078 -4.6278 2.41315 3.79522 -0.291232 0.151862 0.238836
⋮ ⋱ ⋮
3.80763 8.71986 -4.54693 -7.15107 … 1.05601 -0.550652 -0.866024
-2.62447 0.279228 -1.01248 3.24419 0.0338157 -0.122616 0.392885
1.98243 1.54376 -2.9088 -1.01204 -1.72176 3.24419 1.12873
-0.993608 -2.83463 3.62295 1.92149 3.16147 -4.04068 -2.14304
1.32476 3.03385 -1.58199 -2.48803 -3.38366 1.76439 2.7749
-0.913117 0.0971501 -0.352268 1.12873 … -0.108352 0.392885 -1.25888
16×16 Matrix{Float64}:
-3.88325 -3.02397 5.69787 1.98243 … 1.39286 -2.62447 -0.913117
1.94632 5.55259 -7.09678 -3.76388 -2.55755 3.26882 1.73367
-2.595 -5.94282 3.09885 4.87364 2.7373 -1.42735 -2.24483
1.78865 -0.190301 0.690036 -2.211 0.0876539 -0.317834 1.0184
-3.02397 -2.35484 4.43706 1.54376 -0.148192 0.279228 0.0971501
1.51564 4.32392 -5.52642 -2.93102 … 0.272108 -0.347783 -0.184452
-2.02078 -4.6278 2.41315 3.79522 -0.291232 0.151862 0.238836
⋮ ⋱ ⋮
3.80763 8.71986 -4.54693 -7.15107 … 1.05601 -0.550652 -0.866024
-2.62447 0.279228 -1.01248 3.24419 0.0338157 -0.122616 0.392885
1.98243 1.54376 -2.9088 -1.01204 -1.72176 3.24419 1.12873
-0.993608 -2.83463 3.62295 1.92149 3.16147 -4.04068 -2.14304
1.32476 3.03385 -1.58199 -2.48803 -3.38366 1.76439 2.7749
-0.913117 0.0971501 -0.352268 1.12873 … -0.108352 0.392885 -1.25888