# Schur complement

In linear algebra and the theory of matrices, the Schur complement of a matrix block (i.e., a submatrix within a larger matrix) is defined as follows. Suppose A, B, C, D are respectively p×p, p×q, q×p and q×q matrices, and D is invertible. Let

${\displaystyle M=\left[{\begin{matrix}A&B\\C&D\end{matrix}}\right]}$

so that M is a (p+q)×(p+q) matrix.

Then the Schur complement of the block D of the matrix M is the p×p matrix

${\displaystyle A-BD^{-1}C.\,}$

It is named after Issai Schur who used it to prove Schur's lemma, although it had been used previously.[1] Emilie Haynsworth was the first to call it the Schur complement.[2]

## Background

The Schur complement arises as the result of performing a block Gaussian elimination by multiplying the matrix M from the right with the "block lower triangular" matrix

${\displaystyle L=\left[{\begin{matrix}I_{p}&0\\-D^{-1}C&I_{q}\end{matrix}}\right].}$

Here Ip denotes a p×p identity matrix. After multiplication with the matrix L the Schur complement appears in the upper p×p block. The product matrix is

{\displaystyle {\begin{aligned}ML&=\left[{\begin{matrix}A&B\\C&D\end{matrix}}\right]\left[{\begin{matrix}I_{p}&0\\-D^{-1}C&I_{q}\end{matrix}}\right]=\left[{\begin{matrix}A-BD^{-1}C&B\\0&D\end{matrix}}\right]\\&=\left[{\begin{matrix}I_{p}&BD^{-1}\\0&I_{q}\end{matrix}}\right]\left[{\begin{matrix}A-BD^{-1}C&0\\0&D\end{matrix}}\right].\end{aligned}}}

This is analogous to an LDU decomposition. That is, we have shown that

{\displaystyle {\begin{aligned}\left[{\begin{matrix}A&B\\C&D\end{matrix}}\right]&=\left[{\begin{matrix}I_{p}&BD^{-1}\\0&I_{q}\end{matrix}}\right]\left[{\begin{matrix}A-BD^{-1}C&0\\0&D\end{matrix}}\right]\left[{\begin{matrix}I_{p}&0\\D^{-1}C&I_{q}\end{matrix}}\right],\end{aligned}}}

and inverse of M thus may be expressed involving D−1 and the inverse of Schur's complement (if it exists) only as

{\displaystyle {\begin{aligned}&{}\quad \left[{\begin{matrix}A&B\\C&D\end{matrix}}\right]^{-1}=\left[{\begin{matrix}I_{p}&0\\-D^{-1}C&I_{q}\end{matrix}}\right]\left[{\begin{matrix}(A-BD^{-1}C)^{-1}&0\\0&D^{-1}\end{matrix}}\right]\left[{\begin{matrix}I_{p}&-BD^{-1}\\0&I_{q}\end{matrix}}\right]\\[12pt]&=\left[{\begin{matrix}\left(A-BD^{-1}C\right)^{-1}&-\left(A-BD^{-1}C\right)^{-1}BD^{-1}\\-D^{-1}C\left(A-BD^{-1}C\right)^{-1}&D^{-1}+D^{-1}C\left(A-BD^{-1}C\right)^{-1}BD^{-1}\end{matrix}}\right].\end{aligned}}}

C.f. matrix inversion lemma which illustrates relationships between the above and the equivalent derivation with the roles of A and D interchanged.

If M is a positive-definite symmetric matrix, then so is the Schur complement of D in M.

If p and q are both 1 (i.e. A, B, C and D are all scalars), we get the familiar formula for the inverse of a 2-by-2 matrix:

${\displaystyle M^{-1}={\frac {1}{AD-BC}}\left[{\begin{matrix}D&-B\\-C&A\end{matrix}}\right]}$

provided that AD − BC is non-zero.

Moreover, the determinant of M is also clearly seen to be given by

${\displaystyle \det(M)=\det(D)\det(A-BD^{-1}C)}$

which generalizes the determinant formula for 2x2 matrices.

## Application to solving linear equations

The Schur complement arises naturally in solving a system of linear equations such as

${\displaystyle Ax+By=a\,}$
${\displaystyle Cx+Dy=b\,}$

where x, a are p-dimensional column vectors, y, b are q-dimensional column vectors, and A, B, C, D are as above. Multiplying the bottom equation by ${\displaystyle BD^{-1}}$ and then subtracting from the top equation one obtains

${\displaystyle (A-BD^{-1}C)x=a-BD^{-1}b.\,}$

Thus if one can invert D as well as the Schur complement of D, one can solve for x, and then by using the equation ${\displaystyle Cx+Dy=b}$ one can solve for y. This reduces the problem of inverting a ${\displaystyle (p+q)\times (p+q)}$ matrix to that of inverting a p×p matrix and a q×q matrix. In practice one needs D to be well-conditioned in order for this algorithm to be numerically accurate.

## Applications to probability theory and statistics

Suppose the random column vectors X, Y live in Rn and Rm respectively, and the vector (X, Y) in Rn+m has a multivariate normal distribution whose covariance is the symmetric positive-definite matrix

${\displaystyle \Sigma =\left[{\begin{matrix}A&B\\B^{T}&C\end{matrix}}\right],}$

where ${\displaystyle A\in \mathbb {R} ^{n\times n}}$ is the covariance matrix of X, ${\displaystyle C\in \mathbb {R} ^{m\times m}}$ is the covariance matrix of Y and ${\displaystyle B\in \mathbb {R} ^{n\times m}}$ is the covariance matrix between X and Y.

Then the conditional covariance of X given Y is the Schur complement of C in ${\displaystyle \Sigma }$:

${\displaystyle \operatorname {Cov} (X\mid Y)=A-BC^{-1}B^{T}.}$
${\displaystyle \operatorname {E} (X\mid Y)=\operatorname {E} (X)+BC^{-1}(Y-\operatorname {E} (Y)).}$

If we take the matrix ${\displaystyle \Sigma }$ above to be, not a covariance of a random vector, but a sample covariance, then it may have a Wishart distribution. In that case, the Schur complement of C in ${\displaystyle \Sigma }$ also has a Wishart distribution.{{ safesubst:#invoke:Unsubst||date=__DATE__ |\$B= {{#invoke:Category handler|main}}{{#invoke:Category handler|main}}[citation needed] }}

## Schur complement condition for positive definiteness

Let X be a symmetric matrix given by

${\displaystyle X=\left[{\begin{matrix}A&B\\B^{T}&C\end{matrix}}\right].}$

Let S be the Schur complement of A in X, that is:

${\displaystyle S=C-B^{T}A^{-1}B.\,}$

Then

${\displaystyle X\succ 0\Leftrightarrow A\succ 0,S=C-B^{T}A^{-1}B\succ 0}$.
${\displaystyle X\succ 0\Leftrightarrow C\succ 0,A-BC^{-1}B^{T}\succ 0}$.
${\displaystyle {\text{If}}}$ ${\displaystyle A\succ 0}$, ${\displaystyle {\text{then}}}$ ${\displaystyle X\succeq 0\Leftrightarrow S=C-B^{T}A^{-1}B\succeq 0}$.
${\displaystyle {\text{If}}}$ ${\displaystyle C\succ 0}$, ${\displaystyle {\text{then}}}$ ${\displaystyle X\succeq 0\Leftrightarrow A-BC^{-1}B^{T}\succeq 0}$.

The first and third statements can be derived[3] by considering the minimizer of the quantity

${\displaystyle u^{T}Au+2v^{T}B^{T}u+v^{T}Cv,\,}$

as a function of v (for fixed u).

Furthermore, since

${\displaystyle \left[{\begin{matrix}A&B\\B^{T}&C\end{matrix}}\right]\succ 0\Longleftrightarrow \left[{\begin{matrix}C&B\\B^{T}&A\end{matrix}}\right]\succ 0}$

and similarly for positive semi-definite matrices, the second (respectively fourth) statement is immediate from the first (resp. third).