Schur complement

In linear algebra and the theory of matrices, the Schur complement of a matrix block (i.e., a submatrix within a larger matrix) is defined as follows. Suppose A, B, C, D are respectively p×p, p×q, q×p and q×q matrices, and D is invertible. Let

$M=\left[{\begin{matrix}A&B\\C&D\end{matrix}}\right]$ so that M is a (p+q)×(p+q) matrix.

Then the Schur complement of the block D of the matrix M is the p×p matrix

$A-BD^{-1}C.\,$ It is named after Issai Schur who used it to prove Schur's lemma, although it had been used previously. Emilie Haynsworth was the first to call it the Schur complement.

Background

The Schur complement arises as the result of performing a block Gaussian elimination by multiplying the matrix M from the right with the "block lower triangular" matrix

$L=\left[{\begin{matrix}I_{p}&0\\-D^{-1}C&I_{q}\end{matrix}}\right].$ Here Ip denotes a p×p identity matrix. After multiplication with the matrix L the Schur complement appears in the upper p×p block. The product matrix is

{\begin{aligned}ML&=\left[{\begin{matrix}A&B\\C&D\end{matrix}}\right]\left[{\begin{matrix}I_{p}&0\\-D^{-1}C&I_{q}\end{matrix}}\right]=\left[{\begin{matrix}A-BD^{-1}C&B\\0&D\end{matrix}}\right]\\&=\left[{\begin{matrix}I_{p}&BD^{-1}\\0&I_{q}\end{matrix}}\right]\left[{\begin{matrix}A-BD^{-1}C&0\\0&D\end{matrix}}\right].\end{aligned}} This is analogous to an LDU decomposition. That is, we have shown that

{\begin{aligned}\left[{\begin{matrix}A&B\\C&D\end{matrix}}\right]&=\left[{\begin{matrix}I_{p}&BD^{-1}\\0&I_{q}\end{matrix}}\right]\left[{\begin{matrix}A-BD^{-1}C&0\\0&D\end{matrix}}\right]\left[{\begin{matrix}I_{p}&0\\D^{-1}C&I_{q}\end{matrix}}\right],\end{aligned}} and inverse of M thus may be expressed involving D−1 and the inverse of Schur's complement (if it exists) only as

{\begin{aligned}&{}\quad \left[{\begin{matrix}A&B\\C&D\end{matrix}}\right]^{-1}=\left[{\begin{matrix}I_{p}&0\\-D^{-1}C&I_{q}\end{matrix}}\right]\left[{\begin{matrix}(A-BD^{-1}C)^{-1}&0\\0&D^{-1}\end{matrix}}\right]\left[{\begin{matrix}I_{p}&-BD^{-1}\\0&I_{q}\end{matrix}}\right]\\[12pt]&=\left[{\begin{matrix}\left(A-BD^{-1}C\right)^{-1}&-\left(A-BD^{-1}C\right)^{-1}BD^{-1}\\-D^{-1}C\left(A-BD^{-1}C\right)^{-1}&D^{-1}+D^{-1}C\left(A-BD^{-1}C\right)^{-1}BD^{-1}\end{matrix}}\right].\end{aligned}} C.f. matrix inversion lemma which illustrates relationships between the above and the equivalent derivation with the roles of A and D interchanged.

If M is a positive-definite symmetric matrix, then so is the Schur complement of D in M.

If p and q are both 1 (i.e. A, B, C and D are all scalars), we get the familiar formula for the inverse of a 2-by-2 matrix:

$M^{-1}={\frac {1}{AD-BC}}\left[{\begin{matrix}D&-B\\-C&A\end{matrix}}\right]$ provided that AD − BC is non-zero.

Moreover, the determinant of M is also clearly seen to be given by

$\det(M)=\det(D)\det(A-BD^{-1}C)$ which generalizes the determinant formula for 2x2 matrices.

Application to solving linear equations

The Schur complement arises naturally in solving a system of linear equations such as

$Ax+By=a\,$ $Cx+Dy=b\,$ where x, a are p-dimensional column vectors, y, b are q-dimensional column vectors, and A, B, C, D are as above. Multiplying the bottom equation by $BD^{-1}$ and then subtracting from the top equation one obtains

$(A-BD^{-1}C)x=a-BD^{-1}b.\,$ Thus if one can invert D as well as the Schur complement of D, one can solve for x, and then by using the equation $Cx+Dy=b$ one can solve for y. This reduces the problem of inverting a $(p+q)\times (p+q)$ matrix to that of inverting a p×p matrix and a q×q matrix. In practice one needs D to be well-conditioned in order for this algorithm to be numerically accurate.

Applications to probability theory and statistics

Suppose the random column vectors X, Y live in Rn and Rm respectively, and the vector (X, Y) in Rn+m has a multivariate normal distribution whose covariance is the symmetric positive-definite matrix

$\Sigma =\left[{\begin{matrix}A&B\\B^{T}&C\end{matrix}}\right],$ Then the conditional covariance of X given Y is the Schur complement of C in $\Sigma$ :

$\operatorname {Cov} (X\mid Y)=A-BC^{-1}B^{T}.$ $\operatorname {E} (X\mid Y)=\operatorname {E} (X)+BC^{-1}(Y-\operatorname {E} (Y)).$ If we take the matrix $\Sigma$ above to be, not a covariance of a random vector, but a sample covariance, then it may have a Wishart distribution. In that case, the Schur complement of C in $\Sigma$ also has a Wishart distribution.{{ safesubst:#invoke:Unsubst||date=__DATE__ |\$B= {{#invoke:Category handler|main}}{{#invoke:Category handler|main}}[citation needed] }}

Schur complement condition for positive definiteness

Let X be a symmetric matrix given by

$X=\left[{\begin{matrix}A&B\\B^{T}&C\end{matrix}}\right].$ Let S be the Schur complement of A in X, that is:

$S=C-B^{T}A^{-1}B.\,$ Then

$X\succ 0\Leftrightarrow A\succ 0,S=C-B^{T}A^{-1}B\succ 0$ .
$X\succ 0\Leftrightarrow C\succ 0,A-BC^{-1}B^{T}\succ 0$ .
${\text{If}}$ $A\succ 0$ , ${\text{then}}$ $X\succeq 0\Leftrightarrow S=C-B^{T}A^{-1}B\succeq 0$ .
${\text{If}}$ $C\succ 0$ , ${\text{then}}$ $X\succeq 0\Leftrightarrow A-BC^{-1}B^{T}\succeq 0$ .

The first and third statements can be derived by considering the minimizer of the quantity

$u^{T}Au+2v^{T}B^{T}u+v^{T}Cv,\,$ as a function of v (for fixed u).

Furthermore, since

$\left[{\begin{matrix}A&B\\B^{T}&C\end{matrix}}\right]\succ 0\Longleftrightarrow \left[{\begin{matrix}C&B\\B^{T}&A\end{matrix}}\right]\succ 0$ and similarly for positive semi-definite matrices, the second (respectively fourth) statement is immediate from the first (resp. third).