Schur complement

From formulasearchengine
Jump to navigation Jump to search

Template:Distinguish2 Template:Expert-subject

In linear algebra and the theory of matrices, the Schur complement of a matrix block (i.e., a submatrix within a larger matrix) is defined as follows. Suppose A, B, C, D are respectively p×p, p×q, q×p and q×q matrices, and D is invertible. Let

so that M is a (p+q)×(p+q) matrix.

Then the Schur complement of the block D of the matrix M is the p×p matrix

It is named after Issai Schur who used it to prove Schur's lemma, although it had been used previously.[1] Emilie Haynsworth was the first to call it the Schur complement.[2]

Background

The Schur complement arises as the result of performing a block Gaussian elimination by multiplying the matrix M from the right with the "block lower triangular" matrix

Here Ip denotes a p×p identity matrix. After multiplication with the matrix L the Schur complement appears in the upper p×p block. The product matrix is

This is analogous to an LDU decomposition. That is, we have shown that

and inverse of M thus may be expressed involving D−1 and the inverse of Schur's complement (if it exists) only as

C.f. matrix inversion lemma which illustrates relationships between the above and the equivalent derivation with the roles of A and D interchanged.

If M is a positive-definite symmetric matrix, then so is the Schur complement of D in M.

If p and q are both 1 (i.e. A, B, C and D are all scalars), we get the familiar formula for the inverse of a 2-by-2 matrix:

provided that AD − BC is non-zero.

Moreover, the determinant of M is also clearly seen to be given by

which generalizes the determinant formula for 2x2 matrices.

Application to solving linear equations

The Schur complement arises naturally in solving a system of linear equations such as

where x, a are p-dimensional column vectors, y, b are q-dimensional column vectors, and A, B, C, D are as above. Multiplying the bottom equation by and then subtracting from the top equation one obtains

Thus if one can invert D as well as the Schur complement of D, one can solve for x, and then by using the equation one can solve for y. This reduces the problem of inverting a matrix to that of inverting a p×p matrix and a q×q matrix. In practice one needs D to be well-conditioned in order for this algorithm to be numerically accurate.

Applications to probability theory and statistics

Suppose the random column vectors X, Y live in Rn and Rm respectively, and the vector (X, Y) in Rn+m has a multivariate normal distribution whose covariance is the symmetric positive-definite matrix

where is the covariance matrix of X, is the covariance matrix of Y and is the covariance matrix between X and Y.

Then the conditional covariance of X given Y is the Schur complement of C in :

If we take the matrix above to be, not a covariance of a random vector, but a sample covariance, then it may have a Wishart distribution. In that case, the Schur complement of C in also has a Wishart distribution.{{ safesubst:#invoke:Unsubst||date=__DATE__ |$B= {{#invoke:Category handler|main}}{{#invoke:Category handler|main}}[citation needed] }}

Schur complement condition for positive definiteness

Let X be a symmetric matrix given by

Let S be the Schur complement of A in X, that is:

Then

.
.
, .
, .

The first and third statements can be derived[3] by considering the minimizer of the quantity

as a function of v (for fixed u).

Furthermore, since

and similarly for positive semi-definite matrices, the second (respectively fourth) statement is immediate from the first (resp. third).

See also

References

  1. {{#invoke:citation/CS1|citation |CitationClass=book }}
  2. Haynsworth, E. V., "On the Schur Complement", Basel Mathematical Notes, #BNB 20, 17 pages, June 1968.
  3. Boyd, S. and Vandenberghe, L. (2004), "Convex Optimization", Cambridge University Press (Appendix A.5.5)