Square root of a matrix

In mathematics, the square root of a matrix extends the notion of square root from numbers to matrices.

Matrix Template:Mvar is said to be a square root of Template:Mvar if the matrix product Template:MvarTemplate:Mvar is equal to Template:Mvar.[1]

Properties

In general, a matrix can have several square roots. For example, the matrix ${\displaystyle {\bigl (}{\begin{smallmatrix}\\33&24\\48&57\end{smallmatrix}}{\bigr )}}$ has square roots ${\displaystyle {\bigl (}{\begin{smallmatrix}\\1&4\\8&5\end{smallmatrix}}{\bigr )}}$ and ${\displaystyle {\bigl (}{\begin{smallmatrix}\\5&2\\4&7\end{smallmatrix}}{\bigr )}}$, as well as their additive inverses.

Another example is the 2×2 identity matrix ${\displaystyle {\bigl (}{\begin{smallmatrix}\\1&0\\0&1\end{smallmatrix}}{\bigr )},}$ which has infinitely many symmetric rational square roots given by

${\displaystyle {\tfrac {1}{t}}{\bigl (}{\begin{smallmatrix}\\s&r\\r&-s\end{smallmatrix}}{\bigr )},}$ ${\displaystyle {\tfrac {1}{t}}{\bigl (}{\begin{smallmatrix}\\s&-r\\-r&-s\end{smallmatrix}}{\bigr )},}$ ${\displaystyle {\tfrac {1}{t}}{\bigl (}{\begin{smallmatrix}\\-s&r\\r&s\end{smallmatrix}}{\bigr )},}$ ${\displaystyle {\tfrac {1}{t}}{\bigl (}{\begin{smallmatrix}\\-s&-r\\-r&s\end{smallmatrix}}{\bigr )},}$ ${\displaystyle {\bigl (}{\begin{smallmatrix}\\1&0\\0&\pm 1\end{smallmatrix}}{\bigr )},}$ and ${\displaystyle {\bigl (}{\begin{smallmatrix}\\-1&0\\0&\pm 1\end{smallmatrix}}{\bigr )},}$

where (r, s, t) is any Pythagorean triple—that is, any set of positive integers such that r²+s² = t².[2]

However, a positive-semidefinite matrix has precisely one positive-semidefinite square root, which can be called its principal square root.

While the square root of a nonnegative integer is either again an integer or an irrational number, in contrast an integer matrix can have a square root whose entries are rational, yet not integral. For example, the matrix ${\displaystyle {\bigl (}{\begin{smallmatrix}\\~\;0&4\\-1&5\end{smallmatrix}}{\bigr )}}$ has the non-integer square root ${\displaystyle {\bigl (}{\begin{smallmatrix}\\~\;2/3&4/3\\-1/3&7/3\end{smallmatrix}}{\bigr )}}$, as well as the integer matrix: ${\displaystyle {\bigl (}{\begin{smallmatrix}\\2&-4\\1&-3\end{smallmatrix}}{\bigr )}}$. The 2×2 identity matrix is another example.

A 2×2 matrix with two distinct nonzero eigenvalues has four square roots.

More generally, an n×n matrix with Template:Mvar distinct nonzero eigenvalues has 2n square roots. Such a matrix, Template:Mvar, has a decomposition VDV–1 where Template:Mvar is the matrix whose columns are eigenvectors of Template:Mvar and Template:Mvar is the diagonal matrix whose diagonal elements are the corresponding Template:Mvar eigenvalues λi. Thus the square roots of Template:Mvar are given by VD½ V–1, where Template:Mvar½ is any square root matrix of Template:Mvar, which, for distinct eigenvalues, must be diagonal with diagonal elements equal to square roots of the diagonal elements of Template:Mvar; since there are two possible choices for a square root of each diagonal element of Template:Mvar, there are 2n choices for the matrix Template:Mvar½.

This also leads to a proof of the above observation, that a positive-definite matrix has precisely one positive-definite square root: a positive definite matrix has only positive eigenvalues, and each of these eigenvalues has only one positive square root; and since the eigenvalues of the square root matrix are the diagonal elements of Template:Mvar½, for the square root matrix to be itself positive definite necessitates the use of only the unique positive square roots of the original eigenvalues.

Just as with the real numbers, a real matrix may fail to have a real square root, but have a square root with complex-valued entries.

Some matrices have no square root. An example is the matrix ${\displaystyle {\bigl (}{\begin{smallmatrix}\\0&1\\0&0\end{smallmatrix}}{\bigr )}}$.

In general, a complex matrix with positive real eigenvalues has a unique square root with positive eigenvalues called the principal square root. Moreover, the operation of taking the principal square root is continuous on this set of matrices. If the matrix has real entries, then the square root also has real entries. These properties are consequences of the holomorphic functional calculus applied to matrices. The existence and uniqueness of the principal square root can be deduced directly from the Jordan normal form (see below).[3] [4]

Computation methods

Explicit formulas

For a 2 × 2 matrix, there are explicit formulas that give up to four square roots, if the matrix has any roots.

If Template:Mvar is a diagonal n × n matrix, one can obtain a square root by taking a diagonal matrix Template:Mvar, where each element along the diagonal is a square root of the corresponding element of Template:Mvar. If the diagonal elements of D are real and non-negative, and the square roots are taken with non-negative sign, the matrix Template:Mvar will be the principal root of Template:Mvar.

If a matrix is idempotent, one of its square roots is the matrix itself.

By diagonalization

An n × n matrix Template:Mvar is diagonalizable if there is a matrix Template:Mvar and a diagonal matrix Template:Mvar such that A = VDV–1. This happens if and only if Template:Mvar has n eigenvectors which constitute a basis for Cn. In this case, Template:Mvar can be chosen to be the matrix with the n eigenvectors as columns, and thus a square root of Template:Mvar is

${\displaystyle R=VSV^{-1}~,}$

where Template:Mvar is any square root of Template:Mvar. Indeed,

${\displaystyle (VD^{1/2}V^{-1})^{2}=VD^{1/2}(V^{-1}V)D^{1/2}V^{-1}=VDV^{-1}=A~.}$

For example, the matrix ${\displaystyle A={\bigl (}{\begin{smallmatrix}\\33&24\\48&57\end{smallmatrix}}{\bigr )}}$ can be diagonalized as VDV–1, where

${\displaystyle V={\bigl (}{\begin{smallmatrix}\\1&~\;1\\2&-1\end{smallmatrix}}{\bigr )}}$ and ${\displaystyle D={\bigl (}{\begin{smallmatrix}\\81&0\\~\;0&9\end{smallmatrix}}{\bigr )}}$.

Template:Mvar has principal square root

${\displaystyle D^{1/2}={\bigl (}{\begin{smallmatrix}\\9&0\\0&3\end{smallmatrix}}{\bigr )}}$,

giving the square root

${\displaystyle A^{1/2}=VD^{1/2}V^{-1}={\bigl (}{\begin{smallmatrix}\\5&2\\4&7\end{smallmatrix}}{\bigr )}}$.

When Template:Mvar is symmetric, the diagonalizing matrix Template:Mvar can be made an orthogonal matrix by suitably choosing the eigenvectors (see spectral theorem). Then the inverse of Template:Mvar is simply the transpose, so that

${\displaystyle R=VSV^{T}~.}$

By Jordan decomposition

For non-diagonalizable matrices one can calculate the Jordan normal form followed by a series expansion, similar to the approach described in logarithm of a matrix.

To see that any complex matrix with positive eigenvalues has a square root of the same form, it suffices to check this for a Jordan block. Any such block has the form λ(I + N) with λ > 0 and N nilpotent. If (1 + z)1/2 = 1 + a1 z + a2 z2 + ⋅⋅⋅⋅⋅ is the binomial expansion for the square root (valid in |z| < 1), then as a formal power series its square equals 1 + z. Substituting N for z, only finitely many terms will be non-zero and S = √λ (I + a1 N + a2 N2 + ⋅⋅⋅⋅⋅) gives a square root of the Jordan block with eigenvalue √λ.

It suffices to check uniqueness for a Jordan block with λ = 1. The square constructed above has the form S = I + L where L is polynomial in N without constant term. Any other square root T with positive eigenvalues has the form T = I + M with Template:Mvar nilpotent, commuting with N and hence L. But then 0 = S2T2 = 2(LM)(I + (L + M)/2). Since L and Template:Mvar commute, the matrix L + M is nilpotent and I + (L + M)/2 is invertible with inverse given by a Neumann series. Hence L = Template:Mvar.

If Template:Mvar is a matrix with positive eigenvalues and minimal polynomial p(t), then the Jordan decomposition into generalized eigenspaces of Template:Mvar can be deduced from the partial fraction expansion of p(t)−1. The corresponding projections onto the generalized eigenspaces are given by real polynomials in Template:Mvar. On each eigenspace, Template:Mvar has the form λ(I + N) as above. The power series expression for the square root on the eigenspace show that the principal square root of Template:Mvar has the form q(A) where q(t) is a polynomial with real coefficients.

By Denman–Beavers iteration

Another way to find the square root of an n × n matrix A is the Denman–Beavers square root iteration.[5]

Let Y0 = A and Z0 = I, where I is the n × n identity matrix. The iteration is defined by

{\displaystyle {\begin{aligned}Y_{k+1}&={\tfrac {1}{2}}(Y_{k}+Z_{k}^{-1}),\\Z_{k+1}&={\tfrac {1}{2}}(Z_{k}+Y_{k}^{-1}).\end{aligned}}}

As this uses a pair of sequences of matrix inverses whose later elements change comparatively little, only the first elements have a high computational cost since the remainder can be computed from earlier elements with only a few passes of a variant of Newton's method for computing inverses,

${\displaystyle X_{n+1}=2X_{n}-X_{n}BX_{n}.}$

With this, for later values of Template:Mvar one would set ${\displaystyle X_{0}=Z_{k-1}^{-1}}$ and ${\displaystyle B=Z_{k},}$ and then use ${\displaystyle Z_{k}^{-1}=X_{n}}$ for some small n (perhaps just 1), and similarly for ${\displaystyle Y_{k}^{-1}.}$

Convergence is not guaranteed, even for matrices that do have square roots, but if the process converges, the matrix ${\displaystyle Y_{k}}$ converges quadratically to a square root Template:Mvar1/2, while ${\displaystyle Z_{k}}$ converges to its inverse, Template:Mvar−1/2.

By the Babylonian method

Yet another iterative method is obtained by taking the well-known formula of the Babylonian method for computing the square root of a real number, and applying it to matrices. Let X0 = I, where I is the identity matrix. The iteration is defined by

${\displaystyle X_{k+1}={\tfrac {1}{2}}(X_{k}+AX_{k}^{-1})\,.}$

Again, convergence is not guaranteed, but if the process converges, the matrix ${\displaystyle X_{k}}$ converges quadratically to a square root A1/2. Compared to Denman–Beavers iteration, an advantage of the Babylonian method is that only one matrix inverse need be computed per iteration step. On the other hand, as Denman–Beavers iteration uses a pair of sequences of matrix inverses whose later elements change comparatively little, only the first elements have a high computational cost since the remainder can be computed from earlier elements with only a few passes of a variant of Newton's method for computing inverses (see Denman–Beavers iteration above); of course, the same approach can be used to get the single sequence of inverses needed for the Babylonian method. However, unlike Denman–Beavers iteration, the Babylonian method is numerically unstable and more likely to fail to converge.[1]

Square roots of positive operators

{{ safesubst:#invoke:Unsubst||$N=Refimprove |date=__DATE__ |$B= {{#invoke:Message box|ambox}} }}

In linear algebra and operator theory, given a bounded positive semidefinite operator (a non-negative operator) T on a complex Hilbert space, B is a square root of T if T = B* B, where B* denotes the Hermitian adjoint of B. {{ safesubst:#invoke:Unsubst||date=__DATE__ |$B= {{#invoke:Category handler|main}}{{#invoke:Category handler|main}}[citation needed] }} According to the spectral theorem, the continuous functional calculus can be applied to obtain an operator T½ such that T½ is itself positive and (T½)2 = T. The operator T½ is the unique non-negative square root of T. {{ safesubst:#invoke:Unsubst||date=__DATE__ |$B= {{#invoke:Category handler|main}}{{#invoke:Category handler|main}}[citation needed] }}

A bounded non-negative operator on a complex Hilbert space is self adjoint by definition. So T = (T½)* T½. Conversely, it is trivially true that every operator of the form B* B is non-negative. Therefore, an operator T is non-negative if and only if T = B* B for some B (equivalently, T = CC* for some C).

The Cholesky factorization provides another particular example of square root, which should not be confused with the unique non-negative square root.

Unitary freedom of square roots

If T is a non-negative operator on a finite-dimensional Hilbert space, then all square roots of T are related by unitary transformations. More precisely, if T = A*A = B*B, then there exists a unitary U such that A = UB.

Indeed, take B = T½ to be the unique non-negative square root of T. If T is strictly positive, then B is invertible, and so U = AB−1 is unitary:

{\displaystyle {\begin{aligned}U^{*}U&=\left((B^{*})^{-1}A^{*}\right)\left(AB^{-1}\right)=(B^{*})^{-1}T(B^{-1})\\&=(B^{*})^{-1}B^{*}B(B^{-1})=I.\end{aligned}}}

If T is non-negative without being strictly positive, then the inverse of B cannot be defined, but the Moore–Penrose pseudoinverse B+ can be. In that case, the operator B+A is a partial isometry, that is, a unitary operator from the range of T to itself. This can then be extended to a unitary operator U on the whole space by setting it equal to the identity on the kernel of T. More generally, this is true on an infinite-dimensional Hilbert space if, in addition, T has closed range. In general, if A, B are closed and densely defined operators on a Hilbert space H, and A* A = B* B, then A = UB where U is a partial isometry.

Some applications

Square roots, and the unitary freedom of square roots have applications throughout functional analysis and linear algebra.

Polar decomposition

{{#invoke:main|main}} If A is an invertible operator on a finite-dimensional Hilbert space, then there is a unique unitary operator U and positive operator P such that

${\displaystyle A=UP;\,}$

this is the polar decomposition of A. The positive operator P is the unique positive square root of the positive operator AA, and U is defined by U = AP−1.

If A is not invertible, then it still has a polar composition in which P is defined in the same way (and is unique). The unitary operator U is not unique. Rather it is possible to determine a "natural" unitary operator as follows: AP+ is a unitary operator from the range of A to itself, which can be extended by the identity on the kernel of A. The resulting unitary operator U then yields the polar decomposition of A.

Kraus operators

{{#invoke:main|main}}

By Choi's result, a linear map

${\displaystyle \Phi :C^{n\times n}\rightarrow C^{m\times m}}$

is completely positive if and only if it is of the form

${\displaystyle \Phi (A)=\sum _{i}^{k}V_{i}AV_{i}^{*}}$

where knm. Let {Ep q} ⊂ Cn × n be the n2 elementary matrix units. The positive matrix

${\displaystyle M_{\Phi }=(\Phi (E_{pq}))_{pq}\in C^{nm\times nm}}$

is called the Choi matrix of Φ. The Kraus operators correspond to the, not necessarily square, square roots of MΦ: For any square root B of MΦ, one can obtain a family of Kraus operators Vi by undoing the Vec operation to each column bi of B. Thus all sets of Kraus operators are related by partial isometries.

Mixed ensembles

{{#invoke:main|main}}

In quantum physics, a density matrix for an n-level quantum system is an n × n complex matrix ρ that is positive semidefinite with trace 1. If ρ can be expressed as

${\displaystyle \rho =\sum _{i}p_{i}v_{i}v_{i}^{*}}$

where ∑ pi = 1, the set

${\displaystyle \{p_{i},v_{i}\}\,}$

is said to be an ensemble that describes the mixed state ρ. Notice {vi} is not required to be orthogonal. Different ensembles describing the state ρ are related by unitary operators, via the square roots of ρ. For instance, suppose

${\displaystyle \rho =\sum _{j}a_{j}a_{j}^{*}.}$

The trace 1 condition means

${\displaystyle \sum _{j}a_{j}^{*}a_{j}=1.}$

Let

${\displaystyle p_{i}=a_{i}^{*}a_{i},}$

and vi be the normalized ai. We see that

${\displaystyle \{p_{i},v_{i}\}\,}$

gives the mixed state ρ.

Unscented Kalman Filter

{{#invoke:main|main}}

In the Unscented Kalman Filter (UKF),[6] the square root of the state error covariance matrix is required for the unscented transform which is the statistical linearization method used. A comparison between different matrix square root calculation methods within a UKF application of GPS/INS sensor fusion was presented, which indicated that the Cholesky decomposition method was best suited for UKF applications.[7]

Notes

1. {{#invoke:citation/CS1|citation |CitationClass=citation }}
2. Mitchell, Douglas W. "Using Pythagorean triples to generate square roots of I2". The Mathematical Gazette 87, November 2003, 499-500.
3. For analytic functions of matrices, see
4. For the holomorphic functional calculus, see:
5. {{#invoke:citation/CS1|citation |CitationClass=citation }}
6. {{#invoke:citation/CS1|citation |CitationClass=citation }}

References

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}, Chapter IV, Reisz functional calculus

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}

• {{#invoke:citation/CS1|citation

|CitationClass=citation }}