Shell integration: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>AManWithNoPlan
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
In [[mathematics]], and in particular [[linear algebra]], a '''pseudoinverse''' {{math|''A''<sup>+</sup>}} of a  [[matrix (mathematics)|matrix]] {{math|''A''}} is a [[Generalized inverse|generalization]] of the [[inverse matrix]].<ref name="IG2003">{{cite book | last=Ben-Israel | first = Adi | coauthors=[[Thomas N.E. Greville]] | title=Generalized Inverses | isbn=0-387-00293-6 | publisher=[[Springer Science+Business Media|Springer-Verlag]] | year=2003}}</ref>  The most widely known type of matrix pseudoinverse is the '''Moore–Penrose pseudoinverse''', which was independently described by [[E. H. Moore]]<ref name="Moore1920">{{cite journal | last=Moore | first=E. H. | authorlink=E. H. Moore | title=On the reciprocal of the general algebraic matrix | journal=[[Bulletin of the American Mathematical Society]] | volume=26 |issue=9| pages=394–395 | year=1920 | url =http://projecteuclid.org/euclid.bams/1183425340 | doi = 10.1090/S0002-9904-1920-03322-7 }}</ref> in 1920, [[Arne Bjerhammar]]<ref name="Bjerhammar1951">{{cite journal | last=Bjerhammar| first=Arne| authorlink=Arne Bjerhammar | title=Application of calculus of matrices to method of least squares; with special references to geodetic calculations| journal=Trans. Roy. Inst. Tech. Stockholm | year=1951 | volume = 49}}</ref> in 1951 and [[Roger Penrose]]<ref name="Penrose1955">{{cite journal | last=Penrose | first=Roger | authorlink=Roger Penrose | title=A generalized inverse for matrices | journal=[[Proceedings of the Cambridge Philosophical Society]] | volume=51 | pages=406–413 | year=1955 | doi=10.1017/S0305004100030401}}</ref> in 1955. Earlier, [[Erik Ivar Fredholm|Fredholm]] had introduced the concept of a pseudoinverse of [[integral operator]]s in 1903. When referring to a matrix, the term pseudoinverse, without further specification, is often used to indicate the Moore–Penrose pseudoinverse. The term [[generalized inverse]] is sometimes used as a synonym for pseudoinverse.
My hobby is mainly Radio-Controlled Car Racing. Seems boring? Not at all!<br>I also  try to learn Swedish in my spare time.<br><br>Review my web blog [http://Www.Jannekaske.com/2011/05/wedge-practicing.html Fifa 15 coin generator]
 
A common use of the Moore–Penrose pseudoinverse (hereafter, just pseudoinverse) is to compute a 'best fit' ([[Ordinary least squares|least squares]]) solution to a [[system of linear equations]] that lacks a unique solution (see below under [[#Applications|Applications]]).
Another use is to find the minimum ([[Euclidean norm|Euclidean]]) norm solution to a system of linear equations with multiple solutions. The pseudoinverse facilitates the statement and proof of results in linear algebra.
 
The pseudoinverse is defined and unique for all matrices whose entries are [[Real number|real]] or [[Complex number|complex]] numbers. It can be computed using the [[singular value decomposition]].
 
==Notation==
In the following discussion, the following conventions are adopted.
 
*<math>\mathbb{K}</math> will denote one of the [[field (mathematics)|fields]] of real or complex numbers, denoted <math>\mathbb{R},\,\mathbb{C}</math>, respectively. The vector space of <math>m \times n</math> matrices over <math>\mathbb{K}</math> is denoted by <math>M(m,n;\mathbb{K})</math>.
*For <math>A \in M(m,n;\mathbb{K})</math>, <math>A^T</math> and <math>A^{*}</math> denote the transpose and Hermitian transpose (also called [[conjugate transpose]]) respectively. If <math>\mathbb{K} = \mathbb{R}</math>, then <math>A^* = A^T</math>.
*For <math>A \in M(m,n;\mathbb{K})</math>, then <math>\operatorname{Im}(A)</math> denotes the [[column space|range]] (image) of <math>A</math> (the space spanned by the column vectors of <math>A</math>) and <math>\operatorname{Ker}(A)</math> denotes the [[kernel (matrix)|kernel]] (null space) of <math>A</math>.
*Finally, for any positive integer <math>n</math>, <math>I_{n} \in M(n,n;\mathbb{K})</math> denotes the <math>n \times n</math> [[identity matrix]].
 
==Definition==
For <math> A \in M(m,n;\mathbb{K}) </math>,
a Moore–Penrose pseudoinverse (hereafter, just pseudoinverse) of <math> A </math> is defined as a matrix
<math> A^+ \in M(n,m;\mathbb{K}) </math>
satisfying all of the following four criteria:<ref name="Penrose1955"/><ref name="GvL1996">{{cite book | last=Golub | first=Gene H. | authorlink=Gene H. Golub | coauthors=[[Charles F. Van Loan]] | title=Matrix computations | edition=3rd | publisher=Johns Hopkins | location=Baltimore | year=1996 | isbn=0-8018-5414-8 | pages = 257–258}}</ref>
# <math>A A^+A = A\,\!</math> &nbsp; &nbsp; &nbsp; ({{math|''AA''<sup>+</sup>}} need not be the general identity matrix, but it maps all column vectors of {{math|''A''}} to themselves);
# <math>A^+A A^+ = A^+\,\!</math> &nbsp; &nbsp; &nbsp; ({{math|''A''<sup>+</sup>}} is a [[weak inverse]] for the multiplicative [[semigroup]]);
# <math>(AA^+)^* = AA^+\,\!</math> &nbsp; &nbsp; &nbsp; ({{math|''AA''<sup>+</sup>}} is [[Hermitian matrix|Hermitian]]); and
# <math>(A^+A)^* = A^+A\,\!</math> &nbsp; &nbsp; &nbsp; ({{math|''A''<sup>+</sup>''A''}} is also Hermitian).
 
==Properties==
Proofs for some of these facts may be found on a separate page [[Proofs involving the Moore–Penrose pseudoinverse|here]].
 
===Existence and uniqueness===
*The Moore–Penrose pseudoinverse exists and is unique: for any matrix <math> A\,\!</math>, there is precisely one matrix <math> A^+\,\!</math>, that satisfies the four properties of the definition.<ref name="GvL1996"/>
 
A matrix satisfying the first two conditions of the definition is known as a [[generalized inverse]]. Generalized inverses always exist but are not in general unique. Uniqueness is a consequence of the last two conditions.
 
===Basic properties===
* If <math>A\,\!</math> has real entries, then so does <math> A^+\,\!</math>.
* If <math>A\,\!</math> is [[invertible matrix|invertible]], its pseudoinverse is its inverse.  That is: <math>A^+=A^{-1}\,\!</math>.<ref name="SB2002">{{Cite book | last1=Stoer | first1=Josef | last2=Bulirsch | first2=Roland | title=Introduction to Numerical Analysis | publisher=[[Springer-Verlag]] | location=Berlin, New York | edition=3rd | isbn=978-0-387-95452-3 | year=2002}}.</ref>{{rp|243}}
* The pseudoinverse of a [[zero matrix]] is its transpose.
* The pseudoinverse of the pseudoinverse is the original matrix: <math>(A^+)^+=A\,\!</math>.<ref name="SB2002" />{{rp|245}}
* Pseudoinversion commutes with transposition, conjugation, and taking the conjugate transpose:<ref name="SB2002"/>{{rp|245}} <!-- reference only mentions the last bit -->
::<math>(A^T)^+ = (A^+)^T,~~ \overline{A}^+ = \overline{A^+},~~ (A^*)^+ = (A^+)^*.\,\!</math>
* The pseudoinverse of a scalar multiple of {{math|''A''}} is the reciprocal multiple of {{math|''A''<sup>+</sup>}}:
::<math>(\alpha A)^+ = \alpha^{-1} A^+\,\!</math> for <math>\alpha\neq 0</math>.
 
====Identities====
The following identities can be used to cancel certain subexpressions or expand expressions involving pseudoinverses. Proofs for these properties can be found in the [[Proofs involving the Moore–Penrose pseudoinverse|proofs subpage]].
<!-- If you know how to specify that less space is to be used between columns, please do so! @{} did not work for me: -->
::<math>\begin{array}{lclll}
A^+ &=& A^+  & A^{+*} & A^*\\
A^+ &=& A^*  & A^{+*} & A^+\\
A  &=& A^{+*}& A^*    & A  \\
A  &=& A    & A^*    & A^{+*}\\
A^* &=& A^*  & A      & A^+\\
A^* &=& A^+  & A      & A^*\\
\end{array}</math>
 
===Reduction to Hermitian case===
*<math>A^+ = (A^*A)^+A^*\,\!</math>
*<math>A^+ = A^*(AA^*)^+\,\!</math>
 
===Products===
If <math> A \in M(m,n;\mathbb{K}),~B \in M(n,p;\mathbb{K})\,\!</math> and either,
* <math> A\,\!</math> has orthonormal columns (i.e. <math> A^*A = I_n\, </math>) or,
* <math> B\,\!</math> has orthonormal rows (i.e. <math> BB^* = I_n\, </math>) or,
* <math> A\,\!</math> has all columns linearly independent (full column rank) and <math> B\,\!</math> has all rows linearly independent (full row rank),
 
then <math>(AB)^+ = B^+ A^+\,\!</math>.
 
===Projectors===
<math>P = AA^+\,\!</math> and <math>Q = A^+A\,\!</math> are
[[projection (linear algebra)|orthogonal projection operators]]  --- that is, they are  Hermitian  (<math> P = P^*\,\!</math>, <math> Q = Q^*\,\!</math>) and idempotent (<math> P^2 = P\,\!</math> and  <math> Q^2 = Q\,\!</math>). The following hold:
*<math>PA=A=AQ\,\!</math> and <math>A^+P=A^+=QA^+\,\!</math>
*<math>P\,\!</math> is the [[orthogonal projector]] onto the [[range (mathematics)|range]] of <math>A\,\!</math> (which equals the [[orthogonal complement]] of the kernel of <math> A^*\,\!</math>).
*<math>Q\,\!</math> is the orthogonal projector onto the [[range (mathematics)|range]] of <math>A^*\,\!</math> (which equals the [[orthogonal complement]] of the kernel of <math> A\,\!</math>).
*<math>(I - P)\,\!</math> is the orthogonal projector onto the [[kernel (linear algebra)|kernel]] of <math>A^*\,\!</math>.
*<math>(I - Q)\,\!</math> is the orthogonal projector onto the [[kernel (linear algebra)|kernel]] of <math>A\,\!</math>.<ref name="GvL1996"/>
 
===Subspaces===
*<math> \operatorname{Ker}(A^+) = \operatorname{Ker}(A^*)\,\!</math>
*<math> \operatorname{Im}(A^+) = \operatorname{Im}(A^*)\,\!</math>
 
===Limit relations===
* The pseudoinverse are limits:
:<math>A^+ = \lim_{\delta \searrow 0} (A^* A + \delta I)^{-1} A^*
          = \lim_{\delta \searrow 0} A^* (A A^* + \delta I)^{-1}</math>
:(see [[Tikhonov regularization]]). These limits exist even if <math>(AA^*)^{-1}\,\!</math>  or <math>(A^*A)^{-1}\,\!</math> do not exist.<ref name="GvL1996"/>{{rp|263}}
 
===Continuity===
* In contrast to ordinary matrix inversion, the process of taking pseudoinverses is not [[continuous function|continuous]]: if the sequence <math>(A_n)</math> converges to the matrix {{math|''A''}} (in the [[matrix norm|maximum norm or Frobenius norm]], say), then {{math|(''A<sub>n</sub>'')<sup>+</sup>}} need not converge to {{math|''A''<sup>+</sup>}}. However, if all the matrices have the same rank, {{math|(''A<sub>n</sub>'')<sup>+</sup>}} will converge to {{math|''A''<sup>+</sup>}}.<ref name="rakocevic1997">{{cite journal | last=Rakočević | first=Vladimir | title=On continuity of the Moore-Penrose and Drazin inverses | journal=Matematički Vesnik | volume=49 | pages=163–172 | year=1997 | url =http://elib.mi.sanu.ac.rs/files/journals/mv/209/mv973404.pdf }}</ref>
 
===Derivative===
The derivative of a real valued pseudo-inverse matrix which has constant rank at a point <math>x</math> may be calculated in terms of the derivative of the original matrix:<ref>http://mathoverflow.net/questions/25778/analytical-formula-for-numerical-derivative-of-the-matrix-pseudo-inverse</ref>
::<math>
\frac{\mathrm d}{\mathrm d x} A^+(x) =
-A^+ \left( \frac{\mathrm d}{\mathrm d x} A \right) A^+
+A^+ A{^+}^T  \left( \frac{\mathrm d}{\mathrm d x} A^T \right) (1-A A^+)
+ (1-A^+ A) \left( \frac{\mathrm d}{\mathrm d x} A^T \right) A{^+}^T A^+
</math>
 
==Special cases==
 
===Scalars===
It is also possible to define a pseudoinverse for scalars and vectors. This amounts to treating these as matrices. The pseudoinverse of a scalar {{math|''x''}} is zero if {{math|''x''}} is zero and the reciprocal of {{math|''x''}} otherwise:
:<math>x^+ = \left\{\begin{matrix} 0, & \mbox{if }x=0;
\\ x^{-1}, & \mbox{otherwise}. \end{matrix}\right. </math>
 
===Vectors===
The pseudoinverse of the null (all zero) vector is the transposed null vector. The pseudoinverse of a non-null vector is the conjugate transposed vector divided by its squared magnitude:
:<math>x^+ = \left\{\begin{matrix} 0^T, & \mbox{if }x = 0;
\\ {x^* \over x^* x}, & \mbox{otherwise}. \end{matrix}\right. </math>
 
===Linearly independent columns===
If the '''columns''' of <math>A\,\!</math> are [[linear independence|linearly independent]]
(so that <math>m \ge n</math>), then
<math>A^*A\,\!</math> is invertible. In this case, an explicit formula is:<ref name="IG2003"/>
:<math>A^+ = (A^*A)^{-1}A^*\,\!</math>.
It follows that <math>A^+\,\!</math> is then a left inverse of
<math>A\,\!</math>: &nbsp; <math> A^+ A = I_n\,\!</math>.
 
===Linearly independent rows===
If the '''rows''' of <math>A\,\!</math> are linearly independent (so that <math>m \le n</math>), then
<math>A A^*</math> is invertible. In this case, an explicit formula is:
:<math>A^+ = A^*(A A^*)^{-1}\,\!</math>.
It follows that <math>A^+\,\!</math> is a right inverse of
<math>A\,\!</math>: &nbsp; <math>A A^+ = I_m\,\!</math>.
 
===Orthonormal columns or rows===
This is a special case of either full column rank or full row rank (treated above).
If <math>A\,\!</math> has orthonormal columns (<math>A^*A = I_n\,\!</math>)
or orthonormal rows (<math>AA^* = I_m\,\!</math>),
then <math>A^+ = A^*\,\!</math>.
 
===Circulant matrices===
For a [[Circulant matrix]] <math>C\,\!</math>, the singular value decomposition is given by the [[Fourier transform]],
that is the singular values are the Fourier coefficients.
Let <math>\mathcal{F}</math> be the [[DFT matrix|Discrete Fourier Transform (DFT) matrix]], then
:<math>C = \mathcal{F}\cdot\Sigma\cdot\mathcal{F}^*\,\!</math>
:<math>C^+ = \mathcal{F}\cdot\Sigma^+\cdot\mathcal{F}^*\,\!</math><ref name="Stallings1972">{{cite journal | last=Stallings | first=W. T. | authorlink=W. T. Stallings | title=The Pseudoinverse of an r-Circulant Matrix | journal=[[Proceedings of the American Mathematical Society]] | volume=34 | pages=385–388 | year=1972 | doi=10.2307/2038377 | last2=Boullion | first2=T. L.}}</ref>
 
==Construction==
 
===Rank decomposition===
Let <math> r \le \min(m,n)</math> denote the [[rank (matrix theory)|rank]] of
<math>A \in M(m,n;\mathbb{K})\,\!</math>. Then <math>A\,\!</math> can be [[rank factorization|(rank) decomposed]] as
<math>A=BC\,\!</math> where
<math>B \in M(m,r;\mathbb{K})\,\!</math> and
<math>C \in M(r,n;\mathbb{K})\,\!</math> are of rank <math>r</math>.
Then <math>A^+ = C^+B^+ = C^*(CC^*)^{-1}(B^*B)^{-1}B^*\,\!</math>.
 
===The QR method===
For <math>\mathbb{K}=\mathbb{R}\,\!</math> or <math>\mathbb{K}=\mathbb{C}\,\!</math>
computing the product <math>AA^*</math> or <math>A^*A</math> and their inverses explicitly is often a source of numerical rounding errors and computational cost in practice.
An alternative approach using the [[QR decomposition]] of <math>A\,\!</math> may be used instead.
 
Considering the case when <math>A\,\!</math> is of full column rank, so that
<math>A^+ = (A^*A)^{-1}A^*\,\!</math>. Then the [[Cholesky decomposition]]
<math>A^*A = R^*R\,\!</math>,
where <math>R\,\!</math> is an [[upper triangular matrix]], may be used.
Multiplication by the inverse is then done easily by solving a system with multiple right-hand-sides,
:<math>A^+ = (A^*A)^{-1}A^*  \qquad \Leftrightarrow \qquad  (A^*A)A^+ = A^*  \qquad \Leftrightarrow \qquad R^*RA^+ = A^* </math>
which may be solved by [[forward substitution]] followed by [[back substitution]].
 
The Cholesky decomposition may be computed without forming <math>A^*A\,\!</math> explicitly, by alternatively using the [[QR decomposition]] of <math> A = QR\,\!</math>,
where <math>Q\,\,\!</math> has orthonormal columns, <math> Q^*Q = I </math>, and
<math>R\,\!</math> is upper triangular. Then
:<math> A^*A \,=\, (QR)^*(QR) \,=\, R^*Q^*QR \,=\, R^*R</math>,
so {{math|''R''}} is the Cholesky factor of <math>A^*A</math>.
 
The case of full row rank is treated similarly by using the formula
<math>A^+ = A^*(AA^*)^{-1}\,\!</math> and using a similar argument, swapping the roles of <math>A</math> and
<math>A^*</math>.
 
===Singular value decomposition (SVD)===
A computationally simple and accurate way to compute the pseudoinverse is by using the [[singular value decomposition]].<ref name="IG2003"/><ref name="GvL1996"/><ref name="SLEandPI">[http://www.uwlax.edu/faculty/will/svd/systems/index.html Linear Systems & Pseudo-Inverse]</ref> If <math>A = U\Sigma V^*</math> is the singular value decomposition of {{math|''A''}}, then <math>A^+ = V\Sigma^+ U^*</math>. For a [[diagonal matrix]] such as <math>\Sigma</math>, we get the pseudoinverse by taking the reciprocal of each non-zero element on the diagonal, leaving the zeros in place. In numerical computation, only elements larger than some small tolerance are taken to be nonzero, and the others are replaced by zeros. For example, in the [[MATLAB]] or [[NumPy]] function <tt>pinv</tt>, the tolerance is taken to be {{math|''t'' {{=}} ε•max(''m'',''n'')•max(Σ)}}, where ε is the [[machine epsilon]].
 
The computational cost of this method is dominated by the cost of computing the SVD, which is several times higher than matrix-matrix multiplication, even if a state-of-the art implementation (such as that of [[LAPACK]]) is used.
 
The above procedure shows why taking the pseudoinverse is not a continuous operation: if the original matrix {{math|''A''}} has a singular value 0 (a diagonal entry of the matrix <math>\Sigma</math> above), then modifying {{math|''A''}} slightly may turn this zero into a tiny positive number, thereby affecting the pseudoinverse dramatically as we now have to take the reciprocal of a tiny number.
 
===Block matrices===
[[Block matrix pseudoinverse|Optimized approaches]] exist for calculating the pseudoinverse of block structured matrices.
 
===The iterative method of Ben-Israel and Cohen===
Another method for computing the pseudoinverse uses the recursion
:<math> A_{i+1} = 2A_i - A_i A A_i, \, </math>
which is sometimes referred to as hyper-power sequence. This recursion produces a sequence converging quadratically to the pseudoinverse of <math>A</math> if it is started with an appropriate <math>A_0</math> satisfying <math>A_0 A = (A_0 A)^*</math>. The choice <math>A_0 = \alpha A^*</math> (where <math>0 < \alpha < 2/\sigma^2_1(A)</math>, with <math>\sigma_1(A)</math> denoting the largest singular value of <math>A</math>) <ref>{{cite journal | last1=Ben-Israel | first1=Adi | last2=Cohen | first2=Dan | title=On Iterative Computation of Generalized Inverses and Associated Projections | journal=SIAM Journal on Numerical Analysis | volume=3 | pages=410–419 | year=1966 | jstor=2949637 | doi=10.1137/0703035 }}[http://benisrael.net/COHEN-BI-ITER-GI.pdf pdf]</ref> has been argued not to be competitive to the method using the SVD mentioned above, because even for moderately ill-conditioned matrices it takes a long time before <math>A_i</math> enters the region of quadratic convergence.<ref>{{cite journal | last1=Söderström | first1=Torsten | last2=Stewart | first2=G. W. | title=On the Numerical Properties of an Iterative Method for Computing the Moore- Penrose Generalized Inverse | journal=SIAM Journal on Numerical Analysis | volume=11 | pages=61–74 | year=1974 | jstor=2156431 | doi=10.1137/0711008 }}</ref> However, if started with <math>A_0</math> already close to the Moore–Penrose pseudoinverse and <math>A_0 A= (A_0 A)^*</math>, for example <math>A_0:=(A^* A+ \delta I)^{-1} A^*</math>, convergence is fast (quadratic).
 
===Updating the pseudoinverse===
 
For the cases where {{math|''A''}} has full row or column rank, and the inverse of the correlation matrix (<math>AA^*</math> for {{math|''A''}} with full row rank or <math>A^*A</math> for full column rank) is already known, the pseudoinverse for matrices related to <math>A</math> can be computed by applying the [[Sherman–Morrison–Woodbury formula]] to update the inverse of the correlation matrix, which may need less work. In particular, if the related matrix differs from the original one by only a changed, added or deleted row or column, incremental algorithms<ref name="G1992">{{Cite journal |author= Tino Gramß |title= Worterkennung mit einem künstlichen neuronalen Netzwerk |version= |publisher= Georg-August-Universität zu Göttingen |year = 1992 | url = }}</ref><ref name="EMTIYAZ2008">, Mohammad Emtiyaz, "Updating Inverse of a Matrix When a Column is Added/Removed"[http://www.cs.ubc.ca/~emtiyaz/Writings/OneColInv.pdf]</ref> exist that exploit the relationship.
 
Similarly, it is possible to update the Cholesky factor when a row or column is added, without creating the inverse of the correlation matrix explicitly. However, updating the pseudoinverse in the general rank-deficient case is much more complicated.<ref>Meyer, Carl D., Jr. Generalized inverses and ranks of block matrices. SIAM J. Appl. Math. 25  (1973), 597&mdash;602</ref><ref>Meyer, Carl D., Jr. Generalized inversion of modified matrices. SIAM J. Appl. Math. 24  (1973), 315&mdash;323</ref>
 
===Software libraries===
The package [[NumPy]] provides a pseudo-inverse calculation through its functions <code>matrix.I</code> and <code>linalg.pinv</code>; its <code>pinv</code> uses the SVD-based algorithm. [[SciPy]] adds a function <code>scipy.linalg.pinv</code> that uses a least-squares solver. High quality implementations of SVD, QR, and back substitution are available in [[Singular_value_decomposition#Implementations|standard libraries]], such as [[LAPACK]].  Writing one's own implementation of SVD is a major programming project that requires a significant [[Floating_point#Accuracy_problems|numerical expertise]]. In special circumstances, such as [[parallel computing]] or [[embedded computing]], however, alternative implementations by QR or even the use of an explicit inverse might be preferable, and custom implementations may be unavoidable.
 
==Applications==
 
===Linear least-squares===
{{See also|Linear least squares (mathematics)}}
 
The pseudoinverse provides a [[linear least squares (mathematics)|least squares]] solution to a [[system of linear equations]].<ref name="Penrose1956">{{cite journal | last=Penrose | first=Roger | title=On best approximate solution of linear matrix equations | journal=[[Proceedings of the Cambridge Philosophical Society]] | volume=52 | pages=17–19 | year=1956 | doi=10.1017/S0305004100030929}}</ref>
For <math> A \in M(m,n; \mathbb{K})\,\!</math>, given a system of linear equations
:<math>A x = b\,</math>,
in general, a vector <math>x</math> which solves the system may not exist, or if one exists, it may not be unique. The pseudoinverse solves the "least-squares" problem as follows:
 
*<math> \forall x \in \mathbb{K}^n\,\!</math>, we have <math>\|Ax -b\|_2 \ge \|Az -b\|_2</math> where <math>z = A^+b</math> and <math>\|\cdot\|_2</math> denotes the [[Euclidean norm]].  This weak inequality holds with equality if and only if <math>x = A^+b + (I - A^+A)w</math>  for any vector ''w''; this provides an infinitude of minimizing solutions unless ''A'' has full column rank, in which case <math>(I - A^+A)</math>  is a zero matrix.
 
This result is easily extended to systems with multiple right-hand sides, when the Euclidean norm is replaced by
the Frobenius norm. Let <math> B \in M(m,p; \mathbb{K})\,\!</math>.
 
*<math> \forall X \in M(n,p; \mathbb{K})\,\!</math>, we have <math> \|AX - B\|_F \ge \|AZ -B\|_F</math> where <math>Z = A^+B</math> and <math>\|\cdot\|_F </math> denotes the [[Frobenius norm]].
 
===Obtaining all solutions of a linear system===
 
If the linear system
 
:<math>A x = b\,</math>
 
has any solutions, they are all given by
 
:<math>x = A^+ b + [I - A^+ A]w</math>
 
for arbitrary vector ''w''. Solution(s) exist if and only if <math>AA^+ b = b</math>.  If the latter holds, then the solution is unique if and only if ''A'' has full column rank, in which case <math>[I - A^+ A]</math> is a zero matrix.
 
===Minimum-norm solution to a linear system===
For linear systems <math>A x = b,\,</math> with non-unique solutions (such as under-determined systems), the pseudoinverse may be used to construct the solution of minimum [[Euclidean norm]]
<math>\|x\|_2</math> among all solutions.
 
*If <math>A x = b\,</math> is satisfiable, the vector <math>z = A^+b</math> is a solution, and satisfies <math>\|z\|_2 \le \|x\|_2</math> for all solutions.
 
This result is easily extended to systems with multiple right-hand sides, when the Euclidean norm is replaced by
the Frobenius norm. Let <math> B \in M(m,p; \mathbb{K})\,\!</math>.
 
*If <math>A X = B\,</math> is satisfiable, the matrix <math>Z = A^+B</math> is a solution, and satisfies <math>\|Z\|_F \le \|X\|_F</math> for all solutions.
 
===Geometric construction===
This description suggests the following geometric construction for the result of applying the pseudoinverse of an {{math|''m''}}&times;{{math|''n''}} matrix {{math|''A''}} to a vector. To find <math>A^+b</math> for given {{math|''b''}} in {{math|'''R'''<sup>''m''</sup>}}, first project {{math|''b''}} orthogonally onto the range of {{math|''A''}}, finding a point {{math|''p''(''b'')}} in the range. Then form {{math|''A''<sup>-1</sup>({''p''(''b'')})}}, i.e. find those vectors in {{math|'''R'''<sup>''n''</sup>}} that {{math|''A''}} sends to {{math|''p''(''b'')}}. This will be an affine subspace of {{math|'''R'''<sup>''n''</sup>}} parallel to the kernel of {{math|''A''}}. The element of this subspace that has the smallest length (i.e. is closest to the origin) is the answer <math>A^+b</math> we are looking for. It can be found by taking an arbitrary member of {{math|''A''<sup>-1</sup>({''p''(''b'')}) }} and projecting it orthogonally onto the orthogonal complement of the kernel of {{math|''A''}}.
 
===Condition number===
Using the pseudoinverse and a [[matrix norm]], one can define a [[condition number]] for any matrix:
:<math>\mbox{cond}(A)=\|A\| \|A^+\|.\ </math>
A large condition number implies that the problem of finding least-squares solutions to the corresponding system of linear equations is ill-conditioned in the sense that small errors in the entries of {{math|''A''}} can lead to huge errors in the entries of the solution.<ref name=hagen/>
 
==Generalizations==
In order to solve more general least-squares problems, one can define Moore–Penrose pseudoinverses for all continuous linear operators {{math|''A'' : ''H''<sub>1</sub> &rarr; ''H''<sub>2</sub>}} between two [[Hilbert space]]s {{math|''H''<sub>1</sub>}} and {{math|''H''<sub>2</sub>}}, using the same four conditions as in our definition above. It turns out that not every continuous linear operator has a continuous linear pseudo-inverse in this sense.<ref name=hagen>Roland Hagen, Steffen Roch, Bernd Silbermann. ''C*-algebras and Numerical Analysis'', CRC Press, 2001. Section 2.1.2.</ref> Those that do are precisely the ones whose range is [[closed set|closed]] in {{math|''H''<sub>2</sub>}}.
 
In [[abstract algebra]], a Moore–Penrose pseudoinverse may be defined on a [[*-regular semigroup]]. This abstract definition coincides with the one in linear algebra.
 
==See also==
* [[Proofs involving the Moore–Penrose pseudoinverse]]
* [[Drazin inverse]]
* [[Hat matrix]]
* [[Inverse element]]
* [[Linear least squares (mathematics)]]
* [[Pseudo-determinant]]
* [[Von Neumann regular ring]]
 
==References==
{{Reflist}}
 
==External links==
* [http://planetmath.org/encyclopedia/Pseudoinverse.html Pseudoinverse on PlanetMath]
* [http://people.revoledu.com/kardi/tutorial/LinearAlgebra/MatrixGeneralizedInverse.html Interactive program & tutorial of Moore-Penrose Pseudoinverse]
* {{planetmath reference|id=6067|title=Moore-Penrose inverse}}
* {{MathWorld|urlname=Pseudoinverse|title=Pseudoinverse}}
* {{MathWorld|urlname=Moore-PenroseMatrixInverse|title=Moore-Penrose Inverse}}
 
{{Numerical linear algebra}}
 
{{DEFAULTSORT:Moore-Penrose Pseudoinverse}}
[[Category:Matrix theory]]
[[Category:Singular value decomposition]]
[[Category:Numerical linear algebra]]

Latest revision as of 09:00, 14 May 2014

My hobby is mainly Radio-Controlled Car Racing. Seems boring? Not at all!
I also try to learn Swedish in my spare time.

Review my web blog Fifa 15 coin generator