|
|
Line 1: |
Line 1: |
| The '''Kaczmarz method''' or '''Kaczmarz's algorithm''' is an [[iterative algorithm]] for solving [[linear system]]s of equations <math> A x = b </math>. It was first discovered by the Polish mathematician [[Stefan Kaczmarz]],<ref>{{harvcoltxt|Kaczmarz|1937}}</ref> and was rediscovered in the field of image reconstruction from projections by [[Richard Gordon]], Robert Bender, and [[Gabor Herman]] in 1970, where it is called the [[Algebraic Reconstruction Technique]] (ART).<ref>{{harvcoltxt|Gordon|Bender|Herman|1970}}</ref> | | The author is recognized by the title of Figures Lint. For a whilst she's been in South Dakota. Hiring is her day job now but she's always needed her personal business. To gather cash is what his family members and him appreciate.<br><br>My web blog; home std test kit ([http://www.streaming.iwarrior.net/blog/258268 you can try here]) |
| | |
| It is applicable to any linear system of equations, but its computational advantage relative to other methods depends on the system being [[Sparse matrix|sparse]]. It has been demonstrated to be superior, in some biomedical imaging applications, to other methods such as the filtered backprojection method.<ref name="Herman2009">{{harvcoltxt|Herman|2009}}</ref>
| |
| | |
| It has many applications ranging from [[computed tomography]] (CT) to [[signal processing]]. It can be obtained also by applying to the hyperplanes, described by the linear system, the method of successive [[projections onto convex sets]] (POCS).<ref>{{harvcoltxt|Censor|Zenios|1997}}</ref><ref>{{harvcoltxt|Aster|Borchers|Thurber|2004}}</ref>
| |
| | |
| ==Algorithm 1: Randomized Kaczmarz algorithm==
| |
| Let <math> A x = b </math> be a linear system and x_{0} be arbitrary initial approximation to the solution of <math> Ax=b </math>. For <math> k=0,1,... </math> compute:
| |
| :<math>
| |
| x^{k+1}
| |
| =
| |
| x^{k}
| |
| +
| |
| \frac{b_{i} - \langle a_{i}, x^{k} \rangle}{\lVert a_{i} \rVert^2} a_{i}
| |
| </math>
| |
| | |
| where <math> i </math> is chosen from the set <math> {1,2,...,m} </math> at random, with probability proportional to <math> {\lVert a_{i} \rVert^2} </math>.
| |
| | |
| Under such circumstances <math> x_{k} </math> converges exponentially fast to the solution of <math> Ax=b </math>, and the rate of convergence depends only on the scaled [[condition number]] <math> \kappa(A) </math>.
| |
| | |
| ===Theorem===
| |
| Let <math> x </math> be the solution of <math> Ax=b </math>. Then Algorithm 1 converges to <math> x </math> in expectation, with the average error:
| |
| :<math> E{\lVert x_{k}-x \rVert^2} \leq (1-\kappa(A)^{-2})^{k} \cdot {\lVert x_{0}-x \rVert^2}. </math>
| |
| | |
| ===Proof===
| |
| There holds
| |
| | |
| <math>
| |
| \begin{align}
| |
| \sum_{j=1}^{m}|\langle z,a_j \rangle|^2 \geq \frac{\lVert z \rVert^2}{\lVert A^{-1} \rVert^2} \qquad\qquad\qquad\qquad (1)
| |
| \end{align}
| |
| </math> for all <math> z \in \mathbb C^n . </math>
| |
| | |
| Using the fact that <math> {\lVert A \rVert^2}=\sum_{j=1}^{m}{\lVert a_j \rVert^2} </math> we can write (1) as
| |
| | |
| <math>
| |
| \begin{align}
| |
| \sum_{j=1}^{m} \frac{{\lVert a_j \rVert^2}}{\lVert A \rVert^2}\left|\left\langle z,\frac {a_j}{\lVert a_j \rVert}\right\rangle \right|^2 \geq \kappa(A)^{-2}{\lVert z \rVert^2} \qquad\qquad\qquad\qquad (2)
| |
| \end{align}
| |
| </math> for all <math> z \in \mathbb C^n . </math>
| |
| | |
| | |
| The main point of the proof is to view the left hand side in (2) as an expectation of some random variable. Namely, recall that the solution space of the <math>j-th</math> equation of <math> Ax=b </math> is the hyperplane <math> {y : \langle y,a_j \rangle = b_j} </math>, whose normal is <math> \frac{a_j}{\lVert a_j \rVert^2}. </math> Define a random vector Z whose values are the normals to all the equations of <math> Ax=b </math>, with probabilities as in our algorithm:
| |
| | |
| <math> Z=\frac {a_j}{\lVert a_j \rVert} </math> with probability <math> \frac{\lVert a_j \rVert^2}{\lVert A \rVert^2} \qquad\qquad\qquad j=1,\cdots,m </math>
| |
| | |
| | |
| Then (2) says that
| |
| | |
| <math>
| |
| \begin{align}
| |
| \mathbb E|\langle z,Z\rangle|^2 \geq\kappa(A)^{-2}{\lVert z \rVert^2} \qquad\qquad (3)
| |
| \end{align}
| |
| </math> for all <math> z \in \mathbb C^n . </math>
| |
| | |
| | |
| The orthogonal projection <math>P</math> onto the solution space of a random equation of <math> Ax=b </math> is given by <math> Pz= z-\langle z-x, Z\rangle Z.</math>
| |
| | |
| Now we are ready to analyze our algorithm. We want to show that the error <math>{\lVert x_k-x \rVert^2}</math> reduces at each step in average (conditioned on the previous steps) by at least the factor of <math> (1-\kappa(A)^{-2}). </math> The next approximation <math> x_k </math> is computed from <math> x_{k-1} </math> as <math> x_k= P_kx_{k-1}, </math> where <math> P_1,P_2,\cdots </math> are independent realizations of the random projection <math> P. </math> The vector <math> x_{k-1}-x_l </math> is in the kernel of <math> P_k. </math> It is orthogonal to the solution space of the equation onto which <math> P_k </math> projects, which contains the vector <math> x_k-x </math> (recall that <math> x </math> is the solution to all equations). The orthogonality of these two vectors then yields <math> {\lVert x_k-x \rVert^2}={\lVert x_{k-1}-x \rVert^2}-{\lVert x_{k-1}-x_k \rVert^2}. </math>
| |
| To complete the proof, we have to bound <math> {\lVert x_{k-1}-x_k \rVert^2} </math> from below. By the definition of <math> x_k </math>, we have <math> {\lVert x_{k-1}-x_k \rVert}=\langle x_{k-1}-x,Z_k\rangle </math>
| |
| | |
| where <math> Z_1,Z_2,\cdots </math> are independent realizations of the random vector <math> Z. </math>
| |
| | |
| Thus <math> {\lVert x_k-x \rVert^2} \leq \left(1-\left|\left\langle\frac{x_{k-1}-x}{\lVert x_{k-1}-x \rVert},Z_k\right\rangle\right|^2\right){\lVert x_{k-1}-x \rVert^2}. </math>
| |
| | |
| Now we take the expectation of both sides conditional upon the choice of the random vectors <math> Z_1,\cdots,Z_{k-1} </math> (hence we fix the choice of the random projections <math> P_1,\cdots,P_{k-1} </math> and thus the random vectors <math> x_1,\cdots,x_{k-1} </math> and we average over the random vector <math> Z_k </math>). Then
| |
| | |
| <math> \mathbb E_{{Z_1,\cdots,Z_{k-1}}}{\lVert x_k-x \rVert^2} \leq \left(1-\mathbb E_{{Z_1,\cdots,Z_{k-1}}}\left|\left\langle\frac{x_{k-1}-x}{\lVert x_{k-1}-x \rVert},Z_k\right\rangle\right|^2\right){\lVert x_{k-1}-x \rVert^2}.</math>
| |
| | |
| By (3) and the independence,
| |
| | |
| <math> \mathbb E_{{Z_1,\cdots,Z_{k-1}}}{\lVert x_k-x \rVert^2} \leq (1-\kappa(A)^{-2}){\lVert x_{k-1}-x \rVert^2}. </math>
| |
| | |
| | |
| Taking the full expectation of both sides, we conclude that
| |
| | |
| <math> \mathbb E{\lVert x_k-x \rVert^2} \leq (1-\kappa(A)^{-2})\mathbb E{\lVert x_{k-1}-x \rVert^2}. </math>
| |
| | |
| | |
| <math> \blacksquare </math>
| |
| ==Algorithm 2: Randomized Kaczmarz algorithm with relaxation==
| |
| Given a real or complex <math> m \times n </math> matrix <math> A </math> and a real or complex vector <math> b </math>, respectively, the Kaczmarz's algorithm iteratively computes an approximation of the solution of the linear systems of equations <math> A x = b </math>. It does so by converging to the vector <math>x^*=A^T (AA^T )^{-1} b</math> without the need to [[Invertible matrix|invert]] the matrix <math>AA^T</math>, which is algorithm's main advantage, especially when the matrix <math>A</math> has a large number of rows.<ref>{{harvcoltxt|Chong|Zak|2008|pp=226}}</ref> Most generally, algorithm is defined as follows:
| |
| | |
| : <math>
| |
| x^{k+1}
| |
| =
| |
| x^{k}
| |
| +
| |
| \lambda_k
| |
| \frac{b_{i} - \langle a_{i}, x^{k} \rangle}{\lVert a_{i} \rVert^2} a_{i}
| |
| </math>
| |
| where <math> i = k \, \bmod \, m + 1 </math>, <math> a_i^T </math> is the ''i''-th row of the matrix <math> A </math>, <math> b_i </math> is the ''i''-th component of the vector <math> b </math>, and <math> \lambda_k </math> is a relaxation parameter. The above formulae gives a simple iteration routine.
| |
| There are various ways for choosing the ''i''-th equation <math> \langle a_{i}, x_{k} \rangle = b_i </math> and the relaxation parameter <math> \lambda_k </math>
| |
| at the ''k''-th iteration.<ref name="Herman2009"/>
| |
| | |
| If the [[linear system]] is consistent, the ART converges to the minimum-norm solution, provided that the iterations start with the zero vector. There are versions of the ART that converge to a regularized weighted least squares solution when applied to a system of inconsistent equations and, at least as far as initial behavior is concerned, at a lesser cost than other iterative methods, such as the [[conjugate gradient method]]. <ref>See {{harvcoltxt|Herman|2009}} and references therein.</ref>
| |
| | |
| ==Advances==
| |
| Recently, a randomized version of the Kaczmarz method for overdetermined linear systems was introduced by Strohmer and Vershynin<ref name="Strohmer_Vershynin_2009">{{harvcoltxt|Strohmer|Vershynin|2009}}</ref> in which the ''i''-th equation is selected with probability proportional to <math> \lVert a_{i} \rVert ^2 </math>.
| |
| The superiority of this selection was illustrated with the reconstruction of a bandlimited function from its nonuniformly spaced sampling values. However, it has been pointed out<ref name="Censor_Herman_Jiang_2009">{{harvcoltxt|Censor|Herman|Jiang|2009}}</ref> that the reported success by Strohmer and Vershynin depends on the specific choices that were made there in translating the underlying problem, whose geometrical nature is to ''find a common point of a set of hyperplanes'', into a system of algebraic equations. There will always be legitimate algebraic representations of the underlying problem for which the selection method in <ref name="Strohmer_Vershynin_2009"/> will perform in an inferior manner.<ref name="Strohmer_Vershynin_2009"/><ref name="Censor_Herman_Jiang_2009"/><ref>{{harvcoltxt|Strohmer|Vershynin|2009b}}</ref>
| |
| | |
| ==Notes==
| |
| {{reflist|2}}
| |
| | |
| ==References==
| |
| * {{citation |first=Stefan |last=Kaczmarz |author-link=Stefan Kaczmarz |title=Angenäherte Auflösung von Systemen linearer Gleichungen |work=''Bulletin International de l'Académie Polonaise des Sciences et des Lettres''. Classe des Sciences Mathématiques et Naturelles. Série A, Sciences Mathématiques, |volume=35 |pages=355–357 |url=http://jasonstockmann.com/Jason_Stockmann/Welcome_files/kaczmarz_english_translation_1937.pdf |year=1937 |format=PDF}}
| |
| * {{citation |title=An Introduction to Optimization |first=Edwin K. P. |last=Chong |first2=Stanislaw H.|last2=Zak |year=2008 |publisher=John Wiley & Sons |pages=226-230 |edition=3rd}}
| |
| * {{citation |first=Richard |last=Gordon |author-link=Richard Gordon |first2=Robert |last2=Bender |author2-link=Robert Bender |first3=Gabor |last3=Herman |author3-link=Gabor Herman |title=Algebraic reconstruction techniques (ART) for threedimensional electron microscopy and x-ray photography |journal=Journal of Theoretical Biology |volume=29, |pages=471–481 |year=1970}}
| |
| *{{citation |first=Gabor |last=Herman |author-link=Gabor Herman |title=Fundamentals of computerized tomography: Image reconstruction from projection |edition=2nd |publisher=Springer |year=2009}}
| |
| * {{citation |first=Yair |last=Censor |author-link=Yair Censor |first2=S.A. |last2=Zenios |title=Parallel optimization: theory, algorithms, and applications |publisher=Oxford University Press |location=New York |year=1997}}
| |
| * {{citation |first=Richard |last=Aster |first2=Brian |last2=Borchers |first3=Clifford |last3=Thurber |title=Parameter Estimation and Inverse Problems |publisher=Elsevier |year=2004}}
| |
| * {{citation |first=Thomas |last=Strohmer |first2=Roman |last2=Vershynin |title=A randomized Kaczmarz algorithm for linear systems with exponential convergence |journal=Journal of Fourier Analysis and Applications |volume=15 |pages=262–278 |year=2009 |url=http://www.eecs.berkeley.edu/~brecht/cs294docs/week1/09.Strohmer.pdf |format=PDF}}
| |
| * {{citation |first=Yair |last=Censor |first2=Gabor |last2=Herman |author2-link=Gabor Herman |first3=M. |last3=Jiang |title=A note on the behavior of the randomized Kaczmarz algorithm of Strohmer and Vershynin |journal=Journal of Fourier Analysis and Applications |volume=15 |pages=431–436 |year=2009}}
| |
| * {{citation |first=Thomas |last=Strohmer |first2=Roman |last2=Vershynin |title=Comments on the randomized Kaczmarz method |journal=Journal of Fourier Analysis and Applications |volume=15 |pages=437–440 |year=2009b}}
| |
| * {{citation |first=Quang |last=Vinh Nguyen |first2=Ford |last2=Lumban Gaol |title=Proceedings of the 2011 2nd International Congress on Computer Applications and Computational Science |journal=Springer |volume=2 |pages=465–469 |year=2011}}
| |
| | |
| ==External links==
| |
| *[http://www.eecs.berkeley.edu/~brecht/cs294docs/week1/09.Strohmer.pdf] A randomized Kaczmarz algorithm with exponential convergence
| |
| *[http://www-personal.umich.edu/~romanv/papers/kaczmarz-comments.pdf] Comments on the randomized Kaczmarz method
| |
| | |
| {{Numerical linear algebra}}
| |
| | |
| [[Category:Medical imaging]]
| |
| [[Category:Numerical linear algebra]]
| |