Sholl analysis: Difference between revisions

Latest revision as of 16:42, 22 November 2014

I'm Eddie and I liѵe աith my huѕband and our 2 children in Bergerаc, in thе south part. My hobbies are Art collеcting, Antiquing and Shooting sport.
http://www.highschoolofperformingarts.com/uploads/files/pres.php?uploads/files/pres.php=salomon-xt-slab-5-shoes

my wеƄpage; Air jordan 11 kaskus

@@ Line 1: / Line 1: @@
-The '''iterative proportional fitting procedure''' ('''IPFP''', also known as '''biproportional fitting''' in statistics, '''RAS algorithm'''<ref>{{cite journal
+I'm Eddie and I liѵe աith my huѕband and our 2 children in Bergerаc, in thе  south part. My hobbies are Art collеcting, Antiquing and Shooting sport.<br>http://www.highschoolofperformingarts.com/uploads/files/pres.php?uploads/files/pres.php=salomon-xt-slab-5-shoes<br><br>my wеƄpage; [http://Grohova.cz/img/tmp/pres.php?nike/air/force=adidas-shoes-barricade-6.0 Air jordan 11 kaskus]
- |last=Bacharach|first=M.
- |year=1965
- |title=Estimating Nonnegative Matrices from Marginal Data
- |journal=International Economic Review
- |volume=6|pages=294–310
- |doi=10.2307/2525582
- |jstor=2525582
- |issue=3
- |publisher=Blackwell Publishing
-}}</ref> in economics and '''matrix raking''' or '''matrix scaling''' in computer science) is an [[iterative algorithm]] for estimating cell values of a [[contingency table]] such that the marginal totals remain fixed and the estimated table decomposes into an [[outer product]].
-First introduced by [[W. Edwards Deming|Deming]] and Stephan in 1940<ref>{{cite journal
- |last=Deming |first=W. E.|authorlink=W. Edwards Deming
- |last2=Stephan |first2=F. F.
- |year=1940
- |title=On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known
- |journal=[[Annals of Mathematical Statistics]]
- |volume=11 |issue=4 |pages=427–444
- |mr=3527 |doi=10.1214/aoms/1177731829
-}}</ref> (they proposed IPFP as an algorithm leading to a minimizer of the [[Pearson X-squared statistic]], which it ''does not'',<ref>{{cite journal
- |last=Stephan |first=F. F.|year=1942
- |title=Iterative method of adjusting frequency tables when expected margins are known
- |journal=[[Annals of Mathematical Statistics]]
- |volume=13 |issue=2 |pages=166–178
- |mr=6674 | zbl = 0060.31505 |doi=10.1214/aoms/1177731604
-}}</ref> and even failed to prove convergence), it has seen various extensions and related research. A rigorous proof of convergence by means of [[differential geometry]] is due to [[Stephen Fienberg|Fienberg]] (1970).<ref>{{cite journal
- |last=Fienberg |first=S. E.|authorlink=Stephen Fienberg
- |year=1970
- |title=An Iterative Procedure for Estimation in Contingency Tables
- |journal=[[Annals of Mathematical Statistics]]
- |volume=41 |issue=3 |pages=907–917
- |mr=266394 | zbl = 0198.23401 | jstor = 2239244 |doi=10.1214/aoms/1177696968
-}}</ref> He interpreted the family of contingency tables of constant crossproduct ratios as a particular (''IJ''&nbsp;&minus;&nbsp;1)-dimensional manifold of constant interaction and showed that the IPFP is a fixed-point iteration on that manifold. Nevertheless, he assumed strictly positive observations. Generalization to tables with zero entries is still considered a hard and only partly solved problem.
-An exhaustive treatment of the algorithm and its mathematical foundations can be found in the book of Bishop et al. (1975).<ref>{{cite book
- |title=Discrete Multivariate Analysis: Theory and Practice
- |last=Bishop |first=Y. M. M.
- |first2=S. E. |last2=Fienberg |authorlink2=Stephen Fienberg
- |first3=P. W. |last3=Holland
- |year=1975
- |publisher=MIT Press|isbn=978-0-262-02113-5 |mr=381130
-}}</ref> The first general proof of convergence, built on non-trivial measure theoretic theorems and entropy minimization, is due to Csiszár (1975).<ref>{{cite journal
- |last=Csiszár |first=I.|authorlink=Imre Csiszár
- |year=1975
- |title=''I''-Divergence of Probability Distributions and Minimization Problems
- |journal=Annals of Probability
- |volume=3 |issue=1 |pages=146–158
- |mr=365798 | zbl = 0318.60013 | jstor = 2959270 |doi=10.1214/aop/1176996454
-}}</ref>
-Relatively new results on convergence and error behavior have been published by Pukelsheim and Simeone (2009)
-.<ref>{{cite web |title=On the Iterative Proportional Fitting Procedure: Structure of Accumulation Points and L1-Error Analysis |url=http://opus.bibliothek.uni-augsburg.de/volltexte/2009/1368/ |date= |work= |publisher= Pukelsheim, F. and Simeone, B. |accessdate=2009-06-28}}</ref> They proved simple necessary and sufficient conditions for the convergence of the IPFP for arbitrary two-way tables (i.e. tables with zero entries) by analysing an <math>L_1</math>-error function.
-Other general algorithms can be modified to yield the same limit as the IPFP, for instance the [[Newton–Raphson method]] and
-the [[EM algorithm]]. In most cases, IPFP is preferred due to its computational speed, numerical stability and algebraic simplicity.
-== Algorithm 1 (classical IPFP) ==
-Given a  two-way (''I'' &times; ''J'')-table of counts <math>(x_{ij})</math>, where the cell values are assumed to be Poisson or multinomially distributed, we wish to estimate a decomposition <math>\hat{m}_{ij} = a_i b_j</math> for all ''i'' and ''j'' such that <math>(\hat{m}_{ij})</math> is the [[maximum likelihood]] estimate (MLE) of the expected values <math>(m_{ij})</math> leaving the marginals <math>\textstyle x_{i+} = \sum_j x_{ij}\,</math> and <math>\textstyle x_{+j} = \sum_i x_{ij}\,</math> fixed. The assumption that the table factorizes in such a manner is known as the ''model of independence'' (I-model). Written in terms of a [[log-linear model]], we can write this assumption as <math>\log\ m_{ij} = u + v_i + w_j + z_{ij}</math>, where <math>m_{ij} := \mathbb{E}(x_{ij})</math>, <math>\sum_i v_i = \sum_j w_j = 0</math> and the interaction term vanishes, that is <math>z_{ij} = 0</math> for all ''i'' and ''j''.
-Choose initial values <math>\hat{m}_{ij}^{(0)} := 1</math> (different choices of initial values may lead to changes in convergence behavior), and for <math>\eta \geq 1</math> set
-: <math>\hat{m}_{ij}^{(2\eta - 1)} = \frac{\hat{m}_{ij}^{(2\eta-2)}x_{i+}}{\sum_{k=1}^J \hat{m}_{ik}^{(2\eta-2)}}</math>
-: <math>\hat{m}_{ij}^{(2\eta)} = \frac{\hat{m}_{ij}^{(2\eta-1)}x_{+j}}{\sum_{k=1}^I \hat{m}_{kj}^{(2\eta-1)}}.</math>
-Notes:
-* Convergence does not depend on the actual distribution. Distributional assumptions are necessary for inferring that the limit <math>(\hat{m}_{ij}) := \lim_{\eta\rightarrow\infty} (\hat{m}^{(\eta)}_{ij})</math> is an MLE indeed.
-* IPFP can be manipulated to generate any positive marginals be replacing <math>x_{i+}</math> by the desired row marginal <math>u_i</math> (analogously for the column marginals).
-* IPFP can be extended to fit the ''model of quasi-independence'' (Q-model), where <math>m_{ij} = 0</math> is known a priori for <math>(i,j)\in S</math>. Only the initial values have to be changed: Set <math>\hat{m}_{ij}^{(0)} = 0</math> if <math>(i,j)\in S</math> and 1 otherwise.
-== Algorithm 2 (factor estimation) ==
-Assume the same setting as in the classical IPFP.
-Alternatively, we can estimate the row and column factors separately: Choose initial values <math>\hat{b}_j^{(0)} := 1</math>, and for <math>\eta \geq 1</math> set
-: <math>\hat{a}_i^{(\eta)} = \frac{x_{i+}}{\sum_j \hat{b}_j^{(\eta-1)}},</math>
-: <math>\hat{b}_j^{(\eta)} = \frac{x_{+j}}{\sum_i \hat{a}_i^{(\eta)}}</math>
-Setting <math>\hat{m}_{ij}^{(2\eta)} = \hat{a}_i^{(\eta)}\hat{b}_j^{(\eta)}</math>, the two variants of the algorithm are mathematically equivalent (can be seen by formal induction).
-Notes:
-* In matrix notation, we can write <math>(\hat{m}_{ij}) = \hat{a}\hat{b}^T</math>, where <math>\hat{a} = (\hat{a}_1,\ldots,\hat{a}_I)^T = \lim_{\eta\rightarrow\infty} \hat{a}^{(\eta)}</math> and <math>\hat{b} = (\hat{b}_1,\ldots,\hat{b}_J)^T = \lim_{\eta\rightarrow\infty} \hat{b}^{(\eta)}</math>.
-* The factorization is not unique, since it is <math>m_{ij} = a_i b_j = (\gamma a_i)(\frac{1}{\gamma}b_j)</math> for all <math>\gamma > 0</math>.
-* The factor totals remain constant, i.e. <math>\sum_i \hat{a}_i^{(\eta)} = \sum_i \hat{a}_i^{(1)}</math> for all <math>\eta \geq 1</math> and <math>\sum_j \hat{b}_j^{(\eta)} = \sum_j \hat{b}_j^{(0)}</math> for all <math>\eta \geq 0</math>.
-* To fit the Q-model, where <math>m_{ij} = 0</math> a priori for <math>(i,j)\in S</math>, set <math>\delta_{ij} = 0</math> if (<math>i,j)\in S</math> and <math>\delta_{ij} = 1</math> otherwise. Then
-:: <math>\hat{a}_i^{(\eta)} = \frac{x_{i+}}{\sum_j \delta_{ij}\hat{b}_j^{(\eta-1)}},</math>
-:: <math>\hat{b}_j^{(\eta)} = \frac{x_{+j}}{\sum_i \delta_{ij}\hat{a}_i^{(\eta)}}</math>
-:: <math>\hat{m}_{ij}^{(2\eta)} = \delta_{ij}\hat{a}_i^{(\eta)}\hat{b}_j^{(\eta)}</math>
-Obviously, the I-model is a particular case of the Q-model.
-== Algorithm 3 (RAS) ==
-The Problem: Let <math>M := (m^{(0)}_{ij}) \in \mathbb{R}^{I\times J}</math> be the initial matrix with nonnegative entries, <math>u \in \mathbb{R}^I</math> a vector of specified
-row marginals (e.i. row sums) and <math>v \in \mathbb{R}^J</math> a vector of column marginals. We wish to compute a matrix <math>\hat{M} = (\hat{m}_{ij}) \in \mathbb{R}^{I\times J}</math> similar to ''M'' with predefined marginals, meaning
-: <math>\hat{a}_{i+} = \sum_{j=1}^n \hat{a}_{ij} = u_i</math>
-and
-: <math>\hat{a}_{+j} = \sum_{i=1}^m \hat{a}_{ij} = v_j</math>
-Define the diagonalization operator <math>diag: \mathbb{R}^k \longrightarrow \mathbb{R}^{k\times k}</math>, which produces a (diagonal) matrix with its input vector on the main diagonal and zero elsewhere. Then, for <math>\eta \geq 0</math>, set
-: <math>M^{(2\eta + 1)} = \text{diag}(r^{(\eta+1)})M^{(2\eta)}</math>
-: <math>M^{(2\eta + 2)} = M^{(2\eta+1)}\text{diag}(s^{(\eta+1)})</math>
-where
-: <math>r_i^{\eta + 1} = \frac{u_i}{\sum_j m_{ij}^{(2\eta)}}</math>
-: <math>s_j^{\eta + 1} = \frac{v_j}{\sum_i m_{ij}^{(2\eta+1)}}</math>
-Finally, we obtain <math>\hat{M} = \lim_{\eta\rightarrow\infty} M^{(\eta)}.</math>
-== Discussion and comparison of the algorithms ==
-Although RAS seems to be the solution of an entirely different problem, it is indeed identical to the classical IPFP. In practice,
-one would not implement actual matrix multiplication, since diagonal matrices are involved. Reducing the operations to the necessary ones,
-it can easily be seen that RAS does the same as IPFP. The vaguely demanded 'similarity' can be explained as follows: IPFP (and thus RAS)
-maintains the crossproduct ratios, e.i.
-: <math>\frac{m^{(0)}_{ij}m^{(0)}_{hk}}{m^{(0)}_{ik}m^{(0)}_{hj}} = \frac{m^{(\eta)}_{ij}m^{(\eta)}_{hk}}{m^{(\eta)}_{ik}m^{(\eta)}_{hj}}\ \forall\ \eta \geq 0\text{ and }i\neq h,\quad  j\neq k</math>
-since <math>m^{(\eta)}_{ij} = a_i^{(\eta)}b_j^{(\eta)}.</math>
-This property is sometimes called '''structure conservation''' and directly leads to the geometrical interpretation of contingency tables and the proof of convergence in the seminal paper of Fienberg (1970).
-Nevertheless, direct factor estimation (algorithm 2) is under all circumstances the best way to deal with IPF: Whereas classical IPFP needs
-: <math>IJ(2+J) + IJ(2+I) = I^2J + IJ^2 + 4IJ \, </math>
-elementary operations in each iteration step (including a row and a column fitting step), factor estimation needs only
-: <math>I(1+J) + J(1+I) = 2IJ + I + J \, </math>
-operations being at least one order in magnitude faster than classical IPFP.
-== Existence and uniqueness of MLEs ==
-Necessary and sufficient conditions for the existence and uniqueness of MLEs are complicated in the general case (see<ref>{{cite book |title=The Analysis of Frequency Data |last=Haberman |first=S. J.|year=1974 |publisher=Univ. Chicago Press|isbn=978-0-226-31184-5}}</ref>), but sufficient conditions for 2-dimensional tables are simple:
-* the marginals of the observed table do not vanish (that is, <math>x_{i+} > 0,\ x_{+j} > 0</math>) and
-* the observed table is inseparable (e.i. the table does not permute to a block-diagonal shape).
-If unique MLEs exist, IPFP exhibits linear convergence in the worst case (Fienberg 1970), but exponential convergence has also been observed (Pukelsheim and Simeone 2009). If a direct estimator (i.e. a closed form of <math>(\hat{m}_{ij})</math>) exists, IPFP converges after 2 iterations. If unique MLEs do not exist, IPFP converges toward the so-called ''extended MLEs'' by design (Haberman 1974), but convergence may be arbitrarily slow and often computationally infeasible.
-If all observed values are strictly positive, existence and uniqueness of MLEs and therefore convergence is ensured.
-== Goodness of fit ==
-Checking if the assumption of independence is adequate, one uses the [[Pearson X-squared statistic]]
-: <math>X^2 = \sum_{i,j}\frac{(x_{ij}-\hat{m_{ij}})^2}{\hat{m_{ij}}}</math>
-or alternatively the [[likelihood-ratio test]] ([[G-test]]) statistic
-: <math>G = 2\sum_{i,j} x_{ij}\log\ \frac{x_{ij}}{\hat{m}_{ij}}</math>.
-Both statistics are asymptotically <math>\Chi^2_r</math>-distributed, where <math>r = (I-1)(J-1)</math> is the number of degrees of freedom.
-That is, if the [[p-value]]s <math>1 - \Chi^2_r(X^2)</math> and <math>1 - \Chi^2_r(G)</math> are not too small (> 0.05 for instance), there is no indication to discard the hypothesis of independence.
-== Interpretation ==
-If the rows correspond to different values of property A, and the columns correspond to different values of property B, and the hypothesis of independence is not discarded, the properties A and B are considered independent.
-== Example ==
-Consider a table of observations (taken from the entry on [[contingency table]]s):
-<center>
-{| class="wikitable"
-|-----
-|
- || right-handed || left-handed || TOTAL
-|-----
-| male || 43 || 9 || 52
-|-----
-| female || 44 || 4 || 48
-|-----
-| TOTAL || 87 || 13 || 100
-|}</center>
-For executing the classical IPFP, we first initialize the matrix with ones, leaving the marginals untouched:
-<center>
-{| class="wikitable"
-|-----
-|
- || right-handed || left-handed || TOTAL
-|-----
-| male || 1 || 1 || 52
-|-----
-| female || 1 || 1 || 48
-|-----
-| TOTAL || 87 || 13 || 100
-|}</center>
-Of course, the marginal sums do not correspond to the matrix anymore, but this is fixed in the next two iterations of IPFP. The first iteration deals with the row sums:
-<center>
-{| class="wikitable"
-|-----
-|
- || right-handed || left-handed || TOTAL
-|-----
-| male || 26 || 26 || 52
-|-----
-| female || 24 || 24 || 48
-|-----
-| TOTAL || 87 || 13 || 100
-|}</center>
-Note that, by definition, the row sums always constitute a perfect match after odd iterations, as do the column sums for even ones. The subsequent iteration updates the matrix column-wise:
-<center>
-{| class="wikitable"
-|-----
-|
- || right-handed || left-handed || TOTAL
-|-----
-| male || 45.24 || 6.76 || 52
-|-----
-| female ||  41.76 || 6.24 || 48
-|-----
-| TOTAL || 87 || 13 || 100
-|}</center>
-Now, both row and column sums of the matrix match the given marginals again.
-The [[p-value]] of this matrix approximates to <math>p(X^2) \approx  0.1824671</math>, meaning: gender and left-handedness/right-handedness can be considered independent.
-== Notes ==
-{{reflist}}
-{{DEFAULTSORT:Iterative Proportional Fitting}}
-[[Category:Categorical data]]
-[[Category:Statistical algorithms]]

Sholl analysis: Difference between revisions

Latest revision as of 16:42, 22 November 2014

Navigation menu

Search