2-EXPTIME: Difference between revisions
en>ClueBot NG m Reverting possible vandalism by 141.76.8.245 to version by Rjwilmsi. False positive? Report it. Thanks, ClueBot NG. (1121528) (Bot) |
en>EmausBot m Bot: Migrating 1 interwiki links, now provided by Wikidata on d:Q10844267 |
||
Line 1: | Line 1: | ||
In [[mathematics]] and [[multivariate statistics]], the '''centering matrix'''<ref>John I. Marden, ''Analyzing and Modeling Rank Data'', Chapman & Hall, 1995, ISBN 0-412-99521-2, page 59.</ref> is a [[symmetric matrix|symmetric]] and [[idempotent]] [[Matrix (mathematics)|matrix]], which when multiplied with a vector has the same effect as subtracting the [[mean]] of the components of the vector from every component. | |||
== Definition == | |||
The '''centering matrix''' of size ''n'' is defined as the ''n''-by-''n'' matrix | |||
:<math>C_n = I_n - \tfrac{1}{n}\mathbb{O} </math> | |||
where <math>I_n\,</math> is the [[identity matrix]] of size ''n'' and <math>\mathbb{O}</math> is an ''n''-by-''n'' matrix of all 1's. This can also be written as: | |||
:<math>C_n = I_n - \tfrac{1}{n}\mathbf{1}\mathbf{1}^\top</math> | |||
where <math>\mathbf{1}</math> is the column-vector of ''n'' ones and where <math>\top</math> denotes [[matrix transpose]]. | |||
For example | |||
:<math>C_1 = \begin{bmatrix} | |||
0 \end{bmatrix} | |||
</math>, | |||
:<math>C_2= \left[ \begin{array}{rrr} | |||
1 & 0 \\ \\ | |||
0 & 1 | |||
\end{array} \right] - \frac{1}{2}\left[ \begin{array}{rrr} | |||
1 & 1 \\ \\ | |||
1 & 1 | |||
\end{array} \right] = \left[ \begin{array}{rrr} | |||
\frac{1}{2} & -\frac{1}{2} \\ \\ | |||
-\frac{1}{2} & \frac{1}{2} | |||
\end{array} \right] | |||
</math> , | |||
:<math>C_3 = \left[ \begin{array}{rrr} | |||
1 & 0 & 0 \\ \\ | |||
0 & 1 & 0 \\ \\ | |||
0 & 0 & 1 | |||
\end{array} \right] - \frac{1}{3}\left[ \begin{array}{rrr} | |||
1 & 1 & 1 \\ \\ | |||
1 & 1 & 1 \\ \\ | |||
1 & 1 & 1 | |||
\end{array} \right] | |||
= \left[ \begin{array}{rrr} | |||
\frac{2}{3} & -\frac{1}{3} & -\frac{1}{3} \\ \\ | |||
-\frac{1}{3} & \frac{2}{3} & -\frac{1}{3} \\ \\ | |||
-\frac{1}{3} & -\frac{1}{3} & \frac{2}{3} | |||
\end{array} \right] | |||
</math> | |||
== Properties == | |||
Given a column-vector, <math>\mathbf{v}\,</math> of size ''n'', the '''centering property''' of <math>C_n\,</math> can be expressed as | |||
:<math>C_n\,\mathbf{v} = \mathbf{v}-(\tfrac{1}{n}\mathbf{1}'\mathbf{v})\mathbf{1}</math> | |||
where <math>\tfrac{1}{n}\mathbf{1}'\mathbf{v}</math> is the mean of the components of <math>\mathbf{v}\,</math>. | |||
<math>C_n\,</math> is symmetric [[positive semi-definite]]. | |||
<math>C_n\,</math> is [[idempotent]], so that <math>C_n^k=C_n</math>, for <math>k=1,2,\ldots</math>. Once the mean has been removed, it is zero and removing it again has no effect. | |||
<math>C_n\,</math> is [[singular matrix| singular]]. The effects of applying the transformation <math>C_n\,\mathbf{v}</math> cannot be reversed. | |||
<math>C_n\,</math> has the [[eigenvalue]] 1 of multiplicity ''n'' − 1 and eigenvalue 0 of multiplicity 1. | |||
<math>C_n\,</math> has a [[kernel (matrix)|nullspace]] of dimension 1, along the vector <math>\mathbf{1}</math>. | |||
<math>C_n\,</math> is a [[projection matrix]]. That is, <math>C_n\mathbf{v}</math> is a projection of <math>\mathbf{v}\,</math> onto the (''n'' − 1)-dimensional [[linear subspace|subspace]] that is orthogonal to the nullspace <math>\mathbf{1}</math>. (This is the subspace of all ''n''-vectors whose components sum to zero.) | |||
== Application == | |||
Although multiplication by the centering matrix is not a computationally efficient way of removing the mean from a vector, it forms an analytical tool that conveniently and succinctly expresses mean removal. It can be used not only to remove the mean of a single vector, but also of multiple vectors stored in the rows or columns of a matrix. For an ''m''-by-''n'' matrix <math>X\,</math>, the multiplication <math>C_m\,X</math> removes the means from each of the ''n'' columns, while <math>X\,C_n</math> removes the means from each of the ''m'' rows. | |||
The centering matrix provides in particular a succinct way to express the [[scatter matrix]], <math>S=(X-\mu\mathbf{1}')(X-\mu\mathbf{1}')'</math> of a data sample <math>X\,</math>, where <math>\mu=\tfrac{1}{n}X\mathbf{1}</math> is the [[sample mean]]. The centering matrix allows us to express the scatter matrix more compactly as | |||
:<math>S=X\,C_n(X\,C_n)'=X\,C_n\,C_n\,X\,'=X\,C_n\,X\,'.</math> | |||
<math>C_n</math> is the [[covariance matrix]] of the [[multinomial distribution]], in the special case where the parameters of that distribution are <math>k=n</math>, and <math>p_1=p_2=\cdots=p_n=\frac{1}{n}</math>. | |||
== References == | |||
<references/> | |||
[[Category:Multivariate statistics]] | |||
[[Category:Matrices]] | |||
[[Category:Statistical terminology]] |
Revision as of 01:35, 11 April 2013
In mathematics and multivariate statistics, the centering matrix[1] is a symmetric and idempotent matrix, which when multiplied with a vector has the same effect as subtracting the mean of the components of the vector from every component.
Definition
The centering matrix of size n is defined as the n-by-n matrix
where is the identity matrix of size n and is an n-by-n matrix of all 1's. This can also be written as:
where is the column-vector of n ones and where denotes matrix transpose.
For example
Properties
Given a column-vector, of size n, the centering property of can be expressed as
where is the mean of the components of .
is symmetric positive semi-definite.
is idempotent, so that , for . Once the mean has been removed, it is zero and removing it again has no effect.
is singular. The effects of applying the transformation cannot be reversed.
has the eigenvalue 1 of multiplicity n − 1 and eigenvalue 0 of multiplicity 1.
has a nullspace of dimension 1, along the vector .
is a projection matrix. That is, is a projection of onto the (n − 1)-dimensional subspace that is orthogonal to the nullspace . (This is the subspace of all n-vectors whose components sum to zero.)
Application
Although multiplication by the centering matrix is not a computationally efficient way of removing the mean from a vector, it forms an analytical tool that conveniently and succinctly expresses mean removal. It can be used not only to remove the mean of a single vector, but also of multiple vectors stored in the rows or columns of a matrix. For an m-by-n matrix , the multiplication removes the means from each of the n columns, while removes the means from each of the m rows.
The centering matrix provides in particular a succinct way to express the scatter matrix, of a data sample , where is the sample mean. The centering matrix allows us to express the scatter matrix more compactly as
is the covariance matrix of the multinomial distribution, in the special case where the parameters of that distribution are , and .
References
- ↑ John I. Marden, Analyzing and Modeling Rank Data, Chapman & Hall, 1995, ISBN 0-412-99521-2, page 59.