|
|
| Line 1: |
Line 1: |
| {{About|the mathematics of the chi-squared distribution|its uses in statistics|chi-squared test|the music group|Chi2 (band)}}
| | Hello, I'm Tarah, a 19 year old from Taufkirchen, Germany.<br>My hobbies include (but are not limited to) Stamp collecting, Fantasy Football and watching Sons of Anarchy.<br><br>Here is my website; [http://Norfolkgardenbaptist.org/ baptist hospita] |
| | |
| {{Probability distribution
| |
| | type = density
| |
| | pdf_image = [[File:Chi-square pdf.svg|321px]]
| |
| | cdf_image = [[File:Chi-square cdf.svg|321px]]
| |
| | notation = <math>\chi^2(k)\!</math> or <math>\chi^2_k\!</math>
| |
| | parameters = <math>k \in \mathbb{N}~~</math> (known as "degrees of freedom")
| |
| | support = ''x'' ∈ [0, +∞)
| |
| | pdf = <math>\frac{1}{2^{\frac{k}{2}}\Gamma\left(\frac{k}{2}\right)}\; x^{\frac{k}{2}-1} e^{-\frac{x}{2}}\,</math>
| |
| | cdf = <math>\frac{1}{\Gamma\left(\frac{k}{2}\right)}\;\gamma\left(\frac{k}{2},\,\frac{x}{2}\right)</math>
| |
| | mean = ''k''
| |
| | median = <math>\approx k\bigg(1-\frac{2}{9k}\bigg)^3</math>
| |
| | mode = max{ ''k'' − 2, 0 }
| |
| | variance = 2''k''
| |
| | skewness = <math>\scriptstyle\sqrt{8/k}\,</math>
| |
| | kurtosis = 12 / ''k''
| |
| | entropy = <math>\begin{align}\frac{k}{2}&+\ln(2\Gamma(k/2)) \\ &\!+(1-k/2)\psi(k/2)\end{align}</math>
| |
| | mgf = {{nowrap|(1 − 2 ''t'')<sup>−''k''/2</sup>}} for  t  < ½
| |
| | char = {{nowrap|(1 − 2 ''i'' ''t'')<sup>−''k''/2</sup>}} <ref>{{cite web | url=http://www.planetmathematics.com/CentralChiDistr.pdf | title=Characteristic function of the central chi-squared distribution | author=M.A. Sanders | accessdate=2009-03-06}}</ref>
| |
| }}
| |
| | |
| In [[probability theory]] and [[statistics]], the '''chi-squared distribution''' (also '''chi-square''' or {{nowrap|1='''<span style="font-family:serif">''χ''</span>²-distribution'''}}) with ''k'' [[Degrees of freedom (statistics)|degrees of freedom]] is the distribution of a sum of the squares of ''k'' [[Independence (probability theory)|independent]] [[standard normal]] random variables. It is one of the most widely used [[probability distribution]]s in [[inferential statistics]], e.g., in [[hypothesis testing]] or in construction of [[confidence interval]]s.<ref name=abramowitz>{{Abramowitz_Stegun_ref|26|940}}</ref><ref>NIST (2006). [http://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm Engineering Statistics Handbook - Chi-Squared Distribution]</ref><ref>{{cite book
| |
| | last = Jonhson
| |
| | first = N.L.
| |
| | coauthors = S. Kotz, , N. Balakrishnan
| |
| | title = Continuous Univariate Distributions (Second Ed., Vol. 1, Chapter 18)
| |
| | publisher = John Willey and Sons
| |
| | year = 1994
| |
| | isbn = 0-471-58495-9
| |
| }}</ref><ref>{{cite book
| |
| | last = Mood
| |
| | first = Alexander
| |
| | coauthors = Franklin A. Graybill, Duane C. Boes
| |
| | title = Introduction to the Theory of Statistics (Third Edition, p. 241-246)
| |
| | publisher = McGraw-Hill
| |
| | year = 1974
| |
| | isbn = 0-07-042864-6
| |
| }}</ref> When there is a need to contrast it with the [[noncentral chi-squared distribution]], this distribution is sometimes called the '''central chi-squared distribution'''.
| |
| | |
| The chi-squared distribution is used in the common [[chi-squared test]]s for [[goodness of fit]] of an observed distribution to a theoretical one, the [[statistical independence|independence]] of two criteria of classification of [[data analysis|qualitative data]], and in [[confidence interval]] estimation for a population [[standard deviation]] of a normal distribution from a sample standard deviation. Many other statistical tests also use this distribution, like [[Friedman test|Friedman's analysis of variance by ranks]].
| |
| | |
| The chi-squared distribution is a special case of the [[gamma distribution]].
| |
| | |
| ==History and name==
| |
| This distribution was first described by the German statistician [[Helmert|Friedrich Robert Helmert]] in papers of 1875/1876,{{sfn|Hald|1998|pp=633–692|loc=27. Sampling Distributions under Normality}}<ref>F. R. [[Helmert]], "[http://gdz.sub.uni-goettingen.de/dms/load/img/?PPN=PPN599415665_0021&DMDID=DMDLOG_0018 Ueber die Wahrscheinlichkeit der Potenzsummen der Beobachtungsfehler und über einige damit im Zusammenhange stehende Fragen]", ''Zeitschrift für Mathematik und Physik'' [http://gdz.sub.uni-goettingen.de/dms/load/toc/?PPN=PPN599415665_0021 21], 1876, S. 102–219</ref> where he computed the sampling distribution of the sample variance of a normal population. Thus in German this was traditionally known as the ''Helmertsche'' ("Helmertian") or "Helmert distribution".
| |
| | |
| The distribution was independently rediscovered by the English mathematician [[Karl Pearson]] in the context of [[goodness of fit]], for which he developed his [[Pearson's chi-squared test]], published in {{Harv|Pearson|1900}}, with computed table of values published in {{Harv|Elderton|1902}}, collected in {{Harv|Pearson|1914|pp=xxxi–xxxiii, 26–28|loc=Table XII}}.
| |
| The name "chi-squared" ultimately derives from Pearson's shorthand for the exponent in a [[multivariate normal distribution]] with the Greek letter [[Chi (letter)|Chi]], writing
| |
| -½χ² for what would appear in modern notation as -½'''x'''<sup>T</sub>Σ<sup>-1</sup>'''x''' (Σ being the [[covariance matrix]]).<ref>
| |
| R. L. Plackett, ''Karl Pearson and the Chi-Squared Test'', International Statistical Review, 1983, [http://www.jstor.org/stable/1402731?seq=3 61f.]
| |
| See also Jeff Miller, [http://jeff560.tripod.com/c.html Earliest Known Uses of Some of the Words of Mathematics].
| |
| </ref> The idea of a family of "chi-squared distributions", however, is not due to Pearson but arose as a further development due to Fisher in the 1920s.{{sfn|Hald|1998|pp=633–692|loc=27. Sampling Distributions under Normality}}
| |
| | |
| ==Definition==
| |
| If ''Z''<sub>1</sub>, ..., ''Z''<sub>''k''</sub> are [[independence (probability theory)|independent]], [[standard normal]] random variables, then the sum of their squares,
| |
| : <math>
| |
| Q\ = \sum_{i=1}^k Z_i^2 ,
| |
| </math>
| |
| is distributed according to the '''chi-squared distribution''' with ''k'' degrees of freedom. This is usually denoted as
| |
| : <math>
| |
| Q\ \sim\ \chi^2(k)\ \ \text{or}\ \ Q\ \sim\ \chi^2_k .
| |
| </math>
| |
| | |
| The chi-squared distribution has one parameter: ''k'' — a positive integer that specifies the number of [[degrees of freedom (statistics)|degrees of freedom]] (i.e. the number of ''Z''<sub>''i''</sub>’s)
| |
| | |
| ==Characteristics==
| |
| Further properties of the chi-squared distribution can be found in the box at the upper right corner of this article.
| |
| | |
| ===Probability density function===
| |
| The [[probability density function]] (pdf) of the chi-squared distribution is
| |
| :<math>
| |
| f(x;\,k) =
| |
| \begin{cases}
| |
| \frac{x^{(k/2)-1} e^{-x/2}}{2^{k/2} \Gamma\left(\frac{k}{2}\right)}, & x \geq 0; \\ 0, & \text{otherwise}.
| |
| \end{cases}
| |
| </math>
| |
| where Γ(''k''/2) denotes the [[Gamma function]], which has [[particular values of the Gamma function|closed-form values for integer ''k'']].
| |
| | |
| For derivations of the pdf in the cases of one, two and k degrees of freedom, see [[Proofs related to chi-squared distribution]].
| |
| | |
| ===Cumulative distribution function===
| |
| | |
| [[File:Chernoff XS CDF.png|thumb|right|400px|Chernoff bound for the [[Cumulative distribution function|CDF]] and tail (1-CDF) of a chi-squared random variable with ten degrees of freedom (''k'' = 10) ]]
| |
| | |
| Its [[cumulative distribution function]] is:
| |
| : <math>
| |
| F(x;\,k) = \frac{\gamma(\frac{k}{2},\,\frac{x}{2})}{\Gamma(\frac{k}{2})} = P\left(\frac{k}{2},\,\frac{x}{2}\right),
| |
| </math>
| |
| where γ(''s'',''t'') is the [[incomplete Gamma function|lower incomplete Gamma function]] and ''P''(''s'',''t'') is the [[regularized Gamma function]].
| |
| | |
| In a special case of ''k'' = 2 this function has a simple form:
| |
| : <math>
| |
| F(x;\,2) = 1 - e^{-\frac{x}{2}}.
| |
| </math>
| |
| | |
| Tables of the chi-squared cumulative distribution function are widely available and the function is included in many [[spreadsheet]]s and all [[List of statistical packages|statistical packages]].
| |
| | |
| Letting <math>z \equiv x/k</math>, [[Chernoff_bound#The_first_step_in_the_proof_of_Chernoff_bounds|Chernoff bounds]] on the lower and upper tails of the CDF may be obtained.<ref>{{cite journal |last1=Dasgupta |first1=Sanjoy D. A. |last2=Gupta |first2=Anupam K. |year=2002 |title=An Elementary Proof of a Theorem of Johnson and Lindenstrauss |journal=Random Structures and Algorithms |volume=22 |issue= |pages=60–65 |publisher= |doi= |url=http://cseweb.ucsd.edu/~dasgupta/papers/jl.pdf |accessdate=2012-05-01 }}</ref> For the cases when <math>0 < z < 1</math> (which include all of the cases when this CDF is less than half):
| |
| : <math>
| |
| F(z k;\,k) \leq (z e^{1-z})^{k/2}.
| |
| </math>
| |
| | |
| The tail bound for the cases when <math>z > 1</math>, similarly, is
| |
| : <math>
| |
| 1-F(z k;\,k) \leq (z e^{1-z})^{k/2}.
| |
| </math>
| |
| | |
| For another [[approximation]] for the CDF modeled after the cube of a Gaussian, see [[Noncentral_chi-squared_distribution#Approximation|under Noncentral chi-squared distribution]].
| |
| | |
| ===Additivity===
| |
| It follows from the definition of the chi-squared distribution that the sum of independent chi-squared variables is also chi-squared distributed. Specifically, if {''X<sub>i</sub>''}<sub>''i''=1</sub><sup>''n''</sup> are independent chi-squared variables with {''k<sub>i</sub>''}<sub>''i''=1</sub><sup>''n''</sup> degrees of freedom, respectively, then {{nowrap|''Y {{=}} X''<sub>1</sub> + ⋯ + ''X<sub>n</sub>''}} is chi-squared distributed with {{nowrap|''k''<sub>1</sub> + ⋯ + ''k<sub>n</sub>''}} degrees of freedom.
| |
| | |
| ===Sample mean===
| |
| The sample mean of n [[Independent and identically distributed random variables|i.i.d.]] chi-squared variables of degree k is distributed according to a gamma distribution with shape α and scale θ parameters:
| |
| <math> \bar X = \frac{1}{n} \sum_{i=1}^{n} X_i \sim \mathrm{Gamma}\left(\alpha=n\cdot k /2, \theta= 2/n \right) \qquad \mathrm{where } \quad X_i \sim \chi^2(k)</math>
| |
| | |
| Asymptotically, given that for a scale parameter α going to infinity, a Gamma distribution converges towards a Normal distribution with expectation μ = kθ and variance σ<sup>2</sup> = kθ<sup>2</sup>, the sample mean converges towards:
| |
| <math> \bar X \xrightarrow{n \to \infty} N(k, 2\cdot k /n ) </math>
| |
| | |
| Note that we would have obtained the same result invoking instead the [[central limit theorem]], noting that the expectation of the χ² is k, and its variance 2k (and hence the variance of the sample mean being 2k/n).
| |
| | |
| ===Entropy===
| |
| The [[differential entropy]] is given by
| |
| : <math>
| |
| h = \int_{-\infty}^\infty f(x;\,k)\ln f(x;\,k) \, dx
| |
| = \frac{k}{2} + \ln\!\left[2\,\Gamma\!\left(\frac{k}{2}\right)\right] + \left(1-\frac{k}{2}\right)\, \psi\!\left[\frac{k}{2}\right],
| |
| </math>
| |
| where ''ψ''(''x'') is the [[Digamma function]].
| |
| | |
| The chi-squared distribution is the [[maximum entropy probability distribution]] for a random variate ''X'' for which <math>E(X)=k</math> and <math>E(\ln(X))=\psi\left(k/2\right)+log(2)</math> are fixed. Since the chi-squared is in the family of gamma distributions, this can be derived by substituting appropriate values in the [[gamma distribution#Logarithmic expectation|Expectation of the Log moment of Gamma]]. For derivation from more basic principles, see the derivation in [[exponential family#Moment generating function of the sufficient statistic|moment generating function of the sufficient statistic]].
| |
| | |
| ===Noncentral moments===
| |
| The moments about zero of a chi-squared distribution with ''k'' degrees of freedom are given by<ref>[http://mathworld.wolfram.com/Chi-SquaredDistribution.html Chi-squared distribution], from [[MathWorld]], retrieved Feb. 11, 2009</ref><ref>M. K. Simon, ''Probability Distributions Involving Gaussian Random Variables'', New York: Springer, 2002, eq. (2.35), ISBN 978-0-387-34657-1</ref>
| |
| : <math>
| |
| \operatorname{E}(X^m) = k (k+2) (k+4) \cdots (k+2m-2) = 2^m \frac{\Gamma(m+\frac{k}{2})}{\Gamma(\frac{k}{2})}.
| |
| </math>
| |
| | |
| ===Cumulants===
| |
| The [[cumulant]]s are readily obtained by a (formal) power series expansion of the logarithm of the characteristic function:
| |
| : <math>
| |
| \kappa_n = 2^{n-1}(n-1)!\,k
| |
| </math>
| |
| | |
| ===Asymptotic properties===
| |
| By the [[central limit theorem]], because the chi-squared distribution is the sum of ''k'' independent random variables with finite mean and variance, it converges to a normal distribution for large ''k''. For many practical purposes, for ''k'' > 50 the distribution is sufficiently close to a [[normal distribution]] for the difference to be ignored.<ref>{{cite book|title=Statistics for experimenters|author=Box, Hunter and Hunter|publisher=Wiley|year=1978|isbn=0471093157|page=118}}</ref> Specifically, if ''X'' ~ ''χ''²(''k''), then as ''k'' tends to infinity, the distribution of <math>(X-k)/\sqrt{2k}</math> [[convergence of random variables#Convergence in distribution|tends]] to a standard normal distribution. However, convergence is slow as the [[skewness]] is <math>\sqrt{8/k}</math> and the [[excess kurtosis]] is 12/''k''.
| |
| * The sampling distribution of ln(''χ''<sup>2</sup>) converges to normality much faster than the sampling distribution of ''χ''<sup>2</sup>,<ref>{{cite journal |first=M. S. |last=Bartlett |first2=D. G. |last2=Kendall |title=The Statistical Analysis of Variance-Heterogeneity and the Logarithmic Transformation |journal=Supplement to the Journal of the Royal Statistical Society |volume=8 |issue=1 |year=1946 |pages=128–138 |jstor=2983618 }}</ref> as the logarithm removes much of the asymmetry.<ref>{{Cite journal |title=Fixing the F Test for Equal Variances |first=Lewis H. |last=Shoemaker |journal=[[The American Statistician]] |volume=57 |issue=2 |year=2003 |pages=105–114 |jstor=30037243 }}</ref> Other functions of the chi-squared distribution converge more rapidly to a normal distribution. Some examples are:
| |
| * If ''X'' ~ ''χ''²(''k'') then <math>\scriptstyle\sqrt{2X}</math> is approximately normally distributed with mean <math>\scriptstyle\sqrt{2k-1}</math> and unit variance (result credited to [[R. A. Fisher]]).
| |
| * If ''X'' ~ ''χ''²(''k'') then <math>\scriptstyle\sqrt[3]{X/k}</math> is approximately normally distributed with mean <math>\scriptstyle 1-2/(9k)</math> and variance <math>\scriptstyle 2/(9k) .</math><ref>{{cite journal |last=Wilson |first=E. B. |last2=Hilferty |first2=M. M. |year=1931 |title=The distribution of chi-squared |journal=[[Proceedings of the National Academy of Sciences of the United States of America|Proc. Natl. Acad. Sci. USA]] |volume=17 |issue=12 |pages=684–688 |url=http://www.pnas.org/content/17/12/684.full.pdf+html }}</ref> This is known as the Wilson–Hilferty transformation.
| |
| | |
| ==Relation to other distributions==
| |
| {{Refimprove section|date=September 2011}}
| |
| | |
| [[File:Chi on SAS.png|thumb|right|400px|Approximate formula for median compared with numerical quantile (top) as presented in [[SAS (software)|SAS Software]]. Difference between numerical quantile and approximate formula (bottom).]]
| |
| * As <math>k\to\infty</math>, <math> (\chi^2_k-k)/\sqrt{2k} ~ \xrightarrow{d}\ N(0,1) \,</math> ([[normal distribution]])
| |
| | |
| *<math> \chi_k^2 \sim {\chi'}^2_k(0)</math> ([[Noncentral chi-squared distribution]] with non-centrality parameter <math> \lambda = 0 </math>)
| |
| | |
| *If <math>X \sim \mathrm{F}(\nu_1, \nu_2)</math> then <math>Y = \lim_{\nu_2 \to \infty} \nu_1 X</math> has the chi-squared distribution <math>\chi^2_{\nu_{1}}</math>
| |
| | |
| *As a special case, if <math>X \sim \mathrm{F}(1, \nu_2)\,</math> then <math>Y = \lim_{\nu_2 \to \infty} X\,</math> has the chi-squared distribution <math>\chi^2_{1}</math>
| |
| | |
| *<math> \|\boldsymbol{N}_{i=1,...,k}{(0,1)}\|^2 \sim \chi^2_k </math> (The squared [[Norm (mathematics)|norm]] of '''k''' standard normally distributed variables is a chi-squared distribution with '''k''' [[degrees of freedom (statistics)|degrees of freedom]])
| |
| | |
| *If <math>X \sim {\chi}^2(\nu)\,</math> and <math>c>0 \,</math>, then <math>cX \sim {\Gamma}(k=\nu/2, \theta=2c)\,</math>. ([[gamma distribution]])
| |
| | |
| *If <math>X \sim \chi^2_k</math> then <math>\sqrt{X} \sim \chi_k</math> ([[chi distribution]])
| |
| | |
| *If <math>X \sim \chi^2 \left( 2 \right)</math>, then <math>X \sim \mathrm{Exp(1/2)}</math> is an [[exponential distribution]]. (See [[Gamma distribution]] for more.)
| |
| | |
| *If <math>X \sim \mathrm{Rayleigh}(1)\,</math> ([[Rayleigh distribution]]) then <math>X^2 \sim \chi^2(2)\,</math>
| |
| | |
| *If <math>X \sim \mathrm{Maxwell}(1)\,</math> ([[Maxwell distribution]]) then <math>X^2 \sim \chi^2(3)\,</math>
| |
| | |
| *If <math>X \sim \chi^2(\nu)</math> then <math>\tfrac{1}{X} \sim \mbox{Inv-}\chi^2(\nu)\, </math> ([[Inverse-chi-squared distribution]])
| |
| | |
| *The chi-squared distribution is a special case of type 3 [[Pearson distribution]]
| |
| | |
| * If <math>X \sim \chi^2(\nu_1)\,</math> and <math>Y \sim \chi^2(\nu_2)\,</math> are independent then <math>\tfrac{X}{X+Y} \sim {\rm Beta}(\tfrac{\nu_1}{2}, \tfrac{\nu_2}{2})\,</math> ([[beta distribution]])
| |
| | |
| *If <math> X \sim {\rm U}(0,1)\, </math> ([[Uniform distribution (continuous)|uniform distribution]]) then <math> -2\log{(X)} \sim \chi^2(2)\,</math>
| |
| | |
| * <math>\chi^2(6)\,</math> is a transformation of [[Laplace distribution]]
| |
| | |
| *If <math>X_i \sim \mathrm{Laplace}(\mu,\beta)\,</math> then <math>\sum_{i=1}^n{\frac{2 |X_i-\mu|}{\beta}} \sim \chi^2(2n)\,</math>
| |
| | |
| * chi-squared distribution is a transformation of [[Pareto distribution]]
| |
| | |
| * [[Student's t-distribution]] is a transformation of chi-squared distribution
| |
| | |
| * [[Student's t-distribution]] can be obtained from chi-squared distribution and [[normal distribution]]
| |
| | |
| * [[Noncentral beta distribution]] can be obtained as a transformation of chi-squared distribution and [[Noncentral chi-squared distribution]]
| |
| | |
| * [[Noncentral t-distribution]] can be obtained from normal distribution and chi-squared distribution
| |
| | |
| A chi-squared variable with ''k'' degrees of freedom is defined as the sum of the squares of ''k'' independent [[standard normal distribution|standard normal]] random variables.
| |
| | |
| If ''Y'' is a ''k''-dimensional Gaussian random vector with mean vector ''μ'' and rank ''k'' covariance matrix ''C'', then ''X'' = (''Y''−''μ'')<sup>T</sup>''C''<sup>−1</sup>(''Y''−''μ'') is chi-squared distributed with ''k'' degrees of freedom.
| |
| | |
| The sum of squares of [[statistically independent]] unit-variance Gaussian variables which do ''not'' have mean zero yields a generalization of the chi-squared distribution called the [[noncentral chi-squared distribution]].
| |
| | |
| If ''Y'' is a vector of ''k'' [[i.i.d.]] standard normal random variables and ''A'' is a ''k×k'' [[idempotent matrix]] with [[rank (linear algebra)|rank]] ''k−n'' then the [[quadratic form]] ''Y<sup>T</sup>AY'' is chi-squared distributed with ''k−n'' degrees of freedom.
| |
| | |
| The chi-squared distribution is also naturally related to other distributions arising from the Gaussian. In particular,
| |
| | |
| * ''Y'' is [[F-distribution|F-distributed]], ''Y'' ~ ''F''(''k''<sub>1</sub>,''k''<sub>2</sub>) if <math>\scriptstyle Y = \frac{X_1 / k_1}{X_2 / k_2}</math> where ''X''<sub>1</sub> ~ ''χ''²(''k''<sub>1</sub>) and ''X''<sub>2</sub> ~ ''χ''²(''k''<sub>2</sub>) are statistically independent.
| |
| | |
| * If ''X'' is chi-squared distributed, then <math>\scriptstyle\sqrt{X}</math> is [[chi distribution|chi distributed]].
| |
| | |
| * If {{nowrap|''X''<sub>1</sub> ~ ''χ''<sup>2</sup><sub>''k''<sub>1</sub></sub>}} and {{nowrap|''X''<sub>2</sub> ~ ''χ''<sup>2</sup><sub>''k''<sub>2</sub></sub>}} are statistically independent, then {{nowrap|''X''<sub>1</sub> + ''X''<sub>2</sub> ~ ''χ''<sup>2</sup><sub>''k''<sub>1</sub>+''k''<sub>2</sub></sub>}}. If ''X''<sub>1</sub> and ''X''<sub>2</sub> are not independent, then {{nowrap|''X''<sub>1</sub> + ''X''<sub>2</sub>}} is not chi-squared distributed.
| |
| | |
| ==Generalizations==
| |
| The chi-squared distribution is obtained as the sum of the squares of ''k'' independent, zero-mean, unit-variance Gaussian random variables. Generalizations of this distribution can be obtained by summing the squares of other types of Gaussian random variables. Several such distributions are described below.
| |
| | |
| ===Chi-squared distributions===
| |
| | |
| ====Noncentral chi-squared distribution====
| |
| {{Main|Noncentral chi-squared distribution}}
| |
| The noncentral chi-squared distribution is obtained from the sum of the squares of independent Gaussian random variables having unit variance and ''nonzero'' means.
| |
| | |
| ====Generalized chi-squared distribution====
| |
| {{Main|Generalized chi-squared distribution}}
| |
| The generalized chi-squared distribution is obtained from the quadratic form ''z′Az'' where ''z'' is a zero-mean Gaussian vector having an arbitrary covariance matrix, and ''A'' is an arbitrary matrix.
| |
| | |
| ===Gamma, exponential, and related distributions===
| |
| The chi-squared distribution ''X'' ~ '''''χ''²(''k'')''' is a special case of the [[gamma distribution]], in that ''X'' ~ Γ(''k''/2, 1/2) using the rate parameterization of the gamma distribution (or
| |
| ''X'' ~ '''Γ(''k''/2, 2''') using the scale parameterization of the gamma distribution)
| |
| where ''k'' is an integer.
| |
| | |
| Because the [[exponential distribution]] is also a special case of the Gamma distribution, we also have that if ''X'' ~ ''χ''²(2), then ''X'' ~ Exp(1/2) is an [[exponential distribution]].
| |
| | |
| The [[Erlang distribution]] is also a special case of the Gamma distribution and thus we also have that if ''X'' ~ ''χ''²(''k'') with even ''k'', then ''X'' is Erlang distributed with shape parameter ''k''/2 and scale parameter 1/2.
| |
| | |
| ==Applications==
| |
| The chi-squared distribution has numerous applications in inferential [[statistics]], for instance in [[chi-squared test]]s and in estimating [[variance]]s. It enters the problem of estimating the mean of a normally distributed population and the problem of estimating the slope of a [[linear regression|regression]] line via its role in [[Student’s t-distribution]]. It enters all [[analysis of variance]] problems via its role in the [[F-distribution]], which is the distribution of the ratio of two independent chi-squared [[random variable]]s, each divided by their respective degrees of freedom.
| |
| | |
| Following are some of the most common situations in which the chi-squared distribution arises from a Gaussian-distributed sample.
| |
| | |
| *if ''X''<sub>1</sub>, ..., ''X<sub>n</sub>'' are [[independent identically-distributed random variables|i.i.d.]] ''N''(''μ'', ''σ''<sup>2</sup>) [[random variable]]s, then <math>\sum_{i=1}^n(X_i - \bar X)^2 \sim \sigma^2 \chi^2_{n-1}</math> where <math>\bar X = \frac{1}{n} \sum_{i=1}^n X_i</math>.
| |
| | |
| *The box below shows probability distributions with name starting with '''chi''' for some [[statistic]]s based on {{nowrap|''X<sub>i</sub>'' ∼ Normal(''μ<sub>i</sub>'', ''σ''<sup>2</sup><sub>''i''</sub>), ''i'' {{=}} 1, ⋯, ''k'', }} independent random variables:
| |
| <center>
| |
| {| class="wikitable" align="center"
| |
| |-
| |
| ! Name !! Statistic
| |
| |-
| |
| | chi-squared distribution || <math>\sum_{i=1}^k \left(\frac{X_i-\mu_i}{\sigma_i}\right)^2</math>
| |
| |-
| |
| | [[noncentral chi-squared distribution]] || <math>\sum_{i=1}^k \left(\frac{X_i}{\sigma_i}\right)^2</math>
| |
| |-
| |
| | [[chi distribution]] || <math>\sqrt{\sum_{i=1}^k \left(\frac{X_i-\mu_i}{\sigma_i}\right)^2}</math>
| |
| |-
| |
| | [[noncentral chi distribution]] || <math>\sqrt{\sum_{i=1}^k \left(\frac{X_i}{\sigma_i}\right)^2}</math>
| |
| |}
| |
| </center>
| |
| | |
| ==Table of ''χ''<sup>2</sup> value vs p-value==
| |
| The [[p-value]] is the probability of observing a test statistic ''at least'' as extreme in a chi-squared distribution. Accordingly, since the [[cumulative distribution function]] (CDF) for the appropriate degrees of freedom ''(df)'' gives the probability of having obtained a value ''less extreme'' than this point, subtracting the CDF value from 1 gives the p-value. The table below gives a number of p-values matching to ''χ''<sup>2</sup> for the first 10 degrees of freedom.
| |
| | |
| A low p-value indicates greater [[Statistical significance|statistical significance]], i.e. greater confidence that the observed deviation from the null hypothesis is significant. A p-value of 0.05 is often used as a bright-line cutoff between significant and not-significant results.
| |
| | |
| {| class="wikitable"
| |
| |-
| |
| ! Degrees of freedom (df)
| |
| !colspan=11| ''χ''<sup>2</sup> value <ref>[http://www2.lv.psu.edu/jxm57/irp/chisquar.html Chi-Squared Test] Table B.2. Dr. Jacqueline S. McLaughlin at The Pennsylvania State University. In turn citing: R.A. Fisher and F. Yates, Statistical Tables for Biological Agricultural and Medical Research, 6th ed., Table IV</ref>
| |
| |-
| |
| | <div align="center"> 1
| |
| | 0.004
| |
| | 0.02
| |
| | 0.06
| |
| | 0.15
| |
| | 0.46
| |
| | 1.07
| |
| | 1.64
| |
| | 2.71
| |
| | 3.84
| |
| | 6.64
| |
| | 10.83
| |
| |-
| |
| | <div align="center"> 2
| |
| | 0.10
| |
| | 0.21
| |
| | 0.45
| |
| | 0.71
| |
| | 1.39
| |
| | 2.41
| |
| | 3.22
| |
| | 4.60
| |
| | 5.99
| |
| | 9.21
| |
| | 13.82
| |
| |-
| |
| | <div align="center"> 3
| |
| | 0.35
| |
| | 0.58
| |
| | 1.01
| |
| | 1.42
| |
| | 2.37
| |
| | 3.66
| |
| | 4.64
| |
| | 6.25
| |
| | 7.82
| |
| | 11.34
| |
| | 16.27
| |
| |-
| |
| | <div align="center"> 4
| |
| | 0.71
| |
| | 1.06
| |
| | 1.65
| |
| | 2.20
| |
| | 3.36
| |
| | 4.88
| |
| | 5.99
| |
| | 7.78
| |
| | 9.49
| |
| | 13.28
| |
| | 18.47
| |
| |-
| |
| | <div align="center"> 5
| |
| | 1.14
| |
| | 1.61
| |
| | 2.34
| |
| | 3.00
| |
| | 4.35
| |
| | 6.06
| |
| | 7.29
| |
| | 9.24
| |
| | 11.07
| |
| | 15.09
| |
| | 20.52
| |
| |-
| |
| | <div align="center"> 6
| |
| | 1.63
| |
| | 2.20
| |
| | 3.07
| |
| | 3.83
| |
| | 5.35
| |
| | 7.23
| |
| | 8.56
| |
| | 10.64
| |
| | 12.59
| |
| | 16.81
| |
| | 22.46
| |
| |-
| |
| | <div align="center"> 7
| |
| | 2.17
| |
| | 2.83
| |
| | 3.82
| |
| | 4.67
| |
| | 6.35
| |
| | 8.38
| |
| | 9.80
| |
| | 12.02
| |
| | 14.07
| |
| | 18.48
| |
| | 24.32
| |
| |-
| |
| | <div align="center"> 8
| |
| | 2.73
| |
| | 3.49
| |
| | 4.59
| |
| | 5.53
| |
| | 7.34
| |
| | 9.52
| |
| | 11.03
| |
| | 13.36
| |
| | 15.51
| |
| | 20.09
| |
| | 26.12
| |
| |-
| |
| | <div align="center"> 9
| |
| | 3.32
| |
| | 4.17
| |
| | 5.38
| |
| | 6.39
| |
| | 8.34
| |
| | 10.66
| |
| | 12.24
| |
| | 14.68
| |
| | 16.92
| |
| | 21.67
| |
| | 27.88
| |
| |-
| |
| | <div align="center"> 10
| |
| | 3.94
| |
| | 4.86
| |
| | 6.18
| |
| | 7.27
| |
| | 9.34
| |
| | 11.78
| |
| | 13.44
| |
| | 15.99
| |
| | 18.31
| |
| | 23.21
| |
| | 29.59
| |
| |-
| |
| ! <div align="right"> P value (Probability)
| |
| | style="background: #ffa2aa" | 0.95
| |
| | style="background: #efaaaa" | 0.90
| |
| | style="background: #e8b2aa" | 0.80
| |
| | style="background: #dfbaaa" | 0.70
| |
| | style="background: #d8c2aa" | 0.50
| |
| | style="background: #cfcaaa" | 0.30
| |
| | style="background: #c8d2aa" | 0.20
| |
| | style="background: #bfdaaa" | 0.10
| |
| | style="background: #b8e2aa" | 0.05
| |
| | style="background: #afeaaa" | 0.01
| |
| | style="background: #a8faaa" | 0.001
| |
| |-
| |
| |}
| |
| | |
| ==See also==
| |
| {{Portal|Statistics}}
| |
| {{Colbegin}}
| |
| *[[Cochran's theorem]]
| |
| * [[F-distribution]]
| |
| *[[Fisher's method]] for combining [[Statistical independence|independent]] tests of significance
| |
| * [[Gamma distribution]]
| |
| * [[Generalized chi-squared distribution]]
| |
| * [[Hotelling's T-squared distribution]]
| |
| * [[Pearson's chi-squared test]]
| |
| * [[Student's t-distribution]]
| |
| * [[Wilks' lambda distribution]]
| |
| * [[Wishart distribution]]
| |
| {{Colend}}
| |
| | |
| ==References==
| |
| {{Reflist}}
| |
| | |
| ==Further reading==
| |
| {{refbegin}}
| |
| * {{cite isbn|0471179124}}
| |
| * {{cite doi|10.1093/biomet/1.2.155}}
| |
| {{refend}}
| |
| | |
| ==External links==
| |
| * {{springer|title=Chi-squared distribution|id=p/c022100}}
| |
| *[http://calculus-calculator.com/statistics/chi-squared-distribution-calculator.html Calculator for the pdf, cdf and quantiles of the chi-squared distribution]
| |
| *[http://jeff560.tripod.com/c.html Earliest Uses of Some of the Words of Mathematics: entry on Chi squared has a brief history]
| |
| *[http://www.stat.yale.edu/Courses/1997-98/101/chigf.htm Course notes on Chi-Squared Goodness of Fit Testing] from Yale University Stats 101 class.
| |
| *[http://demonstrations.wolfram.com/StatisticsAssociatedWithNormalSamples/ ''Mathematica'' demonstration showing the chi-squared sampling distribution of various statistics, e.g. Σ''x''², for a normal population]
| |
| *[http://www.jstor.org/stable/2348373 Simple algorithm for approximating cdf and inverse cdf for the chi-squared distribution with a pocket calculator]
| |
| | |
| {{ProbDistributions|continuous-semi-infinite}}
| |
| {{Common univariate probability distributions}}
| |
| | |
| {{DEFAULTSORT:Chi-Squared Distribution}}
| |
| [[Category:Continuous distributions]]
| |
| [[Category:Normal distribution]]
| |
| [[Category:Exponential family distributions]]
| |
| [[Category:Infinitely divisible probability distributions]]
| |
| [[Category:Probability distributions]]
| |