|
|
Line 1: |
Line 1: |
| In [[probability theory]], the [[central limit theorem]] states that, under certain circumstances, the [[probability distribution]] of the scaled [[sample mean|mean of a random sample]] [[convergence in distribution|converges]] to a [[normal distribution]] as the sample size increases to infinity. Under stronger assumptions, the '''Berry–Esseen theorem''', or '''Berry–Esseen inequality''', gives a more quantitative result, because it also specifies the rate at which this convergence takes place by giving a bound on the maximal error of [[approximation theory|approximation]] between the normal distribution and the true distribution of the scaled sample mean. The approximation is measured by the [[Kolmogorov–Smirnov distance]]. In the case of [[Independence (probability theory)|independent samples]], the convergence rate is {{math|''n''<sup>−1/2</sup>}}, where {{math|''n''}} is the sample size, and the constant is estimated in terms of the [[Skewness|third]] absolute [[Moment (mathematics)|normalized moments]].
| |
|
| |
|
| ==Statement of the theorem==
| |
| Statements of the theorem vary, as it was independently discovered by two [[mathematician]]s, [[Andrew C. Berry]] (in 1941) and [[Carl-Gustav Esseen]] (1942), who then, along with other authors, refined it repeatedly over subsequent decades.
| |
|
| |
|
| ===Identically distributed summands===
| | Roberto is the name I love to be called with though I won't really like being which is called like that. My [http://Www.Wired.com/search?query=friends friends] say it's not great for me but possibilities I love doing is to bake but I am glad for thinking on starting something mroe challenging. South Carolina is where some of my home is. Software developing is how I've support my family. You can believe my website here: http://circuspartypanama.com<br><br>my blog - clash of clans hack tool ([http://circuspartypanama.com click through the up coming article]) |
| | |
| One version, sacrificing generality somewhat for the sake of clarity, is the following:
| |
| | |
| :There exists a positive [[Constant (mathematics)|constant]] ''C'' such that if ''X''<sub>1</sub>, ''X''<sub>2</sub>, ..., are [[Independent and identically-distributed random variables|i.i.d. random variables]] with E(''X''<sub>1</sub>) = 0, E(''X''<sub>1</sub><sup>2</sup>) = σ<sup>2</sup> > 0, and E(|''X''<sub>1</sub>|<sup>3</sup>) = ρ < ∞, and if we define
| |
| ::<math>Y_n = {X_1 + X_2 + \cdots + X_n \over n}</math>
| |
| :the [[sample mean]], with ''F''<sub>''n''</sub> the [[cumulative distribution function]] of
| |
| ::<math>{Y_n \sqrt{n} \over {\sigma}},</math><!-- please do not change this formula unless you have read and understood the relevant comments on the talk page -->
| |
| :and Φ the cumulative distribution function of the [[standard normal distribution]], then for all ''x'' and ''n'',
| |
| ::<math>\left|F_n(x) - \Phi(x)\right| \le {C \rho \over \sigma^3\,\sqrt{n}}.\ \ \ \ (1)</math>
| |
| | |
| [[Image:BerryEsseenTheoremCDFGraphExample.png|thumb|250px|Illustration of the difference in cumulative distribution functions alluded to in the theorem.]]
| |
| That is: given a sequence of [[independent and identically-distributed random variables]], each having [[mean]] zero and positive [[variance]], if additionally the third absolute [[moment (mathematics)|moment]] is finite, then the [[cumulative distribution function]]s of the [[Standard score|standardized]] sample mean and the standard normal distribution differ (vertically, on a graph) by no more than the specified amount. Note that the approximation error for all ''n'' (and hence the limiting rate of convergence for indefinite ''n'' sufficiently large) is bounded by the [[Big O notation|order]] of ''n''<sup>−1/2</sup>.
| |
| | |
| Calculated values of the constant ''C'' have decreased markedly over the years, from the original value of 7.59 by {{harvtxt|Esseen|1942}}, to 0.7882 by {{harvtxt|van Beek|1972}}, then 0.7655 by {{harvtxt|Shiganov|1986}}, then 0.7056 by {{harvtxt|Shevtsova|2007}}, then 0.7005 by {{harvtxt|Shevtsova|2008}}, then 0.5894 by {{harvtxt|Tyurin|2009}}, then 0.5129 by {{harvtxt|Korolev|Shevtsova|2009}}, then 0.4785 by {{harvtxt|Tyurin|2010}}. The detailed review can be found in the papers {{harvtxt|Korolev|Shevtsova|2009}}, {{harvtxt|Korolev|Shevtsova|2010}}. The best estimate {{Asof|2012|lc=yes}}, ''C'' < 0.4748, follows from the inequality
| |
| :<math>\sup_{x\in\mathbb R}\left|F_n(x) - \Phi(x)\right| \le {0.33554 (\rho+0.415\sigma^3)\over \sigma^3\,\sqrt{n}},</math>
| |
| due to {{harvtxt|Shevtsova|2011}}, since σ<sup>3</sup> ≤ ρ and 0.33554 · 1.415 < 0.4748. However, if ρ ≥ 1.286σ<sup>3</sup>, then the estimate
| |
| :<math>\sup_{x\in\mathbb R}\left|F_n(x) - \Phi(x)\right| \le {0.3328 (\rho+0.429\sigma^3)\over \sigma^3\,\sqrt{n}},</math>
| |
| which is also proved in {{harvtxt|Shevtsova|2011}}, gives an even tighter upper estimate. | |
| | |
| {{harvtxt|Esseen|1956}} proved that the constant also satisfies the lower bound
| |
| : <math>
| |
| C\geq\frac{\sqrt{10}+3}{6\sqrt{2\pi}} \approx 0.40973 \approx \frac{1}{\sqrt{2\pi}} + 0.01079 .
| |
| </math>
| |
| | |
| ===Non-identically distributed summands===
| |
| | |
| :Let ''X''<sub>1</sub>, ''X''<sub>2</sub>, ..., be independent random variables with [[expected value|E]](''X''<sub>i</sub>) = 0, E(''X''<sub>i</sub><sup>2</sup>) = σ<sub>i</sub><sup>2</sup> > 0, and E(|''X''<sub>i</sub>|<sup>3</sup>) = ρ<sub>i</sub> < ∞. Also, let | |
| ::<math>S_n = {X_1 + X_2 + \cdots + X_n \over \sqrt{\sigma_1^2+\sigma_2^2+\cdots+\sigma_n^2} }</math>
| |
| :be the normalized ''n''-th partial sum. Denote ''F''<sub>''n''</sub> the [[cumulative distribution function|cdf]] of ''S''<sub>''n''</sub>, and Φ the cdf of the [[standard normal distribution]]. For the sake of convenience denote
| |
| ::<math>\vec{\sigma}=(\sigma_1,...,\sigma_n),\ \vec{\rho}=(\rho_1,...,\rho_n).</math>
| |
| :In 1941, [[Andrew C. Berry]] proved that for all ''n'' there exists an absolute constant ''C''<sub>1</sub> such that
| |
| ::<math>\sup_{x\in\mathbb R}\left|F_n(x) - \Phi(x)\right| \le C_1\cdot\psi_1,\ \ \ \ (2)</math>
| |
| :where
| |
| ::<math>\psi_1=\psi_1\big(\vec{\sigma},\vec{\rho}\big)=\Big({\textstyle\sum\limits_{i=1}^n\sigma_i^2}\Big)^{-1/2}\cdot\max_{1\le
| |
| i\le n}\frac{\rho_i}{\sigma_i^2}.</math>
| |
| | |
| :Independently, in 1942, [[Carl-Gustav Esseen]] proved that for all ''n'' there exists an absolute constant ''C''<sub>0</sub> such that
| |
| ::<math>\sup_{x\in\mathbb R}\left|F_n(x) - \Phi(x)\right| \le C_0\cdot\psi_0, \ \ \ \ (3)</math>
| |
| :where
| |
| ::<math>\psi_0=\psi_0\big(\vec{\sigma},\vec{\rho}\big)=\Big({\textstyle\sum\limits_{i=1}^n\sigma_i^2}\Big)^{-3/2}\cdot\sum\limits_{i=1}^n\rho_i.</math>
| |
| | |
| It is easy to make sure that ψ<sub>0</sub>≤ψ<sub>1</sub>. Due to this circumstance inequality (3) is conventionally called the Berry–Esseen inequality, and the quantity ψ<sub>0</sub> is called the Lyapunov fraction of the third order. Moreover, in the case where the summands ''X''<sub>1</sub>,... ''X''<sub>n</sub> have identical distributions
| |
| ::<math>\psi_0=\psi_1=\frac{\rho_1}{\sigma_1^3\sqrt{n}},</math> | |
| and thus the bounds stated by inequalities (1), (2) and (3) coincide.
| |
| | |
| Regarding ''C''<sub>0</sub>, obviously, the lower bound established by {{harvtxt|Esseen|1956}} remains valid:
| |
| : <math>
| |
| C_0\geq\frac{\sqrt{10}+3}{6\sqrt{2\pi}} = 0.4097\ldots.
| |
| </math>
| |
| | |
| The upper bounds for ''C''<sub>0</sub> were subsequently lowered from the original estimate 7.59 due to {{harvtxt|Esseen|1942}} to (we mention the recent results only) 0.9051 due to {{harvtxt|Zolotarev|1967}}, 0.7975 due to {{harvtxt|van Beek|1972}}, 0.7915 due to {{harvtxt|Shiganov|1986}}, 0.6379 and 0.5606 due to {{harvtxt|Tyurin|2009}} and {{harvtxt|Tyurin|2010}}. {{Asof|2011}} the best estimate is 0.5600 obtained by {{harvtxt|Shevtsova|2010}}.
| |
| | |
| ==See also==
| |
| *[[Chernoff's inequality]]
| |
| *[[Edgeworth series]]
| |
| *[[List of inequalities]]
| |
| *[[List of mathematical theorems]]
| |
| | |
| ==References==
| |
| {{refbegin}}
| |
| * {{cite journal| first=Andrew C. | last=Berry | year=1941
| |
| | title=The Accuracy of the Gaussian Approximation to the Sum of Independent Variates
| |
| | journal=Transactions of the American Mathematical Society
| |
| | volume=49 | issue=1 | pages=122–136 | ref=harv| jstor=1990053| doi=10.1090/S0002-9947-1941-0003498-3
| |
| }}
| |
| * Durrett, Richard (1991). ''Probability: Theory and Examples''. Pacific Grove, CA: Wadsworth & Brooks/Cole. ISBN 0-534-13206-5.
| |
| * {{cite journal | ref=harv
| |
| | last = Esseen | first = Carl-Gustav
| |
| | title = On the Liapunoff limit of error in the theory of probability
| |
| | year = 1942
| |
| | journal = Arkiv för matematik, astronomi och fysik | issn = 0365-4133
| |
| | volume = A28
| |
| | pages = 1–19
| |
| }}
| |
| * {{cite journal | ref=harv
| |
| | last = Esseen | first = Carl-Gustav
| |
| | title = A moment inequality with an application to the central limit theorem
| |
| | year = 1956
| |
| | journal = Skand. Aktuarietidskr.
| |
| | volume = 39
| |
| | pages = 160–170
| |
| }}
| |
| * Feller, William (1972). ''An Introduction to Probability Theory and Its Applications, Volume II'' (2nd ed.). New York: John Wiley & Sons. ISBN 0-471-25709-5.
| |
| * {{cite journal | ref=harv
| |
| | last1 = Korolev | first1 = V. Yu. | last2 = Shevtsova | first2 = I. G.
| |
| | title = On the upper bound for the absolute constant in the Berry–Esseen inequality
| |
| | year = 2010
| |
| | journal = Theory of Probability and its Applications
| |
| | volume = 54 | issue = 4
| |
| | pages = 638–658
| |
| | doi = 10.1137/S0040585X97984449
| |
| }}
| |
| * {{cite journal | ref=harv
| |
| | last1 = Korolev | first1 = Victor | last2 = Shevtsova | first2 = Irina
| |
| | title = An improvement of the Berry–Esseen inequality with applications to Poisson and mixed Poisson random sums
| |
| | year = 2010
| |
| | journal = Scandinavian Actuarial Journal
| |
| | doi= 10.1080/03461238.2010.485370 | pages=1–25
| |
| | url= http://www.tandfonline.com/doi/abs/10.1080/03461238.2010.485370
| |
| }}
| |
| * Manoukian, Edward B. (1986). ''Modern Concepts and Theorems of Mathematical Statistics''. New York: Springer-Verlag. ISBN 0-387-96186-0.
| |
| * Serfling, Robert J. (1980). ''Approximation Theorems of Mathematical Statistics''. New York: John Wiley & Sons. ISBN 0-471-02403-1.
| |
| * {{cite journal | ref=harv
| |
| | last = Shevtsova | first = I. G.
| |
| | title = On the absolute constant in the Berry–Esseen inequality
| |
| | year = 2008
| |
| | journal = The Collection of Papers of Young Scientists of the Faculty of Computational Mathematics and Cybernetics
| |
| | issue = 5
| |
| | pages = 101–110
| |
| }}
| |
| * {{cite journal | ref=harv
| |
| | last = Shevtsova | first = I. G.
| |
| | title = Sharpening of the upper bound of the absolute constant in the Berry–Esseen inequality
| |
| | year = 2007
| |
| | journal = Theory of Probability and its Applications
| |
| | volume = 51 | issue = 3
| |
| | pages = 549–553
| |
| | doi= 10.1137/S0040585X97982591
| |
| }}
| |
| * {{cite journal | ref=harv
| |
| | last = Shevtsova | first = I. G.
| |
| | title = An Improvement of Convergence Rate Estimates in the Lyapunov Theorem
| |
| | year = 2010
| |
| | journal = Doklady Mathematics
| |
| | volume = 82 | issue = 3
| |
| | pages = 862–864
| |
| | doi= 10.1134/S1064562410060062
| |
| }}
| |
| * {{cite arXiv | ref=harv
| |
| | last = Shevtsova | first = Irina
| |
| | title = On the absolute constants in the Berry Esseen type inequalities for identically distributed summands
| |
| | year = 2011
| |
| | eprint = 1111.6554
| |
| | class = math.PR
| |
| }}
| |
| * {{cite journal | ref=harv
| |
| | last = Shiganov | first = I.S.
| |
| | title = Refinement of the upper bound of a constant in the remainder term of the central limit theorem
| |
| | year = 1986
| |
| | journal = Journal of Soviet mathematics
| |
| | volume = 35
| |
| | pages = 109–115
| |
| | doi=10.1007/BF01121471 | issue=3
| |
| }}
| |
| * {{cite journal | ref=harv
| |
| | last = Tyurin | first = I.S.
| |
| | title = On the accuracy of the Gaussian approximation
| |
| | year = 2009
| |
| | journal = Doklady Mathematics
| |
| | volume = 80 | issue = 3
| |
| | pages = 840–843 | doi=10.1134/S1064562409060155
| |
| }}
| |
| * {{cite journal | ref=harv
| |
| | last = Tyurin | first = I.S.
| |
| | title = An improvement of upper estimates of the constants in the Lyapunov theorem
| |
| | year = 2010
| |
| | journal = Russian Mathematical Surveys
| |
| | volume = 65 | issue = 3(393)
| |
| | pages = 201–202
| |
| }}
| |
| * {{cite journal | ref=harv
| |
| | last = van Beek | first = P.
| |
| | title = An application of Fourier methods to the problem of sharpening the Berry–Esseen inequality
| |
| | year = 1972
| |
| | journal = Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete
| |
| | volume = 23
| |
| | pages = 187–196
| |
| | doi=10.1007/BF00536558 | issue=3
| |
| }}
| |
| * {{cite journal | ref=harv
| |
| | last = Zolotarev | first = V. M.
| |
| | title = A sharpening of the inequality of Berry–Esseen
| |
| | year = 1967
| |
| | journal = Z. Wahrsch. Verw. Geb.
| |
| | volume = 8
| |
| | pages = 332–342 | doi=10.1007/BF00531598 | issue=4
| |
| }}
| |
| {{refend}}
| |
| | |
| ==External links==
| |
| * Gut, Allan & Holst Lars. [http://www.stat.unipd.it/bernoulli/02a/bn_3.html Carl-Gustav Esseen], retrieved Mar. 15, 2004.
| |
| * {{springer|title=Berry-Esseen inequality|id=p/b015760}}
| |
| | |
| {{DEFAULTSORT:Berry-Esseen theorem}}
| |
| [[Category:Probability theorems]]
| |
| [[Category:Probabilistic inequalities]]
| |
| [[Category:Statistical theorems]]
| |
| [[Category:Central limit theorem]]
| |