# Concentration inequality

Jump to navigation Jump to search

In mathematics, concentration inequalities provide probability bounds on how a random variable deviates from some value (e.g. its expectation). The laws of large numbers of classical probability theory state that sums of independent random variables are, under very mild conditions, close to their expectation with a large probability. Such sums are the most basic examples of random variables concentrated around their mean. Recent results shows that such behavior is shared by other functions of independent random variables.

## Markov's inequality

If X is any random variable and a > 0, then

${\displaystyle \Pr(|X|\geq a)\leq {\frac {{\textrm {E}}(|X|)}{a}}.}$

Proof can be found here.

We can extend Markov's inequality to a strictly increasing and non-negative function ${\displaystyle \Phi }$. We have

${\displaystyle \Pr(X\geq a)=\Pr(\Phi (X)\geq \Phi (a))\leq {\frac {{\textrm {E}}(\Phi (X))}{\Phi (a)}}.}$

## Chebyshev's inequality

Chebyshev's inequality is a special case of generalized Markov's inequality when ${\displaystyle \Phi =x^{2}}$

If X is any random variable and a > 0, then

${\displaystyle \Pr(|X-{\textrm {E}}(X)|\geq a)\leq {\frac {{\textrm {Var}}(X)}{a^{2}}},}$

Where Var(X) is the variance of X, defined as:

${\displaystyle \operatorname {Var} (X)=\operatorname {E} [(X-\operatorname {E} (X))^{2}].}$

## Asymptotic behavior of binomial distribution

If a random variable X follows the binomial distribution with parameter ${\displaystyle n}$ and ${\displaystyle p}$. The probability of getting exact ${\displaystyle k}$ successes in ${\displaystyle n}$ trials is given by the probability mass function

${\displaystyle f(k;n,p)=\Pr(K=k)={n \choose k}p^{k}(1-p)^{n-k}.}$

Let ${\displaystyle S_{n}=\sum _{i=1}^{n}X_{i}}$ and ${\displaystyle X_{i}}$'s are i.i.d. Bernoulli random variables with parameter ${\displaystyle p}$. ${\displaystyle S_{n}}$ follows the binomial distribution with parameter ${\displaystyle n}$ and ${\displaystyle p}$. Central Limit Theorem suggests when ${\displaystyle n\to \infty }$, ${\displaystyle S_{n}}$ is approximately normally distributed with mean ${\displaystyle np}$ and variance ${\displaystyle np(1-p)}$, and

${\displaystyle \lim _{n\to \infty }\Pr[a\sigma

For ${\displaystyle p={\frac {\lambda }{n}}}$, where ${\displaystyle \lambda }$ is a constant, the limit distribution of binomial distribution ${\displaystyle B(n,p)}$ is the Poisson distribution ${\displaystyle P(\lambda )}$

## General Chernoff inequality

A Chernoff bound gives exponentially decreasing bounds on tail distributions of sums of independent random variables.[1] Let ${\displaystyle X_{i}}$ denote independent but not necessarily identical random variables, satisfying ${\displaystyle X_{i}\geq E(X_{i})-a_{i}-M}$, for ${\displaystyle 1\leq i\leq n}$.

${\displaystyle X=\sum _{i=1}^{n}X_{i}}$

we have lower tail inequality:

${\displaystyle \Pr[X\leq E(X)-\lambda ]\leq e^{-{\frac {\lambda ^{2}}{2(Var(X)+\sum _{i=1}^{n}a_{i}^{2}+M\lambda /3)}}}}$

If ${\displaystyle X_{i}}$ satisfies ${\displaystyle X_{i}\leq E(X_{i})+a_{i}+M}$, we have upper tail inequality:

${\displaystyle \Pr[X\geq E(X)+\lambda ]\leq e^{-{\frac {\lambda ^{2}}{2(Var(X)+\sum _{i=1}^{n}a_{i}^{2}+M\lambda /3)}}}}$

If ${\displaystyle X_{i}}$ are i.i.d., ${\displaystyle |X_{i}|\leq 1}$ and ${\displaystyle \sigma ^{2}}$ is the variance of ${\displaystyle X_{i}}$. A typical version of Chernoff Inequality is:

${\displaystyle \Pr[|X|\geq k\sigma ]\leq 2e^{-k^{2}/4n}}$ for ${\displaystyle 0\leq k\leq 2\sigma }$

## Hoeffding's inequality

Hoeffding's inequality can be stated as follows:

If :${\displaystyle X_{1},\dots ,X_{n}\!}$ are independent. Assume that the ${\displaystyle X_{i}}$ are almost surely bounded; that is, assume for ${\displaystyle 1\leq i\leq n}$ that

${\displaystyle \Pr(X_{i}\in [a_{i},b_{i}])=1.\!}$

Then, for the empirical mean of these variables

${\displaystyle {\overline {X}}={\frac {X_{1}+\cdots +X_{n}}{n}}}$

we have the inequalities (Hoeffding 1963, Theorem 2 [2]):

${\displaystyle \Pr({\overline {X}}-\mathrm {E} [{\overline {X}}]\geq t)\leq \exp \left(-{\frac {2t^{2}n^{2}}{\sum _{i=1}^{n}(b_{i}-a_{i})^{2}}}\right),\!}$
${\displaystyle \Pr(|{\overline {X}}-\mathrm {E} [{\overline {X}}]|\geq t)\leq 2\exp \left(-{\frac {2t^{2}n^{2}}{\sum _{i=1}^{n}(b_{i}-a_{i})^{2}}}\right),\!}$

## Bennett's inequality

Bennett's inequality was proved by George Bennett of the University of New South Wales in 1962.[3]

Let X1, … Xn be independent random variables, and assume (for simplicity but without loss of generality) they all have zero expected value. Further assume |Xi| ≤ a almost surely for all i, and let

${\displaystyle \sigma ^{2}={\frac {1}{n}}\sum _{i=1}^{n}\operatorname {Var} (X_{i}).}$

Then for any t ≥ 0,

${\displaystyle \Pr \left(\sum _{i=1}^{n}X_{i}>t\right)\leq \exp \left(-{\frac {n\sigma ^{2}}{a^{2}}}h\left({\frac {at}{n\sigma ^{2}}}\right)\right),}$

where h(u) = (1 + u)log(1 + u) – u,[4] see also Fan et al. (2012) [5] for martingale version of Bennett's inequality and its improvement.

## Bernstein's inequality

Bernstein inequalities give bounds on the probability that the sum of random variables deviates from its mean. In the simplest case, let X1, ..., Xn be independent Bernoulli random variables taking values +1 and −1 with probability 1/2, then for every positive ${\displaystyle \varepsilon }$,

${\displaystyle \mathbf {P} \left\{\left|\;{\frac {1}{n}}\sum _{i=1}^{n}X_{i}\;\right|>\varepsilon \right\}\leq 2\exp \left\{-{\frac {n\varepsilon ^{2}}{2(1+\varepsilon /3)}}\right\}.}$

## Efron–Stein inequality

The Efron–Stein inequality (or influence inequality, or MG bound on variance) bounds the variance of a general function.

${\displaystyle \mathrm {Var} (f(X))\leq {\frac {1}{2}}\sum _{i=1}^{n}E[(f(X)-f(X^{(i)}))^{2}].}$

## References

1. Template:Cite web
2. Wassily Hoeffding, Probability inequalities for sums of bounded random variables, Journal of the American Statistical Association 58 (301): 13–30, March 1963. (JSTOR)
3. Template:Cite jstor
4. {{#invoke:citation/CS1|citation |CitationClass=book }}
5. {{#invoke:Citation/CS1|citation |CitationClass=journal }}