# Chebyshev's inequality

{{#invoke:Hatnote|hatnote}}

In probability theory, **Chebyshev's inequality** (also spelled as **Tchebysheff's inequality**, Нера́венство Чебышева) guarantees that in any probability distribution, "nearly all" values are close to the mean — the precise statement being that no more than 1/*k*^{2} of the distribution's values can be more than *k* standard deviations away from the mean (or equivalently, at least 1−1/*k*^{2} of the distribution's values are within *k* standard deviations of the mean). The rule is often called Chebyshev's theorem, about the range of standard deviations around the mean, in statistics. The inequality has great utility because it can be applied to completely arbitrary distributions (unknown except for mean and variance), for example it can be used to prove the weak law of large numbers.

In practical usage, in contrast to the empirical rule, which applies to normal distributions, under Chebyshev's inequality a minimum of just 75% of values must lie within two standard deviations of the mean and 89% within three standard deviations.^{[1]}^{[2]}

The term *Chebyshev's inequality* may also refer to the Markov's inequality, especially in the context of analysis.

## Contents

- 1 History
- 2 Statement
- 3 Example
- 4 Sharpness of bounds
- 5 Proof (of the two-sided version)
- 6 Extensions
- 7 Finite samples
- 8 Sharpened bounds
- 9 Related inequalities
- 10 Unimodal distributions
- 11 Bounds for specific distributions
- 12 Zero means
- 13 Integral Chebyshev inequality
- 14 Haldane's transformation
- 15 Chernoff bounds
- 16 Notes
- 17 See also
- 18 References
- 19 Further reading
- 20 External links

## History

The theorem is named after Russian mathematician Pafnuty Chebyshev, although it was first formulated by his friend and colleague Irénée-Jules Bienaymé.^{[3]}^{:98} The theorem was first stated without proof by Bienaymé in 1853^{[4]} and later proved by Chebyshev in 1867.^{[5]} His student Andrey Markov provided another proof in his 1884 Ph.D. thesis.^{[6]}

## Statement

Chebyshev's inequality is usually stated for random variables, but can be generalized to a statement about measure spaces.

### Probabilistic statement

Let *X* (integrable) be a random variable with finite expected value *μ* and finite non-zero variance *σ*^{2}. Then for any real number *k* > 0,

Only the case *k* > 1 provides useful information. When *k* < 1 the right-hand side is greater than one, so the inequality becomes vacuous, as the probability of any event cannot be greater than one. When *k* = 1 it just says the probability
is less than or equal to one, which is always true for probabilities.

As an example, using *k* = Template:Sqrt shows that at least half of the values lie in the interval (*μ* − Template:Sqrt*σ*, *μ* + Template:Sqrt*σ*).

Because it can be applied to completely arbitrary distributions (unknown except for mean and variance), the inequality generally gives a poor bound compared to what might be deduced if more aspects are known about the distribution involved.

k | Min % within k standardTemplate:Ns deviations of mean |
Max % beyond k standard deviations from mean |
---|---|---|

1 | 0% | 100% |

Template:Sqrt | 50% | 50% |

1.5 | 55.56% | 44.44% |

2 | 75% | 25% |

3 | 88.8889% | 11.1111% |

4 | 93.75% | 6.25% |

5 | 96% | 4% |

6 | 97.2222% | 2.7778% |

7 | 97.9592% | 2.0408% |

8 | 98.4375% | 1.5625% |

9 | 98.7654% | 1.2346% |

10 | 99% | 1% |

### Measure-theoretic statement

Let (*X*, Σ, μ) be a measure space, and let *f* be an extended real-valued measurable function defined on *X*. Then for any real number *t* > 0 and *0 < p < ∞*, ^{[7]}

More generally, if *g* is an extended real-valued measurable function, nonnegative and nondecreasing on the range of *f*, then{{ safesubst:#invoke:Unsubst||date=__DATE__ |$B=
{{#invoke:Category handler|main}}{{#invoke:Category handler|main}}^{[citation needed]}
}}

The previous statement then follows by defining as if and otherwise, and taking instead of .

## Example

Suppose we randomly select a journal article from a source with an average of 1000 words per article, with a standard deviation of 200 words. We can then infer that the probability that it has between 600 and 1400 words (i.e. within *k* = 2 SDs of the mean) must be more than 75%, because there is less than Template:Frac = Template:Frac2 chance to be outside that range, by Chebyshev's inequality. But if we additionally know that the distribution is normal, we can say that is a 75% chance the word count is between 770 and 1230 (which is an even tighter bound).

- Note

This example should be treated with caution as the inequality is only stated for probability distributions rather than for finite sample sizes. The inequality has since been extended to apply to finite sample sizes.

## Sharpness of bounds

As shown in the example above, the theorem typically provides rather loose bounds. However, these bounds cannot in general (remaining true for arbitrary distributions) be improved upon. The bounds are sharp for the following example: for any *k* ≥ 1,

For this distribution, mean *μ* = 0 and standard deviation *σ* = Template:Frac2, so

Chebyshev's inequality is an equality for precisely those distributions that are a linear transformation of this example.

## Proof (of the two-sided version)

### Probabilistic proof

Markov's inequality states that for any real-valued random variable *Y* and any positive number *a*, we have Pr(|*Y*| > *a*) ≤ E(|*Y*|)/*a*. One way to prove Chebyshev's inequality is to apply Markov's inequality to the random variable *Y* = (*X* − *μ*)^{2} with *a* = (σ*k*)^{2}.

It can also be proved directly. For any event *A*, let *I*_{A} be the indicator random variable of *A*, i.e. *I*_{A} equals 1 if *A* occurs and 0 otherwise. Then

The direct proof shows why the bounds are quite loose in typical cases: the number 1 to the right of "≥" is replaced by [(*X* − μ)/(*k*σ)]^{2} to the left of "≥" whenever the latter exceeds 1. In some cases it exceeds 1 by a very wide margin.

### Measure-theoretic proof

Fix and let be defined as , and let be the indicator function of the set . Then, it is easy to check that, for any ,

since *g* is nondecreasing on the range of *f*, and therefore,

The desired inequality follows from dividing the above inequality by *g*(*t*).

## Extensions

Several extensions of Chebyshev's inequality have been developed.

### Asymmetric two-sided case

An asymmetric two-sided version of this inequality is also known.^{[8]}

When the distribution is asymmetric or is unknown

where *σ*^{2} is the variance and Template:Mvar is the mean.

### Bivariate case

A version for the bivariate case is known.^{[9]}

Let *X*_{1}, *X*_{2} be two random variables with means *μ*_{1}, *μ*_{2} and finite variances *σ*_{1}, *σ*_{2} respectively. Then

where for *i* = 1, 2,

Berge derived an inequality for two correlated variables *X*_{1}, *X*_{2}.^{[10]} Let Template:Mvar be the correlation coefficient between *X*_{1} and *X*_{2} and let *σ*_{i}^{2} be the variance of Template:Mvar. Then

Lal later obtained an alternative bound^{[11]}

Isii derived a further generalisation.^{[12]} Let

and define:

There are now three cases.

**Case B:**If the conditions in case A are not met but*k*_{1}*k*_{2}≥ 1 and

**Case C:**If none of the conditions in cases A or B are satisfied then there is no universal bound other than 1.

### Multivariate case

The general case is known as the Birnbaum–Raymond–Zuckerman inequality after the authors who proved it for two dimensions.^{[13]}

where Template:Mvar is the Template:Mvar-th random variable, Template:Mvar is the Template:Mvar-th mean and *σ*_{i}^{2} is the Template:Mvar-th variance.

If the variables are independent this inequality can be sharpened.^{[14]}

Olkin and Pratt derived an inequality for Template:Mvar correlated variables.^{[15]}

where the sum is taken over the *n* variables and

where Template:Mvar is the correlation between Template:Mvar and Template:Mvar.

Olkin and Pratt's inequality was subsequently generalised by Godwin.^{[16]}

### Vector version

Ferentinos^{[9]} has shown that for a vector *X* = (*x*_{1}, *x*_{2}, ...) with mean *μ* = (*μ*_{1}, *μ*_{2}, ...), variance *σ*^{2} = (*σ*_{1}^{2}, *σ*_{2}^{2}, ...) and an arbitrary norm Template:!! ⋅ Template:!! that

A second related inequality has also been derived by Chen.^{[17]} Let Template:Mvar be the dimension of the stochastic vector Template:Mvar and let E(*X*) be the mean of Template:Mvar. Let Template:Mvar be the covariance matrix and *k* > 0. Then

where *Y*^{T} is the transpose of Template:Mvar.

### Infinite Dimensions

There is a straightforward extension of the vector version of Chebyshev's inequality to infinite dimensional settings. Let Template:Mvar be a random variable which takes values in a Fréchet space (equipped with seminorms Template:!! ⋅ Template:!!_{α}). This includes most common settings of vector-valued random variables, e.g., when is a Banach space (equipped with a single norm), a Hilbert space, or the finite-dimensional setting as described above.

Suppose that Template:Mvar is of "strong order two", meaning that

for every seminorm Template:!! ⋅ Template:!!_{α}. This is a generalization of the requirement that Template:Mvar have finite variance, and is necessary for this strong form of Chebyshev's inequality in infinite dimensions. The terminology "strong order two" is due to Vakhania.^{[18]}

Let be the Pettis integral of Template:Mvar (i.e., the vector generalization of the mean), and let

be the standard deviation with respect to the seminorm Template:!! ⋅ Template:!!_{α}. In this setting we can state the following:

**Proof.** The proof is straightforward, and essentially the same as the finitary version. If *σ _{α}* = 0, then Template:Mvar is constant (and equal to Template:Mvar) almost surely, so the inequality is trivial.

If

then Template:!!*X* − *μ*Template:!!_{α} > 0, so we may safely divide by Template:!!*X* − *μ*Template:!!_{α}. The crucial trick in Chebyshev's inequality is to recognize that .

The following calculations complete the proof:

### Higher moments

An extension to higher moments is also possible:

### Exponential version

A related inequality sometimes known as the exponential Chebyshev's inequality^{[19]} is the inequality

Let *K*(*x*, *t*) be the cumulant generating function,

Taking the Legendre–Fenchel transformationTemplate:Clarify of *K*(*x*, *t*) and using the exponential Chebyshev's inequality we have

This inequality may be used to obtain exponential inequalities for unbounded variables.^{[20]}

### Inequalities for bounded variables

If P(*x*) has finite support based on the interval [*a*, *b*], let *M* = max(|*a*|, |*b*|) where |*x*| is the absolute value of Template:Mvar. If the mean of P(*x*) is zero then for all *k* > 0^{[21]}

The second of these inequalities with *r* = 2 is the Chebyshev bound. The first provides a lower bound for the value of P(*x*).

Sharp bounds for a bounded variate have been derived by Niemitalo^{[22]}

Let 0 ≤ *X* ≤ *M* where *M* > 0. Then

**Case 1:**

**Case 2:**

**Case 3:**

## Finite samples

Saw *et al* extended Chebyshev's inequality to cases where the population mean and variance are not known but are instead replaced by their sample estimates.^{[23]}

where *N* is the sample size, *m* is the sample mean, *k* is a constant and *s* is the sample standard deviation. g(*x*) is defined as follows:

Let *x* ≥ 1, *Q* = *N* + 1, and *R* be the greatest integer less than *Q* / *x*. Let

Now

This inequality holds when the population moments do not exist and when the sample is weakly exchangeably distributed.

Kabán gives a somewhat less complex version of this inequality.^{[24]}

If the standard deviation is a multiple of the mean then a further inequality can be derived,^{[24]}

A table of values for the Saw–Yang–Mo inequality for finite sample sizes (*n* < 100) has been determined by Konijn.^{[25]}

For fixed *N* and large *m* the Saw–Yang–Mo inequality is approximately^{[26]}

Beasley *et al* have suggested a modification of this inequality^{[26]}

In empirical testing this modification is conservative but appears to have low statistical power. Its theoretical basis currently remains unexplored.

### Dependence of sample size

The bounds these inequalities give on a finite sample are less tight than those the Chebyshev inequality gives for a distribution. To illustrate this let the sample size *n* = 100 and let *k* = 3. Chebyshev's inequality states that approximately 11.11% of the distribution will lie outside these limits. Kabán's version of the inequality for a finite sample states that approximately 12.05% of the sample lies outside these limits. The dependence of the confidence intervals on sample size is further illustrated below.

For *N* = 10, the 95% confidence interval is approximately ±13.5789 standard deviations.

For *N* = 100 the 95% confidence interval is approximately ±4.9595 standard deviations; the 99% confidence interval is approximately ±140.0 standard deviations.

For *N* = 500 the 95% confidence interval is approximately ±4.5574 standard deviations; the 99% confidence interval is approximately ±11.1620 standard deviations.

For *N* = 1000 the 95% and 99% confidence intervals are approximately ±4.5141 and approximately ±10.5330 standard deviations respectively.

The Chebyshev inequality for the distribution gives 95% and 99% confidence intervals of approximately ±4.472 standard deviations and ±10 standard deviations respectively.

### Comparative bounds

Although Chebyshev's inequality is the best possible bound for an arbitrary distribution, this is not necessarily true for finite samples. Samuelson's inequality states that all values of a sample will lie within √(*N* − 1) standard deviations of the mean. Chebyshev's bound improves as the sample size increases.

When *N* = 10, Samuelson's inequality states that all members of the sample lie within 3 standard deviations of the mean: in contrast Chebyshev's states that 95% of the sample lies within 13.5789 standard deviations of the mean.

When *N* = 100, Samuelson's inequality states that all members of the sample lie within approximately 9.9499 standard deviations of the mean: Chebyshev's states that 99% of the sample lies within 140.0 standard deviations of the mean.

When *N* = 500, Samuelson's inequality states that all members of the sample lie within approximately 22.3383 standard deviations of the mean: Chebyshev's states that 99% of the sample lies within 11.1620 standard deviations of the mean.

It is likely that better bounds for finite samples than these exist.

## Sharpened bounds

Chebyshev's inequality is important because of its applicability to any distribution. As a result of its generality it may not (and usually does not) provide as sharp a bound as alternative methods that can be used if the distribution of the random variable is known. To improve the sharpness of the bounds provided by Chebyshev's inequality a number of methods have been developed.

### Standardised variables

Sharpened bounds can be derived by first standardising the random variable.^{[27]}

Let *X* be a random variable with finite variance *Var*(*x*). Let *Z* be the standardised form defined as

Cantelli's lemma is then

This inequality is sharp and is attained by *k* and −1/*k* with probability 1/(1 + *k*^{2}) and *k*^{2}/(1 + *k*^{2}) respectively.

If *k* > 1 and the distribution of *X* is symmetric then we have

Equality holds if and only if *Z* = −*k*, 0 or *k* with probabilities 1 / 2 *k*^{2}, 1 − 1 / *k*^{2} and 1 / 2 *k*^{2} respectively.^{[27]}
An extension to a two-sided inequality is also possible.

Let *u*, *v* > 0. Then we have^{[27]}

### Semivariances

An alternative method of obtaining sharper bounds is through the use of semivariances (partial moments). The upper (*σ*_{+}^{2}) and lower (*σ*_{−}^{2}) semivariances are defined

where *m* is the arithmetic mean of the sample, *n* is the number of elements in the sample and the sum for the upper (lower) semivariance is taken over the elements greater (less) than the mean.

The variance of the sample is the sum of the two semivariances

In terms of the lower semivariance Chebyshev's inequality can be written^{[28]}

Putting

Chebyshev's inequality can now be written

A similar result can also be derived for the upper semivariance.

If we put

Chebyshev's inequality can be written

Because *σ*_{u}^{2} ≤ *σ*^{2}, use of the semivariance sharpens the original inequality.

If the distribution is known to be symmetric, then

and

This result agrees with that derived using standardised variables.

- Note
- The inequality with the lower semivariance has been found to be of use in estimating downside risk in finance and agriculture.
^{[28]}^{[29]}^{[30]}

### Selberg's inequality

Selberg derived an inequality for *P*(*x*) when *a* ≤ *x* ≤ *b*.^{[31]} To simplify the notation let

where

and

The result of this linear transformation is to make *P*(*a* ≤ *X* ≤ *b*) equal to *P*(|*Y*| ≤ *k*).

The mean (*μ*_{X}) and variance (*σ*_{X}) of *X* are related to the mean (*μ*_{Y}) and variance (*σ*_{Y}) of *Y*:

With this notation Selberg's inequality states that

These are known to be the best possible bounds.^{[32]}

### Cantelli's inequality

Cantelli's inequality^{[33]} due to Francesco Paolo Cantelli states that for a real random variable (*X*) with mean (*μ*) and variance (*σ*^{2})

where *a* ≥ 0.

This inequality can be used to prove a one tailed variant of Chebyshev's inequality with *k* > 0^{[34]}

The bound on the one tailed variant is known to be sharp. To see this consider the random variable *X* that takes the values

Then *E*(*X*) = 0 and *E*(*X*^{2}) = *σ*^{2} and *P*(*X* < 1) = 1 / (1 + *σ*^{2}).

- An application – distance between the mean and the median

The one-sided variant can be used to prove the proposition that for probability distributions having an expected value and a median, the mean and the median can never differ from each other by more than one standard deviation. To express this in symbols let *μ*, *ν*, and *σ* be respectively the mean, the median, and the standard deviation. Then

There is no need to assume that the variance is finite because this inequality is trivially true if the variance is infinite.

The proof is as follows. Setting *k* = 1 in the statement for the one-sided inequality gives:

Changing the sign of *X* and of *μ*, we get

Thus the median is within one standard deviation of the mean.

A proof using Jensen's inequality also exists.

### Bhattacharyya's inequality

Bhattacharyya^{[35]} extended Cantelli's inequality using the third and fourth moments of the distribution.

Let *μ* = 0 and *σ*^{2} be the variance. Let γ = *E*(*X*^{3}) / *σ*^{3} and κ = *E*(*X*^{4}) / *σ*^{4}.

If *k*^{2} − *k*γ − 1 > 0 then

The necessity of *k*^{2} − *k*γ − 1 > 0 requires that *k* be reasonably large.

### Mitzenmacher and Upfal's inequality

Mitzenmacher and Upfal^{[36]} note that

for any real *k* > 0 and that

is the *k*^{th} central moment. They then show that for *t* > 0

For *k* = 2 we obtain Chebyshev's inequality. For *t* ≥ 1, *k* > 2 and assuming that the *k*^{th} moment exists, this bound is tighter than Chebyshev's inequality.

## Related inequalities

Several other related inequalities are also known.

### Zelen's inequality

Zelen has shown that^{[37]}

with

where Template:Mvar is the Template:Mvar-th moment and Template:Mvar is the standard deviation.

### He, Zhang and Zhang's inequality

For any collection of Template:Mvar nonnegative independent random variables Template:Mvar^{[38]}

### Hoeffding's lemma

Let Template:Mvar be a random variable with *a* ≤ *X* ≤ *b* and E[*X*] = 0, then for any *s* > 0, we have

### Van Zuijlen's bound

Let Template:Mvar be a set of independent Rademacher random variables: Pr(*X _{i}* = 1) = Pr(

*X*= −1) = 0.5. Then

_{i}^{[39]}

The bound is sharp and better than that which can be derived from the normal distribution (approximately Pr > 0.31).

## Unimodal distributions

A distribution function *F* is unimodal at *ν* if its cumulative distribution function is convex on (−∞, *ν*) and concave on (*ν*,∞)^{[40]} An empirical distribution can be tested for unimodality with the dip test.^{[41]}

In 1823 Gauss showed that for a unimodal distribution with a mode of zero^{[42]}

If the second condition holds then the second bound is always less than or equal to the first.{{ safesubst:#invoke:Unsubst||date=__DATE__ |$B=
{{#invoke:Category handler|main}}{{#invoke:Category handler|main}}^{[citation needed]}
}}

If the mode (*ν*) is not zero and the mean (*μ*) and standard deviation (*σ*) are both finite then denoting the root mean square deviation from the mode by *ω*, we have{{ safesubst:#invoke:Unsubst||date=__DATE__ |$B=
{{#invoke:Category handler|main}}{{#invoke:Category handler|main}}^{[citation needed]}
}}

and

Winkler in 1866 extended Gauss' inequality to *r*^{th} moments ^{[43]} where *r* > 0 and the distribution is unimodal with a mode of zero:

Gauss' bound has been subsequently sharpened and extended to apply to departures from the mean rather than the mode due to the Vysochanskiï–Petunin inequality.

The Vysochanskiï–Petunin inequality has been extended by Dharmadhikari and Joag-Dev^{[44]}

where *s* is a constant satisfying both *s* > *r* + 1 and *s*(*s* − *r* − 1) = *r*^{r} and *r* > 0.

It can be shown that these inequalities are the best possible and that further sharpening of the bounds requires that additional restrictions be placed on the distributions.

### Unimodal symmetrical distributions

The bounds on this inequality can also be sharpened if the distribution is both unimodal and symmetrical.^{[45]} An empirical distribution can be tested for symmetry with a number of tests including McWilliam's R*.^{[46]} It is known that the variance of a unimodal symmetrical distribution with finite support [*a*, *b*] is less than or equal to ( *b* − *a* )^{2} / 12.^{[47]}

Let the distribution be supported on the finite interval [ −*N*, *N* ] and the variance be finite. Let the mode of the distribution be zero and rescale the variance to 1. Let *k* > 0 and assume *k* < 2*N*/3. Then^{[45]}

If 0 < *k* ≤ 2 / √3 the bounds are reached with the density^{[45]}

If 2 / √3 < *k* ≤ 2*N* / 3 the bounds are attained by the distribution

where *β*_{k} = 4 / 3*k*^{2}, *δ*_{0} is the Dirac delta function and where

The existence of these densities shows that the bounds are optimal. Since *N* is arbitrary these bounds apply to any value of *N*.

The Camp–Meidell's inequality is a related inequality.^{[48]} For an absolutely continuous unimodal and symmetrical distribution

The second of these inequality is the same as the Vysochanskiï–Petunin inequality.

DasGupta has shown that if the distribution is known to be normal^{[49]}

### Notes

- Effects of symmetry and unimodality

Symmetry of the distribution decreases the inequality's bounds by a factor of 2 while unimodality sharpens the bounds by a factor of 4/9.

- Unimodal distributions

Because the mean and the mode in a unimodal distribution differ by at most √3 standard deviations^{[50]} at most 5% of a symmetrical unimodal distribution lies outside (2√10 + 3√3)/3 standard deviations of the mean (approximately 3.840 standard deviations). This is sharper than the bounds provided by the Chebyshev inequality (approximately 4.472 standard deviations).

These bounds on the mean are less sharp than those that can be derived from symmetry of the distribution alone which shows that at most 5% of the distribution lies outside approximately 3.162 standard deviations of the mean. The Vysochanskiï–Petunin inequality further sharpens this bound by showing that for such a distribution that at most 5% of the distribution lies outside 4√5/3 (approximately 2.981) standard deviations of the mean.

- Symmetrical unimodal distributions

For any symmetrical unimodal distribution:

- approximately 5.784% of the distribution lies outside 1.96 standard deviations of the mode
- 5% of the distribution lies outside 2√10/3 (approximately 2.11) standard deviations of the mode

DasGupta's inequality states that for a normal distribution at least 95% lies within approximately 2.582 standard deviations of the mean. This is less sharp than the true figure (approximately 1.96 standard deviations of the mean).

## Bounds for specific distributions

DasGupta has determined a set of best possible bounds for a normal distribution for this inequality.^{[49]}

Steliga and Szynal have extended these bounds to the Pareto distribution.^{[8]}

## Zero means

When the mean (*μ*) is zero Chebyshev's inequality takes a simple form. Let *σ*^{2} be the variance. Then

With the same conditions Cantelli's inequality takes the form

### Unit variance

If in addition *E*( *X*^{2} ) = 1 and *E*( *X*^{4} ) = *ψ* then for any 0 ≤ *ε* ≤ 1^{[51]}

The first inequality is sharp.

It is also known that for a random variable obeying the above conditions that^{[52]}

where

It is also known that^{[52]}

The value of C_{0} is optimal and the bounds are sharp if

If

then the sharp bound is

## Integral Chebyshev inequality

There is a second (less well known) inequality also named after Chebyshev^{[53]}

If *f*, *g* : [*a*, *b*] → **R** are two monotonic functions of the same monotonicity, then

If *f* and *g* are of opposite monotonicity, then the above inequality works in the reverse way.

This inequality is related to Jensen's inequality,^{[54]} Kantorovich's inequality,^{[55]} the Hermite–Hadamard inequality^{[55]} and Walter's conjecture.^{[56]}

### Other inequalities

There are also a number of other inequalities associated with Chebyshev

## Haldane's transformation

One use of Chebyshev's inequality in applications is to create confidence intervals for variates with an unknown distribution. Haldane noted,^{[57]} using an equation derived by Kendall,^{[58]} that if a variate (*x*) has a zero mean, unit variance and both finite skewness (*γ*) and kurtosis (*κ*) then the variate can be converted to a normally distributed standard score (*z*):

This transformation may be useful as an alternative to Chebyshev's inequality or as an adjunct to it for deriving confidence intervals for variates with unknown distributions.

While this transformation may be useful for moderately skewed and/or kurtotic distributions, it performs poorly when the distribution is markedly skewed and/or kurtotic.

## Chernoff bounds

If the random variables may also be assumed to be independently distributed it is possible to obtain sharper bounds. Let δ > 0. Then^{[59]}

With this inequality it can be shown that

where *μ* is the mean of the distribution. Further discussion may be found in the article on Chernoff bounds

## Notes

The Environmental Protection Agency has suggested best practices for the use of Chebyshev's inequality for estimating confidence intervals.^{[60]} This caution appears to be justified as its use in this context may be seriously misleading [1]

## See also

- Chernoff bound — a bound on sums of random variables
- Cornish–Fisher expansion
- Eaton's inequality
- Hoeffding's inequality — an exponential bound on the sum of a series of random variables
- Kolmogorov's inequality
- Proof of the weak law of large numbers using Chebyshev's inequality
- Le Cam's theorem
- Markov inequality
- Multidimensional Chebyshev's inequality
- Paley–Zygmund inequality
- Vysochanskiï–Petunin inequality — a stronger result applicable to unimodal probability distributions

## References

- ↑ {{#invoke:citation/CS1|citation |CitationClass=book }}
- ↑ {{#invoke:citation/CS1|citation |CitationClass=book }}
- ↑ {{#invoke:citation/CS1|citation |CitationClass=book }}
- ↑ Bienaymé I.-J. (1853) Considérations àl'appui de la découverte de Laplace. Comptes Rendus de l'Académie des Sciences 37: 309–324
- ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑ Markov A. (1884) On certain applications of algebraic continued fractions, Ph.D. thesis, St. Petersburg
- ↑ {{#invoke:citation/CS1|citation |CitationClass=book }}
- ↑
^{8.0}^{8.1}{{#invoke:Citation/CS1|citation |CitationClass=journal }} - ↑
^{9.0}^{9.1}Ferentinos K. (1982) "On Tchebycheﬀ type inequalities".*Trabajos Estadıst Investigacion Oper*, 33: 125–132 - ↑ Berge P. O. (1938) A note on a form of Tchebycheff's theorem for two variables. Biometrika 29, 405–406
- ↑ Lal D. N. (1955) A note on a form of Tchebycheﬀ's inequality for two or more variables. Sankhya 15(3):317–320
- ↑ Isii K. (1959) On a method for generalizations of Tchebycheff's inequality. Ann Inst Stat Math 10: 65–88
- ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑ {{#invoke:citation/CS1|citation |CitationClass=book }}
- ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑ Godwin H. J. (1964) Inequalities on distribution functions. New York, Hafner Pub. Co.
- ↑ Template:Cite arXiv
- ↑ Vakhania, Nikolai Nikolaevich. Probability distributions on linear spaces. New York: North Holland, 1981.
- ↑ Section 2.1
- ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }} (the references for this article are corrected by {{#invoke:Citation/CS1|citation |CitationClass=journal }})
- ↑ Dufour (2003) Properties of moments of random variables
- ↑ Niemitalo O. (2012) One-sided Chebyshev-type inequalities for bounded probability distributions.
- ↑ Template:Cite doi/10.2307.2F2683249
- ↑
^{24.0}^{24.1}Template:Cite doi - ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑
^{26.0}^{26.1}{{#invoke:Citation/CS1|citation |CitationClass=journal }} - ↑
^{27.0}^{27.1}^{27.2}{{#invoke:citation/CS1|citation |CitationClass=book }} - ↑
^{28.0}^{28.1}{{#invoke:Citation/CS1|citation |CitationClass=journal }} - ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑ Neave E. H., Ross M. N., Yang J. (2008) Distinguishing upside potential from downside risk. Management Research News. 32(1):26–36
- ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑ Cantelli F. (1910)Intorno ad un teorema fondamentale della teoria del rischio. Bolletino dell Associazione degli Attuari Italiani
- ↑ Grimmett and Stirzaker, problem 7.11.9. Several proofs of this result can be found in Chebyshev's Inequalities by A. G. McDowell.
- ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑ {{#invoke:citation/CS1|citation |CitationClass=book }}
- ↑ Zelen M. (1954) Bounds on a distribution function that are functions of moments to order four. J Res Nat Bur Stand 53:377–381
- ↑ He S., Zhang J., Zhang S. (2010) Bounding probability of small deviation: A fourth moment approach. Mathematics of operations research 35 (1) 208–232. doi: 10.1287/moor.1090.0438
- ↑ Martien C. A. van Zuijlen (2011) On a conjecture concerning the sum of independent Rademacher random variables
- ↑ {{#invoke:citation/CS1|citation |CitationClass=book }}
- ↑ Hartigan J. A., Hartigan P. M. (1985) "The dip test of unimodality".
*Annals of Statistics*13(1):70–84 Template:Hide in printTemplate:Only in print Template:MR - ↑ Gauss C. F. Theoria Combinationis Observationum Erroribus Minimis Obnoxiae. Pars Prior. Pars Posterior. Supplementum. Theory of the Combination of Observations Least Subject to Errors. Part One. Part Two. Supplement. 1995. Translated by G. W. Stewart. Classics in Applied Mathematics Series, Society for Industrial and Applied Mathematics, Philadelphia
- ↑ Winkler A. (1886) Math-Natur theorie Kl. Akad. Wiss Wien Zweite Abt 53, 6–41
- ↑ Dharmadhikari S. W., Joag-Dev K.(1985) The Gauss–Tchebyshev inequality for unimodal distributions. Teor Veroyatnost i Primenen 30(4):817–820
- ↑
^{45.0}^{45.1}^{45.2}Template:Cite doi - ↑ McWilliams T. P. (1990) "A distribution-free test for symmetry based on a runs statistic".
*Journal of the American Statistical Association*85(412):1130–1133 Template:Jstor - ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑
^{49.0}^{49.1}DasGupta A. (2000) Best constants in Chebychev inequalities with various applications. Metrika 5(1):185–200 - ↑ Template:Cite web
- ↑ Godwin H. J. (1964) Inequalities on distribution functions. (Chapter 3) New York, Hafner Pub. Co.
- ↑
^{52.0}^{52.1}Lesley F. D., Rotar V. I. (2003) Some remarks on lower bounds of Chebyshev's type for half-lines. J Inequalities Pure Appl Math 4(5) Art 96 - ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑
^{55.0}^{55.1}{{#invoke:Citation/CS1|citation |CitationClass=journal }} - ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑ Haldane J. B. (1952) Simple tests for bimodality and bitangentiality.
*Annals of Eugenics*16(4):359–364 Template:Hide in printTemplate:Only in print - ↑ Kendall M. G. (1943) The Advanced Theory of Statistics, 1. London
- ↑ {{#invoke:Citation/CS1|citation |CitationClass=journal }}
- ↑ Template:Cite report

## Further reading

- A. Papoulis (1991),
*Probability, Random Variables, and Stochastic Processes*, 3rd ed. McGraw–Hill. ISBN 0-07-100870-5. pp. 113–114. - G. Grimmett and D. Stirzaker (2001),
*Probability and Random Processes*, 3rd ed. Oxford. ISBN 0-19-857222-0. Section 7.3.

## External links

- {{#invoke:citation/CS1|citation

|CitationClass=citation }}

- Formal proof in the Mizar system.