Ureidoglycolate hydrolase

From formulasearchengine
Jump to navigation Jump to search

In statistics and machine learning, Rademacher complexity, named after Hans Rademacher, measures richness of a class of real-valued functions with respect to a probability distribution.

Given a training sample S=(z1,z2,,zm)Zm, and a hypotheses set (where is a class of real-valued functions defined on a domain space Z), the empirical Rademacher complexity of is defined as:

^S()=2m𝔼[suph|i=1mσih(zi)||S]

where σ1,σ2,,σm are independent random variables such that Pr(σi=+1)=Pr(σi=1)=1/2 for i=1,2,,m. The random variables σ1,σ2,,σm are referred to as Rademacher variables.

Let P be a probability distribution over Z. The Rademacher complexity of the function class with respect to P for sample size m is:

m()=𝔼[^S()]

where the above expectation is taken over an identically independently distributed (i.i.d.) sample S=(z1,z2,,zm) generated according to P.

One can show, for example, that there exists a constant C, such that any class of {0,1}-indicator functions with Vapnik-Chervonenkis dimension d has Rademacher complexity upper-bounded by Cdm.

Gaussian complexity

Gaussian complexity is a similar complexity with similar physical meanings, and can be obtained from the previous complexity using the random variables gi instead of σi, where gi are Gaussian i.i.d. random variables with zero-mean and variance 1, i.e. gi𝒩(0,1).

References

  • Peter L. Bartlett, Shahar Mendelson (2002) Rademacher and Gaussian Complexities: Risk Bounds and Structural Results. Journal of Machine Learning Research 3 463-482
  • Giorgio Gnecco (2008) Approximation Error Bounds via Rademacher's Complexity. Applied Mathematical Sciences, Vol. 2, 2008, no. 4, 153 - 176