Main Page: Difference between revisions

Revision as of 21:09, 18 August 2014

In statistical classification, the Bayes error rate is the lowest possible error rate for a given class of classifier.^[1]^[2]

A number of approaches to the estimation of the Bayes error rate exist. One method seeks to obtain analytical bounds which are inherently dependent on distribution parameters, and hence difficult to estimate. Another approach focuses on class densities, while yet another method combines and compares various classifiers.^[2]

The Bayes error rate finds important use in the study of patterns and machine learning techniques.

Error Determination

In terms of machine learning and pattern classification, the data set can be discretely divided into 2 or more classes. Each element of the dataset is called an instance and the class it belongs to is called the lable. The bayes error rate of the dataset classifier is probability of the classifier to incorrectly classify an instance. For an n-multiclass classifier, the bayes error rate may be calculated as follows :

$\sum _{i=1}^{i=n}\textstyle \int \limits _{x\in H_{i}}P(C_{i}|x)p(x)\,dx$

Where x is an instance, C is the class into which an instance is classified, H $_{i}$ is the area/region that the classifier function h classifies as C $_{i}$
A bayes error is non-zero if the distributions of the instances overlap, i.e. a certain instance x can wear more than one lable.

References

43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.

I am Chester from Den Haag. I am learning to play the Cello. Other hobbies are Running.

Also visit my website: Hostgator Coupons - dawonls.dothome.co.kr -

↑ Fukunaga, Keinosuke (1990) Introduction to Statistical Pattern Recognition by ISBN 0122698517 pages 3 and 97
↑ ^2.0 ^2.1 K. Tumer, K. (1996) "Estimating the Bayes error rate through classifier combining" in Proceedings of the 13th International Conference on Pattern Recognition, Volume 2, 695–699

[stat-1] Fukunaga, Keinosuke (1990) Introduction to Statistical Pattern Recognition by ISBN 0122698517 pages 3 and 97

[Tumer-2] 2.0 ^2.1 K. Tumer, K. (1996) "Estimating the Bayes error rate through classifier combining" in Proceedings of the 13th International Conference on Pattern Recognition, Volume 2, 695–699

[1]

[2]

@@ Line 1: / Line 1: @@
-'''Multidimensional sampling''' is the process of converting a function of a multidimensional variable into a discrete collection of values of the function measured on a discrete set of points. This article presents the basic result due to Petersen and Middleton<ref name="petmid62">D. P. Petersen and D. Middleton, "Sampling and Reconstruction of Wave-Number-Limited Functions in N-Dimensional Euclidean Spaces", Information and Control, vol. 5, pp. 279–323, 1962.</ref> on conditions for perfectly reconstructing a [[wavenumber]]-limited function from its measurements on a discrete [[Lattice (group)|lattice]] of points. This result, also known as the '''Petersen–Middleton theorem''',  is a generalization of the [[Nyquist–Shannon sampling theorem]] for sampling one-dimensional bandlimited functions to higher-dimensional Euclidean spaces.
+In [[statistical classification]], the '''Bayes error rate''' is the lowest possible error rate for a given class of classifier.<ref name=stat >Fukunaga, Keinosuke (1990) ''Introduction to Statistical Pattern Recognition'' by ISBN 0122698517 pages 3 and 97</ref><ref name=Tumer >K. Tumer, K. (1996) "Estimating the Bayes error rate through classifier combining" in ''Proceedings of the 13th International Conference on Pattern Recognition'', Volume 2, 695–699 </ref>
-In essence, the Petersen–Middleton theorem shows that a wavenumber-limited function can be perfectly reconstructed from its values on an infinite [[Lattice (group)|lattice]] of points, provided the lattice is fine enough. The theorem provides conditions on the lattice under which perfect reconstruction is possible.
+A number of approaches to the estimation of the Bayes error rate exist. One method seeks to obtain analytical bounds which are inherently dependent on distribution parameters, and hence difficult to estimate. Another approach focuses on class densities, while yet another method combines and compares various classifiers.<ref name=Tumer />
-As with the Nyquist–Shannon sampling theorem, this theorem also assumes an idealization of any real-world situation, as it only applies to functions that are sampled over an infinitude of points. Perfect reconstruction is mathematically possible for the idealized model but only an approximation for real-world functions and sampling techniques, albeit in practice often a very good one.
+The Bayes error rate finds important use in the study  of patterns and [[machine learning]] techniques.
-==Preliminaries==
+==Error Determination==
-[[Image:Hexagonal_sampling_lattice.png|thumb|Fig. 1: A hexagonal sampling lattice <math>\Lambda</math> and its basis vectors ''v''<sub>1</sub> and ''v''<sub>2</sub>|right|200px]]
+In terms of machine learning and pattern classification, the data set can be discretely divided into 2 or more classes. Each element of the dataset is called an '''instance''' and the class it belongs to is called the '''lable'''.
-[[Image:Reciprocal_lattice.png|thumb|Fig. 2: The reciprocal lattice <math>\Gamma</math> corresponding to the lattice <math>\Lambda</math> of Fig. 1 and its basis vectors ''u''<sub>1</sub> and ''u''<sub>2</sub> (figure not to scale).|right|200px]]
+The bayes error rate of the dataset classifier is probability of the classifier to incorrectly classify an instance.
-The concept of a [[Bandlimiting|bandlimited]] function in one dimension can be generalized to the notion of a wavenumber-limited function in higher dimensions. Recall that the [[Fourier transform]] of an integrable function ''ƒ(.)'' on ''n''-dimensional Euclidean space is defined as:
+For an n-multiclass classifier, the bayes error rate may be calculated as follows :
-:<math>\hat{f}(\xi) = \mathcal{F}(f)(\xi) = \int_{\Re^n} f(x) e^{-2\pi i \langle x,\xi \rangle} \, dx</math>
+<br /><br />
-where ''x'' and ''ξ'' are ''n''-dimensional [[vector (mathematics)|vectors]], and <math>\langle x,\xi \rangle</math> is the [[inner product]] of the vectors. The function ''ƒ(.)'' is said to be wavenumber-limited to a set <math>\Omega</math> if the Fourier transform satisfies <math>\hat{f}(\xi) = 0</math> for <math>\xi \notin \Omega</math>.
+<math>\sum_{i=1}^{i=n}\textstyle \int\limits_{x\in H_{i}}P(C_{i}|x)p(x)\, dx</math>
+<br /> <br />
+Where x is an instance, C is the class into which an instance is classified, H<math>_{i}</math> is the area/region that the classifier function '''h''' classifies as C<math>_{i}</math>
+<br />
+A bayes error is non-zero if the distributions of the instances overlap, i.e. a certain instance x can wear more than one lable.
-Similarly, the configuration of uniformly spaced sampling points in one-dimension can be generalized to a [[Lattice (group)|lattice]] in higher dimensions. A lattice is a collection of points <math>\Lambda \subset \Re^n</math> of the form
+==See also==
-<math>
+* [[Naive Bayes classifier]]
-\Lambda = \left\{ \sum_{i=1}^n a_i v_i \; | \; a_i \in\Bbb{Z} \right\}
-</math>
-where {''v''<sub>1</sub>, ..., ''v''<sub>''n''</sub>} is a [[Basis (linear algebra)|basis]] for <math>\Re^n</math>. The [[reciprocal lattice]] <math>\Gamma</math> corresponding to <math>\Lambda</math> is defined by
-:<math>
-\Gamma = \left\{ \sum_{i=1}^n a_i u_i \; | \; a_i \in\Bbb{Z} \right\}
-</math>
-where the vectors <math>u_i</math> are chosen to satisfy <math>\langle u_i, v_j \rangle = \delta_{ij}</math>. An example of a sampling lattice is a [[hexagonal lattice]] depicted in Figure 1. The corresponding reciprocal lattice is shown in Figure 2.
-==The theorem==
-Let <math>\Lambda</math> denote a lattice in <math>\Re^n</math> and <math>\Gamma</math> the corresponding reciprocal lattice. The theorem of Petersen and Middleton<ref name="petmid62"></ref> states that a function ''f(.)'' that is wavenumber-limited to a set <math>\Omega \subset \Re^n</math> can be exactly reconstructed from its measurements on <math>\Lambda</math> provided that the set <math>\Omega</math> does not overlap with any of its shifted versions <math>\Omega + x </math> where the shift ''x'' is any nonzero element of the reciprocal lattice <math>\Gamma</math>. In other words, ''f(.)'' can be exactly reconstructed from its measurements on <math>\Lambda</math> provided that <math>\Omega \cap \{x+y:y\in\Omega\} = \phi </math> for all <math>x \in \Gamma\setminus\{0\}</math>.
-==Reconstruction==
-[[Image:Unaliased_sampled_spectrum_in_2D.png|thumb|Fig. 3: Support of the sampled spectrum <math>\hat f_s(.)</math> obtained by hexagonal sampling of a two-dimensional function wavenumber-limited to a circular disc. The blue circle represents the support <math>\Omega</math> of the original wavenumber-limited field, and the green circles represent the repetitions. In this example the spectral repetitions do not overlap and hence there is no aliasing. The original spectrum can be exactly recovered from the sampled spectrum.|right|300px]]
-The generalization of the [[Poisson summation formula]] to higher dimensions <ref name="stewei71">E. M. Stein and G. Weiss, "Introduction to Fourier Analysis on Euclidean Spaces", Princeton University Press, Princeton, 1971.</ref> can be used to show that the samples, <math>\{f(x): x \in \Lambda\} </math>, of the function ''f(.)'' on the lattice <math>\Lambda</math> are sufficient to create a [[periodic summation]] of the function <math>\hat f(.)</math>. The result is:
-{{NumBlk|:|<math>\hat f_s(\xi)\ \stackrel{\mathrm{def}}{=} \sum_{y \in \Gamma} \hat f\left(\xi - y\right) = \sum_{x \in \Lambda} |\Lambda|f(x) \ e^{-i 2\pi \langle x, \xi \rangle},</math>|{{EquationRef|Eq.1}}}}
-where <math>|\Lambda| </math> represents the volume of the [[parallelepiped]] formed by the vectors {''v''<sub>1</sub>, ..., ''v''<sub>''n''</sub>}. This periodic function is often referred to as the sampled spectrum and can be interpreted as the analogue of the [[discrete-time Fourier transform]] (DTFT) in higher dimensions. If the original wavenumber-limited spectrum <math>\hat f(.)</math> is supported on the set <math>\Omega</math> then the function <math>\hat f_s(.)</math> is supported on periodic repetitions of <math>\Omega</math> shifted by points on the reciprocal lattice <math>\Gamma</math>. If the conditions of the Petersen-Middleton theorem are met, then the function <math>\hat f_s(\xi)</math> is equal to <math>\hat f(\xi)</math> for all <math>\xi \in \Omega</math>, and hence the original field can be exactly reconstructed from the samples. In this case the reconstructed field matches the original field and can be expressed in terms of the samples as
-{{NumBlk|:|<math>f(x) = \sum_{y \in \Lambda} |\Lambda| f(y) \check \chi_\Omega(y - x)</math>,|{{EquationRef|Eq.2}}}}
-where <math>\check \chi_\Omega(.)</math> is the inverse Fourier transform of the [[Indicator function|characteristic function]] of the set <math>\Omega</math>. This interpolation formula is the higher-dimensional equivalent of the [[Whittaker–Shannon interpolation formula]].
-As an example suppose that <math>\Omega</math> is a circular disc. Figure 3 illustrates the support of <math>\hat f_s(.)</math> when the conditions of the Petersen-Middleton theorem are met. We see that the spectral repetitions do not overlap and hence the original spectrum can be exactly recovered.
-==Implications==
-===Aliasing===
-{{main|Aliasing}}
-[[Image:Aliased_sampled_spectrum_in_2D.png|thumb|Fig. 4: Support of the sampled spectrum <math>\hat f_s(.)</math> obtained by hexagonal sampling of a two-dimensional function wavenumber-limited to a circular disc. In this example, the sampling lattice is not fine enough and hence the discs overlap in the sampled spectrum. Thus the spectrum within <math>\Omega</math> represented by the blue circle cannot be recovered exactly due to the overlap from the repetitions (shown in green), thus leading to aliasing.|right|300px]]
-[[File:Moire pattern of bricks small.jpg|thumb|205px|Fig. 5: Spatial aliasing in the form of a [[Moiré pattern]].]]
-[[File:Moire pattern of bricks.jpg|thumb|205px|Fig. 6: Properly sampled image of brick wall.]]
-The theorem gives conditions on sampling lattices for perfect reconstruction of the sampled. If the lattices are not fine enough to satisfy the Petersen-Middleton condition, then the field cannot be reconstructed exactly from the samples in general. In this case we say that the samples may be [[Aliasing|aliased]]. Again, consider the example in which <math>\Omega</math> is a circular disc. If the Petersen-Middleton conditions do not hold, the support of the sampled spectrum will be as shown in Figure 4. In this case the spectral repetitions overlap leading to aliasing in the reconstruction.
-A simple illustration of aliasing can be obtained by studying low-resolution images. A gray-scale image can be interpreted as a function in two-dimensional space. An example of aliasing is shown in the images of brick patterns in Figure 5. The image shows the effects of aliasing when the sampling theorem's condition is not satisfied. If the lattice of pixels is not fine enough for the scene, aliasing occurs as evidenced by the appearance of the [[Moiré pattern]] in the image obtained. The image in Figure 6 is obtained when a smoothened version of the scene is sampled with the same lattice. In this case the conditions of the theorem are satisfied and no aliasing occurs.
-===Optimal sampling lattices===
-One of the objects of interest in designing a sampling scheme for wavenumber-limited fields is to identify the configuration of points that leads to the minimum sampling density, i.e., the density of sampling points per unit spatial volume in <math>\Re^n</math>. Typically the cost for taking and storing the measurements is proportional to the sampling density employed. Often in practice, the natural approach to sample two-dimensional fields is to sample it at points on a [[Lattice_(group)|rectangular lattice]]. However, this is not always the ideal choice in terms of the sampling density. The theorem of Petersen and Middleton can be used to identify the optimal lattice for sampling fields that are wavenumber-limited to a given set <math>\Omega \subset \Re^d</math>. For example, it can be shown that the lattice in <math>\Re^2</math> with minimum spatial density of points that admits perfect reconstructions of fields wavenumber-limited to a circular disc in <math>\Re^2</math> is the hexagonal lattice<ref name="mer79">D. R. Mersereau, “The processing of hexagonally sampled two-dimensional signals,” Proceedings of the IEEE, vol. 67, no. 6, pp. 930 – 949, June 1979.</ref>. As a consequence, hexagonal lattices are preferred for sampling [[Isotropy|isotropic fields]] in <math>\Re^2</math>.
-==Applications==
-The Petersen–Middleton theorem is useful in designing efficient sensor placement strategies in applications involving measurement of spatial phenomena such as seismic surveys, environment monitoring and spatial audio-field measurements.
 ==References==
 {{Reflist}}
+[[Category:Statistical classification]]
+[[Category:Bayesian statistics|Error rate]]
-{{DSP}}
+{{Statistics-stub}}
-[[Category:Digital signal processing]]
-[[Category:Theorems in Fourier analysis]]

Main Page: Difference between revisions

Revision as of 21:09, 18 August 2014

Error Determination

See also

References

Navigation menu

Search