Minimum message length: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Yobot
m Link equal to linktext using AWB (9585)
en>Thepigdog
 
Line 1: Line 1:
In [[probability theory]] and [[statistics]], a '''Gaussian process''' is a [[stochastic process]] whose realizations consist of [[random variable|random values]] associated with every point in a range of times (or of space) such that each such [[random variable]] has a [[normal distribution]]. Moreover, every finite collection of those random variables has a [[multivariate normal distribution]]. The concept of Gaussian processes is named after [[Carl Friedrich Gauss]] because it is based on the notion of the [[normal distribution|normal]] [[distribution (mathematics)|distribution]] which is often called the ''[[Gaussian distribution]]''. In fact, one way of thinking of a Gaussian process is as an infinite-dimensional generalization of the multivariate normal distribution.
enjoying a brand fresh clash of clans get into tool, see the hack book. Most quests possess a book you can buy individually. You need to definitely think about doing my and studying it before you play, or even you should playing. In them manner, you can gear out of your gameplay.<br><br>Cultivate a gaming program for the kids. Similar to essential assignments time, this tv game program will benefit manage a child's way of life. When the times have previously been set, stick to ones schedule. Do Hardly back as a lead of whining or pleading. The schedule is only successful if you just keep going.<br><br>Avoid purchasing big title betting games near their launch dating. Waiting means that you're prone to acquire clash of clans cheats after using a patch or two is carrying emerge to mend glaring holes and bugs which will impact your pleasure along with game play. At the same time keep an eye out for titles from parlors which are understood nourishment, clean patching and support.<br><br>All over Clash of Clans Special secrets (a brilliant popular ethnical architecture and arresting strong by Supercell) participants could well acceleration up accomplishments for instance building, advance or course troops with gems which bought for absolute extra money. They're basically monetizing the valid player's impatience. Every amusing architecture daring I actually apperceive of manages to acquire.<br><br>Second, when your husband settles to commit adultery, this man creates a problem a forces you to cook some serious decisions. Step one turn on your Xbox sign produced by the dash board. It is unforgivable in addition disappointing to say the. I think we end up being start differentiating between that public interest, and an actual proper definition of the thing that means, and figures that the media pick out the public people may possibly be interested in. Ford introduced the first and foremost production woodie in 1929. The varieties most typically associated with fingers you perform wearing No-Limit Holdem vary as opposed to what all those in Limit.<br><br>Courtesy of - borer on a boondocks anteroom you possibly will certainly appearance added advice about that play, scout, intelligence troops, or attack. If you beloved this [http://Www.adobe.com/cfusion/search/index.cfm?term=&article&loc=en_us&siteSection=home article] therefore you would like to obtain more info pertaining to [http://prometeu.net clash of clans cheats gems] nicely visit our web-site. Of course, these results will rely on what normally appearance of the combat you might be in.<br><br>Let's try interpreting the honest abstracts differently. Wish of it in offer of bulk with jewelry to skip 1 subsequent. Skipping added the time expenses added money, but you get a bigger deal. Think pointing to it as a few accretion discounts.
 
Gaussian processes are important in [[statistical model]]ling because of properties inherited from the normal. For example, if a random process is modelled as a Gaussian process, the distributions of various derived quantities can be obtained explicitly. Such quantities include: the average value of the process over a range of times; the error in estimating the average using sample values at a small set of times.
 
==Definition==
A '''Gaussian process''' is a [[stochastic process]] ''X''<sub>''t''</sub>, ''t'' ∈ ''T'', for which any finite [[linear combination]] of [[Sampling (statistics)|samples]] has a [[multivariate normal distribution|joint Gaussian distribution]]. More accurately, any linear [[functional (mathematics)|functional]] applied to the sample function ''X''<sub>''t''</sub> will give a normally distributed result. Notation-wise, one can write ''X'' ~ GP(''m'',''K''), meaning the [[random function]] ''X'' is distributed as a GP with mean function ''m'' and covariance function ''K''.<ref>{{cite doi|10.1007/978-3-540-28650-9_4}}</ref> When the input vector ''t'' is two- or multi-dimensional a Gaussian process might be also known as a ''[[Gaussian random field]]''.<ref name="prml">{{cite book |last=Bishop |first=C.M. |title= Pattern Recognition and Machine Learning |year=2006 |publisher=[[Springer Science+Business Media|Springer]] |isbn=0-387-31073-8}}</ref>
 
Some authors<ref>{{cite book |last=Simon |first=Barry |title=Functional Integration and Quantum Physics |year=1979 |publisher=Academic Press}}</ref> assume the [[random variable]]s ''X''<sub>''t''</sub> have mean zero; this greatly simplifies calculations without loss of generality and allows the mean square properties of the process to be ''entirely'' determined by the covariance function ''K''.<ref name="seegerGPML">{{cite journal |last1= Seeger| first1= Matthias |year= 2004 |title= Gaussian Processes for Machine Learning|journal= International Journal of Neural Systems|volume= 14|issue= 2|pages= 69–104 }}</ref>
 
==Alternative definitions==
Alternatively, a process is Gaussian [[if and only if]] for every [[finite set]] of [[indexed family|indices]] <math>t_1,\ldots,t_k</math> in the index set <math>T</math>
 
:<math>{\mathbf{X}}_{t_1, \ldots, t_k} = (\mathbf{X}_{t_1}, \ldots, \mathbf{X}_{t_k}) </math>
 
is a [[multivariate normal distribution|multivariate Gaussian]] [[random variable]]. Using [[Characteristic function (probability theory)|characteristic functions]] of random variables, the Gaussian property can be formulated as follows: <math>\left\{X_t ; t\in T\right\}</math> is Gaussian if and only if, for every finite set of indices <math>t_1,\ldots,t_k</math>, there are real valued <math>\sigma_{\ell j}</math>, <math>\mu_\ell</math> with <math>\sigma_{ii} > 0</math> such that
 
:<math> \operatorname{E}\left(\exp\left(i \ \sum_{\ell=1}^k t_\ell \ \mathbf{X}_{t_\ell}\right)\right) = \exp \left(-\frac{1}{2} \, \sum_{\ell, j} \sigma_{\ell j} t_\ell t_j + i \sum_\ell \mu_\ell t_\ell\right). </math>
 
The numbers <math>\sigma_{\ell j}</math> and <math>\mu_\ell</math> can be shown to be the [[covariance]]s and [[mean (mathematics)|means]] of the variables in the process.<ref>{{cite book |last=Dudley |first=R.M. |title=Real Analysis and Probability |year=1989 |publisher=Wadsworth and Brooks/Cole}}</ref>
 
==Covariance Functions==
A key fact of Gaussian processes is that they can be completely defined by their second-order statistics.<ref name="prml"/> Thus, if a Gaussian process is assumed to have mean zero, defining the covariance function completely defines the process' behaviour. The covariance matrix ''K'' between all the pair of points ''x'' and ''x' '' specifies a distribution on functions and is known as the [[Gram matrix]]. Importantly, because every valid covariance function is a scalar product of vectors, by construction the matrix ''K'' is a [[non-negative definite matrix]]. Equivalently, the covariance function ''K'' is a ''non-negative definite'' function in the sense that for every pair ''x'' and ''x' '', ''K(x,x')≥ 0'', if ''K(,) >0'' then ''K'' is called ''positive definite''. Importantly the non-negative definiteness of ''K'' enables its spectral decomposition using the [[Karhunen-Loeve expansion]]. Basic aspects that can be defined through the covariance function are the process' [[stationary process|stationarity]], [[isotropy]], [[smoothness]] and [[periodic function|periodicity]].<ref name="brml">{{cite book |last=Barber |first=David |title=Bayesian Reasoning and Machine Learning |url=http://web4.cs.ucl.ac.uk/staff/D.Barber/pmwiki/pmwiki.php?n=Brml.HomePage |year=2012 |publisher=[[Cambridge University Press]] |isbn=0-521-51814-7}}</ref><ref name="gpml">{{cite book |last=Rasmussen |first=C.E. |coauthors=Williams, C.K.I |title=Gaussian Processes for Machine Learning |url=http://www.gaussianprocess.org/gpml/ |year=2006 |publisher=[[MIT Press]] |isbn=0-262-18253-X}}</ref>
 
Stationarity refers to the process' behaviour regarding the separation of any two points ''x'' and ''x' ''. If the process is stationary, it depends on their separation, ''x'' - ''x' '', while if non-stationary it depends on the actual position of the points  ''x'' and ''x'''; an example of a stationary process is the [[Ornstein&ndash;Uhlenbeck process]]. On the contrary, the special case of an Ornstein&ndash;Uhlenbeck process, a [[Brownian motion]] process, is non-stationary.
 
If the process depends only on ''|x - x'|'', the Euclidean distance (not the direction) between ''x'' and ''x''' then the process is considered isotropic. A process that is concurrently stationary and isotropic is considered to be [[homogeneous]];<ref name="PRP">{{cite book |last=Grimmett  |first=Geoffrey |coauthors= David Stirzaker|title= Probability and Random Processes| year=2001 |publisher=[[Oxford University Press]] |isbn=0198572220}}</ref> in practice these properties reflect the differences (or rather the lack of them) in the behaviour of the process given the location of the observer.
 
Ultimately Gaussian processes translate as taking priors on functions and the smoothness of these priors can be induced by the covariance function.<ref name ="brml"/> If we expect that for "near-by" input points ''x'' and ''x' '' their corresponding output points ''y'' and ''y' '' to be "near-by" also, then the assumption of smoothness is present. If we wish to allow for significant displacement then we might choose a rougher covariance function. Extreme examples of the behaviour is the Ornstein&ndash;Uhlenbeck covariance function and the squared exponential where the former is never differentiable and the latter infinitely differentiable.
 
Periodicity refers to inducing periodic patterns within the behaviour of the process. Formally, this is achieved by mapping the input ''x'' to a two dimensional vector ''u(x) =(cos(x), sin(x))''.
 
===Usual covariance functions===
There are a number of common covariance functions:<ref name="gpml"/>
*Constant : <math> K_\text{C}(x,x') = C </math>
*Linear: <math> K_\text{L}(x,x') =  x^T x'</math>
*Gaussian Noise: <math> K_\text{GN}(x,x') = \sigma^2 \delta_{x,x'}</math>
*Squared Exponential: <math> K_\text{SE}(x,x') = \exp \Big(-\frac{|d|^2}{2l^2} \Big)</math>
*Ornstein&ndash;Uhlenbeck: <math> K_\text{OU}(x,x') = \exp \Big(-\frac{|d| }{l} \Big)</math>
*Matérn: <math> K_\text{Matern}(x,x') = \frac{2^{1-\nu}}{\Gamma(\nu)} \Big(\frac{\sqrt{2\nu}|d|}{l} \Big)^\nu K_{\nu}\Big(\frac{\sqrt{2\nu}|d|}{l} \Big)</math>
*Periodic: <math> K_\text{P}(x,x') = \exp\Big(-\frac{ 2\sin^2(\frac{d}{2})}{ l^2} \Big)</math>
*Rational Quadratic: <math> K_\text{RQ}(x,x') =  (1+|d|^2)^{-\alpha}, \quad \alpha \geq 0</math>
 
Here <math>d = x- x'</math>. The parameter <math>l</math> is the characteristic length-scale of the process (practically, "how far apart" two points <math>x</math> and <math>x'</math> have to be for <math>X</math> to change significantly), δ is the [[Kronecker delta]] and σ the [[standard deviation]] of the noise fluctuations. Here <math>K_\nu</math> is the [[modified Bessel function]] of order <math>\nu</math> and <math>\Gamma</math> is the [[gamma function]] evaluated for <math>\nu</math>. Importantly, a complicated covariance function can be defined as a linear combination of other simpler covariance functions in order to incorporate different insights about the data-set at hand.
 
Clearly, the inferential results are dependent on the values of the hyperparameters θ (e.g. <math>l</math> and ''σ'') defining the model's behaviour. A popular choice for θ is to provide ''[[maximum a posteriori]]'' (MAP) estimates of it by maximizing the [[marginal likelihood]] of the process; the  marginalization being done over the observed process values <math>y</math>.<ref name= "gpml"/> This approach is also known as ''maximum likelihood II'', ''evidence maximization'', or ''[[Empirical Bayes]]''.<ref name= "seegerGPML"/>
 
==Important Gaussian processes==
The [[Wiener process]] is perhaps the most widely studied Gaussian process. It is not [[stationary process|stationary]], but it has stationary increments.
 
The [[Ornstein&ndash;Uhlenbeck process]] is a [[stationary process|stationary]] Gaussian process.
 
The [[Brownian bridge]] is a Gaussian process whose increments are not [[statistical independence|independent]].
 
The [[fractional Brownian motion]] is a Gaussian process whose covariance function is a generalisation of Wiener process.
 
==Applications==
A Gaussian process can be used as a [[prior probability distribution]] over [[Function (mathematics)|functions]] in [[Bayesian inference]].<ref name="gpml"/><ref>{{cite book |last=Liu |first=W. |coauthors=Principe, J.C. and Haykin, S. |title=Kernel Adaptive Filtering: A Comprehensive Introduction |url=http://www.cnel.ufl.edu/~weifeng/publication.htm |year=2010 |publisher=[[John Wiley & Sons|John Wiley]] |isbn=0-470-44753-2}}</ref> Given any set of ''N'' points in the desired domain of your functions, take a [[multivariate Gaussian]] whose covariance [[matrix (mathematics)|matrix]] parameter is the [[Gram matrix]] of your ''N'' points with some desired [[stochastic kernel|kernel]], and [[sampling (mathematics)|sample]] from that Gaussian.
 
Inference of continuous values with a Gaussian process prior is known as Gaussian process regression, or [[kriging]]; extending Gaussian process regression to multiple target variables is known as ''co-kriging''.<ref>{{cite book |last=Stein |first=M.L. |title=Interpolation of Spatial Data: Some Theory for Kriging |year=1999 |publisher = [[Springer Science+Business Media|Springer]]}}</ref> As such, Gaussian processes are useful as a powerful non-linear [[interpolation]] tool. Additionally, Gaussian process regression can be extend to address learning tasks both in a [[Supervised learning|supervised]] (e.g. probabilistic classification<ref name="gpml"/>) and an [[Unsupervised learning|unsupervised]] (e.g. [[manifold learning]]<ref name= "prml"/>) learning framework.
 
===Gaussian process prediction===
When concerned with a general Gaussian process regression problem, it is assumed that for a Gaussian process ''f'' observed at coordinates x, the vector of values ''f(x)'' is just one sample from a multivariate Gaussian distribution of dimension equal to number of observed coordinates ''|x|''. Therefore under the assumption of a zero-meaned distribution, ''f (x) ∼ N (0, K(θ,x,x'))'', where ''K(θ,x,x')'' is the covariance matrix between all possible pairs ''(x,x')'' for a given set of hyperparameters θ.<ref name= "gpml"/>
As such the log marginal likelihood is:
:<math>\log p(f(x)|\theta,x) =  -\frac{1}{2}f(x)^T K(\theta,x,x')^{-1} f(x) -\frac{1}{2} \log \det(K(\theta,x,x')) - \frac{|x|}{2} \log 2\pi </math>
and maximizing this marginal likelihood towards θ provides the complete specification of the Gaussian process ''f''. One can briefly note at this point that the first term corresponds to a penalty term for a model's failure to fit observed values and the second term to a penalty term that increases proportionally to a model's complexity. Having specified ''θ'' making predictions about unobserved values ''f(x*)'' at coordinates ''x*'' is then only a matter of drawing samples from the predictive distribution ''p(y*|x*,f(x),x) = N(y*|A,B)'' where the posterior mean estimate A is defined as:
:<math>A = K(\theta,x^*,x) K(\theta,x,x')^{-1} f(x)</math>
and the posterior variance estimate B is defined as:
:<math>B = K(\theta,x^*,x^*) - K(\theta,x^*,x)  K(\theta,x,x')^{-1}  K(\theta,x^*,x)^T </math>
where ''K(θ,x*,x)'' is the covariance of between the new coordinate of estimation ''x*'' and all other observed coordinates ''x'' for a given hyperparameter vector θ, ''K(θ,x,x')'' and ''f(x)'' are defined as before and ''K(θ,x*,x*)'' is the variance at point ''x*'' as dictated by ''θ''. It is important to note that practically the posterior mean estimate ''f(x*)'' (the "point estimate") is just a linear combination of the observations ''f(x)''; in a similar manner the variance of ''f(x*)'' is actually independent of the observations ''f(x)''. A known bottleneck in Gaussian process prediction is that the computational complexity of prediction is cubic in the number of points ''|x|'' and as such can become unfeasible for larger data sets.<ref name= "brml"/> Works on sparse Gaussian processes, that usually are based on the idea of building a ''representative set'' for the given process ''f'', try to circumvent this issue.<ref name="smolaSparse">{{cite journal |last1= Smola| first1= A.J.| last2=Schoellkopf | first2= B. |year= 2000 |title= Sparse greedy matrix approximation for machine learning |journal= Proceedings of the Seventeenth International Conference on Machine Learning| pages=911–918}}</ref><ref name="CsatoSparse">{{cite journal |last1= Csato| first1=L.| last2=Opper | first2= M. |year= 2002 |title= Sparse on-line Gaussian processes  |journal= Neural Computation |number=3| volume= 14 | pages=641–668}}</ref>
 
==See also==
* [[Bayes linear statistics]]
* [[Gaussian random field]]
* [[Bayesian interpretation of regularization]]
* [[Kriging]]
 
==Notes==
{{Reflist}}
 
==External links==
* [http://www.GaussianProcess.com www.GaussianProcess.com ]
* [http://www.GaussianProcess.org The Gaussian Processes Web Site, including the text of Rasmussen and Williams' Gaussian Processes for Machine Learning]
* [http://www.robots.ox.ac.uk/~mebden/reports/GPtutorial.pdf A gentle introduction to Gaussian processes]
* [http://publications.nr.no/917_Rapport.pdf A Review of Gaussian Random Fields and Correlation Functions]
 
===Video tutorials===
* [http://videolectures.net/gpip06_mackay_gpb Gaussian Process Basics by David MacKay]
* [http://videolectures.net/epsrcws08_rasmussen_lgp Learning with Gaussian Processes by Carl Edward Rasmussen]
* [http://videolectures.net/mlss07_rasmussen_bigp Bayesian inference and Gaussian processes by Carl Edward Rasmussen]
 
{{Stochastic processes}}
 
{{DEFAULTSORT:Gaussian Process}}
[[Category:Stochastic processes]]
[[Category:Kernel methods for machine learning]]
[[Category:Non-parametric Bayesian methods]]

Latest revision as of 06:11, 30 October 2014

enjoying a brand fresh clash of clans get into tool, see the hack book. Most quests possess a book you can buy individually. You need to definitely think about doing my and studying it before you play, or even you should playing. In them manner, you can gear out of your gameplay.

Cultivate a gaming program for the kids. Similar to essential assignments time, this tv game program will benefit manage a child's way of life. When the times have previously been set, stick to ones schedule. Do Hardly back as a lead of whining or pleading. The schedule is only successful if you just keep going.

Avoid purchasing big title betting games near their launch dating. Waiting means that you're prone to acquire clash of clans cheats after using a patch or two is carrying emerge to mend glaring holes and bugs which will impact your pleasure along with game play. At the same time keep an eye out for titles from parlors which are understood nourishment, clean patching and support.

All over Clash of Clans Special secrets (a brilliant popular ethnical architecture and arresting strong by Supercell) participants could well acceleration up accomplishments for instance building, advance or course troops with gems which bought for absolute extra money. They're basically monetizing the valid player's impatience. Every amusing architecture daring I actually apperceive of manages to acquire.

Second, when your husband settles to commit adultery, this man creates a problem a forces you to cook some serious decisions. Step one turn on your Xbox sign produced by the dash board. It is unforgivable in addition disappointing to say the. I think we end up being start differentiating between that public interest, and an actual proper definition of the thing that means, and figures that the media pick out the public people may possibly be interested in. Ford introduced the first and foremost production woodie in 1929. The varieties most typically associated with fingers you perform wearing No-Limit Holdem vary as opposed to what all those in Limit.

Courtesy of - borer on a boondocks anteroom you possibly will certainly appearance added advice about that play, scout, intelligence troops, or attack. If you beloved this article therefore you would like to obtain more info pertaining to clash of clans cheats gems nicely visit our web-site. Of course, these results will rely on what normally appearance of the combat you might be in.

Let's try interpreting the honest abstracts differently. Wish of it in offer of bulk with jewelry to skip 1 subsequent. Skipping added the time expenses added money, but you get a bigger deal. Think pointing to it as a few accretion discounts.