G (disambiguation): Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Maiella
en>Bkonrad
claimed usage is not supported by the linked articles
Line 1: Line 1:
{{DISPLAYTITLE:Mallows's ''C<sub>p</sub>''}}
Andrew Berryhill is what his wife enjoys to contact him  [http://galab-work.cs.pusan.ac.kr/Sol09B/?document_srl=1489804 certified psychics] and he completely digs that title. The favorite pastime for him and his kids is to perform lacross and he would never give it up. Mississippi is where  free online tarot card readings ([http://afeen.fbho.net/v2/index.php?do=/profile-210/info/ afeen.fbho.net]) his home is. Since I was 18 I've been operating as a bookkeeper but quickly my wife and I will begin our own business.<br><br>Here is my site :: [http://clothingcarearchworth.com/index.php?document_srl=441551&mid=customer_review free tarot readings]
In [[statistics]], '''Mallows's ''C<sub>p</sub>''''',<ref>{{cite journal | doi=10.2307/1267380 | last=Mallows | first=C. L. | year=1973 | title=Some Comments on ''C<sub>P</sub>'' | journal= Technometrics | volume=15 | issue=4 | pages=661–675 | jstor=1267380}}</ref><ref>{{cite journal |title=The interpretation of Mallows's ''C<sub>p</sub>''-statistic |first=Steven G. |last=Gilmour |journal=Journal of the Royal Statistical Society, Series D |volume=45 |issue=1 |year=1996 |pages=49–56 |jstor=2348411}}</ref> named for Colin Lingwood Mallows, is used to assess the [[goodness of fit|fit]] of a [[regression analysis|regression model]] that has been estimated using [[ordinary least squares]]. It is applied in the context of [[model selection]], where a number of [[dependent and independent variables|predictor variables]] are available for predicting some outcome, and the goal is to find the best model involving a subset of these predictors.
 
Mallows' ''C<sub>p</sub>'' has been shown to be equivalent to [[Akaike Information Criterion]].<ref>Boisbunon A. et&nbsp;/al. (2013),"[http://arxiv.org/abs/1308.2766 AIC and Cp as estimators of loss for spherically symmetric distributions]", [[arXiv]]:1308.2766.</ref>
 
==Definition and properties==
 
Mallows's ''C<sub>p</sub>'' addresses the issue of [[overfitting]], in which model selection statistics such as the residual sum of squares always get smaller as more variables are added to a model. Thus, if we aim to select the model giving the smallest residual sum of squares, the model including all variables would always be selected. The ''C<sub>p</sub>'' statistic calculated on a [[sample (statistics)|sample]] of data estimates the [[mean squared prediction error]] (MSPE) as its [[statistical population|population]] target
 
:<math>
E\sum_j (\hat{Y}_j - E(Y_j\mid X_j))^2/\sigma^2,
</math>
 
where <math>\hat{Y}_j</math> is the fitted value from the regression model for the ''j''th case, ''E''(''Y''<sub>''j''</sub>&nbsp;|&nbsp;''X''<sub>''j''</sub>) is the expected value for the ''j''th case, and σ<sup>2</sup> is the error variance (assumed constant across the cases).  The MSPE will not automatically get smaller as more variables are added.  The optimum model under this criterion is a compromise influenced by the sample size, the [[effect size]]s of the different predictors, and the degree of [[collinearity]] between them.
 
If ''P'' [[regressor]]s are selected from a set of ''K'' > ''P'', the ''C<sub>p</sub>'' statistic for that particular set of regressors is defined as:
 
:<math> C_p={SSE_p \over S^2} - N + 2P, </math>
 
where
 
*<math>SSE_p = \sum_{i=1}^N(Y_i-Y_{pi})^2</math> is the error [[sum of squares]]{{dn|date=December 2013}} for the model with ''P'' [[regressor]]s,
*''Y''<sub>pi</sub> is the [[predict]]ed value of the ''i''th observation of ''Y'' from the ''P'' [[regressor]]s,
*''S''<sup>2</sup> is the residual mean square after [[Regression analysis|regression]] on the complete set of ''K'' [[regressor]]s and can be estimated by [[mean square error]] ''MSE'',
* and ''N'' is the [[sample size]].
 
==Practical use==
 
The ''C<sub>p</sub>'' statistic is often used as a stopping rule for various forms of [[stepwise regression]]. Mallows proposed the statistic as a criterion for selecting among many alternative subset regressions.  Under a model not suffering from appreciable lack of fit (bias), ''C<sub>p</sub>'' has expectation nearly equal to ''P''; otherwise the expectation is roughly ''P'' plus a positive bias term. Nevertheless, even though it has expectation greater than or equal to ''P'', there is nothing to prevent ''C<sub>p</sub>'' < ''P'' or even ''C<sub>p</sub>'' < 0 in extreme cases. It is suggested that one should choose a subset that has ''C<sub>p</sub>'' approaching ''P'',<ref>{{cite book |last1=Daniel |first1=C. |last2=Wood |first2=F. |year=1980 |title=Fitting Equations to Data |edition=Rev. |location=New York |publisher=Wiley & Sons, Inc.}}</ref> from above, for a list of subsets ordered by increasing ''P''. In practice, the positive bias can be adjusted for by selecting a model from the ordered list of subsets, such that ''C<sub>p</sub>'' < ''2P''.
 
Since the sample-based ''C<sub>p</sub>'' statistic is an estimate of the MSPE, using ''C<sub>p</sub>'' for model selection does not completely guard against overfitting.  For instance, it is possible that the selected model will be one in which the sample ''C<sub>p</sub>'' was a particularly severe underestimate of the MSPE.
 
Model selection statistics such as ''C<sub>p</sub>'' are generally not used blindly, but rather information about the field of application, the intended use of the model, and any known biases in the data are taken into account in the process of model selection.
 
==References==
*{{Cite journal | doi = 10.2307/2529336 | last1 = Hocking | first1 = R. R. | year = 1976 | title = The Analysis and Selection of Variables in [[Linear Regression]] | jstor = 2529336| journal = Biometrics | volume = 32 | issue = 1| pages = 1–50 }}
{{reflist}}
 
[[Category:Regression analysis]]
[[Category:Regression variable selection]]

Revision as of 03:54, 5 March 2014

Andrew Berryhill is what his wife enjoys to contact him certified psychics and he completely digs that title. The favorite pastime for him and his kids is to perform lacross and he would never give it up. Mississippi is where free online tarot card readings (afeen.fbho.net) his home is. Since I was 18 I've been operating as a bookkeeper but quickly my wife and I will begin our own business.

Here is my site :: free tarot readings