Dunford–Schwartz theorem: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Yobot
m WP:CHECKWIKI error fixes / special characters in sortkey fixed using AWB (9427)
en>K9re11
removed Category:Functional analysis using HotCat as there is already a more specific category
 
Line 1: Line 1:
In [[statistics]], '''leverage''' is a term used in connection with [[regression analysis]] and, in particular, in analyses aimed at identifying those observations that are far away from corresponding average predictor values. Leverage points do not necessarily have a large effect on the outcome of fitting regression models. 
25 үear-old Chief Infоrmation Officer Justin Vickers fгom Marysville, ɦаs lots of interests that include boardgames, Dermatology ɑnd trekkie. Was exceptionally stimulated аfter visiting German Nazi [http://www.Bbc.Co.uk/search/?q=Concentration Concentration] аnd Extermination Camp (-).<br><br>Alѕo visit my web page - Dr. Marcus Goodman ([http://ashioon.com/link.php?url=https://www.youtube.com/watch?v=lFa6ZbF1jKI http://ashioon.com/link.php?url=https://www.youtube.com/watch?v=lFa6ZbF1jKI])
 
'''Leverage points''' are those observations, if any, made at extreme or outlying values of the <em>independent variables</em> such that the lack of neighboring observations means that the fitted regression model will pass close to that particular observation.<ref>Everitt, B.S. (2002) Cambridge Dictionary of Statistics. CUP. ISBN 0-521-81099-X</ref>
 
Modern computer packages for statistical analysis include, as part of their facilities for regression analysis, various quantitative measures for identifying [[influential observation]]s: among these measures is [[partial leverage]], a measure of how a variable contributes to the leverage of a datum.
 
==Definition==
The leverage score for the <math> i^{th} </math> data unit is defined as:
*<math> h_{ii}=(H)_{ii} </math>,
the <math> i^{th} </math> diagonal of the hat matrix <math> H=X(X'X)^{-1}X'</math>.
 
==Properties==
<math> 0 \leq h_{ii} \leq 1 </math>
 
===Proof===
First, note that <math> H^2=X(X'X)^{-1}X'X(X'X)^{-1}X'=XI(X'X)^{-1}X'=H </math>. Also, observe that <math> H </math> is symmetric.
So we have,
*<math> h_{ii}=h_{ii}^2+\sum_{i\neq j}h_{ij}^2 \geq 0 </math>
and
*<math> h_{ii} \geq h_{ii}^2 \implies h_{ii}\leq 1 </math>
 
 
If we are in an ordinary least squares setting with fixed X and:
*<math> Y=X\beta+\epsilon </math>
*<math>var(\epsilon)=\sigma^2I </math>
then  <math> var(e_i)=(1-h_{ii})\sigma^2 </math> where <math> e_i=Y_i-\hat{Y}_i </math>.
 
In other words, if the <math> \epsilon </math> are homoscedastic, leverage scores determine the noise level in the model.
===Proof===
First, note that <math> I-H </math> is idempotent and symmetric. This gives,
<math> var(e)=var((I-H)Y)=(I-H)var(Y)(I-H)'=\sigma^2(I-H)^2=\sigma^2(I-H) </math>.
 
So that, <math> var(e_i)=(1-h_{ii})\sigma^2 </math>.
 
 
 
 
==See also==
* [[Hat matrix]] — whose main diagonal entries are the leverages of the observations
* [[Mahalanobis distance]] — a measure of leverage of a datum
* [[Cook's distance]] - a measure of changes in regression coefficients when an observation is deleted
* [[DFFITS]]
* [[Outliers]] — observations with extreme Y values
 
==References==
{{reflist}}
 
[[Category:Regression analysis]]
[[Category:Statistical terminology]]
[[Category:Regression diagnostics]]

Latest revision as of 04:33, 9 December 2014

25 үear-old Chief Infоrmation Officer Justin Vickers fгom Marysville, ɦаs lots of interests that include boardgames, Dermatology ɑnd trekkie. Was exceptionally stimulated аfter visiting German Nazi Concentration аnd Extermination Camp (-).

Alѕo visit my web page - Dr. Marcus Goodman (http://ashioon.com/link.php?url=https://www.youtube.com/watch?v=lFa6ZbF1jKI)