|
|
Line 1: |
Line 1: |
| In [[statistics]], '''leverage''' is a term used in connection with [[regression analysis]] and, in particular, in analyses aimed at identifying those observations that are far away from corresponding average predictor values. Leverage points do not necessarily have a large effect on the outcome of fitting regression models.
| | 25 үear-old Chief Infоrmation Officer Justin Vickers fгom Marysville, ɦаs lots of interests that include boardgames, Dermatology ɑnd trekkie. Was exceptionally stimulated аfter visiting German Nazi [http://www.Bbc.Co.uk/search/?q=Concentration Concentration] аnd Extermination Camp (-).<br><br>Alѕo visit my web page - Dr. Marcus Goodman ([http://ashioon.com/link.php?url=https://www.youtube.com/watch?v=lFa6ZbF1jKI http://ashioon.com/link.php?url=https://www.youtube.com/watch?v=lFa6ZbF1jKI]) |
| | |
| '''Leverage points''' are those observations, if any, made at extreme or outlying values of the <em>independent variables</em> such that the lack of neighboring observations means that the fitted regression model will pass close to that particular observation.<ref>Everitt, B.S. (2002) Cambridge Dictionary of Statistics. CUP. ISBN 0-521-81099-X</ref>
| |
| | |
| Modern computer packages for statistical analysis include, as part of their facilities for regression analysis, various quantitative measures for identifying [[influential observation]]s: among these measures is [[partial leverage]], a measure of how a variable contributes to the leverage of a datum.
| |
| | |
| ==Definition==
| |
| The leverage score for the <math> i^{th} </math> data unit is defined as:
| |
| *<math> h_{ii}=(H)_{ii} </math>,
| |
| the <math> i^{th} </math> diagonal of the hat matrix <math> H=X(X'X)^{-1}X'</math>.
| |
| | |
| ==Properties==
| |
| <math> 0 \leq h_{ii} \leq 1 </math>
| |
| | |
| ===Proof===
| |
| First, note that <math> H^2=X(X'X)^{-1}X'X(X'X)^{-1}X'=XI(X'X)^{-1}X'=H </math>. Also, observe that <math> H </math> is symmetric.
| |
| So we have,
| |
| *<math> h_{ii}=h_{ii}^2+\sum_{i\neq j}h_{ij}^2 \geq 0 </math>
| |
| and
| |
| *<math> h_{ii} \geq h_{ii}^2 \implies h_{ii}\leq 1 </math>
| |
| | |
| | |
| If we are in an ordinary least squares setting with fixed X and:
| |
| *<math> Y=X\beta+\epsilon </math>
| |
| *<math>var(\epsilon)=\sigma^2I </math>
| |
| then <math> var(e_i)=(1-h_{ii})\sigma^2 </math> where <math> e_i=Y_i-\hat{Y}_i </math>.
| |
| | |
| In other words, if the <math> \epsilon </math> are homoscedastic, leverage scores determine the noise level in the model.
| |
| ===Proof=== | |
| First, note that <math> I-H </math> is idempotent and symmetric. This gives,
| |
| <math> var(e)=var((I-H)Y)=(I-H)var(Y)(I-H)'=\sigma^2(I-H)^2=\sigma^2(I-H) </math>.
| |
| | |
| So that, <math> var(e_i)=(1-h_{ii})\sigma^2 </math>.
| |
| | |
| | |
| | |
| | |
| ==See also== | |
| * [[Hat matrix]] — whose main diagonal entries are the leverages of the observations
| |
| * [[Mahalanobis distance]] — a measure of leverage of a datum
| |
| * [[Cook's distance]] - a measure of changes in regression coefficients when an observation is deleted
| |
| * [[DFFITS]]
| |
| * [[Outliers]] — observations with extreme Y values
| |
| | |
| ==References==
| |
| {{reflist}}
| |
| | |
| [[Category:Regression analysis]]
| |
| [[Category:Statistical terminology]]
| |
| [[Category:Regression diagnostics]]
| |
25 үear-old Chief Infоrmation Officer Justin Vickers fгom Marysville, ɦаs lots of interests that include boardgames, Dermatology ɑnd trekkie. Was exceptionally stimulated аfter visiting German Nazi Concentration аnd Extermination Camp (-).
Alѕo visit my web page - Dr. Marcus Goodman (http://ashioon.com/link.php?url=https://www.youtube.com/watch?v=lFa6ZbF1jKI)