Gradient-related: Difference between revisions

Revision as of 02:45, 15 February 2013

29 yr old Orthopaedic Surgeon Grippo from Saint-Paul, spends time with interests including model railways, top property developers in singapore developers in singapore and dolls. Finished a cruise ship experience that included passing by Runic Stones and Church.

Template:Expert-subject

The partition of sums of squares is a concept that permeates much of inferential statistics and descriptive statistics. More properly, it is the partitioning of sums of squared deviations or errors. Mathematically, the sum of squared deviations is an unscaled, or unadjusted measure of dispersion (also called variability). When scaled for the number of degrees of freedom, it estimates the variance, or spread of the observations about their mean value. Partitioning of the sum of squared deviations into various components allows the overall variability in a dataset to be ascribed to different types or sources of variability, with the relative importance of each being quantified by the size of each component of the overall sum of squares.

Background

The distance from any point in a collection of data, to the mean of the data, is the deviation. This can be written as $y_{i} - \overline{y}$ , where $y_{i}$ is the ith data point, and $\overline{y}$ is the estimate of the mean. If all such deviations are squared, then summed, as in $\sum_{i = 1}^{n} {(y_{i} - \overline{y})}^{2}$ , this gives the "sum of squares" for these data.

When more data are added to the collection the sum of squares will increase, except in unlikely cases such as the new data being equal to the mean. So usually, the sum of squares will grow with the size of the data collection. That is a manifestation of the fact that it is unscaled.

In many cases, the number of degrees of freedom is simply the number of data in the collection, minus one. We write this as n − 1, where n is the number of data.

Scaling (also known as normalizing) means adjusting the sum of squares so that it does not grow as the size of the data collection grows. This is important when we want to compare samples of different sizes, such as a sample of 100 people compared to a sample of 20 people. If the sum of squares was not normalized, its value would always be larger for the sample of 100 people than for the sample of 20 people. To scale the sum of squares, we divide it by the degrees of freedom, i.e., calculate the sum of squares per degree of freedom, or variance. Standard deviation, in turn, is the square root of the variance.

The above information is how sum of squares is used in descriptive statistics; see the article on total sum of squares for an application of this broad principle to inferential statistics.

Partitioning the sum of squares in linear regression

Theorem. Given a linear regression model $y_{i} = β_{0} + β_{1} x_{i 1} + \dots + β_{p} x_{i p} + ε_{i}$ including a constant based on a sample $(y_{i}, x_{i 1}, \dots, x_{i p}), i = 1, \dots, n$ containing n observations, the total sum of squares $\sum_{i = 1}^{n} (y_{i} - \bar{y})^{2}$ (TSS) can be partitioned as follows into the explained sum of squares (ESS) and the residual sum of squares (RSS):

T S S = E S S + R S S,

where this equation is equivalent to each of the following forms:

\begin{aligned} {‖ y - \bar{y} 1 ‖}^{2} & = {‖ \hat{y} - \bar{y} 1 ‖}^{2} + {‖ \hat{ε} ‖}^{2}, 1 = (1, 1, \dots, 1)^{T}, \\ \sum_{i = 1}^{n} (y_{i} - \bar{y})^{2} & = \sum_{i = 1}^{n} ({\hat{y}}_{i} - \bar{y})^{2} + \sum_{i = 1}^{n} (y_{i} - {\hat{y}}_{i})^{2}, \\ \sum_{i = 1}^{n} (y_{i} - \bar{y})^{2} & = \sum_{i = 1}^{n} ({\hat{y}}_{i} - \bar{y})^{2} + \sum_{i = 1}^{n} {\hat{ε}}_{i}^{2} . \end{aligned}

Proof

\begin{aligned} \sum_{i = 1}^{n} (y_{i} - \overline{y})^{2} & = \sum_{i = 1}^{n} (y_{i} - \overline{y} + {\hat{y}}_{i} - {\hat{y}}_{i})^{2} = \sum_{i = 1}^{n} (({\hat{y}}_{i} - \bar{y}) + \underset{{\hat{ε}}_{i}}{\underset{⏟}{(y_{i} - {\hat{y}}_{i})}})^{2} \\ = \sum_{i = 1}^{n} (({\hat{y}}_{i} - \bar{y})^{2} + 2 {\hat{ε}}_{i} ({\hat{y}}_{i} - \bar{y}) + {\hat{ε}}_{i}^{2}) \\ = \sum_{i = 1}^{n} ({\hat{y}}_{i} - \bar{y})^{2} + \sum_{i = 1}^{n} {\hat{ε}}_{i}^{2} + 2 \sum_{i = 1}^{n} {\hat{ε}}_{i} ({\hat{y}}_{i} - \bar{y}) \\ = \sum_{i = 1}^{n} ({\hat{y}}_{i} - \bar{y})^{2} + \sum_{i = 1}^{n} {\hat{ε}}_{i}^{2} + 2 \sum_{i = 1}^{n} {\hat{ε}}_{i} ({\hat{β}}_{0} + {\hat{β}}_{1} x_{i 1} + \dots + {\hat{β}}_{p} x_{i p} - \overline{y}) \\ = \sum_{i = 1}^{n} ({\hat{y}}_{i} - \bar{y})^{2} + \sum_{i = 1}^{n} {\hat{ε}}_{i}^{2} + 2 ({\hat{β}}_{0} - \overline{y}) \underset{0}{\underset{⏟}{\sum_{i = 1}^{n} {\hat{ε}}_{i}}} + 2 {\hat{β}}_{1} \underset{0}{\underset{⏟}{\sum_{i = 1}^{n} {\hat{ε}}_{i} x_{i 1}}} + \dots + 2 {\hat{β}}_{p} \underset{0}{\underset{⏟}{\sum_{i = 1}^{n} {\hat{ε}}_{i} x_{i p}}} \\ = \sum_{i = 1}^{n} ({\hat{y}}_{i} - \bar{y})^{2} + \sum_{i = 1}^{n} {\hat{ε}}_{i}^{2} = E S S + R S S \end{aligned}

The requirement that the model includes a constant or equivalently that the design matrix contains a column of ones ensures that $\sum_{i = 1}^{n} {\hat{ε}}_{i} = 0$ .

Some readers may find the following version of the proof, set in vector form, more enlightening:

$\begin{aligned} S S_{total} = ‖ y - \bar{y} 1 ‖^{2} & = ‖ y - \bar{y} 1 + \hat{y} - \hat{y} ‖^{2}, \\ = ‖ (\hat{y} - \bar{y} 1) + (y - \hat{y}) ‖^{2}, \\ = ‖ \hat{y} - \bar{y} 1 ‖^{2} + ‖ \hat{ε} ‖^{2} + 2 {\hat{ε}}^{T} (\hat{y} - \bar{y} 1), \\ = S S_{regression} + S S_{error} + 2 {\hat{ε}}^{T} (X \hat{β} - \bar{y} 1), \\ = S S_{regression} + S S_{error} + 2 ({\hat{ε}}^{T} X) \hat{β} - 2 \bar{y} {\hat{ε}}^{T} 1, \\ = S S_{regression} + S S_{error} . \end{aligned}$

The elimination of terms in the last line, used the fact that

{\hat{ε}}^{T} X = {(y - \hat{y})}^{T} X = y^{T} (I - X {(X^{T} X)}^{- 1} X^{T}) X = y^{T} (X - X) = 0 .

Further partitioning

Note that the residual sum of squares can be further partitioned as the lack-of-fit sum of squares plus the sum of squares due to pure error.

References

20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.

My blog: http://www.primaboinca.com/view_profile.php?userid=5889534 Pre-publication chapters are available on-line.
20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.

My blog: http://www.primaboinca.com/view_profile.php?userid=5889534
20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.

My blog: http://www.primaboinca.com/view_profile.php?userid=5889534
Republished as: 20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.

My blog: http://www.primaboinca.com/view_profile.php?userid=5889534
20 year-old Real Estate Agent Rusty from Saint-Paul, has hobbies and interests which includes monopoly, property developers in singapore and poker. Will soon undertake a contiki trip that may include going to the Lower Valley of the Omo.

My blog: http://www.primaboinca.com/view_profile.php?userid=5889534

@@ Line 1: / Line 1: @@
-== Nike Air Max 2013 Red  ktoré ste požiadaní prijímať. ==
+{{About|the partition of sums of squares in statistics||Sum of squares (disambiguation){{!}}Sum of squares}}
-Snažím sa vždy nájsť aby [http://www.bukovec.sk/foto/data.asp?m=85-Nike-Air-Max-2013-Red Nike Air Max 2013 Red] im na pomoc nadácii plniť svoje charitatívne ciele. " . Nie Nastavenie správne typy Match Je to ľahké prehliadnuť nastavenie a dokončenie čo najlepší zápas pre vaše PPC kľúčových slov typy zápas môže byť mätúce, zdĺhavé, a zdanlivo nedôležité. Ale nie sú! . <br><br>Mohol by byť váš sused, ale s [http://www.webvet.cz/ebook/intranet/jquery.asp?k=57-Oakley-Praha-Rybna Oakley Praha Rybna] veľa strojov v jeho hlave. "Je ľahké si predstaviť, Noyce, tuxedoed, s úsmevom hanblivo, a zúfalo chcel cigaretu, v októbri 2000, keď, keby žil, nepochybne by sa delil sa o Nobelovu cenu za fyziku získal do svojho integrovaného obvodu coinventor, Jack Kilby.. <br><br>Ak Oxykodonové rozčuľuje. SurfWatch: SurfWatch tvrdí, že viac ako osem miliónov kópií svojich výrobkov boli stiahnuté, čo je najviac populárne, dôveryhodný produkt pre využitie pozitívneho potenciálu internetu. Takže veľa tweetov ísť medzi niečo, čo sa opakuje. Jedným z aktuálnych problémov, s ohľadom na SEO spam je "Content poľnohospodárstvo". <br><br>Avšak, to že sme sa raz, a my to znova. Hermes má čas intenzívnu spolu s výnimočným skôr, ponuky dupol zaistiť osobu najviac nedávnych storočiach. Písanie zadarmo ebooks: Ak máte talent na písanie, písanie ebook môže byť najlepší spôsob, ako podporiť váš partnerský produkty v neprítomnosti skutočných webových stránok a zarobiť peniaze ľahko. Rovnako ako v e-mailoch a informačné bulletiny, vaši čitatelia by lepšie oceniť vaše ebook, ak to nie [http://www.bukovec.sk/foto/data.asp?m=68-Air-Max-87 Air Max 87] je príliš reklamný, ale skôr informatívny. <br><br>"Hoy no da existe una sola manera de Pintar", a juicio del curador, Sino "ms bien un pluralizmu heterogneo y un Rango de diferentes Formas de ver, entendre y traducir el mundo travs de la pintura. En un mundo de saturado imgenes , la pintura Abre Nuevas Ventanas de Conocimiento para Výkon repensar y reimaginar el mundo en que vivimos ". <br><br>Dlhodobý úspech každej spoločnosti do značnej miery závisí na kvalite a lojalitu svojich ľudí. Niekoľko vedení podnikov by s touto myšlienkou nesúhlasí koncepčne. Remeselníkom všetkého druhu majú teraz prístup k svetu prostredníctvom tohto virtuálneho trhovisko. Namiesto toho, každý predávajúci stanoví cenu, že sa cíti, je spravodlivé pre svoje domáce tovar. <br><br>Rozsudok stačí čistiť ich príchod inyearold. Home Entertainment Top 10 najlepších Bollywood Kritici a reviewers si istý, že veľa nám radi kino, zvlášť indickej CIEM. Tieto tretie strany môžu byť umožnený prístup k osobným údajom potrebným pre výkon ich funkcií, ale nesmie tieto informácie použiť na iný purpose.In Navyše [http://www.hruboskalsko.cz/ckfinder/plugins/fileeditor/paticka.asp?cmd=161-Boty-Jordan-Nike Boty Jordan Nike] môžeme zverejniť akékoľvek informácie vrátane osobných údajov, považujeme za potrebné, podľa nášho vlastného uváženia, aby v súlade so všetkými platnými právnymi predpismi, nariadenia, súdne konanie alebo vládnemu request.We nechcete prijímať nevyžiadané e-maily od nás. Snažíme sa, aby bolo jednoduché výnimky z akejkoľvek služby, ktoré ste požiadaní prijímať.<ul>
+{{Expert-subject|Statistics|date=November 2008}}
-   <li>[http://www.emil86.fr/spip.php?article1/ http://www.emil86.fr/spip.php?article1/]</li>
-   <li>[http://general.assembly.codesria.org/spip.php?article87&lang=pt/ http://general.assembly.codesria.org/spip.php?article87&lang=pt/]</li>
-   <li>[http://ilivewebsolutions.com/awus/DEV/index.php/forum/4-planning/256530-boty-nike-free-run-po-cele-roky#256530 http://ilivewebsolutions.com/awus/DEV/index.php/forum/4-planning/256530-boty-nike-free-run-po-cele-roky#256530]</li>
-   <li>[http://scot-tregor.com/spip.php?article91 http://scot-tregor.com/spip.php?article91]</li>
-   <li>[http://ldsbee.com/index.php?page=item&id=4204045 http://ldsbee.com/index.php?page=item&id=4204045]</li>
- </ul>
-== Nike Boty Slevy  Liss ==
+The '''partition of sums of squares''' is a concept that permeates much of [[inferential statistics]] and [[descriptive statistics]]. More properly, it is the '''partitioning of sums of [[squared deviations]] or errors'''. Mathematically, the sum of squared deviations is an unscaled, or unadjusted measure of [[statistical dispersion|dispersion]] (also called [[statistical variability|variability]]). When scaled for the number of [[Degrees of freedom (statistics)|degrees of freedom]], it estimates the [[variance]], or spread of the observations about their mean value. Partitioning of the sum of squared deviations into various components allows the overall variability in a dataset to be ascribed to different types or sources of variability, with the relative importance of each being quantified by the size of each component of the overall sum of squares.
-K dispozícii je tiež otázka Sarah Elderkin pýtať Uhuru uviesť, čo urobil jeho siedme narodeniny v roku 1968. Vážne? Sarah Elderkin, je buď trpia zlým záchvat Alzheimerovej choroby alebo staroby je konečne doháňa k nej. Kiež by som mohol ísť späť do projektu, kde sa to všetko stalo, bolo to, ako na konci toho dňa som sa naozaj dostať niekam, rovnako ako ľudia boli trochu prijatie sa ako normálna vec, ako je možná len možno ľudia učili niečo . Bohužiaľ som sa presťahovala do pokojnej charitatívnom obchode hovoril som o tom v minulom článku, sa sťahoval do pokojnej strednej triedy prostredia s gay správcu a všeobecné prijatie asimilované homosexuála, kde tak dlho, kým nebudete hovoriť veľa o špecifikách alebo bozk ľudia rovnakého pohlavia, je to všetko celkom nad palubu. <br><br>Hoci ihly sú sterilné, tam je vždy určité riziko infekcie. Chcel by som sa zaoberať zrejme neznalí lekára / zdravotníckeho personálu a ja by som požiadať, aby hovoril na bankový nadriadenému, ak vaša banka vám dá zabrať. Preto nový názov úprimne, a preto pohľad Starspangled Trapper Keeper, taky. Banner v hornej časti stránky sa hviezdy v ňom nielen preto, že hviezdy sú úžasné, ale preto, že v šiestej triede, začal som sa podpísať svoje meno s malou hviezdu na konci. <br><br>Okrem toho, 4S LiPo články nie sú ani v blízkosti rovnaké napätie ako NiMH. Menovite sú 14,8 vs 14,4, čo sa môže zdať blízko, ale dať ich pod zaťažením, a uvidíte, že NiMH klesnúť na asi 1V na článok, ak sa dostanete dobré. V Universal Mobile Telecommunications System (UMTS). RNC, niekedy tiež nazývaný základňovú stanicu. <br><br>Pozrite sa na toto prvý pohľad videa je vidieť v [http://www.hruboskalsko.cz/ckfinder/plugins/fileeditor/paticka.asp?cmd=21-Nike-Boty-Slevy Nike Boty Slevy] akcii .. COLOMBO 3. ESPN urobil veľký krok vpred, keď išli na rozloženie beztabulkový stránky. Overenie príde vo svojom vlastnom [http://www.superucto.cz/katalogy/meny/stredis.asp?f=109-Louis-Vuitton-Penazenky Louis Vuitton Penazenky] čase. Pozrite sa na svoju krásu podnikania očami vašich klientov. Pás z mäkkého Tufo sa nachádza medzi 350 m až 400 m nad dolnej posteli, a to tiež obsahuje dve prírodné depresie (grabialioni); V dôsledku toho bolo tu, že osada vyrástla. <br><br>Ak je to možné, je trasa offroad, pomocou prednosť v jazde a permisívnych ciest. Od Alice Holt Forest, to prejde Bordon, Liphook, Liss, Petersfield, kráľovná Elizabeth Country Park, Staunton Country Park, Havant, Hayling Island a pomocou trajektu do Portsmouthu, skončil na historické lodenice okolo 50 míľ na všetky, [http://www.maxmobile.sk/data/define.asp?l=201-Tenisky-Louis-Vuitton-2011 Tenisky Louis Vuitton 2011] vrátane siedmich železničnej stanice. <br><br>Som bol s ADHD a ADD liek Concerta 72. Mg. Heather Kobrin je pop kultúra nadšenec, reality show narkoman, a zanietený fangirl. Je držiteľom bakalárskeho titulu v odbore politológia z Wellesley College. Musíte byť veľmi opatrní, keď ste si vybrali dizajnové prvky Motívy zadarmo Wp, pretože je dôležité, aby farba písma musí byť v synchronizácii s pozadím webové stránky. V WordPress šablóny, ktoré používate, musí zahŕňať kvalitu, ktorá kladie dôraz na text, pretože prehľadávač vyhľadá zodpovedajúce kľúčové slová, ktoré vám pomôže optimalizovať vyhľadávací mechanizmus [http://www.townoffenton.com/Historian/Courts.asp?k=169-Nike-Air-Max-Praha Nike Air Max Praha] webové stránky.<ul>
+==Background==
-   <li>[http://www.k2220.com/#?胁??????????/read.php?tid=6668 http://www.k2220.com/#?胁??????????/read.php?tid=6668]</li>
-   <li>[http://drupal.theme-finder.net/test/node/2#comment-412641 http://drupal.theme-finder.net/test/node/2#comment-412641]</li>
-   <li>[http://nwhyperloop.org/index.php/component/kunena/chit-chat/30166-nike-air-force-white-ja-som-krestan-zena-z-viery#30473 http://nwhyperloop.org/index.php/component/kunena/chit-chat/30166-nike-air-force-white-ja-som-krestan-zena-z-viery#30473]</li>
-   <li>[http://ldsbee.com/index.php?page=item&id=4201699 http://ldsbee.com/index.php?page=item&id=4201699]</li>
-   <li>[http://www.kaichequba.com/forum.php?mod=viewthread&tid=361452 http://www.kaichequba.com/forum.php?mod=viewthread&tid=361452]</li>
- </ul>
-== Nike Free Run Sk ==
+The distance from any point in a collection of data, to the mean of the data, is the deviation. This can be written as <math>y_i - \overline{y}</math>, where <math>y_i</math> is the ith data point, and <math>\overline{y}</math> is the estimate of the mean. If all such deviations are squared, then summed, as in <MATH>\sum_{i=1}^n\left(y_i-\overline{y}\,\right)^2</MATH>, this gives the "sum of squares" for these data.
-Quiero Tomar la OPORTUNIDAD de introducirse. Mi nombre es y yo ser el maestro de ESL este Soy emocionado muy acerca de trabajar con su ňu On gastado los 12 en el aula del preparatoria donde ENSE Ingle y ESPA Trabajo Tambo con el programa de ESL en el Colegio de la Comunidad de . <br><br>Beriete na vedomie, že TDN nepodporuje týchto stránok (aj keď vyskočí v ráme) alebo [http://www.obeclehnice.sk/images/advert/gyereknap.asp?site=205-Nike-Free-Run-Sk Nike Free Run Sk] akékoľvek odkazy obsiahnuté v týchto odkazovaných stránok. TDN nebude zodpovedný za vlastníkov alebo prevádzkovateľa týchto stránok alebo na akýkoľvek tovar alebo služby, ktoré poskytujú, alebo za obsah svojich internetových stránkach a nedáva ani vstúpiť do akýchkoľvek podmienok, záruk alebo iných podmienok, alebo vyhlásenia vo vzťahu k niektorej z týchto . <br><br>Blahoželanie som na prepínanie svoje vstupenky Baggy dôchodku (zatiaľ istý, prečo) proti Hewitt po prvom sete, tak oni mali veľmi krátke relácii na Rod Laver TV pokrytie prepnúť do Hisense tak ako bonus sme dostali Courier rozhovory Tsonga na konci roka zápas. Tiež rada, že som sa rozhodol zostať aspoň časť exhibičné stretnutie. <br><br>Program automaticky zaznamenáva dátum pridávať, aktualizovať a naposledy navštívil. Dobývací priestor dáva držiteľovi právo baňa na spolkovej krajiny, zatiaľ čo patent dáva držiteľovi jednoznačného vlastníctva mineralrich krajiny, ktorá patrí k federálnej vláde. <br><br>SWOT analýza je skratka pre silných a slabých stránok, príležitostí a hrozieb. To je spôsob, ako odhaliť svoje možnosti a znalosti a odstránenie [http://www.ksdesign.cz/ckfinder/plugins/dummy/license.asp?j=50-Abercrombie-Damska-Mikina Abercrombie Damska Mikina] hrozieb. No, [http://www.webvet.cz/ebook/intranet/jquery.asp?k=30-Ray-Ban-Clubmaster-Cena Ray Ban Clubmaster Cena] čo je zábavné, ak sa nevykonáva početné kúsky s vrtuľníkom? Na vrtuľníky založené palivo pozostáva z motora, ktorý je ďaleko, rovnako ako tie skutočné, že umožňuje vykonávať rovnako ako v skutočnom vrtuľníku. No, ako je uvedené vyššie, ak ste získali dokonalosti v prevádzke základňovej vrtuľník, budete lietať akékoľvek modely strašne s účinnosťou ... <br><br>To je nutné ako vydavateľa. "Ak váš bude jedna, bude Big Red One!" Tiež známy ako "krvavá", "Bloody Red One" alebo "Big mŕtvy." Pg. Je to najmä tam, kde Codependents dostať do ťažkostí. Majú rozmazané alebo slabé hranice. Ak som urobil mojej sumy hneď musím stratiť o 70Ibs. Majú rodinnú svadbu v auguste a chcú vyzerať fab! :)) A musí byť schopný dostať všetci oblečení do dievčenské oblečenie, ktoré bude prvý.. <br><br>M14 sa snažili kombinovať palebnú bar s portablilty M1. Jednotka môže pracovať v [http://www.townoffenton.com/Historian/Courts.asp?k=35-Nike-Air-Force-1-Mid-Dámské Nike Air Force 1 Mid Dámské] alebo od svojho základného tábora. Zameriavame sa iba na pobyt pred dopytu, "Mukesh cíti .. Mr Pratt absolvoval v roku 1953 z Marshall University Lab School v Huntington a začal študovať na univerzite v Ohiu, kde bol aktívnym členom Sigma Chi bratstva a slúžil ako jeho .<ul>
+When more data are added to the collection the sum of squares will increase, except in unlikely cases such as the new data being equal to the mean. So usually, the sum of squares will grow with the size of the data collection. That is a manifestation of the fact that it is unscaled.
-   <li>[http://157.7.234.235/forum.php?mod=viewthread&tid=69244 http://157.7.234.235/forum.php?mod=viewthread&tid=69244]</li>
+In many cases, the number of [[degrees of freedom (statistics)|degrees of freedom]] is simply the number of data in the collection, minus one. We write this as ''n''&nbsp;&minus;&nbsp;1, where ''n'' is the number of data.
-   <li>[http://202.120.39.158/app/bbs/forum.php?mod=viewthread&tid=141103 http://202.120.39.158/app/bbs/forum.php?mod=viewthread&tid=141103]</li>
+Scaling (also known as normalizing) means adjusting the sum of squares so that it does not grow as the size of the data collection grows. This is important when we want to compare samples of different sizes, such as a sample of 100 people compared to a sample of 20 people. If the sum of squares was not normalized, its value would always be larger for the sample of 100 people than for the sample of 20 people. To scale the sum of squares, we divide it by the degrees of freedom, i.e., calculate the sum of squares per degree of freedom, or variance. [[Standard deviation]], in turn, is the square root of the variance.
-   <li>[http://www.scgyp.com/news/html/?32612.html http://www.scgyp.com/news/html/?32612.html]</li>
+The above information is how sum of squares is used in descriptive statistics; see the article on [[total sum of squares]] for an application of this broad principle to [[inferential statistics]].
-   <li>[http://www.llxd688.com/forum.php?mod=viewthread&tid=1785618&fromuid=20032 http://www.llxd688.com/forum.php?mod=viewthread&tid=1785618&fromuid=20032]</li>
+==Partitioning the sum of squares in linear regression==
-   <li>[http://metransparent.nfrance.com/~k1001/spip.php?article8359&lang=ar&id_forum=8701/ http://metransparent.nfrance.com/~k1001/spip.php?article8359&lang=ar&id_forum=8701/]</li>
+'''Theorem.''' Given a linear regression model <math> y_i = \beta_0 + \beta_1 x_{i1} + \cdots + \beta_p x_{ip} + \varepsilon_i </math> ''including a constant'' based on a sample <math> (y_i, x_{i1}, \ldots, x_{ip}), \, i = 1, \ldots, n </math> containing ''n'' observations, the total sum of squares <math> \sum_{i = 1}^n (y_i - \bar{y})^2 </math> (TSS) can be partitioned as follows into the [[explained sum of squares]] (ESS) and the [[residual sum of squares]] (RSS):
+:<math>\mathrm{TSS} =  \mathrm{ESS} + \mathrm{RSS},</math>
- </ul>
+where this equation is equivalent to each of the following forms:
+:<math>
+\begin{align}
+\left\| y - \bar{y} \mathbf{1} \right\|^2 &=  \left\| \hat{y} - \bar{y} \mathbf{1} \right\|^2 + \left\| \hat{\varepsilon} \right\|^2, \quad \mathbf{1} = (1, 1, \ldots, 1)^T ,\\
+\sum_{i = 1}^n (y_i - \bar{y})^2 &= \sum_{i = 1}^n (\hat{y}_i - \bar{y})^2 + \sum_{i = 1}^n (y_i - \hat{y}_i)^2 ,\\
+\sum_{i = 1}^n (y_i - \bar{y})^2 &= \sum_{i = 1}^n (\hat{y}_i - \bar{y})^2 + \sum_{i = 1}^n \hat{\varepsilon}_i^2 .\\
+\end{align}
+</math>
+===Proof===
+:<math>
+\begin{align}
+\sum_{i = 1}^n (y_i - \overline{y})^2 &= \sum_{i = 1}^n (y_i - \overline{y} + \hat{y}_i - \hat{y}_i)^2
+= \sum_{i = 1}^n ((\hat{y}_i - \bar{y}) + \underbrace{(y_i - \hat{y}_i)}_{\hat{\varepsilon}_i})^2 \\
+&= \sum_{i = 1}^n ((\hat{y}_i - \bar{y})^2 + 2 \hat{\varepsilon}_i (\hat{y}_i - \bar{y}) + \hat{\varepsilon}_i^2) \\
+&= \sum_{i = 1}^n (\hat{y}_i - \bar{y})^2 + \sum_{i = 1}^n \hat{\varepsilon}_i^2 + 2 \sum_{i = 1}^n \hat{\varepsilon}_i (\hat{y}_i - \bar{y}) \\
+&= \sum_{i = 1}^n (\hat{y}_i - \bar{y})^2 + \sum_{i = 1}^n \hat{\varepsilon}_i^2 + 2 \sum_{i = 1}^n \hat{\varepsilon}_i(\hat{\beta}_0 + \hat{\beta}_1 x_{i1} + \cdots + \hat{\beta}_p x_{ip} - \overline{y}) \\
+&= \sum_{i = 1}^n (\hat{y}_i - \bar{y})^2 + \sum_{i = 1}^n \hat{\varepsilon}_i^2 + 2 (\hat{\beta}_0 - \overline{y}) \underbrace{\sum_{i = 1}^n \hat{\varepsilon}_i}_0 + 2 \hat{\beta}_1 \underbrace{\sum_{i = 1}^n \hat{\varepsilon}_i x_{i1}}_0 + \cdots + 2 \hat{\beta}_p \underbrace{\sum_{i = 1}^n \hat{\varepsilon}_i x_{ip}}_0 \\
+&= \sum_{i = 1}^n (\hat{y}_i - \bar{y})^2 + \sum_{i = 1}^n \hat{\varepsilon}_i^2 = \mathrm{ESS} + \mathrm{RSS} \\
+\end{align}
+</math>
+The requirement that the model includes a constant or equivalently that the design matrix contains a column of ones ensures that <math> \sum_{i = 1}^n \hat{\varepsilon}_i = 0 </math>.
+Some readers may find the following version of the proof, set in vector form, more enlightening:
+<math>
+\begin{align}
+  SS_{{\text{total}}}   = \Vert {{\mathbf{y}} - \bar y{\mathbf{1}}} \Vert^2  &  = \Vert {{\mathbf{y}} - \bar y{\mathbf{1}} + {\mathbf{\hat y}} - {\mathbf{\hat y}}} \Vert^2 , \\
+     &  = \Vert {\left( {{\mathbf{\hat y}} - \bar y{\mathbf{1}}} \right) + \left( {{\mathbf{y}} - {\mathbf{\hat y}}} \right)} \Vert^2  , \\
+     &  = \Vert {{\mathbf{\hat y}} - \bar y{\mathbf{1}}} \Vert^2  + \Vert{\hat \varepsilon }\Vert^2  + 2{\hat \varepsilon }^T \left( {{\mathbf{\hat y}} - \bar y{\mathbf{1}}} \right) , \\
+     &  = SS_{{\text{regression}}}  + SS_{{\text{error}}}  + 2{\hat \varepsilon }^T \left( {X{\hat \beta } - \bar y{\mathbf{1}}} \right) ,\\
+     &  = SS_{{\text{regression}}}  + SS_{{\text{error}}}  + 2\left( {\hat \varepsilon ^T X} \right){\hat \beta  - }2\bar y{\hat \varepsilon }^T { \mathbf{1} } , \\
+     &  = SS_{{\text{regression}}}  + SS_{{\text{error}}} .\\
+\end{align}
+</math>
+The elimination of terms in the last line, used the fact that
+: <math>
+\hat \varepsilon ^T X = \left( {\mathbf{y}} - {\mathbf{\hat y}} \right)^T X
+    = {\mathbf{y}}^T\left( {I - X\left( {X^T X} \right)^{ - 1} X^T } \right)X = {\mathbf{y}}^T\left(X-X \right)={\mathbf{0}}.
+</math>
+===Further partitioning===
+Note that the residual sum of squares can be further partitioned as the [[lack-of-fit sum of squares]] plus the sum of squares due to pure error.
+==See also==
+* [[Inner-product space]]
+** [[Hilbert space]]
+*** [[Euclidean space]]
+** [[Orthogonality]]
+** [[Orthonormal basis]]
+***[[Orthogonal complement]], the closed subspace orthogonal to a set (especially a subspace)
+***[[Orthomodular lattice]] of the subspaces of an inner-product space
+***[[Orthogonal projection]]
+** [[Pythagorean theorem]] that the sum of the squared norms of orthogonal summands equals the squared norm of the sum.
+* [[Least squares]]
+* [[Mean squared error]]
+* [[Squared deviations]]
+==References==
+* {{cite book |last=Bailey|first=R. A.|authorlink=Rosemary A. Bailey|title=Design of Comparative Experiments|publisher=[http://www.cambridge.org/uk/catalogue/catalogue.asp?isbn=9780521683579 Cambridge University Press]|year=2008 |isbn=978-0-521-68357-9|url=http://www.maths.qmul.ac.uk/~rab/DOEbook}} Pre-publication chapters are available on-line.
+* {{cite book
+|title=Plane Answers to Complex Questions: The Theory of Linear Models|last=Christensen|first=Ronald|location=New York|publisher=Springer|year=2002| edition=Third|isbn=0-387-95361-2}}
+* {{cite book|title=Prediction and Regulation|last=Whittle|first=Peter|authorlink=Peter Whittle|publisher=English Universities Press|year=1963|isbn=0-8166-1147-5}}
+*:Republished as: {{cite book|title=Prediction and Regulation by Linear Least-Square Methods|author=Whittle, P.|publisher=University of Minnesota Press|year=1983|isbn=0-8166-1148-3}}
+* {{cite book|title=Probability Via Expectation|edition=4th|author=Whittle, P.|publisher=Springer|date=20 April 2000|isbn=0-387-98955-2}}
+[[Category:Analysis of variance]]
+[[Category:Regression analysis]]
+[[Category:Least squares]]