Linear model: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Qwertyus
Added {{merge}} tag to article (TW)
en>Lenov
Line 1: Line 1:
In [[statistics]],
The author is called Irwin Wunder but it's not the most masucline name out there. South Dakota is exactly where I've always been living. I am a meter reader. It's not a common factor but what she likes performing is base leaping and now she is attempting to make money with it.<br><br>Look into my webpage - [http://dore.gia.ncnu.edu.tw/88ipart/node/1970172 dore.gia.ncnu.edu.tw]
the '''likelihood principle''' is a controversial principle of [[statistical inference]] which asserts that all of the [[information]] in a [[Sampling (statistics)|sample]] is contained in the [[likelihood function]].
 
A [[likelihood function]] arises from a [[conditional probability distribution]] considered as a function of its distributional parameterization argument, conditioned on the data argument. For example, consider a model which gives the [[probability density function]] of observable [[random variable]] ''X'' as a function of a parameter θ.
Then for a specific value ''x'' of ''X'', the function ''L''(θ | ''x'') = P(''X''=''x'' | θ) is a likelihood function of θ: it gives a measure of how "likely" any particular value of θ is, if we know that ''X'' has the value ''x''. Two likelihood functions are equivalent if one is a scalar multiple of the other. The '''likelihood principle''' states that all information from the data relevant to inferences about the value of θ is found in the equivalence class. The '''strong likelihood principle''' applies this same criterion to cases such as sequential experiments where the sample of data that is available results from applying a [[stopping rule]] to the observations earlier in the experiment.<ref>Dodge, Y. (2003) The Oxford Dictionary of Statistical Terms. OUP. ISBN 0-19-920613-9</ref>
 
==Example==
 
Suppose
 
*''X'' is the number of successes in twelve [[statistical independence|independent]] [[Bernoulli trial]]s with probability θ of success on each trial, and
*''Y'' is the number of independent Bernoulli trials needed to get three successes, again with probability θ of success on each trial.
 
Then the observation that ''X'' = 3 induces the likelihood function
 
:<math>L(\theta|X=3)=\begin{pmatrix}12\\3\end{pmatrix}\;\theta^3(1-\theta)^9=220\;\theta^3(1-\theta)^9</math>
 
and the observation that ''Y'' = 12 induces the likelihood function
 
:<math>L(\theta|Y=12)=\begin{pmatrix}11\\2\end{pmatrix}\;\theta^3(1-\theta)^9=55\;\theta^3(1-\theta)^9.</math>
 
These are equivalent because each is a scalar multiple of the other.  The likelihood principle therefore says the inferences drawn about the value of θ should be the same in both cases.
 
The difference between observing ''X'' = 3 and observing ''Y'' = 12 is only in the [[design of experiments|design of the experiment]]: in one case, one has decided in advance to try twelve times; in the other, to keep trying until three successes are observed.  The ''outcome'' is the same in both cases.
 
==The law of likelihood==
 
A related concept is the '''law of likelihood''', the notion that the extent to which the evidence supports one parameter value or hypothesis against another is equal to the ratio of their likelihoods.
That is,
:<math>\Lambda = {L(a|X=x) \over L(b|X=x)} = {P(X=x|a) \over P(X=x|b)}</math>
is the degree to which the observation x supports parameter value or hypothesis ''a'' against ''b''.
If this ratio is 1, the evidence is indifferent,
and if greater or less than 1, the evidence supports ''a'' against ''b'' or vice versa.  The use of [[Bayes factor]]s can extend this by taking account of the complexity of different hypotheses.
 
Combining the likelihood principle with the law of likelihood yields the consequence that the parameter value which maximizes the likelihood function is the value which is most strongly supported by the evidence.
This is the basis for the widely used [[maximum likelihood|method of maximum likelihood]].
 
== Historical remarks ==
 
The likelihood principle was first identified by that name in print in 1962
(Barnard et al., Birnbaum, and Savage et al.),
but arguments for the same principle, unnamed, and the use of the principle in applications goes back to the works of [[Ronald A. Fisher|R.A. Fisher]] in the 1920s.
The law of likelihood was identified by that name by [[Ian Hacking|I. Hacking]] (1965).
More recently the likelihood principle as a general principle of inference has been championed by [[A. W. F. Edwards]].
The likelihood principle has been applied to the [[philosophy of science]] by [[Richard M. Royall|R. Royall]].
 
[[Allan Birnbaum|Birnbaum]] proved that the likelihood principle follows from two more primitive and seemingly reasonable principles, the ''[[conditionality principle]]'' and the ''[[sufficiency principle]]''. The conditionality principle says that if an experiment is chosen by a random process independent of the states of nature <math>\theta</math>, then only the experiment actually performed is relevant to inferences about <math>\theta</math>. The sufficiency principle says that if <math>T(X)</math> is a [[sufficient statistic]] for <math>\theta</math>, and if in two experiments with data <math>x_1</math> and <math>x_2</math> we have  <math>T(x_1)=T(x_2)</math>, then the evidence about <math>\theta</math> given by the two experiments is the same.
 
== Arguments for and against the likelihood principle ==
Some widely used methods of conventional statistics, for example many [[statistical hypothesis testing|significance test]]s, are not consistent with the likelihood principle.
 
Let us briefly consider some of the arguments for and against the likelihood principle.
 
===The original Birnbaum argument===
 
Birnbaum's proof of the likelihood principle is not widely accepted among statisticians and has been disputed by Philosophers of Science like Deborah Mayo<ref>Mayo, B. (2010) [http://www.phil.vt.edu/dmayo/personal_website/ch%207%20mayo%20birnbaum%20proof.pdf "An Error in the Argument from Conditionality and Sufficiency to the Likelihood Principle"] in ''Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science'' (D Mayo and A. Spanos eds.), Cambridge: Cambridge University Press: 305-14.</ref><ref>Mayo, Deborah (2013) [http://arxiv-web3.library.cornell.edu/pdf/1302.7021v2.pdf On the Birnbaum Argument for the Strong Likelihood Principle]</ref> and Statisticians like Michael Evans.<ref>Evans, Michael (2013) [http://arxiv.org/abs/1302.5468 What does the proof of Birnbaum's theorem prove?]</ref>
 
===Experimental design arguments on the likelihood principle===
Unrealized events do play a role in some common statistical methods.
For example, the result of a [[statistical hypothesis testing|significance test]] depends on the [[p-value|''p''-value]], the probability of a result as extreme or more extreme than the observation, and that probability may depend on the design of the experiment. Thus, to the extent that such methods are accepted, the likelihood principle is denied.
 
Some classical significance tests are not based on the likelihood.
A commonly cited example is the [[optional stopping]] problem.
Suppose I tell you that I tossed a coin 12 times and in the process observed 3 heads.
You might make some inference about the probability of heads and whether the coin was fair.
Suppose now I tell that I tossed the coin until I observed 3 heads, and I tossed it 12 times. Will you now make some different inference?
 
The likelihood function is the same in both cases: it is proportional to
:<math>p^3 \; (1-p)^9</math>.
 
According to the likelihood principle, the inference should be the same in either case.
 
Suppose a number of scientists are assessing the probability of a certain outcome (which we shall call 'success') in experimental trials. Conventional wisdom suggests that if there is no bias towards success or failure then the success probability would be one half.  Adam, a scientist, conducted 12 trials and obtains 3 successes and 9 failures.  Then he left the lab.
 
Bill, a colleague in the same lab, continued Adam's  work and published Adam's results, along with a significance test. He tested the [[null hypothesis]] that ''p'', the success probability, is equal to a half, versus ''p'' < 0.5. The probability of the observed result that out of 12 trials 3 or something fewer (i.e. more extreme) were successes, if ''H''<sub>0</sub> is true, is
:<math>\left({12 \choose 9}+{12 \choose 10}+{12 \choose 11}+{12 \choose 12}\right)\left({1 \over 2}\right)^{12}</math>
which is 299/4096 = 7.3%. Thus the null hypothesis is not rejected at the 5% significance level.
 
Charlotte, another scientist, reads Bill's paper and writes a letter, saying that it is possible that Adam kept trying until he obtained 3 successes, in which case the probability of needing to conduct 12 or more experiments is given by
:<math>1-\left({10 \choose 2}\left({1 \over 2}\right)^{11}+{9 \choose 2}\left({1 \over 2}\right)^{10}+\cdots +{2 \choose 2}\left({1 \over 2}\right)^{3}\right)</math>
which is 134/4096 = 3.27%. Now the result ''is'' statistically significant at the 5% level. Note that there is no contradiction among these two results; both computations are correct.
 
To these scientists, whether a result is significant or not depends on the design of the experiment, not on the likelihood (in the sense of the likelihood function) of the parameter value being 1/2.
 
Results of this kind are considered by some as arguments against the likelihood principle. For others it exemplifies the value of the likelihood principle and is an argument against significance tests.
 
Similar themes appear when comparing [[Fisher's exact test]] with [[Pearson's chi-squared test]].
 
===The voltmeter story===
 
An argument in favor of the likelihood principle is given by Edwards in his book ''Likelihood''. He cites the following story from J.W. Pratt, slightly condensed here. Note that the likelihood function depends only on what actually happened, and not on what ''could'' have happened.
 
: An engineer draws a random sample of electron tubes and measures their voltage. The measurements range from 75 to 99 volts. A statistician computes the sample mean and a confidence interval for the true mean. Later the statistician discovers that the voltmeter reads only as far as 100, so the population appears to be 'censored'. This necessitates a new analysis, if the statistician is orthodox. However, the engineer says he has another meter reading to 1000 volts, which he would have used if any voltage had been over 100. This is a relief to the statistician, because it means the population was effectively uncensored after all. But, the next day the engineer informs the statistician that this second meter was not working at the time of the measuring. The statistician ascertains that the engineer would not have held up the measurements until the meter was fixed, and informs him that new measurements are required. The engineer is astounded. "Next you'll be asking about my oscilloscope".
 
One might proceed with this story, and consider the fact that in general the actual situation could have been different. For instance, high range voltmeters don't break at predictable moments in time, but rather at unpredictable moments. So it ''could'' have been broken, with some probability. The likelihood theory claims that the distribution of the voltage measurements depends on the probability that an instrument not used in this experiment was broken at the time.
 
This story can be translated to Adam's stopping rule above, as follows. Adam stopped immediately after 3 successes, because his boss Bill had instructed him to do so. Adam did not die. After the publication of the statistical analysis by Bill, Adam discovers that he has missed a second instruction from Bill to conduct 12 trials instead, and that Bill's paper is based on this second instruction. Adam is very glad that he got his 3 successes after exactly 12 trials, and explains to his friend Charlotte that by coincidence he executed the second instruction. Later, he is astonished to hear about Charlotte's letter explaining that now the result is significant.
 
== See also ==
* [[Likelihood-ratio test]]
* [[Conditionality principle]]
 
== References ==
<references/>
* {{cite journal|last=Barnard|first=G.A.|coauthors=G.M. Jenkins, and C.B. Winsten|authorlink=George Alfred Barnard|year=1962|title=Likelihood Inference and Time Series|jstor=2982406|journal=Journal of the Royal Statistical Society, Series A|issn=0035-9238|volume=125|issue=3|pages=321–372|doi=10.2307/2982406}}
* {{cite book|last=Berger|first=J.O.|coauthors=and Wolpert, R.L.|title=The Likelihood Principle|edition=2nd|publisher=The Institute of Mathematical Statistics|location=Haywood, CA|year=1988|isbn=0-940600-13-7 |url=http://projecteuclid.org/euclid.lnms/1215466210}}
* {{cite journal|last=Birnbaum|first=Allan|authorlink=Allan Birnbaum|year=1962|title=On the foundations of statistical inference |jstor=2281640|journal=Journal of the American Statistical Association|issn=0162-1459|volume=57|issue=298|pages=269–326|doi= 10.2307/2281640|mr=0138176}} ''(With discussion.)''
* {{cite book|last=Edwards|first=Anthony W.F.|authorlink=A. W. F. Edwards|title=Likelihood|edition=1st|publisher=Cambridge University Press|location=Cambridge|year=1972|isbn=}}
* {{cite book|last=Edwards|first=Anthony W.F.|authorlink=A. W. F. Edwards|title=Likelihood|edition= 2nd|publisher=Johns Hopkins University Press|location=Baltimore|year=1992|isbn=0-8018-4445-2}}
* {{cite journal|last=Edwards|first=Anthony W.F.|authorlink=A. W. F. Edwards|year=1974|title=The history of likelihood |jstor=1402681|journal=International Statistical Review|issn=0306-7734|volume=42|issue=1|pages=9–15|doi=10.2307/1402681|mr=0353514}}
* {{cite journal|last=Fisher|first=Ronald A.|authorlink=Ronald A. Fisher|year=1922|title=On the Mathematical Foundations of Theoretical Statistics|url=http://digital.library.adelaide.edu.au/dspace/handle/2440/15172 |format=PDF fulltext|journal=[[Philosophical Transactions of the Royal Society A]]|volume=222|issue=594–604|page=326|doi=10.1098/rsta.1922.0009|accessdate=2008-12-28}}
* {{cite book|last=Hacking|first=Ian|coauthors=|authorlink=Ian Hacking|title=Logic of Statistical Inference|edition=|publisher=Cambridge University Press|location=Cambridge|year=1965|isbn=0-521-05165-7}}
* {{cite book|last=Jeffreys|first=Harold |coauthors=|authorlink=Harold Jeffreys|title=The Theory of Probability|edition=|publisher=The Oxford University Press|year=1961|isbn=}}
* {{cite book|last=Royall|first=Richard M.|coauthors=|title=Statistical Evidence: A Likelihood Paradigm|edition=|publisher=Chapman & Hall|location=London|year=1997|isbn=0-412-04411-0}}
Mayo, D. (2010). "An Error in the Argument from Conditionality and Sufficiency to the Likelihood Principle" in Error and Inference: Recent Exchanges on Experimental Reasoning, Reliability and the Objectivity and Rationality of Science (D Mayo and A. Spanos eds.), Cambridge: Cambridge University Press: 305-14.
 
* {{cite book|last=Savage|first=Leonard J.|coauthors= et al.|authorlink=Leonard J. Savage|title=The Foundations of Statistical Inference|edition=|publisher=Methuen|location=London|year=1962|isbn=}}
 
== External links ==
 
* Anthony W.F. Edwards. "Likelihood". http://www.cimat.mx/reportes/enlinea/D-99-10.html
* Jeff Miller. [http://jeff560.tripod.com/l.html Earliest Known Uses of Some of the Words of Mathematics (L)]
* John Aldrich. [http://www.economics.soton.ac.uk/staff/aldrich/fisherguide/prob+lik.htm Likelihood and Probability in R. A. Fisher’s Statistical Methods for Research Workers]
 
[[Category:Estimation theory]]
[[Category:Statistical principles]]
 
[[ru:Принцип максимального правдоподобия]]

Revision as of 18:12, 26 February 2014

The author is called Irwin Wunder but it's not the most masucline name out there. South Dakota is exactly where I've always been living. I am a meter reader. It's not a common factor but what she likes performing is base leaping and now she is attempting to make money with it.

Look into my webpage - dore.gia.ncnu.edu.tw