Cobalt-precorrin 5A hydrolase

From formulasearchengine
Jump to navigation Jump to search

In statistics, the multivariate Behrens–Fisher problem is the problem of testing for the equality of means from two multivariate normal distributions when the covariance matrices are unknown and possibly not equal. Since this is a generalization of the univariate Behrens-Fisher problem, it inherits all of the difficulties that arise in the univariate problem.

Notation and problem formulation

Let Xij𝒩p(μi,Σi)(j=1,,ni;i=1,2) be independent random samples from two p-variate normal distributions with unknown mean vectors μi and unknown dispersion matrices Σi. The index i refers to the first or second population, and the jth observation from the ith population is Xij.

The multivariate Behrens–Fisher problem is to test the null hypothesis H0 that the means are equal versus the alternative H1 of non-equality:

H0:μ1=μ2vsH1:μ1μ2.

Define some statistics, which are used in the various attempts to solve the multivariate Behrens–Fisher problem, by

Xi¯=1nij=1niXij,Ai=j=1ni(XijXi¯)(XijXi¯),Si=1ni1Ai,Si~=1niSi,S~=S1~+S2~,andT2=(X1¯X2¯)S~1(X1¯X2¯).

The sample means Xi¯ and sum-of-squares matrices Ai are sufficient for the multivariate normal parameters μi,Σi,(i=1,2), so it suffices to perform inference be based on just these statistics. The distributions of Xi¯ and Ai are independent and are, respectively, multivariate normal and Wishart:[1]

Xi¯𝒩p(μi,Σi/ni),AiWp(Σi,ni1).

Background

In the case where the dispersion matrices are equal, the distribution of the T2 statistic is known to be an F distribution under the null and a noncentral F-distribution under the alternative.[1]

The main problem is that when the true values of the dispersion matrix are unknown, then under the null hypothesis the probability of rejecting H0 via a T2 test depends on the unknown dispersion matrices.[1] In practice, this dependency harms inference when the dispersion matrices are far from each other or when the sample size is not large enough to estimate them accurately.[1]

Now, the mean vectors are independently and normally distributed,

Xi¯𝒩p(μi,Σi/ni),

but the sum A1+A2 does not follow the Wishart distribution,[1] which makes inference more difficult.

Proposed solutions

Proposed solutions are based on a few main strategies:[2][3]

Approaches using the T2 with approximate degrees of freedom

Below, tr indicates the trace operator.

Yao (1965)

(as cited by [6])

T2νpνp+1Fp,νp+1,

where

ν=[1n1(X¯dS~1S~1S~1Xd¯X¯dS~1X¯d)2+1n2(X¯dS~1S~2S~1Xd1X¯dS~1X¯d)2]1,X¯d=X¯1X¯2.

Johansen (1980)

(as cited by [6])

T2qFp,ν,

where

q=p+2D6Dp(p1)+2,ν=p(p+2)3D,

and

D=12i=121ni{tr[(I(S~11+S~21)1S~i1)2]+[tr(I(S~11+S~21)1S~i1)]2}.

Nel and Van der Merwe's (1986)

(as cited by [6])

T2νpνp+1Fp,νp+1,

where

ν=tr(S~2)+[tr(S~)]21n1{tr(S1~2)+[tr(S1~)]2}+1n2{tr(S2~2)+[tr(S2~)]2}.

Comments on performance

Kim (1992) proposed a solution that is based on a variant of T2. Although its power is high, the fact that it is not invariant makes it less attractive. Simulation studies by Subramaniam and Subramaniam (1973) show that the size of Yao's test is closer to the nominal level than that of James's. Christensen and Rencher (1997) performed numerical studies comparing several of these testing procedures and concluded that Kim and Nel and Van der Merwe's tests had the highest power. However, these two procedures are not invariant.

Krishnamoorthy and Yu (2004)

Krishnamoorthy and Yu (2004) proposed a procedure which adjusts in Nel and Var der Merwe (1986)'s approximate df for the denominator of T2 under the null distribution to make it invariant. They show that the approximate degrees of freedom lies in the interval [min{n1,n2},n1+n2] to ensure that the degrees of freedom is not negative. They report numerical studies that indicate that their procedure is as powerful as Nel and Van der Merwe's test for smaller dimension, and more powerful for larger dimension. Overall, they claim that their procedure is the better than the invariant procedures of Yao (1965) and Johansen (1980). Therefore, Krishnamoorthy and Yu's (2004) procedure has the best known size and power as of 2004.

The test statistic T2 in Krishnmoorthy and Yu's procedure follows the distribution T2νpFp,νp+1/(νp+1), where

ν=p+p21n1{tr[(S~1S~1)2]+[tr(S~1S~1)]2}+1n2{tr[(S~2S~1)2]+[tr(S~2S~1)]2}.

References

43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro. 43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.

  • Rodríguez-Cortés, F. J. and Nagar, D. K. (2007). Percentage points for testing equality of mean vectors. Journal of the Nigerian Mathematical Society, 26:85–95.
  • Gupta, A. K., Nagar, D. K., Mateu, J. and Rodríguez-Cortés, F. J. (2013). Percentage points of a test statistic useful in manova with structured covariance matrices. Journal of Applied Statistical Science, 20:29-41.
  1. 1.0 1.1 1.2 1.3 1.4 Cite error: Invalid <ref> tag; no text was provided for refs named 2003anderson
  2. Cite error: Invalid <ref> tag; no text was provided for refs named 1997christensen
  3. 3.0 3.1 Cite error: Invalid <ref> tag; no text was provided for refs named 2007park
  4. Cite error: Invalid <ref> tag; no text was provided for refs named 1981olkin
  5. Cite error: Invalid <ref> tag; no text was provided for refs named 2004gamage
  6. 6.0 6.1 6.2 Cite error: Invalid <ref> tag; no text was provided for refs named 2004krishnamoorthy