0% found this document useful (0 votes)
95 views29 pages

Power

The document discusses power calculations for various statistical tests in R, including: - Z-tests for differences in means - Chi-square tests on variances - F-tests for ANOVA models It provides functions to calculate critical values, test statistics, and power for tests of means, variances, and differences in group means or variances. Examples are given of plotting power curves and determining sample sizes needed to achieve a given power level.

Uploaded by

Abhijit Wankhede
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
95 views29 pages

Power

The document discusses power calculations for various statistical tests in R, including: - Z-tests for differences in means - Chi-square tests on variances - F-tests for ANOVA models It provides functions to calculate critical values, test statistics, and power for tests of means, variances, and differences in group means or variances. Examples are given of plotting power curves and determining sample sizes needed to achieve a given power level.

Uploaded by

Abhijit Wankhede
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Power calculations in R

Outline: Power calculations for z tests (pair of means) Central and non-central c2 distributions Confidence intervals, hypothesis testing and power calculations for a variance Non-central F distributions Power calculations for fixed-effects ANOVA (differences in a group of means) Power calculations for random-effects ANOVA (differences in variances)

Normal Distribution calculations in R Let U = unit normal. Prob(U < x), use pnorm(x)
Examples:

Prob( U < 1.0). Prob( U > 2.3).

pnorm(1.0) returns 0.8413 1- pnorm(2.3) returns 0.01072

Critical values. To find x where Prob(U < x) = p, qnorm(p)


Examples:

Find x where Prob(U < x) = 0.999 qnorm(0.999) returns 3.0902 Find x such that Prob(|U| < 0.99) qnorm(0.995) returns 2.576

Check: Prob(|U| < 2.576) = Prob(U < 2.576) - Prob( < -2.576) pnorm(2.576)-pnorm(-2.576) returns 0.990005

Normal tests and Power


Define z(a) = Prob(U < z(a) ) = a. Note this is qnorm(a) Likewise, note that Pr(U > z(a) ) = 1 - a = Pr (U < z(1- a) ). Example: a = 0.95. 1 - a = 0.05 Prob(U > z(0.05) ) = (U < z(0.95) ) = qnorm(0.95) R returns 1.644

Suppose we have n observations from a normal with mean m0 and variance s2. Hence, x is a normal with mean m0 and variance s2/n, so that (x - m0)/(s2/n)1/2 ~ U (a unit normal) For a one-sided test with (say) a =0.01, want is the critical value TC(a)? Namely, for what value is Pr(x > TC(a)) = a (=0.01 in our example). Pr(x > TC(a)) = Pr(x - m0 > TC(a) - m0 ) = Pr( [x - m0]/s > [TC(a) - m0 )]/s ) = Pr(U > [TC(a) - m0 )]/s ) = a Thus [TC(a) - m0 )]/s = z(1-a) TC(a) = m0 + s z(1-a) = m0 + (s2/n)1/2 z(1-a) s = (s2/n)1/2

TC(a) = m0 + s z(1-a) = m0 + (s2/n)1/2 z(1-a) In our case, a = 0.01, z(1-a) = z(0.99) = qnorm(0.99) = 2.33 So, if our null hypothesis is that the mean = 10 and s2 = 4, our critical 1% (one-sided) test for sample size n is just 10 + (4/n)1/2*2.33 Defining functions in R We want to compute the critical value for different values of n, so lets define a function in R to do this Nc <- function(n) 10+sqrt(4/n)*2.33 More generally, lets let a vary as well Ncrit <- function(n,a) 10+sqrt(4/n)*qnorm(1-a)

TC(a) = m0 + s z(1-a) = m0 + (s2/n)1/2 z(1-a) Ncrit <- function(n,a) 10+sqrt(4/n)*qnorm(1-a) Ncrit(15,0.01) returns 11.20 Ncrit(200,0.001) returns 10.43 Most generally, we can write a function also allow m0 and s2 to vary, Ncrit <- function(m0,var,n,a) m0+sqrt(var/n)*qnorm(1-a) Ncrit(10,4,15,0.01) returns 11.20 (as expected) Ncrit(10,8,15,0.01) returns 11.70

Power for Means tests


Now suppose the true distribution of x is that it has mean m1 (with the same variance s2). The power of this test is the probability that the same mean exceeds the critical value, Tc(a) = m0 + (s2/n)1/2 z(1-a) Here, the mean is now normal with m1, s2, and the power is Pr(x > TC(a)) = Pr(x - m1 > TC(a) - m1 ) = Pr( [x - m1]/s > [TC(a) - m1 )]/s ) = Pr(U > [TC(a) - m1 )]/ (s2/n)1/2 ) We can program this in R power <- function(m0,m1,var,a,n) { temp <-(Ncrit(m0,var,n,a)- m1)/sqrt(var/n); 1-pnorm(temp)}

Mean: null mean = 10, true mean = 11, var = 4, a = 0.01 power(10,11,4,0.01,20) returns 0.464 Suppose we increase sample size to 60 power(10,11,4,0.01,200) returns 0.939 curve(power(10,11,4,0.01,x),10,200)

Central and Non-central c2 distributions Central c2 distributions Recall that if xi ~ N(0,1), then (x1 + x2 If xi ~ N(0,s2),
n n
...

+ xn) ~ c2n

x2 ~ 2 2 i n
i=1 i=1

(xi

x )2 ~ 2 2 -1) (n

Noncentral c2 distributions If xi ~ N(mi,s2),


n n

x2 ~ 2 2 i n,
i=1

with =
i=1

2 i

Chi-square with n degrees of freedom and noncentrality parameter l.

Likewise, for xi ~ N(mi,s2),


n n

( xi - x )2 ~ 22 - 1), (n
i=1

with =
i=1

2 i 2

Central & noncentral c2 in R Pr(c2n < x) = pchisq(x,n) Example: Compute Pr(c235 < 41.5) pchisq(41.5,35) returns 0.792 Pr(c2n,l < x) = pchisq(x,n,l) Example: Compute Pr(c235,6 < 41.5) pchisq(41.5,35,6) returns 0.551

To find x such that Pr(c2n < x) = a, use qchisq(a,n) Example: Compute the 95% value for a c210. qchisq(0.95,10) returns 18.31 To find x such that Pr(c2n,l < x) = a, use qchisq(a,n,l) Example: Compute the 95% value for a c210,3.5. qchisq(0.95,10,3.5) returns 24.27

Confidence limits on variance estimators


If zi ~ N(m, s2), i.e., they all have a common mean, we recover a central c2 distribution,
n

S=
i=1

(xi - x)2 ~ 2 2 - 1 n

Since S is very closely related to the sample variance, we can obtain confidence intervals, critical values, and compute power for simple variance estimates.

V ar =

n- 1

1 n

-1

(xi - x)2
i=1

n- 1

2 - 1 n

We designate appropriate c2 values, let

Pr(2 < 2 [] ) = n n,

First, note that

Pr(2 -1,[/2] < 2 < 2 - 1,[1- /2] ) = 1 n- n n

Recalling that the scaled sample variance follows a c2

Pr 2 - 1,[/2] < n

(n

- 1)V ar
2

< 2 - 1,[1 - /2] = 1- n

Further noting that for a < x < b, that 1/a > 1/x > 1/b gives

Pr

2 - 1,[/2] n

2 1 > > 2 (n - 1)V ar n -1,[1- /2]

= 1-

Thus our 1-a level confidence interval for the true variance given the sample variance is

Pr

(n - 1) (n - 1) 1 2 > V ar > V ar 2 2 n - 1,[/2] n -1,[1- /2]

= 1-

Pr

(n - 1) (n - 1) 1 2 > V ar > V ar 2 - 1,[/2] 2 -1,[1- /2] n n

= 1-

Example: suppose n = 20 and our sample variance is 10. What is a 99% confidence for the true variance in this case? First note that here a = 1. qchisq(0.005,19) gives c210,[0.005] = 6.85 qchisq(0.995,19) gives c210,[0.995] = 38.58 Lower limit = 19*10/38.58 = 4.9 Upper limit = 19*10/6.85 = 27.76 chiCI<-function(var,n,a) { low<-qchisq(a/2,n-1); upper<-qchisq(1-a/2,n-1); c((n-1)*n*var/upper,(n-1)*var/low)} R code. Note that c(x,y) returns an array, 1st element = lower 2nd element = upper chiCI(10,50,0.001) returns

5.55

21.50

Critical Values and Power for One-sided tests


Consider a null of s2 = s20 vs the alternative s2 > s20 What is the critical value C(n, a) under the alternative to give an error (Type I error) of a?
Pr(V ar > C(n, )) = Pr V ar

This is what we want

(n - 1) (n - 1) > C(n, ) 2 2 0 0 (n

)=
(n - 1) 2 0

Pr 2 -1 > C(n, ) n

2 0

- 1)

)=

2 - 1,[1 - ] = C(n, ) n

Hence, the critical value becomes


C(n, ) = 2 - 1,[1 - ] n n- 1
2 0

Suppose the true variance is s21. What is the probability that we reject the null (the power)? This is the prob that Var exceeds C:

Pr(V ar >

2 C(n, ) | 1 )

= Pr V ar

(n

-2 1) > C(n, ) (n -2 1)
1

Which reduces to a very simple form:

Pr

2 - 1 n

>

2 0 2 1

2 - 1,[1 - ] n

R Code
chipower<-function(var,n,alpha) 1-pchisq(qchisq(1-alpha,n)/var,n) s20 /s21
Sample size

Example: plot of power as a function of the variance ratio for n = 20, alpha = 0.01. curve(chipower(x,20,0.01),1,20)

The critical value and power for the other one-sided test, a null of s2 = s20 vs the alternative s2 < s20

Critical value
C(n, ) = 2 - 1,[] n n- 1
2 0

Reject null when Var < C(n,a) Power:


Pr

2 - 1 n

<

2 0 2 1

2 - 1,[ ] n

Central and Non-central F distributions Central F distributions

( c2i/i) / ( c2k/k) ~ Fi,k


Ratio of two chi-squares divided by their dfs

F with numerator i and denominator k degrees of freedom

Non-central F distributions

c2i,l/i) /

c2k/k)

~ Fi,k,l

Ratio of non-central over Central chi-squares

F with numerator i, denominator k degrees of freedom and non-centrality parameter l.

F Distribution values in R
Pr(Fi,k < x) pf(x,i,k) Pr(Fi,k,lam < x) pf(x,i,k,lam) Example: Compute Pr(F20,30 > 1.5) 1-pf(1.5,20,30) returns 0.153 Example: Compute Pr(F20,30,2 > 1.5) 1-pf(1.5,20,30,2) returns 0.215 Find x such that Pr(Fi,k < x) =a qf(a,i,k)

Example: Compute the 95% critical value for an F with 12 numerator and 25 denominator dfs qf(0.95,12,25) returns 2.16

Power of F-tests
The idea is the same as with normal tests Assign a critical value C based on the null Compute the probability that the test statistic exceed C given the assumed alternative values Fixed-effects ANOVA:
Null: treatment effects t =0, so yi ~ N(0,s2e). Alternative: some ti not zero, so yi ~ N(ti,s2e). Non-zero means generate non-central F

Random-effects ANOVA:
Null: treatment variance s2t = 0, so yi ~ N(0,s2e). Alternative: s2t > 0, so yi ~ N(0,s2e + s2t). Non-zero variance generates scaled F (i.e., l*F)

Fixed-effects: One-way ANOVA yij = u + ti + eij


N fixed factors, n replicates/factor. The test statistic is

MSt SSt /(N - 1) f= = MSe SSe/N (n - 1)

The distribution of the numerator follows since


2 y i ~ N(i , e /n) which gives 2 SSt ~ n(e /n)2 - 1, N N i=1

with =

We can think of this as a variance-like term, N 1 2 = n (N = i2 N 1 i=1

i2 2 e /n

2 1) 2 e

The error sums of squares is distributed at


2 SSe ~ e 2 - 1) N(n

Giving the distribution of the test statistic f as

f ~

()
2 e 2 e

2 - 1,/(N - 1) N 2 - 1)/N (n - 1) N(n

~ FN - 1,N(n - 1),

Under the null hypothesis of no treatment effect, l = 0 and f is centrally F distributed. Hence the a-level critical value for the null is

f > F(N - 1),N(n - 1),[1 - ]


Under the alternative, l = n(N-1)s2t/s2e, and the power of the test is simply Pr FN - 1,N(n - 1), > FN - 1,N(n - 1),[1 - ]

l = n(N-1)s2t/s2e, power is

Pr FN - 1,N(n - 1), > FN - 1,N(n - 1),[1 - ]

Example: Suppose the treatment variance is 10% of the total variance and we have N = 5 groups. For a test of a = 0.001, what n is needed for 80% power? The critical value becomes F4,5*n-1,[0.999] = qf(0.999,4,5*(n-1))

What is the noncentrality parameter?


s2t/(s2e+ s2t) = 1/(s2e/s2t + 1) = 0.1, implying s2e/s2t = 9, so that l = n(N-1)s2t/s2e = n(4/9) = 0.444*n

power <-function(n) { crit <- qf(0.999,4,5*(n-1)); 1-pf(crit,4,5*(n-1),0.444*n)}

Lets plot power as a function of n curve(power(x),20,100)


l = n(N-1)s2t/s2e, power is

Pr FN - 1,N(n - 1), > FN - 1,N(n - 1),[1 - ]


General expression power <-function(N,n,var,alpha) { crit <- qf(1-alpha, N-1, N *(n -1)); ncp <- n * (N-1)*var; 1-pf(crit, N-1,N *(n -1),ncp)}

s2t/s2e versus s2t/(s2e+ s2t)


A few brief comments on the ratio of treatment variance to error variance (s2t/s2e ) vs. the fraction of total variance accounted for by the treatment effects (s2t/(s2e+ s2t)), which is the r2 for the simple one-way ANOVA (so for simplicity, we will refer to this ratio as r2. Let x = s2t/s2e , then we can express r2 in terms of x, and vise-versa, Specifically,

r2 = s2t/(s2e+ s2t) = 1/(s2e /s2t + 1) = 1/(1/x + 1) = x/(1+x)


Likewise, x = r2/(1-r2), giving l = n(N-1)s2t/s2e = n(N-1)[r2/(1-r2)] Example: Suppose s2t/s2e = 0.2. Then r2 = 0.2/(1+0.2) = 0.167 Example: if r2 = 0.05, what is s2t/s2e and l? s2t/s2e = 0.05/(1-0.05) = 0.053, l = n(N-1)*0.053

Power under random effects design


yij = u + ti + eij
f test as before, distribution of error sum of squares SSe as before. However, since t ~ N(0, s2t), for the treatment sum of squares, SS ~ (n 2 + 2 )2 t t e N 1

The resulting distribution of the test statistic becomes


2 2 (n t + e )2 -1/(N - 1) SSt /(N - 1) N f = ~ 2 SSe /N (n - 1) e 2 N(n 1)/N (n - 1) 2 t ~ 1+n 2 e

F(N - 1),N(n -1)

Hence,

F 2 / 2 ) ~ FN - 1,N(n - 1) 1 + n (t e
When the treatment variance is zero (null hypothesis), same critical value as for fixed-effects model. When the treatment variance is non-zero, the power is

Pr FN - 1,N(n - 1)

FN - 1,N(n - 1),[1 - ] > 2 2 1 + n (t /e )

Note that, as with the fixed-effects case, we can replace the variance ratio by r2/(1-r2)

Pr FN - 1,N(n - 1)

FN - 1,N(n - 1),[1 - ] > 2 2 1 + n (t /e )

Now the (R) code for power is rpower <-function(N,n,var,alpha) { crit <- qf(1-alpha, N-1, N *(n -1)); temp <- 1+n*var; 1-pf(crit/temp, N-1,N *(n -1))} What about differences in power for fixed vs. random? curve(power(5,x,0.111,0.001)-rpower (5,x,0.111,0.001),5,100)

You might also like