0% found this document useful (0 votes)
25 views43 pages

Hypothesis Testing 2

Uploaded by

anuj21meena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views43 pages

Hypothesis Testing 2

Uploaded by

anuj21meena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Hypothesis Testing II

Two Sample Tests

Two Sample Tests

Population Population
Means, Means, Population Population
Matched Independent Proportions Variances
Pairs Samples
Examples:
Same group Group 1 vs. Proportion 1 vs. Variance 1 vs.
before vs. after independent Proportion 2 Variance 2
treatment Group 2
Matched Pairs
Tests Means of 2 Related Populations
Matched  Paired or matched samples
Pairs  Repeated measures (before/after)
 Use difference between paired values:

di = xi - yi

 Assumptions:
 Both Populations Are Normally Distributed
Test Statistic: Matched Pairs
The test statistic for the mean
Matched difference is a t value, with
Pairs n – 1 degrees of freedom:

d  D0
t
sd
n
Where
D0 = hypothesized mean difference
sd = sample standard dev. of differences
n = the sample size (number of pairs)
Decision Rules: Matched Pairs
Paired Samples
Lower-tail test: Upper-tail test: Two-tail test:
H0: μx – μy  0 H0: μx – μy ≤ 0 H0: μx – μy = 0
H1: μx – μy < 0 H1: μx – μy > 0 H1: μx – μy ≠ 0

a a a/2 a/2

-ta ta -ta/2 ta/2


Reject H0 if t < -tn-1, a Reject H0 if t > tn-1, a Reject H0 if t < -tn-1 , a/2
or t > tn-1 , a/2
d  D0
t
Where sd has n - 1 d.f.
n
Matched Pairs Example
 Assume you send your salespeople to a “customer
service” training workshop. Has the training made a
difference in the number of complaints? You collect
the following data:
 di
Number of Complaints: (2) - (1) d = n
Salesperson Before (1) After (2) Difference, di

C.B. 6 4 - 2
T.F. 20 6 -14
M.H. 3 2 - 1
Sd 
 i
(d  d ) 2

R.K. 0 0 0
n 1
M.O. 4 0 - 4
-21
Matched Pairs Example
 Assume you send your salespeople to a “customer
service” training workshop. Has the training made a
difference in the number of complaints? You collect
the following data:
 di
Number of Complaints: (2) - (1) d = n
Salesperson Before (1) After (2) Difference, di
= - 4.2
C.B. 6 4 - 2
T.F. 20 6 -14
M.H. 3 2 - 1 Sd 
 i
(d  d) 2

R.K. 0 0 0 n 1
M.O. 4 0 - 4
-21  5.67
Matched Pairs: Solution
 Has the training made a difference in the number of
complaints (at the  = 0.01 level)?
Reject Reject
H0: μx – μy = 0
H1: μx – μy  0 /2
/2
 = .01 d = - 4.2 - 4.604 4.604
- 1.66
Critical Value = ± 4.604
d.f. = n - 1 = 4 Decision: Do not reject H0
(t stat is not in the reject region)
Test Statistic:
Conclusion: There is not a
d  D0  4.2  0
t   1.66 significant change in the
sd / n 5.67/ 5 number of complaints.
Difference Between Two Means

Population means, Goal: Form a confidence interval


independent for the difference between two
samples population means, μx – μy
 Different data sources
 Unrelated

 Independent

 Sample selected from one population has no effect on the


sample selected from the other population
Difference Between Two Means
(continued)

Population means,
independent
samples

σx2 and σy2 known Test statistic is a z value

σx2 and σy2 unknown

σx2 and σy2


assumed equal Test statistic is a a value from the
Student’s t distribution
σx2 and σy2
assumed unequal
σx2 and σy2 Known

Population means, Assumptions:


independent
samples  Samples are randomly and
independently drawn
σx2 and σy2 known
*  both population distributions
σx2 and σy2 unknown are normal

 Population variances are


known
σx2 and σy2 Known
(continued)

When σx2 and σy2 are known and


Population means,
independent both populations are normal, the
samples variance of X – Y is
2 2
σx σy
σ 2X  Y  
σx2 and σy2 known
* nx ny

…and the random variable


σx2 and σy2 unknown
(x  y)  (μX  μY )
Z
2
σ 2x σ y

nX nY
has a standard normal distribution
Test Statistic,
σx2 and σy2 Known

Population means,
independent The test statistic for
samples μx – μy is:

σx2 and σy2 known


* z
 x  y   D0
2 2
σx2 and σy2 unknown σx σy

nx ny
Hypothesis Tests for
Two Population Means
Two Population Means, Independent Samples

Lower-tail test: Upper-tail test: Two-tail test:

H0: μx  μy H0: μx ≤ μy H0: μx = μy


H1: μx < μy H1: μx > μy H1: μx ≠ μy
i.e., i.e., i.e.,
H0: μx – μy  0 H0: μx – μy ≤ 0 H0: μx – μy = 0
H1: μx – μy < 0 H1: μx – μy > 0 H1: μx – μy ≠ 0
Decision Rules
Two Population Means, Independent
Samples, Variances Known
Lower-tail test: Upper-tail test: Two-tail test:
H0: μx – μy  0 H0: μx – μy ≤ 0 H0: μx – μy = 0
H1: μx – μy < 0 H1: μx – μy > 0 H1: μx – μy ≠ 0

a a a/2 a/2

-za za -za/2 za/2


Reject H0 if z < -za Reject H0 if z > za Reject H0 if z < -za/2
or z > za/2
σx2 and σy2 Unknown,
Assumed Equal

Population means, Assumptions:


independent  Samples are randomly and
samples
independently drawn

σx2 and σy2 known  Populations are normally


distributed
σx2 and σy2 unknown
 Population variances are
σx2 and σy2
assumed equal * unknown but assumed equal

σx2 and σy2


assumed unequal
σx2 and σy2 Unknown,
Assumed Equal
(continued)

Population means, Forming interval


independent estimates:
samples
 The population variances
σx2 and σy2 known are assumed equal, so use
the two sample standard
deviations and pool them to
σx2 and σy2 unknown
estimate σ
σx2 and σy2
assumed equal *  use a t value with
(nx + ny – 2) degrees of
σx2 and σy2 freedom
assumed unequal
Test Statistic,
σx2 and σy2 Unknown, Equal

σx2 and σy2 unknown The test statistic for


μx – μy is:
σx2 and σy2
assumed equal *  x  y    μx  μy 
σx2 and σy2 t
assumed unequal 1 1 
S   
2
n n 
p
 x y 

Where t has (n1 + n2 – 2) d.f.,


and
(n x  1)s 2x  (n y  1)s 2y
sp2 
nx  ny  2
Pooled Variance t Test: Example
You are a financial analyst for a brokerage firm. Is there
a difference in dividend yield between stocks listed on the
BSE & NSE? You collect the following data:
BSE NSE
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16

Assuming both populations are


approximately normal with
equal variances, is
there a difference in average
yield ( = 0.05)?
Calculating the Test Statistic
The test statistic is:

t
X 1 
 X 2  μ1  μ2 

3.27  2.53   0  2.040
1 1  1 1 
S   
2
p
1.5021  
 n1 n2   21 25 

n
S2  1
 1S1
2
 n 2  1S 2
2

21  11.30 2
 25  11.16 2
 1.5021
p
(n1  1)  (n2  1) (21 - 1)  (25  1)
Solution
Reject H0 Reject H0
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)
.025 .025
 = 0.05
df = 21 + 25 - 2 = 44 -2.0154 0 2.0154 t
Critical Values: t = ± 2.0154
2.040
Test Statistic: Decision:
3.27  2.53
t  2.040 Reject H0 at a = 0.05
 1 1 
Conclusion:
1.5021   
 21 25  There is evidence of a
difference in means.
σx2 and σy2 Unknown,
Assumed Unequal

Population means, Assumptions:


independent  Samples are randomly and
samples
independently drawn

σx2 and σy2 known  Populations are normally


distributed
σx2 and σy2 unknown
 Population variances are
σx2 and σy2 unknown and assumed
assumed equal unequal
σx2 and σy2
assumed unequal *
σx2 and σy2 Unknown,
Assumed Unequal
(continued)

Population means,
Forming interval estimates:
independent
 The population variances are
samples
assumed unequal, so a pooled
variance is not appropriate
σx2 and σy2 known
 use a t value with  degrees
σx2 and σy2 unknown of freedom, where
2
 s 2x s 2y 
σx2 and σy2 ( )  ( )
assumed equal  n x n y 
v 2
2 2  2

 sx  s
σx2 and σy2
assumed unequal *   /(n x  1)   y  /(n y  1)
 nx 
n 
 y
Test Statistic,
σx2 and σy2 Unknown, Unequal

σx2 and σy2 unknown The test statistic for


μx – μy is:
σx2 and σy2
assumed equal (x  y)  D 0
t
σx2 and σy2
assumed unequal * s2
s 2
y
x

nX nY
2
 s 2x s 2y 
( )  ( )
 n x n y 
Where t has  degrees of freedom: v 2 2
 s 2x   s2 
  /(n x  1)   y  /(n y  1)
n 
 nx   y
Decision Rules
Two Population Means, Independent
Samples, Variances Unknown
Lower-tail test: Upper-tail test: Two-tail test:
H0: μx – μy  0 H0: μx – μy ≤ 0 H0: μx – μy = 0
H1: μx – μy < 0 H1: μx – μy > 0 H1: μx – μy ≠ 0

a a a/2 a/2

-ta ta -ta/2 ta/2


Reject H0 if t < -tn-1, a Reject H0 if t > tn-1, a Reject H0 if t < -tn-1 , a/2
or t > tn-1 , a/2
Where t has n - 1 d.f.
Two Population Proportions
Goal: Test hypotheses for the
Population difference between two population
proportions proportions, Px – Py

Assumptions:
Both sample sizes are large,
nP(1 – P) > 9
Two Population Proportions
(continued)

 The random variable


Population
proportions
(pˆ x  pˆ y )  (p x  p y )
Z
pˆ x (1  pˆ x ) pˆ y (1  pˆ y )

nx ny

is approximately normally distributed


Test Statistic for
Two Population Proportions
The test statistic for
Population H0: Px – Py = 0
proportions is a z value:

z
 pˆ x  pˆ y 
pˆ 0 (1  pˆ 0 ) pˆ 0 (1  pˆ 0 )

nx ny

n xpˆ x  n ypˆ y
Where pˆ 0 
nx  ny
Decision Rules: Proportions
Population proportions
Lower-tail test: Upper-tail test: Two-tail test:
H0: px – py  0 H0: px – py ≤ 0 H0: px – py = 0
H1: px – py < 0 H1: px – py > 0 H1: px – py ≠ 0

a a a/2 a/2

-za za -za/2 za/2


Reject H0 if z < -za Reject H0 if z > za Reject H0 if z < -za/2
or z > za/2
Example:
Two Population Proportions
Is there a significant difference between the
proportion of men and the proportion of
women who will vote Yes on Proposition A?

 In a random sample, 36 of 72 men and 31 of


50 women indicated they would vote Yes

 Test at the .05 level of significance


Example:
Two Population Proportions
(continued)
 The hypothesis test is:
H0: PM – PW = 0 (the two proportions are equal)
H1: PM – PW ≠ 0 (there is a significant difference between
proportions)
 The sample proportions are:
 Men: p̂M = 36/72 = .50
 Women: p̂ W = 31/50 = .62
 The estimate for the common overall proportion
is:
n xpˆ x  n ypˆ y 72(36/72)  50(31/50) 67
pˆ 0     .549
nx  ny 72  50 122
Example:
Two Population Proportions
(continued)
Reject H0 Reject H0

The test statistic for PM – PW = 0 is:


.025 .025
z
 pˆ M pˆ W 
pˆ 0 (1  pˆ 0 ) pˆ 0 (1  pˆ 0 )
 -1.96 1.96
n1 n2
-1.31


 .50  .62 
 .549 (1  .549) .549 (1  .549)  Decision: Do not reject H0
  
 72 50  Conclusion: There is not

  1.31 significant evidence of a


difference between men
Critical Values = ±1.96 and women in proportions
For  = .05 who will vote yes.
Hypothesis Tests of
one Population Variance
 Goal: Test hypotheses about the
Population population variance, σ2
Variance
 If the population is normally distributed,

(n  1)s 2
 2
n 1 
σ2
follows a chi-square distribution with
(n – 1) degrees of freedom
Confidence Intervals for the
Population Variance
(continued)

The test statistic for


Population hypothesis tests about one
Variance
population variance is

(n  1)s 2
χ 2
n 1 
σ 02
Decision Rules: Variance
Population variance
Lower-tail test: Upper-tail test: Two-tail test:
H0: σ2  σ02 H0: σ2 ≤ σ02 H0: σ2 = σ02
H1: σ2 < σ02 H1: σ2 > σ02 H1: σ2 ≠ σ02

a a a/2 a/2

χ n21,1 χ n21, χ n21,1 / 2 χ n21, / 2

Reject H0 if Reject H0 if Reject H0 if


χ 2
χ 2 χ n21  χ n21, χ n21  χ n21, / 2
n 1 n 1,1
or χ n21  χ n21,1 / 2
Hypothesis Tests for Two Variances

 Goal: Test hypotheses about two


Tests for Two
population variances
Population
Variances H0: σx2  σy2 Lower-tail test
H1: σx2 < σy2
F test statistic
H0: σx2 ≤ σy2 Upper-tail test
H1: σx2 > σy2

H0: σx2 = σy2


Two-tail test
H1: σx2 ≠ σy2
The two populations are assumed to be
independent and normally distributed
Hypothesis Tests for Two Variances
(continued)

The random variable


Tests for Two
2 2
Population s /σ
Variances F x
2
x
2
s /σ y y
F test statistic
Has an F distribution with (nx – 1)
numerator degrees of freedom and (ny
– 1) denominator degrees of freedom

Denote an F value with 1 numerator and 2


denominator degrees of freedom by
Test Statistic

Tests for Two The critical value for a hypothesis test


Population about two population variances is
Variances
2
s
F test statistic
F x
2
s y

where F has (nx – 1) numerator


degrees of freedom and (ny – 1)
denominator degrees of freedom
Decision Rules: Two Variances
Use sx2 to denote the larger variance. H0: σx2 = σy2
H0: σx2 ≤ σy2 H1: σx2 ≠ σy2
H1: σx2 > σy2
 /2

0 0 F
Do not Reject H0 F Do not
reject H0
Reject H0
Fn x 1,n y 1,α / 2
reject H0 Fnx 1,ny 1,α

Reject H0 if F  Fnx 1,ny 1,α  rejection region for a two-


tail test is:
Reject H0 if F  Fnx 1,ny 1,α / 2
where sx2 is the larger of the
two sample variances
Example: F Test

You are a financial analyst for a brokerage firm. You


want to compare dividend yields between stocks listed on
the BSE and NSE. You collect the following data:
BSE NSE
Number 21 25
Mean 3.27 2.53
Std dev 1.30 1.16

Is there a difference in the variances


between the BSE & NSE at the 
= 0.10 level?
F Test: Example Solution
 Form the hypothesis test:
H0: σx2 = σy2 (there is no difference between variances)
H1: σx2 ≠ σy2 (there is a difference between variances)
Find the F critical values for 

= .10/2:
Degrees of Freedom:
 Numerator
F n x 1, n y 1, α / 2

(BSE has the larger


standard deviation):  F20 , 24 , 0.10/2  2.03
 nx – 1 = 21 – 1 = 20 d.f.
 Denominator:
 ny – 1 = 25 – 1 = 24 d.f.
F Test: Example Solution
(continued)

 The test statistic is: H 0 : σx 2 = σy 2


H 1 : σx 2 ≠ σy 2
s 2x 1.30 2
F 2  2
 1.256
s y 1.16 /2 = .05

Do not
F
Reject H0
reject H0
 F = 1.256 is not in the
rejection region, so we do not F20 , 24 , 0.10/2  2.03
reject H0
 Conclusion: There is not sufficient evidence
of a difference in variances at  = .10
Two-Sample Tests in EXCEL

For paired samples (t test):


 Tools | data analysis… | t-test: paired two sample for means

For independent samples:


 Independent sample Z test with variances known:
 Tools | data analysis | z-test: two sample for means

For variances…
 F test for two variances:
 Tools | data analysis | F-test: two sample for variances

You might also like