Chap 5 Module
Chap 5 Module
Analysis of Variance
Outline
5.1 Introduction to Experimental Design
5.2 One-Way Analysis of Variance
5.3 Two-Way Analysis of Variance
5.4 Two-Way Analysis of Variance with Interaction
Objectives
At the end of this chapter, the students should be able to
1. Understand different experimental designs.
2. Apply One-way ANOVA technique to determine whether there is a
significant difference among 3 or more means.
3. Apply Two-way ANOVA technique to determine whether there is a
significant difference in the main effects or whether a possible
interaction effect exists.
Many studies involve data collection from experiment. Before data collection can be
done, it is wise to organize and design the experiment properly so that the right type of data
is collected for the desired statistical analysis. This process is called experimental design.
Different experimental designs enable us to study different effects of the response in the
study.
Three types of the experimental designs are presented here, that are Completely
Randomized Design, Randomized Block Design and Completely Randomized Factorial
Design.
Case 1:
For example, a researcher designs three different diets for athletes, which are Diet I,
Diet II and Diet III. The researcher is interested to know whether there is any significant
effect on the weight gain of athletes following these 3 different diets. In another word, the
researcher would like to see whether there is any difference in the weight gains of athletes
who follow different diets. Fourteen athletes are randomly assigned to three groups and
follow the assigned diet for 1 month.
The randomized assignment in this case can be done by assigning a number to each
of these 14 athletes, say number 1 to 14. Using a randomized device, random numbers can
be generated and the athletes with the respective random numbers are placed in three
groups. For example, when first 5 random numbers are generated, the athletes who are
associated with these 5 numbers are assigned to Diet I. Then, another 5 random numbers
are generated and the athletes who are associated with these numbers are assigned to Diet
II. The rest of the athletes are assigned to Diet III.
1
Note that the athletes are the experimental units. The weight gains of athletes (in kg)
after following the diets for one month are recorded in the following table.
Diet
Diet I Diet II Diet III
7 3 1
8 6 2
12 3 1
6 2 3
8 5
In this case, the researcher is interested to know whether there is any effect on the
weight gain of athletes following different diets. Thus, there is only 1 factor that affects the
weight gain in this study, which is diet. The factor (diet) is also called independent variable
while the weight gain is the dependent variable.
The factor in this case consists of 3 levels, or, 3 different treatments. Treatment is
something the researcher administers to the experimental units. In this case, the
experimental units (athletes) are ‘treated’ with 3 different diets to see the effect on their
weight gains.
Analysis of Variance (ANOVA) is a procedure used to test the null hypothesis that the
means of three or more populations are equal. In this case, only 1 factor (that is 1
independent variable) involves, thus One-way ANOVA is used to solve this hypothesis test.
2
(b) Randomized Block Design
Case 2:
However, the background of the students (the undergraduate course that they are
pursuing) might affect the English language test results. Thus, the background of the
students is a nuisance variable. In order to isolate the effect of this nuisance variable,
blocking procedure is done by grouping the students with same background into same
block. Then, the students in each block are randomly assigned to different teaching
methods on learning English language. After three months, these students sit for an English
language test. The test results are presented in the following table.
Student background
(Blocking factor)
Engineering Science Pharmacy
Teaching method Method A 13 12 13
(Treatment factor) Method B 10 15 11
Method C 14 11 12
Method D 11 14 14
The factor, teaching method, is known as treatment factor while the student
background is blocking factor. In this study, blocking procedure is done in order to remove
the block-to-block variability that might hide the effect of the treatment factor. Note that the
factors are also called independent variables while the English language test result is the
dependent variable.
Ho: μA = μB = μC = μD
(There is no difference in mean result using different teaching methods)
H1: At least one mean is different from the others
(There is a difference in mean result using different teaching methods)
Ho: μE = μS = μP
(There is no difference in mean result for different student background)
H1: At least one mean is different from the others
(There is a difference in mean result for different student background)
3
Alternatively, we may write the hypotheses as follows:
4
(c) Completely Randomized Factorial Design
In the case when the researcher is interested to investigate the effects of two factors
at different levels and to test the interaction effect between the factors, Completely
Randomized Factorial Design is the right experimental design to be carried out. The
following Case 3 is an example of Completely Randomized Factorial Design.
Case 3:
For example, a researcher wishes to test the effects of 2 different diets and 3
different exercise programs on the amount of weight loss. The researcher is also interested
to detect whether interaction exists between these two factors. To conduct this experiment, 6
treatment combinations have been formed. Two adults are randomly assigned to each
treatment combination for two months. The amount of weight loss for each of the adults (in
kg) is recorded after two months and it is shown in the following table.
Exercise program
A B C
Diet I 9, 8 5, 3 2, 2
II 6, 7 2, 2 1, 1
In this study, there are 2 factors, which are diet and exercise program. The diet
consists of 2 levels while the exercise program consists of 3 levels. The diet and the
exercise program are also called the independent variables while the weight loss is the
dependent variable.
For interaction,
For diet,
Ho: μI = μII
(There is no difference in mean weight loss for different diets)
H1: At least one mean is different from the others
(There is a difference in mean weight loss for different diets)
Ho: μA = μB = μC
(There is no difference in mean weight loss for different exercise programs)
H1: At least one mean is different from the others
(There is a difference in mean weight loss for different exercise programs)
5
Alternatively, we may write the hypotheses as follows:
For interaction,
For diet,
Note that the first set of hypotheses concerns about the interaction effect while the
second and third sets of hypotheses concern about main effects. In this case, the procedure
of Two-way ANOVA with Interaction is used to solve the hypothesis tests.
It is noticed that hypothesis tests on main effects consider whether the null
hypothesis Ho: μI = μII or Ho: μA = μB = μC should be rejected or not. In other words,
hypothesis test on main effect is concerning whether there is any difference among several
mean levels of each independent variable.
In the case when interaction effect exists between two independent variables
(factors), the effect of one independent variable on the dependent variable depends on the
level of another independent variable.
6
5.2 One-Way Analysis of Variance
ANOVA (Analysis of Variance) is a procedure used to test the null hypothesis that the
means of three or more populations are equal.
One-Way ANOVA
Diet
Diet I Diet II Diet III
7 3 1
8 6 2
12 3 1
6 2 3
8 5
Can we conclude that there are differences in mean weight gain among 3 different diets?
Use 0.01 .
Solution:
Step 1 Hypothesis:
7
Total sample size = N = 14
x 67
2 2
CF 320.643
i
N 14
ANOVA Table
Source df Sum of Squares Mean Square Fo
(SS) (MS)
Diet k–1 SSB = 100.007 SSB MSB
(Between =3–1 MSB 50.004 F0 16.01
k 1 MSE
groups) =2
Error N– k SSE = 34.35 SSE
(Within = 14 – 3 MSE 3.123
Nk
groups) = 11
Total N–1 SST = 134.357
= 14 – 1
= 13
Step 4 Decision:
Step 5 Conclusion:
There is enough evidence to conclude that there are differences in mean weight gain
among 3 different diets.
8
5.3 Two-Way Analysis of Variance
Student background
(Blocking factor)
Engineering Science Pharmacy
Teaching method Method A 13 12 13
(Treatment factor) Method B 10 15 11
Method C 14 11 12
Method D 11 14 14
i) Do the data provide enough evidence to indicate that a difference exists in the
average result using different teaching methods? Use 0.05 .
ii) Is there enough evidence to indicate that mean result differs for different
student background? Test at 5% level of significance.
Solution:
Step 1 Hypothesis:
Ho: μA = μB = μC = μD
(There is no difference in mean result using different teaching methods)
H1: At least one mean is different from the others
(There is a difference in mean result using different teaching methods)
Ho: μE = μS = μP
(There is no difference in mean result for different student background)
H1: At least one mean is different from the others
(There is a difference in mean result for different student background)
9
Step 2 Test statistic:
MSR
F1 for teaching method (Row)
MSE
MSC
F2 for student background (Column)
MSE
Student background
(Blocking factor)
Engineering Science Pharmacy Total
Teaching Method A 13 12 13 38
method Method B 10 15 11 36
(Treatment Method C 14 11 12 37
factor) Method D 11 14 14 39
Total 48 52 50 150
x = 13 + 12 +…+ 14 + 14 = 150
x 2
= 132 + 122 +…+ 142 + 142 = 1902
10
ANOVA Table
Sources of df Sum of Mean square F
variation squares
Teaching r-1=3 SSR=1.667 SSR MSR
MSR 0.556 F1 0.143
method r 1 MSE
(row)
Student c-1=2 SSC=2 SSC MSC
MSC 1 F2 0.257
background c 1 MSE
(column)
Error (r-1)(c-1) SSE=23.333 SSE
MSE 3.889
=6 r 1 c 1
Total rc-1 =11 SST=27
For teaching method: Reject Ho if F1 F,r 1,r 1c 1 F0.05,3,6 4.76
For student background: Reject Ho if F2 F,c1,r 1c1 F0.05,2,6 5.14
Step 4 Decision:
Step 5 Conclusion:
There is not enough evidence to indicate that a difference exists in the average result
using different teaching methods.
There is not enough evidence to indicate that mean result differs for different student
background.
11
5.4 Two-Way Analysis of Variance with Interaction
Exercise program
A B C
I 9, 8 5, 3 2, 2
Diet
II 6, 7 2, 2 1, 1
i) Is there any interaction effect between diet and exercise program? Use 0.05 .
ii) Do the data provide sufficient evidence to indicate that there is a significant effect
on mean weight loss due to different diets? Test at 5% level of significance.
iii) Do the data provide enough evidence to indicate that there is a significant effect
on mean weight loss due to different exercise programs? Use 0.05 .
Solution:
Step 1 Hypothesis:
For interaction,
Ho: There is no interaction effect between diet and exercise program
H1: There is an interaction effect between diet and exercise program
For diet,
Ho: μI = μII
(There is no difference in mean weight loss for different diets)
H1: At least one mean is different from the others
(There is a difference in mean weight loss for different diets)
For diet,
Ho: No effect on mean weight loss due to different diets
H1: There is an effect on mean weight loss due to different diets
12
MSR
Step 2 Test statistic: F1 for diet (Row),
MSE
MSC
F2 for exercise program (Column),
MSE
MSRC
F3 for interaction
MSE
Exercise program
A B C Total
Diet I 9, 8 5, 3 2, 2 29
(17) (8) (4)
II 6, 7 2, 2 1, 1 19
(13) (4) (2)
Total 30 12 6 48
x = 9 + 8 + … + 1 + 1 = 48
x 2
= 92 + 82 +…+ 12 + 12 = 282
( x)2
482
CF = 192
rcn 2(3)(2)
For interaction:
(172 82 42 132 42 22 )
SSRC SSR SSC CF
n
558
SSR SSC CF 0.67
2
13
ANOVA Table
Source of df Sum of square Mean Square F
Variation
Diet (Row) r-1=1 SSR = 8.33 SSR MSR
MSR = 8.33 F1 =16.66
r 1 MSE
Exercise c-1=2 SSC = 78 SSC MSC
program MSC = 39 F2 =78
c 1 MSE
(Column)
Interaction (r-1)(c-1)=2 SSRC = 0.67 SSRC MSRC
MSRC F3 =0.67
(r 1)(c 1) MSE
= 0.335
Error rc(n-1)=6 SSE = 3 SSE
MSE = 0.5
rc(n 1)
Total rcn-1=11 SST = 90
Step 4 Decision:
For diet: Since F1 16.66 5.99 , reject H0.
For exercise program: Since F2 78 5.14 , reject H0.
For interaction: Since F3 0.67 5.14 , do not reject H0.
Step 5 Conclusion:
There is not enough evidence to indicate that there is an interaction effect between
diet and exercise program.
There is enough evidence to indicate that there is a significant effect on mean weight
loss due to different diets.
There is enough evidence to indicate that there is a significant effect on mean weight
loss due to different exercise programs.
10
Exercise 8.5
program 8
A B C 6.5
6
Diet Mean 4 2
I = 8.5 4 4
Diet Mean 2 1
2 Diet I
II = 6.5 2 2
1
0 Diet II
A B C
14
Chapter 5 Analysis of Variance
1. From time to time, unknown to its employees, the research department at a company
observes various employees for their work productivity. Recently this company wanted
to check whether the four planners at a brand of the company, on average, the number
of orders are planned. The following table gives the number of orders planned by four
planners.
At the 5% significance level, test the null hypothesis that the mean number of
customers served per hour by each of these four planners is the same. Assume that all
the assumptions required to apply the one-way ANOVA procedure hold true.
a) Is there a difference in the mean distance using 4 grades of gasoline? Use 0.05 .
b) What is the type of experimental design used in this study?
15
3. A laboratory has many voltmeters that are used interchangeably to make voltage
measurements. A sample of three voltmeters is chosen at random. Each voltmeter
was used to measure a constant emf of 100 volt. The following voltage readings were
recorded.
Voltmeter
I II III
48 55 84
73 85 68
51 70 95
65 69 74
87 90 67
Is there a difference in the mean voltage among these 3 voltmeters? Use 0.05 .
4. Eight different machines are being considered for use in manufacturing bottle caps.
The tensile strength of the bottle cap produced by each machine is of interest. A
random sample of 4 bottle caps produced by each machine was randomly selected
and their tensile strength recorded. The ANOVA table below summarizes the results
obtained.
Complete the above table. Test if there is a difference in the mean tensile strength
between the bottle caps produced by the eight machines. Use 0.025 .
5. The effect of three different paints on the corrosion of pipes was studied. One of the
three paints was randomly assigned so that each paint would be used on four
segments of pipe. The segments were painted and were observed for a period of six
months. The accompanying corrosion readings were then obtained. Is there sufficient
evidence to indicate a difference in the mean levels of corrosion for three paints? Use
ANOVA with 0.01 .
16
Exercise 5.3 - Two-way ANOVA
Shift
Morning Afternoon Night
Location A 700 702 621
B 869 823 700
a) At the 0.01 level of significance, do the data provide sufficient evidence to indicate
that there is a significant effect on output due to location?
b) At the 0.01 level of significance, do the data provide sufficient evidence to indicate
that there is a significant effect on output due to shift?
c) What is the type of experimental design used in this study? Identify the treatment
and blocking factor in this study.
d) Give a reason why blocking is carried out in an experiment.
Humidity
Plastic type 10% 30% 50% 70% Total
M 39 33 33.6 32 137.6
N 36.9 27.2 29.7 28.5 122.3
P 29.6 30.2 27.7 32.5 120.0
Total 105.5 90.4 91 93 379.9
At the level 1% level of significance, do the data provide sufficient evidence to indicate
there is an effect of
i) Types of plastic (Blocking factor)
ii) Humidity (Treatment factor)
on force required to pull apart glued plastic?
17
Exercise 5.4 - Two-way ANOVA with Interaction
Worker
Worker 1 Worker 2 Worker 3
Machine A 700, 660 682, 742 627, 700
B 860, 689 813, 789 704, 711
a) What is the type of experimental design used in this study? How many treatment
combinations are there?
b) At the 0.01 level of significance, do the data provide sufficient evidence to indicate that
there is
i) a significant effect on number of rejected products due to worker?
ii) a significant effect on number of rejected products due to machine?
iii) a significant effect on number of rejected products due to interaction of worker and
machine?
[ Answer: i) F = 5.125, do not reject Ho, ii) F = 1.513, do not reject Ho, iii) F = 0.228, do not
reject Ho ]
2. The response time (in milliseconds) in an electric circuit was the subject of a recent
experiment. Three different circuits and five different testers were chosen at random.
Each tester measured the response time in each circuit three times. The results are
summarized in the following ANOVA table.
[ Answer: F1 = 4.976, reject Ho, F2 = 1.2802, do not reject Ho, F3 = 2.0393, do not reject Ho ]
18