Analysis of Variance
Analysis of Variance
Introduction
Solutions of statistical problems based on inference about population means. In this
chapter we extend the methods of inference about population means to the
comparison of more than two means. When the data have been obtained, according
to specified sampling procedures, they are easy to analyze and also may contain
information pertinent to population means than could be obtained using random
sampling. The procedure for selecting sample data is called the design of an
experiment (experimental design) and a statistical procedure for comparing the
population means is called the analysis of variance (ANOVA).
Read and notes on types of Experimental Study Designs
-
Anova
ANOVAstands for analysis of variance
Its one of the mult -variate data analysis techniques
Its defined as a techinique where by total variation present in
a set of data is partitioned into two or more components.
Associated to each of the components is a specific source
of variation, so that in the .. analysis ,its possible to
ascertain the magnitude of the contribution of each of these
sources to the total variation.
Application
Treatment variable
Response variable
iii.
Extraneous variables
iv.
Experimental unit
The area given in the tables is normally the area left to the
right of the F-value and corresponding to the two degrees of
freedom
Illustrative examples
Suppose we have the following information:
(i) n1=11
n2 =4
(ii) =0.05
The F 0.05,(10, 3 )
= 8.79(illustrate to them )
F 0.05,(4,20)
ii.
F 0.05,(20,4)
iii.
F 0.1, (5,10)
iv.
F 0.1,(10,5)
v.
F 0.025,(12,2)
When we look at the effect of fertilizer alone, without considering soil type
(or effect), the design of analysis is referred to as ONE-WAY ANOVA.
When we look at effect of fertilizers and soil type (both), then we have a
TWO WAY ANOVA
When we consider more than 2 factors, we dealing with n-WAY ANOVA
For this course, we shall only look at ONE WAY ANOVA
ONE-WAY ANOVA
One way anova has three components, namely
1. SST = Total Sum of Squares
2. SSC = Sum of squares for column means
3. SSE = Error Sum of Squares
Sum of Squares computational Formulas
SST= (X2
i,j
T2.. /N)..(i)
i,j------
source
SS
df
MS
V.R
Treatment
SSC
C-1
MSC=SSC/C-1
V.R=
Residual
SSE
N-C
MSE=SSE/N-C
Total
N-1
Steps in ANOVA
1. Hypothesis statement
Ho =1= 2= 3
HA=at least any two of the means are not equal
OR
Ho =2 1=
=23
HA: At least any two of the variances are not the same
2. Critical Region(values)
Reject Ho
If f > f [ k-1, k(n-1]
3. Computation of test statistic
We use the formulae and summarize results in ANOVA TABLE,
whose format is given below
Source of
Variation
Sum of
squares
Degrees of
freedom
Mean
Square
Column
means
Error
SSC
SSE
k-1
k(n-1)
S12 = SSC/k1
S22 = SSE/
k(n-1)
Total
SST
Nk-1
Computed f
f = S12/ S22
WKBK1
WKb2
WKbk3
10
10
10
Total
Total
24
54
30
108
Example two
The data in the table below represents the number of hours of pain
relief provided by 5 different brands of headache tablets
administered to 25 subjects. The 25 subjects were randomly divided
into 5 groups and each group was treated with a different brand.
Total
26
39
20
14
33
132
Question; Perform the analysis of variance, and test the hypothesis at the = 0.05
level of significance that the mean number of hours of relief provided by the tablets
is the same for all five brands
Solution:
1. Hypothesis statement
H0: 1 = 2 = 3= 4 = 5
H1: At least two of the means are not the same
2. Critical values
= 0.05
Reject Ho if
K = 5,
n = 6,
SSE
Source of
Variation
Sum of
squares
Degrees of
freedom
Mean
Square
Column
means
SSC =
79.440
k-1 = 4
S12 = SSC/k1
79.440/4
=19.8860
k(n-1) = 20
Error
SSE =
57.600
Total
SST =
137.040
Computed f
f = S12/ S22
6.90
S22 = SSE/
k(n-1)
=
57.600/20
= 2.88
Nk-1
Decision
Reject the null hypothesis and conclude that the
mean number of hours of relief provided by headache
tablets is not the same for all 5 brands
Note: For unequal sample sizes, the critical region is
given by