One-Way ANOVA
One-Way ANOVA
Why DOE?
Sound decisions require quality data.
Objective of DOE
Objective in DOE
variable(s).
To identify factors that affect the response variable(s).
To maximize the amount of information about the
relationship between the response and treatment.
To avoid bias.
Observational vs. Designed Sampling Experiments
Observational sampling experiment
Analyst is just an observer of the data and has no
control over the variables of the study.
Designed sampling experiment
Analyst attempts to control the levels of one or more
variables to determine their effect on the response
variable.
Motivating Example
Objective: Compare 4 brands of tires for treadwear using 16 tires
from 4 cars.
Design 1
Design 2
Car
Positio
n
2 3 4
Car
Positio
n
2 3 4
LF
B C D
LF
B A B
RF
B C D
RF
A B A
LR
B C D
LR
C D C
RR
B C D
RR
D C D
Design 3
Car
Positio
n
2 3 4
LF
B C D
RF
A C D
LR
D A B
Motivating Example
Design 1 (unacceptable design)
The
differences in tire wear among the four brands would
Elements of DOE
1. Response variable (dependent variable): the variable
of interest to be measured in the experiment
SAT score; Household income
2. Factors
(independent variables): variables whose
Gender
Example
Goal:
To compare mean distance traveled by
four
brands (A, B, C, D)of golf balls.
Principles of DOE
1.
2.
3.
4.
5.
6.
7.
the
claim that three or more population means
are equal.
This is an extension of the two independent
samples t-test
The response variable is the variable youre
comparing
The factor is the categorical variable being used
to define the groups
The one-way is because each value is classified
in exactly one way
Examples include comparisons by gender,
race, political party, color, etc.
then test
around
treatment means that is attributed to
sampling
error.
Assumptions
The
samples are randomly selected from k
treatment
populations
rows:
front, middle, and back
taken
An One-Way ANOVA
Example
The summary statistics for the grades of each row
are shown in the table below
Row
Front
Middle
Back
Mean
75.71
67.11
53.50
St. Dev
17.63
10.95
8.96
Variance
310.90
119.86
80.29
Sample size
An One-Way ANOVA
Example
Variation
Variation is the sum of the squares of
the deviations between a value and the
mean of the value
Sum of Squares is abbreviated by SS.
Mean square is abbreviated by MS
One-Way ANOVA
Sum of Square due to Error (SSE)
Are each of the values within each
group identical?
No, there is some variation within the
treatments (groups)
This is called the within treatment
variation
Sometimes called the error variation
Denoted SSE for the Sum of Squares
(variation) due to error
An One-Way ANOVA
Example
There are two sources of variation
the variation between the treatments,
SST, or the variation due to the
treatments
the variation within the treatments
(groups), SSE, or the variation that cant
be explained by the factor so its called
the error variation.
SS
Between
SST
Within
SSE
Total
SST+SS
E
df
k-1
MS
MST
MSE
n-1
F
F=
p
Pr()
One-Way ANOVA
Grand Mean for our example is 65.08
The Between Group Variation for our example is
SST=1902
The within group variation for our example is
SSE=3386
One-Way ANOVA
After filling in the sum of squares, we have
Source
SS
Between
1902
Within
3386
Total
5288
df
MS
One-Way ANOVA
Degrees of freedom
Degrees of Freedom, df
A degree of freedom occurs for each value that can
vary before the rest of the values are predetermined
For example, if you had six numbers that had an
average of 40, you would know that the total had to
be 240. Five of the six numbers could be anything,
but once the first five are known, the last one is
fixed so the sum is 240. The df would be 6-1=5
The df is often one less than the number of values
One-Way ANOVA
Degrees of freedom
The df for treatment is one less than the
number of groups (treatments)
We have three groups, so df for treatment = 2
One-Way ANOVA
Filling in the degrees of freedom gives this
Source
SS
df
MS
Between
1902
Within
3386
21
Total
5288
23
One-Way ANOVA
Mean Squares
Mean Squares
The Mean of the Squares are abbreviated by MS
They are an average squared deviation from the
mean and are found by dividing the sum of
squares by the corresponding degrees of
freedom
MS = SS / df
Variation
Variance
df
One-Way ANOVA
Mean Squares
MST= 1902 / 2= 951.0
MSE= 3386 / 21= 161.2
One-Way ANOVA
Completing the MS gives
Source
SS
df
MS
Between
1902
Within
3386
21 161.2
Total
5288
23
951.0
One-Way ANOVA
F Statistic
F test statistic
An F test statistic is the ratio of MST to
MSE
F = MST / MSE
One-Way ANOVA
Adding F to the table
Source
SS
df
MS
Between
1902
Within
3386
21 161.2
Total
5288
23
951.0
F
5.9
One-Way ANOVA
F test
The F test is a right tail test
The F test statistic has an F
distribution with numerator df being
the df for treatment and denominator
df beong the df for error.
The p-value is the area to the right of
the observed test statistic
P(F2,21 > 5.9) = 0.009
One-Way ANOVA
Completing the table with the p-value
Source
SS
df
MS
Between
1902
Within
3386
21 161.2
Total
5288
23 229.9
951.0
5.9 0.009
One-Way ANOVA
Making Conclusions
The p-value is 0.009, which is less than the
significance level of 0.05, so we reject the null
hypothesis.
The null hypothesis is that the means of the three
rows in class were the same, but we reject that,
so at least one row has a different mean.
There is enough evidence to support the claim
that there is a difference in the mean scores of
the front, middle, and back rows in class.
The ANOVA doesnt tell which row is different, you
would need to run post hoc tests to determine
that
Example 1
Objective:
Determine whether temperature
has
an effect on yield.
Example 2
Objective:
Determine whether temperature
has
an effect on yield.
Example 3
Objective:
Compare the mean distance of the
four
golf ball brands