0% found this document useful (0 votes)
25 views36 pages

ANOVA Study Guide and Practice Exams

1) ANOVA (analysis of variance) is a statistical test used to compare the means of two or more groups. It breaks down the total variation in a data set into treatment variation (differences between group means) and error variation (random error within groups). 2) The assumptions of ANOVA are that the populations are normally distributed, have equal variances, and independent observations. The null hypothesis is that all population means are equal. 3) Key terms in ANOVA include factors (independent variables), levels (treatment groups), and response (dependent variable). ANOVA decomposes the total sum of squares into the sum of squares for treatment and error.

Uploaded by

praveen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views36 pages

ANOVA Study Guide and Practice Exams

1) ANOVA (analysis of variance) is a statistical test used to compare the means of two or more groups. It breaks down the total variation in a data set into treatment variation (differences between group means) and error variation (random error within groups). 2) The assumptions of ANOVA are that the populations are normally distributed, have equal variances, and independent observations. The null hypothesis is that all population means are equal. 3) Key terms in ANOVA include factors (independent variables), levels (treatment groups), and response (dependent variable). ANOVA decomposes the total sum of squares into the sum of squares for treatment and error.

Uploaded by

praveen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Click this link to get the ANOVA Excel File Template

Introduction to ANOVA
Education comes from within; you get it by struggle and effort and
thought. - Napoleon Hill

Are you ready to think and struggle through the next topic in the
Six Sigma Green Belt Body of Knowledge - ANOVA.

ANOVA stands for ANalysis Of VAriance and it is a hypothesis test


used to compare the means of 2 or more groups.

I know what you're thinking - WHY would we test MEAN VALUES using VARIANCE.

I thought the same thing. . .

Don't worry though.

It'll all make sense soon!

Ok, so here's what you'll learn in this chapter:

First, we cover the assumptions associated with ANOVA.

Second are the common terms & definitions within ANOVA (and DOE).

Third, I start very high level to answer the question - Why Does ANOVA use Variance to Test Mean Values?

Fourth, we go through the basics of ANOVA including the Sum of Squares, Degrees of Freedom, Mean Squares, F-value.

Then we go into an example of a One-Way ANOVA and use all of the basics we just learned.

Last, but not least I'll introduce the idea of a Two-Way ANOVA, including new terms and concepts.

Ready to get started?

Green Belt Master Class – Chapter 17


ANOVA Page 1 of 18
Assumptions of ANOVA
Ok, so first things first - ANOVA is a type of Hypothesis Test used to test hypothesis
about Mean values.

So, similar to the other Hypothesis Tests we studied, ANOVA analysis is based on the
starting assumption that the null hypothesis is true.

Within ANOVA, the Null Hypothesis is always the same - that all of our sample mean
values are equal.
Ho: μa = μb = μc = . . . . . . μk

The Alternative Hypothesis is that at least one mean value is different than the rest.

Ha: Not all means are equal


Make Sense?

The other 3 major assumptions for ANOVA are identical to the assumptions associated with the t-test.

Remember that the t-test is used to test the hypothesis that two means are equal, while ANOVA is used to test the
hypothesis that 3+ means are equal. So, it's logical that they would share common assumptions, which include:

• The Population being studied is Normally Distributed


• The Variance is the same between the various treatments (Homogeneity of Variance)
• There is Independence between sample observations

So, your treatment groups must be normally distributed, and the variances between your treatments should be equal and
your samples should be independent from each other.

Terminology in ANOVA
Alright, let' s go deeper with ANOVA and review the common terms and their definitions.

What you'll often find is that ANOVA Analysis is paired with a Designed Experiment to measure the affect that an
independent variable has on a dependent variable.

For example, let's say you wanted to study the effect that different octane gasses have on the horsepower of your car.

You would design an experiment where you would vary the octane of gas and measure horsepower.

In this experiment, the independent variable would be the octane of gas (87, 89, 91, etc.), and the dependent variable
would be the horsepower.

The independent variable in ANOVA is called a Factor.

In the Horsepower example there is only one factor (Octane), which makes this experiment a one-way ANOVA. If there had
been two factors (Octane and Fuel Injector Size) it would be a two-way ANOVA.

You'll notice above that I listed different octanes of gas - 87, 89, 91. These are the different levels or treatments associated
with your factor.

So, our Factor (Octane) can have multiple Levels (87, 89, 91) in a One-Way ANOVA analysis.

The Response that we're measuring in this experiment is Horsepower, which is our dependent variable.

Make Sense? Let's move on from here to review the basics of ANOVA.
Green Belt Master Class – Chapter 17
ANOVA Page 2 of 18
Why Does ANOVA use Variance to Test Mean Values?
ANOVA Takes a very unique and interesting approach to determining if 3
or more sample means all have an equal population mean.

What ANOVA Analysis does - and this is truly brilliant - is it breaks down
your data set into two different sources of variation.

When you look at the standard One-Way ANOVA table (Below), you'll see
the two "sources" of variation are called the Treatment and the Error.

The Treatment Variation is the variability within the data set that can be
attributed to the difference between the different treatment groups (difference in sample means).

The Error Variation is the variability within the data set that can be attributed to the random error associated with the
response variable. This is the variability within the different treatment groups.

Recall that our null hypothesis within ANOVA is that all population means are equal.

So, if our null hypothesis were true, then we would expect that the treatment variation (difference in sample means) could
be fully explained by the random nature of the data.

The treatment variation should be nearly equal to the error variation. This statement is the key to explaining why ANOVA
uses Variance to test the difference between Means.

By breaking down our data into the two sources of variance, we can then compare those variances against each of to see if
they are statistically significantly different (the alternate hypothesis).

What I'd like to do next is to quickly go through each of the columns of the ANOVA Table and go over each topic specifically.

Including the Sum of Squares and Degrees of Freedom which are combined to calculate the Mean Square Values, which
are further combined to calculate your F-Statistic or F-value.

Green Belt Master Class – Chapter 17


ANOVA Page 3 of 18
Sum of Squares in ANOVA
Step 1 in the ANOVA process is calculating the Sum of Squares for each of the sources of variation (Treatment & Error),
and then adding them up to the total variation within the data set.

The Variation for each of these sources is calculated using the "Sum of Squares" calculation. So, we calculate the Sum of
Squares of the Treatment (SSt), and the Sum of Squares of the Error (SSe).

These different sources of variation combine to add up to the Total Sum of Squares (SStotal) within your data set.

Total Sum of Squares (SStotal) = Sum of Square of the Treatment (SSt) + Sum of Squares of the Error (SSe)

Sum of Squares - Example


Ok, let's go over the formulas to calculate the sum of squares for the treatment, error and then the total sum of squares.

Let's say we're back on the horsepower/octane example and we've designed an experiment where we're testing 3
treatment levels (octane levels), and we're measuring horsepower 3 times per treatment.

Treatment Group #1 Treatment Group #2 Treatment Group #3


87 Octane 89 Octane 91 Octane
Sample #1
223 hp 224 hp 225 hp
Horsepower
Sample #2
224 hp 225 hp 226 hp
Horsepower
Sample #3
225 hp 226 hp 227 hp
Horsepower
Sample Mean 224 hp 225 hp 226 hp

So, Treatment group #1 captures our 3 measurements at 87 Octane, and it had a sample mean of 224 hp. Make Sense?

Before we calculate the sum of squares, let's introduce a new topic. The Grand Mean.

So, recall that our null hypothesis is that all three sample means come from the same population mean. If that's true, we
can calculate the Grand Mean, which is an estimate of the population mean.

The Grand Mean for this example is 225 hp. You can calculate the grand mean by averaging all of the individual values
within the data set.

Sum of Squares of the Treatment


Now we can use the grand mean to calculate the Sum of Squares of the Treatment (SSt).

̅ 𝟏 − 𝑮𝑴)𝟐 + 𝒏(𝑿
𝑺𝑺𝒕 = ∑ 𝒏(𝑿 ̅ 𝟐 − 𝑮𝑴)𝟐 + 𝒏(𝑿
̅ 𝟑 − 𝑮𝑴)𝟐 +. . . 𝒏(𝑿
̅ 𝒊 − 𝑮𝑴)𝟐

Recall that the Treatment Variation is the variability within the data set that can be attributed to the difference between
the different treatment groups.

We do this by comparing the treatment sample means (X-bar of Treatment Group 1, etc) against the grand mean. We're
also multiplying this difference by "n", which is the number of samples per treatment group.

𝑺𝑺𝒕 = ∑ 𝟑(𝟐𝟐𝟒 − 𝟐𝟐𝟓)𝟐 + 𝟑(𝟐𝟐𝟓 − 𝟐𝟐𝟓)𝟐 + 𝟑(𝟐𝟐𝟔 − 𝟐𝟐𝟓)𝟐 = ∑ 𝟑 + 𝟎 + 𝟑 = 𝟔

So, the sum of squares of the treatment is equal to 6.

Green Belt Master Class – Chapter 17


ANOVA Page 4 of 18
Sum of Squares of the Error
Now let's move on to the Sum of Squares of the Error (SSe) which can be done using the equation below:
̅ )𝟐
𝑺𝑺𝒆 = ∑(𝑿𝒊 − 𝑿

By the way, do you see why it's called the sum of squares calculation? We're squaring the difference between two values,
then summing up those differences. Hence, the sum of squares.

Remember that the Error Variation is the variability within the data set that can be attributed to the random error
associated with the response variable. This is the variability within the different treatment groups.

So, we're comparing the individual values (Xi) against that values sample mean (X-bar).

𝑺𝑺𝒆 = ∑(𝟐𝟐𝟑 − 𝟐𝟐𝟒)𝟐 + (𝟐𝟐𝟒 − 𝟐𝟐𝟒)𝟐 + (𝟐𝟐𝟓 − 𝟐𝟐𝟒)𝟐 + (𝟐𝟐𝟒 − 𝟐𝟐𝟓)𝟐 + (𝟐𝟐𝟓 − 𝟐𝟐𝟓)𝟐 + (𝟐𝟐𝟔 − 𝟐𝟐𝟓)𝟐
+ (𝟐𝟐𝟓 − 𝟐𝟐𝟔)𝟐 + (𝟐𝟐𝟔 − 𝟐𝟐𝟔)𝟐 + (𝟐𝟐𝟕 − 𝟐𝟐𝟔)𝟐

I've color coded this equation so you can see how the different treatment groups are considered within this equation.

So, treatment group 1 is shown in green, where the 3 individual values (223, 224 & 225) are compared against the sample
mean of treatment group #1 (224). This continues with Treatment Group 2 & 3.
𝑺𝑺𝒆 = ∑ 𝟏 + 𝟎 + 𝟏 + 𝟏 + 𝟎 + 𝟏 + 𝟏 + 𝟎 + 𝟏 = 𝟔

So, the sum of squares of the error is equal to 6.

Total Sum of Squares


Now let's move on to the Total Sum of Squares (SStotal).

We could simply calculate the total sum of squares by adding up the Sum of Square of the Treatment (SSt) and the Sum of
Squares of the Error (SSe). Or we could calculate it using the following equation:
𝑺𝑺𝒕𝒐𝒕𝒂𝒍 = ∑(𝑿𝒊 − 𝑮𝑴)𝟐
This looks similar to the sum of squares of the error calculation; however, we're now comparing all individual values (X i)
against the grand mean (225).

𝑺𝑺𝒕𝒐𝒕𝒂𝒍 = ∑(𝟐𝟐𝟑 − 𝟐𝟐𝟓)𝟐 + (𝟐𝟐𝟒 − 𝟐𝟐𝟓)𝟐 + (𝟐𝟐𝟓 − 𝟐𝟐𝟓)𝟐 + (𝟐𝟐𝟒 − 𝟐𝟐𝟓)𝟐 + (𝟐𝟐𝟓 − 𝟐𝟐𝟓)𝟐 + (𝟐𝟐𝟔 − 𝟐𝟐𝟓)𝟐
+ (𝟐𝟐𝟓 − 𝟐𝟐𝟓)𝟐 + (𝟐𝟐𝟔 − 𝟐𝟐𝟓)𝟐 + (𝟐𝟐𝟕 − 𝟐𝟐𝟓)𝟐

𝑺𝑺𝒕𝒐𝒕𝒂𝒍 = ∑ 𝟒 + 𝟏 + 𝟎 + 𝟏 + 𝟎 + 𝟏 + 𝟎 + 𝟏 + 𝟒 = 𝟏𝟐

Let's compare this to the Sum of Squares of the Treatment and Sum of Squares of the Error:

Total Sum of Squares (SStotal) = Sum of Square of the Treatment (SSt) + Sum of Squares of the Error (SSe)

12 = 6 + 6

See how that reconciles? The total sum of squares captures the total variability within the data set by comparing all
values against the grand mean, while the Treatment & Error sum of squares are a unique component of that overall
variability.

Alright, so we've knocked out the Sum of Squares portion of the ANOVA Table, let's move on to the Degrees of Freedom
column.

Green Belt Master Class – Chapter 17


ANOVA Page 5 of 18
Degrees of Freedom in ANOVA
Step 2 in the ANOVA process is to calculate the degrees of freedom (DF) for each source of variation (Treatment & error),
which add up to the total degrees of freedom.

We will use the degrees of freedom to "normalize" the sum of squares data to convert the raw "variation" into estimations
of the population variance in step 3 (Mean Squares).

So, we will start by calculating the total degrees of freedom (DFtotal) associated with the entire sample data set.
DFtotal = DFerror + DFtreatment

Then we will break this down into the degrees of freedom for the treatment (DFtreatment), and the degrees of freedom for
the error (DFerror).

The Total Degrees of Freedom is the easiest to calculate - It's the total number of observations within your data set,
minus 1. The letter N is often used to represent the total number of observations within a data set.
DFtotal = N - 1

If we go back to the example above, there are 9 total observations within our experiment, so N = 9. Then, the total degrees
of freedom is 8.
DFtotal = N - 1 = 9 - 1 = 8
The DF of the Treatment is the next easiest to calculate - It's the number of treatments minus 1. The letter "a" is often used
to denote the number of treatments, but this can vary between textbooks.
DFtreatment = a - 1

If we go back to the example above, there are 3 treatment levels (87, 89 & 91 octane) within our experiment, so a = 3.
Then, the degrees of freedom of the treatment is 2.
DFtreatment = a - 1 = 3 - 1 = 2

The DF of the Error is often calculated by taking the total degrees of freedom, and subtracting the treatment degrees of
freedom.
DFerror = DFtotal - DFtreatment = (N-1) - (a-1) = (9 - 1) - (3 - 1) = 6

Where N equals the total number of observations, which can be calculated as n(a), which is the number of treatment
groups (a) times the number of samples per treatment group (n). Using this transformation of N=n*a, we can re-arrange
the equation to this:
DFerror = a(n-1) = 3(3-1) = 6

Ok, before we move on to the next section, let's take a look at our ANOVA table to see how we've incorporated all we've
learned so far.

So, for the Treatment and Error we've calculated the Sum of Squares for each and the Degrees of Freedom for each. Now
it's time to use that information to move to step 3 in the ANOVA process, calculating the Mean Squares.

Green Belt Master Class – Chapter 17


ANOVA Page 6 of 18
Mean Squares in ANOVA
Alright, on to step 3 of the ANOVA analysis, which is the calculation of our Mean Squares, where again we'll calculate a
Mean Square of the Treatment (MST), and a Mean Square of the Error (MSE).

Mean Squares are calculated using the equation below:

𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠
𝑀𝑒𝑎𝑛 𝑆𝑞𝑢𝑎𝑟𝑒 =
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚

We take our Sum of Squares and we divide it by our degrees of freedom. We will do this for both our treatment and error
terms.

To help explain what the Mean Squares represents I'll jump right to the calculation for the Mean Square of the Error:

𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑡ℎ𝑒 𝐸𝑟𝑟𝑜𝑟 ∑(𝑋𝑖 − 𝑋̅)2


𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑬𝒓𝒓𝒐𝒓 = =
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 𝑜𝑓 𝑡ℎ𝑒 𝐸𝑟𝑟𝑜𝑟 (𝑁 − 1) − (𝑎 − 1)

This might look super familiar to how we calculate sample variance:

∑(𝑥 − 𝑥̅ )2 𝑠𝑢𝑚 𝑜𝑓 𝑠𝑞𝑢𝑎𝑟𝑒𝑠


𝑺𝒂𝒎𝒑𝒍𝒆 𝑽𝒂𝒓𝒊𝒂𝒏𝒄𝒆 = 𝑠 2 = =
𝑛−1 𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚

I wanted to show you this to make the point that the Mean Square calculation is a calculation of variance!

Specifically, MST & MSE are both unique & different estimates of the population variance associated with our data set.

If our null hypothesis is true then these two estimates of the population variance will be approximately equal.

So, the Mean Square of the Error (MSE) is an estimate of the population variance that's based solely on the variability
within each treatment group.

The Mean Square Between Treatments (MST) is an estimate of the population variance that's based solely on the
variability between the treatment group sample means and the grand mean.

Can you see on the left of this image how MST (Mean Square Between the Treatment) is an estimate of the variability
between each treatment group and the grand mean?

Green Belt Master Class – Chapter 17


ANOVA Page 7 of 18
It compares the sample mean of each treatment to the overall grand mean of the population.

Can you see on the right of this image how MSE (Mean Square of the Error) is an estimate of the variability within each
treatment group?

It compares the individual observations within each treatment group, to the sample mean of that treatment group.

Let's review these two Mean Squares individually.

MSE - The Variability Within the Treatment Groups


The first type of variation measured within ANOVA is the Mean Square of the Error which represents the random
variability that is normal to the response variable which.

You'll see other textbooks call this the Within Treatment Variability because it's a reflection of the variability within each
individual treatment group.

If we go back to our original example of octane & horsepower, you can see that within the 87 octane group, there is some
slight variability WITHIN each treatment group.

This random variation in horsepower that's inherent to each treatment group is the Error Variance.

This next statement is important!

Whether the null hypothesis is true or not, the MSE (Mean Square of the Error) is a good approximation of the population
variance.

To calculate the MSE, we use the following formula:


𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑡ℎ𝑒 𝐸𝑟𝑟𝑜𝑟
𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑬𝒓𝒓𝒐𝒓 =
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 𝑜𝑓 𝑡ℎ𝑒 𝐸𝑟𝑟𝑜𝑟

The Mean Square of the Error (MSE) is the Sum of Squares of the Error (SSerror) divided by the Degrees of Freedom of the
Error (DFe). In our specific example for octane and horsepower, the MSE is:

𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑡ℎ𝑒 𝐸𝑟𝑟𝑜𝑟 6


𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑬𝒓𝒓𝒐𝒓 = = =𝟏
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 𝑜𝑓 𝑡ℎ𝑒 𝐸𝑟𝑟𝑜𝑟 6

Let's move on to discuss the next estimate of the population variance - the MST.

Green Belt Master Class – Chapter 17


ANOVA Page 8 of 18
Treatment - The Variability Between the Treatment Groups
The second type of variability within ANOVA is called the MST or Mean Square Between the Treatments and it's a estimate
of the population variance that's based on the difference between the sample means and the grand mean.

You might also see textbooks call this MSB for Mean Square Between (the treatments), or the Variance Between
Treatments because it reflects the variance caused by the different treatments (levels).

This measure of variability compares the sample mean of each treatment (X-bar), against the Grand Mean of the entire
sample space (all sample observations).

This next statement is important:

If the null hypothesis is false, then the MST will not be an accurate measure of the population variance.

To calculate the MSE, we use the following formula:


𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑡ℎ𝑒 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡
𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑻𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕 =
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡

The Mean Square of the Treatment (MST) is the Sum of Squares of the Treatment (SStreatment) divided by the Degrees of
Freedom of the Treatment (DFt). In our specific example for octane and horsepower, the MST is:

𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑡ℎ𝑒 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 6


𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑻𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕 = = =𝟑
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 2

Now that we've calculated both MST & MSE, let's see what our ANOVA table looks like before we move onto the final step,
calculating the F-value.

Green Belt Master Class – Chapter 17


ANOVA Page 9 of 18
Comparing the Two Variation Types Against Each Other
Alright, now we're on to the final step of ANOVA, which is to calculate the F-value associated with our data set, so that we
can make an accept/reject decision for our hypothesis test.

Recall that if the null hypothesis is true both MST & MSE will both approximate the population variance, and MST ≈ MSE.

We can compare MST & MSE against each other using the F-test, which is a ratio of the two variances in order to make an
accept/reject decision on our null hypothesis.

𝑴𝑺𝑻 𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑻𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕


𝑭 − 𝒗𝒂𝒍𝒖𝒆 = =
𝑴𝑺𝑬 𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑬𝒓𝒓𝒐𝒓

If the null hypothesis is true and all sample means are equal, then MSE & MST will be approximate equal, and our F-value
will equal ~1.

If the null hypothesis is false, and one or more sample groups are not equal to the other sample groups, then MST >> MSE
and our F-value will be >>1.

Now, there might be slight differences between MST & MSE which makes MST > MSE, simply due to random chance. How
do we make sure this random variability doesn't cause us to reject the null hypothesis incorrectly?

This is why we're using the F-Test, which is a test used to determine the equality of two variances, such as MST & MSE.
The F-Test allows us to determine if our differences between MST & MSE are due to chance or if they are statistically
significant.

Many ANOVA tables also include the P-value associated with the F-value. The P-value represents the probability of getting
that big of a difference between MST & MSE (or bigger).

This P-value can be compared to your alpha risk to make an accept/reject decision.

If you're not using a statistical software that gives you a p-value, you can also look up the critical F-statistics in the F-table.

The critical F-statistic is based on your Alpha risk and degrees of freedom for the numerator (MST) & denominator (MSE).
Using the example above, the DF of the treatment is 2, and the degrees of freedom of the Error is 6.

The F-Value can then be compared against the critical Fcrit to determine if the calculated Fstat is a statistically significant
result.

Below is our ANOVA table with the F-Value.

Green Belt Master Class – Chapter 17


ANOVA Page 10 of 18
Below is a quick visual example two different situations to demonstrate how MST & MSE can be used to determine if one
sample has a different population mean.

On the left, MST & MSE will be approximately equal, as the 3 sample groups are roughly evenly distributed around the
grand mean. In this scenario MST would be approximately equal to MSE, and our null hypothesis would likely be true.

On the right, one of our sample means is far out to the right implying that it is not equal to the other sample means. This
means MST will be much greater than MSE, and our null hypothesis will likely be false.

Green Belt Master Class – Chapter 17


ANOVA Page 11 of 18
One Way ANOVA Example
Ready for an example to practice?

Let's say you mold car bumpers, and you want to know if the injection temperature has an impact on the critical dimension
of the car bumper - the length of the bumper.

In this experiment the independent variable (Factor) is the injection temperature and the dependent variable (response) is
the length of the bumper.

Let's design an experiment where we'll study our factor at four different levels (temperatures) to see if this factor has an
effect on the response variable. Let's measure four parts at each of the four levels and analyze the following results:

Treatment Treatment Treatment Treatment


Group #1 Group #2 Group #3 Group #4
Sample #1 58 61 61 60

Sample #2 56 62 57 60

Sample #3 57 59 59 57

Sample #4 58 59 60 56

Let's get started by reviewing the steps to an ANOVA Analysis.

Step 1 is to calculate the Sum of Squares for each of the sources of variation (Treatment & Error), and then adding them up
to the total variation within the data set.

Step 2 is to calculate the degrees of freedom (DF) for each source of variation (Treatment & error), which add up to the
total degrees of freedom.

Step 3 is to calculate the Mean Squares for each source of variation (Treatment & Error), which is the Mean Square of the
Treatment (MST), and a Mean Square of the Error (MSE).

Step 4 is to calculate the F-value associated with your ANOVA by taking the ratio of MST to MSE.

Step 5 is to compare your F-value against the critical F-statistic; or to compare your P-value against your alpha risk. Either
of these comparisons will allow you to make an Accept/Reject decision for your null hypothesis.

You'll notice in these steps that we didn't start with a null hypothesis, because for ANOVA analysis, the null & alternative
hypothesis are always the same.

The Null Hypothesis is that all of our sample mean values are equal, and the Alternative Hypothesis is that at least one
mean value is different than the rest.

Ho: μa = μb = μc = . . . . . . μk and Ha: Not all means are equal

Green Belt Master Class – Chapter 17


ANOVA Page 12 of 18
Step 1 - The Sum of Squares
We're going to need to calculate the Total Sum of Square (SStotal), the Sum of Squares of the Treatment (SSt), and the Sum
of Squares of the Error (SSe).

Before we calculate the sum of squares, we have to calculate the sample mean for each group and the Grand Mean of our
data set.
Treatment Treatment Treatment Treatment
Group #1 Group #2 Group #3 Group #4
Sample #1 58 61 61 60
Sample #2 56 62 57 60
Sample #3 57 59 59 57
Sample #4 58 59 60 56
Sample Mean 57.25 60.25 59.25 58.25

The grand mean is the average of all data points which is 58.75.

Now we can use the grand mean to calculate the Sum of Squares of the Treatment (SSt).

̅ 𝟏 − 𝑮𝑴)𝟐 + 𝒏(𝑿
𝑺𝑺𝒕 = ∑ 𝒏(𝑿 ̅ 𝟐 − 𝑮𝑴)𝟐 + 𝒏(𝑿
̅ 𝟑 − 𝑮𝑴)𝟐 +. . . 𝒏(𝑿
̅ 𝒊 − 𝑮𝑴)𝟐

𝑺𝑺𝒕 = ∑ 𝟒(𝟓𝟕. 𝟐𝟓 − 𝟓𝟖. 𝟕𝟓)𝟐 + 𝟒(𝟔𝟎. 𝟐𝟓 − 𝟓𝟖. 𝟕𝟓)𝟐 + 𝟒(𝟓𝟗. 𝟐𝟓 − 𝟓𝟖. 𝟕𝟓)𝟐 + 𝟒(𝟓𝟖. 𝟐𝟓 − 𝟓𝟖. 𝟕𝟓)𝟐 = 𝟐𝟎

Now let's move on to the Sum of Squares of the Error (SSe) which can be done using the equation below:
̅ )𝟐
𝑺𝑺𝒆 = ∑(𝑿𝒊 − 𝑿
So, we're comparing all 16 individual values (Xi) against the sample mean for that treatment.

𝑺𝑺𝒆 = ∑(𝟓𝟖 − 𝟓𝟕. 𝟐𝟓)𝟐 + (𝟓𝟔 − 𝟓𝟕. 𝟐𝟓)𝟐 + (𝟓𝟕 − 𝟓𝟕. 𝟐𝟓)𝟐 + (𝟓𝟖 − 𝟓𝟕. 𝟐𝟓)𝟐
+(𝟔𝟏 − 𝟔𝟎. 𝟐𝟓)𝟐 + (𝟔𝟐 − 𝟔𝟎. 𝟐𝟓)𝟐 + (𝟓𝟗 − 𝟔𝟎. 𝟐𝟓)𝟐 + (𝟓𝟗 − 𝟔𝟎. 𝟐𝟓)𝟐
+(𝟔𝟏 − 𝟓𝟗. 𝟐𝟓)𝟐 + (𝟓𝟕 − 𝟓𝟗. 𝟐𝟓)𝟐 + (𝟓𝟗 − 𝟓𝟗. 𝟐𝟓)𝟐 + (𝟔𝟎 − 𝟓𝟗. 𝟐𝟓)𝟐
+(𝟔𝟎 − 𝟓𝟖. 𝟐𝟓)𝟐 + (𝟔𝟎 − 𝟓𝟖. 𝟐𝟓)𝟐 + (𝟓𝟕 − 𝟓𝟖. 𝟐𝟓)𝟐 + (𝟓𝟔 − 𝟓𝟖. 𝟐𝟓)𝟐 = 𝟑𝟏

I've color coded this equation so you can see how the different treatment groups are considered within this equation.

Now let's move on to the Total Sum of Squares (SStotal), which we could calculate by adding up the Sum of Square of the
Treatment (20) and the Sum of Squares of the Error (31). Or we could calculate it using the following equation:
𝑺𝑺𝒕𝒐𝒕𝒂𝒍 = ∑(𝑿𝒊 − 𝑮𝑴)𝟐

This looks similar to the sum of squares of the error calculation; however, we're now comparing all individual values (Xi)
against the grand mean (58.75).

𝑺𝑺𝒕𝒐𝒕𝒂𝒍 = ∑(𝟓𝟖 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟓𝟔 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟓𝟕 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟓𝟖 − 𝟓𝟖. 𝟕𝟓)𝟐
+(𝟔𝟏 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟔𝟐 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟓𝟗 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟓𝟗 − 𝟓𝟖. 𝟕𝟓)𝟐
+(𝟔𝟏 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟓𝟕 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟓𝟗 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟔𝟎 − 𝟓𝟖. 𝟕𝟓)𝟐
+(𝟔𝟎 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟔𝟎 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟓𝟕 − 𝟓𝟖. 𝟕𝟓)𝟐 + (𝟓𝟔 − 𝟓𝟖. 𝟕𝟓)𝟐
𝑺𝑺𝒕𝒐𝒕𝒂𝒍 = ∑ 𝟒 + 𝟏 + 𝟎 + 𝟏 + 𝟎 + 𝟏 + 𝟎 + 𝟏 + 𝟒 = 𝟓𝟏
Alright, so we've knocked out the Sum of Squares portion of the ANOVA Table, let's move on to the Degrees of Freedom
column.

Green Belt Master Class – Chapter 17


ANOVA Page 13 of 18
Step 2 - The Degrees of Freedom
Step 2 in the ANOVA process is to calculate the degrees of freedom (DF) for each source of variation (Treatment & error),
which add up to the total degrees of freedom.

So, we will start by calculating the total degrees of freedom (DFtotal) associated with the entire sample data set.

DFtotal = DFerror + DFtreatment

Then we will break this down into the degrees of freedom for the treatment (DFtreatment), and the degrees of freedom for
the error (DFerror).
DFtotal = N - 1 = 16 - 1 = 15

The DF of the Treatment is the next easiest to calculate - It's the number of treatments minus 1. The letter "a" is often used
to denote the number of treatments, but this can vary between textbooks.
DFtreatment = 4 - 1 = 3

The DF of the Error is often calculated by taking the total degrees of freedom, and subtracting the treatment degrees of
freedom.
DFerror = DFtotal - DFtreatment = (N-1) - (a-1)

DFerror = (16 - 1) - (4 - 1) = 12

So, for the Treatment and Error we've calculated the Sum of Squares for each and the Degrees of Freedom for each, let's
see what that looks like in our ANOVA table.

Now it's time to use that information to move to step 3 in the ANOVA process, calculating the Mean Squares.

Step 3 - The Mean Squares


Alright, on to step 3 of the ANOVA analysis, which is the calculation of our Mean Squares, where again we'll calculate a
Mean Square of the Treatment (MST), and a Mean Square of the Error (MSE).

𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑡ℎ𝑒 𝐸𝑟𝑟𝑜𝑟 31


𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑬𝒓𝒓𝒐𝒓 = = = 2.583
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 𝑜𝑓 𝑡ℎ𝑒 𝐸𝑟𝑟𝑜𝑟 12

𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑡ℎ𝑒 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 20


𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑻𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕 = = = 6.667
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 3

Green Belt Master Class – Chapter 17


ANOVA Page 14 of 18
Step 4 - The F - Statistic
Alright, now we're on to one of the last steps in ANOVA, which is to calculate the F-statistic so that we can make an
accept/reject decision for our hypothesis test.

Recall that if the null hypothesis is true both MST & MSE will both approximate the population variance, and MST ≈ MSE.

𝑴𝑺𝑻 𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑻𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕 𝟔. 𝟔𝟔𝟕


𝑭 − 𝒔𝒕𝒂𝒕𝒊𝒔𝒕𝒊𝒄 = = = = 𝟐. 𝟓𝟖𝟏
𝑴𝑺𝑬 𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑬𝒓𝒓𝒐𝒓 𝟐. 𝟓𝟖𝟑

Here's what our ANOVA table looks like with all of the fields filed in.

Step 5 - Accept/Reject the Null Hypothesis


Step 5 is to compare your F-value against the critical F-statistic; or to compare your P-value against your alpha risk.

Either of these comparisons will allow you to make an Accept/Reject decision for your null hypothesis.

If the null hypothesis is true and all sample means are equal, then MSE & MST will be approximate equal, and our F-value
will equal ~1.

If the null hypothesis is false, and one or more sample groups are not equal to the other sample groups, then MST >> MSE
and our F-value will be much greater than 1.

What about this F-value?

Should we reject the null hypothesis?

For this example, I'll show you have to look up the critical F-value in the F-distribution, which can be looked up using the
degrees of freedom for MST (3) & MSE (12), and our Alpha (α) risk of 10%.

Fcrit = F.05(3,12)= 3.490

Because our calculate F-statistic (2.581) is less than our critical F-value (3.490), we must fail to reject the null hypothesis.

Green Belt Master Class – Chapter 17


ANOVA Page 15 of 18
Two Way ANOVA
How do you feel about One-Way ANOVA?

Are you ready to move on to Two-Way ANOVA?

Before we jump into the differences between One Way and Two-Way ANOVA, it's important to note that all of the
assumptions of One-Way ANOVA apply to Two Way ANOVA.

On to the differences!

So, in One Way ANOVA we were analyzing data that was associated with one single Factor or Independent Variable.

In Two Way ANOVA, we're going to analyze data where two factors are varied and studied.

In some textbooks you'll see this as Factor A and Factor B.

Sometimes the second factor is called a "Block", because you can use it within your experiment to "block" out a certain
factor that might be contributing to the error variation.

Similar to One Way ANOVA, Two Way ANOVA allows you to analyze the "main effects" of each factor being varied, by
carving out the variation associated with each factor.

This is a new term we haven't used yet - Main Effects - and it's meant to denote the variation associated with a single
factor. For example, you can have the Main Effects of Factor A, and the Main Effects of Factor B.

When you've got two factors, you can also now study the interaction effect between factors. We will talk more about this
in the chapter on Designed Experiments.

You can see how the ANOVA table grows in complexity when we move to two factors.

I've shown the interactions above, however it's important to note that if you're interested in studying the interactions, you
must have multiple replicates for each treatment. A replicate is the number of observations per factor.

If you only have 1x replicate (observation) per factor, then there won't be enough degrees of freedom left over to calculate
the interactions effect, and any variability due to the interaction between factors will fall into the error term.

Green Belt Master Class – Chapter 17


ANOVA Page 16 of 18
Conclusion
Alright!!! Are you glad that's done??

Let's recap quickly.

Ok, so ANOVA stands for ANalysis Of VAriance and it is a hypothesis test used to compare the means of 2 or more groups.

The first we covered are the assumptions associated with ANOVA, which include:

• The Population being studied is Normally Distributed


• The Variance is the same between the various treatments (Homogeneity of Variance)
• There is Independence between sample observations

Second was the common terms & definitions within ANOVA which included Independent variable, dependent variable,
factors and treatments (levels).

Third, we answered - Why Does ANOVA use Variance to Test Mean Values?

Then we jumped into the basics of ANOVA including the Sum of Squares, Degrees of Freedom, Mean Squares, F-value.

For the Sum of Squares we reviewed the following equations:

𝑻𝒐𝒕𝒂𝒍 𝑺𝒖𝒎 𝒐𝒇 𝑺𝒒𝒖𝒂𝒓𝒆𝒔: 𝑺𝑺𝒕𝒐𝒕𝒂𝒍 = ∑(𝑿𝒊 − 𝑮𝑴)𝟐

̅ )𝟐
𝑺𝒖𝒎 𝒐𝒇 𝑺𝒒𝒖𝒂𝒓𝒆𝒔 𝒐𝒇 𝒕𝒉𝒆 𝑬𝒓𝒓𝒐𝒓: 𝑺𝑺𝒆 = ∑(𝑿𝒊 − 𝑿

̅ 𝒊 − 𝑮𝑴)𝟐
𝑺𝒖𝒎 𝒐𝒇 𝑺𝒒𝒖𝒂𝒓𝒆𝒔 𝒐𝒇 𝒕𝒉𝒆 𝑻𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕 = 𝑺𝑺𝒕 = ∑ 𝒏(𝑿

Total Sum of Squares (SStotal) = Sum of Square of the Treatment (SSt) + Sum of Squares of the Error (SSe)

For the Degrees of Freedom we reviewed the following equations:

DFtotal = DFerror + DFtreatment

DFtotal = a*n - 1

DFtreatment = a - 1

DFerror = (N-1) - (a-1) = a(n-1)


Green Belt Master Class – Chapter 17
ANOVA Page 17 of 18
We then combined the Sum of Squares and Degrees of Freedom to calculate our Mean Squares.

𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑡ℎ𝑒 𝐸𝑟𝑟𝑜𝑟


𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑬𝒓𝒓𝒐𝒓 =
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 𝑜𝑓 𝑡ℎ𝑒 𝐸𝑟𝑟𝑜𝑟

𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑜𝑓 𝑡ℎ𝑒 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡


𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑻𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕 =
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 𝑜𝑓 𝑡ℎ𝑒 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡

We then combined MSE & MST into our final calculation of an F-statistic:

𝑴𝑺𝑻 𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑻𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕


𝑭 − 𝒔𝒕𝒂𝒕𝒊𝒔𝒕𝒊𝒄 = =
𝑴𝑺𝑬 𝑴𝒆𝒂𝒏 𝑺𝒒𝒖𝒂𝒓𝒆 𝒐𝒇 𝒕𝒉𝒆 𝑬𝒓𝒓𝒐𝒓

To determine if our null hypothesis is true or false, we can compare this F-statistic against a critical F-value from tables.

Looking up this critical value will require you to have an Alpha risk for your hypothesis test and combine that with the
Degrees of Freedom for your MST & MSE.

If your f-stat is less than your critical f-value, then you must fail to reject the null hypothesis.

If your f-stat is greater than your critical f-value, then you reject the null hypothesis, and accept the alternative hypothesis
that one of the sample means is not equal to the rest.

Lastly, we covered the nuances of going from a One-Way ANOVA to a Two-Way ANOVA.

In Two Way ANOVA we introduced new concepts like Main Effects of factors and Interactions between factors. You can
see how the ANOVA table grows in complexity when we move to two factors.

Green Belt Master Class – Chapter 17


ANOVA Page 18 of 18
Problem Set For ANOVA:
1. Identify all of the statements below regarding ANOVA that are false:
A. In ANOVA, you're most likely to accept the null hypothesis when MST/MSE is high
B. Whether the null hypothesis is true or not, the MSE (Mean Square of the Error) is a good
approximation of the population variance.
C. In ANOVA Analysis, the response variable that you're measuring is known as the independent
variable.
D. If you were to run an experiment studying muffins when you vary the baking temperate to 400,
425 and 450. This experiment could be described as having 3 levels of one factor.
• A, C
• B, D
• A, B
• C, D

2. Identify all of the statements below regarding ANOVA that are true:
A. If you were to run an experiment that studied the effect on muffin moisture, based on
temperature and baking duration, the ANOVA analysis would be a One Way ANOVA
B. For an ANOVA analysis with 5 treatment groups and 3 measurements per group the degrees of
freedom of the error is 10.
C. Treatment variation is a reflection of the variation within each treatment group.
D. Homogeneity of variance is a baseline assumption required for ANOVA analysis to be accurate.
• A, C
• A, B
• C, D
• B, D

3. Identify the single statement below regarding ANOVA that is True:


• The Treatment Variation is the variability within the data set that can be attributed to the difference between
the different treatment groups.
• The error variation in ANOVA grows if the sample groups have mean values that diverge from each other.
• The dependent variable in ANOVA is called a Factor.
• In ANOVA Analysis, if the null hypothesis is false then the F-statistic will be equal or less than 1.

ANOVA Practice Quiz Page 1 of 16


4. Identify all of the statements below regarding ANOVA that are true:
A. The null hypothesis of ANOVA Analysis is situational and depends on the data being studied
B. Homogeneity of variance is a baseline assumption required for ANOVA analysis to be accurate.
C. The normality assumption associated with a One Way ANOVA is not required for a Two Way
ANOVA.
D. If the null hypothesis is true then the Error Mean Square is an approximate estimator of the
population variance.
• A, C
• B, C
• A, D
• B, D

5. Identify all of the statements below regarding ANOVA that are false:
A. In ANOVA Analysis, if the null hypothesis is false then the treatment variation can be fully
explained by the random nature of the data.
B. The alternative hypothesis of ANOVA analysis is that the means are all different
C. In ANOVA Analysis, the sum of squares is calculated by dividing the mean square by the degrees
of freedom.
D. With a Two Way ANOVA, interactions can be measured but only if there is more than one replicate
per treatment group.
E. In ANOVA analysis, the total sum of squares can be calculated by adding up the sum of squares of
the error and the treatment.
• A, B
• C, D
• A, C
• B, D

6. Identify the single statement below regarding ANOVA that is True:


• If the MSE and MSB are approximately the same, it is highly likely that null hypothesis will be rejected.
• For an ANOVA analysis with 4 treatment groups and 5 measurements per group, the degrees of freedom of
the treatment is 19.
• The Error Variation within ANOVA is the variability that can be attributed to the random error associated with
the response variable.
• If the null hypothesis is false then the Treatment Mean Square is an approximate estimator of the population
variance.

ANOVA Practice Quiz Page 2 of 16


7. Identify the statement below regarding ANOVA that is True:
• Any difference among the population means in the analysis of variance will inflate the error sum of squares
• The t-statistic in ANOVA is calculated by dividing the error mean square by the treatment mean square
• If the null hypothesis is true then MSE & MST will be approximate equal and the F-statistic will equal ~1
• A Two Way ANOVA occurs when 1 factor is varied at 2 levels

8. Which distribution is used to make the accept/reject decision for ANOVA Analysis:
• Chi-squared distribution
• T-distribution
• Normal distribution
• None of the above

9. Which statistical tool should be used to test the equality of 3 or more population means?
• ANOVA
• Chi-Squared Test
• T-Test
• None of the above

10. You're performing an ANOVA Analysis, and the total sum of squares is 36 and the treatment sum of squares is
16, what would the error sum of squares be?
• 20
• 52
• 30
• 16

11. You're performing an ANOVA Analysis, and the treatment sum of squares is 12 and the error sum of squares is
18, what is the total sum of squares?
• 6
• 36
• 30
• 18

12. You're performing an ANOVA Analysis, and the total sum of squares is 24 and the error sum of squares is 16.
What would the treatment sum of squares be?
• 40
• 23
• 15
• 8

ANOVA Practice Quiz Page 3 of 16


13. In ANOVA Analysis, the degrees of freedom associated with the numerator of the F-value can be calculated
using which equation:
• n(a)-1
• n(a-1)
• (N-1) - (a-1)
• a–1

14. You're performing an ANOVA Analysis of 1 independent variable at 4 levels and you're measuring 5 samples per
level. What is the total degrees of freedom?
• 20
• 16
• 19
• 4

15. You're performing an ANOVA Analysis of 1 independent variable at 4 levels where the total degrees of freedom
is 24. What is the error degrees of freedom?
• 23
• 21
• 20
• 3

16. You're performing an ANOVA Analysis of 1 independent variable at 6 levels where the total degrees of freedom
is 24. What is the treatment degrees of freedom?
• 23
• 5
• 20
• 3

17. You're performing an ANOVA Analysis of 1 independent variable at 3 levels and you're measuring 10 samples
per level. What is the total degrees of freedom?
• 27
• 9
• 30
• 29

ANOVA Practice Quiz Page 4 of 16


18. You're performing an ANOVA Analysis of 1 independent variable at 7 levels where the total degrees of freedom
is 35. What is the error degrees of freedom?
• 34
• 28
• 29
• 35

19. You're performing an ANOVA Analysis of 1 independent variable at 10 levels where the total degrees of freedom
is 20. What is the treatment degrees of freedom?
• 10
• 9
• 30
• 11

20. The one way ANOVA Analysis below has 10 treatment groups with the total degrees of freedom of 19.

Variation Sum of Squares Degrees of freedom Mean Squares


F-Value
Source (SS) (DF) (MS)
Treatment
(Between)
Error
55
(Within)
Total 100 19

Calculate the Treatment Mean Square for this ANOVA Table.

• 4.5
• 5
• 5.5
• 6.1

ANOVA Practice Quiz Page 5 of 16


21. The one way ANOVA Analysis below has 15 treatment groups with the total degrees of freedom of 44.

Variation Sum of Squares Degrees of freedom Mean Squares


F-Value
Source (SS) (DF) (MS)
Treatment
30
(Between)
Error
(Within)
Total 45 44

Calculate the Error Mean Square for this ANOVA Table.

• 1.76
• 2.14
• 1.07
• 0.50

22. The one way ANOVA Analysis below has 3 treatment groups with the total degrees of freedom of 50.

Variation Sum of Squares Degrees of freedom Mean Squares


F-Value
Source (SS) (DF) (MS)
Treatment
20
(Between)
Error
(Within)
Total 100 50

What is the decision for this ANOVA Analysis if we choose the alpha risk to be 10%?

• Fail to Reject the Null


• Accept the Null
• Reject the Null
• Not Enough Information

ANOVA Practice Quiz Page 6 of 16


23. The one way ANOVA Analysis below has 7 treatment groups with the total degrees of freedom of 38.

Variation Sum of Squares Degrees of freedom Mean Squares


F-Value
Source (SS) (DF) (MS)
Treatment
(Between)
Error
66
(Within)
Total 84 38

What is the decision for this ANOVA Analysis if we use choose alpha risk to be 10%?

• Fail to Reject the Null


• Accept the Null
• Reject the Null
• Not Enough Information

ANOVA Practice Quiz Page 7 of 16


Solutions for ANOVA:
1. Identify all of the statements below regarding ANOVA that are false:
A. In ANOVA, you're most likely to accept reject the null hypothesis when MST/MSE is high low
(False)
B. Whether the null hypothesis is true or not, the MSE (Mean Square of the Error) is a good
approximation of the population variance. (True)
C. In ANOVA Analysis, the response variable that you're measuring is known as the independent
dependent variable. (False)
D. If you were to run an experiment studying muffins when you vary the baking temperate to 400,
425 and 450. This experiment could be described as having 3 levels of one factor. (True).
• A, C are both false
• B, D
• A, B
• C, D

2. Identify all of the statements below regarding ANOVA that are true:
A. If you were to run an experiment that studied the effect on muffin moisture, based on
temperature and baking duration, the anova analysis would be a one-way two-way anova (False).
B. For an ANOVA analysis with 5 treatment groups and 3 measurements per group the degrees of
freedom of the error is 10. (True, a(n-1) = 5(3-1) = 10).
C. Treatment error variation is a reflection of the variation within each treatment group. (False)
D. Homogeneity of variance is a baseline assumption required for ANOVA analysis to be
accurate.(True)
• A, C
• A, B
• C, D
• B, D

3. Identify the single statement below regarding ANOVA that is True:


• The Treatment Variation is the variability within the data set that can be attributed to the difference between
the different treatment groups. (True)
• The error treatment variation in ANOVA grows if the sample groups have mean values that diverge from each
other (False).
• The dependent independent variable in ANOVA is called a Factor Response. (False)
• In ANOVA Analysis, if the null hypothesis is false then the F-statistic will be much larger than equal or less
than 1. (False)

ANOVA Practice Quiz Page 8 of 16


4. Identify all of the statements below regarding ANOVA that are true:
A. The null hypothesis of ANOVA Analysis is situational and depends on the data being studied (False,
the null hypothesis in ANOVA is always the same).
B. Homogeneity of variance is a baseline assumption required for ANOVA analysis to be accurate.
(True)
C. The normality assumption associated with a One Way ANOVA is not required for a Two Way
ANOVA (False, the assumption of normality also applies to Two Way ANOVA)
D. If the null hypothesis is true then the Error Mean Square is an approximate estimator of the
population variance. (True)
• A, C
• B, C
• A, D
• B, D

5. Identify all of the statements below regarding ANOVA that are false:
A. The alternative hypothesis of ANOVA analysis is that at least 1 mean is different all the means
are all different (False)
B. In ANOVA Analysis, the mean square sum of squares is calculated by dividing the mean square
sum of squares by the degrees of freedom (False).
C. With a Two Way ANOVA, interactions can be measured but only if there is more than one replicate
per treatment group (True)
D. In ANOVA analysis, the total sum of squares can be calculated by adding up the sum of squares of
the error and the treatment. (True)
• A, B
• C, D
• A, C
• B, D

6. Identify the single statement below regarding ANOVA that is True:


• If the MSE and MSB are approximately the same, it is highly likely unlikely that null hypothesis will be rejected.
(FALSE)
• For an ANOVA analysis with 4 treatment groups and 5 measurements per group, the degrees of freedom of
the treatment is 19 3. (False, DF of the treatment is the number of groups (4) minus 1)
• The Error Variation within ANOVA is the variability that can be attributed to the random error associated with
the response variable. (True)
• If the null hypothesis is false then the Treatment Error Mean Square is an approximate estimator of the
population variance. (False)

ANOVA Practice Quiz Page 9 of 16


7. Identify the single statement below regarding ANOVA that is True:
• Any difference among the population means in the analysis of variance will inflate the treatment error sum
of squares (False)
• The t-statistic f-statistic in ANOVA is calculated by dividing the error treatment mean square by the error
treatment mean square (False)
• If the null hypothesis is true then MSE & MST will be approximate equal and the F-statistic will equal ~1. (True)
• A Two Way ANOVA occurs when 1 2 factors are varied at 2 multiple levels (False)

8. Which distribution is used to make the accept/reject decision for ANOVA Analysis:
• Chi-squared distribution
• T-distribution
• Normal distribution
• None of the above - The F-Distribution is used within ANOVA.

9. Which statistical tool should be used to test the equality of 3 or more population means?
• ANOVA
• Chi-Squared Test
• T-Test
• None of the above

10. You're performing an ANOVA Analysis, and the total sum of squares is 36 and the treatment sum of squares is
16, what would the error sum of squares be?
• 20
• 52
• 30
• 16

SSerror = SStotal - SStreatment = 36 - 16 = 20

11. You're performing an ANOVA Analysis, and the treatment sum of squares is 12 and the error sum of squares is
18, what is the total sum of squares?
• 6
• 36
• 30
• 18
SStotal = SSerror + SStreatment SStotal = 12 + 18 = 30

ANOVA Practice Quiz Page 10 of 16


12. You're performing an ANOVA Analysis, and the total sum of squares is 24 and the error sum of squares is 16.
What would the treatment sum of squares be?
• 40
• 23
• 15
• 8
SStreatment = SStotal - SSerror SStreatment = 24 - 16 = 8

13. In ANOVA Analysis, the degrees of freedom associated with the numerator of the F-value can be calculated
using which equation:
• n(a)-1
• n(a-1)
• (N-1) - (a-1)
• a-1

Recall that in ANOVA, F = MST / MSE, with MST being the numerator of the equation. Also the degrees of freedom
associated with the Treatment Mean Square (MST) is equal to the number of treatment groups (a) - 1.

14. You're performing an ANOVA Analysis of 1 independent variable at 4 levels and you're measuring 5 samples per
level. What is the total degrees of freedom?
• 20
• 16
• 19
• 4

Recall that in ANOVA, the total degrees of freedom = N - 1, where N = a*n - 1, with a being the number of levels
(treatment groups) and n being the number of replicates or samples per level (treatment group).

N = 4*5 - 1 = 19

15. You're performing an ANOVA Analysis of 1 independent variable at 4 levels where the total degrees of freedom
is 24, what is the error degrees of freedom?
• 23
• 21
• 20
• 3

First, we must solve for the Treatment D.F., which is equal to 4-1, or 3. Then we can subtract 3 from 24 to get the
error degrees of freedom of 21.

ANOVA Practice Quiz Page 11 of 16


16. You're performing an ANOVA Analysis of 1 independent variable at 6 levels where the total degrees of freedom
is 24. What is the treatment degrees of freedom?
• 23
• 5 (a - 1, 6 - 1 = 5)
• 20
• 3

The Treatment d.f. is the number of levels within the experiment minus 1 (a-1), 6-1 = 5

17. You're performing an ANOVA Analysis of 1 independent variable at 3 levels and you're measuring 10 samples
per level. What is the total degrees of freedom?
• 27
• 9
• 30
• 29

Recall that in ANOVA, the total degrees of freedom = N - 1, where N = a*n - 1, with a being the number of levels
(treatment groups) and n being the number of replicates or samples per level (treatment group).

N = 3*10 - 1 = 29

18. You're performing an ANOVA Analysis of 1 independent variable at 7 levels where the total degrees of freedom
is 35. What is the error degrees of freedom?
• 34
• 28
• 29
• 35

First must solve for the Treatment D.F., which is equal to 7-1, or 6. Then we can subtract 6 from 35 to get the error
degrees of freedom of 29.

19. You're performing an ANOVA Analysis of 1 independent variable at 10 levels where the total degrees of freedom
is 20. What is the treatment degrees of freedom?
• 10
• 9
• 30
• 11

The Treatment D.F. is the number of levels within the experiment minus 1 (a-1), 10-1=9

ANOVA Practice Quiz Page 12 of 16


20. The one-way ANOVA Analysis below has 10 treatment groups with the total degrees of freedom of 19.

Variation Sum of Squares Degrees of freedom Mean Squares


F-Value
Source (SS) (DF) (MS)
Treatment
9
(Between)
Error
55
(Within)
Total 100 19

Calculate the Treatment Mean Square for this ANOVA Table.

• 4.5
• 5
• 5.5
• 6.1

First, we can solve for the treatment sum of squares by simply subtracting 55 from 100, to get a treatment sum
of Square of 45.

Then we must solve for the degrees of freedom.

The treatment degrees of freedom is equal to the number of treatment levels (10) - 1; so 9 degrees of freedom.

Then we can solve for the error degrees of freedom by subtracting 19 - 9; so, 10 degrees of freedom.

Then we can calculate the mean squares for the treatment as the treatment sum of square (45) divided by the
treatment degrees of freedom (9): 45/9 = 5.

Variation Sum of Squares Degrees of freedom Mean Squares


F-Value
Source (SS) (DF) (MS)
Treatment
45 9 5 0.90
(Between)
Error
55 10 5.5
(Within)
Total 100 19

ANOVA Practice Quiz Page 13 of 16


21. The one-way ANOVA Analysis below has 15 treatment groups with the total degrees of freedom of 44.

Variation Sum of Squares Degrees of freedom Mean Squares


F-Value
Source (SS) (DF) (MS)
Treatment
30
(Between)
Error
(Within)
Total 45 44

Calculate the Error Mean Square for this ANOVA Table.

• 1.76
• 2.14
• 1.07
• 0.50

First, we can solve for the error sum of squares by simply subtracting 30 from 45, to get an error sum of squares
of 15.

Then we must solve for the degrees of freedom. For the treatment, the degrees of freedom is equal to the number
of treatment levels (15) - 1; so 14 degrees of freedom. Then we can solve for the error degrees of freedom by
subtracting 45 - 14; so 30 degrees of freedom.

Then we can calculate the error mean squares as the error sum of square (15) divided by the error degrees of
freedom (30): 15/30= 0.50

Variation Sum of Squares Degrees of freedom Mean Squares


F-Value
Source (SS) (DF) (MS)
Treatment
30 14 2.14
(Between)
Error
15 30 0.50
(Within)
Total 45 44

ANOVA Practice Quiz Page 14 of 16


22. The one-way ANOVA Analysis below has 3 treatment groups with the total degrees of freedom of 50.
Variation Sum of Squares Degrees of freedom Mean Squares
F-Value
Source (SS) (DF) (MS)
Treatment
20
(Between)
Error
(Within)
Total 100 50

What is the decision for this ANOVA Analysis if we choose the alpha risk to be 10%?
• Fail to Reject the Null
• Accept the Null
• Reject the Null
• Not Enough Information

First, we can solve for the error sum of squares by simply subtracting 20 from 100, to get an error sum of square
of 80.

Then we must solve for the degrees of freedom. For the treatment, the degrees of freedom is equal to the number
of treatment levels (3) - 1; so 2 degrees of freedom. Then we can solve for the error degrees of freedom by
subtracting 50 - 2; so 48 degrees of freedom.

Then we can calculate the mean squares of the error and treatment by dividing the sum of squares by the degrees
of freedom, which are 10 and 1.67 respectively.

The F-value can then be calculated by taking the ratio of MST to MSE which is 10 / 1.67 = 5.98.

Variation Sum of Squares Degrees of freedom Mean Squares


F-Value
Source (SS) (DF) (MS)
Treatment
20 2 10 5.98
(Between)
Error
80 48 1.67
(Within)
Total 100 50

Now to determine if this F-statistic is within our outside of the rejection region, we need to look up the critical F-
value from the NIST Table.

Recall that ANOVA is always a one-sided test, so we're looking for the upper critical value of the F Distribution at
the intersection of 2 degrees of freedom (v1) and 48 degrees of freedom (v2).

Fcritical = F.10(2,48) = 2.417

Since our F-statistic (5.98) is greater than Fcritical (2.417), we must reject the null hypothesis.

ANOVA Practice Quiz Page 15 of 16


23. The one-way ANOVA Analysis below has 7 treatment groups with the total degrees of freedom of 38.
Variation Sum of Squares Degrees of freedom Mean Squares
F-Value
Source (SS) (DF) (MS)
Treatment
(Between)
Error
66
(Within)
Total 84 38

What is the decision for this ANOVA Analysis if we use choose alpha risk to be 10%?
• Fail to Reject the Null
• Accept the Null
• Reject the Null
• Not Enough Information

First, we can solve for the treatment sum of squares by simply subtracting 66 from 84, to get a treatment sum of
square of 18.

Then we must solve for the degrees of freedom. For the treatment, the degrees of freedom is equal to the number
of treatment levels (7) - 1; so 6 degrees of freedom. Then we can solve for the error degrees of freedom by
subtracting 38 - 6; so 32 degrees of freedom.

Then we can calculate the mean squares of the error and treatment by dividing the sum of squares by the degrees
of freedom, which are 3.00 and 2.06 respectively.

The F-value can then be calculated by taking the ratio of MST to MSE which is 3.00 / 2.06 = 1.45.

Variation Sum of Squares Degrees of freedom Mean Squares


F-Value
Source (SS) (DF) (MS)
Treatment
18 6 3.00 1.45
(Between)
Error
66 32 2.06
(Within)
Total 84 38

Now to determine if this F-statistic is within our outside of the rejection region, we need to look up the critical F-
value from the NIST Table.

Recall that ANOVA is considered a one-sided test, so we're looking for the upper critical value of the F Distribution
at the intersection of 6 degrees of freedom (v1) and 32 degrees of freedom (v2).

Fcritical = F.10(6,32) = 1.967

Since our F-statistic (1.45) is less than Fcritical (1.967), we must fail to reject the null hypothesis.

ANOVA Practice Quiz Page 16 of 16

You might also like