Block 13 ST3188
Block 13 ST3188
Learning Objectives
Discuss the scope of the ANOVA technique and its relationship to the t test and regression
Describe one-way ANOVA, including decomposition of the total variation, measurement of
effects significance testing and interpretation of results
Describe nn-way ANOVA and the testing of the significance of the overall effect, the
interaction effect and the main effect of each factor
Describe ANCOVA and show how it accounts for the influence of uncontrolled independent
variables
Explain key factors pertaining to the interpretation of results with emphasis on interactions
and the relative importance of factors.
Reading List
Malhotra, N.K., D. Nunan and D.F. Birks. Marketing Research: An Applied Approach. (Pearson,
2017) 5th edition [ISBN 9781292103129] Chapter 21 (excluding from ‘Multiple comparisons’ on
page 622).
Overview
Analysis of variance (ANOVA) is used as a test of means for two or more populations. The null
hypothesis, typically, is that all means are equal. ANOVA must have a dependent variable which is
metric (measured using an interval or ratio scale). There must also be one or more independent
variables which are all categorical (non-metric).
Categorical independent variables are also called factors. A particular combination of factor levels, or
categories, is called a treatment.
One-way ANOVA involves only one categorical variable, or a single factor. In one-way ANOVA, a
treatment is the same as a factor level. If two or more factors are involved, the analysis is termed n-
way analysis of variance.
Figure 21.1 of the textbook shows the relationship between the t test, ANOVA, ANCOVA and
regression.
Activity 13.1
Discuss the similarities and differences between analysis of variance and analysis of covariance.
Activity 13.2
What is the relationship between analysis of variance and the t test?
Eta squared, η2 - the strength of the effect of X (the independent variable or factor) on Y
(the dependent variable) is measured by η2. The value of η2 varies between 0 and 1.
F statistic - the null hypothesis that the category means are equal in the population is tested
by an F statistic based on the ratio of mean square related to X and mean square related to
error.
Mean square - this is the sum of squares divided by the appropriate degrees of freedom.
SSBetween=SSX - this is the variation in Y related to the variation in the means of the
categories of X. This represents variation between the categories of X or the portion of the
sum of squares in Y related to X.
SSWithin=SSError - this is the variation in Y due to the variation within each of the categories
of X. This variation is not accounted for by X.
SSY - this is the total variation in Y.
⇓
Interpret the results
Conducting one-way ANOVA
Conducting one-way ANOVA
The total variation in Y, denoted by 𝑆𝑆𝑌 , can be decomposed into two components:
where:
Yij = the ith observation in the jth category
𝑌̅ = the mean over the whole sample, i.e. the overall mean
Table 21.1 of the textbook shows the decomposition of the total variation in one-way ANOVA.
In ANOVA, we estimate two measures of variation: within groups (SSWithin) and between groups
(SSBetween). Therefore, by comparing the Y variance estimates based on between-groups and within-
groups variation, we can test the null hypothesis.
The strength of the effect of X on Y is measured as follows:
𝑆𝑆𝑋 𝑆𝑆𝐸𝑟𝑟𝑜𝑟
𝜂2 = = 𝑆𝑆𝑌 − .
𝑆𝑆𝑌 𝑆𝑆𝑌
The value of η2 varies between 0 and 1, with larger values indicating stronger effects.
In one-way ANOVA, the interest lies in testing the null hypothesis that the category means are equal
in the population, given by:
𝐻0 : μ1 = μ2 = μ3= ⋯ = μc.
Under the null hypothesis, SSXSSX and SSErrorSSError come from the same source of variation. In
other words, the estimate of the population variance of YY is:
S2Y=SSXc−1=Mean square due to X=MSXSY2=SSXc−1=Mean square due to X=MSX
or:
𝑆𝑆𝐸𝑟𝑟𝑜𝑟
𝑆𝑌2 = = 𝑀𝑒𝑎𝑛 𝑠𝑞𝑢𝑎𝑟𝑒 𝑑𝑢𝑒 𝑡𝑜 𝑒𝑟𝑟𝑜𝑟 = 𝑀𝑆𝐸𝑟𝑟𝑜𝑟
𝑛−𝑐
where n is the overall sample size.
The null hypothesis may be tested by the F statistic based on the ratio between these two estimates:
𝑆𝑆𝑋/(𝑐 − 1) 𝑀𝑆𝑋
𝐹= = ∼ 𝐹𝐶−1, 𝑛− 𝑐.
𝑆𝑆𝐸𝑟𝑟𝑜𝑟/(𝑛 − 𝑐) 𝑀𝑆𝐸𝑟𝑟𝑜𝑟
This statistic follows an F distribution, with c−1 and n−c degrees of freedom (df), in the numerator
and denominator, respectively. If the null hypothesis of equal category means is not rejected, then the
independent variable does not have a significant effect on the dependent variable. On the other hand,
if the null hypothesis is rejected, then the effect of the independent variable is statistically significant.
A comparison of the category mean values will indicate the nature of the effect of the independent
variable.
Activity 13.3
What is total variation? How is it decomposed in a one-way analysis of variance?
Activity 13.4
What is the null hypothesis in one-way ANOVA? Which basic statistic is used to test the null
hypothesis in one-way ANOVA? How is this statistic computed?
H0 : μ1 = μ2 = μ3.
Table 21.2 of the textbook provides the data on direct mail offer, dealership promotion, sales of new
cars and clientele rating (the data can be downloaded in the file Mercedes.sav or can be seen in the
table below).
Table 21.3 of the textbook shows the effect of dealership promotion on sales of new cars.
To test the null hypothesis, the various sums of squares are computed as follows:
𝑆𝑆𝑋 106.067
𝜂2 = = = 0.571.
𝑆𝑆𝑌 185.867
In other words, 57.1% of the variation in sales, YY, is accounted for by dealership promotion, XX,
indicating a modest effect. The null hypothesis may now be tested:
n-way ANOVA
In market research, one is often concerned with the effects of more than one factor simultaneously.
The following are examples.
How do advertising levels (high, medium and low) interact with price levels (high, medium
and low) to influence a brand’s sales?
Do educational levels (less than high school, high school graduate, some college and college
graduate) and age group (under 35, 35-55, over 55) affect consumption of a brand
What is the effect of consumers’ familiarity with a car dealership (high, medium and low) and
dealership image (positive, neutral and negative) on preference for the dealer?
Consider the simple case of two factors X1X1 and X2X2 having c1c1 and c2c2 categories,
respectively. The total variation in this case is partitioned as follows:
𝑑𝑓𝑛 = (c1−1)(c2−1)
𝑑𝑓𝑑 = n−c1c2
The significance of the overall effect may be tested by an F test, as follows:
(c1−1)+(c2−1)+(c1−1)(c2−1)=c1c2−1
dfd = degrees of freedom for the denominator = n−c1c2n−c1c2
MS = mean square.
The significance of the main effect of each factor may be tested as follows for X1:
𝑆𝑆𝑋1 /𝑑𝑓𝑛 𝑀𝑆𝑥1
𝐹= =
𝑆𝑆𝐸𝑟𝑟𝑜𝑟/𝑑𝑓𝑑 𝑀𝑆𝐸𝑟𝑟𝑜𝑟
where:
dfn = c1 − 1
dfd = n − c1c2
Similarly, we may test for X2 using:
𝑆𝑆𝑋2 /𝑑𝑓𝑛 𝑀𝑆𝑥2
𝐹= =
𝑆𝑆𝐸𝑟𝑟𝑜𝑟/𝑑𝑓𝑑 𝑀𝑆𝐸𝑟𝑟𝑜𝑟
where:
dfn=c2−1dfn=c2−1
dfd=n−c1c2dfd=n−c1c2.
Returning to the Mercedes example, Table 21.5 of the textbook shows the statistical results of a two-
way ANOVA.
Activity 13.5
How does n-way analysis of variance differ from the one-way procedure?
Activity 13.6
How is the total variation decomposed in n-way analysis of variance?
Analysis of covariance
When examining the differences in the mean values of the dependent variable related to the effect of
the controlled independent variables, it is often necessary to take into account the influence of
uncontrolled independent variables. For example:
In determining how different groups exposed to different commercials evaluate a brand, it may be
necessary to control for prior knowledge
In determining how different price levels will affect a household’s cereal consumption, it may be
essential to take household size into account.
Suppose that we wanted to determine the effect of dealership promotion and direct mail offers on
sales while controlling for the effect of clientele ratings.
Returning to the Mercedes example, Table 21.6 of the textbook shows the statistical results of an
ANCOVA.
Activity 13.7
What is the most common use of the covariate in ANCOVA?
Issues in interpretation
Important issues involved in the interpretation of ANOVA results include interactions, the relative
importance of factors and multiple comparisons.
Interactions:
Figure 21.3 of the textbook provides a classification of possible interaction effects which
could arise when conducting ANOVA on two or more factors. Figure 21.4 of the
textbook shows examples of different patterns of interactions.
Relative importance of factors:
Experimental designs are usually balanced, in that each cell contains the same number of
participants. This results in an orthogonal design in which the factors are uncorrelated. Hence
it is possible to determine unambiguously the relative importance of each factor in explaining
the variation in the dependent variable.
The most commonly used measure in ANOVA is omega squared, 𝜔2 . This measure indicates what
proportion of the variation in the dependent variable is related to a particular independent variable or
factor.
The relative contribution of a factor XX is calculated as follows:
Activity 13.8
What is the difference between ordinal and disordinal interaction?
Activity 13.9
How is the relative importance of factors measured in a balanced design?
2.
c. If the average importance was computed for each group, would you expect the
sample means to be similar or different?
3. An experiment tested the effects of package design and shelf display on the likelihood of
buying a breakfast cereal. Package design and shelf display were varied at two levels each,
resulting in a 2×22×2design. Purchase likelihood was measured on a seven-point Likert scale.
The results are partially described in the following table.
Source of variation Sum of squares df Mean square FF pp-value ω2ω2
4.
a. Complete the table by calculating the mean square, FF, pp-value and ω2ω2 values.
5. In a pilot study examining the effectiveness of three commercials (A, B and C), 10 consumers
were assigned to view each commercial and rate it on a 9-point Likert scale. The data
obtained from the 30 participants are given in the data file Commercial.sav. (An Excel
version of the dataset is Commercial.xlsx.)
Use Analyze > Compare Means > One-Way ANOVA……. Select ‘Effectiveness rating’ as the
dependent variable, and ‘Commercial’ as the factor.
b. Produce some descriptive statistics, carry out a test for the homogeneity of
variances, and produce a plot of means for the data. Interpret your results.
Use Analyze > Compare Means > One-Way ANOVA……. Select ‘Effectiveness rating’ as the
dependent variable, and ‘Commercial’ as the factor. Select the ‘Options……’ box and check
‘Descriptive’, ‘Homogeneity of variance test’ and ‘Means plot’. Click ‘Continue’, then ‘OK’.
a. Do males and females differ on average in their preference for long-haul travel?
Hint: Perform an independent samples tt test.
b. Do the light, medium and heavy travellers differ on average in their preference for
long-haul travel?
Use Analyze > Compare Means > One-Way ANOVA……. Under ‘Options……’, select ‘Descriptive’.
c. Conduct a 2×32×3 analysis of variance with preference for long-haul travel as the
dependent variable, and gender and frequency of travel as the independent
variables. Interpret the results.
Use Analyze > General Linear Model > Univariate……. Choose ‘Preference’ as the ‘Dependent
Variable’, and ‘Gender’ and ‘Travel group’ as the ‘Fixed Factor(s)’. Under ‘Options……’, select
‘Descriptive statistics’.
o Discuss the scope of the ANOVA technique and its relationship to the t test and regression
o Describe one-way ANOVA, including decomposition of the total variation, measurement of
effects significance testing and interpretation of results
o Describe n -way ANOVA and the testing of the significance of the overall effect, the
interaction effect and the main effect of each factor
o Describe ANCOVA and show how it accounts for the influence of uncontrolled independent
variables
o Explain key factors pertaining to the interpretation of results with emphasis on interactions
and the relative importance of factors.
Block 13: Analysis of variance and covariance
and:
𝑐 𝑛𝑗
where:
Yij = the ith observation in the jth category
𝑌̅ = the mean over the whole sample, i.e. the overall mean
𝑌̅𝑗 = the mean for category j
Nj = the sample size for category j.
SSBetween represents the variation among the means of Y in the categories of X and involves the
squares of the deviations of various category means from the overall mean.
SSWithin represents the variation in Y due to the variation within each category of X and involves
the squares of the deviations of each measurement of Y from the corresponding category mean.
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 = ⋯ = 𝜇𝑐
It is called a joint hypothesis because several independent hypotheses are being assumed, namely:
𝐻0 : 𝜇1 = 𝜇2 , 𝐻0 : 𝜇1 = 𝜇3 , 𝐻0 : 𝜇2 = 𝜇3 etc.
The basic statistic used to test the null hypothesis in one-way ANOVA is the FF statistic. Therefore,
the estimate of the population variance of YY can be based on either between-category variation or
within-category variation, i.e. we have:
𝑆𝑆𝑋
𝑆𝑌2 = = mean square due to 𝑋 = 𝑀𝑆𝑋
𝑐−1
or:
𝑆𝑆𝐸𝑟𝑟𝑜𝑟
𝑆𝑌2 =
𝑛−𝑐
= mean square due to error= MSError
where c is the number of categories and n is the total number of observations. Therefore, the null
hypothesis may be tested by the F statistic based on the ratio between these estimates:
Solution to 13.9
Usually, if an experimental design is balanced (i.e. each cell contains the same number of
participants), then it results in an orthogonal design in which the factors are uncorrelated. Therefore, it
becomes easier to determine accurately the relative importance of each factor in explaining the
variation in the dependent variable. In ANOVA, the relative contribution of a factor X is calculated
as:
2.
a. The complete table is:
b. As the interaction effect is significant, the effect of one factor should be interpreted
for each level of the other factor.
3.
a. We test the null hypothesis that all commercials are equally effective, against the
alternative that they are not all equally effective. Specifically:
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3 vs. 𝐻1 : Not all 𝜇𝑖s are equal.
Here we perform one-way ANOVA as there is one categorical independent variable (the commercial)
to explain the continuous dependent variable (the effectiveness rating). Under H0H0, the test statistic
is:
𝑆𝑆𝑋 /(𝑐 − 1)
𝐹= ~𝐹
𝑆𝑆𝐸𝑟𝑟𝑜𝑟 /(𝑛 − 𝑐) 𝑐−1,𝑛−𝑐
The test statistic value, noting that c=3 and n=30, is:
46.667/(3 − 1) 23.333
𝑓= = = 24.231,
26.000/(30 − 3) 0.963
For a 1% significance level, say, the critical value using statistical tables is 𝐹0.01,2,27=5.49. Since 5.49
< 24.231 we reject the null hypothesis and conclude that there is strong evidence that the mean
effectiveness ratings are not all equal across the commercials (i.e. that the commercials are not equally
effective). Indeed, we could have instead consulted the p-value of the F statistic, which is 0.000, and
hence we would have arrived at the same conclusion.
SPSS output is:
ANOVA
Effectiveness rating
Total 72.667 29
b. Looking at the descriptive statistics output, we see the sample means for commercials
A, B and C are 𝑥̅𝐴 = 4, 𝑥̅𝐵 = 5 and 𝑥̅𝐶 = 7, respectively. These are the point estimates
of the true group means, i.e. 𝜇𝐴 , 𝜇𝐵 , and 𝜇𝐶 , respectively.
The overall sample mean is 5.33 and this is the point estimate of the common group
mean under the null hypothesis in (a). The one-way ANOVA F test was highly
significant (see above) which means the test detected some significant difference(s)
between the group sample means. However, does this mean all the group means are
different, or is one different from the other two (which are the same)?
To answer this question we need to consider the 95% confidence intervals for each group mean and
see whether these overlap.
Descriptives
Effectiveness rating
95% Confidence
Interval for Mean
Std.
N Mean Std.Error Minimum Maximum
Deviation
Lower Upper
Bound Bound
Commercial
10 4.00 .816 0.258 3.42 4.58 3 5
A
Commercial
10 5.00 1.054 0.333 4.25 5.75 4 7
B
Commercial
10 7.00 1.054 0.333 6.25 7.75 5 8
C
Noting that 𝑡0.025,9 =2.262, using statistical tables, from the output we see that 95% confidence
intervals for the three groups are, respectively:
𝑆𝐴 0.816
𝑥̅ 𝐴 ± 𝑡0.025 , 𝑛𝐴−1 × ⇒ 4 ± 2.262 × = 4 ± 2.262 × 0.258 ⇒ (3.42,4.58)
√𝑛𝐴 √10
𝑆𝐵 1.054
𝑥̅ 𝐵 ± 𝑡0.025 , 𝑛𝐵−1× ⇒ 5 ± 2.262 × = 5 ± 2.262 × 0.333 ⇒ (4.25, 5.75)
√𝑛𝐵 √10
𝑆𝐶 1.054
𝑥̅ 𝐶 ± 𝑡0.025 , 𝑛𝐶−1 × ⇒ 7 ± 2.262 × = 7 ± 2.262 × 0.333 ⇒ (6.25, 5.75)
√𝑛𝐶 √10
The first two confidence intervals overlap (the upper endpoint of the first (4.58) exceeds the lower
endpoint of the second (4.25)), indicating no significant difference between the mean effectiveness
ratings for commercials A and B. This is equivalent to saying we would not the reject the null
hypothesis of 𝐻0 : 𝜇𝐴 = 𝜇𝐵 at the 5% significance level. However, the 95% confidence interval for
commercial C is above the others, indicating that commercial C is more effective, on average, than
commercials A and B.
The means plot illustrates this difference:
So we could rank the commercials from the least effective to the most effective as A, B and C,
although there is no significant difference between the effectiveness of commercials A and B.
The one-way ANOVA test assumes the same variance for each group, i.e. the variance of
effectiveness ratings should be a constant value, 𝜎 2 , for each group, so it would be advisable to test
this assumption.
Test of Homogeneity of Variances
Effectiveness rating
.375 2 27 .691
The output above is for the test of homogeneous (i.e. equal) variances, i.e. we test 𝐻0 : 𝜎𝐴2 = 𝜎𝐵2 = 𝜎𝐶2
against the alternative hypothesis that the variances are not all equal. We use the ‘Levene statistic’
(omitting the details here) and note the p-value is 0.691. Therefore, we fail to reject 𝐻0 and are
content that our assumption of homogeneous variances holds. In which case we can now improve on
the earlier 95% confidence intervals for the individual group means. We can now effectively pool all
30 observations to derive a more precise estimate of the common variance, σ2σ2. The estimator
is 𝑆 2 = 𝑆𝑆𝐸𝑟𝑟𝑜𝑟 / (n − c), which gives us a point estimate here of 0.963 (the within groups mean sum
of squares in the one-way ANOVA table produced in (a)).
Noting that 𝑡0.025,27 = 2.052, using statistical tables, revised 95% confidence intervals are:
𝑆 √0.963
𝑥̅𝐴 ± 𝑡0.025,𝑛−𝑘 × ⇒ 4 ± 2.052 × ⇒ (3.36,4.64)
√𝑛𝐴 √10
𝑆 √0.963
𝑥̅𝐵 ± 𝑡0.025,𝑛−𝑘 × ⇒ 5 ± 2.052 × ⇒ (4.36,5.64)
√𝑛𝐵 √10
𝑆 √0.963
𝑥̅𝐶 ± 𝑡0.025,𝑛−𝑘 × ⇒ 7 ± 2.052 × ⇒ (6.36,7.64)
√𝑛𝐶 √10
Note the confidence interval for 𝜇𝐴 is now slightly wider (due to 𝑆𝐴2 <𝑆 2 , i.e. (0.816)2 = 0.666 <
0.963, but the other intervals are now narrower.
One-way ANOVA also assumes the dependent variable (here the effectiveness rating) is normally
distributed, which again should be tested. The popular Kolmogorov-Smirnov test would be fine,
although we omit the details here for brevity.
4.
a. Given we are comparing two groups (males and females), it is appropriate to conduct
an independent samples tt test. We test: 𝐻0 : 𝜇1 = 𝜇2 vs . 𝐻1 : 𝜇1 ≠ 𝜇2
The SPSS output is:
Group statistics
The FF test for the equality of variances, i.e.𝐻0 : 𝜎12 = 𝜎22 , is significant (the test statistic value of
5.700 has a p-value of 0.024), so we reject the null hypothesis of equal variances, hence equal
variances are not assumed. This means we look at the second row of the t test box of output. The t test
is not significant (t = −1.061 and the p-value = 0.300). Therefore, the null hypothesis of no
difference cannot be rejected.
b. The three usage groups differ in their preference for long-haul travel (the F statistic =
15.294 and the p-value = 0.000). Comparing means and 95% confidence intervals for
each group, it is clear that heavy users exhibit the greatest preference.
Descriptives
Preference
95%
Confidence
Interval for
Std. Std. Mean
N Meam Minimum Maximum
Deviation Error
Lower Upper
Bound Bound
ANOVA
Preference
Total 104.300 29
c. The overall F test is significant (the F statistic = 20.743 and the p-value = 0.000). As
individual factors, both gender and frequency of travel are significant, as is the
interaction between gender and frequency of travel. For males, the heavy travellers
exhibit the greatest preference and the light travellers the lowest preference. For
females, the medium travellers exhibit the lowest preference.
Descriptive Statistics
Dependent Variable: Preference
Total 947.000 30