ANALYSIS OF VARIANCE
ANALYSIS OF VARIANCE (ANOVA)
Analysis of variance (ANOVA) - a test of means for two or more populations. The null hypothesis is- all means are equal. ANOVA must have a dependent variable that is metric (measured using an interval or ratio scale). One or more independent variables which should be categorical (non-metric). Categorical independent variables are also called factors.
ONE-WAY ANALYSIS OF VARIANCE
One-way analysis of variance involves only one
categorical variable, or a single factor. In one-way analysis of variance, a treatment is the same as a factor level.
Questions ???? Do the various segments differ in terms of their volume of product consumption? Do the brand evaluations of groups exposed to different commercials vary? What is the effect of consumers' familiarity with the store (measured as high, medium, and low) on preference for the store? Null Hypothesis when comparing three groups H0: m1 = m2 = m3
STATISTICS ASSOCIATED WITH ANALYSIS OF VARIANCE
F statistic. The null hypothesis that the category means are equal in the population is tested by an F statistic based on the ratio of mean square related to X and mean square related to error. Mean square. This is the sum of squares divided by the appropriate degrees of freedom. eta2 (2). The strength of the effects of X (independent variable or factor) on Y (dependent variable) is measured by eta2 ( 2). The value of 2 varies between 0 and 1.
STATISTICS ASSOCIATED WITH ANALYSIS OF VARIANCE
F-Ratio
Variance between groups F = Variance within groups
Sum of Squares
SS total = SS within SSbetween
STATISTICS ASSOCIATED WITH ONE-WAY ANALYSIS OF VARIANCE
The total sum of squares or Sst is computed by squaring the deviation of each score from grand mean and summing these squares
SStotal = ( X ij X ) 2
i = 1 j =1
X ij = individual scores, i.e., the ith observation or pi test unit in the jth group X pi = grand mean n = number of all observations or test units in a group c = number of jth groups (or columns)
STATISTICS ASSOCIATED WITH ONE-WAY ANALYSIS OF VARIANCE
Sswithin the observed variability within each group is computed by squaring the deviation of each score from group mean and summing these scores
SS within = ( X ij X j ) 2
i = 1 j =1
p X iji = individual scores = group mean for jth group n = number of all observations or test units in a group c = number of jth groups (or columns)
STATISTICS ASSOCIATED WITH ONE-WAY ANALYSIS OF VARIANCE
Ssbetween, the variability of the group mean about the grand mean is calculated by squaring the deviation of group mean from grand mean, multiplying by the number of items, and summing these scores
SS between =
X
nj(X
j =1
X )2
X j = group mean for the jth group
= grand mean nj = number of all observations or test units in a group
DECOMPOSITION OF THE TOTAL VARIATION: ONE-WAY ANOVA
Independent Variable X Total Sample Y1 Y2 : : YN Y
Total Variation =SSy
Within Category Variation =SSwithin
Category Mean
X1 Y1 Y2 : : Yn Y1
X2 Y1 Y2 Yn Y2
Categories X3 Xc Y1 Y1 Y2 Y2
Yn Y3 Yn Yc
Between Category Variation = SSbetween
ANALYSIS OF VARIANCE MEAN SQUARES
To obtain mean square between groups, Ssbetween is divided by k-1 degrees of freedom: SS between MSbetween = k 1 To obtain mean square within groups, Sswithin is divided by N-k degrees of freedom:
MSwithin
SS within = N k
where k =number of groups N = total number of observations
ANALYSIS OF VARIANCE F-RATIO
MS between F = MS within
This statistic follows the F distribution, with (k-1) degrees of freedom in the numerator and (N-k) degrees of freedom in the denominator
ONE-WAY ANALYSIS OF VARIANCE
Store No. 1 2 3 4 5 6 7 8 9 10 Column Totals Category means: Y j Grand mean, Y EFFECT OF IN-STORE PROMOTION ON SALES Level of In-store Promotion High Medium Low Normalized Sales 10 8 5 9 8 7 10 7 6 8 9 4 9 6 5 8 4 2 9 5 3 7 5 2 7 6 1 6 4 2 83 83/10 = 8.3 62 62/10 = 6.2 = (83 + 62 + 37)/30 = 6.067 37 37/10 = 3.7
STEPS TO CALCULATE ANOVA
Step 1: Calculate groups mean Step 2: Calculate grand mean Step 3: Calculate between group variance
SS between = n j ( X j X ) 2
j =1
Step 4: Calculate within group variance
SS within =
(X
i = 1 j =1
ij
X j )2
Step 5: Calculate SST (Sstotal = Ssbetween + Sswithin)
ONE-WAY ANALYSIS OF VARIANCE
SSbetween = 10(8.3-6.067)2 + 10(6.2-6.067)2 + 10(3.7-6.067)2 = 10(2.233)2 + 10(0.133)2 + 10(-2.367)2 = 106.067 SSwithin = (10-8.3)2 + (9-8.3)2 + (10-8.3)2 + (8-8.3)2 + (9-8.3)2 + (8-8.3)2 + (9-8.3)2 + (7-8.3)2 + (7-8.3)2 + (6-8.3)2 + (8-6.2)2 + (8-6.2)2 + (7-6.2)2 + (9-6.2)2 + (6-6.2)2 + (4-6.2)2 + (5-6.2)2 + (5-6.2)2 + (6-6.2)2 + (4-6.2)2 + (5-3.7)2 + (7-3.7)2 + (6-3.7)2 + (4-3.7)2 + (5-3.7)2 + (2-3.7)2 + (3-3.7)2 + (2-3.7)2 + (1-3.7)2 + (2-3.7)2 = (1.7)2 + (0.7)2 + (1.7)2 + (-0.3)2 + (0.7)2 + (-0.3)2 + (0.7)2 + (-1.3)2 + (-1.3)2 + (-2.3)2 + (1.8)2 + (1.8)2 + (0.8)2 + (2.8)2 + (-0.2)2 + (-2.2)2 + (-1.2)2 + (-1.2)2 + (-0.2)2 + (-2.2)2 + (1.3)2 + (3.3)2 + (2.3)2 + (0.3)2 + (1.3)2 + (-1.7)2 + (-0.7)2 + (-1.7)2 + (-2.7)2 + (-1.7)2 = 79.80 Sstotal = 106.067+ 79.80 = 185.88
STEPS TO CALCULATE ANOVA
Step 6: Calculate the number of degrees of freedom in the numerator (k-1) and in the denominator (N-k) Step 7: Calculate mean square of between group (Msbetween) and mean square of within group (Mswithin)
SS between MSbetween = k 1
MSwithin
SS within = N k
Step 8: Calculate F statistics
CALCULATION F RATIO
Df = k-1 (3-1=2) numerator Df= N-k (30-3=27) denominator SStotal = SSbetween + SSwithin 185.867 = 106.067 +79.80 F = Ssbetween/ (k-1) 106.067/2 (Msbetween ) Sswithin/ (N-k) 79.80/27 (Mswithin) F = Msbetween Mswithin 2.956 = 17.94 with df 2, 27 the critical value of F is 3.35 for = .05 . 53.033
ETA2
(2).
eta2 (2). The strength of the effects of X (independent variable or factor) on Y (dependent variable) is measured by eta2 ( 2). The value of 2 varies between 0 and 1. The strength of the effects of X on Y are measured as follows: 2 = Ssbetween/Sstotal = 106.067/185.867 = 0.571 57.1% of the variation in sales (Y) is accounted for by in-store promotion (X), indicating a modest effect.
SPSS EXAMPLE
Demo sav. The variable Level of education [ed] divides employees into five independent groups by level of education. Use the OneWay ANOVA procedure to test whether Household income in thousands [income] means for the five groups are significantly different.
2-WAY ANOVA
Allows two different treatments to be examined simultaneously. EXAMPLES:
Do educational levels (graduate, post graduate, PhD) and age (less than 35, 35-55, more than 55) affect consumption of a brand? What is the effect of consumers' familiarity with a department store (high, medium, and low) and store image (positive, neutral, and negative) on preference for the store? How do advertising levels (high, medium, and low) interact with price levels (high, medium, and low) to influence a brand's sale?
ANOVA
Consider the simple case of two variables X1 and X2 having categories k1 and k2. The total variation in this case is partitioned as follows: SStotal = SS due to X1 + SS due to X2 + SS due to interaction of X1 and X2 + SSwithin or
SS
t
= SS
x1
+ SS
x2
+ SS
x 1x 2
+ SS
erro r
TWO-WAY ANALYSIS OF VARIANCE
The significance of the main effect of each factor may be tested as follows for X1 and X2
F = S S er r o r / d f d S S x /d f n
1
F =
S S err o r / d f d
S S x /d f n
2
MSx 1 MSerror
MSx 2 MSerror
where dfn = k1 - 1 dfd = N - k1k2
TWO-WAY ANALYSIS OF VARIANCE
the next step is to examine the significance of the interaction effect. Under the null hypothesis of no interaction, the appropriate F test is:
S S x 1x 2/dfn F= S S error/dfd MS x 1x 2 = MS error
Where dfn = (k1 - 1) (k2 - 1) dfd = N - k1k2
SSx1x2 = SSt-(SSx1+SSx2+SSerror)
ETA SQUARE
The strength of the joint effect of two factors, called the overall effect, or multiple 2, is measured as follows: multiple 2 = SSx1+SSx2+SSx1x2/ SSt
ANOVA
Source X1 X2 X1X2 Error Total SS SSx1 SSx2 df k1- 1 k2 -1 MS SSx1/ k1- 1 SSx2/ k2 -1 SSx1x2/(k1-1) (k2-1) SSwithin/ N-k1k2 F MSX1/MSerror MSX2/MSerror MSx1X2/ MSerror P<.05 Eta SSx1/SSt SSx2/SSt SSx1+ SSx2+ SSx1x2/SSt
SSx1x2 (k1-1)(k21) SSwithin SStotal N-k1k2 N-1
SPSS EXAMPLE
Examine the effect of in-store-promotion and couponing on store sales. In-store-promotion (high, medium, low) Couponing (distributed, not distributed) 3X2 factorial design grocery_1month.sav. This hypothetical data file is the grocery_coupons.sav data file with the weekly purchases "rolled-up" so that each case corresponds to a separate customer. Some of the variables that changed weekly disappear as a result, and the amount spent recorded is now the sum of the amounts spent during the four weeks of the study.
ANOVA SPSS OUTPUT
Source In Store Promotion Coupon Interaction Error Total
N= 30; p<.05
SS
106.067 53.33 162.67 23.00 185.87
df
2 1 2 24 29
MS
53.03 53.33 1.63 0.97 6.41
F
54.86** 55.172** 1.690
P<.05
.00 .00 .206
2
.821 697 .123
Interpretation: the main effects of in-store promotion (F 2, 24 = 54.86; p<.05) and couponing (F 1, 24 = 55.17; p<.05) are found to be significant. These results indicates that the three levels of in-store promotion are significantly different for sale. The interaction between in-store promotion and couponing is not significant (F 2, 24 = 1.69; p<.12).
MULTIVARIATE ANOVA
Example: In an experiment designed to measure the effect of gender and frequency of travel on preference and satisfaction for the foreign travel a 2x3 between subject factorial was used. Five respondents were assigned to each cell for a total sample size of 30. Preference and satisfaction level was measure on a 9-point scale. Conduct a 2x3 ANOVA.