0% found this document useful (0 votes)
4 views

Tutorial+6

The document outlines the tutorial for ECON20003, focusing on one-way ANOVA procedures to compare population means using Excel and R. It details the necessary steps for data preparation, hypothesis testing, and calculations for ANOVA, including conditions for parametric tests and the interpretation of results. Additionally, it provides an exercise involving the analysis of time taken to fill out different tax forms, requiring students to perform calculations manually and using R for statistical analysis.

Uploaded by

1512866916
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Tutorial+6

The document outlines the tutorial for ECON20003, focusing on one-way ANOVA procedures to compare population means using Excel and R. It details the necessary steps for data preparation, hypothesis testing, and calculations for ANOVA, including conditions for parametric tests and the interpretation of results. Additionally, it provides an exercise involving the analysis of time taken to fill out different tax forms, requiring students to perform calculations manually and using R for statistical analysis.

Uploaded by

1512866916
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

ECON20003 – QUANTITATIVE METHODS 2

TUTORIAL 6

Download the t6e1, t6e2 and t6e3 Excel data files from the subject website and save them
to your computer or USB flash drive. Read this handout and complete the tutorial
exercises before your tutorial class, so that you can ask help from your tutor during the
Zoom session if necessary.

After the tutorial class attempt the “Exercises for assessment”. For each assessment
exercise type your answer in the corresponding box available in the Quiz. If the exercise
requires you to use R, insert the relevant R/RStudio script and printout in the same Quiz box
below your answer. To get the tutorial mark on week 7, you must submit your answers to
these exercises in the Tutorial 6 Canvas Quiz by 10am Wednesday in week 7 and attend
Tutorial 7.

Comparing Several Population Central Locations with Parametric and


Nonparametric Procedures

Analysis of Variance (ANOVA) is a class of statistical procedures used to test differences


between two or more (sub-) population central locations. It is called "Analysis of Variance"
rather than "Analysis of Means” or Analysis of Medians” because it makes inferences about
several population central locations by analysing the variations within and between these
populations.

There are many different types of ANOVA, the two simplest ones being (i) one-way ANOVA
based on the independent measures design and (ii) one-way ANOVA based on randomised
blocks. The first is the generalisation of the two-independent-sample Z / t test for the
difference between two population means (parametric) and of the Wilcoxon rank-sum test
for the difference between two population medians (nonparametric), while the second is the
extension of the matched pair Z / t test for the difference between two population means
(parametric) and of the matched pair Wilcoxon signed ranks test for the difference between
two population medians (nonparametric).

One-Way ANOVA Based on the Independent Measures Design

Parametric one-way ANOVA based on independent samples has five conditions:

(i) The data set constitutes k independent random samples of independent


observations drawn from k (sub-) populations.
(ii) The variable of interest is quantitative and continuous.
(iii) The measurement scale is interval or ratio.
(iv) Each (sub-) population is normally distributed, …
(v) … and has the same variance.

1
L. Kónya ECON20003 - Tutorial 6
The calculations are based on the decomposition of the total sum of squares (SS) in the
pooled sample into two components: the sum of squares for treatments (SST), which is
related to the variations between the samples, and the sum of squares for error (SSE), which
is related to the variations within the samples.

In symbols,

SS  SST  SSE

where

k nj k k nj

SS    xij  x  , SST   n j  x j  x  , SSE    xij  x j 


2 2 2

j 1 i 1 j 1 j 1 i 1

k is the number of (sub-) populations and also the number of independent samples, nj is the
number of observations in sample j, xj-bar is the mean of sample j, and x-bar is the grand
mean, i.e., the mean of the pooled sample (all available observations). The corresponding
degrees of freedom are n – 1 for SS, k – 1 for SST and n – k for SSE, where n is the total
number of observations (n1 + n2 + …+ nk).

Under the required conditions, the common population variance can be estimated with the
sample variance of the pooled sample,

k nj

x  xj 
2
ij
j 1 i 1 SSE
s 2p    MSE
nk nk

where MSE is the mean squares for errors.

If, in addition, the composite null hypothesis, H0: 1 = 2 =…= k, is correct, the common
population variance can also be estimated using the sample variance of the sample means.
This second estimator is
k

n x  x
2
j j
j 1 SST
s02    MST
k 1 k 1

where MST is the mean squares for treatments.

The test statistic is the ratio of these two estimators,

s02 MST
F 
s 2p MSE

and under H0 it has an F distribution with df1 = k – 1 numerator degrees of freedom and df2
= n – k denominator degrees of freedom.

2
ECON20003 - Tutorial 6
This test is always a right-tail test in terms of the decision rule, meaning that H0 is rejected
at the  100% significance level if the observed test statistic value exceeds the critical
value, i.e. if Fobs > F,k-1,n-k.

Exercise 1 (Selvanathan et al., p. 636, ex. 15.13)

The friendly folks at the Taxpayers Association are always looking for ways to improve the
wording and format of their tax return forms. Three new forms have been developed
recently. To determine which, if any, are superior to the current form, 120 individuals were
asked to participate in an experiment. Each of the three new forms and the currently used
form were filled out by 30 different people. The amount of time (in minutes) taken by each
person to complete the task was recorded and stored in the t6e1 Excel file.

(a) What conclusions can be drawn from these data? (Use  = 0.05.)

The variable of interest is Time (in minutes) it takes to fill out a form. It is a quantitative
variable measured on a ratio scale.

The question is whether there is any difference between the four forms in terms of Time,
so the hypotheses are

H0: 1 = 2 = 3 = 4 and HA: not all four population means are the same.1

Granted that the required conditions are satisfied, we can apply an ANOVA F-test. Let
us do so first manually and then with R.

The 5% critical value is F, k-1,n-k, the (1-) 100% percentiles of the F-distribution with
df1 = k – 1 numerator degrees of freedom and df2 = n – k denominator degrees of
freedom. In this case, k = 4 and n = 4  30 = 120 and from the F table, this critical value
is F0.05,3,116  F0.05,3,120 = 2.68. Therefore, H0 is to be rejected if Fobs > 2.68.

To simplify the manual calculations, launch RStudio, create a new project and script,
and name them t6e1. Import the data saved in the t6e1 Excel data file to RStudio in the
usual way, just make sure that in the Import Options section of the Import Excel Data
dialogue window (see next page) you name the data frame t6e1_wide.2

Save the data set in your project as t6e1.

1
Note that this alternative hypothesis is not equivalent to 1  2  3  4. This latter statement is stronger
than HA, because, for example, it excludes the possibility that 1 and 2 are equal, while under HA they can be
equal.
2
So far, for the sake of simplicity, we always used the name of the Excel data file for the R data frame. You
will understand soon why to have a different name for the R data frame this time.
3
ECON20003 - Tutorial 6
Generate the basic descriptive statistics for the four samples by executing the following
commands:

library(pastecs)
round(stat.desc(t6e1_wide, basic = FALSE,
desc = TRUE, norm = TRUE, p = 0.95),3)

You should get the following results:

Form1 Form2 Form3 Form4


median 83.500 99.500 100.000 113.000
mean 90.167 95.767 106.833 111.167
SE.mean 5.749 5.480 5.564 5.840
CI.mean.0.95 11.758 11.208 11.379 11.943
var 991.523 900.875 928.695 1023.040
std.dev 31.488 30.015 30.475 31.985
coef.var 0.349 0.313 0.285 0.288
skewness 0.457 -0.116 0.361 -0.045
skew.2SE 0.535 -0.136 0.422 -0.053
kurtosis -0.106 -1.087 -0.716 -0.954
kurt.2SE -0.064 -0.653 -0.430 -0.573
normtest.W 0.956 0.969 0.967 0.967
normtest.p 0.251 0.504 0.450 0.455

Since the sample sizes are equal, the grand mean is the average of the four sample
means:


k
j 1
xj 90.167  95.767  106.833  111.167
x   100.984
k 4

The sum of squares for treatments can be calculated from the sample means and the
grand mean:

k
SST   n j  x j  x   30  [ 90.167  100.984    95.767  100.984 
2 2 2

j 1

 106.833  100.984   111.167  100.984  ]  8463.866


2 2

The sum of squares for errors can be obtained from the sample variances

4
ECON20003 - Tutorial 6
k nj k
SSE    xij  x j    ( n j  1) s 2j
2

j 1 i 1 j 1

 29  (991.523  900.875  928.695  1023.040)  111479.857

The mean squares are

SST 8463.866
MST    2821.289
k 1 3

SSE 111479.857
MSE    961.033
nk 116

and the observed test statistic is

MST 2821.289
Fobs    2.936
MSE 961.033

Fobs > 2.68, so we reject H0 and conclude at the 5% significance level that it does not
take the same time on average to fill out the four different forms.

Before we reproduce these results in R, open the t6e1 Excel file. It has two sheets, Wide
and Long. They contain the same data but in different formats. On the Wide sheet, there
are four columns, one for each treatment (Form1, Form2, Form3, Form4), to record the
four samples of the time (Time) required to fill in the forms. On the Long sheet, there are
only two columns. The first column is for the type of the Form and the second for Time.
Form is the treatment variable. It has four possible values and if you scroll down you
can see that each value is repeated n = 30 times. The second column contains the
variable of interest, Time, and it has the four columns of time from the Wide sheet (from
row 2 to row 31) stacked on top of each other.

When you imported the data from the t6e1 Excel file, by default, RStudio opened the
first sheet in the file, i.e., the Wide sheet. If you check your Environment tab, you can
see that the data frame contains 4 numeric vectors.

5
ECON20003 - Tutorial 6
However, to perform ANOVA in R, the data set needs to be arranged in long format, like
on the Long sheet of t6e1.

In general, a set of data arranged in a table can be in wide (unstacked) format or in long
(stacked) format. In the wide format, every data point is recorded in a single row and the
columns hold the values of various attributes, while in the long format each data point is
represented by as many rows as the number of different sets of attributes and each row
contains the values of one set of attributes for the given data point. When the data set has
2 or 3 variables, the wide format is more compact and hence it is preferred for display
purposes. However, when there are more variables, the long format is more convenient.

Import the data on that sheet to RStudio and name the new data frame t6e1_long.

The basic form of the R function for one-way ANOVA is

aov(formula)

where the formula argument specifies the statistical model that we intend to analyse. In
this example, we want to compare the times required to fill in the forms, so Time is the
‘dependent’ variable, Form is the independent variable, and the appropriate formula is
Time ~ Form. The output of aov is quite succinct. To obtain a more meaningful printout,
it is better to combine it with the summary function that we already used in Tutorial 3.

Execute

summary(aov(Time ~ Form))

to obtain the following ANOVA table:

Df Sum Sq Mean Sq F value Pr(>F)


Form 3 8464 2821 2.936 0.0363 *
Residuals 116 111480 961
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
6
ECON20003 - Tutorial 6
Compare this printout to the manual calculations on pages 4-5. Note that in this printout
the two sources of variation, treatment and error, are labelled as Form and Residuals,
and the mean squares are denoted as Mean Sq. The reported F value is 2.936, the
same as Fobs, and the p-value (Pr(>F)) is 0.0363, smaller than 0.05, so the test rejects
H0 at the 5% significance level.

What if we did not have the data set in long format? One option is to reshape the data
set from wide to long in Excel before importing it to RStudio, just as I did myself.
However, this can be time consuming when the data set is large. Alternatively, we can
import the data in wide format and convert it to long format in R before using the aov()
function.

To illustrate this option, stack the four vectors of time into one vector, called minutes, by
executing

minutes = c(t6e1_wide$Form1, t6e1_wide$Form2, t6e1_wide$Form3, t6e1_wide$Form4)

The new vector, minutes, has 120 elements and it is identical with Time in the t6e1_long
data frame. On its own, it is clearly insufficient because by stacking the four original
vectors into one we lost a crucial piece of information, namely which tax return forms
the observations belong to. For this reason, just like in the t6e1_long data frame, Time
must be complemented with a second variable that categorizes Time and stores it as
levels. It is a qualitative or categorical variable, called factor in R.

A factor can be generated by the

gl(n, k, length, labels)

function, where n is the number of factor levels, k is the number of replications in a row,
length is the required length of the resulting factor, and labels (optional) contains the
labels of the factor levels.

In this case, n = 4, k = 30, length = 120, labels = c("Form1", "Form2", "Form3", "Form4"),
so execute

forms = gl(4, 30, 120, c("Form1", "Form2", "Form3", "Form4"))

Type forms in the Console and press Enter to verify that we managed to replicate Form
in the t6e1_long data frame.

Now, execute

summary(aov(minutes ~ forms))

7
ECON20003 - Tutorial 6
to obtain

Apart from the different variable names (Form versus forms), this ANOVA table is the
same as the one on the previous page.

(b) What are the required conditions for the test conducted in part (a)?

As mentioned on page 1, the ANOVA F-test is based on five assumptions: (i) the data
set constitutes k independent random samples of independent observations drawn from
k (sub-) populations; (ii) the variable of interest is quantitative and continuous; (iii) the
measurement scale is interval or ratio; (iv) each (sub-) population is normally distributed
and (v) has the same variance (i.e., they are homoskedastic).

(c) Does it appear that the required conditions of the tests in part (a) are satisfied?

Independence of the samples is not a testable requirement, we just take it granted. The
amount of time (in minutes) taken by each person to complete the task is a quantitative
variable measured on a ratio scale. The descriptive statistics and the Shapiro-Wilk test
results on page 4 do not challenge normality.3

As regards the last requirement, homoskedasticity, you learnt in the week 4 lectures and
on the previous tutorial how to test whether two population variances are equal with the
F-test. A generalization of this test for the equality of two or more population variances
is the Levene’s test. The hypotheses are

H 0 :  12   22  ...   k2 , H A : not all  i2 s are equal

In R, it can be performed easily with the

leveneTest(formula)

where the formula argument is like in aov.4 This function is part of the car package, so
install this package if you do not have it yet, load it
install.packages("car")
library(car)
and then execute

leveneTest(minutes ~ forms)
It returns

3
We do not discuss this issue here in details because by now you should be able to check normality on your
own. Remember though that in the assignments and in the final exam you cannot simply assume normality,
you need to verify it using the checks you learnt about.
4
We do not discuss the details of the Levene’s test, and you are not expected to be able to perform it manually.
8
ECON20003 - Tutorial 6
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 3 0.0833 0.969
116

As you can see, the numbers of degrees of freedom are the same as on the previous
ANOVA printout. The test statistic value is 0.0833 and the p-value (Pr(>F)) is 0.969, so
there is no reason to question the homoskedasticity assumption. This implies that the
ANOVA F-test is appropriate this time.

But, what if Levene’s test rejected homoskedasticity? In that case, we should not rely
on the ANOVA F-test, but should use Welch’s F-test, which is a generalization of
Welch’s t-test for two or more independent samples. This test requires independence
and normality, just like the ANOVA F-test, but it allows unequal variances (called
heteroskedasticity). In R, it can be done with the

oneway.test(formula, var.equal = )

command, where formula is like before and var.equal is a logical variable indicating
whether to treat the variances in the samples as equal (it is FALSE by default).

Hence,

oneway.test(minutes ~ forms, var.equal = TRUE)

performs the ANOVA F-test and returns

One-way analysis of means


data: minutes and forms
F = 2.9358, num df = 3, denom df = 116, p-value = 0.03632

Apart from the different numbers of decimals, the F test statistic and the p-value on this
printout are the same than on the aov printout. To run the Welch test, we need to drop
the var.equal argument.5

oneway.test(minutes ~ forms)

returns

One-way analysis of means (not assuming equal variances)


data: minutes and forms
F = 2.8057, num df = 3.000, denom df = 64.426, p-value = 0.04661

Compared to the ANOVA F-test, the Welch F-test statistic is a bit smaller (2.8057) and
the corresponding p-value is a bit larger (0.04661), but H0 is still rejected at the 5%

5
We can use var.equal = FALSE, but it is unnecessary as this is the default option.
9
ECON20003 - Tutorial 6
significance level. Therefore, it does not matter this time whether the variances are equal
or not.

What if we reduce the significance level to 4%? At this level the ANOVA F-test rejects
the null hypothesis, but the Welch F-test does not. Because of the Levene’s test result,
we better rely on the former test, so at the 4% significance level we still conclude that it
does not take the same time on average to fill out the four different forms.

When the mean does not exist or the sampled populations are clearly not normally
distributed, we should use neither the ANOVA F-test nor the Welch F-test, but some
nonparametric test instead. The nonparametric counterpart of these tests is the Kruskal-
Wallis test, a generalization of the Wilcoxon rank-sum test to two or more (sub-) populations.

The Kruskal-Wallis test is a one-way ANOVA test for the equality of k  2 (sub-) population
medians based on the ranks of the observations in the pooled set of k independent samples,
one from each (sub-) population. It is based on the following assumptions:

(i) The data set constitutes k independent random samples of independent


observations drawn from k (sub-) populations.
(ii) The variable of interest is quantitative and continuous.
(iii) The measurement scale is at least ordinal.
(iv) The sampled populations differ at most with respect to their central locations (i.e.
medians).

The hypotheses are

H0: 1 = 2 = … = k and HA: not all population medians are equal.

To perform this test, we need to rank all available observations from the smallest (1) to the
largest (n = n1 + n2 +…+ nk), averaging the ranks of tied observations, and calculate the sum
of the ranks assigned to the observations in each sample. The test statistic is

k T2
12
H  j

n(n  1) j 1 n j
 3(n  1)

where Tj is the sum of ranks assigned to the observations in the jth sample.

The sampling distribution of this test statistic is non-standard. The small-sample critical
values for k = 3, nj = 1,...,6 and k = 4, nj = 1,...,5 are provided in the Kruskal-Wallis table that
you can download from the subject website, while for larger sample sizes (5 or more) it can
be approximated with a chi-square distribution, 2k-1.

10
ECON20003 - Tutorial 6
Exercise 2 (Selvanathan et al., p. 933, ex. 20.56)

It is common practice in the advertising business to create several different advertisements


and then ask a random sample of potential customers to rate the ads on several different
dimensions. Suppose that an advertising firm developed four different ads for a new
breakfast cereal and asked a random sample of 400 shoppers to rate the believability of the
advertisements. One hundred people viewed ad 1, another 100 viewed ad 2, another 100
saw ad 3, and another 100 saw ad 4. The ratings were: very believable (4), quite believable
(3), somewhat believable (2) and not believable at all (1). The responses are stored in the
t6e2 Excel file. Based on this data, can the firm’s management conclude at the 5%
significance level that differences exist in believability among the four ads?

This exercise is similar to the previous exercise, with one important difference. Namely, this
time the observed variable, response, is a qualitative variable measured on an ordinal scale.
For this reason, we cannot use a parametric test for the population means. Instead, we must
use a nonparametric procedure for the population medians. The null hypothesis is that there
is no difference among the four ads in terms of median believability, i.e., 1 = 2 = 3 = 4,
while the alternative hypothesis is that not all four population medians are the same.

The appropriate procedure is the Kruskal-Wallis test, granted that its requirements are met.
The data set consists of four independent samples of 100 independent observations each,
the underlying variable of interest is belief that can be thought of as a continuous variable,
and the measurement scale of the observed variable is ordinal. Hence, requirements (i), (ii)
and (iii) are satisfied. As for (iv), it can be checked by illustrating the data with four
histograms.

Launch RStudio, create a new project and script, name them t6e2, import the data saved in
the t6e2 Excel file6 and load it into your project. As you can see on the Environment tab,
there are two series in the data set: Response, which is the variable of interest, and Ad,
which is used to classify the observations according to the four advertisements.

Check whether the psych library is available on your computer. If it is not, install it.7

You can develop a histogram of Response for each of the four ads by executing the following
commands:

library(psych)
par(mfrow = c(2, 2))
for (i in c(1,2,3,4)) {hist(subset(Response, Ad == i))}

The first command, library(psych), loads the psych package in the active project. The
second command, par(mfrow), sets up a 2 x 2 plotting space to be able to view the four
histograms together in a single plot. The third command, for (i in c(1,2,3,4)), creates a for
loop that iterates 4 times. In each iteration i takes on the value of the corresponding element
of vector (1, 2, 3, 4) and a histogram is generated by the hist function for a subset of
Response defined by the Ad equals i restriction.

6
Before importing the data, check in Excel that it is in long format.
7
Click Tools / Install Packages … and write psych in the second box of the Install Packages dialogue window.
11
ECON20003 - Tutorial 6
You should get the plot8 shown below. The four histograms look similar, so the fourth
requirement of Kruskal-Wallis test is also satisfied.

Because of the large sample sizes, we do not do the Kruskal-Wallis test manually. The
relevant R function is

kruskal.test(x, group)

In this case,

kruskal.test(Response, Ad)

returns

8
You might get an error message if you use a laptop with its native screen as this combined plot might not fit
in the bottom-right plots panel of RStudio. If that is the case, try to resize this panel, make it wider.
12
ECON20003 - Tutorial 6
Kruskal-Wallis rank sum test
data: Response and Ad
Kruskal-Wallis chi-squared = 4.7766, df = 3, p-value = 0.1889

Under the null hypothesis the Kruskal-Wallis test statistic is distributed as a chi-square
random variable with k – 1 = 3 degrees of freedom. The p-value is 0.1889 > 0.05. Hence, at
the 5% significance level we maintain the null hypothesis and conclude that there are no
significant differences among the four ads in terms of believability.

Exercises for Assessment

Exercise 3

A farmer wants to know if the weight of parsley plants is influenced by using a fertilizer. He
selects 90 plants and randomly divides them into three groups of 30 plants each. He applies
a biological fertilizer to the first group, a chemical fertilizer to the second group and no
fertilizer at all to the third group. After a month he weighs all plants and saves the
measurements in the t6e3 Excel file.

Can we conclude from these data at the 5% significance level that fertilizer affects the weight
of parsley plants?

a) Obtain the basic descriptive statistics with R and then perform the ANOVA F-test
manually.

b) Repeat the ANOVA F-test in R.

c) What are the required conditions for the tests in parts (a) and (b)? Do they seem to be
satisfied?

d) Perform the Welch F-test in R. Does it lead to the same conclusion than the ANOVA F-
test?

e) Perform the Kruskal -Wallis test in R (use  = 0.05). Does it lead to a different conclusion
than the parametric tests in part (b) and (d)?

13
ECON20003 - Tutorial 6

You might also like