ASSUMPTIONS
1. Your dependent variable should be measured
at the interval or ratio level (i.e., they are
continuous).
2. Your independent variable should consist of
two or more categorical, independent
groups.
3. You should have independence of
observations, which means that there is no
relationship between the observations in each
group or between the groups themselves.
4. There should be no significant outliers.
5. Your dependent variable should be
approximately normally distributed for each
PITFALLS OF REGRESSION ANALYSIS category of the independent variable.
6. There needs to be homogeneity of variances.
Lacking an awareness of the assumptions HYPOTHEES
underlying least-squares regression.
The analysis of variance is used to test the
Not knowing how to evaluate assumptions. hypothesis that the means of three or more populations
are the same against the alternative hypothesis that the
Not knowing the alternatives to classical mean of at least one population is different from the
regression if some assumption is violated. others.
Using a regression model without knowledge
of the subject matter.
STRATEGIES FOR AVOIDING PITFALLS OF
REGRESSION ONE-WAY ANOVA
Start with a scatter plot of X on Y to observe One-way analysis of variance (ANOVA) is a
possible relationship. method of testing the equality of three or
more population means by analyzing
Perform residual analysis to check the
sample variances.
assumptions.
It is called the analysis of variance because
- Use a histogram, stem-and-leaf display, the test is based on the analysis of variation
box-and-whisker plot, or normal in the data obtained from different samples.
probability plot of the residuals to
uncover possible non-normality.
If there is violation of any assumption, use
alternative methods to least-squares
regression or alternative least-squares models
(e.g., Curvilinear or multiple regression)
If there is no evidence of assumption
violation, then test for the significance of the
regression coefficients.
Note:
The ANOVA test is applied by calculating
ANALYSIS OF VARIANCE two estimates of the variance of population
distributions: the variance between samples
1. One-way ANOVA and the variance within samples.
2. Two-way ANOVA The variance between samples is also called
3. Tukey Test (Post Hoc Test) the mean square between samples or MSB.
The variance within samples is also called The variance between samples, MSB, gives an
the mean square within samples of MSW. estimate of variance based on the variation
among the means of samples taken from
[MSW is same as the MSE – mean square due to error] different populations.
REJECTION REGION For the example of three teaching methods,
MSB will be based on the values of the mean
One-way ANOVA is always right-tailed with the scores of three samples of students taught by
rejection region in the right tail of the F distribution three different methods. If the means of all
curve. populations under consideration are equal, the
means of the respective samples will still be
different but the variation among them is
expected to be small, and consequently, the
value of MSB is expected to be small.
However, if the means of populations under
consideration are not all equal, the variation
among the means of respective samples is
expected to be large, and consequently, the
[n is the total number of observations not the total value of MSB is expected to be large.
number of samples i.e. if you have 5 samples with k = 6 The variance within samples, MSW, gives an
categories (I.V.), then you have n = (6)(5) = 30 total estimate of variance based on the variation
number of observations] within the data of different samples.
Example: For the example of three teaching methods,
MSW will be based on the scores of
Suppose we have teachers at a school who have individual students included in the three
devised three different methods to teach arithmetic. samples taken from three populations.
They want to find out if these three methods produce
different mean scores. Let and the mean scores of all Example:
students who are taught by Methods I, II, and III,
Callie Cruz, Vice-President of the Nikel and Dime
respectively.
Savings Bank, is reviewing employees performance for
[To test if the three teaching methods produce different possible salary increase. In evaluating tellers, Callie
means, we test the null hypothesis.] decides that an important criterion is the number of
customer each day. She expects that each teller should
handle approximately the same number of customers
daily. Otherwise, each teller should be rewarded or
penalized accordingly.
Note: Callie randomly selects 6 business days and
customer traffic for each teller during these days is
Using a one-way ANOVA test, we analyze recorded. The factor or variable of interest, then, is the
only one factor or variable. number of customers served. The sample data:
For instance, in the example of testing for the
equality of mean arithmetic scores of
students taught by each of the three different
methods, we are considering only one factor,
which is the effect of different teaching
methods on the scores of students.
[I.V.: Teaching Methods / D.V.: Scores of
Students, only one factor or I.V.]
Sometimes we may analyze the effects of
two factors. For example, if different teachers
teach arithmetic using these three methods,
we can analyze the effects of teachers and
teaching methods on the scores of students.
This is done by using a two-way ANOVA.
[I.V.: Teachers and her teaching methods /
Solution:
D.V.: Scores of Students, two factors or I.V.]
Step 1: Since test statistic (3.7805) is greater than CV(3.68), we
reject Ho, therefore at least one of the tellers among
David, Chua and Lim is likely to be handling more or
fewer customers than the others.
All population means are equal. that is, Ms. David,
Ms. Chua and Ms. Lim serve the same average
number of customers per day and they are assumed
to have the same workload.
POST HOC TESTS ON ONE-WAY ANOVA
Not all the tellers are handling the same average Suppose we perform a one-way ANOVA and the
number of customers per day. At least 1 of the teller results lead us to conclude that at least one population is
performing better than the others, at least 1 of them different from the others. To determine which means
is not performing up to the standards of the others. differ significantly, we make additional comparisons
between means. The procedures for making these
comparisons are called multiple comparison methods.
TUKEY TEST
The Tukey test is also known as the Honestly
Significant Difference Test or the Wholly Significant
Difference Test. It is designed to compare pairs of
means after the null hypothesis of equal means has been
rejected.
It tests 𝐻𝑜: 𝜇i = 𝜇j versus 𝐻𝑎: 𝜇i ≠ 𝜇j for all means
where 𝑖 ≠ j. The goal of the test is to determine which
population means differ significantly.
Note:
The computation of the test statistic for the Tukey’s
test follows the same logic as the test for comparing two
means from independent sampling but the standard error
is not the same as the standard error used.
DISTRIBUTION OF TUKEY TEST
The q-test statistic follows a distribution called the
Studentized range distribution.
STANDARD ERROR
Where:
s2 = mean square error estimate (MSE) of from the
one-way ANOVA
n1 = sample size from population 1
Step 7:
n2 = sample size from population 2
TEST STATISTIC FOR TUKEY’S TEST 𝑥1 = 42.6, 𝑥2 = 49.1, 𝑥3 = 46.8, 𝑥4 = 63.7
𝑛1 = 𝑛2 = 𝑛3 = 𝑛4 = 6
Use Tukey’s test to determine which pairwise means
are significantly different using a familywise error of
0.05.
Solution:
CRITICAL VALUE FOR THE TUKEY’S TEST
The critical value for Tukey’s test using a
familywise error rate 𝛼 is given by,
q𝛼, v, k
Where:
v = degrees of freedom due to error (the degrees of
freedom due to error is the total number of subjects’
sample size minus the number of means being
compared, or n-k).
[n here is the total number of observations]
k = Total number of means being compared.
DECISION RULE
If 𝑞 ≥ 𝑞𝛼,𝑣,k reject the null hypothesis that 𝐻𝑜: 𝜇𝑖
= 𝜇j and conclude that the means are significantly
different.
PROCEDURES USED TO MAKE MULTIPLE
COMPARISON USING TUKEY’S TEST
1. Arrange the sample means in ascending order.
2. Compute the pairwise differences, ,
where . [nC2]
3. Compute the test statistic for each pairwise
difference.
[n here is the number of samples]
4. Determine the Critical Value.
5. Determine the decision.
6. Determine the conclusion.
Example:
Suppose that there is sufficient evidence to reject
𝐻𝑜: 𝜇1 = 𝜇2 = 𝜇3 = 𝜇4 using a one-way ANOVA. The
mean square error from ANOVA is determined to be s2 =
26.2. The sample means are with,
Alternatively, if you have a continuous
covariate, you need a two-way ANCOVA.
[covariate is a variable that also influences the outcome
of a study. It can be continuous like age, height, or any
variable that is not considered a variable in the study.
Also, covariate can be a random variable such as death,
disaster, and so on]
Covariance in statistics measures the extent to
which two variables vary linearly. It reveals
whether two variables move in the same or
opposite directions.
Covariance is like variance in that it measures
variability. While variance focuses on the
variability of a single variable around its mean,
covariance examines the co-variability of two
variables around their respective means. A high
value suggests an association exists between the
variables, indicating that they tend to vary together.
[the following details presented are just further
explanations to ANCOVA. NOTE that this is not part of
the curriculum (no need to study this!)]
WHEN TO USE ANCOVA?
ANCOVA, or the analysis of covariance, is a
powerful statistical method that analyzes the differences
between three or more group means while controlling
for the effects of at least one continuous covariate.
It is a potent tool because it adjusts for the effects
TWO-WAY ANOVA
of covariates in the model. By isolating the effect of the
categorical independent variable on the dependent
The two-way ANOVA compares the mean variable, researchers can draw more accurate and
differences between groups that have been split on two reliable conclusions from their data.
independent variables (called factors).
ANCOVA VS. ANOVA
The primary purpose of a two-way ANOVA is to
understand if there is an interaction between the two ANCOVA is an extension of ANOVA. While
independent variables on the dependent variable. ANOVA can compare the means of three or more
groups, it cannot control for covariates. ANCOVA
The interaction term in a two-way ANOVA builds on ANOVA by introducing one or more
informs you whether the effect of one of your covariates into the model.
independent variables on the dependent variable is the
same for all values of your other independent variable In an ANCOVA model, you must specify the
(and vice versa). dependent variable (continuous outcome), at least
one categorical variable that defines the comparison
For example, you could use a two-way ANOVA to groups, and a covariate.
understand whether there is an interaction between
gender and educational level on test anxiety amongst ANCOVA is simply an ANOVA model that
university students, where gender (males/females) and includes at least one covariate.
education level (undergraduate/postgraduate) are your
Covariates are continuous independent variables
independent variables, and test anxiety is your
that influence the dependent variable but are not of
dependent variable.
primary interest to the study. Additionally, the
Reminders: experimenters do not control the covariates. Instead,
they only observe and record their values. In contrast,
If you have three independent variables rather they do control the categorical factors and set them at
than two, you need a three-way ANOVA. specific values for the study.
Researchers refer to covariates as nuisance
variables [nakaka-badtrip] because they:
o Are uncontrolled conditions in the
experiment.
o Can influence the outcome.
This unfortunate combination of attributes allows
covariates to introduce both imprecision and bias into
the results. Even though the researchers aren’t interested
TWO-WAY ANOVA TABLE
in these variables, they must find a way to deal with
them. That’s where ANCOVA comes in!
Fortunately, you can use an ANCOVA model to
control covariates statistically. Simply put, ANCOVA
removes the effects of the covariates on the dependent
variable, allowing for a more accurate assessment of the
relationship between the categorical factors and the
outcome.
ANCOVA does the following:
o Increases statistical power and precision by
accounting for some of the within-group
variability.
Reminders:
o Removes confounder bias by adjusting for
Whenever conducting a two-way ANOVA, we
preexisting differences between groups.
always first test the hypothesis regarding
Example: interaction effect. If the null hypothesis of no
interaction is rejected, we do not interpret
Suppose we want to determine which of three teaching the results of the hypotheses involving the
methods is the best by comparing their mean test scores.
main effects. If the interaction term is NOT
We can include a pretest score as a covariate to account
significant, then we examine the two main
for participants having different starting skill levels.
effects separately. [BAKIT?!]
[going back to two-way ANOVA]
[Siguro, it is favorable kung magkakaroon ng
ONE-WAY VS. TWO-WAY ANOVA interaction since pwede mo ng i-test nang diretso ‘yung
mga variables tulad nung ginawa sa excel sa example]
How do you know if there’s an interaction between
two factors?
Example:
Factor A has two levels and Factor B has two levels. In
the left box, when Factor A is at level 1, Factor B
changes by 3 units. When Factor A is at level 2, Factor
B again changes by 3 units. Similarly, when Factor B is
at level 1, Factor A changes by 2 units. When Factor B
is at level 2, Factor A again changes by 2 units. There is
HYPOTHESES REGARDING INTERACTION
no interaction. The change in the true average response
EFFECT
when the level of either factor changes from 1 to 2 is the
same for each level of the other factor. In this case,
changes in levels of the two factors affect the true
average response separately, or in an additive manner.
HYPOTHES REGARDING MAIN EFFECTS
The right box illustrates the idea of interaction. When
Factor A is at level 1, Factor B changes by 3 units but
when Factor A is at level 2, Factor B changes by 6 units.
When Factor B is at level 1, Factor A changes by 2 units
but when Factor B is at level 2, Factor A changes by 5
units. The change in the true average response when the
levels of both factors change simultaneously from level
1 to level 2 is 8 units, which is much larger than the
separate changes suggest. In this case, there is an
interaction between the two factors, so the effect of
simultaneous changes cannot be determined from the
individual effects of the separate changes. Change in the
true average response when the level of one factor
changes depends on the level of the other factor. You
cannot determine the separate effect of Factor A or
Factor B on the response because of the interaction.
Example:
I.V.: Device and Task
D.V.: Task Completion Time
Alpha = 0.01
[may maling input sa PDF file kaya ito na lang]
The row of ‘Sample’ indicates the variables Task 1
and Task 2, and the p-value > 0.01, implying that we fail
to reject the null hypothesis. This means that there is no
significant difference between the variables Task 1 and
EFFECT SIZE
Task 2.
The row of ‘Columns’ indicates the variables The effect size is the size of the change in the
Device 1, Device 2, and Device 3. Moreover, the p- parameter of interest that can be detected by an
value < 0.01, implies that we reject the null hypothesis experiment. For example, in a coin tossing example, the
and so, there are significant differences among Device parameter of interest is P, the probability of a head. In
1, Device 2, and Device 3. calculating the sample size, we would need to state what
the baseline probability is (probably 0.5) and how large
Lastly, the row of ‘Interaction’ shows the
of a deviation from P that we want to detect with our
interaction between two factors (Task 1 and Task 2).
experiment. We would expect that it would take much
Showing a p-value < 0.01, hence, there is indeed an
larger sample size to detect a deviation of 0.01 than it
interaction between the said factors.
would to detect a deviation of 0.04. Selecting an
appropriate effect size is difficult because it is
POWER ANALYSIS subjective. The question that must be answered is:
What size change in the parameter would be of
Statistical Power Analysis must be discussed in
interest? Note that, in power analysis, the effect size is
the context of statistical hypothesis testing. It is an
not the actual difference, instead the effect size is the
important aspect of experimental design. It allows us to
change in the parameter that is of interest or is to be
determine the sample size required to detect an effect of
detected.
the a given size with a given degree of confidence.
[going back in Regression Analysis]
Power Analysis is normally conducted before the data
collection. The main purpose underlying power analysis Note that one of the essences of regression
is to help the researcher to determine the smallest analysis is that we can establish a model in such a way
sample size that is suitable to detect the effect of a given that we can predict certain values of our dependent
test at a desired level of significance. variable (it is required to have one dependent variable
only)
It combines statistical analysis, subject-area
knowledge, and your requirements to help you derive If our dependent variable exceeds one, say more
the optimal sample size for your study. Statistical power than one, then we will use a test concerning multi-
in a hypothesis test is a probability that the test will variation, this means that if there is more than one D.V.
detect an effect that actually exists. with three or more independent groups, then we can’t
use ANOVA, instead, we will use a Multi-variate
For example, 80% power in a clinical trial means
ANOVA. Similar logic applies to other tests in which
that the study has 80% chance of ending up with a p
variances are used.
value less than 5% in a statistical test (i.e. a statistically
significant treatment effect), if there really was an
important difference between treatment. CLASSIFICATION OF LINEAR MODELS
The power of any test of statistical significance is
Postulated linear models depend on the type of
defined as the probability that it will reject false null
dependent and independent variables used to indicate
hypothesis. Statistical power is inversely related to beta
the system to be modeled.
or probability of making a type II error. In short power =
1-Beta.
Usually, a power analysis calculates needed
sample size given some expected sample size, alpha and
power. In most cases, the researcher is interested in
solving for the sample, so majority of the work needed
to do a power analysis relates to determining the
expected effect to be used in the power analysis.
Statistical power is positively correlated with the PURPOSE OF MODELING
sample size, which means that the level of the other To understand the mechanism that generates
factors, a larger sample size gives greater power.
the data.
However, researchers are also faced with the decision to
To predict the values of the dependent
make a difference between statistical difference and
variable given the independent variables.
scientific differences.
To optimize the response indexed by the
dependent variable.
VARIABLES
DEPENDENT (response/endogenous) –
whose variability is being studied or explained
within the system.
INDEPENDENT (regressor/exogenous) –
used to explain the behavior of the variable.
The variability of this variable is explained
outside the system.
TYPES OF DATA
Cross-section – different stations measured at
the same point in time.
Time series – one or more stations measured
at different points in time.