0% found this document useful (0 votes)

33 views9 pages

Assumptions: Pitfalls of Regression Analysis

1. One-way ANOVA is used to test if three or more population means are equal or if at least one is different. 2. It analyzes the variance between samples and within samples to calculate mean square values. 3. If the variance between samples (MSB) is much larger than expected by chance, it suggests the population means are different.

Uploaded by

John Aaron Mirabel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views9 pages

Assumptions: Pitfalls of Regression Analysis

Uploaded by

John Aaron Mirabel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

ASSUMPTIONS

1. Your dependent variable should be measured

at the interval or ratio level (i.e., they are
continuous).
2. Your independent variable should consist of
two or more categorical, independent
groups.
3. You should have independence of
observations, which means that there is no
relationship between the observations in each
group or between the groups themselves.
4. There should be no significant outliers.
5. Your dependent variable should be
approximately normally distributed for each
PITFALLS OF REGRESSION ANALYSIS category of the independent variable.
6. There needs to be homogeneity of variances.

 Lacking an awareness of the assumptions HYPOTHEES

underlying least-squares regression.
The analysis of variance is used to test the
 Not knowing how to evaluate assumptions. hypothesis that the means of three or more populations
are the same against the alternative hypothesis that the
 Not knowing the alternatives to classical mean of at least one population is different from the
regression if some assumption is violated. others.

 Using a regression model without knowledge

of the subject matter.

STRATEGIES FOR AVOIDING PITFALLS OF

REGRESSION ONE-WAY ANOVA

 Start with a scatter plot of X on Y to observe  One-way analysis of variance (ANOVA) is a

possible relationship. method of testing the equality of three or
more population means by analyzing
 Perform residual analysis to check the
sample variances.
assumptions.
 It is called the analysis of variance because
- Use a histogram, stem-and-leaf display, the test is based on the analysis of variation
box-and-whisker plot, or normal in the data obtained from different samples.
probability plot of the residuals to
uncover possible non-normality.

 If there is violation of any assumption, use

alternative methods to least-squares
regression or alternative least-squares models
(e.g., Curvilinear or multiple regression)

 If there is no evidence of assumption

violation, then test for the significance of the
regression coefficients.
Note:

 The ANOVA test is applied by calculating

ANALYSIS OF VARIANCE two estimates of the variance of population
distributions: the variance between samples
1. One-way ANOVA and the variance within samples.
2. Two-way ANOVA  The variance between samples is also called
3. Tukey Test (Post Hoc Test) the mean square between samples or MSB.
The variance within samples is also called  The variance between samples, MSB, gives an
the mean square within samples of MSW. estimate of variance based on the variation
among the means of samples taken from
[MSW is same as the MSE – mean square due to error] different populations.
REJECTION REGION  For the example of three teaching methods,
MSB will be based on the values of the mean
One-way ANOVA is always right-tailed with the scores of three samples of students taught by
rejection region in the right tail of the F distribution three different methods. If the means of all
curve. populations under consideration are equal, the
means of the respective samples will still be
different but the variation among them is
expected to be small, and consequently, the
value of MSB is expected to be small.
However, if the means of populations under
consideration are not all equal, the variation
among the means of respective samples is
expected to be large, and consequently, the
[n is the total number of observations not the total value of MSB is expected to be large.
number of samples i.e. if you have 5 samples with k = 6  The variance within samples, MSW, gives an
categories (I.V.), then you have n = (6)(5) = 30 total estimate of variance based on the variation
number of observations] within the data of different samples.
Example:  For the example of three teaching methods,
MSW will be based on the scores of
Suppose we have teachers at a school who have individual students included in the three
devised three different methods to teach arithmetic. samples taken from three populations.
They want to find out if these three methods produce
different mean scores. Let and the mean scores of all Example:
students who are taught by Methods I, II, and III,
Callie Cruz, Vice-President of the Nikel and Dime
respectively.
Savings Bank, is reviewing employees performance for
[To test if the three teaching methods produce different possible salary increase. In evaluating tellers, Callie
means, we test the null hypothesis.] decides that an important criterion is the number of
customer each day. She expects that each teller should
handle approximately the same number of customers
daily. Otherwise, each teller should be rewarded or
penalized accordingly.

Note: Callie randomly selects 6 business days and

customer traffic for each teller during these days is
 Using a one-way ANOVA test, we analyze recorded. The factor or variable of interest, then, is the
only one factor or variable. number of customers served. The sample data:
 For instance, in the example of testing for the
equality of mean arithmetic scores of
students taught by each of the three different
methods, we are considering only one factor,
which is the effect of different teaching
methods on the scores of students.

[I.V.: Teaching Methods / D.V.: Scores of

Students, only one factor or I.V.]

 Sometimes we may analyze the effects of

two factors. For example, if different teachers
teach arithmetic using these three methods,
we can analyze the effects of teachers and
teaching methods on the scores of students.
This is done by using a two-way ANOVA.

[I.V.: Teachers and her teaching methods /

Solution:
D.V.: Scores of Students, two factors or I.V.]
Step 1: Since test statistic (3.7805) is greater than CV(3.68), we
reject Ho, therefore at least one of the tellers among
David, Chua and Lim is likely to be handling more or
fewer customers than the others.
All population means are equal. that is, Ms. David,
Ms. Chua and Ms. Lim serve the same average
number of customers per day and they are assumed
to have the same workload.

POST HOC TESTS ON ONE-WAY ANOVA

Not all the tellers are handling the same average Suppose we perform a one-way ANOVA and the
number of customers per day. At least 1 of the teller results lead us to conclude that at least one population is
performing better than the others, at least 1 of them different from the others. To determine which means
is not performing up to the standards of the others. differ significantly, we make additional comparisons
between means. The procedures for making these
comparisons are called multiple comparison methods.

TUKEY TEST

The Tukey test is also known as the Honestly

Significant Difference Test or the Wholly Significant
Difference Test. It is designed to compare pairs of
means after the null hypothesis of equal means has been
rejected.

It tests 𝐻𝑜: 𝜇i = 𝜇j versus 𝐻𝑎: 𝜇i ≠ 𝜇j for all means

where 𝑖 ≠ j. The goal of the test is to determine which
population means differ significantly.

Note:

The computation of the test statistic for the Tukey’s

test follows the same logic as the test for comparing two
means from independent sampling but the standard error
is not the same as the standard error used.

DISTRIBUTION OF TUKEY TEST

The q-test statistic follows a distribution called the

Studentized range distribution.

STANDARD ERROR

Where:

s2 = mean square error estimate (MSE) of from the

one-way ANOVA

n1 = sample size from population 1

Step 7:
n2 = sample size from population 2
TEST STATISTIC FOR TUKEY’S TEST 𝑥1 = 42.6, 𝑥2 = 49.1, 𝑥3 = 46.8, 𝑥4 = 63.7

𝑛1 = 𝑛2 = 𝑛3 = 𝑛4 = 6

Use Tukey’s test to determine which pairwise means

are significantly different using a familywise error of
0.05.

Solution:

CRITICAL VALUE FOR THE TUKEY’S TEST

The critical value for Tukey’s test using a

familywise error rate 𝛼 is given by,

q𝛼, v, k
Where:

v = degrees of freedom due to error (the degrees of

freedom due to error is the total number of subjects’
sample size minus the number of means being
compared, or n-k).

[n here is the total number of observations]

k = Total number of means being compared.

DECISION RULE

If 𝑞 ≥ 𝑞𝛼,𝑣,k reject the null hypothesis that 𝐻𝑜: 𝜇𝑖

= 𝜇j and conclude that the means are significantly
different.

PROCEDURES USED TO MAKE MULTIPLE

COMPARISON USING TUKEY’S TEST

1. Arrange the sample means in ascending order.

2. Compute the pairwise differences, ,
where . [nC2]
3. Compute the test statistic for each pairwise
difference.

[n here is the number of samples]

4. Determine the Critical Value.
5. Determine the decision.
6. Determine the conclusion.

Example:

Suppose that there is sufficient evidence to reject

𝐻𝑜: 𝜇1 = 𝜇2 = 𝜇3 = 𝜇4 using a one-way ANOVA. The
mean square error from ANOVA is determined to be s2 =
26.2. The sample means are with,
 Alternatively, if you have a continuous
covariate, you need a two-way ANCOVA.

[covariate is a variable that also influences the outcome

of a study. It can be continuous like age, height, or any
variable that is not considered a variable in the study.
Also, covariate can be a random variable such as death,
disaster, and so on]

 Covariance in statistics measures the extent to

which two variables vary linearly. It reveals
whether two variables move in the same or
opposite directions.
 Covariance is like variance in that it measures
variability. While variance focuses on the
variability of a single variable around its mean,
covariance examines the co-variability of two
variables around their respective means. A high
value suggests an association exists between the
variables, indicating that they tend to vary together.

[the following details presented are just further

explanations to ANCOVA. NOTE that this is not part of
the curriculum (no need to study this!)]

WHEN TO USE ANCOVA?

ANCOVA, or the analysis of covariance, is a

powerful statistical method that analyzes the differences
between three or more group means while controlling
for the effects of at least one continuous covariate.

It is a potent tool because it adjusts for the effects

TWO-WAY ANOVA
of covariates in the model. By isolating the effect of the
categorical independent variable on the dependent
The two-way ANOVA compares the mean variable, researchers can draw more accurate and
differences between groups that have been split on two reliable conclusions from their data.
independent variables (called factors).
ANCOVA VS. ANOVA
The primary purpose of a two-way ANOVA is to
understand if there is an interaction between the two ANCOVA is an extension of ANOVA. While
independent variables on the dependent variable. ANOVA can compare the means of three or more
groups, it cannot control for covariates. ANCOVA
The interaction term in a two-way ANOVA builds on ANOVA by introducing one or more
informs you whether the effect of one of your covariates into the model.
independent variables on the dependent variable is the
same for all values of your other independent variable In an ANCOVA model, you must specify the
(and vice versa). dependent variable (continuous outcome), at least
one categorical variable that defines the comparison
For example, you could use a two-way ANOVA to groups, and a covariate.
understand whether there is an interaction between
gender and educational level on test anxiety amongst ANCOVA is simply an ANOVA model that
university students, where gender (males/females) and includes at least one covariate.
education level (undergraduate/postgraduate) are your
Covariates are continuous independent variables
independent variables, and test anxiety is your
that influence the dependent variable but are not of
dependent variable.
primary interest to the study. Additionally, the
Reminders: experimenters do not control the covariates. Instead,
they only observe and record their values. In contrast,
 If you have three independent variables rather they do control the categorical factors and set them at
than two, you need a three-way ANOVA. specific values for the study.
Researchers refer to covariates as nuisance
variables [nakaka-badtrip] because they:

o Are uncontrolled conditions in the

experiment.

o Can influence the outcome.

This unfortunate combination of attributes allows

covariates to introduce both imprecision and bias into
the results. Even though the researchers aren’t interested
TWO-WAY ANOVA TABLE
in these variables, they must find a way to deal with
them. That’s where ANCOVA comes in!

Fortunately, you can use an ANCOVA model to

control covariates statistically. Simply put, ANCOVA
removes the effects of the covariates on the dependent
variable, allowing for a more accurate assessment of the
relationship between the categorical factors and the
outcome.

ANCOVA does the following:

o Increases statistical power and precision by

accounting for some of the within-group
variability.
Reminders:
o Removes confounder bias by adjusting for
 Whenever conducting a two-way ANOVA, we
preexisting differences between groups.
always first test the hypothesis regarding
Example: interaction effect. If the null hypothesis of no
interaction is rejected, we do not interpret
Suppose we want to determine which of three teaching the results of the hypotheses involving the
methods is the best by comparing their mean test scores.
main effects. If the interaction term is NOT
We can include a pretest score as a covariate to account
significant, then we examine the two main
for participants having different starting skill levels.
effects separately. [BAKIT?!]
[going back to two-way ANOVA]
[Siguro, it is favorable kung magkakaroon ng
ONE-WAY VS. TWO-WAY ANOVA interaction since pwede mo ng i-test nang diretso ‘yung
mga variables tulad nung ginawa sa excel sa example]

How do you know if there’s an interaction between

two factors?

Example:

Factor A has two levels and Factor B has two levels. In

the left box, when Factor A is at level 1, Factor B
changes by 3 units. When Factor A is at level 2, Factor
B again changes by 3 units. Similarly, when Factor B is
at level 1, Factor A changes by 2 units. When Factor B
is at level 2, Factor A again changes by 2 units. There is
HYPOTHESES REGARDING INTERACTION
no interaction. The change in the true average response
EFFECT
when the level of either factor changes from 1 to 2 is the
same for each level of the other factor. In this case,
changes in levels of the two factors affect the true
average response separately, or in an additive manner.

HYPOTHES REGARDING MAIN EFFECTS

The right box illustrates the idea of interaction. When
Factor A is at level 1, Factor B changes by 3 units but
when Factor A is at level 2, Factor B changes by 6 units.
When Factor B is at level 1, Factor A changes by 2 units
but when Factor B is at level 2, Factor A changes by 5
units. The change in the true average response when the
levels of both factors change simultaneously from level
1 to level 2 is 8 units, which is much larger than the
separate changes suggest. In this case, there is an
interaction between the two factors, so the effect of
simultaneous changes cannot be determined from the
individual effects of the separate changes. Change in the
true average response when the level of one factor
changes depends on the level of the other factor. You
cannot determine the separate effect of Factor A or
Factor B on the response because of the interaction.

Example:

I.V.: Device and Task

D.V.: Task Completion Time

Alpha = 0.01

[may maling input sa PDF file kaya ito na lang]

The row of ‘Sample’ indicates the variables Task 1

and Task 2, and the p-value > 0.01, implying that we fail
to reject the null hypothesis. This means that there is no
significant difference between the variables Task 1 and
EFFECT SIZE
Task 2.

The row of ‘Columns’ indicates the variables The effect size is the size of the change in the
Device 1, Device 2, and Device 3. Moreover, the p- parameter of interest that can be detected by an
value < 0.01, implies that we reject the null hypothesis experiment. For example, in a coin tossing example, the
and so, there are significant differences among Device parameter of interest is P, the probability of a head. In
1, Device 2, and Device 3. calculating the sample size, we would need to state what
the baseline probability is (probably 0.5) and how large
Lastly, the row of ‘Interaction’ shows the
of a deviation from P that we want to detect with our
interaction between two factors (Task 1 and Task 2).
experiment. We would expect that it would take much
Showing a p-value < 0.01, hence, there is indeed an
larger sample size to detect a deviation of 0.01 than it
interaction between the said factors.
would to detect a deviation of 0.04. Selecting an
appropriate effect size is difficult because it is
POWER ANALYSIS subjective. The question that must be answered is:

What size change in the parameter would be of

Statistical Power Analysis must be discussed in
interest? Note that, in power analysis, the effect size is
the context of statistical hypothesis testing. It is an
not the actual difference, instead the effect size is the
important aspect of experimental design. It allows us to
change in the parameter that is of interest or is to be
determine the sample size required to detect an effect of
detected.
the a given size with a given degree of confidence.
[going back in Regression Analysis]
Power Analysis is normally conducted before the data
collection. The main purpose underlying power analysis Note that one of the essences of regression
is to help the researcher to determine the smallest analysis is that we can establish a model in such a way
sample size that is suitable to detect the effect of a given that we can predict certain values of our dependent
test at a desired level of significance. variable (it is required to have one dependent variable
only)
It combines statistical analysis, subject-area
knowledge, and your requirements to help you derive If our dependent variable exceeds one, say more
the optimal sample size for your study. Statistical power than one, then we will use a test concerning multi-
in a hypothesis test is a probability that the test will variation, this means that if there is more than one D.V.
detect an effect that actually exists. with three or more independent groups, then we can’t
use ANOVA, instead, we will use a Multi-variate
For example, 80% power in a clinical trial means
ANOVA. Similar logic applies to other tests in which
that the study has 80% chance of ending up with a p
variances are used.
value less than 5% in a statistical test (i.e. a statistically
significant treatment effect), if there really was an
important difference between treatment. CLASSIFICATION OF LINEAR MODELS

The power of any test of statistical significance is

Postulated linear models depend on the type of
defined as the probability that it will reject false null
dependent and independent variables used to indicate
hypothesis. Statistical power is inversely related to beta
the system to be modeled.
or probability of making a type II error. In short power =
1-Beta.

Usually, a power analysis calculates needed

sample size given some expected sample size, alpha and
power. In most cases, the researcher is interested in
solving for the sample, so majority of the work needed
to do a power analysis relates to determining the
expected effect to be used in the power analysis.

Statistical power is positively correlated with the PURPOSE OF MODELING

sample size, which means that the level of the other  To understand the mechanism that generates
factors, a larger sample size gives greater power.
the data.
However, researchers are also faced with the decision to
 To predict the values of the dependent
make a difference between statistical difference and
variable given the independent variables.
scientific differences.
 To optimize the response indexed by the
dependent variable.

VARIABLES

 DEPENDENT (response/endogenous) –
whose variability is being studied or explained
within the system.
 INDEPENDENT (regressor/exogenous) –
used to explain the behavior of the variable.
The variability of this variable is explained
outside the system.

TYPES OF DATA

 Cross-section – different stations measured at

the same point in time.
 Time series – one or more stations measured
at different points in time.

ANOVA
No ratings yet
ANOVA
36 pages
Lecture 13 ANOVA
100% (1)
Lecture 13 ANOVA
36 pages
ANOVA
No ratings yet
ANOVA
23 pages
Reg Chapter 13 ANOVA
No ratings yet
Reg Chapter 13 ANOVA
62 pages
One-Way ANOVA Is Used To Test If The Means of Two or More Groups Are Significantly Different
No ratings yet
One-Way ANOVA Is Used To Test If The Means of Two or More Groups Are Significantly Different
17 pages
BBADM 221 Unit 10 - With Notes
No ratings yet
BBADM 221 Unit 10 - With Notes
51 pages
Analysis of Variance (ANOVA)
No ratings yet
Analysis of Variance (ANOVA)
92 pages
Stat 2 - 10 - Two Way Anova
No ratings yet
Stat 2 - 10 - Two Way Anova
35 pages
T (Ea) For Two
No ratings yet
T (Ea) For Two
31 pages
5.3c JHypothesis TestingANOVA
No ratings yet
5.3c JHypothesis TestingANOVA
63 pages
One Way Anova
No ratings yet
One Way Anova
36 pages
CH 10
No ratings yet
CH 10
54 pages
Techniques of Annova - 20241103 - 232802 - 0000
No ratings yet
Techniques of Annova - 20241103 - 232802 - 0000
32 pages
Lesson 15 INFERENCES ABOUT THREE OR MORE POPULATION MEANS USING F-TEST (ANOVA)
No ratings yet
Lesson 15 INFERENCES ABOUT THREE OR MORE POPULATION MEANS USING F-TEST (ANOVA)
21 pages
Session 12 - 2023
No ratings yet
Session 12 - 2023
43 pages
ANOVA Notes
No ratings yet
ANOVA Notes
26 pages
Anova 1
No ratings yet
Anova 1
59 pages
Lecture 2 Anova Erb
No ratings yet
Lecture 2 Anova Erb
26 pages
Topic 5 Analysis of Variance
No ratings yet
Topic 5 Analysis of Variance
31 pages
Session 15 - ANOVA
No ratings yet
Session 15 - ANOVA
39 pages
ANOVA
No ratings yet
ANOVA
22 pages
Analysis of Variance
No ratings yet
Analysis of Variance
45 pages
Statistics FOR Management Assignment - 2: One Way ANOVA Test
No ratings yet
Statistics FOR Management Assignment - 2: One Way ANOVA Test
15 pages
Chapter 4 Hypotheses Testing of More Than Two Populations
No ratings yet
Chapter 4 Hypotheses Testing of More Than Two Populations
90 pages
Analysis of Variance
No ratings yet
Analysis of Variance
4 pages
2 One-Way ANOVA
No ratings yet
2 One-Way ANOVA
59 pages
Lecture Two 2019-2020
No ratings yet
Lecture Two 2019-2020
30 pages
One Way Annova (SPSS)
No ratings yet
One Way Annova (SPSS)
10 pages
Anova (Sta 305)
No ratings yet
Anova (Sta 305)
21 pages
Unit Five
No ratings yet
Unit Five
16 pages
Lecture 7 ANOVA
No ratings yet
Lecture 7 ANOVA
30 pages
15 Anova
No ratings yet
15 Anova
34 pages
Hypothesis Testing ANOVA
No ratings yet
Hypothesis Testing ANOVA
61 pages
Pertemuan 14 - One Way Anova
No ratings yet
Pertemuan 14 - One Way Anova
25 pages
ANOVA PPT Explained PDF
No ratings yet
ANOVA PPT Explained PDF
50 pages
Methodology and Application of One-Way ANOVA: Keywords
No ratings yet
Methodology and Application of One-Way ANOVA: Keywords
6 pages
Analysis of Variance: Testing Equality of Means Across Groups
No ratings yet
Analysis of Variance: Testing Equality of Means Across Groups
7 pages
Chapter 12
No ratings yet
Chapter 12
12 pages
Statistics For Business: Analysis of Variance
No ratings yet
Statistics For Business: Analysis of Variance
51 pages
Chapter 12
No ratings yet
Chapter 12
31 pages
MC Math 13 Module 14
No ratings yet
MC Math 13 Module 14
10 pages
ANOVA & Design of Experiment (Definitions)
No ratings yet
ANOVA & Design of Experiment (Definitions)
6 pages
Chapter 5, ANOVA
No ratings yet
Chapter 5, ANOVA
6 pages
Advanced Stat Anova
No ratings yet
Advanced Stat Anova
10 pages
SMuR Complete
No ratings yet
SMuR Complete
114 pages
Analysis of Variance
No ratings yet
Analysis of Variance
11 pages
Lecture 2
No ratings yet
Lecture 2
13 pages
Analysis of Variance (Anova)
No ratings yet
Analysis of Variance (Anova)
15 pages
10 ANOVA
No ratings yet
10 ANOVA
48 pages
Anova
No ratings yet
Anova
43 pages
Dav 2 Unit
No ratings yet
Dav 2 Unit
7 pages
ANOVA Reader
No ratings yet
ANOVA Reader
7 pages
Module 9
No ratings yet
Module 9
11 pages
BST 32202 Linear Regression 3 Anova One Way
No ratings yet
BST 32202 Linear Regression 3 Anova One Way
29 pages
02b Data Structures Datasets
No ratings yet
02b Data Structures Datasets
96 pages
115AG01
No ratings yet
115AG01
2 pages
Data Science With Python Relationship
No ratings yet
Data Science With Python Relationship
30 pages
1 Act What Are Data
No ratings yet
1 Act What Are Data
3 pages
General Principles of Preclinical Study Design
No ratings yet
General Principles of Preclinical Study Design
15 pages
Lab 2.ipynb - Colab
No ratings yet
Lab 2.ipynb - Colab
10 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
7 pages
Comparing Groups For Statistical Differences - How To Choose The Right Statistical Test - Biochemia Medica
No ratings yet
Comparing Groups For Statistical Differences - How To Choose The Right Statistical Test - Biochemia Medica
8 pages
CH 2
No ratings yet
CH 2
14 pages
Variables
No ratings yet
Variables
11 pages
Research Activity 2
No ratings yet
Research Activity 2
7 pages
Assignment 2 AAMD by Om
No ratings yet
Assignment 2 AAMD by Om
10 pages
One Way Anova
100% (2)
One Way Anova
36 pages
Linear Regression and Logistic Regression
No ratings yet
Linear Regression and Logistic Regression
19 pages
003-Seven Practices Points of Light
No ratings yet
003-Seven Practices Points of Light
25 pages
Statistics Supplement McEvoy
No ratings yet
Statistics Supplement McEvoy
10 pages
RVS Certificate in Advanced Business Analytics and Data Science Using SAS and R
No ratings yet
RVS Certificate in Advanced Business Analytics and Data Science Using SAS and R
4 pages
OPMT 1005 - Week Three - Data and Statistics
No ratings yet
OPMT 1005 - Week Three - Data and Statistics
50 pages
Counting Civilian Casualties An Introduction To Recording and Estimating Nonmilitary Deaths in Conflict 1st Edition Taylor B. Seybolt PDF Download
No ratings yet
Counting Civilian Casualties An Introduction To Recording and Estimating Nonmilitary Deaths in Conflict 1st Edition Taylor B. Seybolt PDF Download
52 pages
Forecasting Stock Performance in Indian Market Usi
No ratings yet
Forecasting Stock Performance in Indian Market Usi
25 pages
Session 1-3 - Introduction To Business Statistics
No ratings yet
Session 1-3 - Introduction To Business Statistics
43 pages
AILab Journal Karan
No ratings yet
AILab Journal Karan
22 pages
Turning Behaviour of Woodlice
No ratings yet
Turning Behaviour of Woodlice
5 pages
Regresi Logistik - Bahan
No ratings yet
Regresi Logistik - Bahan
89 pages
DMDW Full PDF
No ratings yet
DMDW Full PDF
784 pages
Statistical Foundations in Data Science
No ratings yet
Statistical Foundations in Data Science
99 pages
Module 8 ANOVA or F Test
No ratings yet
Module 8 ANOVA or F Test
11 pages
TIGIST PROPOSAL and Result
No ratings yet
TIGIST PROPOSAL and Result
45 pages
Stat Book Versao Final
No ratings yet
Stat Book Versao Final
76 pages
Where Are We and Where Are We Going?: Purpose IV DV Inferential Test
No ratings yet
Where Are We and Where Are We Going?: Purpose IV DV Inferential Test
36 pages
The Analysis of Cross Classified Categorical Data Compress
No ratings yet
The Analysis of Cross Classified Categorical Data Compress
208 pages
Issue 14
No ratings yet
Issue 14
2 pages
Factor in R - Categorical & Continuous Variables
No ratings yet
Factor in R - Categorical & Continuous Variables
4 pages
Anova and Design of Experiments
No ratings yet
Anova and Design of Experiments
35 pages

Assumptions: Pitfalls of Regression Analysis

Uploaded by

Assumptions: Pitfalls of Regression Analysis

Uploaded by

ASSUMPTIONS

1. Your dependent variable should be measured

 Lacking an awareness of the assumptions HYPOTHEES

 Using a regression model without knowledge

STRATEGIES FOR AVOIDING PITFALLS OF

 Start with a scatter plot of X on Y to observe  One-way analysis of variance (ANOVA) is a

 If there is violation of any assumption, use

 If there is no evidence of assumption

 The ANOVA test is applied by calculating

Note: Callie randomly selects 6 business days and

[I.V.: Teaching Methods / D.V.: Scores of

 Sometimes we may analyze the effects of

[I.V.: Teachers and her teaching methods /

POST HOC TESTS ON ONE-WAY ANOVA

The Tukey test is also known as the Honestly

It tests 𝐻𝑜: 𝜇i = 𝜇j versus 𝐻𝑎: 𝜇i ≠ 𝜇j for all means

The computation of the test statistic for the Tukey’s

DISTRIBUTION OF TUKEY TEST

The q-test statistic follows a distribution called the

s2 = mean square error estimate (MSE) of from the

n1 = sample size from population 1

Use Tukey’s test to determine which pairwise means

CRITICAL VALUE FOR THE TUKEY’S TEST

The critical value for Tukey’s test using a

v = degrees of freedom due to error (the degrees of

[n here is the total number of observations]

k = Total number of means being compared.

If 𝑞 ≥ 𝑞𝛼,𝑣,k reject the null hypothesis that 𝐻𝑜: 𝜇𝑖

PROCEDURES USED TO MAKE MULTIPLE

1. Arrange the sample means in ascending order.

[n here is the number of samples]

Suppose that there is sufficient evidence to reject

[covariate is a variable that also influences the outcome

 Covariance in statistics measures the extent to

[the following details presented are just further

WHEN TO USE ANCOVA?

ANCOVA, or the analysis of covariance, is a

It is a potent tool because it adjusts for the effects

o Are uncontrolled conditions in the

o Can influence the outcome.

This unfortunate combination of attributes allows

Fortunately, you can use an ANCOVA model to

ANCOVA does the following:

o Increases statistical power and precision by

How do you know if there’s an interaction between

Factor A has two levels and Factor B has two levels. In

HYPOTHES REGARDING MAIN EFFECTS

I.V.: Device and Task

D.V.: Task Completion Time

[may maling input sa PDF file kaya ito na lang]

The row of ‘Sample’ indicates the variables Task 1

What size change in the parameter would be of

The power of any test of statistical significance is

Usually, a power analysis calculates needed

Statistical power is positively correlated with the PURPOSE OF MODELING

 Cross-section – different stations measured at

You might also like