Module03 Anova

The document discusses ANOVA (Analysis of Variance) and its application in statistical analysis, particularly in determining if different settings significantly affect outcomes, such as distance traveled in various settings. It covers key concepts including the Central Limit Theorem, important sampling distributions, assumptions for ANOVA, and methods for hypothesis testing like Tukey’s test and Least Significant Difference. Additionally, it emphasizes the importance of experimental design and the distinction between fixed and random effects models.

Uploaded by

A216Kasish Agarwal

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Module03 Anova

Uploaded by

A216Kasish Agarwal

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

ANOVA

Prof. S. Roychowdhury
The Statapult
Example: More than 2 Levels

• More than 2 samples

• Distance travelled in setting 1 (inches)
11,13,12,10,11
• Distance travelled in setting 2 (inches)
17,14,13,15,15
• Distance travelled in setting 3 (inches)
19,17,21,23,18
Question: Does setting significantly
affect travelling distance?
Plasma Etching
Plasma Etching

Etch Rate
Power (W) 1 2 3 4 5
160 575 542 530 539 570
180 565 593 590 579 610
200 600 651 610 637 629
220 725 700 715 685 710
1 Factor: More than 2 Levels

• Concept of ANOVA
• ANOVA Table
• Formulas
• Conclusion
Central Limit Theorem (CLT)
• Definition: If 𝑥1 , … , 𝑥𝑛 are independent random variables with mean 𝜇𝑖 and variance 𝜎𝑖2 ,
𝑦−σ𝑛𝑖=1 𝜇𝑖
and if 𝑦 = 𝑥1 + ⋯ + 𝑥𝑛 , then the distribution of approaches the 𝑁 0,1
σ𝑛 2
𝑖=1 𝜎𝑖
distribution as 𝑛 approaches infinity.
(Montgomery D.C., Introduction to Statistical Quality Control)

• It implies that the sum of 𝑛 independently distributed random variables is approximately

normal, regardless of the distribution of the individual variables.

• If 𝑥𝑖 are independent and identically distributed (IID), and distribution of each 𝑥𝑖 does not
depart radically from normal distribution, then CLT works quite well for 𝑛 ≥ 3 𝑜𝑟 4.
(common in SQC problems)
Important Sampling Distributions Derived
from Normal Distribution
1. 𝜒 2 distribution: If 𝑥1 , . . 𝑥𝑛 are standard normally and
independently distributed then 𝑦 = 𝑥12 + 𝑥22 … + 𝑥𝑛2
follow chi-squared distribution with 𝑛 degrees of
freedom.
2. 𝑡-distribution: If 𝑥 is standard normal variable and 𝑦 is chi-
squared random variable with 𝑘 degrees of freedom, and𝑥 if
𝑥 and 𝑦 are independent then the random variable 𝑡 = 𝑦 is
𝑘
distributed as 𝑡 with 𝑘 degrees of freedom.
3. If 𝑤 and 𝑦 are two independent random chi-sq distributed
variables with 𝑢 and 𝑣 degrees of freedom, then the ratio
𝑤
𝑢
𝐹= 𝑦 follows F distribution with (𝑢, 𝑣) degrees of freedom
𝑣
ANOVA
Minitab: Stat-> ANOVA-> One way ANOVA,
Graphs: Select Boxplot, Normal Probability Plot of Residuals

Example 3: Null hypothesis is rejected at 𝛼 = 0.05. The average distance

travelled differs significantly with the settings.
ANOVA How do I
explain
(model) the
variation ?
ANOVA
Means Model
𝒚𝒊𝒋 = 𝝁𝒊 + 𝝐𝒊𝒋
𝑖 = 1, . . 𝑎 levels
𝑗 = 1, . . 𝑛 observations in each level
Null hypothesis 𝐻0 : 𝜇1 = 𝜇2 … . = 𝜇𝑎
𝐻1 : 𝜇𝑖 ≠ 𝜇𝑗 for at least one pair of 𝑖, 𝑗

Effects Model
𝒚𝒊𝒋 = 𝝁 + 𝝉𝒊 + 𝝐𝒊𝒋
𝑖 = 1, . . 𝑎 levels
𝑗 = 1, . . 𝑛 observations in each level
Null hypothesis 𝐻0 : 𝜏1 = 𝜏2 … . = 𝜏𝑎 = 0
𝐻1 : 𝜏𝑖 ≠ 0 for at least one 𝑖
ANOVA
Grand
Mean Random Error
Component 𝑁(0, 𝜎 2 )
Linear Statistical Model
𝒚𝒊𝒋 = 𝝁 + 𝝉𝒊 + 𝝐𝒊𝒋
𝑖 = 1, . . 𝑎 levels
Treatment Effect
𝑗 = 1, . . 𝑛 observations in each level
(Effect due to level i)
ANOVA Calculations

𝑎 𝑛 𝑎 𝑎 𝑛
2 2 2
෍ ෍ 𝑦𝑖𝑗 − 𝑦.
ത . = 𝑛 ෍ 𝑦ത𝑖 . −𝑦ത.. + ෍ ෍ 𝑦𝑖𝑗 − 𝑦ത𝑖 .
𝑖=1 𝑗=1 𝑖=1 𝑖=1 𝑗=1

𝑆𝑆𝑇 = 𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 + 𝑆𝑆𝐸

Assumption: Each of the 𝑎 populations is coming from Normal distribution.
If 𝐻0 is true then 𝐹0 (see ANOVA Table) follows 𝐹 distribution with 𝑎 − 1
and 𝑎 𝑛 − 1 degrees of freedom.
Procedure: Reject 𝐻0 if 𝐹0 > 𝐹𝛼,𝑎−1,𝑎(𝑛−1) or P-value <0.05 , meaning differences
among the average responses at different levels are significant.
Assumptions for ANOVA
• The model error is assumed to be independent and
normally distributed random variables, with mean 0 and
variance 𝜎 2
• 𝑦𝑖𝑗 ~𝑁(𝜇 + 𝜏𝑖 , 𝜎 2 )
• 𝜖𝑖𝑗 ~𝑁(0, 𝜎 2 )
• Observations are mutually independent
• The variance 𝜎 2 is assumed to be same for all levels of the
factor.
Statistical Analysis
• 𝑆𝑆𝑇 is the sum of square of normally distributed random
variables.
𝑆𝑆𝑇
• It can be shown 2 ~𝜒 2 (𝑁 − 1)
𝜎
𝑆𝑆𝐸
• It can be shown 2 ~𝜒 2 (𝑁 − 𝑎) and
𝜎
𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 2
• ~𝜒 (𝑎 − 1) if 𝐻0 : 𝜏𝑖 = 0 ∀𝑖 is true
𝜎2
• But all three sum of squares are not independent.
Cochran’s Theorem
• Let 𝑍𝑖 be 𝑁𝐼𝐷(0,1) for 𝑖 = 1,2, … 𝜈 and
σ𝜈𝑖=1 𝑍𝑖2 = 𝑄1 + 𝑄2 + ⋯ + 𝑄𝑠
Where 𝑠 ≤ 𝜈, 𝑄𝑖 has 𝜈𝑖 degrees of freedom.
Then 𝑄1 , . . 𝑄𝑠 are independent chi-square random variables
with 𝜈1 , … , 𝜈𝑠 degrees of freedom respectively, if and only if

𝜈 = 𝜈1 + ⋯ + 𝜈𝑠
Cochran’s Theorem
𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 𝑆𝑆𝐸
Cochran’s theorem implies that and are
𝜎2 𝜎2
independently distributed chi-square random variables.

So if 𝐻0 of no difference in treatment is true the ratio

𝑆𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 /(𝑎−1)
𝐹0 = = 𝑀𝑆𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑠 /𝑀𝑆𝐸 is distributed
𝑆𝑆𝐸 /(𝑁−𝑎)
as 𝐹 with (𝑎 − 1) and (𝑁 − 𝑎) dofs.
ANOVA Calculations
ANOVA F-crit

𝑭𝟎.𝟎𝟓,𝟐,𝟏𝟐
Key Considerations for ANOVA
• Experiments have to be performed in random order so
that the environment in which the treatments are applied
is as uniform as possible. The experimental design should
be completely randomized design.
• When “𝑎”no. treatment levels are specifically chosen by
the experimenter, the conclusions cannot be extended for
similar treatments that were not considered. This is called
Fixed Effects Model
• When "𝑎“ no. of treatment levels are chosen randomly
out of a larger population of treatments, 𝜏𝑖 (treatment
effects) are random variables and we try to estimate the
variability in 𝜏𝑖 . This is called Random Effects Model.
Key Considerations for ANOVA
• In Fixed Effects Model we test hypotheses about the
treatment means.
• The conclusions for FEM are only applied to the factor
levels considered.
• The model parameters (𝜇, 𝜏𝑖 , 𝜎 2 ) can be estimated from
FEM.
• In Random Effects Model, conclusions can be exteneded
to all treatments.
• In REM, 𝜏𝑖 s are random variables, and the variability of
the 𝜏𝑖 s are estimated.
Which Factors are Significant

no
F-test finds
Terminate
Significance?

Perform Pair-wise
“LSD” Comparisons Terminate
Or Tukey’s test
Least Significant Difference
1 1
▪ LSD = 𝑡1−𝛼 , 𝑛𝑜. 𝑜𝑓 𝑑𝑜𝑓𝑠 𝑜𝑓 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠 * 𝑀𝑆𝐸 ( + )
2 𝑛1 𝑛2

▪ Check if |𝑦ത𝑖 − 𝑦ത𝑗 | > 𝐿𝑆𝐷 for all 𝑖, 𝑗 ∈ 1 … 𝑎, 𝑖 ≠ 𝑗

▪ You can make plots to show which
1

2
3
levels are different

Final conclusion: Setting 3 provides significantly more

travel, than setting 1 and 2. There is no significant difference
among the average distances travelled as a result of settings 1
and 2.
Tukey’s Test
• When null hypothesis is rejected, all pairwise means to be
compared 𝐻0 : 𝜇𝑖 = 𝜇𝑗 ; 𝐻1 : 𝜇𝑖 ≠ 𝜇𝑗
• Tukey’s procedure has significance level 𝛼 when sample sizes
are equal and atmost 𝛼 when sample sizes are not equal. It
keeps family error rate at selected level 𝛼.
• Tukey’s procedure uses Studentized Range Distribution
ഥ
y𝑚𝑎𝑥 −ഥ
ymin
• Studentized Range Statistic q =
𝑀𝑆𝐸
𝑛
𝑀𝑆𝐸
• If |𝑦ത𝑖 . −𝑦ത𝑗. | > 𝑇𝛼 = 𝑞 1−𝛼 (𝑎, 𝑓)
𝑛
• where 𝑎 = number of treatment levels and 𝑓 is dof associated
with MSE, then two means are significantly different
Confidence Intervals
• 100 1 − 𝛼 confidence intervals for all pairs of means may be
constructed
𝑀𝑆𝐸
𝑦ത𝑖 . −𝑦ത𝑗 . −𝑞1−𝛼 𝑎, 𝑓 ≤ 𝜇𝑖 − 𝜇𝑗
𝑛
𝑀𝑆𝐸
≤ 𝑦ത𝑖 . −𝑦ത𝑗 . + 𝑞1−𝛼 𝑎, 𝑓
𝑛
For unequal sample sizes:
Check if the absolute differences of sample means is greater
than
𝑞1−𝛼 𝑎,𝑓 1 1
𝑇𝛼 = 𝑀𝑆𝐸 +
2 𝑛𝑖 𝑛𝑗
Estimating Model Parameters
• 𝑦𝑖𝑗 = 𝜇 + 𝜏𝑖 + 𝜖𝑖𝑗 For 𝑖 = 1, . . 𝑎 ; 𝑗 = 1, … 𝑛
• The model parameters can be estimated as below:
• 𝜇ො = 𝑦.ത.
• 𝜏Ƹ 𝑖 = 𝑦ത𝑖 . −𝑦.
ത . For 𝑖 = 1, . . 𝑎
• 𝑖 𝑡ℎ treatment mean: 𝜇ො𝑖 = 𝑦ത𝑖 .
Confidence interal of 𝑖 𝑡ℎ treatment mean:
𝑀𝑆𝐸 𝑀𝑆𝐸
𝑦ത𝑖 . − 𝑡1−𝛼,𝑁−𝑎 ≤ 𝜇𝑖 ≤ 𝑦ത𝑖 . + 𝑡1−𝛼,𝑁−𝑎
2 𝑛 2 𝑛
Model Adequacy Checking
• To check the assumption of normality of the error term
find the residuals: 𝑒𝑖𝑗 = 𝑦𝑖𝑗 − 𝑦ො𝑖𝑗 = 𝑦𝑖𝑗 − 𝑦ത𝑖 .
• The residuals should be structureless.
• Normal probability plot of residuals is an effective way to
check the assumption of normality
• Check outliers with standardized residuals
𝑒𝑖𝑗
𝑑𝑖𝑗 =
𝑀𝑆𝐸
68% of 𝑑𝑖𝑗 should be within ±1, 95% within ±2 and almost
all (99.73%) should be within ±3
Variance Stabilizing Transformation
Observations Transformation
′
Poisson Distribution Square root: 𝑦𝑖𝑗 = 𝑦𝑖𝑗 or
′
𝑦𝑖𝑗 = 1 + 𝑦𝑖𝑗
′
Lognormal Distribution Logarithmic: 𝑦𝑖𝑗 = log 𝑦𝑖𝑗
Binomial Distribution ′
Arcsin: 𝑦𝑖𝑗 = arcsin√(𝑦𝑖𝑗 )

Understanding The Z - Score
No ratings yet
Understanding The Z - Score
63 pages
Module02 ANOVA
No ratings yet
Module02 ANOVA
28 pages
Quality Midsem
No ratings yet
Quality Midsem
179 pages
Lectures Stat 530
No ratings yet
Lectures Stat 530
59 pages
Chapter 3
No ratings yet
Chapter 3
34 pages
Analysis of Variance (Anova)
No ratings yet
Analysis of Variance (Anova)
35 pages
ANOVA models
No ratings yet
ANOVA models
7 pages
Uecm2623 Topic 9
No ratings yet
Uecm2623 Topic 9
13 pages
Chapter6 ANOVA
No ratings yet
Chapter6 ANOVA
51 pages
Chapter 6: Introduction To Analysis of Variance, Statistical Quality Control and System Reliability
No ratings yet
Chapter 6: Introduction To Analysis of Variance, Statistical Quality Control and System Reliability
14 pages
The Analysis of Variance: I S M T 2002
No ratings yet
The Analysis of Variance: I S M T 2002
31 pages
Notes Class5
No ratings yet
Notes Class5
41 pages
Lecture 2: Completely Randomised Designs: Example 1
No ratings yet
Lecture 2: Completely Randomised Designs: Example 1
25 pages
IME 755-Chapter 3
No ratings yet
IME 755-Chapter 3
41 pages
BBADM 221 Unit 10 - With Notes
No ratings yet
BBADM 221 Unit 10 - With Notes
51 pages
Chapter Three Mean Difference Analysis (T-Test, Analysis of Variance)
No ratings yet
Chapter Three Mean Difference Analysis (T-Test, Analysis of Variance)
55 pages
Chapter - 13 Correlation and Linear Regression
No ratings yet
Chapter - 13 Correlation and Linear Regression
26 pages
Anova and Design of Experiments
No ratings yet
Anova and Design of Experiments
35 pages
Analysis of Variance
No ratings yet
Analysis of Variance
12 pages
Analysis of Variance
No ratings yet
Analysis of Variance
45 pages
14 Anova1
No ratings yet
14 Anova1
31 pages
Design of Experiments
No ratings yet
Design of Experiments
30 pages
One Way Analysis of Variance (ANOVA) : "Slide 43-45)
No ratings yet
One Way Analysis of Variance (ANOVA) : "Slide 43-45)
15 pages
Section 3.2.2 to End updated
No ratings yet
Section 3.2.2 to End updated
37 pages
Anova
No ratings yet
Anova
34 pages
Anova: Module 3 - Advanced Statistics
No ratings yet
Anova: Module 3 - Advanced Statistics
17 pages
Statistics For Decision Making: ANOVA: Analysis of Variance
No ratings yet
Statistics For Decision Making: ANOVA: Analysis of Variance
32 pages
Module 17
No ratings yet
Module 17
8 pages
Chapter 4 Hypotheses Testing of More Than Two Populations
No ratings yet
Chapter 4 Hypotheses Testing of More Than Two Populations
90 pages
SMA 6304 / MIT 2.853 / MIT 2.854: Manufacturing Systems
No ratings yet
SMA 6304 / MIT 2.853 / MIT 2.854: Manufacturing Systems
35 pages
FormulaSheet FinalExam
No ratings yet
FormulaSheet FinalExam
8 pages
Readings For Lecture 5,: S S N N S N
No ratings yet
Readings For Lecture 5,: S S N N S N
16 pages
ANOVA and Simple Comparative Experiment
No ratings yet
ANOVA and Simple Comparative Experiment
44 pages
Lec Mar 1A ANOVA I
No ratings yet
Lec Mar 1A ANOVA I
66 pages
Single Factor Design: Analysis of Variance (ANOVA)
No ratings yet
Single Factor Design: Analysis of Variance (ANOVA)
20 pages
Lecture 7 ANOVA
No ratings yet
Lecture 7 ANOVA
30 pages
Checking Model Assumptions
No ratings yet
Checking Model Assumptions
4 pages
Design of Engineering Experiments Part 2 - Basic Statistical Concepts
No ratings yet
Design of Engineering Experiments Part 2 - Basic Statistical Concepts
11 pages
Doe
No ratings yet
Doe
143 pages
Anova: Analysis of Variation: Math 243 Lecture R. Pruim
No ratings yet
Anova: Analysis of Variation: Math 243 Lecture R. Pruim
30 pages
Anova: Analysis of Variation: Math 243 Lecture R. Pruim
No ratings yet
Anova: Analysis of Variation: Math 243 Lecture R. Pruim
30 pages
Anova: Analysis of Variation: Math 243 Lecture R. Pruim
No ratings yet
Anova: Analysis of Variation: Math 243 Lecture R. Pruim
30 pages
Anova: Analysis of Variation: Math 243 Lecture R. Pruim
No ratings yet
Anova: Analysis of Variation: Math 243 Lecture R. Pruim
30 pages
An Ova 111
No ratings yet
An Ova 111
4 pages
Chapter 12 PowerPoint
No ratings yet
Chapter 12 PowerPoint
40 pages
Inferential Statistics: Draw Inferences About The Larger Group
No ratings yet
Inferential Statistics: Draw Inferences About The Larger Group
60 pages
F-Test: Illustration 1: The Following Are The 6 Randomly Selected Games Played by Three Basketball
No ratings yet
F-Test: Illustration 1: The Following Are The 6 Randomly Selected Games Played by Three Basketball
7 pages
Hypothesis Testing Using The One-Way Analysis of Variance
No ratings yet
Hypothesis Testing Using The One-Way Analysis of Variance
52 pages
Notes 1029
No ratings yet
Notes 1029
27 pages
Hypothesis Testing ANOVA
No ratings yet
Hypothesis Testing ANOVA
61 pages
One Way Anova
100% (1)
One Way Anova
52 pages
VI Sem 1st Unit
No ratings yet
VI Sem 1st Unit
63 pages
Anova
No ratings yet
Anova
98 pages
stats 6th
No ratings yet
stats 6th
132 pages
Topic 5 Analysis of Variance
No ratings yet
Topic 5 Analysis of Variance
31 pages
Chapter 9: Analysis of Variance: For Example
No ratings yet
Chapter 9: Analysis of Variance: For Example
9 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Calculus Volume1
From Everand
Calculus Volume1
Ming Yao Tsai
No ratings yet
Calculus Super Review
From Everand
Calculus Super Review
Editors of REA
No ratings yet
Magnus Preview
No ratings yet
Magnus Preview
52 pages
Brief9 tcm18-25951
No ratings yet
Brief9 tcm18-25951
2 pages
Assignment Problems
No ratings yet
Assignment Problems
31 pages
What Are Control Charts?: Comparison of Univariate and Multivariate Control Data
No ratings yet
What Are Control Charts?: Comparison of Univariate and Multivariate Control Data
3 pages
Statistics
No ratings yet
Statistics
15 pages
Exam 2 Practice Questions
No ratings yet
Exam 2 Practice Questions
21 pages
Measure Theoretic Probability With Applications to Statistics Finance
No ratings yet
Measure Theoretic Probability With Applications to Statistics Finance
262 pages
Continuous Distributions
No ratings yet
Continuous Distributions
59 pages
STA2023 Final Exam Grade Saver Fall 14 (New) Notes PDF
No ratings yet
STA2023 Final Exam Grade Saver Fall 14 (New) Notes PDF
36 pages
Worksheet October 24 Solutions
No ratings yet
Worksheet October 24 Solutions
12 pages
Chapter 7
No ratings yet
Chapter 7
26 pages
Paper-J Jekov PDF
No ratings yet
Paper-J Jekov PDF
5 pages
Statistics Probability The Central Limit Theorem
No ratings yet
Statistics Probability The Central Limit Theorem
11 pages
Water Flooding
No ratings yet
Water Flooding
19 pages
LEISURE
No ratings yet
LEISURE
16 pages
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
No ratings yet
A Machine Learning Approach For Tracking and Predicting Student Performance in Degree Programs
12 pages
Assignment 4
No ratings yet
Assignment 4
3 pages
p10 p50 p90
No ratings yet
p10 p50 p90
8 pages
954 Mathematics T (PPU) Semester 1 Topics
No ratings yet
954 Mathematics T (PPU) Semester 1 Topics
11 pages
Get Statistics Using IBM SPSS: An Integrative Approach – Ebook PDF Version PDF ebook with Full Chapters Now
100% (4)
Get Statistics Using IBM SPSS: An Integrative Approach – Ebook PDF Version PDF ebook with Full Chapters Now
65 pages
CA Foundation Maths Statistics Question Paper With Answer Nov 2019
100% (1)
CA Foundation Maths Statistics Question Paper With Answer Nov 2019
11 pages
Kamakura AnUpdatedHJMModelforUSTreasuriesv2 20220930
No ratings yet
Kamakura AnUpdatedHJMModelforUSTreasuriesv2 20220930
36 pages
Dynamics of Satellite Separation System
No ratings yet
Dynamics of Satellite Separation System
12 pages
Muestras de Distribucion
No ratings yet
Muestras de Distribucion
7 pages
Course No: MATH F432: Applied Statistical Methods
No ratings yet
Course No: MATH F432: Applied Statistical Methods
36 pages
Mathbook-Econ Prep
100% (1)
Mathbook-Econ Prep
278 pages
Lecture30 Central Limit Theorem PDF
No ratings yet
Lecture30 Central Limit Theorem PDF
2 pages
Home Assignment - Experimental Methods & Analysis: Ia Random Variables and Probability Distributions
No ratings yet
Home Assignment - Experimental Methods & Analysis: Ia Random Variables and Probability Distributions
23 pages
UNIT III QA MBA
No ratings yet
UNIT III QA MBA
20 pages