0% found this document useful (0 votes)
16 views

Topic - chapter 11 - ANOVA

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Topic - chapter 11 - ANOVA

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Chapter 11: Analysis of Variance (ANOVA) Nguyen Thi Thu Van - November 25, 2023

One-way analysis of variance One-way ANOVA is a technique used to determine whether there are any statistically significant differences between the means of three or more independent (unrelated) groups.
For example, to allocate resources and fixed costs correctly, hospital management needs to test whether a
patient’s length of a stay (LOS) depends on the diagnostic-related group (DRG) code. Consider the case of a
bone fracture. LOS is a numerical response variable (measured in hours). The hospital organizes the data by
using five diagnostic codes for type of fracture (facial, radius or ulna, hip or femur, other lower extremity, all
other). Type of fracture is a categorical variable.

Group means Overall mean Sum of squares Sum of squares between groups Sum of squares within groups
𝑛𝑗 𝑐 𝑛𝑗 𝑐 𝑛𝑗 𝑐 𝑐 𝑛𝑗 𝑐 𝑐 𝑛𝑗
∑𝑖=1 𝑦𝑖𝑗 1 1 ∑𝑖=1 𝑦𝑖𝑗 1 2
𝑦̅𝑗 = 2 2
𝑦̅ = ∑ ∑ 𝑦𝑖𝑗 = ∑ 𝑛𝑗 × = ∑ 𝑛𝑗 𝑦̅𝑗 ∑ ∑(𝑦𝑖𝑗 − 𝑦̅) = ∑ 𝑛𝑗 × (𝑦̅𝑗 − 𝑦̅) + ∑ ∑(𝑦𝑖𝑗 − 𝑦̅𝑗 )
𝑛𝑗 𝑛 𝑛 𝑛𝑗 𝑛
𝑗=1 𝑖=1 𝑗=1 𝑗=1 𝑗=1 𝑖=1 𝑗=1 𝑗=1 𝑖=1

Explained by treatments Unexplained random error

Treatments 𝑻𝟏 𝑻𝟐 … 𝑻𝒄
𝒚𝟏𝟏 𝒚𝟏𝟐 … 𝒚𝟏𝒄
𝒚𝟐𝟏 𝒚𝟐𝟐 … 𝒚𝟐𝒄

Group size 𝒏𝟏 observations 𝒏𝟐 observations … 𝒏𝒄 observations
Mean ̅̅̅
𝒚𝟏 ̅̅̅
𝒚𝟐 … ̅̅̅
𝒚𝒄

If the treatment means do not differ ANOVA assumptions: F – test method [Ronald A. Fisher in the 1930s]
greatly from the grand mean, SSB will be • Observations on Y are independent. Step 1: State the hypotheses: 𝐻0 : 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑐 vs 𝐻1: Not all the means are equal
relatively small and SSE will be relatively • Populations being sampled are normal. Step 2: Specify the decision rule:
large (and conversely). The sums SSB and • Populations being sampled have equal variances. • Numerator: 𝑑𝑓1 = 𝑐 − 1;
SSE may be used to test the hypothesis In general, ANOVA’s are considered to be fairly robust against • Denominator: 𝑑𝑓2 = 𝑛 − 𝑐
that the treatment means differ from the violations of the equal variances assumption as long as each • Find the critical value 𝐹𝑐𝑟𝑖𝑡 = 𝐹. 𝐼𝑁𝑉. 𝑅𝑇(𝛼, 𝑑𝑓1 , 𝑑𝑓2)
grand mean. However, to adjust for group group has the same sample size. However, if the sample sizes 𝑀𝑆𝐵 𝑀𝑆𝐴
Step 3: Calculate the 𝐹 = ≡ use ANOVA table of interest and make the decision.
𝑀𝑆𝐸 𝑀𝑆𝐸
sizes, we first divide each sum of squares are not the same and this assumption is severely violated, you
Otherwise, find 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 𝐹. 𝐷𝐼𝑆𝑇. 𝑅𝑇(𝐹, 𝑑𝑓1, 𝑑𝑓2 )
by its degrees of freedom. could instead run a Kruskal-Wallis Test, which is the non-
parametric version of the one-way ANOVA.
Post hoc tests with ANOVA: To determine exactly which group means are different, we can perform a Test for homogeneity of variances: ANOVA assumes that observations on the response variable are from normally distributed populations that have the
Tukey post hoc test [John W. Tukey in the 1930s]. Similar to a two-sample t-test except that it pools the same variance. However, only few populations meet these requirements perfectly and unless the sample is quite large, a test for normality is impractical.
variances for all 𝑐 samples 𝐻0 : 𝜇𝑗 = 𝜇𝑘 𝐻1: 𝜇𝑗 ≠ 𝜇𝑘 But we can easily test the assumption of homogeneous (equal) variances: Hartley’s test to check for unequal variances for 𝑐 groups: 𝐻0 : 𝜎12 = 𝜎22 = ⋯ =

- The 𝑇 −statistic 𝑇𝑐𝑎𝑙𝑐 =


̅̅̅−𝑦
|𝑦 𝑗 ̅̅̅̅|
𝑘
; 𝑇𝑐𝑟𝑖𝑡 = 𝑇𝑐,𝑛−𝑐 [c = groups; n = the overall sample size. The 𝜎𝑐2 ; 𝐻1 : Not all the variances are equal.
1 1 2
𝑠𝑚𝑎𝑥 𝑛
√𝑀𝑆𝐸( + )
𝑛𝑗 𝑛𝑘 If 𝑐 = 2 then use 𝐹 test to compare variances [In fact, it is a two tailed t- test]. Otherwise, we use Hartley’s test 𝐻 = 2 with 𝑑𝑓1 = 𝑐; 𝑑𝑓2 = − 1 [𝑑𝑓2 is
𝑠𝑚𝑖𝑛 𝑐

𝑑. 𝑓. for numerator is 𝑐; for denominator is 𝑛 − 𝑐 ] approximated to the next lower integer if it is not an integer]
- We reject 𝐻0 if 𝑇𝑐𝑎𝑙𝑐 > 𝑇𝑐𝑟𝑖𝑡 We reject 𝐻0 if 𝐻𝑐𝑎𝑙𝑐 > 𝐻𝑐𝑟𝑖𝑡
Table T Table H
Too-way analysis of variance Two-way ANOVA is a technique used to determine the effect of two nominal predictor variables on a continuous outcome variable.
For example, a numerical response variable (paint viscosity) may vary both by temperature (Factor A) and by
paint supplier (Factor B). Three different temperature settings (A1, A2, A3) were tested on shipments from three
different suppliers (B1, B2, B3).

You might also like