Chapter6 STAT 453 558
Chapter6 STAT 453 558
Michelle F. Miranda
University of Victoria
[email protected]
Office: DTB 543
Introduction
The 2k factorial design is a special case of the general factorial design described on Chapter
5.
1
Estimation of the Effects
Practical interpretation
• The effect of A is positive, suggesting that increasing A from the low level (15%) to
the high level (25%) will increase the yield
• The effect of B is negative, suggesting that increasing the amount of catalyst added to
the process will decrease the yield
• The interaction effect seems small relative to the two main effects
2
ANOVA for the 22 Design
• It’s convenient to write down the treatment combinations in the order (−1), a, b, ab
(standard order)
3
[ab + a − b − (1)]2
SSA =
4n
[ab + b − a − (1)]2
SSB =
4n
[ab + (1) − a − b]2
SSAB =
4n
• To estimate the MS we need the degrees of freedom as bellow
SS DF
SSA 1
SSB 1
SSAB 1
SSE 4(n-1)
SST 4n-1
• It’s important to examine the magnitude and direction of the factor effects along with
ANOVA because ANOVA does not convey this information. This will tells us which
variables are likely to be important.
4
The regression model
• For the chemical process example
y = β0 + β1 x1 + β2 x2 + β12 x1 x2
The 23 Design
• 3 factors: A, B, and C,
• Each factor at two levels low (-) and high (+)
• Similarly to the 22 , we can list the 8 runs in a geometric view
• We write the treatment combinations in standard order as (1), a, b, ab, c, ac, bc, and abc
• Remember these symbols also represent the total of all n observations taken at that
particular treatment combination
Contrasts
• Three different notations are used for the runs in the 2k design: geometric coding (+ or
-), lowercase letter labels, the use of 0 and 1 to denote the low and high level (dummy
variable)
• There are 7 degrees of freedom between the 8 treatment combinations in the 23 design
• 3 degrees of freedoms are associated with the main effects A, B, and C
• 4 degrees of freedom are associated with the interactions (one each): AB, AC, BC, and
ABC
5
6
Estimating the main effect A
• The effect of A when B and C are at the low level is [a − (1)]/n
• The effect of A when B is at the high level and C is at the low level is [ab − b]/n
• The effect of A when C is at the high level and B is at the low level is [ac − c]/n
• The effect of A when both B and C are at the high level is [abc − bc]/n
1
A= [a − (1) + ab − b + ac − c + abc − bc]
4n
• The effect of A is just the average of the four runs where A is at the high level (ȳA+ )
minus the average of the four runs where A is at the low level (ȳA− )
A = ȳA+ − ȳA−
a + ab + ac + abc (1) + b + c + bc
= −
4n 4n
1
= [a + ab + ac + abc − (1) − b − c − bc] (1)
4n
7
The two-factor interaction effects
• A measure of the AB interaction is the difference between the average A effects and
the two levels of B. By convention, 1/2 of this difference is called the AB interaction
abc − bc + ab − b − ac + c − a + (1)
AB = (4)
4n
abc + ab + c + (1) bc + b + ac + a
= −
4n 4n
• Similarly we can find
abc + ac + b + (1) bc + c + ab + a
AC = −
4n 4n
1
= [(1) − a + b − ab − c + ac − bc + abc] (5)
4n
1
BC = [(1) + a − b − ab − c − ac + bc + abc] (6)
4n
• The ABC interaction is defined as the average difference between the AB interaction
at the two different levels of C
1
ABC = [abc − bc − ac + c − ab + b + a − (1)] (7)
4n
8
Algebraic Signs in the 23 design
• A table of plus and minus signs can be developed from the contrasts
• In the 23 design with n replicates, the sum of squares for any effect is
(Contrast)2
SS =
8n
9
• We can construct CI for the effects
CI = Effect ± s.e.(Effect)
Contrast
V (Effect) = V
n2k−1
• Each contrast is a linear combination of 2k treatment totals, and each consists of n
observations
V (Contrast) = n2k σ 2
1
V (Effect) = k−2 σ 2
n2
Therefore
q √
\ = √M SE ,
s.e.(Effect) = V (Effect)
2
n2k
since σ̂ 2 = M SE.
– k3 three-factor interactions...
– kk k-factor interaction
Total of 2k − 1 effects
• Error has 2k (n − 1) degrees of freedom
Properties
10
A single replicate of the 2k design
• n=1
• 2k observations
• Effects have 2k − 1 d.f. = total d.f.
• Error: 0 d.f. (No MSE)
• No F-tests to test for effects
Inference when n = 1
Since there is no degrees of freedom left for the MSE, we can use use the following strategy.
• The effects that are negligible are usually normally distributed with mean 0 and variance
σ2
• Significant effects will have nonzero means
• A normal probability plot can be used to identify important factors
From a normal probability plot we can identify significant effects and keep them in the re-
duced model. Non-significant effects will not be in the reduced model, and they are combined
as an estimate of the error.
• We can find which level combination maximizes or minimizes the response variable y
• Regression model is a very useful way to optimize y
• We look at ŷ to find optimum levels for the factors
11
Example 6.2
• By looking at the normal probability plot we noticed that only the main effects of A,
C, and D and the AC and AD interactions are important.
• We can discard B from the experiment so that the design becomes a 23 design in A, C,
and D with two replicates. This is called design projection.
• Note that by projecting the single replicate 24 into a replicated 23 design we now have
both an estimate of the ACD interaction and an estimate of the error based on what
is called the hidden replication.
12
Projecting 2k design into 2k−1
• If one factor is not important in a 2k design, we can project the 2k design into a 2k−1
design with 2 replicates
• Assume we have a 23 design and B is not a significant effect. Then, the
• Assume we have a becomes a 22 design with 2 replicates as shown below
A B C A C
- - - - -
+ - - + -
- + - - -
+ + - + -
- - + - +
+ - + + +
- + + - +
+ + + + +
• If we find clear problems with normality and equality of variance we can transform the
response variable y
• Common transformations are
– y ∗ = ln y
√
– y∗ = y + c
– y ∗ = (y λ − 1)/λ (called the Box-Cox transformation)
13