0% found this document useful (0 votes)
50 views6 pages

Stat 3515 Lecture Notes No 1 s2022 Single Factor Completely Randomized Experiments

This document provides an overview of completely randomized experiments with a single factor. It defines key terminology like factors, levels, treatments, and experimental units. It describes the assumptions and model for a completely randomized single factor design. The analysis of variance approach is outlined, including decomposing total variability into treatment and error components. This allows testing hypotheses about treatment effects using an F-test. An example is presented to illustrate these concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views6 pages

Stat 3515 Lecture Notes No 1 s2022 Single Factor Completely Randomized Experiments

This document provides an overview of completely randomized experiments with a single factor. It defines key terminology like factors, levels, treatments, and experimental units. It describes the assumptions and model for a completely randomized single factor design. The analysis of variance approach is outlined, including decomposing total variability into treatment and error components. This allows testing hypotheses about treatment effects using an F-test. An example is presented to illustrate these concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

lOMoARcPSD|36552487

Stat 3515 Lecture Notes no 1 S2022 Single Factor


Completely Randomized Experiments
Design of Experiments (University of Connecticut)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Prof. Madya Dr. Umar Yusuf Madaki ([email protected])
lOMoARcPSD|36552487

Stat 3515 Lecture Notes no 1

Single Factor Completely Randomized Experiments

In an experiment to compare different treatments, each treatment must be applied to several


different experimental units. This is because the response from different units varies, even if the units
are treated identically. The simplest experimental design in which, in order to compare a treatments,
treatment i is applied to ni units, i=1,....,a. In many experiments n1=....=na, but this is not necessary
or even desirable. This design, in which the only recognizable difference between units is the
treatments which are applied to those units, is called a single factor completely randomized design.
With this design, the assignments are made completely at random. This complete randomization
provides that every experimental unit has an equal chance to receive any one of the treatments or,
equivalently that all combinations of experimental units are assigned to the different treatments are
equally likely.

A completely randomized design is particularly useful when the experimental units are quite
homogeneous. This design is very flexible; it accommodates any number of treatments and permits
different sample sizes for each of the treatments. Its chief disadvantage is that, when the experimental
units are heterogeneous, this design is not as efficient as other statistical designs.

Terminology:

A factor is an independent variable to be studied in an investigation. For example in an


investigation of how cotton content affects the tensile strength of a new synthetic fiber, the factor
studied is cotton content. similarly, to study the relationship between sales of cereal and four different
package designs, the factor is package design.

A level of a factor is a particular form of that factor. In the synthetic fiber study, the product
development engineer has selected fibers with 15%, 20%, 25%, 30% and 35% cotton. These are the
five levels of the factor in that study. In the cereal study, there are four levels for the factor of package
design. In the first example the factor is a quantitative one, while in the second example it is a
qualitative one.

In a single factor experiment, a treatment corresponds to a factor level. In multi factor studies,
a treatment will correspond to a combination of factor levels. In a single factor experiment if the
levels of the factors are chosen at random, we will say it is a random (effects) model, otherwise it will
be called a fixed (effects) model.

Downloaded by Prof. Madya Dr. Umar Yusuf Madaki ([email protected])


lOMoARcPSD|36552487

Type of Data:

Observational Data - data that is obtained without controlling the independent variable(s) of interest.

Experimental Data - data that is obtained by the experimenter by controlling the independent
variable(s).

Example: New synthetic fiber study

Data (lb/inch square): tensile strength of the new synthetic fiber.


_____________________________________________________
Observation
Cotton __________________________________________
Percentage 1 2 3 4 5 Total Mean
______________________________________________________

15 7 7 15 11 9 49 9.8
20 12 17 12 18 18 77 15.4
25 14 18 18 19 19 88 17.6
30 19 25 22 19 23 108 21.6
35 7 10 11 15 11 54 10.8
____ _____
376 15.04

It is always a good idea to examine experimental data graphically. For example one can
present a boxplot and/or a scatter plot of tensile strength vs cotton percentage. In the SAS output the
letters are the individual observations and the rectangles in the boxplot are the sample means. Both
graphs indicate that tensile strength increases as cotton content increases, up to 30%. Beyond 30%
cotton there is a sizable decrease in tensile strength. The scatter diagram supports the fact that the
variability does not depend on cotton content. From the graphical display one would suspect that
cotton content affects tensile strength and that around 30% cotton one would get maximum strength.

Downloaded by Prof. Madya Dr. Umar Yusuf Madaki ([email protected])


lOMoARcPSD|36552487

Analysis of the Fixed Effects Model

Model:

Yij = ì + ôi + gij , j=1,....,ni and i=1,....,a.

Yij is the jth observation for the ith treatment, ì is the overall mean representing the common effect
for the entire experiment, ôi is the effect of the ith treatment, and gij is the random error present in the
jth observation for the ith treatment.

Assumptions: 1. gij are iid N(0,óe2)

a
2. ' ôi = 0.
i=1

From the expression for the model it follows that for 1#j#ni and 1#i#a

E(Yij) = ì + ôi = ìi

is the mean of the observations in the ith group. The analysis of this experiment consists of testing

H0 : ôi = 0 for all i vs Ha : not all ôi are 0,

which equivalent to testing

H0 : ìi = ì for all i vs Ha : not all ìi = ì .

To test the above hypothesis the F test in a one way analysis of variance is used. The anova
approach has two purposes. First, it provides a subdivision of the total variability between the
experimental units into separate components, each component representing a different source of
variability, so that the relative importance of the different sources can be assessed. Second, and more
important, it gives an estimate of the underlying variability between units which provides a basis for
inferences about the effects of the applied treatments. We now proceed to
develop this for our model.

Downloaded by Prof. Madya Dr. Umar Yusuf Madaki ([email protected])


lOMoARcPSD|36552487

ni a ni a
Notation: Yi. = ' Yij, Y.. = ' ' Yij, N = ' ni
j=1 i=1 j=1 i=1

_ _ a
Yi. = Yi. / ni, Y.. = Y.. / N = ' Yi . / N.
i=1

From the model we get the following identity

Yij - ì = (Yij - ìi) + (ìi - ì),

which remains valid when we replace the parameters by their estimates:

_ _ _ _
Yij - Y.. = (Yij - Yi .) + (Yi. - Y..).

The above equation states that the deviation of each observation from the overall mean can be
decomposed into two parts: the deviation of the observation from its treatment mean plus the
deviation of the treatment mean form the overall mean. If we square both sides of the above equation
and sum over i and j we get the following fundamental equation of the analysis of variance:

a ni a ni a
_ _ _ _
' ' (Yij - Y..) = ' ' (Yij - Yi.) + ' ni(Yi. - Y..)2,
2 2

i=1 j=1 i=1 j=1 i=1

or

SStotal = SSerror + SStreatment.

The term on the left hand side represents the total variability in the data. The first term on the right
hand side of the identity represents the total variability within each of the a treatments.
Since we have assumed that the variances within the a treatments are equal if we divide that term by

a
' (ni - 1) = N - a
i=1

we get an unbiased estimator of the variance ó2, which is valid regardless of the null hypothesis being

Downloaded by Prof. Madya Dr. Umar Yusuf Madaki ([email protected])


lOMoARcPSD|36552487

true or not true.

Now, if H0 is true, then the second term divided by a - 1, is also an unbiased estimator of ó2.
Moreover the two estimators are independent of each other and their quotient denoted by

has an F distribution with a-1 and N-a degrees of freedom. Since the numerator gets large when H0
is not true, while the denominator remains stable, we reject H0 for large values of F. Table IV on
pages A-6 to A-10 gives the critical values for the upper tail area = á for the F distribution, for
selected values of á.

For this example one can show that:

SS total = 636.96

SS treatment = 475.76

SS error = 161.20

F = 14.76

From Table IV, page A-10 in the Appendix, we get that the critical value for our data set for á=.01
is 4.43 (õ1=4, õ2=20). Therefore, we can reject the null hypothesis at the .01 level. A more accurate
result can be obtained from SAS output.

Downloaded by Prof. Madya Dr. Umar Yusuf Madaki ([email protected])

You might also like