0% found this document useful (0 votes)
3 views61 pages

Lec 23 - Design Single Factor

Experimental design involves strategies for organizing data collection and analysis procedures, primarily focusing on controlling variability to identify treatment effects. Key principles include control by matching, randomization, and statistical adjustment, with randomization being the most effective for ensuring equivalence among groups. Different designs such as completely randomized, randomized block, and hierarchical designs are discussed, highlighting the importance of sampling models in educational research and their impact on statistical precision and inference.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views61 pages

Lec 23 - Design Single Factor

Experimental design involves strategies for organizing data collection and analysis procedures, primarily focusing on controlling variability to identify treatment effects. Key principles include control by matching, randomization, and statistical adjustment, with randomization being the most effective for ensuring equivalence among groups. Different designs such as completely randomized, randomized block, and hierarchical designs are discussed, highlighting the importance of sampling models in educational research and their impact on statistical precision and inference.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Basic Experimental Design

What is Experimental Design?


Experimental design includes both
• Strategies for organizing data collection
• Data analysis procedures matched to those data
collection strategies

Classical treatments of design stress analysis procedures


based on the analysis of variance (ANOVA)

Other analysis procedure such as those based on


hierarchical linear models or analysis of aggregates
(e.g., class or school means) are also appropriate
Why Do We Need Experimental Design?

Because of variability

We wouldn’t need a science of experimental design if

• If all units (students, teachers, & schools) were identical


and
• If all units responded identically to treatments

We need experimental design to control variability so that


treatment effects can be identified
Principles of Experimental Design
Experimental design controls background variability so that
systematic effects of treatments can be observed

Three basic principles

1. Control by matching
2. Control by randomization
3. Control by statistical adjustment

Their importance is in that order


Control by Matching
Known sources of variation may be eliminated by matching
Eliminating genetic variation
Compare animals from the same litter of mice
Eliminating district or school effects
Compare students within districts or schools
However matching is limited
• matching is only possible on observable characteristics
• perfect matching is not always possible
• matching inherently limits generalizability by removing (possibly
desired) variation
Control by Matching
Matching ensures that groups compared are alike
on specific known and observable characteristics
(in principle, everything we have thought of)

Wouldn’t it be great if there were a method of


making groups alike on not only everything we
have thought of, but everything we didn’t think of
too?

There is such a method


Control by Randomization
Matching controls for the effects of variation due to specific
observable characteristics

Randomization controls for the effects all (observable or


non-observable, known or unknown) characteristics

Randomization makes groups equivalent (on average) on


all variables (known and unknown, observable or not)

Randomization also gives us a way to assess whether


differences after treatment are larger than would be
expected due to chance.
Control by Randomization
Random assignment is not assignment with no
particular rule. It is a purposeful process

Assignment is made at random. This does not


mean that the experimenter writes down the
names of the varieties in any order that occurs to
him, but that he carries out a physical
experimental process of randomization, using
means which shall ensure that each variety will
have an equal chance of being tested on any
particular plot of ground (Fisher, 1935, p. 51)
Control by Randomization
Random assignment of schools or classrooms is not
assignment with no particular rule. It is a
purposeful process

Assignment of schools to treatments is made at


random. This does not mean that the
experimenter assigns schools to treatments in any
order that occurs to her, but that she carries out a
physical experimental process of randomization,
using means which shall ensure that each
treatment will have an equal chance of being
tested in any particular school (Hedges, 2007)
Control by Statistical Adjustment
Control by statistical adjustment is a form of
pseudo-matching

It uses statistical relations to simulate matching

Statistical control is important for increasing


precision but should not be relied upon to control
biases that may exist prior to assignment
Using Principles of Experimental Design

You have to know a lot (be smart) to use matching


and statistical control effectively

You do not have to be smart to use randomization


effectively

But

Where all are possible, randomization is not as


efficient (requires larger sample sizes for the
same power) as matching or statistical control
Basic Ideas of Design:
Independent Variables (Factors)
The values of independent variables are called levels
Some independent variables can be manipulated, others
can’t
Treatments are independent variables that can be
manipulated
Blocks and covariates are independent variables that
cannot be manipulated
These concepts are simple, but are often confused
Remember:
You can randomly assign treatment levels but not blocks
Basic Ideas of Design (Crossing)
Relations between independent variables

Factors (treatments or blocks) are crossed if every level of


one factor occurs with every level of another factor

Example
The Tennessee class size experiment assigned students to
one of three class size conditions. All three treatment
conditions occurred within each of the participating
schools

Thus treatment was crossed with schools


Basic Ideas of Design (Nesting)
Factor B is nested in factor A if every level of factor B
occurs within only one level of factor A

Example
The Tennessee class size experiment actually assigned
classrooms to one of three class size conditions. Each
classroom occurred in only one treatment condition

Thus classrooms were nested within treatments

(But treatment was crossed with schools)


Where Do These Terms Come From?
(Nesting)
An agricultural experiment where blocks are literally blocks
or plots of land
Blocks
1 2 … n

T1 T2 … T1

Here each block is literally nested within a treatment


condition
Where Do These Terms Come From?
(Crossing)
An agricultural experiment
Blocks
1 2 … n

T1 T2 T1

T2 T1 T2

Blocks were literally blocks of land and plots


of land within blocks were assigned
different treatments
Where Do These Terms Come From?
(Crossing)
Blocks were literally blocks of land and plots of land within
blocks were assigned different treatments.

Blocks
1 2 … n

T1 T2 T1

T2 T1 T2

Here treatment literally crosses the blocks


Where Do These Terms Come From?
(Crossing)
The experiment is often depicted like this.
What is wrong with this as a field layout?
Blocks
1 2 … n

Treatment 1

Treatment 2

Consider possible sources of bias


Think About These Designs
A study assigns a reading treatment (or control) to children
in 20 schools. Each child is classified into one of three
groups with different risk of reading failure.

A study assigns T or C to 20 teachers. The teachers are in


five schools, and each teacher teaches 4 science
classes

Two schools in each district are picked to participate. Each


school has two grade 4 teachers. One of them is
assigned to T, the other to C.
Three Basic Designs
The completely randomized design
Treatments are assigned to individuals

The randomized block design


Treatments are assigned to individuals within
blocks

The hierarchical design


Treatments are assigned to blocks, the same
treatment is assigned to all individuals in the
block
The Completely Randomized Design
Individuals are randomly assigned to one of two treatments
Treatment Control

Individual 1 Individual 1
Individual 2 Individual 2

… …

Individual n Individual n
The Randomized Block Design
Block 1 … Block m

Individual 1 Individual 1

Treatment 1 … … …

Individual n Individual n
Individual n +1 Individual n+1

Treatment 2 … … …

Individual 2n Individual 2n
The Hierarchical Design

Treatment Control

Block 1 Block m Block m+1 Block 2m

Individual 1 Individual 1 Individual 1 Individual 1


Individual 2 Individual 2 Individual 2 Individual 2
… …
… … … …

Individual n Individual n Individual n Individual n


Randomization Procedures
Randomization has to be done as an explicit process
devised by the experimenter

• Haphazard is not the same as random


• Unknown assignment is not the same as random
• “Essentially random” is technically meaningless
• Alternation is not random, even if you alternate from a
random start

This is why R.A. Fisher was so explicit about randomization


processes
Randomization Procedures
R.A. Fisher on how to randomize an experiment with small
sample size and 5 treatments

A satisfactory method is to use a pack of cards


numbered from 1 to 100, and to arrange them in random
order by repeated shuffling. The varieties [treatments]
are numbered from 1 to 5, and any card such as the
number 33, for example is deemed to correspond to
variety [treatment] number 3, because on dividing by 5
this number is found as the remainder. (Fisher, 1935,
p.51)
Randomization Procedures
You may want to use a table of random numbers, but be
sure to pick an arbitrary start point!

Beware random number generators—they typically depend


on seed values, be sure to vary the seed value (if they
do not do it automatically)

Otherwise you can reliably generate the same sequence of


random numbers every time

It is no different that starting in the same place in a table of


random numbers
Randomization Procedures
Completely Randomized Design
(2 treatments, 2n individuals)

Make a list of all individuals

For each individual, pick a random number from 1 to 2 (odd


or even)

Assign the individual to treatment 1 if even, 2 if odd

When one treatment is assigned n individuals, stop


assigning more individuals to that treatment
Randomization Procedures
Completely Randomized Design (2pn
individuals, p treatments)

Make a list of all individuals


For each individual, pick a random number from 1 to p
One way to do this is to get a random number of any
size, divide by p, the remainder R is between 0 and (p –
1), so add 1 to the remainder to get R + 1
Assign the individual to treatment R + 1
Stop assigning individuals to any treatment after it gets n
individuals
Randomization Procedures
Randomized Block Design with 2 Treatments
(m blocks per treatment, 2n individuals per block)

Make a list of all individuals in the first block


For each individual, pick a random number from 1 to 2 (odd
or even)
Assign the individual to treatment 1 if even, 2 if odd
Stop assigning a treatment it is assigned n individuals in
the block
Repeat the same process with every block
Randomization Procedures
Randomized Block Design with p Treatments
(m blocks per treatment, pn individuals per block)

Make a list of all individuals in the first block

For each individual, pick a random number from 1 to p


Assign the individual to treatment p
Stop assigning a treatment it is assigned n individuals in
the block

Repeat the same process with every block


Randomization Procedures
Hierarchical Design with 2 Treatments
(m blocks per treatment, n individuals per block)

Make a list of all blocks

For each block, pick a random number from 1 to 2


Assign the block to treatment 1 if even, treatment 2 if odd
Stop assigning a treatment after it is assigned m blocks

Every individual in a block is assigned to the same


treatment
Randomization Procedures
Hierarchical Design with p Treatments
(m blocks per treatment, n individuals per block)

Make a list of all blocks

For each block, pick a random number from 1 to p


Assign the block to treatment corresponding to the number
Stop assigning a treatment after it is assigned m blocks

Every individual in a block is assigned to the same


treatment
Sampling Models
Sampling Models in Educational Research

Sampling models are often ignored in educational


research

But

Sampling is where the randomness comes from in


social research

Sampling therefore has profound consequences


for statistical analysis and research designs
Sampling Models in Educational Research

Simple random samples are rare in field research

Educational populations are hierarchically nested:


• Students in classrooms in schools
• Schools in districts in states

We usually exploit the population structure to sample


students by first sampling schools

Even then, most samples are not probability samples, but


they are intended to be representative (of some
population)
Sampling Models in Educational Research

Survey research calls this strategy multistage (multilevel)


clustered sampling

We often sample clusters (schools) first then individuals


within clusters (students within schools)

This is a two-stage (two-level) cluster sample

We might sample schools, then classrooms, then students

This is a three-stage (three-level) cluster sample


Precision of Estimates
Depends on the Sampling Model

Suppose the total population variance is σT2 and ICC is ρ


Consider two samples of size N = mn

A simple random sample or stratified sample


The variance of the mean is σT2/mn

A clustered sample of n students from each of m schools


The variance of the mean is (σT2/mn)[1 + (n – 1)ρ]
The inflation factor [1 + (n – 1)ρ] is called the design effect
Precision of Estimates
Depends on the Sampling Model

Suppose the population variance is σT2


School level ICC is ρS, class level ICC is ρC
Consider two samples of size N = mpn
A simple random sample or stratified sample
The variance of the mean is σT2/mpn
A clustered sample of n students from p classes in m
schools
The variance is (σT2/mpn)[1 + (pn – 1)ρS + (n – 1)ρC]
The three level design effect is [1 + (pn – 1)ρS + (n – 1)ρC]
Precision of Estimates
Depends on the Sampling Model

Treatment effects in experiments and


quasi-experiments are mean differences

Therefore precision of treatment effects


and statistical power will depend on the
sampling model
Sampling Models in Educational Research

The fact that the population is structured does not mean


the sample is must be a clustered sample

Whether it is a clustered sample depends on:

• How the sample is drawn (e.g., are schools sampled first


then individuals randomly within schools)

• What the inferential population is (e.g., is the inference


these schools studied or a larger population of schools)
Sampling Models in Educational Research

A necessary condition for a clustered sample is that it is


drawn in stages using population subdivisions
• schools then students within schools
• schools then classrooms then students

However, if all subdivisions in a population are present in


the sample, the sample is not clustered, but stratified
Stratification has different implications than clustering
Whether there is stratification or clustering depends on the
definition of the population to which we draw inferences
(the inferential population)
Sampling Models in Educational Research

The clustered/stratified distinction matters because it


influences the precision of statistics estimated from the
sample
If all population subdivisions are included in the every
sample, there is no sampling (or exhaustive sampling) of
subdivisions
• therefore differences between subdivisions add no
uncertainty to estimates
If only some population subdivisions are included in the
sample, it matters which ones you happen to sample
• thus differences between subdivisions add to uncertainty
Inferential Population and Inference Models

The inferential population or inference model has


implications for analysis and therefore for the design of
experiments

Do we make inferences to the schools in this sample or to a


larger population of schools?

Inferences to the schools or classes in the sample are


called conditional inferences

Inferences to a larger population of schools or classes are


called unconditional inferences
Inferential Population and Inference Models

Note that the inferences (what we are estimating) are


different in conditional versus unconditional inference
models

• In a conditional inference, we are estimating the mean


(or treatment effect) in the observed schools

• In unconditional inference we are estimating the mean


(or treatment effect) in the population of schools from
which the observed schools are sampled

We are still estimating a mean (or a treatment effect) but


they are different parameters with different uncertainties
Fixed and Random Effects
When the levels of a factor (e.g., particular blocks
included) in a study are sampled and the
inference model is unconditional, that factor is
called random and its effects are called random
effects

When the levels of a factor (e.g., particular blocks


included) in a study constitute the entire
inference population and the inference model is
conditional, that factor is called fixed and its
effects are called fixed effects
Applications to Experimental Design

We will look in detail at the two most widely


used experimental designs in education

• Randomized blocks designs

• Hierarchical designs
Experimental Designs
For each design we will look at
• Structural Model for data (and what it means)
• Two inference models
– What does ‘treatment effect’ mean in principle
– What is the estimate of treatment effect
– How do we deal with context effects
• Two statistical analysis procedures
– How do we estimate and test treatment effects
– How do we estimate and test context effects
– What is the sensitivity of the tests
The Randomized Block Design
The population (the sampling frame)

We wish to compare two treatments

• We assign treatments within schools

• Many schools with 2n students in each

• Assign n students to each treatment in each


school
The Randomized Block Design
The experiment

Compare two treatments in an experiment

• We assign treatments within schools

• With m schools with 2n students in each

• Assign n students to each treatment in each


school
The Randomized Block Design
Diagram of the design
Schools

Treatment 1 2 … m

1 …
2 …
The Randomized Block Design
School 1

Schools

Treatment 1 2 … m

1 …
2 …
The Conceptual Model
The statistical model for the observation on the kth person in
the jth school in the ith treatment is

Yijk = μ +αi + βj + αβij + εijk

where
μ is the grand mean,
αi is the average effect of being in treatment i,
βj is the average effect of being in school j,
αβij is the difference between the average effect of
treatment i and the effect of that treatment in school j,
εijk is a residual
Inference Models
Two different kinds of inferences about effects

Unconditional Inference (Schools Random)


Inference to the whole universe of schools
(requires a representative sample of schools)

Conditional Inference (Schools Fixed)


Inference to the schools in the experiment
(no sampling requirement on schools)
Statistical Analysis Procedures
Two kinds of statistical analysis procedures

Mixed Effects Procedures (Schools Random)


Treat schools in the experiment as a sample
from a population of schools
(only sensible if schools are a sample)

Fixed Effects Procedures (Schools Fixed)


Treat schools in the experiment as a population
The Hierarchical Design
The universe (the sampling frame)

We wish to compare two treatments

• We assign treatments to whole schools


• Many schools with n students in each
• Assign all students in each school to the
same treatment
The Hierarchical Design
The experiment

We wish to compare two treatments

• We assign treatments to whole schools


• Assign 2m schools with n students in each
• Assign all students in each school to the
same treatment
The Hierarchical Design
Diagram of the experiment

Schools

Treatment 1 2 … m m+1 m+2 … 2m

2
The Hierarchical Design
Treatment 1 schools

Schools

Treatment 1 2 … m m+1 m+2 … 2m

2
The Hierarchical Design
Treatment 2 schools

Schools

Treatment 1 2 … m m+1 m+2 … 2m

2
The Conceptual Model
The statistical model for the observation on the kth person in
the jth school in the ith treatment is
Yijk = μ + αi + βi + αβij + εjk(i) = μ + αi + βj(i) + εjk(i)
μ is the grand mean,
αi is the average effect of being in treatment i,
βj is the average effect if being in school j,
αβij is the difference between the average effect of
treatment i and the effect of that treatment in school j,
εijk is a residual
Or βj(i) = βi + αβij is a term for the combined effect of schools
within treatments
The Conceptual Model
The statistical model for the observation on the kth person in the jth
school in the ith treatment is

Yijk = μ + αi + βi + αβij + εjk(i) = μ + αi + βj(i) + εjk(i)


Context Effects
μ is the grand mean,
αi is the average effect of being in treatment i,
βj is the average effect if being in school j,
αβij is the difference between the average effect of treatment i and the
effect of that treatment in school j,
εijk is a residual

or βj(i) = βi + αβij is a term for the combined effect of schools within


treatments

You might also like