F. Lecture 4a - Randomization Why and When
F. Lecture 4a - Randomization Why and When
Rema Hanna
Associate Professor of Public Policy
Evidence for Policy Design Harvard Kennedy School
August 28, 2012
Todays Agenda
Discuss randomized evaluations, which are considered the gold standard of evaluation methods Overview of what it is, what is required to do one, and how it can help inform policy
Will not able to cover all details of experimental methods, rather an overview If you are interested in these methods, we can recommend additional reading material and put you in touch with researchers working on policy questions you are interested in
Social experiments
Randomization Part I
Theoretical Overview of Randomized Evaluations Basic Structure of a Randomized Evaluation When should you conduct a Randomized Evaluation Choosing sample and sample size
Randomization Part II
Randomization Methods Level/Unit of Randomization Constraints
Randomization Part I
Theoretical Overview of Randomized Evaluations Basic Structure of a Randomized Evaluation When should you conduct a Randomized Evaluation Choosing sample and sample size
Unobservable Characteristics:
Many important variables that you cannot easily define or control for, e.g. motivation to engage in the program in the first place
Compare outcomes for treatment and control group to get program impact
Randomization Part I
Theoretical Overview of Randomized Evaluations Basic Structure of a Randomized Evaluation When should you conduct a Randomized Evaluation Choosing sample and sample size
Timeline of Evaluation
* Design Study **Chose Sample * Baseline (if necessary)
* Randomize
*Intervention * Monitor Process to Reduce Evaluation Threat
* Conduct Follow-up Surveys
Randomization Part I
Theoretical Overview of Randomized Evaluations Basic Structure of a Randomized Evaluation When should you conduct a Randomized Evaluation Choosing sample and sample size
Do an RCT: When it is ethically feasible (piloting new programs, resources are scare or will be phased in, program impact is unclear)
Dont do an RCT: Crisis, Disasters, etc.
Do an RCT: When you are fairly confident that the program can be implemented logistically
Dont do an RCT: When the logistics still have kinks (often times, we pre-pilot with logistics first)
Randomization Part I
Theoretical Overview of Randomized Evaluations Basic Structure of a Randomized Evaluation When should you conduct a Randomized Evaluation Choosing sample and sample size
Statistics tells us that if we choose a representative sample of the population, we can get very close to the right answer
But, what is the right sample? And, how large does the sample need to be for us to get the right answer?
Choosing a sample
Want a sample representative of the population
In general: best way to get a representative sample is to choose a random sample of the population
Treatment Group
Control Group
Village 1
Village 2
Technically
When we randomize, account for two aspects: Sample size is large enough in terms of people, houses, etc. Random Assignment is not clustered into a few units
Often times, need to do random assignment of clusters (e.g. schools, village, etc.)
In this case, need a large enough sample size of clusters to randomize over, and we will often take a random sample of households within each village to obtain info on each cluster
Mean (treatment) - Mean (control) = Effect (size) We then test whether this is the real effect of just an artifact of this particular sample: this test of statistical significance relies on how messy the data is or the variance
To summarize
Smaller sample size:
Large treatment effect and small variance
To calculate what is small and large, we use a technique called a power calculation
At minimum: general rule of at least 50 per group or it is not worth doing But often times, requires larger than that
Conclusion
Randomization provides best estimate of counterfactual and the true program effect Does this by eliminating initial selection bias:
creates treatment and control groups similar on observables and non-observables
A large sample size is necessary to have similar treatment and control groups.
The definition of large depends on the effect size and variance of the outcome variables, so need to conduct power calculations to get right size.
Next Lecture
Will continue discussing the details of how to conduct a randomized evaluation