Research Design Notes
Research Design Notes
content validity: do the individual questions in the test help to measure the construct?
• i.e., are the items in the test getting at what we are measuring with the intervention
• e.g. #1, If a scale is supposed to measure panic attacks, does it measure the cognitive and
physiological aspects of panic attacks or just one or the other?
• e.g. #2, If an instrument is measuring poor eating behaviors, does an item ask instead about
current weight (which may not necessarily indicate poor eating behavior)?
construct validity: is the test as a whole measuring what it’s supposed to test (the construct)
• i.e., did the intervention work and, if so, what parts of it are responsible for that?
• e.g. #1, Does a test really measure depression or maybe just a low mood?
• e.g. #2, Is CBT responsible for the decrease in depression, or is it just the attention that the
client is paying to therapy that is decreasing depression?
• It could be the case that just one or two parts of an intervention are responsible for a change
whereas the last part is not.
o two kinds of construct validity:
▪ convergent validity: looks at two tests that are supposed to be measuring the
same construct and determines how similar they are
▪ divergent validity: looks at two tests that are supposed to be measuring
different constructs and determines how similar they are
internal validity: to what extent did the intervention (and not something else) lead to the cause of the
outcome
You should do a baseline assessment prior to the experiment to make sure you know what changes are
taking place.
Covariate analyses involve controlling for a variable (co + variate), like age, gender, etc.
You can evaluate variability for groups but also for individuals. For example, you can get a baseline from
someone with a TBI and later see how they’ve progressed.
3 ways to tell if there is causal inference (if one thing affected another):
1. timing: the cause comes before the effect
2. statistical association: the cause and effect “go together” statistically
3. nonspuriousness: it’s the cause and not some other factor (aka, confounding factor) that is
responsible for the effect
a. do this by ruling out confounding influences
b. we can control through these variables by adding covariates; we also account for this
statistically through confidence intervals and standard errors of measurement
3 threats to internal validity (meaning internal validity is questionable if all 3 of these criteria are
applicable):
1. plausibility: if it is plausible that the threat/confounding variable was active during the study
a. something is a threat if it impacts the results and varies between groups
b. If random assignment is NOT used for an experiment, a selection threat is plausible.
2. capability: if there is reason to believe the threat/confounding variable is capable of bringing
about the results you saw
3. vary systematically by group: if the threat/confounding influence or variable is likely to be more
pronounced among one intervention group compared with another intervention group
a. different from plausibility because occurs during the study as opposed to before/at
baseline
b. e.g., if you’re measuring anxiety of participants while they’re watching violent films and
have two intervention groups, one of them being a group made up of Americans and the
other being a group made up of Italians
c. things can come about during the intervention that have an impact on the outcome by
making the groups vary. For example, participants may experience stress, loss, etc.
Looking at the impact factor of a journal article when selecting it. The higher the impact factor of a
journal, the more credibility it has.
• anything above 10 is a good impact factor
o
• Full Text Finder
o is for finding an article specifically when you have a citation
• Ulrichweb
o used to check if a journal is peer-reviewed/refereed
o peer reviewed a.k.a. refereed
• Prisma: can see how many journal articles have been written on a specific topic
• Web of Science: can see the reputability score of journals
• Thesaurus
• APA PsychInfo > My Research: for saving searches and setting alerts so you’re notified if new
articles are added relative to something; can create a folder per kind of search, class, etc.
Other resources:
regression to the mean: extreme scores more likely to be closer to the mean upon re-testing
• a.k.a., statistical regression
• a.k.a., regression artifact
In layman’s terms:
• “Was there a statistical interaction between intervention condition and participant sex in
predicting the outcome?” → “Was there a difference in outcome for males vs. females?”
• “interactions of the causal association” or “intervention effect” → “threats to external validity”
Effect size
- Effect size: what amount of changes are accounted for by the intervention
o e.g., What amount of variability in the decrease in depression is accounted for by the
intervention
- 20% is small effect
- 80% is large effect
- Researchers will sometimes tell you the p value and not tell you the effect size.
o If sample size is very large, there is a chance that there could be statistical significance
(high p) without there actually being an effect.
o If effect size is low, doesn’t necessarily mean that there’s no effect but maybe there’s a
confounding variable that is impacting it.
o If you have a large sample size and various interventions, you can start looking for those
interventions that have a p < .01 instead of .05. This is because, with a large sample size,
it’s likely you will be able to reach statistical significance of .05 easily without there
being an effect.
For a true experimental study, the main component is randomly assigned groups. Random assignment
ensures there are no pre-existing conditions that mess up the results.
What can make something experimental or quasi-experimental is whom you are generalizing to. If you
have a study investigating If there is any criteria that you use to select which group to assign people to,
it would be quasi-experimental. For example, you are evaluating
The importance of experimental vs quasi-experimental is that it liits generalizability. The study is still
good but cannot be generalized to larger populations.
Case studies
Case studies are guiding tools, not conceptual frameworks. They “set the stage” for experimental
studies, not the other way around.
- e.g., Before Beck made the CBT manual, he applied the principles while treating clients and
made case studies.
level changes: looking at the effect of the intervention at different time periods (e.g., change in
outcome from last observation in phase 1 to first observation in phase 2)
Homogeneity of variance: assumption underlying both t tests and F tests (analyses of variance,
ANOVAs) in which the population variances (i.e., the distribution, or “spread,” of scores around the
mean) of two or more samples are considered equal.
• test of homogeneity: if you get a significant result, this is not good (it means that it failed
homogeneity)
Make sure that there is no kurtosis (e.g., leptokurtic or platykurtic) or skew (e.g., negative or positive)
skew: the TAIL tells the TALE (tells you if it is positive or negative)
logistic regression: analyses in which you try to determine the amount of variance a dependent variable
accounts for
• uses block chains, where each block accounts for individual reasons for variance
• like a more complex version of regression analysis
multiple predictor regression models: looks at the same as logistic regression but at how much each
variable accounts for the variance in the dependent variable
• like a more in-depth analysis of the results