0% found this document useful (0 votes)
3 views

Research Design Notes

Notes for a doctoral level research design class.

Uploaded by

emailme1513
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Research Design Notes

Notes for a doctoral level research design class.

Uploaded by

emailme1513
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Research Design

Construct validity is NOT a kind of internal validity


• Internal validity is focused on the structure of a study and the accuracy of the conclusions
drawn based on a cause and effect relationship
• Construct validity answers questions about the measurement of a concept or construct

content validity: do the individual questions in the test help to measure the construct?
• i.e., are the items in the test getting at what we are measuring with the intervention
• e.g. #1, If a scale is supposed to measure panic attacks, does it measure the cognitive and
physiological aspects of panic attacks or just one or the other?
• e.g. #2, If an instrument is measuring poor eating behaviors, does an item ask instead about
current weight (which may not necessarily indicate poor eating behavior)?
construct validity: is the test as a whole measuring what it’s supposed to test (the construct)
• i.e., did the intervention work and, if so, what parts of it are responsible for that?
• e.g. #1, Does a test really measure depression or maybe just a low mood?
• e.g. #2, Is CBT responsible for the decrease in depression, or is it just the attention that the
client is paying to therapy that is decreasing depression?
• It could be the case that just one or two parts of an intervention are responsible for a change
whereas the last part is not.
o two kinds of construct validity:
▪ convergent validity: looks at two tests that are supposed to be measuring the
same construct and determines how similar they are
▪ divergent validity: looks at two tests that are supposed to be measuring
different constructs and determines how similar they are

Threats to construct validity:


• participant expectations of benefits
• attention (of the participants to the intervention)
• confounding therapists with treatments
o i.e., one administrator being more skilled than another administrator
o attrition rate can signal this
• therapist expectancy

internal validity: to what extent did the intervention (and not something else) lead to the cause of the
outcome

x = the independent variable


y = the dependent variable

You should do a baseline assessment prior to the experiment to make sure you know what changes are
taking place.

Covariate analyses involve controlling for a variable (co + variate), like age, gender, etc.
You can evaluate variability for groups but also for individuals. For example, you can get a baseline from
someone with a TBI and later see how they’ve progressed.

3 ways to tell if there is causal inference (if one thing affected another):
1. timing: the cause comes before the effect
2. statistical association: the cause and effect “go together” statistically
3. nonspuriousness: it’s the cause and not some other factor (aka, confounding factor) that is
responsible for the effect
a. do this by ruling out confounding influences
b. we can control through these variables by adding covariates; we also account for this
statistically through confidence intervals and standard errors of measurement

3 threats to internal validity (meaning internal validity is questionable if all 3 of these criteria are
applicable):
1. plausibility: if it is plausible that the threat/confounding variable was active during the study
a. something is a threat if it impacts the results and varies between groups
b. If random assignment is NOT used for an experiment, a selection threat is plausible.
2. capability: if there is reason to believe the threat/confounding variable is capable of bringing
about the results you saw
3. vary systematically by group: if the threat/confounding influence or variable is likely to be more
pronounced among one intervention group compared with another intervention group
a. different from plausibility because occurs during the study as opposed to before/at
baseline
b. e.g., if you’re measuring anxiety of participants while they’re watching violent films and
have two intervention groups, one of them being a group made up of Americans and the
other being a group made up of Italians
c. things can come about during the intervention that have an impact on the outcome by
making the groups vary. For example, participants may experience stress, loss, etc.

Statistical significance is NOT clinical significance.


• As sample size increases, chances of statistical significance also increases.
• If has clinical significance, there is high chance it has statistical significance.

external validity: validity of the results in the real world


• measured by effect size
• ecological validity: kind of external validity that accounts for confounding factors in the
environment/other factors that may have led to the results

False Positive = Type I error


False Negative = Type II error

Looking at the impact factor of a journal article when selecting it. The higher the impact factor of a
journal, the more credibility it has.
• anything above 10 is a good impact factor

CBT is longitudinal (has long-term positive effects).


Diagnosing threats to internal validity:
1. selection: those selected for the groups differ from one another before the intervention is
implemented
2. timing: when participants or groups join the study
3. history: events occurring during (not before) the time of tx that could account for the tx effect
a. e.g., testing separate military bases of soldiers for anxiety, but one group was deployed
to Iraq; would control by instead doing random selection with equal number of
individuals in groups from each base
4. maturation: are changes that are naturally occurring
a. e.g., if testing to see if tx for dealing with recent trauma is efficacious, you could say the
negative effects of the trauma would naturally decrease as time increases
5. testing: testing individuals can affect results (because of practice, familiarity with the questions,
etc.)
6. instrumentation: changes in the measuring device could affect results
a. e.g., A new instrument or tactic adopted by a hospital to track the number of car
accidents or to treat these accidents could affect results of a study looking at healthcare
results for car accident victims.
i. The instrument being used incorrectly could also affect results.

Library Services offered:


• Academic Writer (formerly APA Style CENTRAL)
• LinkedIn Learning
• Course Guide
• SharkSearch:
o is for undergrads; do not use
o use these options instead:

o
• Full Text Finder
o is for finding an article specifically when you have a citation
• Ulrichweb
o used to check if a journal is peer-reviewed/refereed
o peer reviewed a.k.a. refereed
• Prisma: can see how many journal articles have been written on a specific topic
• Web of Science: can see the reputability score of journals
• Thesaurus
• APA PsychInfo > My Research: for saving searches and setting alerts so you’re notified if new
articles are added relative to something; can create a folder per kind of search, class, etc.

Other resources:

• APA Style Blog: https://apastyle.apa.org/blog


o gives you specific info. on APA format
• Owl Citation Generator: https://owl.purdue.edu/owl/research_and_citation/resources.html

Difference between literature review and systematic review:


systematic reviews not only look at numerous studies, but they also have a specific methodology of
selecting and reporting on those studies and look for commonalities/conclusions between them

regression to the mean: extreme scores more likely to be closer to the mean upon re-testing
• a.k.a., statistical regression
• a.k.a., regression artifact

attrition: drop out


• more likely when there are repeated observations

In layman’s terms:
• “Was there a statistical interaction between intervention condition and participant sex in
predicting the outcome?” → “Was there a difference in outcome for males vs. females?”
• “interactions of the causal association” or “intervention effect” → “threats to external validity”

Effect size
- Effect size: what amount of changes are accounted for by the intervention
o e.g., What amount of variability in the decrease in depression is accounted for by the
intervention
- 20% is small effect
- 80% is large effect
- Researchers will sometimes tell you the p value and not tell you the effect size.
o If sample size is very large, there is a chance that there could be statistical significance
(high p) without there actually being an effect.
o If effect size is low, doesn’t necessarily mean that there’s no effect but maybe there’s a
confounding variable that is impacting it.
o If you have a large sample size and various interventions, you can start looking for those
interventions that have a p < .01 instead of .05. This is because, with a large sample size,
it’s likely you will be able to reach statistical significance of .05 easily without there
being an effect.

Improving Quasi-experimental designs:

Threats to internal validity:


- Selection by maturation
o e.g., employers were making better gains in sales practices in general in the intervention
stores.
- Differential statistical regression
o Intervention was implemented in stores whose most recent sales trends were
particularly low.
▪ Things to consider:
• If possible differences between groups prior to intervention:
o Where did study get participants from?
o What was the baseline prior to testing?
o Was there a pre-test?
- Local history (or selection by history)
o Something other than the sales program was responsible for increased sales (e.g.
increased foot traffic in intervention stores).

4 kinds of designs associated with quasi-experimental designs:


1. Non-equivalent control group design
2. Non-equivalent dependent variables design
3. Brief time series design
4. Removed and repeated treatments

For a true experimental study, the main component is randomly assigned groups. Random assignment
ensures there are no pre-existing conditions that mess up the results.

In quasi-experiemental designs, there is no random assignment. The distinguishing factor for


experimental vs quasi-experimental is if individuals are assigned to groups based on things that
cannot be changed (e.g., IQ or personality traits).
- e.g., You are doing an experiment to see if a specific SAT-tutoring program improves high school
students’ scores. You have one group that receives 3 sessions, one that receives 6 sessions, one
that receives 12 sessions, and one that receives 0 sessions (the control).
o if you take 9-12th graders and divide them randomly between groups: it is a true
experimental study
o if you assign 9th graders to the group receiving 3 sessions, the 10th graders to the group
receiving 6 sessions, etc: it is a quasi-experimental study

What can make something experimental or quasi-experimental is whom you are generalizing to. If you
have a study investigating If there is any criteria that you use to select which group to assign people to,
it would be quasi-experimental. For example, you are evaluating

The importance of experimental vs quasi-experimental is that it liits generalizability. The study is still
good but cannot be generalized to larger populations.

Case studies

Case studies are guiding tools, not conceptual frameworks. They “set the stage” for experimental
studies, not the other way around.
- e.g., Before Beck made the CBT manual, he applied the principles while treating clients and
made case studies.

Usually a case study involves an outlier sample.

Limitations of case studies:


- increases likelihood of confirmation bias
- difficult to rule out alternative explanations for what happened
- hard to generalize to other cases

Kinds of case studies:


- Single case studies:
o Called A-B-A-B (ABAB) designs.
o very specific hypotheses; less focus on formal inferential statistics
o These studies are more likely to have repeated baselines that you get in repeated
observations.
▪ You want multiple baselines because:
• there is just usually one person and there could have been a
confounding factor at the time you measured that could have
influenced the baseline for that individual
• they help us understand maturational trends in the absence of the
intervention
o Common kinds of single case studies:
▪ A-B-A-B designs (A=absence of intervention/represents a baseline) and
B=intervention)
• shows that baseline being measured multiple times
▪ Multiple baseline designs
• Doesn’t necessarily mean you are getting a baseline at multiple times.
You can be measuring different behaviors instead.
▪ Changing criterion designs
• usually takes existing research that determines that intervention X is
effective and instead makes a new research study that examines if
intervention Y can be added or used instead and still have the same
effect
o if case study, not usually based on existing research
o criterion (can be exclusion/inclusion criteria, the intervention,
the dose of the intervention, etc.) are usually sequentially and
systematically modified

level changes: looking at the effect of the intervention at different time periods (e.g., change in
outcome from last observation in phase 1 to first observation in phase 2)

slope: change in tx outcome over time

Homogeneity of variance: assumption underlying both t tests and F tests (analyses of variance,
ANOVAs) in which the population variances (i.e., the distribution, or “spread,” of scores around the
mean) of two or more samples are considered equal.
• test of homogeneity: if you get a significant result, this is not good (it means that it failed
homogeneity)

Make sure that there is no kurtosis (e.g., leptokurtic or platykurtic) or skew (e.g., negative or positive)
skew: the TAIL tells the TALE (tells you if it is positive or negative)

logistic regression: analyses in which you try to determine the amount of variance a dependent variable
accounts for
• uses block chains, where each block accounts for individual reasons for variance
• like a more complex version of regression analysis

r2 (coefficient of determination): represents proportion of variance for a dependent variable that's


explained by an independent variable or variables in a regression model

multiple predictor regression models: looks at the same as logistic regression but at how much each
variable accounts for the variance in the dependent variable
• like a more in-depth analysis of the results

You might also like