Correlational Methods in Psychological Statistics
Correlational Methods in Psychological Statistics
Psychological research often examines how variables co-vary. Correlation coefficients quantify the degree
and direction of association between two variables. In the broadest sense, “correlation…is a measure of an
association between variables” 1 . If one variable tends to increase when another increases, the
correlation is positive; if one increases when the other decreases, it is negative. Correlation does not imply
causation, only association. A commonly cited overview notes that for continuous variables with a linear
relationship, the Pearson product-moment correlation (Pearson’s r) is used, whereas for non-normal or
ordinal data, the Spearman rank correlation (Spearman’s rho) may be applied 2 . Each type of correlation
below is defined formally, explained in plain language, and accompanied by an illustrative example.
• Pearson’s r
Definition (Academic): The Pearson product-moment correlation coefficient (r) measures the
strength and direction of the linear relationship between two continuous, interval-scaled variables
1 . Formally, r is the covariance of the two variables divided by the product of their standard
deviations (often computed by a summation formula). r ranges from –1.0 to +1.0; r = 0 indicates no
linear association, whereas r = +1.0 or –1.0 indicate perfect positive or negative linear relationships,
respectively. It assumes both variables are normally distributed and measured at least at the interval
level.
Lay Explanation: Pearson’s r tells us how closely two measurements tend to rise and fall together.
For example, if we measure students’ height (in cm) and their weight (in kg), Pearson’s r would
capture the extent to which taller students tend to weigh more (a positive correlation). A value of r =
0.80 would indicate a strong positive association. A value near 0 would mean no consistent linear
pattern (e.g., height and exam score might have r ≈ 0).
Example: Imagine a study measuring hours studied (X) and exam scores (Y) among 30 students. If
Pearson’s r = 0.75 (p < .01), we interpret this as a fairly strong positive linear relationship: higher
study time tends to be associated with higher scores. This indicates students who study more tend
to score higher, though it doesn’t prove studying causes better scores (it just shows association).
Citation: As Schober et al. note, the Pearson r “is typically used for jointly normally distributed data”
when assessing linear correlation 2 .
• Spearman’s ρ (rho)
Definition (Academic): Spearman’s rank correlation coefficient (ρ) measures the strength and
direction of a monotonic (order-preserving) relationship between two variables, using their ranks. It
is a nonparametric analog of Pearson’s r. Spearman’s ρ does not assume normal distributions; it only
assumes an ordinal or continuous scale. Like Pearson’s r, ρ ranges from –1 to +1. It is defined as the
Pearson correlation of the rank-ordered values of each variable.
Lay Explanation: Spearman’s rho tells us if higher ranks on one variable tend to go with higher
ranks on the other, without assuming exact distances. For example, imagine we rank 50 people by
satisfaction with their job (1 = least satisfied, 50 = most) and also rank their self-reported stress level
(1 = least stressed, 50 = most). Spearman’s ρ close to +1 would mean people who rank high in
1
satisfaction also rank high in stress (or vice versa if negative). It is robust to outliers and works with
ordinal data.
Example: Suppose a researcher surveys patients’ pain level on a 1–10 scale and their reported sleep
quality on a 1–10 scale. These ratings may not be normally distributed, and the relationship may be
curvilinear. By computing Spearman’s ρ = –0.65 (p < .01), the researcher concludes a strong negative
monotonic relationship: as pain rank increases, sleep quality rank decreases. In simple terms,
patients with higher pain ratings tend to have lower sleep quality.
Citation: As above, Spearman’s rho is recommended for ordinal or non-normally distributed data
2 .
• Biserial Correlation
Definition (Academic): The biserial correlation measures the association between a continuous
variable and a dichotomous (binary) variable that is assumed to be derived from an underlying normal
continuum. In other words, one variable is continuous, and the other is a “true” dichotomy (e.g. pass/
fail) that is thought to result from cutting a normally distributed trait at some threshold 3 . The
biserial correlation is the Pearson r that would have been observed if the dichotomy were replaced
by the latent continuous variable.
Lay Explanation: Think of a class passing a test or not. The “pass”/“fail” decision is the dichotomous
variable. But if we assume there is an underlying continuous ability (like test score) that was cut at 50
to decide pass/fail, the biserial correlation gauges the relationship between ability and another
continuous measure. It corrects for the fact the dichotomous side lost some information.
Example: Suppose researchers record students’ total standardized test scores (continuous) and also
record whether each student passed a critical exam (yes/no). If we believe passing vs. failing is based
on an underlying continuous score, the biserial correlation would measure how strongly the
continuous scores relate to passing. For instance, if biserial r = 0.72, it implies a strong relationship:
higher continuous ability strongly predicts passing the exam.
Citation: As the psych R package documentation explains, “the biserial correlation is between a
continuous Y variable and a dichotomous X variable, which is assumed to have resulted from a
dichotomized normal variable” 3 .
• Point-Biserial Correlation
Definition (Academic): The point-biserial correlation is a special case of Pearson’s r for a truly
dichotomous variable and a continuous variable. It measures the linear relationship between a
binary (0/1 coded) variable and a continuous variable. It is calculated identically to Pearson’s r but
with one variable coded as 0/1. Unlike the biserial, it makes no assumption of an underlying
continuum; the dichotomy may be a pure nominal distinction.
Lay Explanation: The point-biserial tells us how much two distinct groups differ on a continuous
measure. For example, it is used when correlating a person’s gender (male/female) coded as 0/1 with
their score on a math test. It essentially compares group means relative to overall variability. If
male=0 and female=1, a positive point-biserial means females (1) on average scored higher on the
test.
Example: In a study of professional success, “employment status” (employed=1, unemployed=0) is
correlated with annual income (continuous). Calculating the point-biserial correlation yields r = 0.50,
indicating employed individuals tend to have higher incomes. This measure is exactly the Pearson
correlation between the 0/1 variable and income. Indeed, as one source notes, the point-biserial is
2
“simply the correlation between one dichotomous variable and one continuous variable” 4 .
Citation: Newson emphasizes that point-biserial correlation is simply Pearson’s r when one variable
is dichotomous 4 .
• Tetrachoric Correlation
Definition (Academic): The tetrachoric correlation estimates the Pearson correlation between two
unobserved continuous variables from data where each variable is observed only as a dichotomy
(two categories). It assumes each observed binary variable arises from dichotomizing an underlying
normally distributed trait. The tetrachoric correlation is the “inferred Pearson correlation from a 2×2
table with the assumption of bivariate normality” 5 .
Lay Explanation: Imagine we have two yes/no questions (e.g., “Does patient have symptom X? Yes
or No” and “Does patient have symptom Y? Yes or No”) but we believe there are underlying
continuous propensities for each symptom (say, severity). The tetrachoric correlation attempts to
reconstruct the correlation of those latent severities using only the 2×2 table of yes/no counts. A
high tetrachoric correlation implies the two symptoms strongly co-occur beyond chance, under the
normal-threshold assumption.
Example: In a clinical psychology study, two dichotomous variables might be “history of depression
(yes/no)” and “history of anxiety (yes/no)”. Using tetrachoric correlation might yield ρ = 0.65,
suggesting a strong association in the underlying latent risk of depression and anxiety. This means
people with depression are much more likely to have anxiety than would occur by chance alone,
assuming both traits are normally distributed before dichotomization.
Citation: As documentation explains, “The tetrachoric correlation is the inferred Pearson correlation
from a two x two table with the assumption of bivariate normality” 5 .
Definition (Academic): Phi (φ) is the Pearson correlation coefficient computed on two binary
variables (0/1 coded) in a 2×2 contingency table. It measures the degree of association between two
dichotomous variables. Phi’s value ranges from –1 to +1, analogous to Pearson’s r. It requires no
underlying normality assumption. For larger tables, an analogous measure called Cramer’s V (see
below) is used.
Lay Explanation: The phi coefficient quantifies association between two yes/no (binary) items. For
example, if we code smokers=1/non-smokers=0 and drinkers=1/non-drinkers=0, phi tells us how
smoking and drinking go together. A positive φ means people who smoke tend also to drink, a
negative φ would mean smokers tend not to drink.
Example: Suppose a survey asks two yes/no questions: “Owns a pet (yes/no)” and “Lives alone (yes/
no)”. Counting responses in a 2×2 table and computing phi might yield φ = 0.20, a weak positive
association: pet owners are slightly more likely to live alone than expected by chance. Newson
clarifies that phi is just Pearson’s r for two binary variables 6 .
Citation: Phi is exactly the Pearson correlation for two dichotomies 6 .
• Cramér’s V
3
= √(χ² / n·(min(r,c) – 1)). It adjusts phi for larger tables. Cramer’s V indicates the strength (but not
direction) of association between two nominal variables.
Lay Explanation: When two variables have more than two categories, Cramer’s V tells us how
related they are. For example, if Var1 has 3 categories and Var2 has 4 categories, V summarizes the
overall association. A higher value (near 1) means a strong relationship; near 0 means
independence.
Example: Consider a 3×4 table crossing education level (High School, Bachelor’s, Master’s+) by job
sector (e.g., Business, Tech, Healthcare, Education). Computing a chi-square test shows significant
association, and Cramer’s V = 0.35 indicates a moderate association: education level relates to job
sector (certain sectors have more postgraduates).
Citation: As Jason notes, “Cramer’s V is used to examine the association between two categorical
variables when there is more than a 2×2 contingency…Cramer’s V represents the association or
correlation between two variables” 7 .
Definition (Academic): Cohen’s kappa coefficient (κ) measures inter-rater agreement for categorical
items, correcting for chance agreement. It is defined as
Po − Pe
κ= ,
1 − Pe
4
are in the same order vs. reversed order. A τ = 0.9 means almost all pairs rank in the same order
(strong positive relation); τ = 0 means as many agreements as disagreements (no association).
Example: Suppose students in a class rank their own math skill (1–10) and science skill (1–10). If most
students who rank high in math also rank high in science, Kendall’s τ might be around +0.7,
indicating a strong positive rank association. If the rankings were uncorrelated, τ ≈ 0.
Citation: StatsDirect explains that “Kendall’s rank correlation provides…a measure of the strength of
dependence between two variables” and is a distribution-free (nonparametric) test of independence
9 .
• Phenomenological Research
• Grounded Theory
5
• Case Study
Definition (Academic): A case study is a qualitative approach in which the investigator explores a
bounded system (a case) or multiple cases over time, through detailed, in-depth data collection
involving multiple sources of information; the outcome is a case description and case-based themes
12 . The case might be a person, group, organization, or event.
Lay Explanation: A case study is an in-depth examination of one specific “case.” For example, a
clinical psychologist might do a case study of one patient, gathering interviews, observations, and
records over months. The researcher then writes a rich, holistic story of that case and what can be
learned from it.
Example: An in-depth case study might focus on a student with exceptional mathematical talent.
The researcher spends a year observing the student in class, interviewing parents and teachers, and
analyzing the student’s work. The result is a comprehensive profile: “John’s mathematical
development and the contextual factors that shaped it.”
• Ethnography
Definition (Academic): Ethnography is a qualitative design in which the researcher describes and
interprets the shared and learned patterns of values, behaviors, beliefs, and language of a
culture-sharing group 13 . The researcher often immerses in the group’s environment (participant
observation) for an extended period to understand the group’s worldview.
Lay Explanation: Ethnography involves living among a group of people to understand their culture
from the inside. For example, an ethnographer might live in a remote village or join an online
community, observing daily life and talking with members to learn the group’s customs and values.
Example: A psychologist might conduct an ethnography of classroom culture in a school: spending
months in the school, talking to teachers and students, observing routines, and writing a report on
the norms, language, and interactions that define that school culture.
• Narrative Research
Definition (Academic): Narrative research focuses on the life stories of individuals, treating the data
as narratives (stories). The researcher collects and analyzes “personal stories of experiences” to
understand how people make sense of events 14 . Narrative research emphasizes chronology and
context, viewing each story as a whole.
Lay Explanation: Narrative research is about studying people’s stories. An interviewer might ask an
individual to tell their life story or the story of a specific experience (like recovery from an accident).
The researcher then analyzes the structure, themes, and meaning of that story.
Example: A narrative study might follow refugees rebuilding their lives. Each refugee’s personal
narrative is collected (through interviews or diaries) and analyzed for themes (e.g. “journey”, “identity
reconstruction”). The researcher tells these personal stories to highlight how people make sense of
displacement and resettlement.
• Focus Groups
Definition (Academic): A focus group is a qualitative data collection method where a small group
(typically 6–12) of participants engages in guided discussion about a specific topic or concept. The
discussion is led by a moderator who encourages interaction among participants 15 . It aims to elicit
6
ideas, perceptions, and group norms.
Lay Explanation: In a focus group, several people meet (in person or online) to talk about a topic
while a researcher asks questions. For example, a psychologist might gather a group of teenagers to
discuss their attitudes toward social media. The group dynamic often brings out ideas and opinions
that might not emerge in one-on-one interviews.
Example: A clinical psychologist uses a focus group to explore coping strategies for anxiety. They
bring together 8 adults with anxiety disorders, ask open-ended questions (e.g., “What helps you feel
calmer?”), and let participants discuss. The resulting data includes group dialogue highlighting
common coping mechanisms.
• Action Research
• Content Analysis
Definition (Academic): Content analysis is a systematic method for coding and interpreting text,
media, or documents to quantify patterns or themes 17 . It can be qualitative (identifying themes) or
quantitative (counting word frequencies). The researcher develops coding categories a priori or
inductively and applies them to the content to identify underlying patterns of meaning.
Lay Explanation: Content analysis treats text or media as data to be coded. For example, a
psychologist might analyze newspaper articles about mental health to see how often terms like
“depression” or “anxiety” appear and in what context. By coding the content, they can report on the
most common themes or how media portrayal changes over time.
Example: A social psychologist performs content analysis on 100 political speeches to identify
frames used for social welfare. They create categories (e.g. “personal responsibility”, “government
aid”) and tally occurrences. This reveals how frequently and in what terms politicians discuss welfare
issues.
7
Sampling in Psychology
Sampling refers to how participants or observations are selected.
• Probability Sampling
Definition (Academic): Probability sampling is any sampling method where every member of the
population has a known, non-zero chance of being selected 18 . This includes simple random
sampling, stratified random sampling, cluster sampling, etc. The key is use of randomization,
allowing generalization from sample to population. Properties include: (1) every unit has a non-zero
probability of selection, and (2) selection is by a random mechanism 18 .
Lay Explanation: In probability sampling, researchers use chance (like random number generators)
to pick participants. For example, from a list of all students in a university, they might randomly
choose 200 to survey. Because selection is random and known, the results can more legitimately be
generalized to all students.
Example: A psychologist wants a representative sample of elementary school children nationwide.
Using cluster sampling (a probability method), they randomly select 20 schools (clusters) and then
randomly pick students within those schools. Because random methods are used at each stage,
inferences to the population are statistically justified.
• Non-Probability Sampling
Definition: Degrees of freedom (df) refer to the number of independent values or quantities that
can vary in an analysis after certain restrictions (like sample statistics) are imposed 20 21 . In
general, df = n – k, where n is sample size and k is the number of estimated parameters. It plays a key
role in determining the shape of test statistic distributions (t, χ², F).
Lay Explanation: Think of degrees of freedom as the count of values that are free to vary. For
example, if you have 5 scores and you know their mean, only 4 can vary before the last is fixed by the
8
mean. So for a sample size n estimating one parameter (the mean), df = n–1. Higher degrees of
freedom generally give more stable estimates and narrower confidence intervals.
Example: In an independent-samples t-test comparing two groups of 10 each, df = (10–1)+(10–1) =
18 total. The t distribution with 18 df is used to find p-values. As df increases (e.g. larger sample
sizes), the t-distribution approximates the normal.
• t-Test
Definition: The t-test is a statistical test comparing means. There are several types (one-sample,
independent two-sample, paired-sample), but all use the Student’s t-distribution for inference. In
general, a two-sample t-test examines whether two population means are equal 22 . The formula for
the t statistic involves the difference of sample means, pooled standard deviation, and sample sizes.
Lay Explanation: A t-test asks “are these two averages significantly different, or could the difference
be due to random sampling?” For instance, comparing average test scores between two classes. The
test yields a t value and p-value: if p is low (e.g. < .05), we reject the idea that the classes have the
same true mean. If p is high, we don’t have evidence of a real difference.
Example: A psychologist compares test anxiety scores (continuous) between a group receiving
relaxation training (n=15, mean=20) and a control group (n=15, mean=25). Using an independent t-
test, they compute t(28)=–2.3, p<.03, suggesting the training group’s mean anxiety is significantly
lower. (Note: df = 28 here.)
Definition: The chi-square test assesses the relationship between categorical variables. Common
forms are the chi-square test of independence (in contingency tables) and the chi-square
goodness-of-fit test. The test statistic is
(O − E)2
χ2 = ∑ ,
E
summing over cells where O are observed frequencies and E are expected frequencies under the
null. For independence, E = (row total * column total)/grand total. The χ² statistic (with appropriate
df) tells us if observed frequencies deviate more from expectation than chance 23 9 .
Lay Explanation: Chi-square tests ask whether categorical variables are related. For a 2×k table, the
test checks if row and column categories are independent or not. A large χ² (relative to df) gives a
small p-value, indicating dependence (association). For example, “Do exercise habit (yes/no) and
stress level (high/low) relate?” is tested by χ².
Example: A researcher tallies a 3×2 table of therapy type (Cognitive, Behavioral, None) by patient
outcome (Improved, Not improved). The chi-square test yields χ²(2)=6.5, p<.04, indicating therapy
type and improvement status are not independent (some therapies were more effective). The
researcher concludes there is a statistically significant association between therapy and outcome.
Definition: ANOVA is a family of parametric tests that compare means across three or more groups
(or more complex designs). One-way ANOVA tests whether k group means are all equal, by
partitioning total variance into “between-group” and “within-group” variance. The F-statistic = (Mean
9
Square Between)/(Mean Square Within). If F is large, group means differ more than would be
expected by chance 24 . ANOVA models include one-way, factorial (two-way), and more. ANOVA
generalizes the t-test when there are more than two groups.
Lay Explanation: ANOVA asks “do these group averages differ?” while controlling Type I error. For
example, comparing mean anxiety in three therapy conditions. Instead of doing multiple t-tests
(which inflates error rate), ANOVA uses an F-test to see if any differences exist among all group
means simultaneously. A significant F (p<.05) means at least one pair of means is different.
Example: In a one-way ANOVA, therapists test exam performance among students taught with
method A, B, or C (three groups of 20). ANOVA yields F(2,57)=5.3, p=.008, indicating not all group
means are equal. Further analysis (post-hoc) might reveal Method B led to higher scores than C, for
instance 24 .
• Linear Regression
Definition: Linear regression analyzes the relationship between an outcome variable (dependent)
and one or more predictor variables (independent). In simple linear regression (one predictor), the
model is $Y = b_0 + b_1 X + e$, where $b_1$ is the slope. Multiple regression extends this to multiple
predictors. Regression estimates the parameters ($b$’s) that best fit the data. Coefficients are tested
(t-tests), and overall model fit is assessed (F-test). Regression assumptions include linearity,
normality of residuals, and homoscedasticity.
Lay Explanation: Linear regression predicts a continuous outcome. For example, predicting job
performance score (Y) from years of education (X). The regression line gives the expected score for
each X. If the estimated slope $b_1=2.5$ (per year of education), it means each extra year of
schooling predicts 2.5 points higher job score. It also tells how well X predicts Y (via R² and p-values).
Example: Researchers predict depression severity (Y) from sleep hours (X) and exercise frequency (Z).
In multiple regression, the equation might be $\hat{Y} = 15 - 0.8X - 0.5Z$. Interpretation: each
additional hour of sleep reduces predicted depression by 0.8 points, controlling for exercise. If this
model has $R^2=0.45$, about 45% of variance in depression is explained by sleep and exercise
combined 25 .
Definition: The mean (arithmetic mean) is the sum of all values divided by n 26 . The median is the
middle value in an ordered dataset (or the average of two middle values if n is even) 27 . The mode is
the most frequently occurring value in the data 28 . Together, they describe the “center” of a
distribution, but their suitability depends on the data’s shape and level.
Lay Explanation: The mean is the “average” score (sensitive to outliers). The median splits the data
in half (better for skewed data). The mode is simply the category or number that appears most. For
example, in ages of 7,8,9,9,10, the mean = 8.6, median = 9, and mode = 9. These three give slightly
different “typical” values.
Example: A clinical sample has depression scores: 2,2,3,4,10. Mean=4.2, median=3, mode=2.
Because one person scored 10 (much higher than others), the mean (4.2) is pulled upward, so the
median (3) is a better center measure. All three are valid measures of central tendency 27 .
10
• Standardization (z-Score, t-Score)
Definition: Standardization converts raw scores into a standard scale. A z-score rescales a raw
value by subtracting the mean and dividing by the standard deviation: $z=(x-\mu)/\sigma$. The
resulting z has mean 0 and SD 1. In psychological testing, a t-score often refers to a standardized
score with mean 50 and SD 10 (not to be confused with t-test). Standard scores place different scales
on a common frame.
Lay Explanation: A z-score tells you how many standard deviations a score is above or below the
mean. For example, on an IQ test (mean=100, SD=15), an IQ of 115 has $z=(115-100)/15 = +1.0$,
meaning one standard deviation above average. A t-score might do similarly on a T-scale (e.g. a t of
60 is 1 SD above the mean of 50).
Example: A student scores 75 on a test (class mean=65, SD=5). Their z = (75–65)/5 = +2.0. This means
they scored 2 SD above the class average. In practical terms, if 95% of class scores are below +2 SD
(for normal distribution), this student did exceptionally well relative to classmates.
• Skewness
Definition: In inferential testing, a Type I error (false positive) is rejecting a true null hypothesis; a
Type II error (false negative) is failing to reject a false null 21 . Formally, $\alpha = P(\text{Type I
error})$ and $\beta = P(\text{Type II error})$, and statistical power = $1-\beta$.
Lay Explanation: Type I error means “crying wolf”: you conclude there is an effect/difference when
there isn’t one. Type II error means “missing the wolf”: you conclude there is no effect when there
actually is. For example, saying a drug works when it doesn’t (Type I), or saying it doesn’t work when
it does (Type II). Conventionally, α is often set at 0.05 (5% chance of false alarm).
Example: In a clinical trial, suppose the null hypothesis is “New therapy has no effect”. A Type I error
would be concluding it does help (and approving the therapy) when it actually has no real effect. A
Type II error would be missing a real benefit by concluding the therapy is ineffective when it actually
is effective. StatsDirect clarifies: “Type I error is the false rejection of the null hypothesis and Type II
error is the false acceptance of the null hypothesis” 21 .
11
Factor Analysis
Factor analysis is a multivariate technique for identifying latent factors underlying a set of observed
variables. The steps are typically:
1. Compute the Correlation Matrix: Calculate correlations among all pairs of variables, forming a
correlation (or covariance) matrix. This reveals which variables move together.
Example: In a personality questionnaire, we compute correlations among items. Highly correlated
items (e.g. “enjoy groups” and “talkative”) suggest a common factor (e.g. Extraversion).
2. Assess Factorability (KMO and Bartlett’s Test): Before extraction, check whether factor analysis is
appropriate.
3. Kaiser-Meyer-Olkin (KMO) Test: Measures sampling adequacy. Values range 0–1; values >0.8 are
meritorious, >0.6 acceptable, <0.5 is unacceptable 30 . A high KMO (near 1) means partial
correlations are low and factors should emerge well.
4. Bartlett’s Test of Sphericity: Tests whether the correlation matrix is significantly different from an
identity matrix (no correlations). A significant Bartlett’s test (p<.05) indicates enough correlations to
proceed 31 . If the test is not significant, variables may be uncorrelated and factor analysis is not
suitable.
Example: A KMO of 0.82 and Bartlett’s p<.001 suggest our personality data is factorable.
5. Factor Extraction: Determine the initial factors that account for variance. Common methods include
Principal Components Analysis (PCA, though technically dimension reduction) or common factor
methods (e.g. Principal Axis Factoring). One examines eigenvalues or a scree plot to decide how
many factors to retain (often factors with eigenvalues >1).
Example: If analysis yields three eigenvalues >1, one might extract three factors.
6. Factor Rotation: To aid interpretation, the initial factor solution is rotated. Orthogonal rotations
(e.g. Varimax) keep factors uncorrelated; oblique rotations (e.g. Promax) allow factors to correlate.
Rotation yields a simpler pattern (higher loadings on one factor, lower on others) for easier
interpretation.
Example: After rotation, one factor may have high loadings on items about social behavior, another
on items about planning.
7. Factor Naming: Interpret and label each factor based on the loadings of variables. Researchers
examine which variables load highly on a factor, identify the common theme, and give it a name (e.g.
“Anxiety”, “Achievement Motivation”).
Example: If Factor 1 loads on “likes parties”, “talkative”, “energetic”, the researcher might name it
“Sociability”.
Throughout, one would use software to compute matrices and tests. There is no single citation needed for
these steps, but [136] provides guidance on KMO and Bartlett’s considerations.
12
Software Tools in Psychology Research
Psychologists use specialized software for data analysis.
Citation: IBM describes SPSS Advanced Statistics as providing “a comprehensive suite of univariate and
multivariate analysis tools” for data 32 .
Citation: According to QSR International, “NVivo is the premier software for qualitative data analysis” and it
helps researchers “organize, analyze and visualize” data from interviews and texts 33 .
Researchers may also use survey platforms (Qualtrics, SurveyMonkey), experimental design tools (PsychoPy,
E-Prime), and statistical programming (SPSS syntax, R scripts) as part of their toolkit. The choice depends on
research needs, budget, and data type.
Sources: Authoritative statistical texts, peer-reviewed methodology articles, and official software
documentation were consulted, including Schober et al. 1 , Creswell 10 11 12 13 , qualitative research
guides 15 16 17 , statistical resource sites (StatsDirect) 20 24 25 28 29 , and IBM/QSR material 32 33 .
These provide both technical definitions and accessible explanations to ensure clarity.
13
4 6 7 lectur15
https://web.pdx.edu/~newsomj/pa551/lectur15.htm
8 Kappa statistic considerations in evaluating inter-rater reliability between two raters: which, when and
context matters | BMC Cancer | Full Text
https://bmccancer.biomedcentral.com/articles/10.1186/s12885-023-11325-z
10 11 12 13 us.sagepub.com
https://us.sagepub.com/sites/default/files/upm-binaries/13421_Chapter4.pdf
17 Content Analysis Method and Examples | Columbia Public Health | Columbia University Mailman School
of Public Health
https://www.publichealth.columbia.edu/research/population-health-methods/content-analysis
22 Two-Sample t-Test
https://www.jmp.com/en/statistics-knowledge-portal/t-test/two-sample-t-test
14
30 31 KMO and Bartlett’s test of sphericity | Analysis INN.
https://www.analysisinn.com/post/kmo-and-bartlett-s-test-of-sphericity/
33 About NVivo
https://help-nv.qsrinternational.com/14/win/Content/about-nvivo/about-nvivo.htm
15