Exploratory Factor Analysis: Prof. Andy Field
Exploratory Factor Analysis: Prof. Andy Field
Slide 2
When and Why?
• To test for clusters of variables or
measures.
• To see whether different measures are
tapping aspects of a common dimension.
– E.g. Anal-Retentiveness, Number of
friends, and social skills might be aspects of
the common dimension of ‘statistical
ability’
Slide 3
R-Matrix
Slide 4
Factors and components
• Factor analysis attempts to achieve parsimony by
explaining the maximum amount of common
variance in a correlation matrix using the
smallest number of explanatory constructs.
– These ‘explanatory constructs’ are called
factors.
• PCA tries to explain the maximum amount of
total variance in a correlation matrix.
– It does this by transforming the original
variables into a set of linear components.
Slide 5
Graphical Representation
CHAPTER 17 EXPLORATORY FACTOR ANALYSIS 669
FIGURE 17.3
Talk 2
Example of a
factor plot
Liar
Talk 1
Slide 6
R-matrix: selfishness, the amount a person talks about themselves and their propensity to
Mathematical Representation
Yi =b1 X1i + b2 X 2i +… + bn X ni + ei
Componenti =b1Variable1i + b2Variable2i +… + bnVariable
Yi =b1 X1i + b2 X 2i +… + bn X ni
Sociabilityi =b1Talk1i + b2Social Skillsi + b3Interest i
+b4 Talk2 i + b5Selfish i + b6 Liari
Consideration i =b1Talk1i + b2Social Skillsi + b3Interest i
+b4 Talk2 i + b5Selfish i + b6 Liari
Slide 7
Mathematical Representation
Continued
• The factors in Factor Analysis are not
represented in the same way as
components.
x =m + L x + d
Variables = Variable Means + (Loadings × Common Factor) + Unique Factor
Factor Loadings
• Both factor analysis and PCA are linear models
in which loadings are used as weights.
– These loadings can be expressed as a matrix
– This matrix is called the factor matrix or component
matrix (if doing PCA).
– The assumption of factor analysis (but not PCA) is
that these algebraic factors represent real-world
dimensions.
æ 0.87 0.01 ö
ç ÷
ç 0.96 - 0.03 ÷
ç 0.92 0.04 ÷
L =ç
0.00 0.82 ÷
ç ÷
ç - 0.10 0.75 ÷
ç 0.09 0.70 ÷
Slide 9 è ø
The SAQ
Slide 10
Initial Considerations
• The quality of analysis depends upon the
quality of the data (GIGO).
• Test variables should correlate quite well
– r > .3.
• Avoid Multicollinearity:
– several variables highly correlated, r > .80.
• Avoid Singularity:
– some variables perfectly correlated, r = 1.
• Screen the correlation matrix, eliminate any
variables that obviously cause concern.
Slide 11
Further Considerations
• Determinant:
– Indicator of multicollinearity
– should be greater than 0.00001.
• Kaiser-Meyer-Olkin (KMO):
– Measures sampling adequacy
– should be greater than 0.5.
• Bartlett’s Test of Sphericity:
– Tests whether the R-matrix is an identity matrix
– should be significant at p < .05.
• Anti-Image Matrix:
– Measures of sampling adequacy on diagonal,
– Off-diagonal elements should be small.
• Reproduced:
– Correlation matrix after rotation
– most residuals should be < |0.05|
Slide 12
• Determinant:
– Indicator of multicollinearity
– should be greater than 0.00001.
• Kaiser-Meyer-Olkin (KMO):
– Measures sampling adequacy
– should be greater than 0.5.
• Bartlett’s Test of Sphericity:
– Tests whether the R-matrix is an identity matrix
– should be significant at p < .05.
• Anti-Image Matrix:
– Measures of sampling adequacy on diagonal,
– Off-diagonal elements should be small.
• Reproduced:
Slide 13
– Correlation matrix after rotation
–
Finding Factors: Communality
• Common Variance:
– Variance that a variable shares with other variables.
• Unique Variance:
– Variance that is unique to a particular variable.
• The proportion of common variance in a
variable is called the communality.
• Communality = 1, All variance shared.
• Communality = 0, No variance shared.
• 0 < Communality < 1 = Some variance shared.
Slide 14
Communality = 1
Communality = 0
Variance
Varianceof of
Variance of
Variable 1 3
Variable
Variable 2
Variance of
Variable 4
Slide 15
Finding Factors
• We find factors by calculating the amount of
common variance
– Circularity
• Principal Components Analysis:
– Assume all variance is shared
– All Communalities = 1
• Factor Analysis
– Estimate Communality
– Use Squared Multiple Correlation (SMC)
Slide 16
Slide 17
Factor Extraction
• Kaiser’s Extraction
– Kaiser (1960): retain factors with Eigen values >
1.
• Scree Plot
– Cattell (1966): use ‘point of inflexion’ of the scree
plot.
• Which Rule?
– Use Kaiser’s Extraction when
• less than 30 variables, communalities after extraction >
0.7.
• sample size > 250 and mean communality ≥ 0.6.
– Scree plot is good if sample size is > 200.
Slide 18
Slide 19
Scree Plots
CHAPTER 17 EXPLORATORY FACTOR ANALYSIS 699
Point o f
Po int o f
OUTPUT 17.6
Slide 20
Rotation
Slide 21
Orthogonal Oblique
Slide 22
nean
Before Rotation
mp o
1
2
34
0 1Q1
8 5Q0
7 9Q1
7 3Q1
6 9Q1
5 8Q2
5 6Q1
502Q1
0
4 3Q1
3 4Q0
2 9Q0
9 3Q1
8 6Q0
5 6Q0
4
019
1Q0
7
3 7Q1
3
0 6
4Q2
2 7Q1
2 7Q0
4 8Q0
6 5Q2
6
7 2
1Q0
Q2
0 7
E x
a
4.
Slide 23
Orthogonal Rotation (varimax)
702 DISCOVERING STATISTICS USING SPSS
Slide 24
OUTPUT 17.8
Oblique Rotation
Slide 25
Reliability
• Test-Retest Method
– What about practice effects/mood states?
• Alternate Form Method
– Expensive and Impractical
• Split-Half Method
– Splits the questionnaire into two random halves, calculates
scores and correlates them.
• Cronbach’s Alpha
– Splits the questionnaire into all possible halves, calculates the
scores, correlates them and averages the correlation for all
splits (well, sort of …).
– Ranges from 0 (no reliability) to 1 (complete reliability)
Slide 26
Cronbach’s Alpha
var1 cov12 cov13
variance - covariance matrix cov12 var2 cov 23
cov cov 23 var3
13
Slide 27
Interpreting Cronbach’s Alpha
• Kline (1999)
– Reliable if α > .7
• Depends on the number of items
– More questions = bigger α
• Treat Subscales separately
• Remember to reverse score reverse phrased
items!
– If not, α is reduced and can even be negative
Slide 28
Reliability for Fear of Computers
Subscale
Slide 29
Reliability for Fear of Statistics Subscale
Slide 30
Reliability for Fear of Maths Subscale
Slide 31
Reliability for the Peer Evaluation
Subscale
Slide 32
The End?
• Describe Factor Structure/Reliability
• What items should be retained?
– What items did you eliminate and why?
• Application
– Where will your questionnaire be used?
– How does it fit in with psychological theory?
Slide 33