0% found this document useful (0 votes)
9 views16 pages

PSYC588Lecture 19

Lecture Note on Psych 588

Uploaded by

kaiwenkaer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views16 pages

PSYC588Lecture 19

Lecture Note on Psych 588

Uploaded by

kaiwenkaer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Exploratory Factor Analysis:

dimensionality and factor scores

Psychology 588: Covariance structure and factor models


How many PCs to retain 2

• Unlike confirmatory FA, the number of factors to extract is not


known in advance, or a presumed dimensionality need be
empirically supported

• There is no generally acceptable single guideline to determine


the dimensionality --- some are relevant to common factors
and others to principal components, and mostly to both

• Many rules and tests are available, but unfortunately not


necessarily suggesting the same number

• The most popular are Scree Test by Cattell and, so called,


eigenvalue-greater-than-1 (or VAF per factor > average
variance; a.k.a., Guttman-Kaiser rule)
• Scree Test is criticized for its graphical nature (subjective and
non-statistical) --- parametric (Bentler & Yuan, 1998) and non-
parametric (Hong et. al, 2006) scree tests are available

• Parallel Analysis (Horn, 1965) improves the G-K rule for data
size, yet non-statistical

true n = 3 mapped in a null distribution


5 5

4 4

3 3

EV
EV

2 2

1 1

0 0
3 6 9 12 15 3 6 9 12 15
PC # PC #
Large sample inference 4

• With a random sample x from a normal distribution, the


sampling distributions of eigenvalues, eigenvectors, and
equality of the last q – n eigenvalues are known --- but with
large N, these properties also hold for non-normally distributed
x (due to central limit theorem), allowing for parametric
statistical testing

 2
• Sample eigenvalues e are distributed as N q e, 2Ε N , so 
that we can test H 0 : ek  0, k  1,..., n, with associated
100(1 – α)% CI as:
eˆk eˆk
 ek 
1  z  2  2 N 1  z  2  2 N
• Another parametric test for principal components is Bartlett test
for H 0 : en1    eq , which is known to suggest too many
PCs in practice

    N  q 3  5 6  2n 3 log Rq n ,
ˆ 2

R
df   q  n  q  n  1 2, Rq n  q n
 n   q   k 1 ek 
n

  ek   q  n 

 k 1   
• Sample size has a direct consequence on the statistic with no
adjustment in the df --- large N causing too many
components retained
• Anderson provides more generally applicable χ2 statistic than
Bartlett’s (e.g., not necessarily the last q – n roots), which is
widely used in practice and isn’t sensitive to too large N

 q  
q
ek
ˆ     log ek   q  n  log
2 k  n 1
,
 k n1 qn 
 
df   q  n  1 q  n  2  2,   q  q  1 2

• In most cases, this improved test behaves reasonably while it


suggests too many components to retain when the 1st PC is
dominantly large

• Similar χ2 tests are available for the common factor model,


due to Anderson, Lawley and Rubin
• National track records data for men tested by the Bartlett test
(using Anderson’s formula) and the bootstrap Scree test
(MATLAB codes available in my netfiles, “bscree.m” with its
syntax “[dim,pvalues] = bscree(X, alpha, nBs, flagm)”

[ndim,prob,chisquare] = barttest(zscore(trackm),.05)
Cut-off lines at alpha
8

ndim = 8 7

prob = 0 chisquare = 768.9522


6

0 256.6401
0.0000 59.0284 5

0.0003 39.6948 4

VAF
0.0074 22.5032 3

0.0080 15.6116 2

0.0326 6.8479 1

-1
0 1 2 3 4 5
factor number

dim = bscree(zscore(trackm), [.25 .10 .05 .01], 1000, 0)


=4 3 2 2
Indeterminacy of factor scores 8

• Once all parameters of the common factor model (Λ, Φ and


Θ) are obtained, we may sometimes want to know factor scores
of “subjects” --- quantities that are not considered as a part of
model parameters, instead, as some values useful to know
afterwards

• Factor scores are not uniquely determinable since there are


n + q unknown factors, given only q data variables

 ξi 
xi   Λ , I    , i  1,..., N , E  ξδ   0, E  δδ   Θ
q1  qn qq  δi 
 n  q 1

• Two approaches are considered here to overcome this problem


--- weighted least squares and regression methods
What if PCs are extracted 9

• Since the PC model “ignores” the existence of specific factors,


the estimation of factor scores simply reduces to the OLS,
given (mean-centered) data and the loading matrix as:

xi  Λξ i  δi , ξ i   ΛΛ  Λxi  E1Λxi


ˆ 1

which exactly determines the least-squares estimate of ξ ---


accordingly, the indeterminacy of factor scores doesn’t apply to
the PC model

• This LS property holds for any rotated Λ

   
ˆ 1 1
ξ i  Λ
 Λ
  x  T1ΛΛT1
Λ i T1Λxi

 TE1Λxi  Tξˆ i
WLS for common factor scores 10

• Under the common factor model, q observed variables have a


varying contribution to the n common factors (i.e., different
communalities) --- taking this into account, a weighted sum of
squared errors would provide a better prediction of factor
scores (due to Bartlett) as:
q
 ij2
f WLS_i   
j 1 jj
 δiΘ 1δi   xi  Λξ i  Θ 1  xi  Λξ i 

• Accordingly, the WLS estimator is:

 
1
ˆξ  Λ Θ 1
Λ Λ Θ 1
xi
WLS_i

• Would be the weight  jj1 larger or smaller with larger


communality h 2jj ?
• A rotated version of ξ̂ WLS is obtained by simply replacing Λ in
the formula by Λ

• The WLS estimate satisfies only the 0-mean property of ξ but


not the others (i.e., unit-variance & orthogonality if orthogonally
rotated), though the deviations tend ignorable --- alternatively,
Anderson-Rubin’s modified estimation provides all satisfied
results

 
0.5
ˆξ  Λ Θ 1
SΘ 1
Λ Λ Θ 1
xi
A-R_i
Regression on data variables 12

• Consider a partitioned vector of the data variables and common


factors [x′, ξ′]′, with all entries mean-centered and normalized
to unit variance, then we have the expectation of its cross-
products as:

  z    zz zξ   R P 
E  z  ξ   E    , P  ΛΦ
 ξ    ξz ξξ    P Φ 
• And if we set up a regression equation for common factors
predicted by the scaled data
ξ N n  Z N q B qn  ε
Then, the OLS estimator of B is:
ˆ 1
B   Z Z  Z ξ  R P  R ΛΦ
  1 1
• From the OLS estimator of regression weights, we have

ξˆ REG_i  Bˆ z i  ΦΛR 1z i , Bˆ  R 1ΛΦ

Note that this estimator is applicable also to unstandardized,


mean-centered data by replacing Z, R, ΛR, ΦR, respectively,
by X, S, ΛC, ΦC (subscripts R and C represent the
correlation and covariance data)
• Factor scores (for factors 1 and 2) are estimated based on ML
extraction and Oblimin rotation, once by regression and once
by Anderson-Rubin WLS:

Rank Reg1 Reg2 A-R1 A-R2 Reg1 Reg2 A-R1 A-R2

1 UK USA Portugl Dom Rep -0.933 -1.688 -1.067 -1.991


2 Kenya Italy Kenya USA -0.928 -1.465 -0.976 -1.687
3 USA USSR NewZlnd Bermuda -0.864 -1.259 -0.911 -1.532
4 Portugl UK Norway Italy -0.855 -1.158 -0.846 -1.467
5 E Ger W Ger Nethrlnd Tailand -0.853 -1.015 -0.844 -1.308
Reliability of model parameters 15

• Under normality, the ML method and PCA provide a basis for


parametric testing

• For other methods, with large N, split-half analysis can be


performed to see whether an optimal factor solution derived
from one random half of the data “agree” with results from the
other by the same method (same n, factoring and rotation) ---
using congruent coefficient to see how the two sets agree

• Alternatively, bootstrapping could be used to empirically create


sampling distribution of parameter estimates (e.g., Ichikawa &
Konish, 1995)

 Note that this type of non-parametric approach do not require


any of the usual parametric assumptions
References for EFA 16

• Anderson, T.W., & Rubin, H. (1956). Statistical inference in factor analysis. Proceedings of the Third
Berkeley Symposium on Mathematical Statistics and Probability, 5, 111-150.
• Bartlett, M.S. (1950). Tests of significance in factor analysis. British Journal of Mathematical and
Statistical Psychology, 3, 77-85.
• Bentler, P.M., & Yuan, K.H. (1998). Tests for linear trend in the smallest eigenvalues of the correlation
matrix. Psychometrika, 63, 131–144.
• Carroll, J.B. (1953). An analytic solution for approximating simple structure in factor analysis.
Psychometrika, 18, 23-38.
• Carroll, J.B. (1960). unpublished manuscript for Oblimin
• Cattell, R.B. (1966). The Scree test for the number of factors. Multivariate Behavioral Research, 1, 245-
276.
• Guttman-Kaiser test: Kaiser, H.F. (1970). A second generation of Little Jiffy. 35, 401-415.
• Hong, S., Mitchell, S.K, & Harshman, R.A. (2006). Bootstrap scree tests: A Monte Carlo simulation and
applications to published data. British Journal of Mathematical and Statistical Psychology, 59, 35-57.
• Horn, J.L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30,
179-185.
• Ichikawa, M., & Konish, S.(1995). Application of the bootstrap methods in factor analysis.
Psychometrika, 60, 77-93 .
• Jennrich, R.I., & Sampson, P.F. (1966). Rotation for simple loadings. Psychometrika, 31, 313-323.
• Kaiser, H.F. (1958). The Varimax criterion for analytic rotation in factor analysis. Psychometrika, 23,
187-200.
• MINRES: Harman, H.H. (1976). Modern factor analysis (3rd ed.). Chicago: University of Chicago Press.
• ML factoring: Lawley, D.N., & Maxwell, A.E. (1971). Factor analysis as a statistical method (2nd ed.).
New York: Elsevier.
• Simple structure: Thurstone, L.L. (1947). Multiple factor analysis. Chicago: University of Chicago Press.

You might also like