0% found this document useful (0 votes)
19 views

Lecture 10

- Factor analysis seeks to describe the covariance relationships among many variables in terms of a few underlying factors. It assumes variables can be grouped by their correlations, with each group representing an underlying construct or factor. - The factor model represents variables as a linear combination of common factors and specific factors. It approximates the covariance matrix as the product of the factor loading matrix and its transpose plus a diagonal matrix of specific variances. - The principal component and maximum likelihood methods are commonly used to estimate the factor model parameters from sample data. Principal components analysis provides an initial factor solution that can be rotated to aid interpretation.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Lecture 10

- Factor analysis seeks to describe the covariance relationships among many variables in terms of a few underlying factors. It assumes variables can be grouped by their correlations, with each group representing an underlying construct or factor. - The factor model represents variables as a linear combination of common factors and specific factors. It approximates the covariance matrix as the product of the factor loading matrix and its transpose plus a diagonal matrix of specific variances. - The principal component and maximum likelihood methods are commonly used to estimate the factor model parameters from sample data. Principal components analysis provides an initial factor solution that can be rotated to aid interpretation.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Lecture 10

Factor Analysis

➢ The essential purpose of factor analysis is to describe, if possible, the


covariance relationships among many variables in terms of a few
underlying, but un observable, random quantities called factors.
➢ Basically, the factor model is motivated by the following argument: Suppose
variables can be grouped by their correlations.
✓ That is, suppose all variables within a particular group are highly
correlated among themselves but have relatively small correlations
with variables in a different group.
➢ Then it is conceivable that each group of variables represents a single
underlying construct, or factor, that is responsible for the observed
correlations.
➢ For example, correlations from the group of test scores in classics, French,
English, mathematics, and music collected by Spearman suggested an
underlying "intelligence" factor. A second group of variables, representing
physical-fitness scores, if available, might correspond to another factor. It
is this type of structure that factor analysis seeks to confirm.

➢ Factor analysis can be considered an extension of principal component


analysis.
✓ Both can be viewed as attempts to approximate the covariance matrix ∑.
✓ However, the approximation based on the factor analysis model is more
elaborate.

Orthogonal Factor Model with Common Factors

𝑋
⏟ = ⏟
𝜇 + ⏟
𝐿 𝐹
⏟ + ⏟
𝜀
(𝑝×1) (𝑝×1) (𝑝×𝑚) (𝑚×1) (𝑝×1)

𝜇𝑖 = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑖
𝜀𝑖 = 𝑖𝑡ℎ 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑐 𝑓𝑎𝑐𝑡𝑜𝑟
𝐹𝑗 = 𝑗𝑡ℎ 𝑐𝑜𝑚𝑚𝑜𝑛 𝑓𝑎𝑐𝑡𝑜𝑟
𝑙𝑖𝑗 = 𝑙𝑜𝑎𝑑𝑖𝑛𝑔 𝑜𝑓 𝑡ℎ𝑒 𝑖𝑡ℎ 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑜𝑛 𝑡ℎ𝑒 𝑗𝑡ℎ 𝑓𝑎𝑐𝑡𝑜𝑟
The unobservable random vector F and 𝜺 satisfy the following conditions:

𝐹 𝑎𝑛𝑑 𝜀 are independent


𝐸(𝐹) = 0, 𝐶𝑜𝑣(𝐹) = 𝐼
𝐸(𝜀) = 0, 𝐶𝑜𝑣(𝜀) = 𝛹, 𝑤ℎ𝑒𝑟𝑒 𝛹 𝑖𝑠 𝑎 𝑑𝑖𝑎𝑔𝑜𝑛𝑎𝑙 𝑚𝑎𝑡𝑟𝑖𝑥.

Covariance Structure for the Orthogonal Factor Model

1. 𝐶𝑜𝑣(𝑋) = 𝐿𝐿′ + 𝜳

or
2 2
𝑉𝑎𝑟(𝑋𝑖 ) = 𝑙𝑖1 + ⋯ … … … . +𝑙𝑖𝑚 + 𝛹𝑖
𝐶𝑜𝑣(𝑋𝑖 , 𝑋𝑘 ) = 𝑙𝑖1 𝑙𝑘1 + ⋯ … … … . +𝑙𝑖𝑚 𝑙𝑘𝑚

2. 𝐶𝑜𝑣(𝑋, 𝐹) = 𝐿

or
𝐶𝑜𝑣(𝑋𝑖 , 𝐹𝑗 ) = 𝑙𝑖𝑗

➢ The portion of the variance of the ith variable contributed by the m common
factors is called the ith communality.
➢ The portion of 𝑽𝒂𝒓 (𝑿𝒊 ) = 𝝈𝒊𝒊 due to the specific factor is often called the
uniqueness, or specific variance.
➢ Denoting the ith communality by ℎ2 , we that
2 2 2
𝜎
⏟𝑖𝑖 =⏟
𝑙𝑖1 + 𝑙𝑖2 + ⋯ … … . +𝑙𝑖𝑚 + 𝛹
⏟𝑖
𝑉𝑎𝑟 (𝑋𝑖 ) = 𝑐𝑜𝑚𝑚𝑛𝑎𝑙𝑖𝑡𝑦 + 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑐 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒
or
ℎ𝑖2 = 𝑙𝑖1
2 2
+ 𝑙𝑖2 2
+ ⋯ … . . +𝑙𝑖𝑚
and
𝜎𝑖𝑖 = ℎ2 + 𝛹𝑖 𝑖 = 1.2 … … . . , 𝑝

The ith communality is the sum of squares of the loadings of the ith variable on the
m common factor.
Example: (Verify that the relation ∑= 𝐿𝐿′ + 𝜳 for two factors) Consider the
covariance matrix

19 30 2 12
30 57 5 23
∑=[ ]
2 5 38 47
12 23 47 68
The equality
19 30 2 12 4 1 2 0 0 0
30 57 5 23 7 2 4 7 −1 1 0 4 0 0
[ ]=[ ][ ]+[ ]
2 5 38 47 −1 6 1 2 6 8 0 0 1 0
12 23 47 68 1 8 0 0 0 3

or
∑= 𝐿𝐿′ + 𝜳

may be verified by matrix algebra. Therefore, ∑ has the structure produced by an


𝑚 = 2 orthogonal factor model. Since

𝑙11 𝑙12 4 1
𝑙 𝑙 7 2
𝐿 = [ 21 22 ] = [ ]
𝑙31 𝑙32 −1 6
𝑙41 𝑙42 1 8
𝜓1 0 0 0 2 0 0 0
0 𝜓2 0 0 0 4 0 0
𝜳=[ ]=[ ]
0 0 𝜓3 0 0 0 1 0
0 0 0 𝜓4 0 0 0 3

the communality of 𝑋1 is,


ℎ12 = 𝑙11
2 2
+ 𝑙12 = 42 + 12 = 17

and the variance of 𝑋1 can be decomposed as


2 2 )
𝜎11 = (𝑙11 + 𝑙12 + 𝜓1 = ℎ12 + 𝜓1

19
⏟ = ⏟2
4 + 12 + ⏟ 2
𝑣𝑎𝑟𝑖𝑛𝑎𝑐𝑒 𝑐𝑜𝑚𝑚𝑢𝑛𝑎𝑙𝑖𝑡𝑦 + 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑐 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒

A similar breakdown occurs for the other variables.


Methods of Estimation

➢ The two most popular methods of parameter estimation are


(i) the Principal Component (and the related principal factor) method,
(ii) the Maximum Likelihood Method.

➢ The solution from either can be rotated in order to simplify the


interpretation of factors.

Principal Component Solution of the Factor Model

➢ The principal component factor analysis of the sample covariance matrix S


is specified in terms of its eigenvalue-eigenvector pairs
(𝜆̂1 , 𝑒̂1 ), (𝜆̂2 , 𝑒̂2 ), … … … , (𝜆̂𝑝 , 𝑝), where 𝜆̂1 ≥ 𝜆̂2 ≥, … … … . . , ≥ 𝜆̂𝑝 .

➢ Let m < p be the number of common factors.


➢ Then the matrix of estimated factor loadings {𝑙̃𝑖𝑗 } is given by

𝐿̃ = [√𝜆̂1 𝑒̂1 ⋮ √𝜆̂2 𝑒̂2 ⋮ ⋯ … … … … ⋮ √𝜆̂𝑚 𝑒̂𝑝 ] … … … … (1)

➢ The estimated specific variances are provided by the diagonal elements of the
̃′ , so
matrix 𝑆 − 𝐿𝐿
𝜓̃1 0 ⋯ 0
̃
𝛹̃ = 0 𝜓2 ⋯ 0 with 𝜓̃𝑖 = 𝑠𝑖𝑖 − ∑𝑚 ̃2
𝑖=1 𝑙𝑖𝑗 … … … . (2)
⋮ ⋮ ⋱ ⋮
[0 0 ⋯ 𝜓̃𝑝 ]

➢ Communalities are estimated as

ℎ̃𝑖2 = 𝑙̃𝑖1
2
+ 𝑙̃𝑖2
2
+ ⋯ … … . +𝑙̃𝑖𝑚
2
… … … … . . (3)

➢ The principal component factor analysis of the sample correlation matrix is


obtained by starting with R in place of S.

➢ Ideally, the contributions of the first few factors to the sample variances of
the variables should be large.
✓ The contribution to the sample variance 𝑠𝑖𝑖 from the first common
factor is 𝑙̃𝑖1
2
. The contribution to the total sample variance,

𝑠11 + 𝑠22 + ⋯ … . +𝑠𝑝𝑝 = 𝑡𝑟(𝑆),

from the first common factor is then



𝑙̃11
2
+ 𝑙̃21
2
+ ⋯ … … . +𝑙̃𝑝1
2
= (√𝜆̂1 𝑒̂1 ) (√𝜆̂1 𝑒̂1 ) = 𝜆̂1

sine the eigenvector 𝑒̂1 has unit length.

In general,
𝜆̂𝑗
𝑃𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑓𝑜𝑟 𝑎 𝑓𝑎𝑐𝑡𝑜𝑟 𝑎𝑛𝑎𝑙𝑦𝑠𝑖𝑠 𝑜𝑓 𝑆
𝑠11 + 𝑠22 + ⋯ + 𝑠𝑝𝑝
( 𝑠𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 ) =
𝑑𝑢𝑒 𝑡𝑜 𝑗𝑡ℎ 𝑓𝑎𝑐𝑡𝑜𝑟 𝜆̂𝑗
𝑓𝑜𝑟 𝑎 𝑓𝑎𝑐𝑡𝑜𝑟 𝑎𝑛𝑎𝑙𝑦𝑠𝑖𝑠 𝑜𝑓 𝑅
{ 𝑝

➢ Criterion above is frequently used as a heuristic device for determining the


appropriate number of common factors.
➢ The number of common factors retained in the model is increased until a
"suitable proportion" of the total sample variance has been explained.

Example: (Factor analysis of consumer-preference data) In a consumer-preference


study, a random sample of customers were asked to rate several attributes of a new
product. The responses, on a 7-point semantic differential scale, were tabulated and
the attribute correlation matrix constructed. The correlation matrix is presented next:

Attribute (variable) 1 2 3 4 5
𝑇𝑎𝑠𝑡𝑒 1 1.00 . 02 . 96 . 42 . 01
𝐺𝑜𝑜𝑑 𝑏𝑢𝑦 𝑓𝑜𝑟 𝑚𝑜𝑛𝑒𝑦 2 . 02 1.00 . 13 . 71 . 85
𝐹𝑙𝑎𝑣𝑜𝑟 3 . 96 . 13 1.00 . 50 . 11
𝑆𝑢𝑖𝑡𝑎𝑏𝑙𝑒 𝑓𝑜𝑟 𝑠𝑛𝑎𝑐𝑘 4 . 42 . 71 . 50 1.00 . 79
𝑃𝑟𝑜𝑣𝑖𝑑𝑒𝑠 𝑙𝑜𝑡𝑠 𝑜𝑓 𝑒𝑛𝑒𝑟𝑔𝑦 5 [ . 01 . 85 . 11 . 79 1.00]

➢ It is clear from the shaded entries in the correlation matrix that variables 1 and
3 and variables 2 and 5 form groups.
➢ Variable 4 is "closer" to the (2, 5) group than the (1, 3) group.
➢ Given these results and the small number of variables, we might expect that
the apparent linear relationships between the variables can be explained in
terms of, at most, two or three common factors.

➢ The first two eigenvalues, 𝜆̂1 = 2.85 and 𝜆̂2 = 1.81, of R are the only
eigenvalues greater than unity.
➢ Moreover, m = 2 common factors will account for a cumulative proportion

𝜆̂1 + 𝜆̂1 2.85 + 1.81


= = .93
𝑝 5

of the total (standardized) sample variance. The estimated factor loadings,


communalities, and specific variances, obtained using (1), (2), and (3), are
given in the following Table.

Variable Estimated Communalities Specific


factor loadings ℎ̃𝑖
2 variances
𝑙̃𝑖𝑗 = √𝜆̂𝑖 𝑒̂𝑖𝑗 𝜓̃𝑖 = 1 − ℎ̃𝑖2
𝐹1 𝐹2
1. 𝑇𝑎𝑠𝑡𝑒 .56 .82 .98 .02
2. Good buy for money .78 -.53 .88 .12
3. Flavor .65 .75 .98 .02
4. Suitable for snack .94 -.10 .89 .11
5. Provides los of energy .80 -.54 .93 .07
Eigenvalues 2.85 1.81
Cumulative proportion of .571 .932
total (standardized) sample
variance

Now
. 56 . 82
. 78 −.53
̃ +𝜳
𝐿𝐿′ ̃ = . 65 . 75 [. 56 . 78 . 65 . 94 . 80
]
. 82 −.53 . 75 −.10 −.54
. 94 −.10
[. 80 −.54]
. 02 0 0 0 0 1.00 . 01 . 97 . 44 . 00
0 . 12 0 0 0 1.00 . 11 . 79 . 91
+ 0 0 . 02 0 0 = 1.00 . 53 . 11
0 0 0 . 11 0 1.00 . 81
[ 0 0 0 0 . 07 ] [ 1.00]
nearly reproduces the correlation matrix R. Thus, on a purely descriptive basis, we
would judge a two-factor model with the factor loadings displayed in the Table as
providing a good fit to the data. The communalities (.98, .88, .98, .89, .93) indicate
that the two factors account for a large percentage of the sample variance of each
variable

You might also like