Factor Analysis
Factor Analysis
(Items)
(Latent Variables/
Construct)
In-store
Behaviour
Factor Analysis
1
4/15/2023
It reduces a large number of overlapping variables to a smaller set of factors. Factor analysis
groups variables with similar characteristics together.
Uses number of surrogate variables or factors (“derived” variables) to represent the given
variables. Factor analysis is used to reduce number of variables into few components (Factors)
The relationship of each variable to the underlying factor is expressed by the so-called factor
loading.
The initial number of factors is the same as the number of variables used in the factor analysis.
• Factors are latent variables, as they are not directly observable. They are also called surrogated variable
• Sample size : Ideally 300 or 10 – 15 per variable or at least 5 per variable
• KMO – Sample Adequacy. Should be greater than 0.5 (0.7 to 0.8 is ideal)
• Bartlett's test : Indication of the strength of the relationship among variables. The null hypothesis is that
the correlation matrix is an identity matrix (to test H0 that variables are uncorrelated). It should be less
than 0.05 (α). Because we want to reject this null hypothesis.
• Factor Matrix (Component Matrix) – It contains the factor loadings of all the variables on all the
extracted factors.
• Rotated component Matrix: The idea of rotation is to reduce the number factors on which the variables
under investigation have high loadings. Rotation does not actually change anything but makes the
interpretation of the analysis easier.
• Eigenvalues: The eigenvalue is a measure of how much of the variance of the observed variables a
factor explains
• Variables are called items, indicators, or observed variables
• EM (Expectation Maximisation) imputation or deletion of missing values
2
4/15/2023
Investment data
Cut off : 70%
F1 :Perceived value of service F2: Security Factor
Car data
Cut off : 75% 1 (Economy) 2 (Spaciousness) 3 (Safety)
3
4/15/2023
Example
The dimensions (Importance assigned by
inventors)
Risk averseness
Returns
Insurance Cover
Tax Rebate
Maturity Time
Credibility of the company
Easy Accessibility
4
4/15/2023
Communalities
Initial Extraction
Score on Risk Averseness 1.000 0.598
Score on Returns 1.000 0.304
Score on Insurance Covers 1.000 0.612
Score on Tax Rebate 1.000 0.111
Score on Maturity Time 1.000 0.624
Score on Credibility of Financial Institution 1.000 0.725
Score on Easy Accessibility 1.000 0.631
Higher communalities are desirable
Cut off points will have to be decided by the analyst
If the cut of is 50%, we would not exclude any of the variables in the above table
KMO (Kaiser-Meyer-Olkin) Measure of Sampling Adequacy. This measure varies between 0 and 1. A value closer to 1 is
better. For given set of variables, KMO above 0.50 confirms that factor analysis can be conducted on the given set of
data (ideally suggested minimum is .60) Values smaller than 0.5 indicate that there is a problem with sampling and
hence the sample is inadequate
5
4/15/2023
Interpretation
• KMO greater than 0.5 indicates that the Factor analysis can be performed on the given data set. Values smaller than
0.5 indicate that there is a problem with sampling and hence the sample is inadequate
• Sig .000 Indicates the p value which is less than 0.05, indicating the rejection of the hypothesis that ‘the correlation
matrix of the variables is insignificant’
Eigenvalues are most commonly reported in factor analyses. They are calculated
and used in deciding how many factors to extract in the overall factor analysis.
• Communality is denoted by h2
• It indicates how much of each variable is
accounted for by the underlying factor
taken together
• The communality value which should be
more than 0.5 to be considered for further
analysis. Else these variables are to be
removed from further steps factor
analysis
• Example: over 72% of the variance in
“credibility of financial institution” is
accounted for, while 30.4 % of the
variance in “Returns” is accounted for
6
4/15/2023
Eigenvalue
7
4/15/2023
The scree plot graphs the Eigenvalue against the factor number.
An alternative method of determining the appropriate number of factors to retain is
scree plot
Focus only on the factors above the elbow point - factors which explain a sufficient degree of variance in the
observed indicators.
This plot shows that there are three relatively high (factors 1, 2, and 3) eigenvalues. Retain factors that are above
the ‘bend’
8
4/15/2023
= 0.0572 + 0.5512 + 0.1092 + 0.3322 + 0.6712 + 0.7322 + 0.7712 = 2.010 = (2.017)*100 =28.71%
=(1.607)*100 =22.86%
= -0.7712 + 0.0042 +0.7752 - 0.0272 +0.4172 - 0.4352 + 0.1922 = 1.60 Total = 28.71+22.86 = 51.57%
Score on Tax Rebate 1.000 .111 Score on Tax Rebate .309 .125
Score on Maturity Time 1.000 .624 Score on Maturity Time .765 -.198
Score on Easy Accessibility 1.000 .631 Extraction Method: Principal Component Analysis.
9
4/15/2023
Component Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
10
4/15/2023
Component
1(Customer 2 (Flyer Incentive) 3 (Convenience)
Service)
11