0% found this document useful (0 votes)
10 views

Factor Analysis

Factor analysis is a technique used to reduce a large number of overlapping variables into a smaller set of underlying factors. It groups variables that are correlated with one another and expresses their relationship to the underlying factors using factor loadings. The factors are latent variables that cannot be directly observed or measured. Key aspects of factor analysis include determining the number of factors to extract based on eigenvalues above 1, evaluating the strength of the factor solution using the KMO and Bartlett's test, and interpreting the factor loadings and communalities.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Factor Analysis

Factor analysis is a technique used to reduce a large number of overlapping variables into a smaller set of underlying factors. It groups variables that are correlated with one another and expresses their relationship to the underlying factors using factor loadings. The factors are latent variables that cannot be directly observed or measured. Key aspects of factor analysis include determining the number of factors to extract based on eigenvalues above 1, evaluating the strength of the factor solution using the KMO and Bartlett's test, and interpreting the factor loadings and communalities.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

4/15/2023

(Items)
(Latent Variables/
Construct)

In-store
Behaviour

Note: Latent Variables cannot be measured directly

Factor Analysis

1
4/15/2023

Meaning of Factor Analysis


 Factor analysis is a method of dimension reduction.

 It is an interdependence technique: and hence, no distinction between dependent and


independent variables. Factors are relatively independent of one another

 It reduces a large number of overlapping variables to a smaller set of factors. Factor analysis
groups variables with similar characteristics together.

 Uses number of surrogate variables or factors (“derived” variables) to represent the given
variables. Factor analysis is used to reduce number of variables into few components (Factors)

 The relationship of each variable to the underlying factor is expressed by the so-called factor
loading.

 The initial number of factors is the same as the number of variables used in the factor analysis.

• Factors are latent variables, as they are not directly observable. They are also called surrogated variable
• Sample size : Ideally 300 or 10 – 15 per variable or at least 5 per variable
• KMO – Sample Adequacy. Should be greater than 0.5 (0.7 to 0.8 is ideal)
• Bartlett's test : Indication of the strength of the relationship among variables. The null hypothesis is that
the correlation matrix is an identity matrix (to test H0 that variables are uncorrelated). It should be less
than 0.05 (α). Because we want to reject this null hypothesis.

• Factor Matrix (Component Matrix) – It contains the factor loadings of all the variables on all the
extracted factors.
• Rotated component Matrix: The idea of rotation is to reduce the number factors on which the variables
under investigation have high loadings. Rotation does not actually change anything but makes the
interpretation of the analysis easier.
• Eigenvalues: The eigenvalue is a measure of how much of the variance of the observed variables a
factor explains
• Variables are called items, indicators, or observed variables
• EM (Expectation Maximisation) imputation or deletion of missing values

2
4/15/2023

Investment data
Cut off : 70%
F1 :Perceived value of service F2: Security Factor

Car data
Cut off : 75% 1 (Economy) 2 (Spaciousness) 3 (Safety)

3
4/15/2023

Example
 The dimensions (Importance assigned by
inventors)
 Risk averseness
 Returns
 Insurance Cover
 Tax Rebate
 Maturity Time
 Credibility of the company
 Easy Accessibility

5 Point Scale is used where : 1 = Strongly Disagree,


: 2 = Disagree
: 3 = Neutral
: 4 = Agree
: 5 = Strongly Agree

Terminologies used in Factor Analysis

4
4/15/2023

Communalities- A general Example


 Communalities tells us how much of the variance in each of the original variables is
explained by the extracted factors.
The proportion of common variance present in a variable is known as the communality.

Communalities
Initial Extraction
Score on Risk Averseness 1.000 0.598
Score on Returns 1.000 0.304
Score on Insurance Covers 1.000 0.612
Score on Tax Rebate 1.000 0.111
Score on Maturity Time 1.000 0.624
Score on Credibility of Financial Institution 1.000 0.725
Score on Easy Accessibility 1.000 0.631
Higher communalities are desirable
Cut off points will have to be decided by the analyst
If the cut of is 50%, we would not exclude any of the variables in the above table

Establishing the strength of the factor Analysis solution

KMO (Kaiser-Meyer-Olkin) Measure of Sampling Adequacy. This measure varies between 0 and 1. A value closer to 1 is
better. For given set of variables, KMO above 0.50 confirms that factor analysis can be conducted on the given set of
data (ideally suggested minimum is .60) Values smaller than 0.5 indicate that there is a problem with sampling and
hence the sample is inadequate

Kaiser-Meyer-Olkin Measure of Sampling Adequacy. 0.591


Bartlett's Test of Sphericity Approx. Chi-Square 80.004
df 21
Sig. 0.000

Bartlett's Test of Sphericity tests the null hypothesis that the


correlation matrix is an identity matrix. An identity matrix is matrix
in which all of the diagonal elements are 1 and all off diagonal Significant (p) value is
less than (0.05), the test
elements are 0 (indicates a lack of correlation). You want to reject
can be performed
this null hypothesis.

5
4/15/2023

In our Example ----

Interpretation
• KMO greater than 0.5 indicates that the Factor analysis can be performed on the given data set. Values smaller than
0.5 indicate that there is a problem with sampling and hence the sample is inadequate
• Sig .000 Indicates the p value which is less than 0.05, indicating the rejection of the hypothesis that ‘the correlation
matrix of the variables is insignificant’

Eigenvalues are most commonly reported in factor analyses. They are calculated
and used in deciding how many factors to extract in the overall factor analysis.

• Communality is denoted by h2
• It indicates how much of each variable is
accounted for by the underlying factor
taken together
• The communality value which should be
more than 0.5 to be considered for further
analysis. Else these variables are to be
removed from further steps factor
analysis
• Example: over 72% of the variance in
“credibility of financial institution” is
accounted for, while 30.4 % of the
variance in “Returns” is accounted for

6
4/15/2023

Eigenvalue

 Initially the total of Eigenvalues is equal to the total number of variables


 The Eigenvalues for each factor tells us something about how much
variance in the observed indicators is being explained by that latent
factor.
 Eogenvalue is used to decide upon the number of factors that should be
retained
 The common practice is to retain factors that have eigenvalues above 1

(2.054  7) × 100 = 29.346


(1.551  7) × 100 = 22.160

= -0.1762 + 0.5272 + 0.3352 + 0.3092 + 0.7652 + 0.5702 + 0.7932 = 2.054


= 0.7532 + 0.1602 - 0.7072 + 0.1252 - 0.1982 + 0.6332 + 0.0472 = 1.55

= 0.0572 + 0.5512 + 0.1092 + 0.3322 + 0.6712 + 0.7322 + 0.7712 = 2.010

= -0.7712 + 0.0042 +0.7752 - 0.0272 +0.4172 - 0.4352 + 0.1922 = 1.60

7
4/15/2023

 The scree plot graphs the Eigenvalue against the factor number.
 An alternative method of determining the appropriate number of factors to retain is
scree plot

1) Eigenvalues: retain all factors with EV > 1


2) Scree plot: retain all factors "before the elbow"

Focus only on the factors above the elbow point - factors which explain a sufficient degree of variance in the
observed indicators.

This plot shows that there are three relatively high (factors 1, 2, and 3) eigenvalues. Retain factors that are above
the ‘bend’

• Elements of the matrix are called factor


loadings
• Correlation between risk averseness and
factor 1 is (0.176) and between
insurance cover and factor 2 is (0.707)
• Factor loadings are used to compute
Eigen values for each factor

= -0.1762 + 0.5272 + 0.3352 + 0.3092 + 0.7652 + 0.5702 + 0.7932 = 2.054


= 0.7532 + 0.1602 - 0.7072 + 0.1252 - 0.1982 + 0.6332 + 0.0472 = 1.55

8
4/15/2023

•To interpret the results, a cut off point is decided


• No hard and fast rule to decide cut off point
• Gnarly values above 0.5 is considered
• Let us consider cut off point as 0.7
• Two variables corresponding to factor 1 and two variables
corresponding to factor 2 have values more than 0.7

F1 :Perceived value of service


F2: Security Factor

Rotation serves to make the output more understandable, by seeking


so-called "Simple Structure"

= 0.0572 + 0.5512 + 0.1092 + 0.3322 + 0.6712 + 0.7322 + 0.7712 = 2.010 = (2.017)*100 =28.71%
=(1.607)*100 =22.86%
= -0.7712 + 0.0042 +0.7752 - 0.0272 +0.4172 - 0.4352 + 0.1922 = 1.60 Total = 28.71+22.86 = 51.57%

Communalities Component Matrixa

Initial Extraction Component

Score on Risk Averseness 1.000 .598 1 2


Score on Risk Averseness -.176 .753
Score on Returns 1.000 .304
Score on Returns .527 .160
Score on Insurance Covers 1.000 .612
Score on Insurance Covers .335 -.707

Score on Tax Rebate 1.000 .111 Score on Tax Rebate .309 .125

Score on Maturity Time 1.000 .624 Score on Maturity Time .765 -.198

Score on Credibility of Financial Institution .570 .633


Score on Credibility of Financial 1.000 .725
Institution Score on Easy Accessibility .793 .047

Score on Easy Accessibility 1.000 .631 Extraction Method: Principal Component Analysis.

Extraction Method: Principal Component Analysis. a. 2 components extracted.

9
4/15/2023

Factor Analysis – Data set 2


Total Variance Explained

Component Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings

Total % of Cumulative Total % of Cumulative Total % of Cumulative


Variance % Variance % Variance %

1 3.946 39.458 39.458 3.946 39.458 39.458 3.910 39.097 39.097

2 2.704 27.044 66.502 2.704 27.044 66.502 2.533 25.335 64.432

3 1.601 16.011 82.513 1.601 16.011 82.513 1.808 18.081 82.513

Extraction Method: Principal Component Analysis.

Rotated Component Matrixa


Component
1 (Economy) 2 (Spaciousness) 3 (Safety)

The price of the car should be reasonable .969

Fuel mileage of the car should be at least 17km/hr .961


A small car should be easy to maintain and to be
.961
serviced
Seating should be comfortable for four adults .897

A small car should have adequate leg space .907

Breaks are the most critical part of a small car .830

Collapsible steering column should be standard


.779
equipment in all the new cars
Power steering is a must
Interior accessories of a small car should be
attractive
Small space for parking .974
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 4 iterations.

Cutoff : More than 0.75

10
4/15/2023

Factor Analysis – Dataset 3


Total Variance Explained
Compon Initial Eigenvalues Extraction Sums of Squared Rotation Sums of Squared
ent Loadings Loadings
Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %

1 3.177 31.775 31.775 3.177 31.775 31.775 3.041 30.408 30.408


2 3.050 30.499 62.274 3.050 30.499 62.274 3.030 30.296 60.703
3 1.845 18.447 80.720 1.845 18.447 80.720 2.002 20.017 80.720
Extraction Method: Principal Component Analysis.

Rotated Component Matrixa

Component
1(Customer 2 (Flyer Incentive) 3 (Convenience)
Service)

Jet Airways are always on time .954

The seats are very comfortable .962

I love the food they provide .912

Their air-hostesses are very beautiful .965

My boss/friend flies with same aircraft

The airlines have younger Aircrafts .959

I got the advantage of frequent flyer programme .985

Flight timings suits my schedule .958

My mom feels safe when I fly Jet

Flying Jet compliments my lifestyle .956

Extraction Method: Principal Component Analysis.


Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 4 iterations.

Cut off 0.75

11

You might also like