Factor Analysis Exercises

This document discusses using factor analysis to analyze labor and social protection data from 27 EU countries. The analysis involves dealing with missing data, assessing correlations between variables using Bartlett's test of sphericity and the KMO measure, removing variables, interpreting communalities from the principal axis factoring analysis, selecting the number of factors based on eigenvalues above 1 or a scree plot, and potentially rotating factors. While only 81% of variance is explained by two factors, this is considered a good result from reducing ten variables down to two factors.

Uploaded by

aida

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

151 views

Factor Analysis Exercises

Uploaded by

aida

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Exercise on Factor Analysis

You are an expert in labor policies working for the European Commission and you have been asked
by your supervisor to compile a report on the situation of labor and social protection within the 27
countries belonging to the European Union. For this purpose, you decide to use data made
available from the World Bank for the year 2012.

Table 4.1: Descriptive statistics

Minimu Maximu Std.
N m m Mean Deviation
Unemployment, youth total (% of total labor 2 24.970
8.10 55.30 11.38108
force ages 15-24) 7 4
Long-term unemployment (% of total 2
1.10 14.40 4.9333 3.22979
unemployment) 7
Employment in agriculture (% of total 2
1.00 29.00 5.8346 5.84417
employment) 6
Employers, total (% of employment) 2
1.20 7.20 4.0593 1.22734
7
Self-employed, total (% of total employed) 2 15.896
8.60 36.80 6.80359
7 3
Vulnerable employment, total (% of total 2 11.840
5.00 31.50 6.55786
employment) 7 7
Wage and salaried workers, total (% of total 2 84.074
63.20 91.40 6.79471
employed) 7 1
Contributing family workers, total (% of total 2
.20 12.60 1.6440 2.59343
employed) 5
Employment in services (% of total 2 68.953
42.40 84.10 9.08001
employment) 6 8
Employment in industry (% of total 2 24.938
12.40 38.10 6.22923
employment) 6 5
Unemployment, total (% of total labor force) 2 10.559
4.30 25.00 5.15075
7 3
Employment to population ratio, 15+, total (%) 2 52.237
40.30 61.30 5.06869
7 0
Valid N (listwise) 2
4

All variables consist of percentages, therefore all of them are expressed on the same scale.

1) Based on Table 4.1, which problem do you have to deal with in order to proceed with the
analysis?
I notice that there are some missing values. I could therefore opt either for the removal
of the countries with missing information, either listwise of pairwise, or for replacing the
missing values with the mean of the variable.

2) What does the Bartlett’s test of sphericity tell you about the correlation between the
items? What is the null hypothesis behind the test? Given a value a value of 673.178 (p-
value=.000) for Bartlett’s test, would you proceed with the analysis or not? Are you
confident with that or would you need some other piece of information?
Bartlett’s test of sphericity is a global test on the correlation matrix, whose null
hypothesis states that the correlation between the variables is equal to zero, thus the
correlation matrix is an identity matrix. Just considering the result of the test is not
enough to proceed with the analysis, since the test just tells me that there is some
correlation between the variables, but I do not know how strong it is and among how
many variables. For this information, I need the value of the KMO measure.

3) You decide to remove two variables from the analysis (“employment in industry” and
“employers”). Looking at the main diagonal of the new anti-image matrix, you see that the
values range from 0.644 (“Unemployment, total”) to 0.800 (“Vulnerable employment,
total”). What do these values tell us? Are you confident enough with going on with the
analysis (the new KMO measure is 0.742)?
The anti-image matrix reports the KMO MSA measures for each of the variables, that is to
say, the amount of common variance between each variable and the remaining ones. The
smallest value, 0.644, indicates that the inter-correlation between the item and the other
ones is just mediocre. However, if the other values are above 0.7, the degree of inter-
correlation should be satisfactory. Moreover, the KMO MSA is larger than 0.7, indicating that
the total degree of inter-correlation in the data is satisfactory. Hence, I would proceed with
the analysis.

Table 4.2: Communalities

Initial Extraction
Employment in services (% of total
.676 .449
employment)
Unemployment, youth total (% of total
.954 .954
labor force ages 15-24)
Long-term unemployment (% of total
.909 .842
unemployment)
Employment in agriculture (% of total
.895 .777
employment)
Self-employed, total (% of total employed) 1.000 .883
Employment to population ratio, 15+, total
.767 .585
(%)
Vulnerable employment, total (% of total
.993 .936
employment)
Wage and salaried workers, total (% of
1.000 .881
total employed)
Contributing family workers, total (% of
.902 .859
total employed)
Unemployment, total (% of total labor
.967 .929
force)
Extraction Method: Principal Axis Factoring.

4) What are communalities? What is the difference between the initial communalities for a
“principal axis factoring analysis” and the initial communalities for a “principal component
analysis”?
The communality of a variable is the proportion of each variable’s variance that is in
common with the other variable and therefore can be explained by the factors. It can be
defined as the sum of squared factor loadings for the variables.
The initial communalities under the principal axis factoring analysis aim to describe the
share of common variance between each variable and all the remaining ones, thus they
are numbers between 0 and 1. This happens because this method is not interested in
explaining the whole variance in the data, but only the common variance between the
variables.
Instead, the initial communalities under the PCA are all equal to one because now the
analysis aims to reduce the data dimension having as goal to explain as much as possible
of the whole variability in the data, not only the one in common between the variables.

5) Table 4.3 shows the total variance explained by each of the factors. What possible criteria
might have been used to select the number of factors? Is it OK if the two factors explain
only 81% of the whole variance?
One possible criterion is to select those factors characterized by an initial eigenvalue
bigger than one, which indicates that the factor accounts for more variability than a
single variable. Another possible criterion is to look at the scree plot, which plots the
eigenvalues against the factor number. The optimal number of factors is the one that is
followed by an almost flat line. This indicates that each successive factor accounts for
increasingly smaller amounts of the total variance.
I am fine with the fact that only around 80% of total variance is explained by the factors.
The aim of FA is to reduce the dimension of data to a smaller set of variables keeping as
large as possible their capacity to explain most of the total variance in the initial data. In
this case, the result is very good, because 80% of the total variance is explained by two
factors, compared to the ten initial variables.

6) If you asked for rotation, SPSS output finally shows you two matrices, one with the
unrotated factors and the other with the rotated ones. Which one would you consider to
facilitate your task of interpreting the latent factors that you found? And why?
I would consider the matrix with the rotated factors. Indeed, rotation is a mathematical
transformation that doesn’t alter the factors and the quantity of variability that they
explain. Rather, by distributing the variance that they explain more evenly among the
variables, rotation makes interpretation of the factors more straightforward. Different
kinds of rotation are possible. For instance, varimax rotation gives totally uncorrelated
factors, while oblique rotation gives correlated factors.
7) Using Table 4.4, with the rotated factors, obtained through Varimax method with Kaiser
normalization, would you be able to compute the communality for the item “Long-term
unemployment”?
The communality for the item is given by
(0.097)^2+(0.912)^2 = 0.8412.

8) Instead of the Varimax method for rotation, say, you opted for the Direct Oblimin method.
What is the assumption behind this rotation method? You therefore got the output shown
in Table 4.5. Given this piece of information, would you be confident enough in using your
two new factors as independent variables in a regression model?

Direct Oblimin is a method for oblique rotation. This is not an orthogonal rotation of the
factors, therefore the factors are no longer uncorrelated. In this particular case, since the
correlation between the two factors is rather small, just 0.28, I would be confident
enough in using the two factors as independent variables in a regression model.

Table 4.3: Total variance explained

Extraction Sums of Squared Rotation Sums of Squared
Initial Eigenvalues Loadings Loadings
% of % of % of
Facto Varianc Cumulativ Varianc Cumulativ Varianc Cumulativ
r Total e e% Total e e% Total e e%
1 5.57 4.39
5.736 57.357 57.357 55.774 55.774 43.953 43.953
7 5
2 2.51 3.69
2.697 26.974 84.331 25.169 80.943 36.990 80.943
7 9
3 .709 7.089 91.420
4 .389 3.887 95.307
5 .260 2.604 97.911
6 .116 1.156 99.066
7 .066 .657 99.723
8 .021 .211 99.934
9 .007 .065 100.000
10 3.802E
.000 100.000
-005
Table 4.4: Rotated factor matrix
Factor
1 2
Contributing family workers, total (% of total employed) .926 -.029
Vulnerable employment, total (% of total employment) .909 .331
Employment in agriculture (% of total employment) .877 .088
Self-employed, total (% of total employed) .840 .421
Wage and salaried workers, total (% of total employed) -.839 -.421
Employment in services (% of total employment) -.669 .033
Unemployment, total (% of total labor force) .050 .963
Unemployment, youth total (% of total labor force ages 15-24) .172 .961
Long-term unemployment (% of total unemployment) .097 .912
Employment to population ratio, 15+, total (%) -.207 -.736
Extraction Method: Principal Axis Factoring.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 3 iterations.

Table 4.5: Factor correlation matrix

Factor 1 2
1 1.000 .282
2 .282 1.000

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
89% (45)
12 Week Program: Summer Body Starts Now
70 pages
Knee Ability Zero Now Complete As A Picture Book 4 PDF Free
94% (68)
Knee Ability Zero Now Complete As A Picture Book 4 PDF Free
49 pages
Read People Like A Book by Patrick King-Edited
62% (71)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (77)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (541)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (28)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
78% (27)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
Satanic Calendar
25% (55)
Satanic Calendar
4 pages
IB SL Math Internal Assessment
100% (6)
IB SL Math Internal Assessment
13 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (7)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
ALCHEMIST
64% (14)
ALCHEMIST
4 pages
Abgenix & The Xenomouse
No ratings yet
Abgenix & The Xenomouse
3 pages
1001 Songs
70% (70)
1001 Songs
1,798 pages
Basic Statistics For Health Sciences
91% (11)
Basic Statistics For Health Sciences
361 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Homework 4
No ratings yet
Homework 4
4 pages
Ocean Carriers Case Study
No ratings yet
Ocean Carriers Case Study
7 pages
10 RepeatedMeasuresAndMixedANOVA
No ratings yet
10 RepeatedMeasuresAndMixedANOVA
30 pages
Topic03 Correlation Regression
No ratings yet
Topic03 Correlation Regression
81 pages
Introduction To Statistics HAND OUT ARBA - 2
No ratings yet
Introduction To Statistics HAND OUT ARBA - 2
143 pages
Econometric Analysis of Panel Data: William Greene Department of Economics Stern School of Business
No ratings yet
Econometric Analysis of Panel Data: William Greene Department of Economics Stern School of Business
88 pages
Wooldridge Session 4
No ratings yet
Wooldridge Session 4
64 pages
Calculating Total Scale Scores and Reliability SPSS - D.boduszek
No ratings yet
Calculating Total Scale Scores and Reliability SPSS - D.boduszek
16 pages
Econometrics Multiple Regression Analysis: Heteroskedasticity
No ratings yet
Econometrics Multiple Regression Analysis: Heteroskedasticity
19 pages
ARCH Model
No ratings yet
ARCH Model
26 pages
(Ebook) Quantitative Data Analysis for Language Assessment Volume I: Fundamental Techniques by Vahid Aryadoust (editor), Michelle Raquel (editor) ISBN 9781138733121, 1138733121 all chapter instant download
100% (11)
(Ebook) Quantitative Data Analysis for Language Assessment Volume I: Fundamental Techniques by Vahid Aryadoust (editor), Michelle Raquel (editor) ISBN 9781138733121, 1138733121 all chapter instant download
65 pages
Best SEM STATA Menu StataSEMMasterDay2and3 PDF
No ratings yet
Best SEM STATA Menu StataSEMMasterDay2and3 PDF
58 pages
MBA Free Ebooks
No ratings yet
MBA Free Ebooks
56 pages
Rab Nawaz Lodhi Management Sciences 2016 HSR BU Islamabad 27.07.2017 PDF
No ratings yet
Rab Nawaz Lodhi Management Sciences 2016 HSR BU Islamabad 27.07.2017 PDF
302 pages
Statistics For Management I - Best
No ratings yet
Statistics For Management I - Best
127 pages
Chapter One
No ratings yet
Chapter One
7 pages
Basic Statistics2222
No ratings yet
Basic Statistics2222
52 pages
Regression With Dummy Variables Econ420 1
No ratings yet
Regression With Dummy Variables Econ420 1
47 pages
Structural Equation Modeling
No ratings yet
Structural Equation Modeling
23 pages
Logistic Regression & Practice
100% (1)
Logistic Regression & Practice
51 pages
Introduction To Descriptive Statistics
No ratings yet
Introduction To Descriptive Statistics
73 pages
Factor Analysis by Diagrams PDF
No ratings yet
Factor Analysis by Diagrams PDF
6 pages
MTH4100 Calculus I: Lecture Notes For Week 2 Thomas' Calculus, Sections 1.3 To 1.5
No ratings yet
MTH4100 Calculus I: Lecture Notes For Week 2 Thomas' Calculus, Sections 1.3 To 1.5
13 pages
Factor Analysis in SPSS
No ratings yet
Factor Analysis in SPSS
9 pages
Linear Statistical Models The Less Than Full Rank Model: Yao-Ban Chan
100% (1)
Linear Statistical Models The Less Than Full Rank Model: Yao-Ban Chan
140 pages
07 - Lent - Topic 2 - Generalized Method of Moments, Part II - The Linear Model - mw217
No ratings yet
07 - Lent - Topic 2 - Generalized Method of Moments, Part II - The Linear Model - mw217
16 pages
Dr. Chinmoy Jana Iiswbm: Management House, Kolkata
No ratings yet
Dr. Chinmoy Jana Iiswbm: Management House, Kolkata
22 pages
Axiomatic Probability and Concepts
No ratings yet
Axiomatic Probability and Concepts
6 pages
HCI - Notes-Ch3
100% (1)
HCI - Notes-Ch3
44 pages
ECON 330-Econometrics-Dr. Farooq Naseer
No ratings yet
ECON 330-Econometrics-Dr. Farooq Naseer
5 pages
1-Day Training Workshop On Basic Statistics Techniques and Predictive Analysis (Module 2)
No ratings yet
1-Day Training Workshop On Basic Statistics Techniques and Predictive Analysis (Module 2)
85 pages
Basic Statistics
No ratings yet
Basic Statistics
66 pages
Data Analysis in Social Sciences Comparison Between Quantitative and Qualitative Research
No ratings yet
Data Analysis in Social Sciences Comparison Between Quantitative and Qualitative Research
2 pages
MBA-SEM I-E Finance - Business Research Methods and Analytics - Unit 5
No ratings yet
MBA-SEM I-E Finance - Business Research Methods and Analytics - Unit 5
44 pages
Bibliometric Laws
100% (1)
Bibliometric Laws
27 pages
Time Series Modeling: Shouvik Mani April 5, 2018
No ratings yet
Time Series Modeling: Shouvik Mani April 5, 2018
46 pages
Panel Data Analysis Sunita Arora
100% (1)
Panel Data Analysis Sunita Arora
28 pages
Panel Data Analysis Using STATA 13
No ratings yet
Panel Data Analysis Using STATA 13
17 pages
Ordinary Least Squares: Linear Model
No ratings yet
Ordinary Least Squares: Linear Model
13 pages
10E-Poisson Regression
No ratings yet
10E-Poisson Regression
19 pages
Presenter:: Prof. Richard Chinomona
100% (1)
Presenter:: Prof. Richard Chinomona
55 pages
Multiple Regression MS
No ratings yet
Multiple Regression MS
35 pages
Ch2 Slides
No ratings yet
Ch2 Slides
80 pages
Factor Analysis - Spss
No ratings yet
Factor Analysis - Spss
15 pages
Correlation-Regression 2019
No ratings yet
Correlation-Regression 2019
76 pages
Chap4 Normality (Data Analysis) FV
100% (1)
Chap4 Normality (Data Analysis) FV
72 pages
Multi Regression
No ratings yet
Multi Regression
17 pages
Choosing The Correct Statistical Test
No ratings yet
Choosing The Correct Statistical Test
26 pages
Descriptive Measure of Scale Variables
No ratings yet
Descriptive Measure of Scale Variables
33 pages
Econometric Modelling: Module - 1
No ratings yet
Econometric Modelling: Module - 1
20 pages
Statistics
No ratings yet
Statistics
41 pages
Introduction To Factor Analysis (Compatibility Mode) PDF
No ratings yet
Introduction To Factor Analysis (Compatibility Mode) PDF
20 pages
Class 7
No ratings yet
Class 7
42 pages
Multivariate Regression
No ratings yet
Multivariate Regression
20 pages
Modelling in R
No ratings yet
Modelling in R
47 pages
L3
No ratings yet
L3
37 pages
IACFM Presentation Oct 2021
No ratings yet
IACFM Presentation Oct 2021
15 pages
L3 2
No ratings yet
L3 2
38 pages
Eco Profile 2018 Chapter-4
No ratings yet
Eco Profile 2018 Chapter-4
28 pages
KP Coping Strategy Against COVID-19
No ratings yet
KP Coping Strategy Against COVID-19
20 pages
Mahatma Gandhi NREGA
No ratings yet
Mahatma Gandhi NREGA
2 pages
What Is The Relation Between Peace and Politics?
No ratings yet
What Is The Relation Between Peace and Politics?
4 pages
Globalization Essay Final
No ratings yet
Globalization Essay Final
6 pages
Executive Summary - Bang & Olufsen
No ratings yet
Executive Summary - Bang & Olufsen
5 pages
IKEA Management Report
100% (1)
IKEA Management Report
27 pages
Ocean Carriers Case Study
100% (1)
Ocean Carriers Case Study
7 pages
"Letters": To Time
No ratings yet
"Letters": To Time
3 pages
TOK Essay Final Version
No ratings yet
TOK Essay Final Version
6 pages
SPSS Instructions For Research Project #2 (Q-Aire Construction)
No ratings yet
SPSS Instructions For Research Project #2 (Q-Aire Construction)
5 pages
Ward-Gleditsch (2008) Spatial Regression Models
No ratings yet
Ward-Gleditsch (2008) Spatial Regression Models
87 pages
4 Impact of Stress On The Work Performance
No ratings yet
4 Impact of Stress On The Work Performance
5 pages
Bài tiểu luận - Anh Biên 1
No ratings yet
Bài tiểu luận - Anh Biên 1
48 pages
Author's Original Submission
No ratings yet
Author's Original Submission
21 pages
Estimating Wood Structural Panel Diaphragm and Shear Wall Deflection
No ratings yet
Estimating Wood Structural Panel Diaphragm and Shear Wall Deflection
6 pages
Chapter 11: Simple Linear Regression
No ratings yet
Chapter 11: Simple Linear Regression
57 pages
8-Multivariate Analysis Using SAS
No ratings yet
8-Multivariate Analysis Using SAS
26 pages
Soil Variability and Its Consequences in Geotechnical Engineering
No ratings yet
Soil Variability and Its Consequences in Geotechnical Engineering
302 pages
Sensation Seeking Scale
No ratings yet
Sensation Seeking Scale
9 pages
Chapter 3: Research Methods
No ratings yet
Chapter 3: Research Methods
11 pages
Hubungan Pemerintah Dengan Masyarakat
No ratings yet
Hubungan Pemerintah Dengan Masyarakat
9 pages
The Use of Social Media On Political Participation Among University Students
No ratings yet
The Use of Social Media On Political Participation Among University Students
9 pages
Correlational Research
No ratings yet
Correlational Research
14 pages
Assessing Physical Activity in The Elderly: A Comparative Study of Most Popular Questionnaires
No ratings yet
Assessing Physical Activity in The Elderly: A Comparative Study of Most Popular Questionnaires
13 pages
(1996) Anthropometry and Biomechanics
100% (2)
(1996) Anthropometry and Biomechanics
59 pages
Muthee
No ratings yet
Muthee
26 pages
12 Regresi Linier Dan Korelasi
No ratings yet
12 Regresi Linier Dan Korelasi
16 pages
S R PI P90 5/1 SII SQN AF) R - 1-33 I I I .: Ad-Ao e $18 Air Fo Ce Human Resources Lab Brooks A TEX Ter. L-T 7
No ratings yet
S R PI P90 5/1 SII SQN AF) R - 1-33 I I I .: Ad-Ao e $18 Air Fo Ce Human Resources Lab Brooks A TEX Ter. L-T 7
39 pages
Multivariate Analysis
No ratings yet
Multivariate Analysis
15 pages