0% found this document useful (0 votes)
20 views

Chapter 6 - Multivariate Analysis of Survey Data (QDA)

Uploaded by

obsa abdalla
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Chapter 6 - Multivariate Analysis of Survey Data (QDA)

Uploaded by

obsa abdalla
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

CHAPTER 6

6. Multivariate Analysis of Survey


(Questionnaire) Data
6.1. Nature of Survey Data and Statistical
Methods of Survey Data Analysis
6.3. Principal Component Analysis (PCA)
6.4. Exploratory Factor Analysis (EFA)
6.5 Confirmatory Factor Analysis
6.6. Structural Equation Modeling and
Path Diagram
1
Survey Data
Your survey instrument (Questionnaire)
designed to collect primary data can gather
information on variables in form;
 Continuous scale
 Likert Scale
 Nominal scale

2
Questionnaire data or instrument
Consists of;
1. Demographic data such as sex,
education, occupation, experience, family
size etc.
2. Items or questions – responses can be in
o Ordinal (5 point Likert Scale or more)
o Continuous (scale)
o Nominal (dichotomous)

3
3. Outcome Items – may be single item or
multiple item
 Outcome items are dependent variables
measured in survey such as income,
satisfaction, service quality, employee job
satisfaction or any other parameter that you
consider as dependent variable.
Generally, when designing the survey
instruments it is very essential to clearly
know and state the dependent and
independent variables.
4
Steps/procedures in survey data analysis
1. Data clearing
 Remove duplicates, incomplete and error
response data
2. Assign numerical values for categorical
variables
 Ordinal (Likert scale)
 Nominal
 Multinomial , should all be properly
assigned values.
5
3. Coding Likert scale variables
4. Reverse coding
5. Defining constructs
6. Generating composites such as average,
sum etc when necessary

6
Types of survey data analysis
1. Descriptive data analysis
2. Data visualization
3. Cross-tabulation and chi-square tests
4. Factor analysis
5. Correlation analysis
6. Regression analysis
7. Comparative analysis – parametric and non-
parametric
8. Hypothesis testing and statistical significance
7
Descriptive data analysis
Use of mean, SD and other arithmetic's for
continuous variables
 Percentages and frequency distribution table for
categorical variables.
Data visualization
 Graphs, bar charts, pic harts, histograms etc.
Reliability and validity tests
We use Cornbrash's alpha for testing internal
consistency the questionnaire .

8
Comparative analysis
 Parametric (for normally distributed) mean
comparison tests such as t-test and ANNOVA
Non-parmetrics (for data not normally distributed
) comparative analysis include;
 Mann Whitney U test for two groups ,
Kurskal Walis Test for three groups or more and
Jonckheers trend test for ordinal data to find the
trend &
We compare determine difference between groups

9
Factor Analysis
FA is a data reduction statistical techniques
It is statistical method designed to identify
dimensions, factors or constructs or sectors in
variables (items or questions). Objectives are to;
examine the correlation between variables, then
identify groups of variables that are highly related.
Group questions that are answered similarly in survey
Explain common underlying factors or constructs
Uncover and generate set of latent factors
Help in development of theoretical model
Determine the number of factors in a data

10
Types of factors analysis
1. Principal component analysis (PCA);
 Data reduction method – to reduce number of items
in a survey
 computes correlation between the items and then
groups the items into groups known as components
2. Exploratory Factor Analysis (EFA)
 principal factor analysis (PFA)- compute
correlation between items to identify common
factor, unique factors , explore factor sturctures

11
3. Confirmative factor analysis
It is performed post exploratory factor analysis. Its
purpose is identify underlying stricture for factors
 to develop structural equation models – it is
initial step for SEM;
 as its name indicates it helps to confirm or
disconfirm factor structure

12
Assumptions of factor analysis
Adequacy of sample size – should be sufficient
>200
No outliers – no extreme values
Linearity – relationship among variables should
linear;
Multivariate Normality – observed variables
follow multivariate normal distribution
No perfect multicollinearity – no high correlation
between 2 or more independent variables.
13
Data has to be continuous data ( interval or ration
data).
However, can be used for ordinal data of Lickert scale
with 5 greater categories (scales).
Common variance – observed variables share common
variance explained by underlying factors
Independence of factors- factors are assumed to
uncorrelated to each other
No rotation of axisis 0 axis or factor are unrotated.

14
FA output of correlation matrix for item can be
computed which shows the relationship between
all pairs of variables.
Correlation coefficient must be > -0.8 and
< 0.8
If r> 0.9 due to high multicollinearity
If r < 0.3 or >-0.3 delete that item because
of weak multicollinearity
Determinants should be >0.0000

15
Sample adequacy test
Using KMO values between 0 and 1 and test
result must be significant
Test result > 0.5 is acceptable and 0.8 is the
best value ideally .
Bartlett's test of sphericity
Ho: correlation matrix is an identity matrix
– a matrix with all diagonals equal to 1.
Reject Ho if it is significant

16
PCA and PFA Analyses Using STATA
Select variables that should be reduced to a set
of underlying factors (PCA) or should be used to
identify underlying dimensions (factor analysis).
On Stata using statistics
 ► Statistics ► Multivariate analysis ►
Factor and principal component analysis ►
Factor analysis. Enter the variables in the
Variables box.
Also answer the related questions
Are the variables interval or ratio scaled?
Is the sample size sufficiently large?
To check whether the variables are sufficiently
correlated, we use;
On Stata
► Statistics ► Summaries, tables, and tests ►
Summary and descriptive statistics ► Pairwise
correlations.
o Check the number of observations for each entry
and Print the significance level for each entry.
o Select Use Bonferroni-adjusted significance level
to maintain the familywise error rate. Or
Stata command:
pwcorr x1 x2 x3 x4 x5 s6 s7 s8, obs sig Bonferroni
KMO test can also be used
On Stata;
► Statistics ► Postestimation ► Factor
analysis reports and graphs ► Kaiser-Meyer-
Olkin measure of sample adequacy. Then
click on Launch and OK.
Note that this analysis can only be run after
the PCA has been conducted.
Or
Use Stata command: estat kmo
Extract the factors

Choose the method of factor analysis


On Stata;
► Statistics ► Multivariate analysis ► Factor
and principal component analysis ► Factor
analysis. Click on the Model 2 tab and select
the Principal component factor.
OR
 USE COMMAND
factor [variables], pcf
Determine the number of factors
On Stata’
Kaiser criterion:
► Statistics ► Multivariate analysis ► Factor
and principal component analysis ► Factor
analysis.
Click on the Model 2 tab and enter 1 under the
Minimum value of eigenvalues to be retained.
OR
factor [variables], pcf mineigen (1)
Parallel analysis: Download and install the
command ‘paran’ and enter paran s1 s2 s3 s4 s5
s6 s7 s8, centile (95) q all graph
Thus,
Extract factors (1) with adjusted eigenvalues
greater than 1, and (2) whose adjusted
eigenvalues are greater than the random
eigenvalues.
On Stata
Scree plot: ► Statistics ► Postestimation ►
Factor analysis reports and graphs ► Scree
plot of eigenvalues. Then click on Launch and
OK.
OR
Command: screeplot
Pre-specify the number of factors based on a priori
information:
On Stata
► Statistics ► Multivariate analysis ► Factor and
principal component analysis ► Factor analysis.
Under the Model 2 tab, tick the Maximum number of
factors to be retained and specify a value in the box
below. OR
USE COMMAND: factor s1 s2 s3 s4 s5 s6 s7 s8,
factors(2)
Here, make sure that the factors extracted account for
at least 50% of the total variance explained (75% or
more is recommended): Check the Cumulative column
in the PCA output.
Interpret the Factor Solution
Rotate the factors- Use the varimax procedure or, if
necessary, the Promax procedure with gamma set to 3
(both with Kaiser normalization):
On Stata
 ► Statistics ► Postestimation ► Principal
component analysis reports and graphs ► Rotate
factor loadings.
 Select the corresponding option in the menu.OR
Varimax: rotate, Kaiser
Promax: rotate, promax(3) oblique Kaiser
Assign variables to factors
Check the Factor loadings (coefficient matrix)
table in the output of the rotated solution.
Assign each variable to a certain factor based on
the highest absolute loading.
To facilitate interpretation, you may also assign a
variable to a different factor, but check that the
loading is at an acceptable level (0.50 if only a
few factors are extracted, 0.30 if many factors are
extracted).
Consider making a loadings plot:
On Stata
► Statistics ► Postestimation ► Factor
analysis reports and graphs ► Plot of factor
loadings.
 Under Plot all combinations of the following,
indicate the number of factors for which you
want to plot. Check which items load highly on
which factor.
Lastly compute factor scores
On Stata
Save factor scores as new variables: ► Statistics ►
Postestimation ► Predictions ►Regression and
Bartlett scores.
Under New variable names or variable stub*enter
factor*. Select Factors scored by the regression
scoring method.
OR
Use command
predict score*, regression

You might also like