Chapter 6 - Multivariate Analysis of Survey Data (QDA)
Chapter 6 - Multivariate Analysis of Survey Data (QDA)
2
Questionnaire data or instrument
Consists of;
1. Demographic data such as sex,
education, occupation, experience, family
size etc.
2. Items or questions – responses can be in
o Ordinal (5 point Likert Scale or more)
o Continuous (scale)
o Nominal (dichotomous)
3
3. Outcome Items – may be single item or
multiple item
Outcome items are dependent variables
measured in survey such as income,
satisfaction, service quality, employee job
satisfaction or any other parameter that you
consider as dependent variable.
Generally, when designing the survey
instruments it is very essential to clearly
know and state the dependent and
independent variables.
4
Steps/procedures in survey data analysis
1. Data clearing
Remove duplicates, incomplete and error
response data
2. Assign numerical values for categorical
variables
Ordinal (Likert scale)
Nominal
Multinomial , should all be properly
assigned values.
5
3. Coding Likert scale variables
4. Reverse coding
5. Defining constructs
6. Generating composites such as average,
sum etc when necessary
6
Types of survey data analysis
1. Descriptive data analysis
2. Data visualization
3. Cross-tabulation and chi-square tests
4. Factor analysis
5. Correlation analysis
6. Regression analysis
7. Comparative analysis – parametric and non-
parametric
8. Hypothesis testing and statistical significance
7
Descriptive data analysis
Use of mean, SD and other arithmetic's for
continuous variables
Percentages and frequency distribution table for
categorical variables.
Data visualization
Graphs, bar charts, pic harts, histograms etc.
Reliability and validity tests
We use Cornbrash's alpha for testing internal
consistency the questionnaire .
8
Comparative analysis
Parametric (for normally distributed) mean
comparison tests such as t-test and ANNOVA
Non-parmetrics (for data not normally distributed
) comparative analysis include;
Mann Whitney U test for two groups ,
Kurskal Walis Test for three groups or more and
Jonckheers trend test for ordinal data to find the
trend &
We compare determine difference between groups
9
Factor Analysis
FA is a data reduction statistical techniques
It is statistical method designed to identify
dimensions, factors or constructs or sectors in
variables (items or questions). Objectives are to;
examine the correlation between variables, then
identify groups of variables that are highly related.
Group questions that are answered similarly in survey
Explain common underlying factors or constructs
Uncover and generate set of latent factors
Help in development of theoretical model
Determine the number of factors in a data
10
Types of factors analysis
1. Principal component analysis (PCA);
Data reduction method – to reduce number of items
in a survey
computes correlation between the items and then
groups the items into groups known as components
2. Exploratory Factor Analysis (EFA)
principal factor analysis (PFA)- compute
correlation between items to identify common
factor, unique factors , explore factor sturctures
11
3. Confirmative factor analysis
It is performed post exploratory factor analysis. Its
purpose is identify underlying stricture for factors
to develop structural equation models – it is
initial step for SEM;
as its name indicates it helps to confirm or
disconfirm factor structure
12
Assumptions of factor analysis
Adequacy of sample size – should be sufficient
>200
No outliers – no extreme values
Linearity – relationship among variables should
linear;
Multivariate Normality – observed variables
follow multivariate normal distribution
No perfect multicollinearity – no high correlation
between 2 or more independent variables.
13
Data has to be continuous data ( interval or ration
data).
However, can be used for ordinal data of Lickert scale
with 5 greater categories (scales).
Common variance – observed variables share common
variance explained by underlying factors
Independence of factors- factors are assumed to
uncorrelated to each other
No rotation of axisis 0 axis or factor are unrotated.
14
FA output of correlation matrix for item can be
computed which shows the relationship between
all pairs of variables.
Correlation coefficient must be > -0.8 and
< 0.8
If r> 0.9 due to high multicollinearity
If r < 0.3 or >-0.3 delete that item because
of weak multicollinearity
Determinants should be >0.0000
15
Sample adequacy test
Using KMO values between 0 and 1 and test
result must be significant
Test result > 0.5 is acceptable and 0.8 is the
best value ideally .
Bartlett's test of sphericity
Ho: correlation matrix is an identity matrix
– a matrix with all diagonals equal to 1.
Reject Ho if it is significant
16
PCA and PFA Analyses Using STATA
Select variables that should be reduced to a set
of underlying factors (PCA) or should be used to
identify underlying dimensions (factor analysis).
On Stata using statistics
► Statistics ► Multivariate analysis ►
Factor and principal component analysis ►
Factor analysis. Enter the variables in the
Variables box.
Also answer the related questions
Are the variables interval or ratio scaled?
Is the sample size sufficiently large?
To check whether the variables are sufficiently
correlated, we use;
On Stata
► Statistics ► Summaries, tables, and tests ►
Summary and descriptive statistics ► Pairwise
correlations.
o Check the number of observations for each entry
and Print the significance level for each entry.
o Select Use Bonferroni-adjusted significance level
to maintain the familywise error rate. Or
Stata command:
pwcorr x1 x2 x3 x4 x5 s6 s7 s8, obs sig Bonferroni
KMO test can also be used
On Stata;
► Statistics ► Postestimation ► Factor
analysis reports and graphs ► Kaiser-Meyer-
Olkin measure of sample adequacy. Then
click on Launch and OK.
Note that this analysis can only be run after
the PCA has been conducted.
Or
Use Stata command: estat kmo
Extract the factors