0% found this document useful (0 votes)
218 views

Factor Analysis2

Factor analysis allows researchers to identify underlying dimensions, or factors, that explain correlations among sets of variables. It reduces a large number of interrelated variables into a smaller number of underlying factors. Factor analysis was used by JPMorgan Chase to identify the key dimensions consumers use to evaluate banks - traditional services, convenience, visibility, and competence. This enabled JPMorgan Chase to develop effective marketing strategies and become one of the largest U.S. banks.

Uploaded by

Vanila Mehta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
218 views

Factor Analysis2

Factor analysis allows researchers to identify underlying dimensions, or factors, that explain correlations among sets of variables. It reduces a large number of interrelated variables into a smaller number of underlying factors. Factor analysis was used by JPMorgan Chase to identify the key dimensions consumers use to evaluate banks - traditional services, convenience, visibility, and competence. This enabled JPMorgan Chase to develop effective marketing strategies and become one of the largest U.S. banks.

Uploaded by

Vanila Mehta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Chapter

19
“ ”
Factor analysis allows us to look at groups

of variables that tend to be correlated

to each other and identify underlying

dimensions that explain these

correlations.

William D. Neal, Senior Partner, SDR Consulting

Objectives [ After reading this chapter, the student should be able to: ]
1. Describe the concept of factor analysis and explain how it is different from
analysis of variance, multiple regression, and discriminant analysis.
2. Discuss the procedure for conducting factor analysis, including problem
formulation, construction of the correlation matrix, selection of an appropriate
method, determination of the number of factors, rotation, and interpretation
of factors.
3. Understand the distinction between principal component factor analysis
and common factor analysis methods.
4. Explain the selection of surrogate variables and their application, with
emphasis on their use in subsequent analysis.
5. Describe the procedure for determining the fit of a factor analysis model
using the observed and the reproduced correlations.

602
Factor Analysis

Overview In analysis of variance (Chapter 16), regression (Chapter 17), and discriminant analysis
(Chapter 18), one of the variables is clearly identified as the dependent variable. We now turn
to a procedure, factor analysis, in which variables are not classified as independent or depen-
dent. Instead, the whole set of interdependent relationships among variables is examined.
This chapter discusses the basic concept of factor analysis and gives an exposition of the
factor model. We describe the steps in factor analysis and illustrate them in the context of
principal components analysis. Next, we present an application of common factor analysis.
Finally, we discuss the use of software in factor analysis. Help for running the SPSS and
SAS Learning Edition programs used in this chapter is provided in four ways: (1) detailed
step-by-step instructions are given later in the chapter, (2) you can download (from the Web
site for this book) computerized demonstration movies illustrating these step-by-step
instructions, (3) you can download screen captures with notes illustrating these step-by-step
instructions, and (4) you can refer to the Study Guide and Technology Manual, a supplement
that accompanies this book.
To begin, we provide some examples to illustrate the usefulness of factor analysis.

Real Research Factor Analysis Earns Interest at Banks


How do consumers evaluate banks? Respondents in a survey were asked to rate the importance of 15 bank
attributes. A 5-point scale ranging from not important to very important was employed. These data were
analyzed via principal components analysis.
A four-factor solution resulted, with the factors being labeled as traditional services, convenience,
visibility, and competence. Traditional services included interest rates on loans, reputation in the community,
low rates for checking, friendly and personalized service, easy-to-read monthly statements, and obtainability
of loans. Convenience was comprised of convenient branch location, convenient ATM locations, speed of
service, and convenient banking hours. The visibility factor included recommendations from friends and

Factor analysis helped


JPMorgan Chase & Co.
to identify the dimensions
consumers use to evaluate
banks and to develop appro-
priate marketing strategies
enabling it to become one
of the largest U.S. banks.

603
604 PART III • DATA COLLECTION, PREPARATION, ANALYSIS, AND REPORTING

relatives, attractiveness of the physical structure, community involvement, and obtainability of loans.
Competence consisted of employee competence and availability of auxiliary banking services. It was
concluded that consumers evaluated banks using the four basic factors of traditional services, convenience,
visibility, and competence, and banks must excel on these factors to project a good image. By emphasizing
these factors, JPMorgan Chase & Co. became one of the largest U.S. banks and bought the banking opera-
tions of bankrupt rival Washington Mutual in September 2008.1 ■

Basic Concept
Factor analysis is a general name denoting a class of procedures primarily used for data reduc-
factor analysis tion and summarization. In marketing research, there may be a large number of variables, most
A class of procedures of which are correlated and which must be reduced to a manageable level. Relationships among
primarily used for data sets of many interrelated variables are examined and represented in terms of a few underlying
reduction and
factors. For example, store image may be measured by asking respondents to evaluate stores on
summarization.
a series of items on a semantic differential scale. These item evaluations may then be analyzed to
determine the factors underlying store image.
In analysis of variance, multiple regression, and discriminant analysis, one variable is
considered as the dependent or criterion variable, and the others as independent or predictor vari-
ables. However, no such distinction is made in factor analysis. Rather, factor analysis is an
interdependence technique in that an entire set of interdependent relationships is examined.2
interdependence Factor analysis is used in the following circumstances:
technique
Multivariate statistical 1. To identify underlying dimensions, or factors, that explain the correlations among a
techniques in which set of variables. For example, a set of lifestyle statements may be used to measure the
the whole set of psychographic profiles of consumers. These statements may then be factor analyzed
interdependent to identify the underlying psychographic factors, as illustrated in the department store
relationships is example. This is also illustrated in Figure 19.1 derived based on empirical analysis,
examined. where the seven psychographic variables can be represented by two factors. In this
factors
figure, factor 1 can be interpreted as homebody versus socialite, and factor 2 can be
An underlying dimension interpreted as sports versus movies/plays.
that explains the 2. To identify a new, smaller set of uncorrelated variables to replace the original set of correlated
correlations among a set variables in subsequent multivariate analysis (regression or discriminant analysis). For example,
of variables. the psychographic factors identified may be used as independent variables in explaining the
differences between loyal and nonloyal consumers. Thus, instead of the seven correlated
psychographic variables of Figure 19.1, we can use the two uncorrelated factors,
i.e., homebody versus socialite, and sports versus movies/plays, in subsequent analysis.
3. To identify a smaller set of salient variables from a larger set for use in subsequent multi-
variate analysis. For example, a few of the original lifestyle statements that correlate highly
with the identified factors may be used as independent variables to explain the differences
between the loyal and nonloyal users. Specifically, based on theory and empirical results

FIGURE 19.1 Factor 2


Factors Underlying
Selected Baseball
Psychographics Football
and Lifestyles

Evening at home

Factor 1

Go to a party Home is best place

Plays
Movies
CHAPTER 19 • FACTOR ANALYSIS 605

(Figure 19.1), we can select home is best place and football as independent variables, and
drop the other five variables to avoid problems due to multicollinearity (see Chapter 17).
All these uses are exploratory in nature and, therefore, factor analysis is also called exploratory
factor analysis (EFA). The technique has numerous applications in marketing research. For
example:
䊉 It can be used in market segmentation for identifying the underlying variables on which to
group the customers. New car buyers might be grouped based on the relative emphasis
they place on economy, convenience, performance, comfort, and luxury. This might result
in five segments: economy seekers, convenience seekers, performance seekers, comfort
seekers, and luxury seekers.
䊉 In product research, factor analysis can be employed to determine the brand attributes that
influence consumer choice. Toothpaste brands might be evaluated in terms of protection
against cavities, whiteness of teeth, taste, fresh breath, and price.
䊉 In advertising studies, factor analysis can be used to understand the media consumption
habits of the target market. The users of frozen foods may be heavy viewers of cable TV,
see a lot of movies, and listen to country music.
䊉 In pricing studies, it can be used to identify the characteristics of price-sensitive consumers.
For example, these consumers might be methodical, economy minded, and home centered.

Factor Analysis Model


Mathematically, factor analysis is somewhat similar to multiple regression analysis, in that each
variable is expressed as a linear combination of underlying factors. The amount of variance a
variable shares with all other variables included in the analysis is referred to as communality. The
covariation among the variables is described in terms of a small number of common factors plus
a unique factor for each variable. These factors are not overtly observed. If the variables are stan-
dardized, the factor model may be represented as:
Xi = Ai1F1 + Ai2F2 + Ai3 F3 + Á + Aim Fm + Vi Ui
where
Xi  ith standardized variable
Aij  standardized multiple regression coefficient of variable i on common factor j
F  common factor
Vi  standardized regression coefficient of variable i on unique factor i
Ui  the unique factor for variable i
m  number of common factors
The unique factors are uncorrelated with each other and with the common factors.3 The
common factors themselves can be expressed as linear combinations of the observed variables.
Fi = Wi1 X1 + Wi2 X2 + Wi3 X3 + Á + Wik Xk
where
Fi  estimate of ith factor
Wi  weight or factor score coefficient
k  number of variables
It is possible to select weights or factor score coefficients so that the first factor explains the
largest portion of the total variance. Then a second set of weights can be selected, so that the second
factor accounts for most of the residual variance, subject to being uncorrelated with the first factor.
This same principle could be applied to selecting additional weights for the additional factors. Thus,
the factors can be estimated so that their factor scores, unlike the values of the original variables, are
not correlated. Furthermore, the first factor accounts for the highest variance in the data, the second
factor the second highest, and so on. A simplified graphical illustration of factor analysis in the case
of two variables is presented in Figure 19.2. Several statistics are associated with factor analysis.
606 PART III • DATA COLLECTION, PREPARATION, ANALYSIS, AND REPORTING

FIGURE 19.2 X2
Graphical Fi = Wi1X1 + Wi2X2
Illustration
of Factor
Analysis

X1

Statistics Associated with Factor Analysis


The key statistics associated with factor analysis are as follows:
Bartlett’s test of sphericity. Bartlett’s test of sphericity is a test statistic used to examine
the hypothesis that the variables are uncorrelated in the population. In other words, the
population correlation matrix is an identity matrix; each variable correlates perfectly with
itself (r = 1) but has no correlation with the other variables (r = 0).
Correlation matrix. A correlation matrix is a lower triangle matrix showing the simple cor-
relations, r, between all possible pairs of variables included in the analysis. The diagonal
elements, which are all 1, are usually omitted.
Communality. Communality is the amount of variance a variable shares with all the
other variables being considered. This is also the proportion of variance explained by
the common factors.
Eigenvalue. The eigenvalue represents the total variance explained by each factor.
Factor loadings. Factor loadings are simple correlations between the variables and the factors.
Factor loading plot. A factor loading plot is a plot of the original variables using the factor
loadings as coordinates.
Factor matrix. A factor matrix contains the factor loadings of all the variables on all the
factors extracted.
Factor scores. Factor scores are composite scores estimated for each respondent on the
derived factors.
Factor scores coefficient matrix. This matrix contains the weights, or factor score coeffi-
cients, used to combine the standardized variables to obtain factor scores.
Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy. The Kaiser-Meyer-Olkin
(KMO) measure of sampling adequacy is an index used to examine the appropriateness of
factor analysis. High values (between 0.5 and 1.0) indicate factor analysis is appropriate.
Values below 0.5 imply that factor analysis may not be appropriate.
Percentage of variance. This is the percentage of the total variance attributed to each factor.
Residuals. Residuals are the differences between the observed correlations, as given in the
input correlation matrix, and the reproduced correlations, as estimated from the factor matrix.
Scree plot. A scree plot is a plot of the eigenvalues against the number of factors in order
of extraction.
In the next section, we describe the uses of these statistics in the context of the procedure for
conducting factor analysis.

Conducting Factor Analysis


The steps involved in conducting factor analysis are illustrated in Figure 19.3. The first step is
to define the factor analysis problem and identify the variables to be factor analyzed. Then a
correlation matrix of these variables is constructed and a method of factor analysis selected.
CHAPTER 19 • FACTOR ANALYSIS 607

FIGURE 19.3 Formulate the problem.


Conducting Factor
Analysis
Construct the correlation matrix.

Determine the method of factor analysis.

Determine the number of factors.

Rotate the factors.

Interpret the factors.

Calculate the Select the


factor surrogate
scores. variables.

Determine the model fit.

The researcher decides on the number of factors to be extracted and the method of rotation.
Next, the rotated factors should be interpreted. Depending upon the objectives, the factor scores
may be calculated, or surrogate variables selected, to represent the factors in subsequent multi-
variate analysis. Finally, the fit of the factor analysis model is determined. We discuss these
steps in more detail in the following sections.4

Formulate the Problem


Problem formulation includes several tasks. First, the objectives of factor analysis should be
identified. The variables to be included in the factor analysis should be specified based on past
research, theory, and judgment of the researcher. It is important that the variables be appropri-
ately measured on an interval or ratio scale. An appropriate sample size should be used. As a
rough guideline, there should be at least four or five times as many observations (sample size) as
there are variables.5 In many marketing research situations, the sample size is small and this ratio
is considerably lower. In these cases, the results should be interpreted cautiously.
To illustrate factor analysis, suppose the researcher wants to determine the underlying benefits
consumers seek from the purchase of a toothpaste. A sample of 30 respondents was interviewed
using mall-intercept interviewing. The respondents were asked to indicate their degree of agreement
with the following statements using a 7-point scale (1  strongly disagree, 7  strongly agree):
V1: It is important to buy a toothpaste that prevents cavities.
V2: I like a toothpaste that gives shiny teeth.
V3: A toothpaste should strengthen your gums.
V4: I prefer a toothpaste that freshens breath.
V5: Prevention of tooth decay is not an important benefit offered by a toothpaste.
V6: The most important consideration in buying a toothpaste is attractive teeth.
The data obtained are given in Table 19.1. For illustrative purposes, we consider only a small
number of observations. In actual practice, factor analysis is performed on a much larger sample
such as that in the Dell running case and other cases with real data that are presented in this book.
A correlation matrix was constructed based on these ratings data.
608 PART III • DATA COLLECTION, PREPARATION, ANALYSIS, AND REPORTING

TABLE 19.1
Toothpaste Attribute Ratings
Respondent
Number V1 V2 V3 V4 V5 V6

1 7.00 3.00 6.00 4.00 2.00 4.00


SPSS Data File 2 1.00 3.00 2.00 4.00 5.00 4.00
3 6.00 2.00 7.00 4.00 1.00 3.00
4 4.00 5.00 4.00 6.00 2.00 5.00
5 1.00 2.00 2.00 3.00 6.00 2.00
6 6.00 3.00 6.00 4.00 2.00 4.00
SAS Data File 7 5.00 3.00 6.00 3.00 4.00 3.00
8 6.00 4.00 7.00 4.00 1.00 4.00
9 3.00 4.00 2.00 3.00 6.00 3.00
10 2.00 6.00 2.00 6.00 7.00 6.00
11 6.00 4.00 7.00 3.00 2.00 3.00
12 2.00 3.00 1.00 4.00 5.00 4.00
13 7.00 2.00 6.00 4.00 1.00 3.00
14 4.00 6.00 4.00 5.00 3.00 6.00
15 1.00 3.00 2.00 2.00 6.00 4.00
16 6.00 4.00 6.00 3.00 3.00 4.00
17 5.00 3.00 6.00 3.00 3.00 4.00
18 7.00 3.00 7.00 4.00 1.00 4.00
19 2.00 4.00 3.00 3.00 6.00 3.00
20 3.00 5.00 3.00 6.00 4.00 6.00
21 1.00 3.00 2.00 3.00 5.00 3.00
22 5.00 4.00 5.00 4.00 2.00 4.00
23 2.00 2.00 1.00 5.00 4.00 4.00
24 4.00 6.00 4.00 6.00 4.00 7.00
25 6.00 5.00 4.00 2.00 1.00 4.00
26 3.00 5.00 4.00 6.00 4.00 7.00
27 4.00 4.00 7.00 2.00 2.00 5.00
28 3.00 7.00 2.00 6.00 4.00 3.00
29 4.00 6.00 3.00 7.00 2.00 7.00
30 2.00 3.00 2.00 4.00 7.00 2.00

Construct the Correlation Matrix


The analytical process is based on a matrix of correlations between the variables. Valuable
insights can be gained from an examination of this matrix. For the factor analysis to be appropri-
ate, the variables must be correlated. In practice, this is usually the case. If the correlations
between all the variables are small, factor analysis may not be appropriate. We would also expect
that variables that are highly correlated with each other would also highly correlate with the
same factor or factors.
Formal statistics are available for testing the appropriateness of the factor model. Bartlett’s
test of sphericity can be used to test the null hypothesis that the variables are uncorrelated in the
population; in other words, the population correlation matrix is an identity matrix. In an identity
matrix, all the diagonal terms are 1, and all off-diagonal terms are 0. The test statistic for spheric-
ity is based on a chi-square transformation of the determinant of the correlation matrix. A large
value of the test statistic will favor the rejection of the null hypothesis. If this hypothesis cannot
be rejected, then the appropriateness of factor analysis should be questioned. Another useful sta-
tistic is the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy. This index compares the
magnitudes of the observed correlation coefficients to the magnitudes of the partial correlation
CHAPTER 19 • FACTOR ANALYSIS 609

TABLE 19.2
Correlation Matrix
Variables V1 V2 V3 V4 V5 V6

V1 1.00
V2 -0.053 1.00
SPSS Output File V3 0.873 -0.155 1.00
V4 -0.086 0.572 -0.248 1.00
V5 -0.858 0.020 -0.778 -0.007 1.00
V6 0.004 0.640 -0.018 0.640 -0.136 1.00

SAS Output File


coefficients. Small values of the KMO statistic indicate that the correlations between pairs of
variables cannot be explained by other variables and that factor analysis may not be appropriate.
Generally, a value greater than 0.5 is desirable.
The correlation matrix, constructed from the data obtained to understand toothpaste benefits
(Table 19.1), is shown in Table 19.2. There are relatively high correlations among V1 (prevention
of cavities), V3 (strong gums), and V5 (prevention of tooth decay). We would expect these
variables to correlate with the same set of factors. Likewise, there are relatively high correlations
among V2 (shiny teeth), V4 (fresh breath), and V6 (attractive teeth). These variables may also be
expected to correlate with the same factors.6
The results of principal components analysis are given in Table 19.3. The null hypothesis,
that the population correlation matrix is an identity matrix, is rejected by Bartlett’s test of
sphericity. The approximate chi-square statistic is 111.314 with 15 degrees of freedom, which is
significant at the 0.05 level. The value of the KMO statistic (0.660) is also large (7 0.5). Thus,
factor analysis may be considered an appropriate technique for analyzing the correlation matrix
of Table 19.2.

TABLE 19.3
Results of Principal Components Analysis
Bartlett test of sphericity
Approx. chi-square  111.314, df  15, significance  0.00000
Kaiser-Meyer-Olkin measure of sampling adequacy  0.660
Communalities
SPSS Output File
Variable Initial Extraction
V1 1.000 0.926
V2 1.000 0.723
V3 1.000 0.894
SAS Output File V4 1.000 0.739
V5 1.000 0.878
V6 1.000 0.790

Initial Eigenvalues
Factor Eigenvalue % of Variance Cumulative %
1 2.731 45.520 45.520
2 2.218 36.969 82.488
3 0.442 7.360 89.848
4 0.341 5.688 95.536
5 0.183 3.044 98.580
6 0.085 1.420 100.000

(continued)
610 PART III • DATA COLLECTION, PREPARATION, ANALYSIS, AND REPORTING

TABLE 19.3
Results of Principal Components Analysis (continued)
Extraction Sums of Squared Loadings
Factor Eigenvalue % of Variance Cumulative %
1 2.731 45.520 45.520
2 2.218 36.969 82.488
Factor Matrix Factor 1 Factor 2
V1 0.928 0.253
V2 -0.301 0.795
V3 0.936 0.131
V4 -0.342 0.789
V5 -0.869 -0.351
V6 -0.177 0.871
Rotation Sums of Squared Loadings
Factor Eigenvalue % of Variance Cumulative %
1 2.688 44.802 44.802
2 2.261 37.687 82.488
Rotated Factor Matrix
Factor 1 Factor 2
V1 0.962 -0.027
V2 -0.057 0.848
V3 0.934 -0.146
V4 -0.098 0.854
V5 -0.933 -0.084
V6 0.083 0.885
Factor Score Coefficient Matrix
Factor 1 Factor 2
V1 0.358 0.011
V2 -0.001 0.375
V3 0.345 -0.043
V4 -0.017 0.377
V5 -0.350 -0.059
V6 0.052 0.395
Reproduced Correlation Matrix
V1 V2 V3 V4 V5 V6
V1 0.926* 0.024 -0.029 0.031 0.038 -0.053
V2 -0.078 0.723* 0.022 -0.158 0.038 -0.105
V3 0.902 -0.177 0.894* -0.031 0.081 0.033
V4 -0.117 0.730 -0.217 0.739* -0.027 -0.107
V5 -0.895 -0.018 -0.859 0.020 0.878* 0.016
V6 0.057 0.746 -0.051 0.748 -0.152 0.790*

*The lower left triangle contains the reproduced correlation matrix; the diagonal, the
communalities; the upper right triangle, the residuals between the observed correlations
and the reproduced correlations.
CHAPTER 19 • FACTOR ANALYSIS 611

Determine the Method of Factor Analysis


Once it has been determined that factor analysis is suitable for analyzing the data, an appropriate
method must be selected. The approach used to derive the weights or factor score coefficients
differentiates the various methods of factor analysis. The two basic approaches are principal
principal components components analysis and common factor analysis. In principal components analysis, the total
analysis variance in the data is considered. The diagonal of the correlation matrix consists of unities, and
An approach to factor full variance is brought into the factor matrix. Principal components analysis is recommended
analysis that considers the when the primary concern is to determine the minimum number of factors that will account for
total variance in the data.
maximum variance in the data for use in subsequent multivariate analysis. The factors are called
principal components.
common factor In common factor analysis, the factors are estimated based only on the common variance.
analysis Communalities are inserted in the diagonal of the correlation matrix. This method is appropriate
An approach to factor when the primary concern is to identify the underlying dimensions and the common variance is
analysis that estimates of interest. This method is also known as principal axis factoring.
the factors based only Other approaches for estimating the common factors are also available. These include the meth-
on the common
ods of unweighted least squares, generalized least squares, maximum likelihood, alpha method, and
variance.
image factoring. These methods are complex and are not recommended for inexperienced users.7
Table 19.3 shows the application of principal components analysis to the toothpaste example.
Under “Communalities,” “Initial” column, it can be seen that the communality for each variable,
V1 to V6, is 1.0 as unities were inserted in the diagonal of the correlation matrix. The table labeled
“Initial Eigenvalues” gives the eigenvalues. The eigenvalues for the factors are, as expected, in
decreasing order of magnitude as we go from factor 1 to factor 6. The eigenvalue for a factor
indicates the total variance attributed to that factor. The total variance accounted for by all six
factors is 6.00, which is equal to the number of variables. Factor 1 accounts for a variance of
2.731, which is (2.731/6) or 45.52 percent of the total variance. Likewise, the second factor
accounts for (2.218/6) or 36.97 percent of the total variance, and the first two factors combined
account for 82.49 percent of the total variance. Several considerations are involved in determining
the number of factors that should be used in the analysis.

Determine the Number of Factors


It is possible to compute as many principal components as there are variables, but in doing so, no
parsimony is gained. In order to summarize the information contained in the original variables, a
smaller number of factors should be extracted. The question is, how many? Several procedures
have been suggested for determining the number of factors. These include a priori determination
and approaches based on eigenvalues, scree plot, percentage of variance accounted for, split-half
reliability, and significance tests.
A PRIORI DETERMINATION Sometimes, because of prior knowledge, the researcher knows how
many factors to expect and thus can specify the number of factors to be extracted beforehand.
The extraction of factors ceases when the desired number of factors have been extracted. Most
computer programs allow the user to specify the number of factors, allowing for an easy
implementation of this approach.
DETERMINATION BASED ON EIGENVALUES In this approach, only factors with eigenvalues greater
than 1.0 are retained; the other factors are not included in the model. An eigenvalue represents
the amount of variance associated with the factor. Hence, only factors with a variance greater than
1.0 are included. Factors with variance less than 1.0 are no better than a single variable, because,
due to standardization, each individual variable has a variance of 1.0. If the number of variables is
less than 20, this approach will result in a conservative number of factors.
DETERMINATION BASED ON SCREE PLOT A scree plot is a plot of the eigenvalues against the
number of factors in order of extraction. The shape of the plot is used to determine the number of
factors. Typically, the plot has a distinct break between the steep slope of factors, with large
eigenvalues and a gradual trailing off associated with the rest of the factors. This gradual trailing
off is referred to as the scree. Experimental evidence indicates that the point at which the scree
begins denotes the true number of factors. Generally, the number of factors determined by a
scree plot will be one or a few more than that determined by the eigenvalue criterion.
612 PART III • DATA COLLECTION, PREPARATION, ANALYSIS, AND REPORTING

DETERMINATION BASED ON PERCENTAGE OF VARIANCE In this approach, the number of factors


extracted is determined so that the cumulative percentage of variance extracted by the factors reaches
a satisfactory level. What level of variance is satisfactory depends upon the problem. However, it is
recommended that the factors extracted should account for at least 60 percent of the variance.
DETERMINATION BASED ON SPLIT-HALF RELIABILITY The sample is split in half and factor
analysis is performed on each half. Only factors with high correspondence of factor loadings
across the two subsamples are retained.
DETERMINATION BASED ON SIGNIFICANCE TESTS It is possible to determine the statistical
significance of the separate eigenvalues and retain only those factors that are statistically
significant. A drawback is that with large samples (size greater than 200), many factors are
likely to be statistically significant, although from a practical viewpoint many of these account
for only a small proportion of the total variance.
In Table 19.3, we see that the eigenvalue greater than 1.0 (default option) results in two factors
being extracted. Our a priori knowledge tells us that toothpaste is bought for two major reasons.
The scree plot associated with this analysis is given in Figure 19.4. From the scree plot, a distinct
break occurs at three factors. Finally, from the cumulative percentage of variance accounted for, we
see that the first two factors account for 82.49 percent of the variance, and that the gain achieved in
going to three factors is marginal. Furthermore, split-half reliability also indicates that two factors
are appropriate. Thus, two factors appear to be reasonable in this situation.
The second column under “Communalities” in Table 19.3 gives relevant information after
the desired number of factors has been extracted. The communalities for the variables under
“Extraction” are different than under “Initial” because all of the variances associated with the
variables are not explained unless all the factors are retained. The “Extraction Sums of Squared
Loadings” give the variances associated with the factors that are retained. Note that these are the
same as under “Initial Eigenvalues.” This is always the case in principal components analysis.
The percentage variance accounted for by a factor is determined by dividing the associated
eigenvalue with the total number of factors (or variables) and multiplying by 100. Thus, the first
factor accounts for (2.731/6) * 100 or 45.52 percent of the variance of the six variables.
Likewise, the second factor accounts for (2.218/6) * 100 or 36.969 percent of the variance.
Interpretation of the solution is often enhanced by a rotation of the factors.

Rotate Factors
An important output from factor analysis is the factor matrix, also called the factor pattern matrix.
The factor matrix contains the coefficients used to express the standardized variables in terms of the
factors. These coefficients, the factor loadings, represent the correlations between the factors and

FIGURE 19.4 3.0


Scree Plot
2.5

2.0
Eigenvalues

SPSS Output File


1.5

1.0

SAS Output File 0.5

0.0
1 2 3 4 5 6

Number of Factors
CHAPTER 19 • FACTOR ANALYSIS 613

the variables. A coefficient with a large absolute value indicates that the factor and the variable are
closely related. The coefficients of the factor matrix can be used to interpret the factors.
Although the initial or unrotated factor matrix indicates the relationship between the factors and
individual variables, it seldom results in factors that can be interpreted, because the factors are corre-
lated with many variables. For example, in Table 19.3, under “Factor Matrix”, factor 1 is at least some-
what correlated with five of the six variables (an absolute value of factor loading greater than 0.3).
Likewise, factor 2 is at least somewhat correlated with four of the six variables. Moreover, variables 2
and 5 load at least somewhat on both the factors. This is illustrated in Figure 19.5(a). How should
these factors be interpreted? In such a complex matrix it is difficult to interpret the factors. Therefore,
through rotation, the factor matrix is transformed into a simpler one that is easier to interpret.
In rotating the factors, we would like each factor to have nonzero, or significant, loadings or
coefficients for only some of the variables. Likewise, we would like each variable to have nonzero
or significant loadings with only a few factors, if possible with only one. If several factors have
high loadings with the same variable, it is difficult to interpret them. Rotation does not affect the
communalities and the percentage of total variance explained. However, the percentage of
variance accounted for by each factor does change. This is seen in Table 19.3 by comparing
“Extraction Sums of Squared Loadings” with “Rotation Sums of Squared Loadings.” The
variance explained by the individual factors is redistributed by rotation. Hence, different methods
of rotation may result in the identification of different factors.
orthogonal rotation The rotation is called orthogonal rotation if the axes are maintained at right angles. The
Rotation of factors in which most commonly used method for rotation is the varimax procedure. This is an orthogonal
the axes are maintained at method of rotation that minimizes the number of variables with high loadings on a factor,
right angles. thereby enhancing the interpretability of the factors.8 Orthogonal rotation results in factors that
varimax procedure are uncorrelated. The rotation is called oblique rotation when the axes are not maintained at
An orthogonal method right angles, and the factors are correlated. Sometimes, allowing for correlations among factors
of factor rotation that can simplify the factor pattern matrix. Oblique rotation should be used when factors in the
minimizes the number population are likely to be strongly correlated.
of variables with high In Table 19.3, by comparing the varimax rotated factor matrix with the unrotated matrix (titled
loadings on a factor, “Factor Matrix”), we can see how rotation achieves simplicity and enhances interpretability.
thereby enhancing the Whereas five variables correlated with factor 1 in the unrotated matrix, only variables V1, V3, and V5
interpretability of the correlate with factor 1 after rotation. The remaining variables, V2, V4, and V6, correlate highly with
factors.
factor 2. Furthermore, no variable correlates highly with both the factors. This can be seen clearly in
oblique rotation Figure 19.5(b). The rotated factor matrix forms the basis for interpretation of the factors.
Rotation of factors when
the axes are not maintained Interpret Factors
at right angles.
Interpretation is facilitated by identifying the variables that have large loadings on the same factor.
That factor can then be interpreted in terms of the variables that load high on it. Another useful aid
in interpretation is to plot the variables using the factor loadings as coordinates. Variables at the
end of an axis are those that have high loadings on only that factor, and hence describe the factor.
Variables near the origin have small loadings on both the factors. Variables that are not near any of
the axes are related to both the factors. If a factor cannot be clearly defined in terms of the original
variables, it should be labeled as an undefined or a general factor.
In the rotated factor matrix of Table 19.3, factor 1 has high coefficients for variables V1
(prevention of cavities) and V3 (strong gums), and a negative coefficient for V5 (prevention of
tooth decay is not important). Therefore, this factor may be labeled a health benefit factor.

FIGURE 19.5 Factors Factors


Factor Matrix Variables 1 2 Variables 1 2
Before and 1 X 1 X
After Rotation 2 X X 2 X
3 X 3 X
4 X X 4 X
5 X X 5 X
6 X 6 X
(a) (b)
High Loadings Before Rotation High Loadings After Rotation
614 PART III • DATA COLLECTION, PREPARATION, ANALYSIS, AND REPORTING

FIGURE 19.6 1.0


V4 V2 V6
Factor Loading
Plot 0.5

Factor 2
V1
0.0
V5
SPSS Output File V3
–0.5

–1.0
–1.0 –0.5 0.0 0.5 1.0
SAS Output File Factor 1

Note that a negative coefficient for a negative variable (V5) leads to a positive interpretation
that prevention of tooth decay is important. Factor 2 is highly related with variables V2 (shiny
teeth), V4 (fresh breath), and V6 (attractive teeth). Thus, factor 2 may be labeled a social bene-
fit factor. A plot of the factor loadings, given in Figure 19.6, confirms this interpretation.
Variables V1, V3, and V5 are at the ends of the horizontal axis (factor 1), with V5 at the end
opposite to V1 and V3, whereas variables V2, V4, and V6 are at the end of the vertical axis
(factor 2). One could summarize the data by stating that consumers appear to seek two major
kinds of benefits from a toothpaste: health benefits and social benefits.

Calculate Factor Scores


Following interpretation, factor scores can be calculated, if necessary. Factor analysis has its own
stand-alone value. However, if the goal of factor analysis is to reduce the original set of variables
to a smaller set of composite variables (factors) for use in subsequent multivariate analysis, it is
useful to compute factor scores for each respondent. A factor is simply a linear combination of
factor scores the original variables. The factor scores for the ith factor may be estimated as follows:
Composite scores estimated
for each respondent on the Fi = Wi1X1 + Wi2 X2 + Wi3 X3 + Á + WikXk
derived factors.
These symbols were defined earlier in the chapter.
The weights, or factor score coefficients, used to combine the standardized variables are
obtained from the factor score coefficient matrix. Most computer programs allow you to request
factor scores. Only in the case of principal components analysis is it possible to compute exact
factor scores. Moreover, in principal component analysis, these scores are uncorrelated. In
common factor analysis, estimates of these scores are obtained, and there is no guarantee that the
factors will be uncorrelated with each other. The factor scores can be used instead of the original
variables in subsequent multivariate analysis. For example, using the “Factor Score Coefficient
Matrix” in Table 19.3, one could compute two factor scores for each respondent. The standardized
variable values would be multiplied by the corresponding factor score coefficients to obtain the
factor scores.

Select Surrogate Variables


Sometimes, instead of computing factor scores, the researcher wishes to select surrogate variables.
Selection of substitute or surrogate variables involves singling out some of the original variables
for use in subsequent analysis. This allows the researcher to conduct subsequent analysis and inter-
pret the results in terms of original variables rather than factor scores. By examining the factor
matrix, one could select for each factor the variable with the highest loading on that factor. That
variable could then be used as a surrogate variable for the associated factor. This process works
well if one factor loading for a variable is clearly higher than all other factor loadings. However,
the choice is not as easy if two or more variables have similarly high loadings. In such a case, the
choice between these variables should be based on theoretical and measurement considerations.
For example, theory may suggest that a variable with a slightly lower loading is more important
CHAPTER 19 • FACTOR ANALYSIS 615

than one with a slightly higher loading. Likewise, if a variable has a slightly lower loading but has
been measured more precisely, it should be selected as the surrogate variable. In Table 19.3, the
variables V1, V3, and V5 all have high loadings on factor 1, and all are fairly close in magnitude,
although V1 has relatively the highest loading and would therefore be a likely candidate. However,
if prior knowledge suggests that prevention of tooth decay is a very important benefit, V5 would be
selected as the surrogate for factor 1. Also, the choice of a surrogate for factor 2 is not straight
forward. Variables V2, V4, and V6 all have comparable high loadings on this factor. If prior knowledge
suggests that attractive teeth is the most important social benefit sought from a toothpaste, the
researcher would select V6.

Determine the Model Fit


The final step in factor analysis involves the determination of model fit. A basic assumption
underlying factor analysis is that the observed correlation between variables can be attributed to
common factors. Hence, the correlations between the variables can be deduced or reproduced
from the estimated correlations between the variables and the factors. The differences between
the observed correlations (as given in the input correlation matrix) and the reproduced correla-
tions (as estimated from the factor matrix) can be examined to determine model fit. These differ-
ences are called residuals. If there are many large residuals, the factor model does not provide a
good fit to the data and the model should be reconsidered. In the upper right triangle of the
“Reproduced Correlation Matrix” of Table 19.3, we see that only five residuals are larger than
0.05, indicating an acceptable model fit.

ACTIVE RESEARCH

Nokia: Factoring Preferences for Cellular Handsets


Visit www.nokia.com and search the Internet using a search engine as well as your library’s online database
to obtain information on consumers’ preferences for cellular handsets.
Nokia would like to determine the factors that underlie the cellular handset preferences of 15- to
24-year-olds, the heavy users of cellular handsets. What data would you collect and how would you
analyze that data?
As the marketing manager for Nokia, what strategies would you formulate to target the 15- to
24-year-olds, the heavy users of cellular handsets?

Real Research Manufacturing Promotion Components


The objective of this study was to develop a rather comprehensive inventory of manufacturer-controlled
trade promotion variables and to demonstrate that an association exists between these variables and the
retailer’s promotion support decision. Retailer or trade support was defined operationally as the trade
buyer’s attitude toward the promotion.
Factor analysis was performed on the explanatory variables with the primary goal of data reduction.
The principal components method, using varimax rotation, reduced the 30 explanatory variables to eight
factors having eigenvalues greater than 1.0. For the purpose of interpretation, each factor was composed of
variables that loaded 0.40 or higher on that factor. In two instances, where variables loaded 0.40 or above on
two factors, each variable was assigned to the factor where it had the highest loading. Only one variable,
“ease of handling/stocking at retail,” did not load at least 0.40 on any factor. In all, the eight factors
explained 62 percent of the total variance. Interpretation of the factor loading matrix was straightforward.
Table 1 lists the factors in the order in which they were extracted.
Stepwise discriminant analysis was conducted to determine which, if any, of the eight factors predicted
trade support to a statistically significant degree. The factor scores for the eight factors were the explanatory
variables. The dependent variable consisted of the retail buyer’s overall rating of the deal (Rating), which
was collapsed into a three-group (low, medium, and high) measure of trade support. The results of the
discriminant analyses are shown in Table 2. All eight entered the discriminant functions. Goodness-of-fit
measures indicated that, as a group, the eight factors discriminated between high, medium, and low levels
of trade support. Multivariate F ratios, indicating the degree of discrimination between each pair of groups,
616 PART III • DATA COLLECTION, PREPARATION, ANALYSIS, AND REPORTING

Table 1 Factors Influencing Trade Promotional Support

Factor Interpretation
Factor (% variance explained) Loading Variables Included in the Factor

F1 Item importance (16.3%) 0.77 Item is significant enough to warrant promotion


0.75 Category responds well to promotion
0.66 Closest trade competitor is likely to promote item
0.64 Importance of promoted product category
0.59 Item regular (nondeal) sales volume
0.57 Deal meshes with trade promotional requirements
F2 Promotion elasticity (9.3%) Buyer’s estimate of sales increase on the basis of:
0.86 Price reduction and display
0.82 Display only
0.80 Price reduction only
0.70 Price reduction, display, and advertising
F3 Manufacturer brand support (8.2%) Manufacturer’s brand support in form of:
0.85 Coupons
0.81 Radio and television advertising
0.80 Newspaper advertising
0.75 Point-of-purchase promotion (e.g., display)
F4 Manufacturer reputation (7.3%) 0.72 Manufacturer’s overall reputation
0.72 Manufacturer cooperates in meeting trade’s
promotional needs
0.64 Manufacturer cooperates on emergency orders
0.55 Quality of sales presentation
0.51 Manufacturer’s overall product quality
F5 Promotion wearout (6.4%) 0.93 Product category is overpromoted
0.93 Item is overpromoted
F6 Sales velocity (5.4%) -0.81 Brand market share ranka
0.69 Item regular sales volumea
0.46 Item regular sales volume
F7 Item profitability (4.5%) 0.79 Item regular gross margin
0.72 Item regular gross margina
0.49 Reasonableness of deal performance requirements
F8 Incentive amount (4.2%) 0.83 Absolute amount of deal allowances
0.81 Deal allowances as percent of regular trade costa
0.49 Absolute amount of deal allowancesa
aDenotes objective (archival) measure

Table 2 Discriminant Analysis Results: Analysis of Rating (N = 564)


Standardized Discriminant Coefficients
Analysis of Rating
Factor Function 1 Function 2
F1 Item importance 0.861 -0.253
F2 Promotion elasticity 0.081 0.398
F3 Manufacturer brand support 0.127 -0.036
F4 Manufacturer reputation 0.394 0.014
F5 Promotion wearout -0.207 0.380
F6 Sales velocity 0.033 -0.665
F7 Item profitability 0.614 0.357
F8 Incentive amount 0.461 0.254
Wilks’ l (for each factor) All significant at p 6 0.001
Multivariate F ratios All significant at p 6 0.001
% Cases correctly classified 65% correct
CHAPTER 19 • FACTOR ANALYSIS 617

Table 3 Relative Importance of Trade Support


Influencers (as Indicated by Order of Entry
into the Discriminant Analysis)
Analysis of Rating
Order of Entry Factor Name
1 Item importance
2 Item profitability
3 Incentive amount
4 Manufacturer reputation
5 Promotion wearout
6 Sales velocity
7 Promotion elasticity
8 Manufacturer brand support

were significant at p 6 0.001. Correct classification into high, medium, and low categories was achieved
for 65 percent of the cases. The order of entry into discriminant analysis was used to determine the relative
importance of factors as trade support influencers, as shown in Table 3.9
In keeping with the results of this study, P&G decided to emphasize item importance, item profitability,
incentive amount, and its reputation in order to garner retailers’ promotion support. Partially as a result of
these efforts, P&G brands touched the lives of people around the world three billion times a day in 2009. ■

Applications of Common Factor Analysis


The data of Table 19.1 were analyzed using the common factor analysis model. Instead of using
unities in the diagonal, the communalities were inserted. The output, shown in Table 19.4, is
similar to the output from principal components analysis presented in Table 19.3. Under
“Communalities” under the “Initial” column, the communalities for the variables are no longer 1.0.
Based on the eigenvalue criterion, again two factors are extracted. The variances, after extracting
the factors, are different from the initial eigenvalues. The first factor accounts for 42.84 percent of
the variance, whereas the second accounts for 31.13 percent, in each case a little less than what was
observed in principal components analysis.
The values in the unrotated “Factor Matrix” of Table 19.4 are a little different from those in
Table 19.3, although the pattern of the coefficients is similar. Sometimes, however, the pattern of
loadings for common factor analysis is different from that for principal components analysis, with
some variables loading on different factors. The rotated factor matrix has the same pattern as that
in Table 19.3, leading to a similar interpretation of the factors. Another application is provided in
the context of “common” rebate perceptions.

ACTIVE RESEARCH

Wendy’s: How Old Fashioned Is Consumer’s Choice Criteria for Fast Foods
Visit www.wendys.com and search the Internet using a search engine as well as your library’s online database
to determine the choice criteria of consumers in selecting a fast-food restaurant.
As the marketing director for Wendy’s, what marketing strategies would you formulate to increase
your patronage?
Describe the data you would collect and the analysis you would conduct to determine the choice
criteria of consumers in selecting a fast-food restaurant.
618 PART III • DATA COLLECTION, PREPARATION, ANALYSIS, AND REPORTING

TABLE 19.4
Results of Common Factor Analysis
Bartlett test of sphericity
Approx. chi-square  111.314, df  15, significance  0.00000
Kaiser-Meyer-Olkin measure of sampling adequacy  0.660
Communalities
SPSS Output File
Variable Initial Extraction
V1 0.859 0.928
V2 0.480 0.562
V3 0.814 0.836
SAS Output File V4 0.543 0.600
V5 0.763 0.789
V6 0.587 0.723

Initial Eigenvalues
Factor Eigenvalue % of Variance Cumulative %
1 2.731 45.520 45.520
2 2.218 36.969 82.488
3 0.442 7.360 89.848
4 0.341 5.688 95.536
5 0.183 3.044 98.580
6 0.085 1.420 100.000
Extraction Sums of Squared Loadings
Factor Eigenvalue % of Variance Cumulative %
1 2.570 42.837 42.837
2 1.868 31.126 73.964
Factor Matrix
Factor 1 Factor 2
V1 0.949 0.168
V2 -0.206 0.720
V3 0.914 0.038
V4 -0.246 0.734
V5 -0.850 -0.259
V6 -0.101 0.844

Rotation Sums of Squared Loadings


Factor Eigenvalue % of Variance Cumulative %
1 2.541 42.343 42.343
2 1.897 31.621 73.964
Rotated Factor Matrix
Factor 1 Factor 2
V1 0.963 -0.030
V2 -0.054 0.747
V3 0.902 -0.150
V4 -0.090 0.769
V5 -0.885 -0.079
V6 0.075 0.847

(continued)
CHAPTER 19 • FACTOR ANALYSIS 619

TABLE 19.4
Results of Common Factor Analysis (continued )
Factor Score Coefficient Matrix
Factor 1 Factor 2
V1 0.628 0.101
V2 -0.024 0.253
V3 0.217 -0.169
V4 -0.023 0.271
V5 -0.166 -0.059
V6 0.083 0.500

Reproduced Correlation Matrix


V1 V2 V3 V4 V5 V6

V1 0.928* 0.022 -0.000 0.024 -0.008 -0.042


V2 -0.075 0.562* 0.006 -0.008 0.031 0.012
V3 0.873 -0.161 0.836* -0.051 0.008 0.042
V4 -0.110 0.580 -0.197 0.600* -0.025 -0.004
V5 -0.850 -0.012 -0.786 0.019 0.789* -0.003
V6 0.046 0.629 -0.060 0.645 -0.133 0.723*

*The lower left triangle contains the reproduced correlation matrix; the diagonal, the
communalities; the upper right triangle, the residuals between the observed
correlations and the reproduced correlations.

Real Research “Common” Rebate Perceptions


Rebates are effective in obtaining new users, brand-switching, and repeat purchases among current users. In
March 2009, AT&T deployed a rebate program as a means to draw new users to their Internet services.
AT&T’s intent behind this rebate plan was to acquire new users from rivals such as Verizon. What makes
rebates effective?
A study was undertaken to determine the factors underlying consumer perception of rebates. A set of 24
items measuring consumer perceptions of rebates was constructed. Respondents were asked to express their
degree of agreement with these items on 5-point Likert scales. The data were collected by a one-stage area tele-
phone survey conducted in the Memphis metropolitan area. A total of 303 usable questionnaires were obtained.
The 24 items measuring perceptions of rebates were analyzed using common factor analysis. The initial
factor solution did not reveal a simple structure of underlying rebate perceptions. Therefore, items that had
low loadings were deleted from the scale, and the factor analysis was performed on the remaining items. This
second solution yielded three interpretable factors. The factor loadings are presented in the accompanying
table, where large loadings have been underscored. The three factors contained four, four, and three items,

Factor Analysis of Perceptions of Rebates


. Factor Loading
Scale Itemsa Factor 1 Factor 2 Factor 3
Manufacturers make the rebate process too complicated. 0.194 0.671 -0.127
Mail-in rebates are not worth the trouble involved. -0.031 0.612 0.352
It takes too long to receive the rebate check from the 0.013 0.718 0.051
manufacturer.
Manufacturers could do more to make rebates easier 0.205 0.616 0.173
to use.
Manufacturers offer rebates because consumers want them.b 0.660 0.172 0.101
Today’s manufacturers take a real interest in 0.569 0.203 0.334
consumer welfare.b
Consumer benefit is usually the primary consideration 0.660 0.002 0.318
in rebate offers.b
(continued)
620 PART III • DATA COLLECTION, PREPARATION, ANALYSIS, AND REPORTING

Factor Analysis of Perceptions of Rebates (continued )


Factor Loading
Scale Itemsa Factor 1 Factor 2 Factor 3
In general, manufacturers are sincere in their rebate 0.716 0.047 -0.033
offers to consumers.b
Manufacturers offer rebates to get consumers to buy 0.099 0.156 0.744
something they don’t really need.
Manufacturers use rebate offers to induce consumers 0.090 0.027 0.702
to buy slow-moving items.
Rebate offers require you to buy more of a product 0.230 0.066 0.527
than you need.
Eigenvalues 2.030 1.344 1.062
Percentage of explained variance 27.500 12.200 9.700
a The response categories for all items were: strongly agree (1), agree (2), neither agree nor disagree (3),
disagree (4), strongly disagree (5), and don’t know (6). “Don’t know” responses were excluded from
data analysis.
b The scores of these items were reversed.

respectively. Factor 1 was defined as a representation of consumers’ faith in the rebate system (Faith). Factor
2 seemed to capture the consumers’ perceptions of the efforts and difficulties associated with rebateredemp-
tion (Efforts). Factor 3 represented consumers’ perceptions of the manufacturers’ motives for offering
rebates (Motives). The loadings of items on their respective factor ranged from 0.527 to 0.744.
Therefore, companies such as AT&T that employ rebates should ensure that the effort and difficulties
of consumers in taking advantage of the rebates are minimized. They should also try to build consumers’
faith in the rebate system and portray honest motives for offering rebates.10 ■

Note that in this example, when the initial factor solution was not interpretable, items that had
low loadings were deleted and the factor analysis was performed on the remaining items. If the
number of variables is large (greater than 15), principal components analysis and common factor
analysis result in similar solutions. However, principal components analysis is less prone to mis-
interpretation and is recommended for the nonexpert user. The next example illustrates an appli-
cation of principal components analysis in international marketing research, and the example
after that presents an application in the area of ethics.

Real Research Driving Nuts for Beetles


Generally, with time, consumer needs and tastes change. Consumer preferences for automobiles need to be
continually tracked to identify changing demands and specifications. However, there is one car that is quite
an exception—the Volkswagen Beetle. More than 22 million have been built since it was introduced in
1938. Surveys have been conducted in different countries to determine the reasons why people purchase
Beetles. Principal components analyses of the variables measuring the reasons for owning Beetles have
consistently revealed one dominant factor—fanatical loyalty. The company has long wished for the car’s
natural death but without any effect. This noisy and cramped “bug” has inspired devotion in drivers. Now
old bugs are being sought everywhere. “The Japanese are going absolutely nuts for Beetles,” says Jack
Finn, a recycler of old Beetles in West Palm Beach, Florida. Because of faithful loyalty to the “bug,” VW
reintroduced it in 1998 as the New Beetle. The New Beetle has proven itself as much more than a sequel to
its legendary namesake. It has won several distinguished automotive awards. The 2009 Beetle was offered
in coupe and convertible versions and the base model had a starting MSRP of $17,990.11 ■

Real Research Factors Predicting Unethical Marketing Research Practices


Unethical employee behavior was identified as a root cause for the global banking and financial mess of
2008–2009. If companies want ethical employees, then they themselves must conform to high ethical
standards. This also applies to the marketing research industry. In order to identify organizational
variables that are determinants of the incidence of unethical marketing research practices, a sample of
420 marketing professionals was surveyed. These marketing professionals were asked to provide
responses on several scales, and to provide evaluations of incidence of 15 research practices that have
been found to pose research ethics problems.
CHAPTER 19 • FACTOR ANALYSIS 621

Factor Analysis of Ethical Problems and Top Management Action Scales


Extent of Ethical Top Management
Problems Within the Actions on Ethics
Organization (Factor 1) (Factor 2)
1. Successful executives in my company make rivals look bad in 0.66
the eyes of important people in my company.
2. Peer executives in my company often engage in behaviors that 0.68
I consider to be unethical.
3. There are many opportunities for peer executives in my 0.43
company to engage in unethical behaviors.
4. Successful executives in my company take credit for the ideas 0.81
and accomplishment of others.
5. In order to succeed in my company, it is often necessary to 0.66
compromise one’s ethics.
6. Successful executives in my company are generally more 0.64
unethical than unsuccessful executives.
7. Successful executives in my company look for a “scapegoat” 0.78
when they feel they may be associated with failure.
8. Successful executives in my company withhold information 0.68
that is detrimental to their self-interest.
9. Top management in my company has let it be known in no 0.73
uncertain terms that unethical behaviors will not be tolerated.
10. If an executive in my company is discovered to have engaged 0.80
in unethical behavior that results primarily in personal gain
(rather than corporate gain), he/she will be promptly
reprimanded.
11. If an executive in my company is discovered to have engaged 0.78
in an unethical behavior that results primarily in corporate
gain (rather than personal gain), he/she will be promptly
reprimanded.
Eigenvalue 5.06 1.17
% of Variance Explained 46% 11%
Coefficient Alpha 0.87 0.75

To simplify the table, only varimax rotated loadings of 0.40 or greater are reported. Each was rated on a 5-point scale with 1  “strongly agree”
and 5  “strongly disagree.”
One of these scales included 11 items pertaining to the extent that ethical problems plagued the organiza-
tion, and what top management’s actions were toward ethical situations. A principal components analysis with
varimax rotation indicated that the data could be represented by two factors.
These two factors were then used in a multiple regression along with four other predictor variables.
They were found to be the two best predictors of unethical marketing research practices.12 ■

Decision Research Tiffany: Focusing on the Core

The Situation
Tiffany & Co. (www.tiffany.com) is the internationally renowned retailer, designer, manufacturer, and
distributor of fine jewelry, timepieces, sterling silverware, china, crystal, stationery, fragrances, and acces-
sories. Founded in 1837 by Charles Lewis Tiffany, there were 184 Tiffany & Co. stores and boutiques that
served customers in the United States and international markets in 2009. Tiffany’s main growth strategies
consist of expanding its channels of distribution in important markets around the world, complementing
its existing product offerings with an active product development program, enhancing customer awareness
of its product designs, quality, and value, and providing levels of customer service that guarantee a great
shopping experience. Tiffany & Company’s revenues exceeded $2.94 billion in 2008.
Tiffany is slowly and subtly embracing the middle class, a potential danger for one of retail’s most
exclusive names. Over the past decade, the luxury jewelry retailer has nearly tripled its stores in the United
States and changed its promotions to highlight more lower-price wares. It currently has 70 U.S. locations
and another 114 overseas. Tiffany’s new 5,000-square-foot format will allow it to also expand into smaller
markets and double up in bigger towns. In the process of reaching out to embrace more markets, Tiffany
622 PART III • DATA COLLECTION, PREPARATION, ANALYSIS, AND REPORTING

Factor analysis can help


Tiffany identify the
psychographic profile
of its core customers.

may be driving away some of its core customers. Although Tiffany has a long way to go before it is fully
accessible, the Tiffany Heart Tag silver bracelet has become quite a popular item among many, including
Reese Witherspoon and her entourage in Legally Blonde. Mr. James E. Quinn, president, is wondering
what the psychographic profile of Tiffany’s core customers is and what the company should do to maintain
and build upon the loyalty of its core customers. This is critical to success in the future.

The Marketing Research Decision


1. What data should be collected and how should they be analyzed to determine the psychographic profile
of Tiffany’s core customers?
2. Discuss the role of the type of research you recommend in enabling James E. Quinn to maintain and
build upon the loyalty of Tiffany’s core customers.

The Marketing Management Decision


1. What new strategies should James E. Quinn formulate to target the core customers?
2. Discuss how the marketing management decision action that you recommend to James E. Quinn is
influenced by the data analysis that you suggested earlier and by the likely findings.13 ■

Statistical Software
Computer programs are available to implement both of the approaches: principal components analy-
sis and common factor analysis. We discuss the use of SPSS and SAS in detail in the subsequent
sections. Here, we briefly describe the use of MINITAB. In MINITAB, factor analysis can be
assessed using Multivariate>Factor analysis. Principal components or maximum likelihood can be
used to determine the initial factor extraction. If maximum likelihood is used, specify the number of
factors to extract. If a number is not specified with a principal component extraction, the program
will set it equal to a number of variables in the data set. Factor analysis is not available in EXCEL.

SPSS and SAS Computerized Demonstration Movies


We have developed computerized demonstration movies that give step-by-step instructions for
running all the SPSS and SAS Learning Edition programs that are discussed in this chapter.
These demonstrations can be downloaded from the Web site for this book. The instructions for
running these demonstrations are given in Exhibit 14.2.

SPSS and SAS Screen Captures with Notes


The step-by-step instructions for running the various SPSS and SAS Learning Edition programs
discussed in this chapter are also illustrated in screen captures with appropriate notes. These
screen captures can be downloaded from the Web site for this book.
CHAPTER 19 • FACTOR ANALYSIS 623

SPSS Windows
To select this procedure using SPSS for Windows, click:
Analyze7Data Reduction7Factor . . .
The following are the detailed steps for running principal components analysis on the tooth-
paste attribute ratings (V1 to V6) using the data of Table 19.1.

1. Select ANALYZE from the SPSS menu bar.


2. Click DIMENSION REDUCTION and then FACTOR.
3. Move “Prevents Cavities [v1],” “Shiny Teeth [v2],” “Strengthen Gums [v3],” “Freshens
Breath [v4],” “Tooth Decay Unimportant [v5],” and “Attractive Teeth [v6].” into the
VARIABLES box.
4. Click on DESCRIPTIVES. In the pop-up window, in the STATISTICS box check INITIAL
SOLUTION. In the CORRELATION MATRIX box check KMO AND BARTLETT’S
TEST OF SPHERICITY and also check REPRODUCED. Click CONTINUE.
5. Click on EXTRACTION. In the pop-up window, for METHOD select PRINCIPAL
COMPONENTS (default). In the ANALYZE box, check CORRELATION MATRIX.
In the EXTRACT box, select BASED ON EIGENVALUE and enter 1 for EIGENVALUES
GREATER THAN box. In the DISPLAY box check UNROTATED FACTOR SOLUTION.
Click CONTINUE.
6. Click on ROTATION. In the METHOD box check VARIMAX. In the DISPLAY box check
ROTATED SOLUTION. Click CONTINUE.
7. Click on SCORES. In the pop-up window, check DISPLAY FACTOR SCORE COEFFICIENT
MATRIX. Click CONTINUE.
8. Click OK.

The procedure for running common factor analysis is similar, except that in step 5, for
METHOD select PRINCIPAL AXIS FACTORING.

SAS Learning Edition


The instructions given here and in all the data analysis chapters (14 to 22) will work with the
SAS Learning Edition as well as with the SAS Enterprise Guide. For a point-and-click approach
for performing principal components analysis and factor analysis, use the Analyze task within
the SAS Learning Edition. The Multivariate7Factor Analysis tasks performs both Principal
Components Analysis and Factor Analysis. The Multivariate7Principal Components tasks also
performs Principal Components Analysis.
To perform principal components analysis, click:

Analyze7Multivariate7Factor Analysis

The following are the detailed steps for running principal components analysis on the tooth-
paste attribute ratings (V1 to V6) using the data of Table 19.1.

1. Select ANALYZE from the SAS Learning Edition menu bar.


2. Click MULTIVARIATE and then FACTOR ANALYSIS.
3. Move V1-V6 to the ANALYSIS variables task role.
4. Click FACTORING METHOD and change the SMALLEST EIGENVALUE to 1.
5. Click on ROTATION AND PLOTS and select ORTHOGONAL VARIMAX as the
Rotation method and the SCREE PLOT under the PLOTS TO SHOW.
6. Click on RESULTS and select EIGENVECTORS, FACTOR SCORING COEFFICIENTS
under FACTOR RESULTS, and MEANS AND STANDARD DEVIATIONS of input
columns, CORRELATION MATRIX of input columns and KAISER’S MEASURE OF
SAMPLING ADEQUACY under RELATED STATISTICS.
7. Click RUN.
The procedure for running common factor analysis is similar, except that in step 5, select PRINCIPAL
AXIS FACTORING as the ROTATION METHOD.
624 PART III • DATA COLLECTION, PREPARATION, ANALYSIS, AND REPORTING

Project Research Factor Analysis


In the department store project, the respondents’ ratings of 21 lifestyle statements were factor analyzed to
determine the underlying lifestyle factors. Seven factors emerged: bank card versus store card preference,
credit proneness, credit avoidance, leisure time orientation, credit card favorableness, credit convenience,
and credit card cost consciousness. These factors, along with the demographic characteristics, were used to
profile the segments formed as a result of clustering.

Project Activities
SPSS Data File Download the SPSS or SAS data file Sears Data 17 from the Web site for this book. See Chapter 17 for a
description of this file.
1. Can the 21 lifestyle statements be represented by a reduced set of factors? If so, what would be the
interpretation of these factors? Conduct a principal components analysis and save the factor scores.
2. Can the importance attached to the eight factors of the choice criteria be represented by a reduced set
SAS Data File of factors? If so, what would be the interpretation of these factors? Conduct a principal components
analysis. ■
Summary
Factor analysis, also called exploratory factor analysis (EFA), accounts for the highest variance in the data, the second the
is a class of procedures used for reducing and summarizing next highest, and so on. Additionally, it is possible to extract
data. Each variable is expressed as a linear combination of the the factors so that the factors are uncorrelated, as in principal
underlying factors. Likewise, the factors themselves can be components analysis. Figure 19.7 gives a concept map for
expressed as linear combinations of the observed variables. factor analysis.
The factors are extracted in such a way that the first factor

FIGURE 19.7
A Concept Map for Factor Analysis

Formulate the Problem specify The Variables to Be Included


specify determine
construct
Objectives The Sample Size: Should Be at Least
4 to 5 Times the Number of Variables
Correlation Matrix
conduct assess
Bartlettís Test of Sphericity determine Kaiser-Meyer-Olkin Should Be > 0.5
common
total Method of Factor Analysis variance Common Factor Analysis
Principal Components Analysis variance
decide more complex Other Methods
A Priori based on based on Eigenvalues
Number of Factors
based on based on
Percentage of Variance
based on conduct Split-Half Reliability
based on
Scree Plot Significance Tests
select Factor Rotation select
Orthogonal Oblique
interpret select
Other Rotation Methods
examine Rotated Factors examine
Plot of Variables Using Rotated Factor Matrix
calculate select
Factor Loadings as Coordinates
Factor Matrix
Factor Scores Surrogate Variables based on
calculate
Fi = Wi1X1 + Wi2X2 + … + WikXk based on Theoretical and
determine determine Measurement
Considerations

calculate Model Fit


examine
Reproduced
Residuals
Correlation Matrix
CHAPTER 19 • FACTOR ANALYSIS 625

In formulating the factor analysis problem, the variables The number of factors that should be extracted can be
to be included in the analysis should be specified based on determined a priori or based on eigenvalues, scree plots,
past research, theory, and the judgment of the researcher. percentage of variance, split-half reliability, or significance
These variables should be measured on an interval or ratio tests. Although the initial or unrotated factor matrix indicates
scale. Factor analysis is based on a matrix of correlation the relationship between the factors and individual variables,
between the variables. The appropriateness of the correla- it seldom results in factors that can be interpreted, because
tion matrix for factor analysis can be statistically tested. the factors are correlated with many variables. Therefore,
The two basic approaches to factor analysis are princi- rotation is used to transform the factor matrix into a simpler
pal components analysis and common factor analysis. one that is easier to interpret. The most commonly used
In principal components analysis, the total variance in the method of rotation is the varimax procedure, which results in
data is considered. Principal components analysis is recom- orthogonal factors. If the factors are highly correlated in the
mended when the researcher’s primary concern is to deter- population, oblique rotation can be utilized. The rotated
mine the minimum number of factors that will account for factor matrix forms the basis for interpreting the factors.
maximum variance in the data for use in subsequent multi- Factor scores can be computed for each respondent.
variate analysis. In common factor analysis, the factors are Alternatively, surrogate variables may be selected by exam-
estimated based only on the common variance. This method ining the factor matrix and selecting for each factor a
is appropriate when the primary concern is to identify the variable with the highest or near highest loading. The
underlying dimensions, and when the common variance is differences between the observed correlations and the
of interest. This method is also known as principal axis reproduced correlations, as estimated from the factor
factoring. matrix, can be examined to determine model fit.

Key Terms and Concepts


factor analysis, 604 factor loading plot, 606 principal components analysis, 611
interdependence technique, 604 factor matrix, 606 common factor analysis, 611
factors, 604 factor scores coefficient matrix, 606 orthogonal rotation, 613
Bartlett’s test of sphericity, 606 Kaiser-Meyer-Olkin (KMO) measure varimax procedure, 613
correlation matrix, 606 of sampling adequacy, 606 oblique rotation, 613
communality, 606 percentage of variance, 606 factor scores, 614
eigenvalue, 606 residuals, 606
factor loadings, 606 scree plot, 606

Suggested Cases, Video Cases, and HBS Cases


Running Case with Real Data
1.1 Dell

Comprehensive Critical Thinking Cases


2.1 American Idol 2.2 Baskin-Robbins 2.3 Akron Children’s Hospital

Data Analysis Cases with Real Data


3.1 AT&T 3.2 IBM 3.3 Kimberly-Clark

Comprehensive Cases with Real Data


4.1 JPMorgan Chase 4.2 Wendy’s

Video Case
23.1 Marriott

Live Research: Conducting a Marketing Research Project


1. The objectives of factor analysis should be clearly specified. 3. It is instructive to use different guidelines for determining the
2. If multicollinearity is a problem, factor analysis can be used to number of factors and different methods of rotation and to
generate uncorrelated factor scores or to identify a smaller set examine the effect on the factor solutions.
of the original variables, which can be used in subsequent mul-
tivariate analysis.

You might also like