0% found this document useful (0 votes)
302 views

Chapter-18: Research Methodology

Cluster analysis is a technique used to group objects based on multiple variables. It can be applied to both numeric and non-numeric data. Cluster analysis is commonly used for market segmentation to split customers into homogeneous groups. It involves calculating the distances or similarities between objects to group them together in clusters with high homogeneity within each cluster but clear differences between clusters. The document provides examples of using cluster analysis for market segmentation, career planning, and segmenting industries.

Uploaded by

Pankaj2c
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
302 views

Chapter-18: Research Methodology

Cluster analysis is a technique used to group objects based on multiple variables. It can be applied to both numeric and non-numeric data. Cluster analysis is commonly used for market segmentation to split customers into homogeneous groups. It involves calculating the distances or similarities between objects to group them together in clusters with high homogeneity within each cluster but clear differences between clusters. The document provides examples of using cluster analysis for market segmentation, career planning, and segmenting industries.

Uploaded by

Pankaj2c
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

CLUSTER ANALYSIS

CHAPTER-18
RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

SLIDE 18-1
RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

What is Cluster analysis?
Cluster analysis is a techniques for grouping objects, cases,
entities on the basis of multiple variables. The advantage of
the technique is that it is applicable to both metric and non-
metric data.
Secondly, the grouping can be done post hoc , i.e. after the
primary data survey is over. The technique has wide
applications in all branches of management . However, it is
most often used for market segmentation analysis.
SLIDE 18-2
RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

Cluster analysis- basic tenets
Can be used to cluster objects, individuals and entities

Similarity is based on multiple variables

Measures proximity between study variables

Groups that are grouped in one cluster are
homogenous as compared to others

Can be conducted on metric, non-metric as well as
mixed data



SLIDE 18-3
RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

Usage of cluster analysis
Market segmentation customers/potential customers can
be split into smaller more homogenous groups by using
the method.
Segmenting industries the same grouping principle can
be applied for industrial consumers.
Segmenting markets cities or regions with similar or
common traits can be grouped on the basis of climatic or
socio-economic conditions.
SLIDE 18-4
RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

Usage of cluster analysis

Career planning and training analysis for human
resource planning people can be grouped into clusters on
the basis of their educational/experience or aptitude and
aspirations.


Segmenting financial sector/instruments different factors
like raw material cost, financial allocations, seasonality and
other factors are being used to group sectors together to
understand the growth and performance of a group of
industries.
RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

Statistics associated with cluster analysis
Metric data analysis


Where,
d
ij
= distance between person i and j.
k = variable (interval / ratio)
i = object
j = object


3
1
2
k
jk ik ij
X X d
SLIDE 18-5
RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

Statistics associated with cluster
analysis
Non-metric data

Simple matching coefficient =


Jaccard coefficient =

Where
P=positive matches
N=negative matches
M=mismatches

SLIDE 18-6
RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

Statistics associated with cluster
analysis
SLIDE 18-7
Mixed Data
RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

Key concepts in cluster analysis
SLIDE 18-8

Agglomeration schedule: A hierarchical method that provides
information on the objects, starting with the most similar pair and then
at each stage provides information on the object joining the pair at a
later stage.


ANOVA table: The univariate or one way ANOVA statistics for each
clustering variable. The higher is the ANOVA value , the higher is the
difference between the clusters on that variable.

Cluster variate: The variables or parameters representing the objects
to be clustered and used to calculate the similarity between objects.

Cluster centroid: The average values of the objects on all the
variables in the cluster variate.

RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

Key concepts in cluster analysis
SLIDE 18-9
Cluster seeds: Initial cluster centres in the non-hierarchical clustering
that are the initial points from which one starts. Then the clusters are
created around these seeds.

Cluster membership: This indicates the address or the cluster to which
a particular person/object belongs.

Dendrogram: This is a tree like diagram that is used to graphically
present the cluster results. The vertical axis represents the objects and
the horizontal represents the inter-respondent distance. The figure is to
be read from left to right.

Distances between final cluster centres: These are the distances
between the individual pairs of clusters. A robust solution that is able to
demarcate the groups distinctly is the one where the inter cluster
distance is large; the larger the distance the more distinct are the
clusters.

RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

Key concepts in cluster analysis
SLIDE 18-10
Entropy group: The individuals or small groups that do
not seem to fit into any cluster.

Final cluster centres: The mean value of the cluster on
each of the variables that is a part of the cluster variate.

Hierarchical methods: A step-wise process that starts
with the most similar pair and formulates a tree-like
structure composed of separate clusters.

Non-hierarchical methods: Cluster seeds or centres are
the starting points and one builds individual clusters
around it based on some pre-specified distance of the
seeds.

RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

Key concepts in cluster analysis
SLIDE 18-11
Proximity matrix: A data matrix that consists of pair-wise
distances/similarities between the objects. It is a N x N
matrix, where N is the number of objects being clustered.

Summary: Number of cases in each cluster is indicated in
the non-hierarchical clustering method.

Vertical icicle diagram: Quite similar to the dendogram, it
is a graphical method to demonstrate the composition of
the clusters. The objects are individually displayed at the
top. At any given stage the columns correspond to the
objects being clustered, and the rows correspond to the
number of clusters. An icicle diagram is read from bottom
to top.

RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

Cluster analysis process
SLIDE 18-12

Nonmetric data Metric data
Stage 1
Stage 2
2
Stage 3
RESEARCH OBJECTIVES
Exploratory versus confirmatory
objectives
Select variables used to cluster objects
CLUSTER ASSUMPTIONS
Are the cluster variables metric or non
metric?
Distance measures of similarity
Squared Euclidean distance
Association measures of similarity
Matching coefficients

CLUSTERING ALGORITHM
Is a hierarchical, nonhierarchical, or
combination of the two methods
used?

HIERARCHICAL
METHODS
Single Linkage
Complete Linkage
Average Linkage
Wards Methods
Centroid Method
NONHIERARCHICH
AL METHODS
Sequential
Threshold
Parallel Threshold
Optimization

COMBINATION
Use a hierarchical
method to specify
cluster seeds for a
nonhierarchical
method

NUMBER OF CLUSTERS
Hierarchical methods
Examine dendrogram
Cluster membership
Conceptual consideration

INTERPRETING THE CLUSTERS
Examine cluster variables.
Name clusters
VALIDATING AND PROFILING THE
CLUSTERS
Validation
Profiling

Stage 4
Stage 5
Stage 6
TWO STEP
CLUSTER

RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I


Illustration : Nano study
SLIDE 18-13
Inter respondent Distance Cluster Combine
C A S E 0 5 10 15 20 25
+---------+---------+---------+---------+---------+

18 -^
25 -;-^
7 --
13 -u ---^
11 -^
21 -;-u
-----------------------------------------^
6 -u
3 ------u
8 -u
5 ------^
10 -----u
17 --^ ---------------------------^
22 -u -^
15 ---u -u
2 -----u
16 ----^ -------------u
20 ---u
12 -----;---^
19 ------
14 -----u
9 -^ -------------------------u
24 -;-^
1 -u -^
23 ---u ---u
4 -----u
RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I


Illustration: Nano study
SLIDE 18-14
ANOVA
F Sig.

I think in India we have been able to achieve technological
standard of high order
39.036 .000
I prefer to buy things made in India 44.896 .000
I usually buy things which provide value for money 53.716 .000
Convenience is more important than style 65.008 .000
I do not like wasteful expenditure
92.103 .000
When it comes to safety I believe there should be no
compromises.
50.579 .000
I'm a "saver" rather than a "spender." 23.468 .000
I like to try new and different things. 164.223 .000
I always want to be a part of changing world 96.749 .000

RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I


Illustration : Nano study
SLIDE 18-15
Cluster centroids for Nano sample survey

Cluster
1 2 3
I think in India we have been able to achieve
technological standard of high order
2.17 2.00 4.40
I prefer to buy things made in India 1.67 2.22 4.70
I usually buy things which provide value for money 4.67 1.44 2.70
Convenience is more important than style 4.67 1.78 2.10
I do not like wasteful expenditure
4.33 1.00 2.80
When it comes to safety I believe there should be no
compromises.
4.67 1.22 2.60
I'm a "saver" rather than a "spender." 4.17 1.00 2.60
I like to try new and different things. 1.50 4.78 1.20
I always want to be a part of changing world 1.33 4.33 1.40

RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I


Illustration: Nano study
SLIDE 18-16
Cluster summary- Nano sample survey

Cluster 1( cautious consumer) 6.000
Cluster 2( innovative consumer) 9.000
Cluster 3( Patriotic consumer) 10.000
Valid 25.000
Missing .000

RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I


Validating the cluster solution
Use two-step clustering to measure the stability of
the obtained solution.

Split the data in half and conduct clustering on each
and check cluster centroids.

Use subjective judgment to evaluate both group
formation as well as cluster potential for managerial
decision.
SLIDE 18-17
END OF CHAPTER
RESEARCH
METHODOLOGY
CONCEPTS AND
CASES
D
R

D
E
E
P
A
K

C
H
A
W
L
A









D
R

N
E
E
N
A

S
O
N
D
H
I

You might also like