0% found this document useful (0 votes)

26 views

Cluster Analysis

Cluster analysis is a statistical technique used to group similar data points together based on certain characteristics. It aims to find patterns within a dataset by identifying clusters with data points that are more similar to each other than points in other clusters. There are four main types of cluster analysis: hierarchical, centroid-based, distribution-based, and density-based. Cluster analysis is useful for understanding large, unstructured datasets, detecting outliers and anomalies, and has applications in marketing, business operations, earth observation, and data science.

Uploaded by

Jhonemar Tejano

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views

Cluster Analysis

Uploaded by

Jhonemar Tejano

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Cluster analysis

- is a statistical technique used in data analysis to group similar data points together based on certain
characteristics or attributes.

- It is the basic and most important step of data mining and a common technique for statistical data
analysis, and it is used in many fields such as data compression, machine learning, pattern recognition,
information retrieval etc.

The goal is to find patterns or structure within a dataset by identifying clusters of data points that are
more similar to each other than to those in other clusters.

Clusters should exhibit high internal homogeneity and high external heterogeneity.

What does this mean?

When plotted geometrically, objects within clusters should be very close together and clusters will be far
apart.

Cluster Analysis as a Stand-Alone Tool

* Marketing: In marketing, cluster analysis can be used to segregate customers into different buckets
based on their buying patterns or interests. These are known as customer personas. Organizations then
use different marketing strategies for different clusters of customers.

* Risk Analysis in Finance: Financial organizations use various cluster analysis algorithms for segregating
their customers into various risk categories based on their bank balance and debt. While approving
loans, insurance, or credit cards, these clusters are used to aid in decision-making.

* Real Estate: Infrastructure specialists use clustering to group houses according to their size, location,
and market value. This information is used to assess the real estate potential of different parts of a city.

4 Major Types of Cluster Analysis

Hierarchical Cluster Analysis

In this method, first, a cluster is made and then added to another cluster (the most similar and closest
one) to form one single cluster. This process is repeated until all subjects are in one cluster. This
particular method is known as Agglomerative method. Agglomerative clustering starts with single
objects and starts grouping them into clusters.

The divisive method is another kind of Hierarchical method in which clustering starts with the complete
data set and then starts dividing into partitions.

Centroid-based Clustering

In this type of clustering, clusters are represented by a central entity, which may or may not be a part of
the given data set. K-Means method of clustering is used in this method, where k are the cluster centers
and objects are assigned to the nearest cluster centres.

Distribution-based Clustering

It is a type of clustering model closely related to statistics based on the modals of distribution. Objects
that belong to the same distribution are put into a single cluster.This type of clustering can capture
some complex properties of objects like correlation and dependence between attributes.

Density-based Clustering

In this type of clustering, clusters are defined by the areas of density that are higher than the remaining
of the data set. Objects in sparse areas are usually required to separate clusters.The objects in these
sparse points are usually noise and border points in the graph.The most popular method in this type of
clustering is DBSCAN.

Example of Cluster Analysis

The following example shows you how to use the centroid-based clustering algorithm to cluster 30
different points into five groups. You can plot points on a two-dimensional graph, as shown in the
graphs below.

On the left, we have a random distribution of the 30 points. The first iteration of the K-means clustering
divides the points into five groups, with each cluster represented by a different color, as shown in the
center graph.

The algorithm will then iteratively move the points from one cluster to another until the points are
grouped optimally. The end result will be five distinct clusters, as shown in the graph on the right.
When Is Cluster Analysis Useful?

Cluster analysis helps us understand data and detect patterns. In certain cases, it provides a great
starting point for further analysis. In other cases, it can give you the greatest insights from the data.
Here are some cases when cluster analysis is useful,

* If you have large and unstructured data sets, it can be expensive and time-consuming to
label groups manually. In this case, cluster analysis provides the best solution to divide your data into
groups.

* When you don’t know the number of clusters in advance, cluster analysis can provide
the first insight into groups that are available in your data set.

* When you need to detect outliers in your data, cluster analysis provides an effective
method compared to traditional outlier detection methods, such as standard deviation.

* Cluster analysis can help you detect anomalies. While outliers are observations distant
from the mean, they don’t necessarily represent abnormalities. On the other hand, anomalies relate to
identifying rare events or observations that deviate greatly from the mean.

Applications of Cluster Analysis

Cluster analysis has applications in many disparate industries and fields. Here’s a list of some disciplines
that make use of this methodology.

* Marketing: Cluster analysis is popular in marketing, especially in customer

segmentation. This method of analysis helps to both target customer segments and perform sales
analysis by groups.

* Business Operations: Businesses can optimize their processes and reduce costs by
analyzing clusters and identifying similarities and differences between data points. For example, you can
identify patterns in customer data and improve customer support processes for a particular group that
may require special attention.

* Earth Observation: Using a clustering algorithm, you can create a pixel mask for objects
in an image. For example, you can use image segmentation to classify vegetation or built-up areas in a
satellite image.

* Data Science: We can use cluster analysis for predictive analytics. By applying machine
learning techniques to clusters, we can create predictive models to make inferences about a particular
data set.

Icse Worksheet For Class 10
No ratings yet
Icse Worksheet For Class 10
12 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Cluster Analysis (1)- Rmm
No ratings yet
Cluster Analysis (1)- Rmm
17 pages
DM MODULE 4
No ratings yet
DM MODULE 4
17 pages
Cluster Analysis
No ratings yet
Cluster Analysis
4 pages
Unit 5 Clustering-2
No ratings yet
Unit 5 Clustering-2
28 pages
Lectures 5 and 6 - Data Anaysis in Management - MBM
No ratings yet
Lectures 5 and 6 - Data Anaysis in Management - MBM
61 pages
Dmbi Unit-4
No ratings yet
Dmbi Unit-4
18 pages
ML Unit 4 Notes - NJ
No ratings yet
ML Unit 4 Notes - NJ
15 pages
DWDS Unit 6 Cluster Analysis (1)
No ratings yet
DWDS Unit 6 Cluster Analysis (1)
31 pages
17 GM ASAP Data Mining - Clustering
No ratings yet
17 GM ASAP Data Mining - Clustering
107 pages
Screenshot 2024-05-17 at 3.30.05 PM
No ratings yet
Screenshot 2024-05-17 at 3.30.05 PM
31 pages
Unit 4
No ratings yet
Unit 4
4 pages
Advanced Mining Techniques
No ratings yet
Advanced Mining Techniques
8 pages
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
No ratings yet
What Is Cluster Analysis?: - Cluster: A Collection of Data Objects
9 pages
Clustering
No ratings yet
Clustering
6 pages
Clustering
No ratings yet
Clustering
8 pages
CLUSTERING ANALYSIS
No ratings yet
CLUSTERING ANALYSIS
17 pages
Unit5 Clustering
No ratings yet
Unit5 Clustering
74 pages
clustering
No ratings yet
clustering
6 pages
Cluster Analysis
No ratings yet
Cluster Analysis
61 pages
Complete Clustering
No ratings yet
Complete Clustering
80 pages
DWDM Unit 3
No ratings yet
DWDM Unit 3
21 pages
(PML ITS - Week 10) - Clustering
No ratings yet
(PML ITS - Week 10) - Clustering
42 pages
Chapter-8 (Cluster Analysis Basic Concepts and Algorithms)
No ratings yet
Chapter-8 (Cluster Analysis Basic Concepts and Algorithms)
73 pages
Clustering
No ratings yet
Clustering
11 pages
UNIT 4 Clustering and Applications
No ratings yet
UNIT 4 Clustering and Applications
5 pages
Lecture 6
No ratings yet
Lecture 6
14 pages
Cluster Analysis
No ratings yet
Cluster Analysis
2 pages
Clustering K Means Agnes
No ratings yet
Clustering K Means Agnes
36 pages
Introduction to Cluster Analysis.
No ratings yet
Introduction to Cluster Analysis.
53 pages
DM Cluster Analysis
No ratings yet
DM Cluster Analysis
3 pages
DM Unit 5
No ratings yet
DM Unit 5
15 pages
Cluster Analysis
No ratings yet
Cluster Analysis
36 pages
Unit - 5 Cluster Analysis
No ratings yet
Unit - 5 Cluster Analysis
83 pages
Fundamentals of Data Science Unit 3
No ratings yet
Fundamentals of Data Science Unit 3
15 pages
Iv Unit DM
No ratings yet
Iv Unit DM
26 pages
Unit 4
No ratings yet
Unit 4
40 pages
Data Mining Clustering Techniques
No ratings yet
Data Mining Clustering Techniques
3 pages
Data Mining - Cluster Analysis
No ratings yet
Data Mining - Cluster Analysis
4 pages
APznzaaxpWzYylHJmwXGn2puBz7GP1usZYf9XTi7oqfrrKnFV9DMMfVzPCu6yO0UOnr_XFt1gJv4TE1ITR6850n9k65DydQUgoRlylNdn2acWAu6KNonoO8z7QULN6BlLxY_B-JhKko0tJ3K77woLz26oTaAv1YNcIuMcOSqInmgeCUzpUxjKC9VqnT_lhE7vDyWp_LQQjGTRnamgIC6ya3nlwi7mjjE9EUIiO2sUhjkD6RV
No ratings yet
APznzaaxpWzYylHJmwXGn2puBz7GP1usZYf9XTi7oqfrrKnFV9DMMfVzPCu6yO0UOnr_XFt1gJv4TE1ITR6850n9k65DydQUgoRlylNdn2acWAu6KNonoO8z7QULN6BlLxY_B-JhKko0tJ3K77woLz26oTaAv1YNcIuMcOSqInmgeCUzpUxjKC9VqnT_lhE7vDyWp_LQQjGTRnamgIC6ya3nlwi7mjjE9EUIiO2sUhjkD6RV
38 pages
MODULE-V
No ratings yet
MODULE-V
16 pages
18CSE397T - Computational Data Analysis Unit - 3: Session - 7: SLO - 01
No ratings yet
18CSE397T - Computational Data Analysis Unit - 3: Session - 7: SLO - 01
3 pages
Concepts and Techniques: - Chapter 7
No ratings yet
Concepts and Techniques: - Chapter 7
70 pages
Data Mining-Unit IV
No ratings yet
Data Mining-Unit IV
15 pages
Clustering
No ratings yet
Clustering
6 pages
TQM - TRG - F-07 - Cluster Analysis - Rev02 - 20180421
No ratings yet
TQM - TRG - F-07 - Cluster Analysis - Rev02 - 20180421
42 pages
Unit 5
No ratings yet
Unit 5
5 pages
Cluster Analysis: G Sreenivas
No ratings yet
Cluster Analysis: G Sreenivas
29 pages
Unit 4
No ratings yet
Unit 4
21 pages
Prepared By: Dr. Poonam Khurana: Cluster Analysis
No ratings yet
Prepared By: Dr. Poonam Khurana: Cluster Analysis
10 pages
UNIT 4 Updated
No ratings yet
UNIT 4 Updated
56 pages
Practical Software Testing
No ratings yet
Practical Software Testing
3 pages
Assignment 4
No ratings yet
Assignment 4
40 pages
Chap8-Cluster Analysis
No ratings yet
Chap8-Cluster Analysis
103 pages
Discriminant Analysis
No ratings yet
Discriminant Analysis
15 pages
Unit V - Clustering
No ratings yet
Unit V - Clustering
19 pages
Data Mining Notes UNIT IV
No ratings yet
Data Mining Notes UNIT IV
19 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
From Everand
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
Peter Bradley
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
cpu-scheduling
No ratings yet
cpu-scheduling
37 pages
Normalization
No ratings yet
Normalization
25 pages
Utilization of Sweet Potato (Ipomoea Batatas) Starch As Potential Starch-Based Bioplastic Food Packaging
No ratings yet
Utilization of Sweet Potato (Ipomoea Batatas) Starch As Potential Starch-Based Bioplastic Food Packaging
39 pages
Timeline of Philippine Arts
No ratings yet
Timeline of Philippine Arts
2 pages
Title of The Paper (16 Point, Bold, Times New Roman) : Article Type: Developmental Research
No ratings yet
Title of The Paper (16 Point, Bold, Times New Roman) : Article Type: Developmental Research
4 pages
Definition of Terms
No ratings yet
Definition of Terms
1 page
Concept Paper For Stuck Garbages in Our Barangay
0% (1)
Concept Paper For Stuck Garbages in Our Barangay
2 pages
Steam Distillation
100% (1)
Steam Distillation
28 pages
Expression Tree in Data Structure PDF
0% (1)
Expression Tree in Data Structure PDF
2 pages
Optimal Execution: I. Limit Order Book & Price Impact Models
No ratings yet
Optimal Execution: I. Limit Order Book & Price Impact Models
33 pages
28nm FDSOI CMOS technology FEOL and BEOL thermal stability for 3D Sequential Integration yield and reliability analysis
No ratings yet
28nm FDSOI CMOS technology FEOL and BEOL thermal stability for 3D Sequential Integration yield and reliability analysis
2 pages
MK31 (Feb17)
No ratings yet
MK31 (Feb17)
4 pages
Milli Volt Drop
100% (2)
Milli Volt Drop
2 pages
Jain 2013
No ratings yet
Jain 2013
4 pages
Is Your Company Ready For AI & Machine Vision Projects?: The 24 Questions You Should Ask Yourself
No ratings yet
Is Your Company Ready For AI & Machine Vision Projects?: The 24 Questions You Should Ask Yourself
10 pages
TUCAM-API Development Guide
No ratings yet
TUCAM-API Development Guide
98 pages
Bhel Haridwar - Training Report - 800 MW Bar CIM Block 4 (EEE)
71% (7)
Bhel Haridwar - Training Report - 800 MW Bar CIM Block 4 (EEE)
27 pages
Unit III
No ratings yet
Unit III
196 pages
Switch Disconnector Katalog HG - 12 - 21
No ratings yet
Switch Disconnector Katalog HG - 12 - 21
28 pages
Stream Bank Protection and Erosion Damage Mitigation Measures
No ratings yet
Stream Bank Protection and Erosion Damage Mitigation Measures
19 pages
Big Data Analytics-Digital Notes
No ratings yet
Big Data Analytics-Digital Notes
86 pages
MC QB
No ratings yet
MC QB
24 pages
Nitro Shock Absorber
No ratings yet
Nitro Shock Absorber
15 pages
Adjectives
No ratings yet
Adjectives
6 pages
Energies 13 06269 v2
No ratings yet
Energies 13 06269 v2
41 pages
Q5. by Using Spinner, Buttons. Write A Program To Draw Gui. //manactivity - Java
No ratings yet
Q5. by Using Spinner, Buttons. Write A Program To Draw Gui. //manactivity - Java
3 pages
On My Honor, I Have Neither Solicited Nor Received Unauthorized Assistance On This Assignment
No ratings yet
On My Honor, I Have Neither Solicited Nor Received Unauthorized Assistance On This Assignment
6 pages
RS Geo 7 PDF
No ratings yet
RS Geo 7 PDF
6 pages
Lesson-1-Math-M 3112
No ratings yet
Lesson-1-Math-M 3112
3 pages
64F3048F16 Hitachi
No ratings yet
64F3048F16 Hitachi
867 pages
Ereta Uace Physics 2 Final
No ratings yet
Ereta Uace Physics 2 Final
9 pages
Cyclopropyl Methyl Cation
No ratings yet
Cyclopropyl Methyl Cation
2 pages
ME 408 Automatic Control - MidSp2021 PDF
No ratings yet
ME 408 Automatic Control - MidSp2021 PDF
2 pages
May 2021 - Plane Trigonometry 1
No ratings yet
May 2021 - Plane Trigonometry 1
2 pages
4SDK3
No ratings yet
4SDK3
4 pages
12 Binom and Normal
No ratings yet
12 Binom and Normal
6 pages

Cluster Analysis

Uploaded by

Cluster Analysis

Uploaded by

Cluster analysis

What does this mean?

Cluster Analysis as a Stand-Alone Tool

4 Major Types of Cluster Analysis

Hierarchical Cluster Analysis

Example of Cluster Analysis

Applications of Cluster Analysis

* Marketing: Cluster analysis is popular in marketing, especially in customer

You might also like