0% found this document useful (0 votes)

93 views

Data Mining: Hierarchical Clustering, DBSCAN The EM Algorithm

Hierarchical clustering is an unsupervised machine learning algorithm that groups similar objects into clusters. There are two main types: agglomerative, which starts with each object as a separate cluster and merges them sequentially, and divisive, which starts with all objects in one cluster and splits them sequentially. Agglomerative clustering is more popular and works by iteratively merging the closest pair of clusters based on a defined proximity measure until only one cluster remains, producing a hierarchical tree structure called a dendrogram. Different linkage criteria like single, complete, average linkage can be used to calculate the distance between clusters during the merging process.

Uploaded by

Arul Kumar Venugopal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views

Data Mining: Hierarchical Clustering, DBSCAN The EM Algorithm

Uploaded by

Arul Kumar Venugopal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 63

DATA MINING

LECTURE 7
Hierarchical Clustering, DBSCAN
The EM Algorithm
CLUSTERING
What is a Clustering?
• In general a grouping of objects such that the objects in a
group (cluster) are similar (or related) to one another and
different from (or unrelated to) the objects in other groups

Inter-cluster
Intra-cluster distances are
distances are maximized
minimized
Clustering Algorithms
• K-means and its variants

• Hierarchical clustering

• DBSCAN
HIERARCHICAL
CLUSTERING
Hierarchical Clustering
• Two main types of hierarchical clustering
• Agglomerative:
• Start with the points as individual clusters
• At each step, merge the closest pair of clusters until only one cluster (or k
clusters) left

• Divisive:
• Start with one, all-inclusive cluster
• At each step, split a cluster until each cluster contains a point (or there are
k clusters)

• Traditional hierarchical algorithms use a similarity or

distance matrix
• Merge or split one cluster at a time
Hierarchical Clustering
• Produces a set of nested clusters organized as a
hierarchical tree
• Can be visualized as a dendrogram
• A tree like diagram that records the sequences of
merges or splits
6 5
0.2
4
3 4
0.15 2
5
2
0.1

1
0.05 1
3

0
1 3 2 5 4 6
Strengths of Hierarchical Clustering
• Do not have to assume any particular number of
clusters
• Any desired number of clusters can be obtained by
‘cutting’ the dendogram at the proper level

• They may correspond to meaningful taxonomies

• Example in biological sciences (e.g., animal kingdom,
phylogeny reconstruction, …)
Agglomerative Clustering Algorithm
• More popular hierarchical clustering technique

• Basic algorithm is straightforward

1. Compute the proximity matrix
2. Let each data point be a cluster
3. Repeat
4. Merge the two closest clusters
5. Update the proximity matrix
6. Until only a single cluster remains
• Key operation is the computation of the proximity
of two clusters
• Different approaches to defining the distance between
clusters distinguish the different algorithms
Starting Situation
• Start with clusters of individual points and a
proximity matrix
p1 p2 p3 p4 p5 ...
p1
p2
p3
p4
p5
.
.
. Proximity Matrix

...
p1 p2 p3 p4 p9 p10 p11 p12
Intermediate Situation
• After some merging steps, we have some clusters
C1 C2 C3 C4 C5
C1
C2
C3 C3
C4 C4
C5
C1 Proximity Matrix

C2 C5

...
p1 p2 p3 p4 p9 p10 p11 p12
Intermediate Situation
• We want to merge the two closest clusters (C2 and C5) and
update the proximity matrix.
C1 C2 C3 C4 C5
C1
C2
C3 C3
C4
C4
C5
Proximity Matrix
C1

C2 C5

...
p1 p2 p3 p4 p9 p10 p11 p12
After Merging
• The question is “How do we update the proximity matrix?”
C2
U
C1 C5 C3 C4
C1 ?
C2 U C5 ? ? ? ?
C3
C3 ?
C4
C4 ?
C1 Proximity Matrix

C2 U C5

...
p1 p2 p3 p4 p9 p10 p11 p12
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...

p1
Similarity?
p2
p3

p4
p5
 MIN
.
 MAX .
 Group Average .
Proximity Matrix
 Distance Between Centroids
 Other methods driven by an objective
function
– Ward’s Method uses squared error
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...

p2
p3

p2
p3
p4
p5
 MIN
.
 MAX .
 Group Average .
Proximity Matrix
 Distance Between Centroids
 Other methods driven by an objective
function
– Ward’s Method uses squared error
How to Define Inter-Cluster Similarity
p1 p2 p3 p4 p5 ...

p2
p3

p1
  p2
p3

p4
p5
 MIN
.
 MAX .
 Group Average .
Proximity Matrix
 Distance Between Centroids
 Other methods driven by an objective
function
– Ward’s Method uses squared error
Single Link – Complete Link
• Another way to view the processing of the
hierarchical algorithm is that we create links
between their elements in order of increasing
distance
• The MIN – Single Link, will merge two clusters when a
single pair of elements is linked
• The MAX – Complete Linkage will merge two clusters
when all pairs of elements have been linked.
Hierarchical Clustering: MIN
1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
5 2 .24 0 .15 .20 .14 .25
1
3 3 .22 .15 0 .15 .28 .11
4 .37 .20 .15 0 .29 .22
5 5 .34 .14 .28 .29 0 .39
2 1
6 .23 .25 .11 .22 .39 0
2 3 6
0.2
4
4 0.15

0.1

0.05

Nested Clusters Dendrogram

0
3 6 2 5 4 1
Strength of MIN

Original Points Two Clusters

• Can handle non-elliptical shapes

Limitations of MIN

Original Points Two Clusters

• Sensitive to noise and outliers

Hierarchical Clustering: MAX
1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
4 1 2 .24 0 .15 .20 .14 .25

2 5 3 .22 .15 0 .15 .28 .11

4 .37 .20 .15 0 .29 .22
5
2 5 .34 .14 .28 .29 0 .39
6 .23 .25 .11 .22 .39 0
3 6
3 0.4
1 0.35

4 0.3

0.25

0.2

0.15

0.1
Nested Clusters Dendrogram
0.05

0
3 6 4 1 2 5
Strength of MAX

Original Points Two Clusters

• Less susceptible to noise and outliers

Limitations of MAX

Original Points Two Clusters

•Tends to break large clusters

•Biased towards globular clusters
Cluster Similarity: Group Average
• Proximity of two clusters is the average of pairwise proximity
between points in the two clusters.
 proximity(p , p )
piClusteri
i j

p jClusterj
proximity(Clusteri , Clusterj ) 
|Clusteri ||Clusterj |

• Need to use average connectivity for scalability since total

proximity favors large clusters
1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
2 .24 0 .15 .20 .14 .25
3 .22 .15 0 .15 .28 .11
4 .37 .20 .15 0 .29 .22
5 .34 .14 .28 .29 0 .39
6 .23 .25 .11 .22 .39 0
Hierarchical Clustering: Group Average
1 2 3 4 5 6
1 0 .24 .22 .37 .34 .23
5 4 1 2 .24 0 .15 .20 .14 .25

2 3 .22 .15 0 .15 .28 .11

4 .37 .20 .15 0 .29 .22
5
2 5 .34 .14 .28 .29 0 .39

3 6 .23 .25 .11 .22 .39 0

6
1
0.25
4
3 0.2

0.15

0.1

Nested Clusters Dendrogram 0.05

0
3 6 4 1 2 5
Hierarchical Clustering: Group Average
• Compromise between Single and
Complete Link

• Strengths
• Less susceptible to noise and outliers

• Limitations
• Biased towards globular clusters
Cluster Similarity: Ward’s Method
• Similarity of two clusters is based on the increase in
squared error (SSE) when two clusters are merged
• Similar to group average if distance between points is
distance squared

• Less susceptible to noise and outliers

• Biased towards globular clusters

• Hierarchical analogue of K-means

• Can be used to initialize K-means
Hierarchical Clustering: Comparison
5
1 4 1
3
2 5
5 5
2 1 2
MIN MAX
2 3 6 3 6
3
1
4 4
4

5
1 5 4 1
2 2
5 Ward’s Method 5
2 2
3 6 Group Average 3 6
3
4 1 1
4 4
3
Hierarchical Clustering:
Time and Space requirements
• O(N2) space since it uses the proximity matrix.
• N is the number of points.

• O(N3) time in many cases

• There are N steps and at each step the size, N2,
proximity matrix must be updated and searched
• Complexity can be reduced to O(N2 log(N) ) time for
some approaches
Hierarchical Clustering:
Problems and Limitations
• Computational complexity in time and space

• Once a decision is made to combine two clusters, it

cannot be undone

• No objective function is directly minimized

• Different schemes have problems with one or more of

the following:
• Sensitivity to noise and outliers
• Difficulty handling different sized clusters and convex shapes
• Breaking large clusters
DBSCAN
DBSCAN: Density-Based Clustering
• DBSCAN is a Density-Based Clustering algorithm

• Reminder: In density based clustering we partition points into

dense regions separated by not-so-dense regions.

• Important Questions:
• How do we measure density?
• What is a dense region?

• DBSCAN:
• Density at point p: number of points within a circle of radius Eps
• Dense Region: A circle of radius Eps that contains at least MinPts
points
DBSCAN
• Characterization of points
• A point is a core point if it has more than a specified
number of points (MinPts) within Eps
• These points belong in a dense region and are at the interior
of a cluster

• A border point has fewer than MinPts within Eps, but

is in the neighborhood of a core point.

• A noise point is any point that is not a core point or a

border point.
DBSCAN: Core, Border, and Noise
Points
DBSCAN: Core, Border and Noise Points

Point types: core,

Original Points
border and noise

Eps = 10, MinPts = 4

Density-Connected points
• Density edge
• We place an edge between two core p
points q and p if they are within p1
distance Eps. q

• Density-connected
• A point p is density-connected to a
point q if there is a path of edges
from p to q p q

o
DBSCAN Algorithm
• Label points as core, border and noise
• Eliminate noise points
• For every core point p that has not been assigned
to a cluster
• Create a new cluster with the point p and all the
points that are density-connected to p.
• Assign border points to the cluster of the closest
core point.
DBSCAN: Determining Eps and MinPts
• Idea is that for points in a cluster, their kth nearest neighbors are
at roughly the same distance
• Noise points have the kth nearest neighbor at farther distance
• So, plot sorted distance of every point to its kth nearest neighbor
• Find the distance d where there is a “knee” in the curve
• Eps = d, MinPts = k

Eps ~ 7-10
MinPts = 4
When DBSCAN Works Well

Original Points
Clusters

• Resistant to Noise
• Can handle clusters of different shapes and sizes
When DBSCAN Does NOT Work Well

(MinPts=4, Eps=9.75).

Original Points

• Varying densities
• High-dimensional data

(MinPts=4, Eps=9.92)
DBSCAN: Sensitive to Parameters
Other algorithms
• PAM, CLARANS: Solutions for the k-medoids problem
• BIRCH: Constructs a hierarchical tree that acts a summary
of the data, and then clusters the leaves.
• MST: Clustering using the Minimum Spanning Tree.
• ROCK: clustering categorical data by neighbor and link
analysis
• LIMBO, COOLCAT: Clustering categorical data using
information theoretic tools.
• CURE: Hierarchical algorithm uses different representation
of the cluster
• CHAMELEON: Hierarchical algorithm uses closeness and
interconnectivity for merging
MIXTURE MODELS AND
THE EM ALGORITHM
Model-based clustering
• In order to understand our data, we will assume that there is
a generative process (a model) that creates/describes the
data, and we will try to find the model that best fits the data.
• Models of different complexity can be defined, but we will assume
that our model is a distribution from which data points are sampled
• Example: the data is the height of all people in Greece

• In most cases, a single distribution is not good enough to

describe all data points: different parts of the data follow a
different distribution
• Example: the data is the height of all people in Greece and China
• We need a mixture model
• Different distributions correspond to different clusters in the data.
Gaussian Distribution
• Example: the data is the height of all people in
Greece
• Experience has shown that this data follows a Gaussian
(Normal) distribution
• Reminder: Normal distribution:

2
(𝑥 − 𝜇 )
−
1 2𝜎
2

𝑃 ( 𝑥) = 𝑒
√2 𝜋 𝜎

• = mean, = standard deviation

Gaussian Model
• What is a model?
• A Gaussian distribution is fully defined by the mean and
the standard deviation
• We define our model as the pair of parameters

• This is a general principle: a model is defined as

a vector of parameters
Fitting the model
• We want to find the normal distribution that best
fits our data
• Find the best values for and
• But what does best fit mean?
Maximum Likelihood Estimation (MLE)
•• Suppose
that we have a vector of values
• And we want to fit a Gaussian model to the data
• Probability of observing point : 2
( 𝑥 − 𝜇)
1 −
𝑖
2
𝑃 ( 𝑥𝑖 ) = 𝑒 2𝜎
√2 𝜋 𝜎
• Probability of observing all points (assume independence)
2
𝑛 𝑛 ( 𝑥 𝑖− 𝜇 )
1 −
2𝜎
2
𝑃 ( 𝑋 )= ∏ 𝑃 ( 𝑥 𝑖) = ∏ 𝑒
𝑖=1 𝑖=1 √2 𝜋 𝜎

• We want to find the parameters that maximize the

probability
Maximum Likelihood Estimation (MLE)
• The
probability as a function of is called the Likelihood
function 2
𝑛 ( 𝑥 𝑖 − 𝜇)
1 −
2𝜎
2
𝐿 (𝜃 )= ∏ 𝑒
𝑖=1 √ 2 𝜋 𝜎

• It is usually easier to work with the Log-Likelihood

function 𝑛
( 𝑥𝑖 − 𝜇 )2 1
𝐿𝐿 (𝜃 )=− ∑ − 𝑛 log 2 𝜋 − 𝑛 log 𝜎
𝑖=1 2𝜎
2
2
• Maximum Likelihood Estimation
• Find parameters that maximize
𝑛 𝑛
1 2 1 2 2
𝜇= ∑ 𝑥 𝑖=𝜇 𝑋 𝜎= ∑ (𝑥¿¿𝑖 − 𝜇) =𝜎 𝑋¿
𝑛 𝑖=1 𝑛 𝑖=1
Sample Mean Sample Variance
MLE
• Note: these are also the most likely parameters
given the data

• If we have no prior information about , or X, then

maximizing is the same as maximizing
Mixture of Gaussians
• Suppose that you have the heights of people from
Greece and China and the distribution looks like
the figure below (dramatization)
Mixture of Gaussians
• In this case the data is the result of the mixture of
two Gaussians
• One for Greek people, and one for Chinese people
• Identifying for each value which Gaussian is most likely
to have generated it will give us a clustering.
Mixture model
• A value is generated according to the following
process:
• First select the nationality
• With probability select Greek, with probability select China

We can also thing of this as a Hidden Variable Z

• Given the nationality, generate the point from the
corresponding Gaussian
• if Greece
• if China
Mixture Model
• Our
model has the following parameters

Mixture probabilities Distribution Parameters

• For value , we have:

• For all values

• We want to estimate the parameters that

maximize the Likelihood of the data
Mixture Models
• Once we have the parameters we can estimate
the membership probabilities and for each point :
• This is the probability that point belongs to the Greek or
the Chinese population (cluster)
EM (Expectation Maximization) Algorithm
•• Initialize
the values of the parameters in to some random
values
• Repeat until convergence
• E-Step: Given the parameters estimate the membership
probabilities and
• M-Step: Compute the parameter values that (in expectation)
maximize the data likelihood
𝑛 𝑛
1
𝜋 𝐺= ∑ 𝑃(𝐺∨𝑥 𝑖) 1 Fraction of
𝜋 𝐶 = ∑ 𝑃(𝐶∨𝑥 𝑖)
𝑛 𝑖=1 𝑛 𝑖=1 population in G,C
𝑛
𝑃 ( 𝐶|𝑥 𝑖 ) 𝑛
𝑃 ( 𝐺|𝑥 𝑖)
𝜇𝐶 = ∑ 𝑥𝑖 𝜇𝐺 = ∑ 𝑥𝑖 MLE Estimates
𝑖=1 𝑛 ∗ 𝜋 𝐶 𝑖=1 𝑛 ∗ 𝜋 𝐺 if ’s were fixed
2 𝑛
𝑃 ( 𝐶| 𝑥𝑖 ) 2
2 𝑛
𝑃 ( 𝐺|𝑥 𝑖 ) 2
𝜎 𝐶 =∑ 𝑥 − 𝜇
( 𝑖 𝐶) 𝜎 𝐺 =∑ 𝑥 − 𝜇
( 𝑖 𝐺)
𝑖=1 𝑛∗ 𝜋 𝐶 𝑖=1 𝑛 ∗ 𝜋 𝐺
Relationship to K-means
• E-Step: Assignment of points to clusters
• K-means: hard assignment, EM: soft assignment
• M-Step: Computation of centroids
• K-means assumes common fixed variance (spherical
clusters)
• EM: can change the variance for different clusters or
different dimensions (elipsoid clusters)
• If the variance is fixed then both minimize the
same error function

STATISTICS WITH R PROGRAMMING Question Paper PDF
88% (8)
STATISTICS WITH R PROGRAMMING Question Paper PDF
5 pages
Forecasting PDF
No ratings yet
Forecasting PDF
35 pages
Assignments123 2013
0% (4)
Assignments123 2013
5 pages
Nureg-Cr-6823 HB Param Est
No ratings yet
Nureg-Cr-6823 HB Param Est
294 pages
Clustering Hierarchical PDF
No ratings yet
Clustering Hierarchical PDF
31 pages
Clustering
No ratings yet
Clustering
12 pages
ML UNIT 4
No ratings yet
ML UNIT 4
15 pages
Cluster Analysis 04: Elbow, Slihouette, Hierarchical Clustering, Agglomerative Clustering, Min, Max, Group Average
No ratings yet
Cluster Analysis 04: Elbow, Slihouette, Hierarchical Clustering, Agglomerative Clustering, Min, Max, Group Average
28 pages
03 Clustering
No ratings yet
03 Clustering
63 pages
Hierar Scale4
No ratings yet
Hierar Scale4
51 pages
Hierarchle Cluster
No ratings yet
Hierarchle Cluster
34 pages
Clustering: K-Means, Agglomerative, DBSCAN: Tan, Steinbach, Kumar
No ratings yet
Clustering: K-Means, Agglomerative, DBSCAN: Tan, Steinbach, Kumar
45 pages
Clustering Basics
No ratings yet
Clustering Basics
39 pages
SSRN Id3768295
No ratings yet
SSRN Id3768295
7 pages
Cluster
100% (1)
Cluster
72 pages
Birch
No ratings yet
Birch
6 pages
L08 Hierachical agglomerative clustering
No ratings yet
L08 Hierachical agglomerative clustering
41 pages
Data Science Session 8 Clustering V0
No ratings yet
Data Science Session 8 Clustering V0
30 pages
M6
No ratings yet
M6
23 pages
Clustering
No ratings yet
Clustering
65 pages
UnSupervisedLearning
No ratings yet
UnSupervisedLearning
22 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
Clustering 2
No ratings yet
Clustering 2
17 pages
Lect 11 DM
No ratings yet
Lect 11 DM
41 pages
Lecture 6
No ratings yet
Lecture 6
55 pages
DS4 Nonpartitional Clustering
No ratings yet
DS4 Nonpartitional Clustering
53 pages
Week 07 Lecture Material
No ratings yet
Week 07 Lecture Material
49 pages
Unit 2
No ratings yet
Unit 2
33 pages
AIMLB PGP 2024 Session 12
No ratings yet
AIMLB PGP 2024 Session 12
46 pages
Clustering
No ratings yet
Clustering
69 pages
1. Clustering
No ratings yet
1. Clustering
75 pages
Chapter 5
No ratings yet
Chapter 5
43 pages
Clustering Lecture
No ratings yet
Clustering Lecture
46 pages
Presentation 28128 Content Document 20241126014005PM
No ratings yet
Presentation 28128 Content Document 20241126014005PM
80 pages
Introduction To Data Mining Clustering Analysis
No ratings yet
Introduction To Data Mining Clustering Analysis
84 pages
ML - 8
No ratings yet
ML - 8
70 pages
MACHINE LEARNING NOTES ANNA UNIVERSITY
No ratings yet
MACHINE LEARNING NOTES ANNA UNIVERSITY
14 pages
Clustering Analysis
No ratings yet
Clustering Analysis
30 pages
Unit-IV ppt
No ratings yet
Unit-IV ppt
51 pages
Topic 6d - Hierarchical Algorithm
No ratings yet
Topic 6d - Hierarchical Algorithm
38 pages
Clustering Analysis (1)
No ratings yet
Clustering Analysis (1)
12 pages
Clustering: EE-671 Prof L. Behera, IITK
No ratings yet
Clustering: EE-671 Prof L. Behera, IITK
33 pages
DBSCAN
No ratings yet
DBSCAN
18 pages
Lecture - 11 Hierarchical Clustering
No ratings yet
Lecture - 11 Hierarchical Clustering
28 pages
Clustering: Sridhar S Department of IST Anna University
No ratings yet
Clustering: Sridhar S Department of IST Anna University
91 pages
Ambo University: Inistitute of Technology
No ratings yet
Ambo University: Inistitute of Technology
15 pages
08 Clustering Hierarchical
No ratings yet
08 Clustering Hierarchical
44 pages
1629189889 ML TCS Lecture Hierarchical 1608
No ratings yet
1629189889 ML TCS Lecture Hierarchical 1608
41 pages
03 Hierarchical Clustering
100% (1)
03 Hierarchical Clustering
15 pages
20 - 1 - ML - UNSUP - 02 - Hierarchical Clustering
No ratings yet
20 - 1 - ML - UNSUP - 02 - Hierarchical Clustering
41 pages
4.6 Dbscan
No ratings yet
4.6 Dbscan
27 pages
P 3.1.3 Hierarchical
No ratings yet
P 3.1.3 Hierarchical
30 pages
Clustering Class Ppt
No ratings yet
Clustering Class Ppt
103 pages
Clustering
No ratings yet
Clustering
45 pages
Spatial Data Mining: Clustering Techniques
No ratings yet
Spatial Data Mining: Clustering Techniques
56 pages
Unit 5
No ratings yet
Unit 5
10 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
26 pages
Lecture 13 - Unsupervised Learning, PCA ICA
No ratings yet
Lecture 13 - Unsupervised Learning, PCA ICA
50 pages
20 - 1 - ML - Unsup - 03 - Dbscan Hdbscan
No ratings yet
20 - 1 - ML - Unsup - 03 - Dbscan Hdbscan
21 pages
M5
No ratings yet
M5
40 pages
Chapter 6
No ratings yet
Chapter 6
62 pages
ML-UNIT-III
No ratings yet
ML-UNIT-III
12 pages
Introduction to Coding in Hours With Python Level 1: A Guide to Programming for Students With No Prior Experience (Learn Coding Basics With Python)
From Everand
Introduction to Coding in Hours With Python Level 1: A Guide to Programming for Students With No Prior Experience (Learn Coding Basics With Python)
Jack C. Stanely
No ratings yet
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
Data Mining: Classification
No ratings yet
Data Mining: Classification
87 pages
Data Mining Lecture 10B: Classification
No ratings yet
Data Mining Lecture 10B: Classification
62 pages
Data Mining: Dimensionality Reduction Pca - SVD
No ratings yet
Data Mining: Dimensionality Reduction Pca - SVD
33 pages
Data Mining: Clustering Validation Minimum Description Length Information Theory Co-Clustering
No ratings yet
Data Mining: Clustering Validation Minimum Description Length Information Theory Co-Clustering
67 pages
Relational Database Management System 3 Exam/Comp/It/Csc/0626/0090/Nov'16 Duration: 3 Hrs M. Marks 75 Section-A Q1. Do As Directed: 15X1 15
No ratings yet
Relational Database Management System 3 Exam/Comp/It/Csc/0626/0090/Nov'16 Duration: 3 Hrs M. Marks 75 Section-A Q1. Do As Directed: 15X1 15
1 page
Chapter 16 - Files and Streams
No ratings yet
Chapter 16 - Files and Streams
53 pages
Datamining Lect1
No ratings yet
Datamining Lect1
61 pages
CH 09
No ratings yet
CH 09
74 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
25 pages
Deps 087671
No ratings yet
Deps 087671
22 pages
Cloud Computing I C
No ratings yet
Cloud Computing I C
8 pages
Cdcs Desy Wa 2018
No ratings yet
Cdcs Desy Wa 2018
22 pages
Clustering
No ratings yet
Clustering
36 pages
Cognitive Weighted Response For A Class: A New Metric For Measuring Cognitive Complexity of OO Systems
No ratings yet
Cognitive Weighted Response For A Class: A New Metric For Measuring Cognitive Complexity of OO Systems
13 pages
2015 - Cloud Computing Data Security Issues, Challenges, Architecture and Mehods - A Survey PDF
No ratings yet
2015 - Cloud Computing Data Security Issues, Challenges, Architecture and Mehods - A Survey PDF
10 pages
ATOM Install Notes Readme
No ratings yet
ATOM Install Notes Readme
2 pages
Statistics Syllabus gyanSHiLA
No ratings yet
Statistics Syllabus gyanSHiLA
8 pages
INFERENTIAL STATISTICS: Hypothesis Testing: Learning Objectives
No ratings yet
INFERENTIAL STATISTICS: Hypothesis Testing: Learning Objectives
5 pages
Experimental Design With Sample Table
100% (1)
Experimental Design With Sample Table
17 pages
Econometrics: Assignment 1
No ratings yet
Econometrics: Assignment 1
6 pages
Engineering Data Analysis Guides
No ratings yet
Engineering Data Analysis Guides
1 page
A Guide to Modern Econometrics 5th Edition Marno Verbeek - The ebook in PDF and DOCX formats is ready for download
100% (1)
A Guide to Modern Econometrics 5th Edition Marno Verbeek - The ebook in PDF and DOCX formats is ready for download
57 pages
busines analytics assignment
No ratings yet
busines analytics assignment
10 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
59 pages
11 ANOVA (Student Version)
No ratings yet
11 ANOVA (Student Version)
30 pages
Cronbach Alpha Beh Stat
No ratings yet
Cronbach Alpha Beh Stat
5 pages
Using Degradation Measures To Estimate A Time-to-Failure Distribution
100% (1)
Using Degradation Measures To Estimate A Time-to-Failure Distribution
15 pages
Formula Sheet
No ratings yet
Formula Sheet
5 pages
Gaussian Process Regression Analysis for Functional Data 1st Edition Jian Qing Shiinstant download
100% (1)
Gaussian Process Regression Analysis for Functional Data 1st Edition Jian Qing Shiinstant download
43 pages
Stratified Randon Sampling
No ratings yet
Stratified Randon Sampling
32 pages
Tutorial 2
No ratings yet
Tutorial 2
4 pages
Iit Madras (Data Science)..-1
No ratings yet
Iit Madras (Data Science)..-1
5 pages
Final Project - Group 1
No ratings yet
Final Project - Group 1
6 pages
MMM - Multiple Regression
No ratings yet
MMM - Multiple Regression
68 pages
2022 DIVISION STATISTICS MONTH CELEBRATION With Answers
No ratings yet
2022 DIVISION STATISTICS MONTH CELEBRATION With Answers
6 pages
Econometricstrix Meeting 2020 December
No ratings yet
Econometricstrix Meeting 2020 December
5 pages
Using Gretl For Principles of Econometrics, 3rd Edition: The Errata (Page 286) For Changes Since The Last Update
No ratings yet
Using Gretl For Principles of Econometrics, 3rd Edition: The Errata (Page 286) For Changes Since The Last Update
316 pages
L16 Qcar
No ratings yet
L16 Qcar
9 pages
Inferential Statics
No ratings yet
Inferential Statics
33 pages
Univariate_Statistics l1 lmd
No ratings yet
Univariate_Statistics l1 lmd
60 pages
Chapter4 Anova Experimental Design Analysis
No ratings yet
Chapter4 Anova Experimental Design Analysis
31 pages
Research Methodology and Medical Statistics-Book Preview
29% (7)
Research Methodology and Medical Statistics-Book Preview
18 pages
Forecasting - Muhammad Idzhar Faisa - 120310200084
No ratings yet
Forecasting - Muhammad Idzhar Faisa - 120310200084
10 pages

Data Mining: Hierarchical Clustering, DBSCAN The EM Algorithm

Uploaded by

Data Mining: Hierarchical Clustering, DBSCAN The EM Algorithm

Uploaded by

DATA MINING

• Traditional hierarchical algorithms use a similarity or

• They may correspond to meaningful taxonomies

• Basic algorithm is straightforward

Nested Clusters Dendrogram

Original Points Two Clusters

• Can handle non-elliptical shapes

Original Points Two Clusters

• Sensitive to noise and outliers

2 5 3 .22 .15 0 .15 .28 .11

Original Points Two Clusters

• Less susceptible to noise and outliers

Original Points Two Clusters

•Tends to break large clusters

• Need to use average connectivity for scalability since total

2 3 .22 .15 0 .15 .28 .11

3 6 .23 .25 .11 .22 .39 0

Nested Clusters Dendrogram 0.05

• Less susceptible to noise and outliers

• Biased towards globular clusters

• Hierarchical analogue of K-means

• O(N3) time in many cases

• Once a decision is made to combine two clusters, it

• No objective function is directly minimized

• Different schemes have problems with one or more of

• Reminder: In density based clustering we partition points into

• A border point has fewer than MinPts within Eps, but

• A noise point is any point that is not a core point or a

Point types: core,

Eps = 10, MinPts = 4

• In most cases, a single distribution is not good enough to

• = mean, = standard deviation

• This is a general principle: a model is defined as

• We want to find the parameters that maximize the

• It is usually easier to work with the Log-Likelihood

• If we have no prior information about , or X, then

We can also thing of this as a Hidden Variable Z

Mixture probabilities Distribution Parameters

• For value , we have:

• For all values

• We want to estimate the parameters that

You might also like