Clustering Performance Evaluation in Scikit Learn
Last Updated :
26 Apr, 2025
In this article, we shall look at different approaches to evaluate Clustering Algorithms using Scikit Learn Python Machine Learning Library. Clustering is an Unsupervised Machine Learning algorithm that deals with grouping the dataset to its similar kind data point. Clustering is widely used for Segmentation, Pattern Finding, Search engine, and so on.
Let's consider an example to perform Clustering on a dataset and look at different performance evaluation metrics to evaluate the model.
Python3
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt
feature, target = make_blobs(n_samples=500,
centers=5,
random_state=42,
shuffle=False)
plt.scatter(feature[:, 0], feature[:, 1])
Output:
In this dataset, we shall use KMeans Clustering Algorithm which is a density-based Clustering Algorithm.
Python3
from sklearn.cluster import KMeans
model = KMeans(n_clusters=4)
model.fit(feature)
plt.scatter(feature[:, 0], feature[:, 1], color="r")
plt.scatter(model.cluster_centers_[1],
model.cluster_centers_[3],
color="k", marker="*")
plt.scatter(model.cluster_centers_[2],
model.cluster_centers_[0],
color="k", marker="*")
Output:
Performance Evaluation Metrics
Once we build a model, we usually do some predictions. But how do we verify the results? and on what basis do we come to the conclusion? That is when Evaluation Metrics come into the picture. Evaluation Metrics are the critical step in Machine Learning implementation. These are mainly used to evaluate the performance of the model on the inference data or testing data in comparison to actual data.
Now let us see some common Clustering Performance Evaluations in Scikit Learn.
5 Commonly used Clustering Performance Evaluation Metrics
Adjusted Rand Index
The adjusted rand index is an evaluation metric that is used to measure the similarity between two clustering by considering all the pairs of the n_samples and calculating the counting pairs of the assigned in the same or different clusters in the actual and predicted clustering.
The adjusted rand index score is defined as:
ARI = (RI - Expected_RI) / (max(RI) - Expected_RI)
Python3
from sklearn.metrics import adjusted_rand_score
ari = adjusted_rand_score(target, model.labels_)
print(ari)
Output:
0.7812362998684788
A score above 0.7 is considered to be a good match.
Rand Index
The Rand index is different from the adjusted rand index. Rand index does find the similarity between two clustering by considering all the pairs of the n_sample but it ranges from 0 to 1. whereas ARI ranges from -1 to 1.
The rand index is defined as:
RI = (number of agreeing pairs) / (number of pairs)
Python3
from sklearn.metrics import rand_score
ris = rand_score(target, model.labels_)
print(ris)
Output:
0.9198396793587175
Silhouette Score aka Silhouette Coefficient
Silhouette score aka Silhouette Coefficient is an evaluation metric that results in the range of -1 to 1. A score near 1 signifies the best importance that the data point is very compact within the cluster to which it belongs and far away from the other clusters. The score near -1 signifies the least or worst importance of the data point. A score near 0 signifies overlapping clusters.
Python3
from sklearn.metrics import silhouette_score
ss = silhouette_score(feature, model.labels_)
print(ss)
Output:
0.7328381899726921
4. Davies-Bouldin Index
Davies-Bouldin Index score is defined as the average similarity measure of each cluster with its most similar cluster, where similarity is the ratio of within-cluster distances to between-cluster distances. Thus, clusters that are farther apart and less dispersed will result in a better score. The minimum score is 0, with lower values indicating better clustering.
Python3
from sklearn.metrics import davies_bouldin_score
dbs = davies_bouldin_score(feature, model.labels_)
print(dbs)
Output:
0.3389800864889033
Mutual Information
Mutual Information between two clusters is a measure of the similarity between two labels of the same data. That is it is used to check the mutual information in the actual label target vs the predicted model label.
Python3
from sklearn.metrics import mutual_info_score
mis = mutual_info_score(target, model.labels_)
print(mis)
Output:
1.3321790402101235
Similar Reads
ML | V-Measure for Evaluating Clustering Performance
One of the primary disadvantages of any clustering technique is that it is difficult to evaluate its performance. To tackle this problem, the metric of V-Measure was developed. The calculation of the V-Measure first requires the calculation of two terms:- Homogeneity: A perfectly homogeneous cluster
5 min read
Project | Scikit-learn - Whisky Clustering
Introduction | Scikit-learnScikit-learn is a machine learning library for Python.It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerica
4 min read
Spectral Clustering in Machine Learning
Prerequisites: K-Means Clustering In the clustering algorithm that we have studied before we used compactness(distance) between the data points as a characteristic to cluster our data points. However, we can also use connectivity between the data point as a feature to cluster our data points. Using
9 min read
Clustering Text Documents using K-Means in Scikit Learn
Clustering text documents is a common problem in Natural Language Processing (NLP) where similar documents are grouped based on their content. K-Means clustering is a popular clustering technique used for this purpose. In this article we'll learn how to perform text document clustering using the K-M
3 min read
K-Means Clustering in R Programming
K Means Clustering in R Programming is an Unsupervised Non-linear algorithm that cluster data based on similarity or similar groups. It seeks to partition the observations into a pre-specified number of clusters. Segmentation of data takes place to assign each training example to a segment called a
3 min read
GPU Acceleration in Scikit-Learn
Scikit-learn, a popular machine learning library in Python, is renowned for its simplicity and efficiency in implementing a wide range of machine learning algorithms. However, one common question among data scientists and machine learning practitioners is whether scikit-learn can utilize GPU for acc
4 min read
Save classifier to disk in scikit-learn in Python
In this article, we will cover saving a Save classifier to disk in scikit-learn using Python. We always train our models whether they are classifiers, regressors, etc. with the scikit learn library which require a considerable time to train. So we can save our trained models and then retrieve them w
3 min read
Clustering in Machine Learning
In real world, not every data we work upon has a target variable. Have you ever wondered how Netflix groups similar movies together or how Amazon organizes its vast product catalog? These are real-world applications of clustering. This kind of data cannot be analyzed using supervised learning algori
9 min read
Clustering Metrics in Machine Learning
Clustering is a technique in Machine Learning that is used to group similar data points. While the algorithm performs its job, helping uncover the patterns and structures in the data, it is important to judge how well it functions. Several metrics have been designed to evaluate the performance of th
8 min read
Hierarchical Clustering with Scikit-Learn
Hierarchical clustering is a popular method in data science for grouping similar data points into clusters. Unlike other clustering techniques like K-means, hierarchical clustering does not require the number of clusters to be specified in advance. Instead, it builds a hierarchy of clusters that can
4 min read