Difference Between Agglomerative clustering and Divisive clustering

Last Updated : 15 May, 2025

Agglomerative and divisive clustering are two main types of hierarchical clustering methods. Agglomerative clustering is a bottom-up approach where each data point starts as its own cluster and similar ones are merged step by step.
Divisive clustering is top-down, starting with all data in one cluster and splitting it into smaller groups based on differences.

Agglomerative Clustering

Agglomerative clustering is a bottom-up approach where each data point starts as its own individual cluster. The algorithm iteratively merges the most similar pairs of clusters until all the data points belong to a single cluster. It’s widely used due to its simplicity and efficiency in many clustering tasks.

Key steps in agglomerative clustering:

Treat each data point as a separate cluster.
Calculate the similarity (distance) between all pairs of clusters.
Merge the two most similar clusters.
Repeat steps 2-3 until all points belong to a single cluster.

This method can be computationally expensive especially for large datasets. The algorithm needs to compute the distance between every pair of points leading to a time complexity of O(n^3) for large datasets.

It can be implemented using Scikit learn and SciPy library of python. Here’s a simple implementation of agglomerative clustering using randomly generated data in Python with Scipy:

Python

import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import dendrogram, linkage

data = np.random.randn(50, 2)

Z = linkage(data, 'ward')

# Plot dendrogram
plt.figure(figsize=(10, 7))
dendrogram(Z)
plt.title("Agglomerative Clustering Dendrogram")
plt.show()

Output:

Divisive Clustering

Divisive clustering on the other hand, is a top-down approach. It starts with all data points in a single cluster and recursively splits the clusters into smaller sub-clusters based on their dissimilarity until each data point is in its own individual cluster. This approach is more computationally intensive as it require splitting the data rather than merging it.

Key steps in divisive clustering:

Start with a single cluster containing all the data points.
Split the cluster into two sub-clusters based on their dissimilarity.
Recursively apply the same process to the resulting sub-clusters.
Repeat until each data point is in its own cluster.

Divisive clustering’s complexity can vary depending on the implementation it generally requires more computational power due to the recursive splitting process. However because it operates on sub-clusters it can sometimes reduce the computational cost when compared to agglomerative clustering on very large datasets. It is more complex to implement and require a choice of splitting criteria.