Hierarchically-clustered Heatmap in Python with Seaborn Clustermap
Last Updated :
02 Dec, 2020
Seaborn is an amazing visualization library for statistical graphics plotting in Python. It provides beautiful default styles and color palettes to make statistical plots more attractive. It is built on the top of matplotlib library and also closely integrated into the data structures from pandas.
What is Clustering?
Clustering is basically grouping data based on relationships among the variables in the data. Clustering algorithms help in getting structured data in unsupervised learning. The most common types of clustering are shown below.
Clustering_types
Here we are going to see hierarchical clustering especially Agglomerative(bottom-up) hierarchical clustering. In Agglomerative clustering, we start with considering each data point as a cluster and then repeatedly combine two nearest clusters into larger clusters until we are left with a single cluster. The graph we plot after performing agglomerative clustering on data is called Dendrogram.
Plotting Hierarchically clustered Heatmaps
Coming to the heat map, it is a graphical representation of data where values are represented using colors. Variation in the intensity of color depicts how data is clustered or varies over space.
The clustermap() function of seaborn plots a hierarchically-clustered heat map of the given matrix dataset. It returns a clustered grid index.
Below are some examples which depict the hierarchically-clustered heat map from a dataset:
In the Flights dataset the data(Number of passengers) is clustered based on month and year:
Example 1:
Python3
# Importing the library
import seaborn as sns
from sunbird.categorical_encoding import frequency_encoding
# Load dataset
data = sns.load_dataset('flights')
# Categorical encoding
frequency_encoding(data, 'month')
# Clustering data row-wise and
# changing color of the map.
sns.clustermap(data, figsize=(7, 7))
Output :
The legend to the left of the cluster map indicates information about the cluster map e.g bright color indicates more passengers and dark color indicates fewer passengers.
Example 2:
Python3
# Importing the library
import seaborn as sns
from sunbird.categorical_encoding import frequency_encoding
# Load dataset
data = sns.load_dataset('flights')
# Categorical encoding
frequency_encoding(data, 'month')
# Clustering data row-wise and
# changing color of the map.
sns.clustermap(data, cmap='coolwarm', figsize=(7, 7))
Output:
Here we have changed the colors of the cluster map.
Similar Reads
Creating Heatmaps with Hierarchical Clustering Before diving into our actual topic, let's have an understanding of Heatmaps and Hierarchical Clustering. HeatmapsHeatmaps are a powerful data visualization tool that can reveal patterns, relationships, and similarities within large datasets. When combined with hierarchical clustering, they become e
8 min read
Creating Heatmaps with Hierarchical Clustering Before diving into our actual topic, let's have an understanding of Heatmaps and Hierarchical Clustering. HeatmapsHeatmaps are a powerful data visualization tool that can reveal patterns, relationships, and similarities within large datasets. When combined with hierarchical clustering, they become e
8 min read
Cutting hierarchical dendrogram into clusters using SciPy in Python In this article, we will see how to cut a hierarchical dendrogram into clusters via a threshold value using SciPy in Python. A dendrogram is a type of tree diagram showing hierarchical clustering i.e. relationships between similar sets of data. It is used to analyze the hierarchical relationship bet
3 min read
Draw Heatmap with Clusters Using pheatmap in R In data visualization, heatmaps are frequently used to show numerical data in a matrix structure where each value is represented by a color. In this article, we will see how we can draw heatmaps with clusters using the Pheatmap package. In R programming, the heatmap visualizations can be produced us
5 min read
Draw Heatmap with Clusters Using pheatmap in R In data visualization, heatmaps are frequently used to show numerical data in a matrix structure where each value is represented by a color. In this article, we will see how we can draw heatmaps with clusters using the Pheatmap package. In R programming, the heatmap visualizations can be produced us
5 min read
Custom Color Palette Intervals in Seaborn Heatmap Heatmaps are a powerful visualization tool for representing data in a matrix format, where individual values are represented by colors. Seaborn, a popular Python data visualization library, provides a convenient way to create heatmaps. However, customizing the color palette intervals in a Seaborn he
6 min read