0% found this document useful (0 votes)
20 views

Heirarchical Clustering.ipynb - Colab

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Heirarchical Clustering.ipynb - Colab

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

10/22/24, 11:16 AM Heirarchical_clustering.

ipynb - Colab

import numpy as np # linear algebra


import pandas as pd

dataset = pd.read_csv('/content/Mall_Customers.csv')
dataset.head()

CustomerID Gender Age Annual Income (k$) Spending Score (1-100)

0 1 Male 19 15 39

1 2 Male 21 15 81

2 3 Female 20 16 6

3 4 Female 23 16 77

4 5 Female 31 17 40

Next steps: Generate code with dataset


toggle_off View recommended plots New interactive sheet

dataset.shape

(200, 5)

dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CustomerID 200 non-null int64
1 Gender 200 non-null object
2 Age 200 non-null int64
3 Annual Income (k$) 200 non-null int64
4 Spending Score (1-100) 200 non-null int64
dtypes: int64(4), object(1)
memory usage: 7.9+ KB

dataset.describe()

CustomerID Age Annual Income (k$) Spending Score (1-100)

count 200.000000 200.000000 200.000000 200.000000

mean 100.500000 38.850000 60.560000 50.200000

std 57.879185 13.969007 26.264721 25.823522

min 1.000000 18.000000 15.000000 1.000000

25% 50.750000 28.750000 41.500000 34.750000

50% 100.500000 36.000000 61.500000 50.000000

75% 150.250000 49.000000 78.000000 73.000000

max 200.000000 70.000000 137.000000 99.000000

X = dataset.iloc[:, 3:]
X.head()

https://colab.research.google.com/drive/1IQVN2sKiNbELNVBA1WahQe7XEHKCMSRG#scrollTo=je3s-tguwhom&printMode=true 1/4
10/22/24, 11:16 AM Heirarchical_clustering.ipynb - Colab

Annual Income (k$) Spending Score (1-100)

0 15 39

1 15 81

2 16 6

3 16 77

4 17 40

Next steps: Generate code with X


toggle_off View recommended plots New interactive sheet

import scipy.cluster.hierarchy as hc
import matplotlib.pyplot as plt
from pylab import rcParams

rcParams['figure.figsize'] = 15, 10

# Using Dendogram to find the optimal number of clusters


dendogram = hc.dendrogram(hc.linkage(X, method = 'ward'))
plt.title('Dendrogram')
plt.xlabel('Customers')
plt.ylabel('Euclidean Distances')
plt.show()

dendogram = hc.dendrogram(hc.linkage(X, method = 'ward'))


plt.title('Dendrogram')
plt.xlabel('Customers')
plt.ylabel('Euclidean Distances')
plt.axhline(200, c='r', linestyle='--')
plt.show()

https://colab.research.google.com/drive/1IQVN2sKiNbELNVBA1WahQe7XEHKCMSRG#scrollTo=je3s-tguwhom&printMode=true 2/4
10/22/24, 11:16 AM Heirarchical_clustering.ipynb - Colab

# Fitting hierarchical clustering to the mall dataset


from sklearn.cluster import AgglomerativeClustering
# Removing the affinity parameter as it is not used with 'ward' linkage.
hc_Agg = AgglomerativeClustering(n_clusters = 5, linkage = 'ward')
y_hc = hc_Agg.fit_predict(X)

# Visualizing the clusters


plt.scatter(X.iloc[y_hc == 0, 0], X.iloc[y_hc == 0, 1], s = 100, c = 'red', label = 'Careful')
plt.scatter(X.iloc[y_hc == 1, 0], X.iloc[y_hc == 1, 1], s = 100, c = 'blue', label = 'Standard')
plt.scatter(X.iloc[y_hc == 2, 0], X.iloc[y_hc == 2, 1], s = 100, c = 'green', label = 'Target')
plt.scatter(X.iloc[y_hc == 3, 0], X.iloc[y_hc == 3, 1], s = 100, c = 'cyan', label = 'Careless')
plt.scatter(X.iloc[y_hc == 4, 0], X.iloc[y_hc == 4, 1], s = 100, c = 'magenta', label = 'Sensible')
plt.title('Clusters of customers')
plt.xlabel('Annual Income (k$)')
plt.ylabel('Spending Score (1-100)')
plt.legend()
plt.show()

https://colab.research.google.com/drive/1IQVN2sKiNbELNVBA1WahQe7XEHKCMSRG#scrollTo=je3s-tguwhom&printMode=true 3/4
10/22/24, 11:16 AM Heirarchical_clustering.ipynb - Colab

https://colab.research.google.com/drive/1IQVN2sKiNbELNVBA1WahQe7XEHKCMSRG#scrollTo=je3s-tguwhom&printMode=true 4/4

You might also like