0% found this document useful (0 votes)
2 views

06K_means_clustering

The document outlines a practical guide for implementing K-Means clustering using Python, including steps for importing packages, reading data, calculating distances, and plotting results. It demonstrates how to cluster height and weight data into three groups and visualize the clusters. The final output includes a DataFrame showing the height, weight, and assigned cluster for each data point.

Uploaded by

Pratham Dhiman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

06K_means_clustering

The document outlines a practical guide for implementing K-Means clustering using Python, including steps for importing packages, reading data, calculating distances, and plotting results. It demonstrates how to cluster height and weight data into three groups and visualize the clusters. The final output includes a DataFrame showing the height, weight, and assigned cluster for each data point.

Uploaded by

Pratham Dhiman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

4/22/24, 10:06 PM K_means_clustering

Practical - 6 : K-Means for Clustering


Table of Contents

Importing Needed Packages


Reading the data
Calculating the minimum distance
Plotting the data
Applying the Model
Plotting the Result

Importing Needed Packages

In [ ]: import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt
from sklearn.tree import plot_tree

from sklearn.cluster import KMeans


import seaborn as sns

Reading the data

In [ ]: htWtData = pd.read_csv("/BOOK.csv")
df = pd.DataFrame(htWtData)
df.head()

Out[ ]: Height Weight

0 185 72

1 170 56

2 168 60

3 179 68

4 182 72

Calculating the Minimum Distance

In [ ]: ht = df["Height"].tolist()
print(ht)
wt = df["Weight"].tolist()
print(wt)

ed, mnD = [], 10 ** 9

for i in range(0, len(ht)) :


for j in range(i+1, len(wt)) :
ds = ((ht[i] - ht[j]) ** 2) + ((wt[i] - wt[j]) ** 2)
ed.append(ds ** (0.5))
d = ds ** (0.5)
mnD = min(mnD, d)

localhost:8888/nbconvert/html/Desktop/ML/K_means_clustering.ipynb?download=false 1/4
4/22/24, 10:06 PM K_means_clustering
if len(ed) >= 5 : break

print("\n")
for d in ed :
print(d)

print("\n minimum distance : ", mnD)

[185, 170, 168, 179, 182, 188, 180, 180, 183, 180, 180, 177]
[72, 56, 60, 68, 72, 77, 71, 70, 84, 88, 67, 76]

21.93171219946131
20.808652046684813
7.211102550927978
3.0
5.830951894845301
5.0990195135927845
5.385164807134504
12.165525060596439
16.76305461424021
7.0710678118654755
8.94427190999916

minimum distance : 3.0

Plotting the Data

In [ ]: plt.figure(figsize=(8, 5))
# plt.show()

plt.scatter(htWtData['Weight'], htWtData['Height'])
plt.xlabel("weight")
plt.ylabel("height")
plt.show()

Applying the K-Means Model

localhost:8888/nbconvert/html/Desktop/ML/K_means_clustering.ipynb?download=false 2/4
4/22/24, 10:06 PM K_means_clustering

In [ ]: kmeans = KMeans(n_clusters = 3)
kmeans.fit(htWtData)

pdVals = kmeans.predict(htWtData)
print(pdVals)

f = pd.DataFrame(htWtData)
f["cluster"] = pdVals
print(f)

color = ["red", "blue","green"]


for k in range(0,3) :
final = f[f["cluster"] == k]
plt.scatter(final["Height"], final["Weight"], c = color[k])

plt.show()

[0 1 1 0 0 2 0 0 2 2 0 0]
/usr/local/lib/python3.10/dist-packages/sklearn/cluster/_kmeans.py:870: FutureWarn
ing: The default value of `n_init` will change from 10 to 'auto' in 1.4. Set the v
alue of `n_init` explicitly to suppress the warning
warnings.warn(

Plotting the Result

In [ ]: f = pd.DataFrame(htWtData)
f["cluster"] = pdVals
print(f)

color = ["red", "blue","green"]


for k in range(0,3) :
final = f[f["cluster"] == k]
plt.scatter(final["Height"], final["Weight"], c = color[k])

plt.show()

Height Weight cluster


0 185 72 0
1 170 56 1
2 168 60 1
3 179 68 0
4 182 72 0
5 188 77 2
6 180 71 0
7 180 70 0
8 183 84 2
9 180 88 2
10 180 67 0
11 177 76 0

localhost:8888/nbconvert/html/Desktop/ML/K_means_clustering.ipynb?download=false 3/4
4/22/24, 10:06 PM K_means_clustering

localhost:8888/nbconvert/html/Desktop/ML/K_means_clustering.ipynb?download=false 4/4

You might also like