0% found this document useful (0 votes)
15 views4 pages

Experiment No 7 ML

Program to implement KNN in machine learning.

Uploaded by

ananyahc12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views4 pages

Experiment No 7 ML

Program to implement KNN in machine learning.

Uploaded by

ananyahc12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Experiment No.

Objective:
Program to implement KNN in machine learning.

Apparatus required:
Pc and Jupyter or collab .

Theory:

Introduction to k-Nearest Neighbors (k-NN) Algorithm


The k-Nearest Neighbors (k-NN) algorithm is a versatile and straightforward supervised
learning algorithm used for classification and regression tasks. It belongs to the category of
instance-based learning algorithms, where predictions are made based on the similarity
between instances in the training data and the new data points to be classified or predicted.
The k-Nearest Neighbors (k-NN) algorithm is a powerful and widely used machine learning
algorithm that is effective for both classification and regression tasks. While it has its
limitations, particularly in terms of computational efficiency and sensitivity to parameter
settings, it remains a valuable tool in many applications, especially in scenarios where the
data distribution is complex and not well-defined.

How does k-NN work?


At its core, the k-NN algorithm operates on the principle of proximity. It assumes that similar
instances tend to share similar labels or values. When a new data point needs to be classified
or predicted, the algorithm looks at the k nearest neighbors in the training dataset and assigns
the label or value based on the majority vote (for classification) or average (for regression) of
those neighbors.
The K-NN working can be explained on the basis of the below algorithm:
o Step-1: Select the number K of the neighbors
o Step-2: Calculate the Euclidean distance of K number of neighbors
o Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.
o Step-4: Among these k neighbors, count the number of the data points in each
category.
o Step-5: Assign the new data points to that category for which the number of the
neighbor is maximum.
o Step-6: Our model is ready.

Classification with k-NN


In a classification task, the steps involved in making predictions with k-NN are as follows:
1. Calculate distances: Compute the distance between the new data point and all data points
in the training dataset. The most common distance metric used is the Euclidean distance.
2. Find nearest neighbors: Select the k data points with the smallest distances to the new
data point.
3. Majority voting: Determine the class label of the new data point by taking a majority vote
among the labels of its k nearest neighbors.
4. Assign class label: Assign the class label with the highest count as the predicted label for
the new data point.

Regression with k-NN


In regression tasks, instead of assigning class labels, the algorithm predicts the continuous
value of a new data point based on the average (or weighted average) of the target values of
its k nearest neighbors.
Parameter: Choosing the value of k
The choice of the value of k is crucial in the k-NN algorithm and can significantly affect its
performance. A smaller value of k results in a more flexible model with more complex
decision boundaries, potentially leading to overfitting. On the other hand, a larger value of k
may result in oversmoothing and underfitting.

Advantages of k-NN:
• Simple and intuitive algorithm
• No training phase (lazy learner)
• Handles multi-class classification and regression tasks
• Robust to noisy data

Disadvantages of k-NN:
• Computationally expensive for large datasets
• Sensitivity to the choice of distance metric and value of k
• Requires careful preprocessing and feature scaling
• Storage of entire training dataset required for prediction

Implementation Using Code:

# Author~ANANYA SHARMA
# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
# Load the Iris dataset
iris = load_iris()

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(iris.data,
iris.target, test_size=0.2, random_state=42 )

# Create a KNN classifier


knn = KNeighborsClassifier(n_neighbors=3)

# Train the classifier


knn.fit(X_train, y_train)

# Make predictions on the test set


y_pred = knn.predict(X_test)

# Evaluate the accuracy of the classifier


accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Accuracy: 1.0

import matplotlib.pyplot as plt

# Create a scatter plot of the data points


plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train,
cmap='viridis')

# Add labels and title


plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.title('KNN Decision Boundary')

# Show the plot


plt.show()
OUTPUT

Result:
Program to implement KNN is implemented.

You might also like