0% found this document useful (0 votes)
3 views26 pages

KNN

The document discusses the k-Nearest Neighbors (kNN) classification algorithm, which is a supervised learning approach used to classify data based on input features. It outlines the steps involved in the kNN algorithm, including selecting the number of neighbors, calculating distances, and assigning categories based on the majority vote of neighbors. Additionally, it provides an example of applying kNN to determine the acceptance of a new product based on its attributes and the classification of similar products.

Uploaded by

watches1432
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views26 pages

KNN

The document discusses the k-Nearest Neighbors (kNN) classification algorithm, which is a supervised learning approach used to classify data based on input features. It outlines the steps involved in the kNN algorithm, including selecting the number of neighbors, calculating distances, and assigning categories based on the majority vote of neighbors. Additionally, it provides an example of applying kNN to determine the acceptance of a new product based on its attributes and the classification of similar products.

Uploaded by

watches1432
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

kNN

Classification Algorithm
• In machine learning and statistics, classification
is a supervised learning approach
• computer program learns from the data input
given to it and then uses this learning to classify
new observation
• This data set may simply be bi-class (like
identifying whether the person is male or
female or that the mail is spam or non-spam)
• it may be multi-class too
Examples of classification
examples of classification problems are:
• speech recognition
• handwriting recognition
• bio metric identification
• document classification
Classification Algorithms
• Linear Classifiers: Logistic Regression, Naive
Bayes Classifier
• k-Nearest Neighbor (kNN)
• Support Vector Machines
• Decision Trees (IG –ID3)
• Boosted Trees
• Random Forest
• Neural Networks
The k-Nearest- Neighbours algorithm
• Step-1: Select the number K of the
neighbours
• Step-2: Calculate the Euclidean distance of K
number of neighbors
• Step-3: Take the K nearest neighbours as per
the calculated Euclidean distance.
• Step-4: Among these k neighbours, count the
number of the data points in each category.
• Step-5: Assign the new data points to that
category for which the number of the
neighbour is maximum.
• Step-6: Our model is ready.
K-NN Algorithm
K-NN Algorithm
K=4
K=3
• Total number of category + 1 in target feature
• In real time application, it is based on iterative
accuracy process (Elbow curve plot)
Validation Error Curve (Elbow curve plot)
Pseudo Code of KNN
• Load the data
• Initialise the value of k
• For getting the predicted class, iterate from 1
to total number of training data points
– Calculate the distance between test data and each
row of training data.
– Here we will use Euclidean distance as our
distance metric since it’s the most popular
method.
– The other metrics that can be used are
Manhattan, Minkowski, Chebyshev, cosine, etc.
Pseudo Code of KNN
– Sort the calculated distances in ascending order
based on distance values
– Get top k rows from the sorted array
– Get the most frequent class of these rows
– Return the predicted class
• Euclidean distance
DATASET
A startup company’s Acceptance and Rejection of the product
is tabulated in the table given below. Suggest the company
whether the new product Prod5, with Attrib1=3 and
Attrib2=7 will be accepted or rejected using similarity based
learning algorithm by considering 2 nearest neighbors.

Product Attrib1 Attrib2 Status Result


Prod1 7 7 Reject 4
Prod2 7 4 Reject 5
Prod3 3 4 Good 3
Prod4 1 4 Good 3.6

You might also like