0% found this document useful (0 votes)
69 views

Machine Learning-Lecture 03

This document discusses the K-nearest neighbors (KNN) machine learning algorithm. KNN is a simple supervised learning method where new data is classified based on its similarity to training examples. The algorithm finds the K closest examples in the training data to the new point and assigns the most common class among those K neighbors. The document covers how KNN works, why it is considered a lazy learner, how to calculate distances, and factors to consider when selecting the K value.

Uploaded by

Amna Arooj
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Machine Learning-Lecture 03

This document discusses the K-nearest neighbors (KNN) machine learning algorithm. KNN is a simple supervised learning method where new data is classified based on its similarity to training examples. The algorithm finds the K closest examples in the training data to the new point and assigns the most common class among those K neighbors. The document covers how KNN works, why it is considered a lazy learner, how to calculate distances, and factors to consider when selecting the K value.

Uploaded by

Amna Arooj
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Machine Learning

LECTURE – 03
K-NEAREST-NEIGHBOR
ALGORITHM
K Nearest Neighbor Classification
K-Nearest Neighbors or KNN is one of the simplest machine
learning algorithms.
◦ easy to implement

◦ easy to understand.

It is a supervised machine learning algorithm. This means we


need a reference dataset to predict the category/group of the
future data point.
K Nearest Neighbor Classification
Once the KNN model is trained on the reference dataset, it
classifies the new data points based on the points (neighbors)
that are most similar to it. It uses the test data to make an
“educated guess” as to which category should the new data
points belong.

KNN algorithm simply means that “I am as good as my


nearest K neighbors.”
Why Is The KNN Called A “Lazy Learner”
Or A “Lazy Algorithm”?
KNN is called a lazy learner because when we supply training

data to this algorithm, the algorithm does not train itself at all.

KNN does not learn any discriminative function from the training

data. But it memorizes the entire training dataset instead.

There is no training time in KNN.


K Nearest Neighbor Classification(Lazy
Learner)
Each time a new data point comes in and we want to make a
prediction, the KNN algorithm will search for the nearest
neighbors in the entire training set.

Hence the prediction step becomes more time-consuming


and computationally expensive.
KNN CLASSIFICATION
APPROACH
KNN CLASSIFICATION
APPROACH
KNN CLASSIFICATION ALGORITHM
The table above represents our data set. We have two columns
— Brightness and Saturation.
Each row in the table has a class of either Red or Blue.
Before we introduce a new data entry, let's assume the value
of K is 5.

The table above represents our data set. We have two columns
How to Calculate Euclidean Distance in the K-Nearest Neighbors
Algorithm
Here's the new data entry:

To know its class, we have to calculate the distance from the new entry to
other entries in the data set using the Euclidean distance formula.
Here's what the table will look like after all the
distances have been calculated:
Since we chose 5 as the value of K, we'll only
consider the closest five records. That is:
As you can see above, the majority class within the 5
nearest neighbors to the new entry is Red. Therefore,
we'll classify the new entry as Red.
KNN - EXAMPLE
HOW TO SELECT THE VALUE OF
K
There is no particular way of choosing the value K, but here are
some common conventions to keep in mind:

Choosing a very low value will most likely lead to inaccurate


predictions.

The commonly used value of K is 5.

Always use an odd number as the value of K.


HOW TO SELECT THE VALUE OF K
There is no particular way to determine the best value for "K",
so we need to try some values to find the best out of them.
For each of N training example
Find its K nearest neighbours

Make a classification based on these K neighbours

Calculate classification error

Output average error over all examples

Use the K that gives lowest average error over the N training examples
PROS AND CONS OF KNN
Pros

Cons
THANK YOU!

You might also like