ML-Lecture-13-KNN

This document is a lecture on the k-Nearest Neighbors (kNN) algorithm, which is a type of instance-based learning that classifies unknown objects based on their similarity to stored training data. It discusses the kNN algorithm's process, distance measures, and the importance of selecting the appropriate value of k. Additionally, it highlights the characteristics of kNN and provides examples and learning materials for further understanding.

Uploaded by

Shohanur Rahman

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views

ML-Lecture-13-KNN

Uploaded by

Shohanur Rahman

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Machine Learning

Lecture 13: k-Nearest Neighbors

COURSE CODE: CSE451
2023
Course Teacher
Dr. Mrinal Kanti Baowaly
Associate Professor
Department of Computer Science and
Engineering, Bangabandhu Sheikh
Mujibur Rahman Science and
Technology University, Bangladesh.

Email: [email protected]
Instance Based Learning
 No model is learned
 It does not learn from the training set immediately instead it stores
the dataset and uses them at the time of prediction(hence also
called lazy learning).
 It classifies/predicts the test data based on its similarity to the
stored training data.
 Example: KNN algorithm
What is k-Nearest Neighbors (kNN)
learning?
 A type of instance-based learning in which an unknown object is
classified with the most common class among its k-closet objects

Basic Idea:
Analogy for kNN
Another Analogy for kNN
 If it walks like a duck, quacks like a duck, then it’s probably a duck

Compute
Distance Test Record

Training Choose k of the

Records “nearest” records
What are k-Nearest Neighbors?

X X X

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

k-Nearest Neighbors of a record x are data points

that have the k smallest distance to x
kNN algorithm
 To classify an unknown record
― Compute distance to all other training records
― Identify k-nearest records (neighbors)
― Find the most common class of the nearest k-neighbors and assign
the class for the unknown record
Distance measures
 Euclidean distance: It is useful in low dimensions, it doesn’t work
well in high dimensions and for categorical variables.
 Hamming distance: Calculate the distance between binary vectors.
 Manhattan distance: Calculate the distance between real vectors using
the sum of their absolute difference. Also called City Block Distance.
 Minkowski distance: Generalization of Euclidean and Manhattan
distance.
Note: Both Euclidean and Manhattan distances are used in case of continuous
variables, whereas hamming distance is used in case of categorical variable.

Detail: Distance Metrics in Machine Learning

How to choose the value of k?
 Choice of k is very critical – A small value of k means that noise will have
a higher influence on the result. A large value make it computationally
expensive and may defeat the basic philosophy behind kNN (that objects
that are near might have similar classes).
 A simple approach to select 𝑘 = 𝑛, where 𝑛 is the number of samples
in the training data.
 Sometimes it is best to run through each possible value for k (e.g., start
with k=1 and then increase it) and then decide the value of k that
outputs the best performance with respect to training and test data
 Choose an odd number for the binary classification
How to decide the class label?
 Take the majority vote of class labels from the k-Nearest Neighbors
 Weigh the vote according to distance weight factor, w = 1/d2
Example of kNN
 Suppose you have height, weight and T-shirt size of some
customers
 You need to predict the T-shirt size of a new customer named
‘Monica’ who has height 161cm and weight 61kg.

Detail: ListenData, Revoledu

Example of kNN
 Consider k=5
 Calculate distance of all the customers
with Monica and calculates the rank in
terms of distance
 Find 5 customers closest to Monica.
 4 of them had ‘Medium T shirt sizes’
and 1 had ‘Large T shirt size’
 Monica is ‘Medium T shirt’
Characteristics of kNN
 Non-parametric (i.e. it does not make any assumption on underlying
data)
 Lazy learner/instance-based
(Find what is Eager Vs. Lazy Learners? Source: Datacamp )
 Very simple and easy to implement
 Minimal training but expensive testing
 Choosing the value of k is crucial
 Variables should be normalized/standardized else higher range variables can
bias it (source: ListenData)
 Susceptible of high number of independent variables
Some Learning Materials
Datacamp: KNN Classification using Scikit-learn
Javatpoint: K-Nearest Neighbor(KNN) Algorithm for Machine
Learning
ListenData: K NEAREST NEIGHBOR : STEP BY STEP TUTORIAL
AnalyticsVidhya: Introduction to k-Nearest Neighbors

Error Detection Assignment - 221208 - 160356 - 221208 - 160425
No ratings yet
Error Detection Assignment - 221208 - 160356 - 221208 - 160425
3 pages
6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
H 2
No ratings yet
H 2
16 pages
KNN
No ratings yet
KNN
16 pages
K-Nearest Neighbor (KNN)
No ratings yet
K-Nearest Neighbor (KNN)
27 pages
K- Nearest Neighbor
No ratings yet
K- Nearest Neighbor
13 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
K-Nearest Neighbor Learning
No ratings yet
K-Nearest Neighbor Learning
31 pages
KNN_Algorithm
No ratings yet
KNN_Algorithm
2 pages
K Nearest Neighbour
No ratings yet
K Nearest Neighbour
2 pages
KNN Algorithm
No ratings yet
KNN Algorithm
16 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
Machine Learning unit 3
No ratings yet
Machine Learning unit 3
40 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
26 pages
UNIT-2 K-Nn-March 2024
No ratings yet
UNIT-2 K-Nn-March 2024
23 pages
Instance-Based Learning: K-Nearest Neighbour Learning
No ratings yet
Instance-Based Learning: K-Nearest Neighbour Learning
21 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
Wikipedia K Nearest Neighbor Algorithm
No ratings yet
Wikipedia K Nearest Neighbor Algorithm
4 pages
Lab 07
No ratings yet
Lab 07
3 pages
Aiml
No ratings yet
Aiml
7 pages
Introduction To Machine Learning: K-Nearest Neighbor Algorithm
No ratings yet
Introduction To Machine Learning: K-Nearest Neighbor Algorithm
25 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
Aplikasi KNN
No ratings yet
Aplikasi KNN
5 pages
Bài-nhóm-tìm-hiểu-về-KNN
No ratings yet
Bài-nhóm-tìm-hiểu-về-KNN
5 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
2 pages
Act10
No ratings yet
Act10
4 pages
ML_lecture14
No ratings yet
ML_lecture14
17 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
3
No ratings yet
3
23 pages
Lab # 12 K-Nearest Neighbor (KNN) Algorithm: Objective
No ratings yet
Lab # 12 K-Nearest Neighbor (KNN) Algorithm: Objective
5 pages
3. Classification (K-Nearest Neighbor)
No ratings yet
3. Classification (K-Nearest Neighbor)
22 pages
KNN HMM
No ratings yet
KNN HMM
51 pages
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
No ratings yet
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
7 pages
01 Basics 02knn 01
No ratings yet
01 Basics 02knn 01
7 pages
Replace all valid mathematical equations with high
No ratings yet
Replace all valid mathematical equations with high
6 pages
SVM,KNN,TreeNBC
No ratings yet
SVM,KNN,TreeNBC
22 pages
Unit 5 - DA - Classification & Clustering
No ratings yet
Unit 5 - DA - Classification & Clustering
105 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
Mathematical Foundations For Machine Learning and Data Science
No ratings yet
Mathematical Foundations For Machine Learning and Data Science
25 pages
KNN
No ratings yet
KNN
2 pages
Unit 3 (Classification)
No ratings yet
Unit 3 (Classification)
12 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
K-NN Algorithm: Need To Create Two Files File 1: KNN - Py Second File: Expt3.py
No ratings yet
K-NN Algorithm: Need To Create Two Files File 1: KNN - Py Second File: Expt3.py
4 pages
K-Nearest Neighbour (KNN)
No ratings yet
K-Nearest Neighbour (KNN)
14 pages
4.kNN Concepts
No ratings yet
4.kNN Concepts
12 pages
KNN - Algorithm - SVM - Algorithm
No ratings yet
KNN - Algorithm - SVM - Algorithm
27 pages
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
No ratings yet
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
13 pages
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
No ratings yet
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
33 pages
Road Traffic Algorithm
No ratings yet
Road Traffic Algorithm
5 pages
Week 7 Part 1KNN K Nearest Neighbor Classification
No ratings yet
Week 7 Part 1KNN K Nearest Neighbor Classification
47 pages
K-NN (Nearest Neighbor)
100% (1)
K-NN (Nearest Neighbor)
17 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
11 pages
algosintrvwques
No ratings yet
algosintrvwques
27 pages
Unit 4_KVR
No ratings yet
Unit 4_KVR
111 pages
Lecture Week 2 KNN and Model Evaluation PDF
100% (1)
Lecture Week 2 KNN and Model Evaluation PDF
53 pages
Classification
No ratings yet
Classification
74 pages
IV Distance and Rule Based Models 4.1 Distance Based Models
No ratings yet
IV Distance and Rule Based Models 4.1 Distance Based Models
45 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
ML-Lecture-12-NB
No ratings yet
ML-Lecture-12-NB
15 pages
Assignment2
No ratings yet
Assignment2
10 pages
Research Paper
No ratings yet
Research Paper
7 pages
ML-Lecture-8-9-Classification
No ratings yet
ML-Lecture-8-9-Classification
35 pages
Intro To Microprocessor
No ratings yet
Intro To Microprocessor
26 pages
ML-Lecture-1-Intro
No ratings yet
ML-Lecture-1-Intro
21 pages
ML-Lecture-14-SVM
No ratings yet
ML-Lecture-14-SVM
15 pages
ML-Lecture-11-Evaluation
No ratings yet
ML-Lecture-11-Evaluation
17 pages
ML-Lecture-2-3-Types
No ratings yet
ML-Lecture-2-3-Types
27 pages
B4 1-Seidel
No ratings yet
B4 1-Seidel
28 pages
Image Fish
No ratings yet
Image Fish
4 pages
Testing
No ratings yet
Testing
61 pages
Project Management
No ratings yet
Project Management
25 pages
Organization of 8086
No ratings yet
Organization of 8086
22 pages
Lec05 System Modeling Part2
No ratings yet
Lec05 System Modeling Part2
21 pages
Lec03 Agile
No ratings yet
Lec03 Agile
28 pages
Observer
No ratings yet
Observer
12 pages
Lec02 Process Model
No ratings yet
Lec02 Process Model
37 pages
Lec01 Intro
No ratings yet
Lec01 Intro
27 pages
Factory
No ratings yet
Factory
7 pages
Activity Case
No ratings yet
Activity Case
34 pages
Bmee211l Engineering-Optimization TH 1.0 67 Bmee211l
No ratings yet
Bmee211l Engineering-Optimization TH 1.0 67 Bmee211l
2 pages
Chapter 5 PCM Mod
100% (1)
Chapter 5 PCM Mod
12 pages
Islamic University of Gaza: Numerical Analysis Midterm Exam (2014) Allowable Time: 80 Min
No ratings yet
Islamic University of Gaza: Numerical Analysis Midterm Exam (2014) Allowable Time: 80 Min
5 pages
ECE4001-Digital Communication Systems-Syllabus PDF
100% (1)
ECE4001-Digital Communication Systems-Syllabus PDF
4 pages
Telecommunication Networks PDF
No ratings yet
Telecommunication Networks PDF
1,076 pages
Direct File
No ratings yet
Direct File
48 pages
Big-M Method (Introduction) : Artificial Variable
No ratings yet
Big-M Method (Introduction) : Artificial Variable
10 pages
On EEG Signal Processing
No ratings yet
On EEG Signal Processing
39 pages
A Genetic Algorithm-Based 3D Feature Selection For Lip Reading
No ratings yet
A Genetic Algorithm-Based 3D Feature Selection For Lip Reading
6 pages
Download ebooks file Voice Compression and Communications Principles and Applications for Fixed and Wireless Channels 1st Edition Lajos L. Hanzo all chapters
100% (3)
Download ebooks file Voice Compression and Communications Principles and Applications for Fixed and Wireless Channels 1st Edition Lajos L. Hanzo all chapters
81 pages
Ampliación Al Reporte Farr
No ratings yet
Ampliación Al Reporte Farr
6 pages
100 Employee Data Set
No ratings yet
100 Employee Data Set
7 pages
Unit5 FDS
No ratings yet
Unit5 FDS
21 pages
CSEN3231 IMAGE PROCESSING
No ratings yet
CSEN3231 IMAGE PROCESSING
4 pages
Jacome Sebastian 10
No ratings yet
Jacome Sebastian 10
7 pages
Infoedge Test in IITR
No ratings yet
Infoedge Test in IITR
2 pages
Yule Walker MatrixForm
No ratings yet
Yule Walker MatrixForm
62 pages
T5 Worksheet 5
No ratings yet
T5 Worksheet 5
6 pages
Matrices II System of Linear Equations
No ratings yet
Matrices II System of Linear Equations
9 pages
Huffman Code
No ratings yet
Huffman Code
51 pages
DSA by Shradha Didi & Aman Bhaiya
No ratings yet
DSA by Shradha Didi & Aman Bhaiya
9 pages
9 Line Coding and ISI
No ratings yet
9 Line Coding and ISI
17 pages
A - Cardinal - Function - Algorithm - For - Computing - Multiv
No ratings yet
A - Cardinal - Function - Algorithm - For - Computing - Multiv
13 pages
A Survey On Learning To Hash
No ratings yet
A Survey On Learning To Hash
22 pages
Grid-Based Learning
No ratings yet
Grid-Based Learning
2 pages
01 - Study Unit 1 - Introduction To Algorithms
No ratings yet
01 - Study Unit 1 - Introduction To Algorithms
39 pages
Assignment Model
No ratings yet
Assignment Model
9 pages
EET305 - ktu qbank
No ratings yet
EET305 - ktu qbank
7 pages