0% found this document useful (0 votes)

10 views41 pages

Supervised Learning & Regression

The document discusses supervised learning, specifically classification and linear regression, outlining how to classify data points into classes based on features and how to predict outcomes using linear relationships. It explains the concepts of hypothesis classes, training sets, and the importance of model selection and generalization in machine learning. Additionally, it covers the significance of covariance and correlation coefficients in understanding relationships between variables.

Uploaded by

halouma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views41 pages

Supervised Learning & Regression

Uploaded by

halouma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 41

PATTERN

RECOGNITION

1
SUPERVISED LEARNING
CLASSIFICATION

2
LEARNİNG A CLASS FROM EXAMPLES
 Class C of a “family car”
 Prediction: Is car x a family car?
 Knowledge extraction: What do people
expect from a family car?
 Positive (+) and negative (–) examples
 Input representation:

x1: price, x2 : engine power

3
TRAİNİNG SET X
For each car

 x1 
x  
 x2 
 1 if x is positive
r 
0 if x is negative

For N training examples

t t N
X  {x ,r } t 1

4
CLASS C

p1 price p2  AND e1 engine power e2 

For suitable values
of p1,p2,e1 and e2

Class C is defined
by a rectangle in
the price-engine
power space.

5
CLASS C

p1 price p2  AND e1 engine power e2 

This equation fixes the hypothesis class H –
the set of rectangles
Learning algorithm to find a particular
hypothesis h Є H to approximate C as closely
as possible

Expert defines the hypothesis

class
Algorithm finds the parameters
6
HYPOTHESİS CLASS H
 1 if h classifies x as positive
h (x)  
0 if h classifies x as negative

Training Error: Predictions

of h which do not match
the required values in X

N
 
E (h | X )   1 h xt r t 
t 1
7
HYPOTHESİS CLASS H How to read?

N
 
E (h | X )   1 h xt r t 
t 1

Error on hypothesis
h given the training
set X

8
S, G, AND THE VERSİON SPACE

Most specific hypothesis, S

Most general hypothesis, G

h Î H, between S and G is
consistent

and make up the

version space

9
Ci for i=1,...,K

MULTIPLE CLASSES

 1 if xt
 Ci
X  {xt ,r t }tN1 t
ri   t
0 if x  C j , j i

Train hypotheses
hi(x), i =1,...,K:

 1 if xt
 Ci
 
t 
hi x   t
0 if x  C j , j i

10
Ci for i=1,...,K

MULTIPLE CLASSES
K Class problem = K – 2 class problems

Positive examples
for class : Luxury
Sedan

Rest ALL –
Negative examples

11
LINEAR REGRESSION

12
EXAMPLE

David Beckham: 1.83m Brad Pitt: 1.83m George Bush :1.81m

Victoria Beckham: 1.68m Angelina Jolie: 1.70m Laura Bush: ?

 To predict height of the wife in a couple, based on the husband’s

height
 Response (out come or dependent) variable (Y):
 height of the wife
13
 Predictor (explanatory or independent) variable (X):
 height of the husband
WHAT IS LINEAR
 Remember this?

14
WHAT IS LINEAR
 A slope of 2
means that every
1-unit change in X
yields a 2-unit
change in Y.

15
EXAMPLE
 Dataset giving the living areas and prices of
50 houses

16
EXAMPLE
 We can plot this data

Given data like

this, how can we
learn to predict the
prices of other
houses as a
function of the size
of their living
areas?

17
NOTATIONS
 The “input” variables – x(i) (living area in this
example)
 The “output” or target variable that we are

trying to predict – y(i) (price)

 A pair (x(i), y(i)) is called a training example

 A list of m training examples {(x(i), y(i)); i =

 1, . . . ,m}—is called a training set

 X denote the space of input values, and Y

the space of output values

18
REGRESSION
Given a training set, to learn a function h :
X → Y so that h(x) is a “good” predictor for
the corresponding value of y. For historical
reasons, this function h is called a
hypothesis.

19
CHOICE OF HYPOTHESIS
 Decision
 How to represent the hypothesis h
 For linear regression – we assume that the
hypothesis is Linear

h( x )  0  1 x

20
HYPOTHESIS
 Generally we’ll have more than one input
features

x1=Living area h( x )  0  1 x1   2 x2
x2 = # of bedrooms 21
HYPOTHESIS
 Hypothesis
h( x )  0  1 x1   2 x2

 To show dependence on θ:

h ( x )  0  1 x1   2 x2
 OR
h( x |  )  0  1 x1   2 x2

This is the price that the hypothesis predicts

22
for a given house with living area x1 and
number of bedrooms x2
HYPOTHESIS
h ( x )  0  1 x1   2 x2
 For conciseness

Define x0 1 h ( x )  0 x0  1 x1   2 x2
2
h ( x)   i xi θs are called the
i 0 parameters and are real
 For n features numbers

n Job of learning
h ( x)   i xi  T X alogrithm to find or
i 0
learn these parameters
23
CHOSING THE REGRESSION LINE

Which of these
lines to chose?

Y Y

X X 24
y h ( x )  0  1 x

CHOSING THE REGRESSION LINE

The predicted value is:
yˆ i h ( xi )  0  1 xi

Y The true value for xi is yi

yˆ i
ˆ i  yi
Error or residual y
yi

Consider this point xi

xi X 25
CHOSING THE REGRESSION LINE
How to chose this
best fit line
m
min  ( h ( x (i ) )  y (i ) ) 2
Y 
i 1

Minimize the sum of the

In other words: squared (why squared?)
How to chose distances of the points (Yi’s)
the θs from the line for the m
training examples

X
26
CHOSING THE REGRESSION LINE
Sum the error
over m training
To simplify examples We dont want
calculations negative values
m
1
J ( )  min  ( h ( x (i ) )  y (i ) ) 2
2  i 1

Difference between what

hypothesis predicted and what
Find θ which
the actual value is
minimizes the
expression
27
min J ( )

GRADIENT DESCENT
 Chose initial values of θ0 and θ1 and continue
moving the direction of steepest descente
J(θ)

28
θ0
θ1
GRADIENT DESCENT
 Chose initial values of θ0 and θ1 and continue
moving the direction of steepest descente
 The step size is controlled by a parameter

called learning rate

 Starting point is

important

29
MODEL SELECTION

g x  w1x  w 0
Life is not as simple as

 Non-Linear Regression

g x  w 2x 2  w1x  w 0

Higher order polynomial

Linear

30

MODEL SELECTION
 Inductive Bias
 The set of assumptions we make to have
learning possible is called the inductive bias of
the learning algorithm.
 Examples:
 Chosing the hypothesis class – Rectangle
 Regression – Assuming the function is linear

 Learning – Need to chose a bias

 How to chose the right bias?
 Model Selection

31
GENERALIZATION
 Generalization: How well a model performs
on new data
 Overfitting:
 The chosen hypothesis is too complex
 For example: Fitting a 3rd order polynomial on
linear data
 Underfitting:
 The chosen hypothesis is too simple
 For example: Fitting a line on a quardatic
function

32
CROSS VALIDATION

 To estimate generalization error, we need

data unseen during training. We split the
data as
 Training set (50%)
 Validation set (25%)
 Test (publication) set (25%)
Chose the hypothesis that is best on the
validation set – Cross Validation
33
CROSS VALIDATION
 Example: Find the right order of polynomial in
regression?
 Use the training set to estimate the coefficients
 Caclulate the errors on the validation set
 Chose the one with the least validation error
 Question: What is the expected error of the
chosen model?
 Can NOT use the validation error
 The validation data has been used to chose the
model – effectively a part of training
 Use the TEST data set

34
SUMMARY
 Model
h ( x) or h  x| 

 Loss Function
m
E ( | x)  J ( )  (h ( x (i ) )  y ( i ) ) 2
i 1
 Optimization
min E ( | x)


35
COVARIANCE
n

 (x  i X )( yi  Y )
cov ( x , y )  i 1
n 1
cov(X,Y) > 0 X and Y are positively
correlated
cov(X,Y) < 0 X and Y are inversely
correlated
cov(X,Y) = 0 X and Y are independent

36
CORRELATION COEFFICIENT
 Pearson’s Correlation Coefficient is
standardized covariance (unitless):

cov( x, y )
r
var x var y

37
CORRELATION COEFFICIENT
 Measures the relative strength of the linear
relationship between two variables
 Unit-less

 Ranges between –1 and 1

 The closer to –1, the stronger the negative

linear relationship
 The closer to 1, the stronger the positive linear

relationship
 The closer to 0, the weaker any positive linear

relationship
38
CORRELATION COEFFICIENT
Y Y

X X
r = -0.8 r = -0.6
Y
Y Y

39
X X
r = +0.8 r = +0.2
CORRELATION COEFFICIENT
Strong relationships Weak relationships

Y Y

X X

Y Y

40
X X
ACKNOWLEDGEMENTS
 Machine Intelligence, Dr M. Hanif, UET, Lahore
 Machine Learning, Andrew Ng – Stanford

University
 Lecture Slides, Introduction to Machine

Learning, E. Alpyadin, MIT Press.

Module-1 Deep Learning (Autosaved)
No ratings yet
Module-1 Deep Learning (Autosaved)
100 pages
Machine Learning: Introduction and Linear Regression
No ratings yet
Machine Learning: Introduction and Linear Regression
29 pages
05-1 Supervised Learning
No ratings yet
05-1 Supervised Learning
65 pages
Supervised Machine Learning Overview
100% (1)
Supervised Machine Learning Overview
111 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
AAI Lecture 10 SP 25
No ratings yet
AAI Lecture 10 SP 25
37 pages
Regression and Generalization
No ratings yet
Regression and Generalization
67 pages
ML Introduction
No ratings yet
ML Introduction
76 pages
Linear Regression
No ratings yet
Linear Regression
91 pages
Presentation 6
No ratings yet
Presentation 6
34 pages
ML Lecture - 3
No ratings yet
ML Lecture - 3
47 pages
Lecture Slide 02 - Supervised Learning-1
No ratings yet
Lecture Slide 02 - Supervised Learning-1
43 pages
6 - Classification and Regression Tasks
No ratings yet
6 - Classification and Regression Tasks
115 pages
Lecture 8: Gradient Descent and Logistic Regression
No ratings yet
Lecture 8: Gradient Descent and Logistic Regression
39 pages
ML4 Linear Models
No ratings yet
ML4 Linear Models
34 pages
Linear Regression with Python OLS
No ratings yet
Linear Regression with Python OLS
23 pages
Unit 4 Regression
No ratings yet
Unit 4 Regression
26 pages
Supervised Learning: Classification & Regression
No ratings yet
Supervised Learning: Classification & Regression
6 pages
Lec4 Oct12 2022 PracticalNotes LinearRegression
No ratings yet
Lec4 Oct12 2022 PracticalNotes LinearRegression
34 pages
Lecture - 4 - Logistic Regression
No ratings yet
Lecture - 4 - Logistic Regression
62 pages
Machine Learning Techniques Overview
No ratings yet
Machine Learning Techniques Overview
73 pages
Mlfa Autumn 22 Lec 02
No ratings yet
Mlfa Autumn 22 Lec 02
24 pages
Machine Learning Shortnote
No ratings yet
Machine Learning Shortnote
14 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
7 pages
Lecture 1 2022
No ratings yet
Lecture 1 2022
55 pages
Introduction To AI and ML
No ratings yet
Introduction To AI and ML
22 pages
CSE 412 Lab Manual 3 Linear Regression
No ratings yet
CSE 412 Lab Manual 3 Linear Regression
10 pages
Unit 2 ML Regression
No ratings yet
Unit 2 ML Regression
46 pages
Week 3
No ratings yet
Week 3
56 pages
01B DL2023 LinearModels
No ratings yet
01B DL2023 LinearModels
47 pages
MECH4403 LR Week04
No ratings yet
MECH4403 LR Week04
25 pages
Mla Unit 2
No ratings yet
Mla Unit 2
99 pages
Lecture 3 - Regression
No ratings yet
Lecture 3 - Regression
47 pages
Machine Learning Basics for Beginners
No ratings yet
Machine Learning Basics for Beginners
53 pages
Regression PDF
No ratings yet
Regression PDF
37 pages
MLA TAB Lecture3
No ratings yet
MLA TAB Lecture3
70 pages
Unit 2
No ratings yet
Unit 2
151 pages
Linear Regression Basics and Model Fitting
No ratings yet
Linear Regression Basics and Model Fitting
35 pages
Lecture3 - Linear Regression and Logistic Regression
No ratings yet
Lecture3 - Linear Regression and Logistic Regression
60 pages
Machine Learning Guide 2017
No ratings yet
Machine Learning Guide 2017
15 pages
Unit 3 DSA
No ratings yet
Unit 3 DSA
69 pages
ML Week 4
No ratings yet
ML Week 4
5 pages
Unit No. 2
No ratings yet
Unit No. 2
30 pages
Regression
No ratings yet
Regression
56 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
103 pages
Intro to Machine Learning Concepts
No ratings yet
Intro to Machine Learning Concepts
8 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Lecture 02
No ratings yet
Lecture 02
43 pages
2a Linear Regression 18may
No ratings yet
2a Linear Regression 18may
28 pages
Linear Regression Lecture Notes
No ratings yet
Linear Regression Lecture Notes
34 pages
Markdown in Jupyter Notebook Lab 4
No ratings yet
Markdown in Jupyter Notebook Lab 4
5 pages
Chap 2 Linear Regression - Part1
No ratings yet
Chap 2 Linear Regression - Part1
29 pages
S1 - 25 (NSP) - ML - CS 34 - 10th17th Aug 2025
No ratings yet
S1 - 25 (NSP) - ML - CS 34 - 10th17th Aug 2025
89 pages
ML: Introduction 1. What Is Machine Learning?
No ratings yet
ML: Introduction 1. What Is Machine Learning?
38 pages
Understanding Machine Learning Types
No ratings yet
Understanding Machine Learning Types
49 pages
Machine Learning and Data Mining
No ratings yet
Machine Learning and Data Mining
88 pages
Information Security (3170720) Lab Manual
No ratings yet
Information Security (3170720) Lab Manual
6 pages
Analysis of Support Vector Machine-Based Intrusion Detection Techniques
No ratings yet
Analysis of Support Vector Machine-Based Intrusion Detection Techniques
13 pages
BTech Semester I Exam Guide
No ratings yet
BTech Semester I Exam Guide
2 pages
Homework Solutions #6 (Mcintyre CH 7 & 11)
No ratings yet
Homework Solutions #6 (Mcintyre CH 7 & 11)
3 pages
Mean and Variance in Probability
No ratings yet
Mean and Variance in Probability
14 pages
ISS-2025-Introduction To AI: Thinking Analytically About Creative Machines
No ratings yet
ISS-2025-Introduction To AI: Thinking Analytically About Creative Machines
2 pages
Implementation of QKD BB84 Protocol in Qiskit
No ratings yet
Implementation of QKD BB84 Protocol in Qiskit
7 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
53 pages
Machine Learning For Cyber Physical System by Janmenjoy Nayak
100% (1)
Machine Learning For Cyber Physical System by Janmenjoy Nayak
412 pages
Discussion Paper 2310 2
No ratings yet
Discussion Paper 2310 2
23 pages
Data Presentation & Interpretation Guide
No ratings yet
Data Presentation & Interpretation Guide
22 pages
HCALA: Lightweight Drone Authentication
No ratings yet
HCALA: Lightweight Drone Authentication
21 pages
Expected Utility Theory
No ratings yet
Expected Utility Theory
12 pages
Design Via Root Locus
No ratings yet
Design Via Root Locus
79 pages
Optimal Cost Binary Search Trees
No ratings yet
Optimal Cost Binary Search Trees
4 pages
Explain Unit Step Function and Its Applications
No ratings yet
Explain Unit Step Function and Its Applications
10 pages
Neural Networks
No ratings yet
Neural Networks
11 pages
5 Collection and Presentation of Data New 2023
No ratings yet
5 Collection and Presentation of Data New 2023
2 pages
Difference Equations in DT Systems
No ratings yet
Difference Equations in DT Systems
9 pages
Y Hilpisch Derivatives Analytics Excerpt
100% (1)
Y Hilpisch Derivatives Analytics Excerpt
37 pages
Department of Mechanical Engineering, Manipal University Jaipur M. Tech. in Computer Aided Analysis and Design (Caad)
No ratings yet
Department of Mechanical Engineering, Manipal University Jaipur M. Tech. in Computer Aided Analysis and Design (Caad)
2 pages
Trapezoidal Rule & Romberg Integration Problem Set
No ratings yet
Trapezoidal Rule & Romberg Integration Problem Set
4 pages
DWM Syllabus
No ratings yet
DWM Syllabus
4 pages
Essentials of Business Analytics 1st Edition (Ebook PDF) Download
100% (1)
Essentials of Business Analytics 1st Edition (Ebook PDF) Download
44 pages
Understanding 2-3-4 Trees: Structure & Operations
No ratings yet
Understanding 2-3-4 Trees: Structure & Operations
7 pages
T.Y.F.M. Sem-V Dec.23 (Choice Based) Financial Derivatives (1!12!2023) (Pc-43811)
No ratings yet
T.Y.F.M. Sem-V Dec.23 (Choice Based) Financial Derivatives (1!12!2023) (Pc-43811)
3 pages
Disease Prediction Synopsis
No ratings yet
Disease Prediction Synopsis
3 pages
Happy-Sad Facial Expression Recognition
No ratings yet
Happy-Sad Facial Expression Recognition
18 pages
PI and PID Controller Tuning Overview
No ratings yet
PI and PID Controller Tuning Overview
6 pages
Hospital Management Algorithm Design Activity
No ratings yet
Hospital Management Algorithm Design Activity
5 pages

Supervised Learning & Regression

Uploaded by

Supervised Learning & Regression

Uploaded by

PATTERN

x1: price, x2 : engine power

For N training examples

p1 price p2  AND e1 engine power e2 

p1 price p2  AND e1 engine power e2 

Expert defines the hypothesis

Training Error: Predictions

Most specific hypothesis, S

and make up the

David Beckham: 1.83m Brad Pitt: 1.83m George Bush :1.81m

 To predict height of the wife in a couple, based on the husband’s

Given data like

trying to predict – y(i) (price)

 A list of m training examples {(x(i), y(i)); i =

 1, . . . ,m}—is called a training set

 X denote the space of input values, and Y

the space of output values

This is the price that the hypothesis predicts

CHOSING THE REGRESSION LINE

Y The true value for xi is yi

Consider this point xi

Minimize the sum of the

Difference between what

called learning rate

Higher order polynomial

 Learning – Need to chose a bias

 To estimate generalization error, we need

 Ranges between –1 and 1

 The closer to –1, the stronger the negative

Learning, E. Alpyadin, MIT Press.

You might also like