02-Linear Regression

The document provides an overview of linear regression, a supervised learning technique used to predict continuous outcomes by estimating relationships among variables. It discusses the basic elements, notations, loss functions, optimization methods, and the distinction between regression and classification. Key concepts include the linear model, the goal of minimizing loss through gradient descent, and the use of cross-entropy loss in classification problems.

Uploaded by

Kalp Patel

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

02-Linear Regression

Uploaded by

Kalp Patel

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

CSD456

Deep Learning
Linear
Regression
Regression

• Regression is a type of supervised learning used to predict

continuous outcomes.
• It estimates the relationships among variables.
• Common in various fields like finance, biology, and economics.

Examples
• Predicting house prices based on features like size, location.
• Forecasting stock prices using historical data.
• Estimating a person's weight based on height and age.
Basic Elements of Linear Regression
● Assumption:
○ Linear relationship between independent variables x and the dependent
variable y
○ y can be expressed as a weighted sum of the elements in x, given noise on
the observations
○ Noise well behaved (follow Gaussian distribution)
● We need:
○ A training dataset or training set
○ Rows referred to as examples, data points, data instances, sample
○ Dependent variable is called the label or target
○ Independent variables are called the features or covariates
Notations

● n denotes the # of examples

T
𝑖 𝑖
● Index the data examples 𝐱 (𝑖) = 𝑥1 , 𝑥2
● Corresponding labels 𝑦 (𝑖)

● Example:
○ We would like to estimate price of houses based on area age price

area and age 15 10 15000

○ We want to develop a predictive model for predicting 25 15 25000
house prices
Linear Model

Based on the linearity assumption we say that the target is a weighted sum of
the features and translation (bias)

Bias -Take on this value when

Weights -Influence of features on
features take value of 0
the predictions
Linear Model

● Goal: choose weights and bias such that on average, predictions of the model
fit the true prices observed in the data
● Linear models rely on the affine transformation specified by the chosen
weights and Bias
● So, generally speaking we have

● Compact form using vectors

Linear Regression

Linear Regression 𝑦ො (𝑖) = 𝐰 ∗ 𝑥 (𝑖) + 𝑏

For more than 1D input, ෝ𝑦 (𝑖) = 𝐰 𝑇 𝐱 (𝑖) + 𝑏, 𝐱 (𝑖) ∈ ℝ𝑑 , 𝑏 ∈ ℝ

• Linear Regression is one of the simplest and

W
most widely used regression techniques.
y • It models the relationship between input
features (independent variables) and a
continuous output (dependent variable) using a
linear function.
x • The objective is to find the line (or hyperplane
-b/w1 in higher dimensions) that best fits the data.
Loss Function

● A function that quantifies the difference between real and predicted value of
the target
● The smaller the value of loss the better
● A popular loss function: squared error
estimation
● Empirical error is a function of the parameters
1 2
● 𝑙 (𝑖) 𝐰, 𝑏 = 𝑦ො (𝑖) − 𝑦 𝑖 , Where 𝑦ො (𝑖) = 𝐰 T 𝑥 (𝑖) + 𝑏
2

Term cancels
when we take
the derivative observation
of the loss
Loss Function
𝑦ො (𝑖)
1 𝑛 1 𝑛 1 2
𝐿(𝐰, 𝐛) = σ𝑖=1 𝑙 (𝑖) 𝐰, 𝑏 = σ 𝐰 T 𝑥 (𝑖) +𝑏− 𝑦 (𝑖)
𝑛 𝑛 𝑖=1 2

When training the

Number of examples model, we seek Note the superscript suggests
parameters that an operation applied to a single
minimize the total loss example
across all training
examples
Optimization

● We seek to iteratively reduce the error of the model and improve its quality:
○ Updating the parameters in the direction that incrementally lowers the
loss function
● We use the gradient descent algorithm to achieve it
● We can directly take the derivative of the average loss on the entire dataset
○ This requires pass over the entire dataset before making a single update
● A better solution is called miniBatch stochastic gradient descent
○ Sampling random minibatch of examples and
○ Take the derivative of the average loss on the minibatch with regard to
the model parameters and then compute the update
MiniBatch Stochastic Gradient Descend

Partial derivative of the

Term multiplied average loss of the
to the gradient minibatch
(learning rate)

3ubtract the result

from the current
parameters
Minibatch size
Optimization Algorithm

● Initialize model parameters (typically random)

● Sample random minibatches
● Update parameters in the direction of the negative gradient:
Prediction using Linear Regression Model

● We adjust hyperparameters (e.g., learning rate) assessed on validation set

● We aim to find parameters that achieve low loss on unseen data
○ Also referred to generalization (discussed later on)
● Given those learned parameters it is now possible to estimate targets given
features of a new instance
● We use the squares loss below to quantify goodness/badness of the model
○ Using maximum likelihood estimate principles we get negative log
likelihood
Normal Distribution and Squared loss

Linear regression with the square loss can be motivated by assuming that the
observations arise from noisy distributions. Hence,
Linear regression as single-layer neural network

Figure source: d2l.ai

Classification
● Classification aims to predict from set of categories (e.g., cat vs. dogs, positive
vs. negative)
● Hard assignment of examples to categories (classes)
● Soft assignments; assess probabilities (discussed later on)
● We want a model that estimates the conditional probabilities' with all
possible classes
○ Model with multiple outputs (one per class)
○ Goal: optimize our parameters to produce probabilities that maximize the
likelihood of the observed data
○ One main approach is softmax regression
Cross-entropy loss

● Used to measure quality of predicted probabilities

● Common loss function used in classification problems
● Computes the expected value of the loss for a distribution over labels
● Concretely, cross-entropy objective
○ Maximizes the likelihood of the observed data
○ Measures difference between two probability distributions
○ Minimize the surprisal require to communicate the labels (refer to information
theory)

Solution Manual Introduction To Corporate Finance 5th Edition by Alex Frino SLP1163
50% (4)
Solution Manual Introduction To Corporate Finance 5th Edition by Alex Frino SLP1163
29 pages
The Practically Cheating Calculus Handbook
From Everand
The Practically Cheating Calculus Handbook
S. Deviant
3.5/5 (7)
TCDS EASA R008 AH AS350 EC130 Issue 12
No ratings yet
TCDS EASA R008 AH AS350 EC130 Issue 12
52 pages
Fendt 800 Vario Manual
No ratings yet
Fendt 800 Vario Manual
19 pages
Lec 3-5 (Function Approximation)
No ratings yet
Lec 3-5 (Function Approximation)
34 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
31 pages
LinearRegression PDF
No ratings yet
LinearRegression PDF
4 pages
D2L CH3 Part1
No ratings yet
D2L CH3 Part1
36 pages
Multiple Regression
No ratings yet
Multiple Regression
7 pages
AI Lec 2
No ratings yet
AI Lec 2
49 pages
Linear Regression
No ratings yet
Linear Regression
26 pages
Linear-Regression ML
No ratings yet
Linear-Regression ML
36 pages
04 LinearModels
No ratings yet
04 LinearModels
28 pages
lecture3_supervised_learning_I
No ratings yet
lecture3_supervised_learning_I
84 pages
GradientDescent-Regression_slides
No ratings yet
GradientDescent-Regression_slides
26 pages
Unit-Vi 2
No ratings yet
Unit-Vi 2
31 pages
Predictive Analytics (2)
No ratings yet
Predictive Analytics (2)
46 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
Lecture 2
No ratings yet
Lecture 2
66 pages
Lecture 02
No ratings yet
Lecture 02
43 pages
Linear Regression - Everything You Need To Know About Linear Regression
No ratings yet
Linear Regression - Everything You Need To Know About Linear Regression
17 pages
Unit I
No ratings yet
Unit I
14 pages
Lecture Notes 5 Linear Regression
No ratings yet
Lecture Notes 5 Linear Regression
11 pages
Regression
No ratings yet
Regression
17 pages
Chapter_2_Linear and Logistic Regression
No ratings yet
Chapter_2_Linear and Logistic Regression
34 pages
Linear Regression
No ratings yet
Linear Regression
61 pages
A Practical Approach To Linear Regression in Machine Learning - by Ashwin Raj - Towards Data Science
No ratings yet
A Practical Approach To Linear Regression in Machine Learning - by Ashwin Raj - Towards Data Science
20 pages
linear regression
No ratings yet
linear regression
130 pages
Linear Regression
No ratings yet
Linear Regression
5 pages
Essentials of Linear Regression in Python
No ratings yet
Essentials of Linear Regression in Python
23 pages
AI Lec-04
No ratings yet
AI Lec-04
21 pages
Week - 03 Week04
No ratings yet
Week - 03 Week04
32 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
HongDaiNghia NguyenPhucToan
No ratings yet
HongDaiNghia NguyenPhucToan
33 pages
Introduction To Machine Learning Algorithms: Linear Regression
No ratings yet
Introduction To Machine Learning Algorithms: Linear Regression
1 page
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
Lec 6
No ratings yet
Lec 6
19 pages
Regression and Optimization in ML
No ratings yet
Regression and Optimization in ML
41 pages
Foundations of Machine Learning - 3
No ratings yet
Foundations of Machine Learning - 3
38 pages
Linear Regression
No ratings yet
Linear Regression
15 pages
Linear-Regression 231212 072619
No ratings yet
Linear-Regression 231212 072619
13 pages
ML-1
No ratings yet
ML-1
24 pages
ML Lecture - 3
No ratings yet
ML Lecture - 3
47 pages
Wk05 machine learning
No ratings yet
Wk05 machine learning
6 pages
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript
No ratings yet
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript
9 pages
ML2 Regression
No ratings yet
ML2 Regression
148 pages
ML UNIT II
No ratings yet
ML UNIT II
30 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
CS464 Ch9 LinearRegression
100% (1)
CS464 Ch9 LinearRegression
43 pages
Linear Regression
No ratings yet
Linear Regression
34 pages
Class 5 - LinearRegression
No ratings yet
Class 5 - LinearRegression
20 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Modern Pridictive Modelling(Regression)
No ratings yet
Modern Pridictive Modelling(Regression)
12 pages
UNIT-6
No ratings yet
UNIT-6
107 pages
AI ML 3
No ratings yet
AI ML 3
27 pages
CM20315 02 Supervised
No ratings yet
CM20315 02 Supervised
53 pages
ML Unit
No ratings yet
ML Unit
23 pages
03 Linear Models
No ratings yet
03 Linear Models
46 pages
Lecture 1, Part 1: Linear Regression: Roger Grosse
No ratings yet
Lecture 1, Part 1: Linear Regression: Roger Grosse
9 pages
Lec 3 Regression.
No ratings yet
Lec 3 Regression.
20 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
COPY-Intermediate Algebra - MAT 016 Final Exam Review
No ratings yet
COPY-Intermediate Algebra - MAT 016 Final Exam Review
13 pages
10-NV_SM2002_E01_1 Common Operations of TECS OpenPalette Products-61p
No ratings yet
10-NV_SM2002_E01_1 Common Operations of TECS OpenPalette Products-61p
61 pages
Hands On
No ratings yet
Hands On
6 pages
s4 Lesson Notes - Chapter 21 Web Authoring
No ratings yet
s4 Lesson Notes - Chapter 21 Web Authoring
11 pages
A Streaming Ensemble Algorithm (SEA) For Large-Scale Classification
No ratings yet
A Streaming Ensemble Algorithm (SEA) For Large-Scale Classification
6 pages
Def Stan00 35pt5iss3 Induced Mechanical
No ratings yet
Def Stan00 35pt5iss3 Induced Mechanical
302 pages
ITRF PPT
No ratings yet
ITRF PPT
6 pages
Divakaran - Amal Kiran - 5079885 - Building Construction - 22-23
No ratings yet
Divakaran - Amal Kiran - 5079885 - Building Construction - 22-23
10 pages
Buy ebook Engineering drawing 2nd Edition Basant Agrawal cheap price
100% (8)
Buy ebook Engineering drawing 2nd Edition Basant Agrawal cheap price
81 pages
Framing in Data Link Layer
No ratings yet
Framing in Data Link Layer
4 pages
CFS Vietnam
No ratings yet
CFS Vietnam
3 pages
Hyperbolic Functions I
No ratings yet
Hyperbolic Functions I
1 page
Top Driver Commissioning Procedure
No ratings yet
Top Driver Commissioning Procedure
26 pages
How To Configure Syslog Server in Linux
No ratings yet
How To Configure Syslog Server in Linux
4 pages
Concentration Factors KT
No ratings yet
Concentration Factors KT
7 pages
React JS
No ratings yet
React JS
25 pages
Addddddddddddddddddddakj
No ratings yet
Addddddddddddddddddddakj
14 pages
521B1166 DKCFNPD043A102 StandardsolenoidvalveVDHT GB Lowres
No ratings yet
521B1166 DKCFNPD043A102 StandardsolenoidvalveVDHT GB Lowres
33 pages
Excretory System: Function and Parts
No ratings yet
Excretory System: Function and Parts
18 pages
Ijaerv12n24 15
No ratings yet
Ijaerv12n24 15
7 pages
NX-progressive-die-design
No ratings yet
NX-progressive-die-design
5 pages
Meeting Morning Report 23-09-2020
No ratings yet
Meeting Morning Report 23-09-2020
3 pages
002) Probability DPP 01 Varun JEE Advanced 2024
No ratings yet
002) Probability DPP 01 Varun JEE Advanced 2024
2 pages
Evauation and Optimization 0f The Cassava Production Processes
No ratings yet
Evauation and Optimization 0f The Cassava Production Processes
14 pages
Edgeworth Box, Robinson Crusoe Exercises
No ratings yet
Edgeworth Box, Robinson Crusoe Exercises
30 pages
Enzymatic Production of Biohydrogen: Brief Communications
No ratings yet
Enzymatic Production of Biohydrogen: Brief Communications
3 pages
Cell Signaling Notes
No ratings yet
Cell Signaling Notes
4 pages