Lect03 CSN382
Lect03 CSN382
Machine Learning
CSN-382 (Lecture 3)
Dr. R. Balasubramanian
Professor
Department of Computer Science and Engineering
Mehta Family School of Data Science and Artificial Intelligence
Indian Institute of Technology Roorkee
Roorkee 247 667
[email protected]
https://faculty.iitr.ac.in/cs/bala/
Limitations of ML
● Related to data
○ Lack of suitable data & human bias in
the data
○ Data privacy and ethical issues
○ Rapid changes in the data
● Related to models
○ Biased models
○ Poor performance in production
○ Regular training required
○ Black box models
● Related to infrastructure
○ Expensive infrastructure requirement
2
Limitations of ML
3
How reliable are the models we
train?
[1] https://bgr.com/2018/06/22/uber-self-driving-car-crash-arizona-hulu-logs/
[2] https://www.mirror.co.uk/news/world-news/google-self-driving-car-hits-
7529261
4
Regression
5
Terminologies Related to the
Regression Analysis
► Dependent Variable
► Independent Variable
► Outliers
► Multicollinearity
► Underfitting and Overfitting
6
Types of Regression
7
What is linear regression?
Age Blood
(Independent Pressure
variable) (target
variable)
9
Regression: Use case
Age in Blood
years pressure
Let us consider this data: (x) (y)
● The task is to predict the 25 120
blood pressure when the 36 135
age is 40. 68 143
55 139
49 120
72 165
40 ?
10
Linear Regression line
y = β0 + β1x + ε
y = set of values taken by
dependent variable Y
x = set of values taken by
independent variable X
β0 = y intercept
β1 = slope
ε = random error component
11
Linear Regression line
12
What is error term?
13
Calculating the error term
● Equation of regression line is
15
Methods to get the best fit line
16
Linear Regression Model
Y i 0 1 X i i
Dependent Independent (Explanatory) Variable
(Response) Variable
17
OLS
18
19
Measures of variation
20
Gradient descent
21
Gradient descent
Loss Function
the cost associated with the
deviation of observed data from Cost Minima
Gradient
predicted data.
Beta coefficient to estimate
● It is an iterative method which
converges to the optimum solution.
● The estimates of the parameter are
updated at every iteration.
22
Gradient descent
● Consider a ball rolling down the slope as shown below
● Any position on the slope is the loss of the current values of the
coefficients (cost)
● The bottom of the slope where the cost function is minimum
● The objective is to find lowest point in the cost function by
continuously trying different values of the parameters
● Repeating this process numerous times, the best parameters are
such that the cost is minimum
23
Gradient Descent Algorithm
24
► For some combination of m and c, we will get the least Error
(MSE). That combination of m and c will give us our best fit
line.
► The algorithm starts with some value of m and c (usually
starts with m=0, c=0). We calculate MSE (cost) at point m=0,
c=0. Let say the MSE (cost) at m=0, c=0 is 100.
► Then we reduce the value of m and c by some amount
(Learning Step). We will notice a decrease in MSE (cost).
► We will continue doing the same until our loss function is a
very small value or ideally 0 (which means 0 error or 100%
accuracy).
25
Algorithm
26
Step 2(a)
27
Step 2(b)
28
2. (b) Similarly, let’s find the partial derivative with respect to c.
Let partial derivative of the Cost function with respect to c
be 𝐷𝑐 (With little change in c how much Cost function
changes).
3. Now update the current values of m and c using the
following equation:
𝑚 = 𝑚 − 𝐿𝐷𝑚
𝑐 = 𝑐 − 𝐿𝐷𝑐
4. We will repeat this process until our Cost function is very
small (ideally 0).
29
Problem
x y
2 3
4 7
6 5
8 10
30
Thank You!
31