0% found this document useful (0 votes)
43 views

Lecture 2

Linear regression with one variable is described. The model represents the hypothesis as a linear equation relating the single input variable x to the output variable y. Gradient descent is used as the learning algorithm to choose the parameters θ0 and θ1 by minimizing the cost function J(θ0, θ1), which represents the error between the predicted and actual output values. The parameters are updated simultaneously in each step of gradient descent to reduce the overall cost.

Uploaded by

Bandar Almaslukh
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Lecture 2

Linear regression with one variable is described. The model represents the hypothesis as a linear equation relating the single input variable x to the output variable y. Gradient descent is used as the learning algorithm to choose the parameters θ0 and θ1 by minimizing the cost function J(θ0, θ1), which represents the error between the predicted and actual output values. The parameters are updated simultaneously in each step of gradient descent to reduce the overall cost.

Uploaded by

Bandar Almaslukh
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 71

Linear regression

with one variable


Model
representation
Machine Learning

Andrew Ng
500000
Housing Prices
400000
(Portland, OR)
300000
Size in feet2 Price ($) in
(x) 1000's (y) Price 200000

2104 460
(in 1000s 100000
1416 232 of dollars) 0
1534 315
852 178
500 1000 1500 2000 2500 3000
… …
Size (feet2)
Supervised Learning Regression Problem
Given the “right answer” for Predict real-valued output
each example in the data.
Andrew Ng
Training set of Size in feet2 (x) Price ($) in 1000's (y)
housing prices 2104 460
(Portland, OR) 1416 232
1534 315
852 178
… …
Notation:
m = Number of training examples
x’s = “input” variable / features
y’s = “output” variable / “target” variable

Andrew Ng
Training Set How do we represent h ?

Learning Algorithm

Size of h Estimated
house price

Linear regression with one variable.


Univariate linear regression.

Andrew Ng
Linear regression
with one variable
Cost function

Machine Learning

Andrew Ng
Size in feet2 (x) Price ($) in 1000's (y)
Training Set 2104 460
1416 232
1534 315
852 178
… …

Hypothesis:
‘s: Parameters
How to choose ‘s ?
Andrew Ng
3 3 3

2 2 2

1 1 1

0 0 0
0 1 2 3 0 1 2 3 0 1 2 3

Andrew Ng
y

Idea: Choose so that


is close to
for our training examples

Andrew Ng
Linear regression
with one variable
Cost function
intuition I
Machine Learning

Andrew Ng
Simplified
Hypothesis:

Parameters:

Cost Function:

Goal:

Andrew Ng
(for fixed , this is a function of x) (function of the parameter )

3 3

2 2
y
1 1

0 0
0 1 2 3 -0.5 0 0.5 1 1.5 2 2.5
x

Andrew Ng
(for fixed , this is a function of x) (function of the parameter )

3 3

2 2
y
1 1

0 0
0 1 2 3 -0.5 0 0.5 1 1.5 2 2.5
x

Andrew Ng
(for fixed , this is a function of x) (function of the parameter )

3 3

2 2
y
1 1

0 0
0 1 2 3 -0.5 0 0.5 1 1.5 2 2.5
x

Andrew Ng
Linear regression
with one variable
Cost function
intuition II
Machine Learning

Andrew Ng
Hypothesis:

Parameters:

Cost Function:

Goal:

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

500000

400000
Price ($) 300000
in 1000’s
200000

100000

0
500 1000 1500 2000 2500 3000
Size in feet2 (x)

Andrew Ng
Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
Linear regression
with one variable

Gradient
Machine Learning
descent
Andrew Ng
Have some function
Want

Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum
Andrew Ng
J(0,1)

1
0

Andrew Ng
J(0,1)

1
0

Andrew Ng
Gradient descent algorithm

Andrew Ng
Linear regression
with one variable
Gradient descent
intuition
Machine Learning

Andrew Ng
Gradient descent algorithm

Andrew Ng
Andrew Ng
If α is too small, gradient descent
can be slow.

If α is too large, gradient descent


can overshoot the minimum. It may
fail to converge, or even diverge.

Andrew Ng
at local optima

Current value of

Andrew Ng
Gradient descent can converge to a local
minimum, even with the learning rate α fixed.

As we approach a local
minimum, gradient
descent will automatically
take smaller steps. So, no
need to decrease α over
time.
Andrew Ng
Linear regression
with one variable
Gradient descent for
linear regression
Machine Learning

Andrew Ng
Gradient descent algorithm Linear Regression Model

Andrew Ng
Andrew Ng
Gradient descent algorithm

update
and
simultaneously

Andrew Ng
J(0,1)

1
0

Andrew Ng
Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
(for fixed , this is a function of x) (function of the parameters )

Andrew Ng
“Batch” Gradient Descent

“Batch”: Each step of gradient descent


uses all the training examples.

Andrew Ng
Linear Algebra
review (optional)
Matrices and
vectors
Machine Learning

Andrew Ng
Matrix: Rectangular array of numbers:

Dimension of matrix: number of rows x number of columns


Andrew Ng
Matrix Elements (entries of matrix)

“ , entry” in the row, column.

Andrew Ng
Vector: An n x 1 matrix.

element 1-indexed vs 0-indexed:

Andrew Ng
Linear Algebra
review (optional)
Addition and scalar
multiplication

Machine Learning

Andrew Ng
Matrix Addition

Andrew Ng
Scalar Multiplication

Andrew Ng
Combination of Operands

Andrew Ng
Linear Algebra
review (optional)
Matrix-vector
multiplication
Machine Learning

Andrew Ng
Example

Andrew Ng
Details:

m x n matrix n x 1 matrix
m-dimensional
(m rows, (n-dimensional
vector
n columns) vector)

To get , multiply ’s row with elements


of vector , and add them up.
Andrew Ng
Example

Andrew Ng
House sizes:

Andrew Ng
Linear Algebra
review (optional)
Matrix-matrix
multiplication
Machine Learning

Andrew Ng
Example

Andrew Ng
Details:

m x n matrix n x o matrix
mxo
(m rows, (n rows,
matrix
n columns) o columns)

The column of the matrix is obtained by multiplying


with the column of . (for = 1,2,…,o)
Andrew Ng
Example
7

2 7

Andrew Ng
House sizes: Have 3 competing hypotheses:
1.
2.
3.
Matrix Matrix

Andrew Ng
Linear Algebra
review (optional)
Matrix multiplication
properties

Machine Learning

Andrew Ng
Let and be matrices. Then in general,
(not commutative.)

E.g.

Andrew Ng
Associative

Let Compute
Let Compute

Andrew Ng
Linear Algebra
review (optional)

transpose
Machine Learning

Andrew Ng
Matrix Transpose
Example:

Let be an m x n matrix, and let


Then is an n x m matrix, and

Andrew Ng

You might also like