0% found this document useful (0 votes)
193 views21 pages

Asset-V1 MITx+6.86x+3T2020+typeasset+blockslides Lecture2 Compressed

This document discusses machine learning concepts including: - Linear classifiers are defined by a weight vector θ and classify points based on which side of the linear decision boundary they fall. - Examples are linearly separable if there exists a linear classifier defined by θ that can correctly classify all examples based on their feature vectors x and labels y. - The perceptron algorithm is presented for learning a linear classifier from labeled examples through iterative updates of the weight vector θ each time a mistake is made on a training example. The algorithm terminates after a fixed number of iterations T.

Uploaded by

Rahul Vasanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
193 views21 pages

Asset-V1 MITx+6.86x+3T2020+typeasset+blockslides Lecture2 Compressed

This document discusses machine learning concepts including: - Linear classifiers are defined by a weight vector θ and classify points based on which side of the linear decision boundary they fall. - Examples are linearly separable if there exists a linear classifier defined by θ that can correctly classify all examples based on their feature vectors x and labels y. - The perceptron algorithm is presented for learning a linear classifier from labeled examples through iterative updates of the weight vector θ each time a mistake is made on a training example. The algorithm terminates after a fixed number of iterations T.

Uploaded by

Rahul Vasanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Machine Learning

Lecture 2
Review of basic concepts
‣ Feature vectors, labels

‣ Training set

‣ Classifier

‣ Training error

‣ Test error

‣ Set of classifiers
Review: training set

x2

- +
+

x1
-
Review: a classifier

x2
h(x) = +1

- +
+

h(x) = 1

x1
-
Review: test set

x2
? h(x) = +1
? ?
? +
- ?
+
?

h(x) = 1? ?
?
?
x1
? -
This lecture
‣ The set of linear classifiers

‣ Linear separation

‣ Perceptron algorithm
Linear classifiers

x2

x1
Linear classifiers through origin

x2

x1
Linear classifiers

x2

x1
Linear separation: ex

x2

x1
Linear separation: ex

x2

x1
Linear separation: ex

x2

x1
Linear separation

Definition:
Training examples Sn = {(x(i) , y (i) }), i = 1, . . . , n} are
linearly separable if there exists a parameter vector ✓ˆ and
o↵set parameter ✓ˆ0 such that y (i) (✓ˆ · x(i) + ✓ˆ0 ) > 0 for all
i = 1, ..., n.
Learning linear classifiers
‣ Training error for a linear classifier (through origin)
Learning linear classifiers
‣ Training error for a linear classifier
Learning algorithm: perceptron
Algorithm 1 Perceptron Algorithm (without offset)
procedure Perceptron({(x(i) , y (i) ), i = 1, . . . , n}, T )
✓ = 0 (vector)
for t = 1, . . . , T do
for i = 1, . . . , n do
if y (i) (✓ · x(i) )  0 then
✓ = ✓ + y (i) x(i)
return ✓

We should first establish that the perceptron updates tend


this, consider a simple two dimensional example in figure 4. T
figure are chosen such that the algorithm makes a mistake on
pass. As a result, the updates become: ✓(0) = 0 and
Learning algorithm: perceptron
Algorithm 1 Perceptron Algorithm (without offset)
procedure Perceptron({(x(i) , y (i) ), i = 1, . . . , n}, T )
✓ = 0 (vector)
for t = 1, . . . , T do
for i = 1, . . . , n do
if y (i) (✓ · x(i) )  0 then
✓ = ✓ + y (i) x(i)
return ✓

We should first establish that the perceptron updates tend


this, consider a simple two dimensional example in figure 4. T
figure are chosen such that the algorithm makes a mistake on
pass. As a result, the updates become: ✓(0) = 0 and
Learning algorithm: perceptron
Algorithm 1 Perceptron Algorithm (without offset)
procedure Perceptron({(x(i) , y (i) ), i = 1, . . . , n}, T )
✓ = 0 (vector)
for t = 1, . . . , T do
for i = 1, . . . , n do
if y (i) (✓ · x(i) )  0 then
✓ = ✓ + y (i) x(i)
return ✓

We should first establish that the perceptron updates tend


this, consider a simple two dimensional example in figure 4. T
figure are chosen such that the algorithm makes a mistake on
pass. As a result, the updates become: ✓(0) = 0 and
Perceptron algorithm: ex

x2

x1
The perceptron algorithm and the above statements about c
to the casePerceptron (with offset)
with the offset parameter.

Algorithm 2 Perceptron Algorithm


1: procedure Perceptron({(x(i) , y (i) ), i = 1, . . . , n}, T )
2: ✓ = 0 (vector), ✓0 = 0 (scalar)
3: for t = 1, . . . , T do
4: for i = 1, . . . , n do
5: if y (✓ · x + ✓0 )  0 then
(i) (i)

6: ✓ = ✓ + y (i) x(i)
7: ✓0 = ✓0 + y (i)
8: return ✓, ✓0

Why is the offset parameter updated in this way? Think of i


with an additional coordinate that is set to 1 for all example
our examples x 2 Rd to x0 2 Rd+1 such that x0 = [x1 , . . . , x
Key things to understand
‣ Parametric families (sets) of classifiers

‣ The set of linear classifiers

‣ Linear separation

‣ Perceptron algorithm

You might also like