0% found this document useful (0 votes)

235 views

Perceptron

1. The document summarizes a lecture on online learning and the perceptron algorithm. 2. It presents a theorem that bounds the number of mistakes made by the perceptron algorithm on a sequence of examples to be no more than (D/γ)2, where D is an upper bound on the lengths of the example vectors and γ is the margin of separation between the classes. 3. The proof of the theorem tracks the length of the perceptron's weight vector after each mistake and uses properties of the algorithm and data to derive the mistake bound.

Uploaded by

api-3814100

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

235 views

Perceptron

Uploaded by

api-3814100

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

CS229 Lecture notes

Andrew Ng

Lecture on 11/4/04

1 The perceptron and large margin classifiers

In this final set of notes on learning theory, we will introduce a different
model of machine learning. Specifically, we have so far been considering
batch learning settings in which we are first given a training set to learn
with, and our hypothesis h is then evaluated on separate test data. In this set
of notes, we will consider the online learning setting in which the algorithm
has to make predictions continuously even while it’s learning.
In this setting, the learning algorithm is given a sequence of examples
(x , y ), (x(2) , y (2) ), . . . (x(m) , y (m) ) in order. Specifically, the algorithm first
(1) (1)

sees x(1) and is asked to predict what it thinks y (1) is. After making its pre-
diction, the true value of y (1) is revealed to the algorithm (and the algorithm
may use this information to perform some learning). The algorithm is then
shown x(2) and again asked to make a prediction, after which y (2) is revealed,
and it may again perform some more learning. This proceeds until we reach
(x(m) , y (m) ). In the online learning setting, we are interested in the total
number of errors made by the algorithm during this process. Thus, it models
applications in which the algorithm has to make predictions even while it’s
still learning.
We will give a bound on the online learning error of the perceptron algo-
rithm. To make our subsequent derivations easier, we will use the notational
convention of denoting the class labels by y =∈ {−1, 1}.
Recall that the perceptron algorithm has parameters θ ∈ Rn+1 , and makes
its predictions according to

hθ (x) = g(θ T x) (1)

1
CS229 Winter 2003 2

where
1 if z ≥ 0
g(z) =
−1 if z < 0.
Also, given a training example (x, y), the perceptron learning rule updates
the parameters as follows. If hθ (x) = y, then it makes no change to the
parameters. Otherwise, it performs the update1

θ := θ + yx.

The following theorem gives a bound on the online learning error of the
perceptron algorithm, when it is run as an online algorithm that performs
an update each time it gets an example wrong. Note that the bound below
on the number of errors does not have an explicit dependence on the number
of examples m in the sequence, or on the dimension n of the inputs (!).

Theorem (Block, 1962, and Novikoff, 1962). Let a sequence of exam-

ples (x(1) , y (1) ), (x(2) , y (2) ), . . . (x(m) , y (m) ) be given. Suppose that ||x(i) || ≤ D
for all i, and further that there exists a unit-length vector u (||u|| 2 = 1) such
that y (i) · (uT x(i) ) ≥ γ for all examples in the sequence (i.e., uT x(i) ≥ γ if
y (i) = 1, and uT x(i) ≤ −γ if y (i) = −1, so that u separates the data with a
margin of at least γ). Then the total number of mistakes that the perceptron
algorithm makes on this sequence is at most (D/γ)2 .
Proof. The perceptron updates its weights only on those examples on which
it makes a mistake. Let θ (k) be the weights that were being used when it made
its k-th mistake. So, θ (1) = ~0 (since the weights are initialized to zero), and
if the k-th mistake was on the example (x(i) , y (i) ), then g((x(i) )T θ(k) ) 6= y (i) ,
which implies that
(x(i) )T θ(k) y (i) ≤ 0. (2)
Also, from the perceptron learning rule, we would have that θ (k+1) = θ(k) +
y (i) x(i) .
We then have

(θ(k+1) )T u = (θ(k) )T u + y (i) (x(i) )T u

≥ (θ(k) )T u + γ
1
This looks slightly different from the update rule we had written down earlier in the
quarter because here we have changed the labels to be y ∈ {−1, 1}. Also, the learning rate
parameter α was dropped. The only effect of the learning rate is to scale all the parameters
θ by some fixed constant, which does not affect the behavior of the perceptron.
CS229 Winter 2003 3

By a straightforward inductive argument, implies that

(θ(k+1) )T u ≥ kγ. (3)

Also, we have that

||θ(k+1) ||2 = ||θ(k) + y (i) x(i) ||2

= ||θ(k) ||2 + ||x(i) ||2 + 2y (i) (x(i) )T θ(i)
≤ ||θ(k) ||2 + ||x(i) ||2
≤ ||θ(k) ||2 + D2 (4)

The third step above used Equation (2). Moreover, again by applying a
straightfoward inductive argument, we see that (4) implies

||θ(k+1) ||2 ≤ kD2 . (5)

Putting together (3) and (4) we find that

√
kD ≥ ||θ (k+1) ||
≥ (θ(k+1) )T u
≥ kγ.

The second inequality above follows from the fact that u is a unit-length
vector (and z T u = ||z|| · ||u|| cos φ ≤ ||z|| · ||u||, where φ is the angle between
z and u). Our result implies that k ≤ (D/γ)2 . Hence, if the perceptron made
a k-th mistake, then k ≤ (D/γ)2 .

Haykin, Xue-Neural Networks and Learning Machines 3ed Soln
53% (19)
Haykin, Xue-Neural Networks and Learning Machines 3ed Soln
103 pages
Script-Jungle Cruise
No ratings yet
Script-Jungle Cruise
24 pages
Characteristic of Young Athletes
100% (1)
Characteristic of Young Athletes
87 pages
Principles of Mirror and Curtain
No ratings yet
Principles of Mirror and Curtain
2 pages
Manual Digital Therapy Machine
100% (1)
Manual Digital Therapy Machine
37 pages
First Steps in First Aid: Training Packages For Health Emergencies
No ratings yet
First Steps in First Aid: Training Packages For Health Emergencies
19 pages
Percept Ron
No ratings yet
Percept Ron
2 pages
Lecture Notes 3 Perceptron
No ratings yet
Lecture Notes 3 Perceptron
7 pages
Perceptron Bound Proof
No ratings yet
Perceptron Bound Proof
27 pages
hw1 Sols PDF
No ratings yet
hw1 Sols PDF
5 pages
06 Optimization Basics PDF
No ratings yet
06 Optimization Basics PDF
82 pages
Asset-V1 MITx+6.86x+3T2020+typeasset+blockslides Lecture2 Compressed
No ratings yet
Asset-V1 MITx+6.86x+3T2020+typeasset+blockslides Lecture2 Compressed
21 pages
Neural N Problems - SLP
No ratings yet
Neural N Problems - SLP
123 pages
Perceptron
No ratings yet
Perceptron
6 pages
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
No ratings yet
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
50 pages
Perceptron Mistake Bound
No ratings yet
Perceptron Mistake Bound
10 pages
MAT6007 Session5 Perceptron Algorithm
No ratings yet
MAT6007 Session5 Perceptron Algorithm
19 pages
Lecture 2
No ratings yet
Lecture 2
57 pages
Lecturenotes Perceptron
No ratings yet
Lecturenotes Perceptron
7 pages
ANN (Perceptron) 02
No ratings yet
ANN (Perceptron) 02
14 pages
Clase3_redUnidireccional
No ratings yet
Clase3_redUnidireccional
74 pages
Linear Classifier-Perceptron
No ratings yet
Linear Classifier-Perceptron
42 pages
1 Algorithm: For I 1 To N Ify
No ratings yet
1 Algorithm: For I 1 To N Ify
6 pages
07. Linear Regression
No ratings yet
07. Linear Regression
37 pages
Perceptron Example (Practice Que)
No ratings yet
Perceptron Example (Practice Que)
26 pages
Perceptron Linear Classifiers
No ratings yet
Perceptron Linear Classifiers
42 pages
NN Part1
No ratings yet
NN Part1
43 pages
Perceptron PDF
0% (1)
Perceptron PDF
8 pages
2007 02 01b Janecek Perceptron
No ratings yet
2007 02 01b Janecek Perceptron
37 pages
Perceptron PDF
No ratings yet
Perceptron PDF
37 pages
ML_Lec 6- Linear Classifiers
No ratings yet
ML_Lec 6- Linear Classifiers
55 pages
3 Perceptron: Nnets - L. 3 February 10, 2002
No ratings yet
3 Perceptron: Nnets - L. 3 February 10, 2002
31 pages
Perceptrons
No ratings yet
Perceptrons
11 pages
Lecture Notes 1 - Supervised Learning & Perceptron
No ratings yet
Lecture Notes 1 - Supervised Learning & Perceptron
3 pages
perceptron
No ratings yet
perceptron
11 pages
Peer Instruction: Might Not Just Be One Correct Answer
No ratings yet
Peer Instruction: Might Not Just Be One Correct Answer
52 pages
Perceptron: Tirtharaj Dash
No ratings yet
Perceptron: Tirtharaj Dash
22 pages
DeepLearning Practice Question Answers
No ratings yet
DeepLearning Practice Question Answers
43 pages
SML_Lecture5
No ratings yet
SML_Lecture5
45 pages
(I) (J) (I) TH (J) TH (I) (J)
No ratings yet
(I) (J) (I) TH (J) TH (I) (J)
3 pages
Perceptrons Algorithm PDF
No ratings yet
Perceptrons Algorithm PDF
68 pages
Percept Rons
No ratings yet
Percept Rons
68 pages
Ann Mid1: Artificial Neural Networks With Biological Neural Network - Similarity
No ratings yet
Ann Mid1: Artificial Neural Networks With Biological Neural Network - Similarity
13 pages
Slide 2
No ratings yet
Slide 2
35 pages
In-Class Exercise Solutions - Perceptrons
No ratings yet
In-Class Exercise Solutions - Perceptrons
23 pages
Theory and Examples: Problem Statement
No ratings yet
Theory and Examples: Problem Statement
44 pages
Uni2 NN 2023
No ratings yet
Uni2 NN 2023
52 pages
20.NeuralNets Short
No ratings yet
20.NeuralNets Short
60 pages
Kakade S. Tewari a. - Topics in Artificial Intelligence (Learning Theory)
No ratings yet
Kakade S. Tewari a. - Topics in Artificial Intelligence (Learning Theory)
68 pages
ANN - Perceptron - Adaline
No ratings yet
ANN - Perceptron - Adaline
15 pages
Lecture 16 - Hyperplane Classifiers - Perceptron - Plain
No ratings yet
Lecture 16 - Hyperplane Classifiers - Perceptron - Plain
9 pages
cz4041 7 ANN
No ratings yet
cz4041 7 ANN
70 pages
1 Introduction/Recap From Last Time: COS 511: Theoretical Machine Learning
No ratings yet
1 Introduction/Recap From Last Time: COS 511: Theoretical Machine Learning
5 pages
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
No ratings yet
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
11 pages
Lecture 3 - The Perceptron
No ratings yet
Lecture 3 - The Perceptron
4 pages
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
No ratings yet
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
43 pages
NeuralNetwork3
No ratings yet
NeuralNetwork3
19 pages
Perceptron - Algorithm
No ratings yet
Perceptron - Algorithm
9 pages
Dave Reed: Connectionist Approach To AI
No ratings yet
Dave Reed: Connectionist Approach To AI
26 pages
Perceptron Lecture 3
No ratings yet
Perceptron Lecture 3
25 pages
Single Layer Feedforward Networks
No ratings yet
Single Layer Feedforward Networks
21 pages
Haykin Xue Neural Networks and Learning Machines 3ed Soln PDF
50% (2)
Haykin Xue Neural Networks and Learning Machines 3ed Soln PDF
103 pages
Speed Mathamatics
From Everand
Speed Mathamatics
Naila Hina
1/5 (1)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Elementary Calculus
From Everand
Elementary Calculus
George N. Frempong
No ratings yet
OEMT 2006 新一代船舶操纵模拟器关键技术探讨
No ratings yet
OEMT 2006 新一代船舶操纵模拟器关键技术探讨
8 pages
Oemt 2006 Global Networking For Development of Ship Handling Simulators
No ratings yet
Oemt 2006 Global Networking For Development of Ship Handling Simulators
8 pages
A Split-And-merge Method With Ranking Selection For Polygonal Approximation of Digital Curve
No ratings yet
A Split-And-merge Method With Ranking Selection For Polygonal Approximation of Digital Curve
6 pages
SC Intro
No ratings yet
SC Intro
8 pages
Network Models 05
100% (1)
Network Models 05
9 pages
A Case Study On Sibling Rivalry and The Use of A Social Skills Training Model
No ratings yet
A Case Study On Sibling Rivalry and The Use of A Social Skills Training Model
39 pages
ETHICS 7 Chapter-VI-Synthesis
100% (4)
ETHICS 7 Chapter-VI-Synthesis
5 pages
Baa Baa Black Sheep
No ratings yet
Baa Baa Black Sheep
6 pages
NCP - Patient For Electrocardiogram (ECG)
No ratings yet
NCP - Patient For Electrocardiogram (ECG)
1 page
St. Anne College Lucena, Inc.: Integrated Basic Education Department
No ratings yet
St. Anne College Lucena, Inc.: Integrated Basic Education Department
39 pages
Swot Analysis of Wipro LTD.: Chapter-1
No ratings yet
Swot Analysis of Wipro LTD.: Chapter-1
23 pages
JrEng TS T 01032023
No ratings yet
JrEng TS T 01032023
32 pages
MacGyver Secret Quickstart PDF
100% (2)
MacGyver Secret Quickstart PDF
11 pages
Woman in White Teacher's Notes
100% (1)
Woman in White Teacher's Notes
5 pages
Jurisprudence Attempted Rape
No ratings yet
Jurisprudence Attempted Rape
65 pages
Typologies of Health Tourism, Categorization of Medical Tourist, Wellness and Spa, Categorization of Spa
No ratings yet
Typologies of Health Tourism, Categorization of Medical Tourist, Wellness and Spa, Categorization of Spa
8 pages
Logocentrism - Wikipedia
No ratings yet
Logocentrism - Wikipedia
5 pages
0 CP468 Artificial Intelligence
No ratings yet
0 CP468 Artificial Intelligence
24 pages
Dave Ramsey
100% (2)
Dave Ramsey
2 pages
Virgil 2nd Edition Jasper Griffin - The latest updated ebook is now available for download
100% (1)
Virgil 2nd Edition Jasper Griffin - The latest updated ebook is now available for download
60 pages
Bipolar Guidelines All in One- Psych Reads
No ratings yet
Bipolar Guidelines All in One- Psych Reads
19 pages
MS Obs & Gynae Logbook Phase-A
No ratings yet
MS Obs & Gynae Logbook Phase-A
130 pages
CONSUMER BEHAVIOUR & MARKETING COMMUNICATION
No ratings yet
CONSUMER BEHAVIOUR & MARKETING COMMUNICATION
12 pages
1SVR440723R0300 CL LMR C18ac2 Logic Relay 100 240vac 12i 6o Relay
No ratings yet
1SVR440723R0300 CL LMR C18ac2 Logic Relay 100 240vac 12i 6o Relay
2 pages
3rd Term Drama-WPS Office
No ratings yet
3rd Term Drama-WPS Office
6 pages
(Ebook) Kyoto: A Cultural History by John Dougill ISBN 9780195301373, 9780199760466, 0195301374, 0199760462 - The 2025 ebook edition is available with updated content
100% (1)
(Ebook) Kyoto: A Cultural History by John Dougill ISBN 9780195301373, 9780199760466, 0195301374, 0199760462 - The 2025 ebook edition is available with updated content
57 pages
Who Belies Deen by G A Parwez Published by Idara Tulu-E-Islam
No ratings yet
Who Belies Deen by G A Parwez Published by Idara Tulu-E-Islam
25 pages
Enhancing Standard Components WDA
No ratings yet
Enhancing Standard Components WDA
12 pages
Conjugating Regular French Verbs in The Subjunctive
No ratings yet
Conjugating Regular French Verbs in The Subjunctive
2 pages
Face Recognition Using Pca On Matlab: A Synopsis Report ON
No ratings yet
Face Recognition Using Pca On Matlab: A Synopsis Report ON
8 pages

Perceptron

Uploaded by

Perceptron

Uploaded by

CS229 Lecture notes

1 The perceptron and large margin classifiers

hθ (x) = g(θ T x) (1)

Theorem (Block, 1962, and Novikoff, 1962). Let a sequence of exam-

(θ(k+1) )T u = (θ(k) )T u + y (i) (x(i) )T u

By a straightforward inductive argument, implies that

(θ(k+1) )T u ≥ kγ. (3)

Also, we have that

||θ(k+1) ||2 = ||θ(k) + y (i) x(i) ||2

||θ(k+1) ||2 ≤ kD2 . (5)

Putting together (3) and (4) we find that

You might also like