0% found this document useful (0 votes)

10 views

PNAL4 SingleLayerNets

Uploaded by

egeee eemmm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

PNAL4 SingleLayerNets

Uploaded by

egeee eemmm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Perceptron Networks and Applications

M. Ali Akcayol
Gazi University
Department of Computer Engineering
Content
 Perceptrons
 Linear separability
 Perceptron training algorithm
 Termination criterion
 Choice of learning rate
 Non-numeric inputs
 Adalines
 Multiclass discrimination

2
Perceptrons
 In supervised learning algorithms, the desired result is known
for samples in the training data.
 The learning algorithms are simpler for the networks
consisting of only one node in one layer.
 The modification of the weights is very simple.
 The perceptrons have simple description but limited
capabilities.
 A perceptron is defined to be a machine that learns using
examples.
 A perceptron also is defined as a stochastic gradient-descent
algortihm that separate a set of n-dimensional space linearly.

3
Perceptrons
 A perceptron has a single output whose values determine that
each input pattern belongs to which one of two classes.
 A perceptron can be represented by a single node.
 The perceptron applies a step function to the net weighted
sum of its inputs.
 The input pattern is considered to belong to one class or the
other.
 The output class is decided depending on whether the node
output is 0 or l.

4
Perceptrons
Example
 Consider two-dimensional samples (0,0), (0,1), (1,0), (-1,-1)
that belong to one class, and samples (2.1,0), (0, -2.5), (1.6,
-1.6) that belong to another class.
 These classes are linearly separable.
 The node function is a step function.
 The output of the node is 1 if the net weighted input is greater
than 2, and 0 otherwise.

x1 - x2 ≤ 2
x1 - x2 > 2
5
Content
 Perceptrons
 Linear separability
 Perceptron training algorithm
 Termination criterion
 Choice of learning rate
 Non-numeric inputs
 Adalines
 Multiclass discrimination

6
Linear separability
 If there exists a line that separates all samples of one class
from the other class, such classification problems are said to
be ‘linearly separable’.
 The line’s equation is

w0 + w1 x1 + w2 x2 = 0

 If there is perceptron with weights w0 , w1 , w2 for connections

from inputs 1, x1, x2 , the perceptron can separate samples of
two classes.
 If the samples are NOT linearly separable, i.e., no straight line
can possibly separate samples belonging to two classes, then
there cannot be any simple perceptron that achieves this task.
 This is the fundamental limitation of simple perceptrons.

7
Linear separability
 Examples of linearly non separable classes are:

 Most real-life classification problems are linearly nonseparable.

8
Linear separability
 If there is only one input dimension x, then the two-class
problem can be solved using a perceptron if and only if there
is some value x0 of x such that all samples of one class occur
for x > x0 , and all samples of the other class occur for x < x0.

9
Linear separability
 If there are three input dimensions, a two-class problem can
be solved using a perceptron if and only if there is a plane that
separates samples of different classes.
 As in the two-dimensional case, coefficients of terms
correspond to the weights of the perceptron.
 A generic perceptron for n-dimensional space.

 For this perceptron, hyperplane is .

10
Linear separability
 For spaces of higher number of input dimensions, the
geometric presentations need to be extended.
 Hyperplanes can separate samples of different classes in n-
dimensional space.
 Each hyperplane in n dimensions is defined by the equation

 Each hyperplane divides the n-dimensional space into two

regions:
1-
2-
 Training algorithms used to obtain the weights of a suitable
perceptron.

11
Content
 Perceptrons
 Linear separability
 Perceptron training algorithm
 Termination criterion
 Choice of learning rate
 Non-numeric inputs
 Adalines
 Multiclass discrimination

12
Perceptron training algorithm
 Perceptron training algorithm can be used to obtain
appropriate weights of a perceptron that separates two
classes.
 Using weight values, the equation of the hyperplane that
divide the solution space can be derived.
 The developed perceptron can be used to classify new
samples.
 Dot product or scalar product of two vectors, w and x, is
defined as follows,

 Euclidean length ǁvǁ of a vector v is defined as,

13
Perceptron training algorithm
 The presentation of the learning is simplified by using
perceptron output values  {-1, 1} instead of {0, 1}.
 Weight values are randomly chosen between 0 and 1.
 It is assumed that the perceptron with weight vector w has
output 1 if w.x > 0, and output -1 otherwise.
 If the network output differs from the desired output, the
weights must be changed, otherwise cannot be changed.
 If a sample (i) belongs to class 0, but w.i > 0, then the weight
vector needs to be modified.
 After each modification, the sample would have a better
chance in the following iteration.

14
Perceptron training algorithm
 If i belongs to a class (desired node output is -1) but w.i > 0,
then the weight vector needs to be modified to w + Δw
so that (w + Δw).i < w.i
 Δw = -η.i, where η > 0.
 After modification of the weight, i would have a better chance
of being classified correctly in the following iteration.

15
Perceptron training algorithm
 If i belongs to a class (desired node output is 1) but w.i < 0,
then the weight vector needs to be modified to w + Δw
so that (w + Δw).i > w.i
 Let i1 , i2 , …, ip denote the training set, containing p input
vectors.
 We define a function that maps each sample to either +1 (C1)
or -1 (C0).
 Samples are presented repeatedly to train the weights.

16
Perceptron training algorithm
Example
 Let there be 7 one-dimensional input patterns as shown
below.

 The 7 input paterns can be separable linearly.

 Samples {0.0, 0.17, 0.33, 0.50} belong to one class (desired
output 0), and samples {0.67, 0.83, 1.0} belong to the other
class (desired output 1).
 For the initial randomly chosen value of w1 = -0.36, and
w0 = -1.0, {0.83, 0.67, 1.0} are misclassified.

17
Perceptron training algorithm
Example – cont.
 For the input value 0.83, output is (0.83)(-0.36) – 1.0 = -1.2
 Then the sample has calculated class 0, which is an error (it
would be 1).
 For η = 0.1, new weights are calculated as,

 For the new weights, some samples are still misclassified.

 The weights are modified iteratively and the final weight
values are,
w1 = 0.3
w0 = -0.2
18
Perceptron training algorithm
Example – cont.
 The progress of the training process.

 What is the reason of the oscillations on weight values?

19
Perceptron training algorithm
 There are some important questions:
 How long should we execute this training procedure?
 What is the termination criterion (if the given samples are
not linearly separable)
 What is the appropriate choice of the learning rate?
 How can the perceptron training algorithm be applied to
problems in which the inputs are non-numeric values (color,
label, name, …)?
 Is there a guarantee that the training algorithm will always
succeed whenever the samples are linearly separable?
 Can the perceptron training algorithm work reasonably well
when samples are not linearly separable?

20
Content
 Perceptrons
 Linear separability
 Perceptron training algorithm
 Termination criterion
 Choice of learning rate
 Non-numeric inputs
 Adalines
 Multiclass discrimination

21
Termination criterion
 For many ANN learning algorithms, the termination criterion is
″stop when the goal is achieved″.
 For any kind of classifier, the goal is the correct classification
of all samples.
 So the perceptron training algorithm runs until all samples are
correctly classified.
 For perceptron, termination is assured if η sufficiently small
and samples are linearly separable.
 If η is not appropriate or samples are not linearly separable,
the algorithm runs indefinitely.
 How can we detect that this may be the case?

22
Termination criterion
 The amount of progress achieved in the recent past can be
used to terminate the training.
 For linear classifier, if the number of correct classification has
not changed in large of steps, the samples may not be linearly
separable.
 The same problem may be occurred with the inappropriate
choice of η.
 The different values of η may yield improvement for training
phase.

23
Termination criterion
 In some problems, two classes overlap (not linearly
separable).
 If the performance requirements allow some amount of
misclassification, we can modify the termination criterion.
 For example, it may known that at least 6% of the samples
will be misclassified (or user satisfied with 6%), the
termination criterion should be modified.
 We can then terminate the training algorithm as soon as 94%
of the samples are correctly classified.

24
Content
 Perceptrons
 Linear separability
 Perceptron training algorithm
 Termination criterion
 Choice of learning rate
 Non-numeric inputs
 Adalines
 Multiclass discrimination

25
Choice of learning rate
 The examination of extreme cases can help derive a good
choice for η.
 If η is too large (e.g. 1.000.000), then the components of
Δw = ±ηx can have very large magnitudes.
 If η is too large, each weight update swings perceptron
outputs completely in one direction as a result, the perceptron
considers all samples to be in the same class.
 The system oscillates between extremes.
 If η is very small (e.g. η = 0) the weights are never going to
be modified.
 If η equals some too small value, the change in the weights in
each step going to be too small. This makes the algorithm
exceedingly slow.
26
Choice of learning rate
 If η is too large, the progress will start very fast, but eventually
jump around the optimal solution and will never settle down.
 If η is too small, the training will eventually converge to the best
state, but this will take a long time.
 To find a fairly good learning rate, the network should be trained by
using various learning rates.

27
Choice of learning rate
 What is an appropriate choice for η, which is neither too small
nor too large?
 A common choice is η = 1, leading to the simple weight
change computational rule of Δw = ±x,
so that (w + Δw).x = w.x ± x.x
 If |w.x| > |x.x|, the sample x may not be correctly classified.
 In order to ensure that the sample x correctly classified,
(w + Δw).x and x.x have opposite signs.

28
Content
 Perceptrons
 Linear separability
 Perceptron training algorithm
 Termination criterion
 Choice of learning rate
 Non-numeric inputs
 Adalines
 Multiclass discrimination

29
Non-numeric inputs
 In some problems, the input dimensions are non-numeric.
 For example, input dimension may be ″color″.
 Its values may range over the set {red, blue, green, yellow}.
 We may not establish a relationships between colors on an
axis.
 The simplest way is to generate four new dimensions (″red″,
″blue″, ″green″, ″yellow″).
 We can replace each original attribute-value pair by a binary
vector.
 For instance, color = ″green″ is represented by the input
vector (0, 0, 1, 0), ″blue″ is (0, 1, 0, 0).
 The disadvantage of this approach is a drastic increase in the
number of dimensions.

30
Non-numeric inputs
Example
 The day of the week (Sunday/Monday/ . . .) is an important
variable in predicting the amount of electric power consumed
in a city.
 However, there is no obvious way of sequencing weekdays.
 So it is not appropriate to use a single variable whose values
range from 1 to 7.
 Instead, seven different variables should be chosen and each
input sample has a value of 1 for one of these coordinates,
and a value of 0 for others.
 For instance, ″Tuesday″ is represented as (0, 0, 1, 0, 0, 0, 0),
″Monday″ is (0, 1, 0, 0, 0, 0, 0).

31
Content
 Perceptrons
 Linear separability
 Perceptron training algorithm
 Termination criterion
 Choice of learning rate
 Non-numeric inputs
 Adalines
 Multiclass discrimination

32
Adalines
 The fundamental principle underlying the perceptron learning
algorithm is to modify weights to reduce the number of
misclassifications.
 Perfect classification using a linear element may not be
possible for all problems.
 Minimizing the mean squared error (MSE) instead of the
number of misclassified samples may be used while training.
 An adaptive linear element or Adaline, proposed by Widrow
(1959, 1960), is a simple perceptron-like system.

33
Adalines
 Adaline accomplishes classification by modifying weights in
such a way as to diminish the MSE at each iteration.
 This can be accomplished using gradient descent.
 MSE is a quadratic function whose derivative exists
everywhere.
 Unlike the perceptron, this algorithm implies that weight
changes are made to reduce MSE.
 Even when a sample is correctly classified by the network, the
weights may change.

34
Adalines
 In the training process, when a sample is presented to the
network, the linear weighted net input is computed.
 Computed net value is compared with the desired output.
 Generated error signal used to modify each weight in the
Adaline.
 The weight change rule use partial derivative with respect to
weights.

35
Adalines
 Let be an input vector for which dj is the
desired output value.
 Let be the net input to the node.
 is the presented value of the weight vector.
 The squared error is

 The weight update rule is

36
Adalines
 Adaline Least-Mean-Squares (LMS) training algorithm

 The weight vector w is changed when the input vector ij is

presented to the Adaline.
37
Adalines
 A modification on this LMS rule has been made by Widrow and
Hoff.
 The weight change magnitude independent of the magnitude
of the input vector.
 -LMS (or Widrow-Hoff delta rule) training rule is

where, dj is the desired output for the jth input ij ,

ǁ i ǁ denotes the length of vector i.

where, l is the length of the input vector.

38
Content
 Perceptrons
 Linear separability
 Perceptron training algorithm
 Termination criterion
 Choice of learning rate
 Non-numeric inputs
 Adalines
 Multiclass discrimination

39
Multiclass discrimination
 So far, we have considered dichotomies, or two-class
problems.
 Many important real-life problems require partitioning data
into three or more classes.
 For example, the character recognition problem consists of
distinguishing between samples of 29 (for Turkish alphabet)
different classes.
 A layer of perceptrons or Adalines may be used to solve some
such multiclass problems.
 Four perceptrons can put together to solve a four-class
classification problem.

40
Multiclass discrimination
 Each weight wi,j indicates the strength of the connection jth
input to the ith node.
 A sample is considered to belong to the ith class if and only if
the ith output oi = 1, and every other output ok = 0, for k ≠ i.
 This network is trained in the same way as perceptrons.
 If all outputs are zeroes or if more than one output value
equals 1, the network is considered to have failed in the
classification task.
 All outputs can have values in between 0 and 1, a ‘maximum-
selector’ can be used to select the highest-value output.

41
Homework

 Prepare a report on the use of artificial neural networks in the

speech-to-text and text-to-speech applications.

Gospelchristianity PDF
100% (3)
Gospelchristianity PDF
84 pages
NN-Ch2 New V1
No ratings yet
NN-Ch2 New V1
99 pages
A Presentation On: By: Edutechlearners
No ratings yet
A Presentation On: By: Edutechlearners
33 pages
Perceptron Linear Classifiers
No ratings yet
Perceptron Linear Classifiers
42 pages
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
No ratings yet
Artificial Neural Networks Unit 3: Single-Layer Perceptrons
11 pages
Linear
No ratings yet
Linear
18 pages
NN Unit 2
No ratings yet
NN Unit 2
20 pages
Perceptron Lecture 3
No ratings yet
Perceptron Lecture 3
25 pages
Single Layer Feedforward Networks
No ratings yet
Single Layer Feedforward Networks
21 pages
4. Learning Algorithm
No ratings yet
4. Learning Algorithm
58 pages
Iv. Single Layer Structures: 4.1. Perceptrons
No ratings yet
Iv. Single Layer Structures: 4.1. Perceptrons
26 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
ANN (Perceptron) 02
No ratings yet
ANN (Perceptron) 02
14 pages
Unit 1 Until MLP
No ratings yet
Unit 1 Until MLP
56 pages
Perceptron 2014
No ratings yet
Perceptron 2014
44 pages
NN 2
No ratings yet
NN 2
42 pages
Pattern Recognition & Analysis Assignment - Ii
No ratings yet
Pattern Recognition & Analysis Assignment - Ii
19 pages
1c Perceptrons
No ratings yet
1c Perceptrons
20 pages
Lecturenotes Perceptron
No ratings yet
Lecturenotes Perceptron
7 pages
lecture 4
No ratings yet
lecture 4
65 pages
Perceptron PDF
0% (1)
Perceptron PDF
8 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
1c Perceptrons4
No ratings yet
1c Perceptrons4
5 pages
Slide 2
No ratings yet
Slide 2
35 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
neural (2)
No ratings yet
neural (2)
32 pages
Lecture 5 NN
No ratings yet
Lecture 5 NN
57 pages
Module 4 Lab 1
No ratings yet
Module 4 Lab 1
5 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Lecture Notes 3 Perceptron
No ratings yet
Lecture Notes 3 Perceptron
7 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
71 pages
Perceptron PDF
No ratings yet
Perceptron PDF
37 pages
2007 02 01b Janecek Perceptron
No ratings yet
2007 02 01b Janecek Perceptron
37 pages
NNFL 3unit
No ratings yet
NNFL 3unit
10 pages
Lecture 3- Rosenblatt_s Perceptron-Ch2
No ratings yet
Lecture 3- Rosenblatt_s Perceptron-Ch2
20 pages
NN 03
No ratings yet
NN 03
27 pages
Neural Networks Unit - 8 Perceptrons Objective: An Objective of The Unit-8 Is, The Reader Should Be Able To Know What Do You
No ratings yet
Neural Networks Unit - 8 Perceptrons Objective: An Objective of The Unit-8 Is, The Reader Should Be Able To Know What Do You
13 pages
nn1
No ratings yet
nn1
6 pages
soft computing unit 2
No ratings yet
soft computing unit 2
23 pages
Uni2 NN 2023
No ratings yet
Uni2 NN 2023
52 pages
8.NN and Clustering Moodle
No ratings yet
8.NN and Clustering Moodle
51 pages
ANN - Perceptron - Adaline
No ratings yet
ANN - Perceptron - Adaline
15 pages
NY Perceptron Notes
No ratings yet
NY Perceptron Notes
21 pages
20200428135045cfbc718e2c (1)
No ratings yet
20200428135045cfbc718e2c (1)
30 pages
Neural N Problems - SLP
No ratings yet
Neural N Problems - SLP
123 pages
Unit II - Perceptron
No ratings yet
Unit II - Perceptron
20 pages
Unit 2 - Soft Computing - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Soft Computing - WWW - Rgpvnotes.in
17 pages
Neural Networks Neural Networks
No ratings yet
Neural Networks Neural Networks
30 pages
ML 03
No ratings yet
ML 03
42 pages
Perceptron: Learning Rule. Generalize From Its Training Vectors and Work With
No ratings yet
Perceptron: Learning Rule. Generalize From Its Training Vectors and Work With
32 pages
UNIT 3
No ratings yet
UNIT 3
9 pages
L17-Perceptron
No ratings yet
L17-Perceptron
21 pages
Lecture 10
No ratings yet
Lecture 10
155 pages
NN
No ratings yet
NN
37 pages
Lecture-3 Learning in Feedforward Neural Networks (1)654
No ratings yet
Lecture-3 Learning in Feedforward Neural Networks (1)654
35 pages
NN Lec - 02
No ratings yet
NN Lec - 02
63 pages
DL Question Bank Answers
No ratings yet
DL Question Bank Answers
55 pages
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
Exercises of Numerical Analysis
From Everand
Exercises of Numerical Analysis
Simone Malacrida
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
SE-TP-013
No ratings yet
SE-TP-013
2 pages
Agreement of Subject With Predicate
No ratings yet
Agreement of Subject With Predicate
1 page
2 MS Sequence 1 Lesson 1
No ratings yet
2 MS Sequence 1 Lesson 1
25 pages
Lailatul Maya_Mobile Learning
No ratings yet
Lailatul Maya_Mobile Learning
2 pages
Dream On Monkey Mountain
No ratings yet
Dream On Monkey Mountain
26 pages
HW 6
No ratings yet
HW 6
2 pages
Applcation Letter
No ratings yet
Applcation Letter
10 pages
python科学计算第二版（可编辑）
No ratings yet
python科学计算第二版（可编辑）
723 pages
M-1 NEET Physics
No ratings yet
M-1 NEET Physics
205 pages
Sql Server 2017 Administration Inside Out Leblanc Patrickassaf download
No ratings yet
Sql Server 2017 Administration Inside Out Leblanc Patrickassaf download
69 pages
Night of The Scorpion Examination
No ratings yet
Night of The Scorpion Examination
7 pages
Literacy: Handouts
No ratings yet
Literacy: Handouts
28 pages
Free CV Template 18
No ratings yet
Free CV Template 18
2 pages
St. Theresa'S College, Q.C.: Pre - Kinder Skills and Abilities Checklist SY 2019 - 2020
No ratings yet
St. Theresa'S College, Q.C.: Pre - Kinder Skills and Abilities Checklist SY 2019 - 2020
1 page
15 Participle, To-Infinitive and Reduced Clauses
No ratings yet
15 Participle, To-Infinitive and Reduced Clauses
1 page
Syndromes and Diseasespdf
No ratings yet
Syndromes and Diseasespdf
17 pages
Certificate in Guidance (Cig) Term-End Examination June, 2010 Nes-103: Guiding Children'S Learning
No ratings yet
Certificate in Guidance (Cig) Term-End Examination June, 2010 Nes-103: Guiding Children'S Learning
4 pages
Crafting Literary Texts With Proper Structure
No ratings yet
Crafting Literary Texts With Proper Structure
2 pages
Department of Education-Lesson Plan On Different Literary Devices
No ratings yet
Department of Education-Lesson Plan On Different Literary Devices
7 pages
APILE 2019 Users Manual
No ratings yet
APILE 2019 Users Manual
163 pages
Multicollinearity and Oaxaca -Tutorial
No ratings yet
Multicollinearity and Oaxaca -Tutorial
35 pages
Niagara 4 Features Overview
100% (1)
Niagara 4 Features Overview
23 pages
Ecarp 141225222332 Conversion Gate02 PDF
No ratings yet
Ecarp 141225222332 Conversion Gate02 PDF
20 pages
Y (A + B) + ( B + C) (A) : Based On The Combinatorial Circuit Shown Below, Answer The Following Questions
No ratings yet
Y (A + B) + ( B + C) (A) : Based On The Combinatorial Circuit Shown Below, Answer The Following Questions
3 pages
CH 1
No ratings yet
CH 1
3 pages
Manually Configure Devices by Using Device Manager: Windows XP
No ratings yet
Manually Configure Devices by Using Device Manager: Windows XP
4 pages
Bluetooth Mini Car
No ratings yet
Bluetooth Mini Car
5 pages
Lesson Plan
No ratings yet
Lesson Plan
6 pages
DBT Materialization1
No ratings yet
DBT Materialization1
3 pages