0% found this document useful (0 votes)
36 views

Learning Process

This document discusses different types of learning processes in neural networks, including supervised learning, reinforcement learning, and unsupervised learning. Supervised learning involves learning with a teacher who provides labeled examples to train the neural network. Reinforcement learning involves learning through interaction with the environment without explicit feedback, instead aiming to maximize rewards. Unsupervised learning looks for hidden patterns in unlabeled data through methods like clustering.

Uploaded by

nourhan fahmy
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Learning Process

This document discusses different types of learning processes in neural networks, including supervised learning, reinforcement learning, and unsupervised learning. Supervised learning involves learning with a teacher who provides labeled examples to train the neural network. Reinforcement learning involves learning through interaction with the environment without explicit feedback, instead aiming to maximize rewards. Unsupervised learning looks for hidden patterns in unlabeled data through methods like clustering.

Uploaded by

nourhan fahmy
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 54

Learning Process

Just as there are different ways in which we ourselves learn from our
own surrounding environments, so it is with neural networks. In a broad
sense, we may categorize the learning processes through which neural
networks function as follows: learning with a teacher and learning
without a teacher. By the same token, the latter form of learning may be
subcategorized into unsupervised learning and reinforcement learning.
These different forms of learning as performed on neural networks
parallel those of human learning.
Learning with a teacher
• Learning with a teacher is also referred to as supervised learning.
Figure 24 shows a block diagram that illustrates this form of learning.
In conceptual terms, we may think of the teacher as having
knowledge of the environment, with that knowledge being
represented by a set of input–output examples. The environment is,
however, unknown to the neural network. Suppose now that the
teacher and the neural network are both exposed to a training vector
(i.e., example) drawn from the same environment.
Learning with teacher
• By virtue of built-in knowledge, the teacher is able to provide the
neural network with a desired response for that training vector.
Indeed, the desired response represents the “optimum” action to be
performed by the neural network. The network parameters are
adjusted under the combined influence of the training vector and the
error signal. The error signal is defined as the difference between the
desired response and the actual response of the network. This
adjustment is carried out iteratively in a step-by-step fashion with the
aim of eventually making the neural network emulate the teacher;
the emulation is presumed to be optimum in some statistical sense.
• In this way, knowledge of the environment available to the teacher is
transferred to the neural network through training and stored in the
form of “fixed” synaptic weights, representing Long-term memory.
When this condition is reached, we may then dispense with the
teacher and let the neural network deal with the environment
completely by itself.
• The form of supervised learning we have just described is the basis of
error correction learning.
Learning without a Teacher

• In supervised learning, the learning process takes place under the


tutelage of a teacher. However, in the paradigm known as learning
without a teacher, as the name implies, there is no teacher to oversee
the learning process. That is to say, there are no labeled examples of
the function to be learned by the network. Under this second
paradigm, two subcategories are identified:
1. Reinforcement Learning
2. Unsupervised Learning
1. Reinforcement Learning
• In reinforcement learning, the learning of an input–output mapping is
performed through continued interaction with the environment in
order to minimize a scalar index of performance. Figure 25 shows the
block diagram of one form of a reinforcement-learning system built
around a critic that converts a primary reinforcement signal received
from the environment into a higher quality reinforcement signal called
the heuristic reinforcement signal, both of which are scalar inputs
(Barto et al., 1983).The system is designed to learn under delayed
reinforcement, which means that the system observes a temporal
sequence of stimuli also received from the environment, which
eventually result in the generation of the heuristic reinforcement signal.
• The goal of reinforcement learning is to minimize a cost-to-go function, defined
as the expectation of the cumulative cost of actions taken over a sequence of
steps instead of simply the immediate cost. It may turn out that certain actions
taken earlier in that sequence of time steps are in fact the best determinants
of overall system behavior. The function of the learning system is to discover
these actions and feed them back to the environment. Delayed-reinforcement
learning is difficult to perform for two basic reasons:
• There is no teacher to provide a desired response at each step of the learning
process.
• The delay incurred in the generation of the primary reinforcement signal
implies that the learning machine must solve a temporal credit assignment
problem. By this we mean that the learning machine must be able to assign
credit and blame individually to each action in the sequence of time steps that
led to the final outcome, while the primary reinforcement may only evaluate the
outcome.
2. Unsupervised Learning
• In unsupervised, or self-organized, learning, there is no external teacher
or critic to oversee the learning process, as indicated in Fig. 26. Rather,
provision is made for a task-independent measure of the quality of
representation that the network is required to learn, and the free
parameters of the network are optimized with respect to that measure.
• For a specific task-independent measure, once the network has
become tuned to the statistical regularities of the input data, the
network develops the ability to form internal representations for
encoding features of the input and thereby to create new classes
automatically (Becker, 1991).
Learning Tasks
• In the previous section, we discussed different learning paradigms. In
this section, we describe some basic learning tasks. The choice of a
particular learning rule, is of course, influenced by the learning task,
the diverse nature of which is testimony to the universality of neural
networks.
Pattern Association
• An associative memory is a brain like distributed memory that learns
by association. Association has been known to be a prominent feature
of human memory since the time of Aristotle, and all models of
cognition use association in one form or another as the basic
operation (Anderson, 1995).
• Association takes one of two forms: auto-association and hetero-
association. In auto-association, a neural network is required to store
a set of patterns (vectors) by repeatedly presenting them to the
network. The network is subsequently presented with a partial
description or distorted (noisy) version of an original pattern stored in
it, and the task is to retrieve (recall) that particular pattern. Hetero-
association differs from auto association in that an arbitrary set of
input patterns is paired with another arbitrary set of output patterns.
Auto association involves the use of unsupervised learning, whereas
the type of learning involved in hetero-association is supervised.
• There are two phases involved in the operation of an associative
memory:
• storage phase, which refers to the training of the network in
accordance with Eq.
• Let xk denote a key pattern (vector) applied to an associative memory
and yk denote a memorized pattern (vector).The pattern association
performed by the network is described by
xk→ yk, k = 1, 2, ..., q
where q is the number of patterns stored in the network.
• recall phase, which involves the retrieval of a memorized pattern in
response to the presentation of a noisy or distorted version of a key
pattern to the network.
Outlines

Introduction to Associative Memory


Models of Associative Memory

Hebbian learning rule

Linear Associative Memory


Associative Memories
• Associative memories can be implemented either by using
feedforward or recurrent neural networks.
• Such associative neural networks are used to associate one set of
vectors with another set of vectors, say input and output patterns.
• The aim of an associative memory is, to produce the associated
output pattern whenever one of the input pattern is applied to the
neural network.

Lecture 4-17
Associative Memories
• Motivation
• Human ability to retrieve information from applied associated stimuli
• Example, recalling one’s relationship with another after not seeing them for several years
despite the other’s physical changes (aging, facial hair, etc.)
• Enhances human awareness and deduction skills and efficiently organizes vast amounts
of information
• Why not replicate this ability with computers?
• Ability would be a crucial addition to the Artificial Intelligence Community in developing
rational, goal oriented, problem solving agents
Associative Memory Definition
• In biological terms associative memory refers to the brains
capability to associate different objects, feelings, senses, etc.
with known past experiences.

• In terms of neural networks this corresponds to


• the nets being able to store a set of pattern associations. Each of
these associations is essentially an input-output pair.
• By storing these associations the network is able to recall the
desired output for a given input that is similar but not identical to
the training input.
Memory Data Retrieval
• Two different Addressing Modes which are used for memory data
retrieval:
• Address- addressable memory: in digital computers, data can be accessed
when their correct addresses in the memory are give.
• Content - addressable memory: the data can be accessed based on the
content of key vector.

• One realization of Associative Memories are Contents Addressable


Memories (CAM)
Applications of Associative Memory
• An example of such an application could be palm
print recognition on an input pad. The network
would store the desired palm print of each
authorized person to that particular room.
• When a person comes to enter the room and
places their hand on the pad, the network takes in
this pattern as an input. The person may place
their hand in a different part of the pad or may
have their fingers closer or further apart than
when they entered their original print.
• The network should still be able to recognize the
print as being a match of one of the desired
patterns and so admit the person to the room
Applications of Associative Memory, cont.
• Another example of such a system would be a Address zip code
sorter, in a post office.
• The zip code could be entered on the letter in many different
ways, i.e. handwritten, printed using different fonts, etc.
• The sorter has to be able to recognize this no matter what the
format to ensure that the letter reaches it's correct destination.
Pattern Association & Associative Memory
• Associating patterns which are
– similar,
– contrary,
– in close proximity (spatial),
– in close succession (temporal)
– or in other relations

• Associative recall/retrieve
– evoke associated patterns
– recall a pattern by part of it: pattern completion
– evoke/recall with noisy patterns: pattern correction
Example for Associative recall/retrieve
• Recall a stored pattern by a noisy input
pattern
• Using the weights that capture the
association
• Stored patterns are viewed as
“attractors”, each has its “attraction
basin”
• Often call this type of NN “associative
memory” (recall by association, not
explicit indexing/addressing)
Outlines

Introduction to Associative Memory

Models of Associative Memory


Hebbian learning rule

Linear Associative Memory


Associative Memories
• An associative memory is a content-
addressable structure that maps a set of
input patterns to a set of output patterns.
• Two types of associative memory:
autoassociative and heteroassociative.
• Auto-association
• retrieves a previously stored pattern that most
closely resembles the current pattern.
• Hetero-association
• the retrieved pattern is, in general, different
from the input pattern not only in content but
possibly also in type and format.
Associative Memories

Stored Patterns
(x1 , y1 ) i i
2
(x , y ) 2 x y Autoassociative

 i i
p
(x , y ) p x y Heteroassociative
i n
x R
yi  Rm
Type of Associative memory
Hetero associative Auto associative
A α A A

B β B B
Hetero-association (Different Patterns) Auto-association (Same Patterns)

Niagara
memory
Waterfall
A memory
A
• For two patterns p and t • For two patterns p and t
– hetero-association (p != t) : – auto-association (p = t):
relating two different relating parts of a pattern
patterns with other parts
Models of Associative Memory Networks

Two models of Associative Memory networks:


1- Static memory, means there is one feed forward pass only. Block
diagram show an associating mapping of an input vector X into an
output vector Y
Y= M(X)
Where M is a mapping of V to X. his mapping called retrieval
X1 y1
X2 y2
… M

Xn yn

2- Dynamic memory, means there exists feedback property and


hence time-delay.
Architectures of NN associative memory

1. single layer (with/out input layer)


1. Linear Associative Memory(LAM) Network
2. Hopfield Network - Autoassociative Memory
2. two layers
1. Bidirection Associative Memory (BAM)
LAM (Linear Associative Memory) Network
•single-layer feed-forward network
•is a static network
•recover the output pattern from full or partial information in
the input pattern.
•LAM model can perform both auto and hetro associative recall
of stored information.

•The Hebbian learning rule can


be used to learn a new association
pair (Pi, ti)
Hopfield Network
-Single layer
-Fully connected
Bidirectional Associative Memory (BAM)
• It is similar to linear associator but connections are
bidirectional that is
wij = wji for i=1, 2, …, n
j= 1, 2, …, m
• BAM model can perform
both auto and hetro
associative recall of stored
information.
• BAM also uses Hebb’s
Learning rule to build the
connection weight matrix to
store the associative patterns pairs.
Outlines

Introduction to Associative Memory

Models of Associative Memory

Hebbian learning rule


Linear Associative Memory
Hebb’s Postulate
The oldest (1949) and most famous of all learning rules is Hebb’s
postulate of learning:

“When an axon of cell A is near enough to excite a cell B and


repeatedly or persistently takes part in firing it, some growth
process or metabolic change takes place in one or both cells
such that A’s efficiency, as one of the cells firing B, is increased.”
D. O. Hebb, 1949
B

A
Hebbian Learning Laws

Hebb’s Rule
If a neuron receives an output from another
neuron, and if both are highly active (both have
same sign), the weight between the neurons
should be strengthened.

• It means that if two interconnected neurons are both


“on” at the same time, then the weight between them
should be increased
Hebbian Algorithm
The neurons output :

where:
               y = neuron output
               w = weight vector
               x = neuron input

The change in weight vector is given by:

where:
        µ = learning constant ( positive value that determines
learning rate )
Summary
• Hebbian learning has four features:
1. It is unsupervised
2. It is a local learning rule, meaning that it can be applied to
a network in parallel
3. It is simple and therefore required very little computation
4. It is biological plausible

Hebbian learning has disadvantage:


unconstrained growth in w:
Problems such as unrestrained growth could occur in
cases where responses and excitations constantly
agree in sign.
Outlines

Introduction to Associative Memory

Models of Associative Memory

Hebbian learning rule

Linear Associative Memory


LAM (Linear Associative Memory) Network
•single-layer feed-forward network
•associate one or more pairs of vector (Pi, ti) so that

T=WP

Where P is a vector of size nx1


T is a vector of size mx1
W is a matrix of size mxn

Architecture of Linear
Associator network
learning of Linear Associative memory

• Goal of learning:
to obtain a set of weights wij from a set of training pattern
pairs {p:t}
such that
when P is applied to the input layer, t is computed at the
output layer

• The Hebbian learnin rule can be used to learn a new


association pair (Pi, ti)
Hebbian rule for AM
• Similar to use hebbian learning for classification.
• Algorithm: (used with bipolar patterns (1, -1) or
binary patterns (1, 0)):
• Initiall wij  0
• For each training samples (p,t) : wij  t j .Pi
• w increases if both p and t
ij i j
are ON (binary) or have the same sign (bipolar)
• Then, after updates for all Q training patterns

Q
wij   t j (q ).Pi (q )
q 1
Hebb’s Algorithm for Linear Associate
• Step 0: initialize all weights to 0

• Step 1: Given a training input unit, Pi, and output unit ti

• Step 2: Adjust the weights: wi(new) = wi(old) + tiPi

Or you can use the matrix Form:


Wnew = Wold + t PT
• Step 3: got to step 1.
Hebb’s Algorithm for Linear Associate, cont.
• Step 4: (Recall): After compute the weight and stopping
training , you can test (or recall) the network:
output: yi = w Pi

1 if yj  0 1 if yj  0
f (yj)   OR
f (yj)  
0 if yj  0  1 if yj  0
Using outer product
Instead of obtaining W by iterative updates, it can be computed from
the training set by summing the outer product of tq and pq over all Q
samples.
Q
T T
W= t 1 p1 + t 2 p2 +  + t Qp TQ = 
T
t qpq (Zero Initial
Weights)
q= 1
In this case, Hebb’s Rule is the same as taking the outer product of
the two vectors, outer product of two vectors is a matrix

ti ,1  ti ,1Pi ,1 , , t p ,1 Pp ,m 


Q Q   
Q t P , , t P

Wm,n   ti PiT     Pi1 , , Pim    
i , 2 i ,1 i , 2 i ,m 

i 1 i 1
  i 1
 
   
ti ,n  ti ,n Pi ,1 , , ti ,n Pi ,m 
Example 1: hetero-associative memory
• 1- Required to build a neural network which will associate the following 4 training samples:

P t
(1 0 0 0) (1, 0)
(1 1 0 0) (1, 0)
input
(0 0 0 1) (0, 1) output
(0 0 1 1) (0, 1)
• 2- Test (recall) the network for : n1

x = (1 0 0 0) n2
x=(0 1 1 0)
x=(0 1 0 0)
Solution: First, Build the network:
1- finding the four outer products
1  1 0 0 0 
t1  p   1 0 0 0   
T
1 
0  0 0 0 0
1  1 1 0 0 
t 2  p2   1 1 0 0   
T

0 0 0 0 0
 0  0 0 0 0
t3  p   0 0 0 1  
T
3 
1  0 0 0 1 
0  0 0 0 0
t 4  p4   0 0 1 1  
T

1  0 0 11 
2- Add all four individual weight matrices to produce the final weight
matrix:
 2 1 0 0 Each row defines
W   the weights for an output
 0 0 1 2  neuron
Solution, cont.
Recall: Try the first input pattern: x = (1 0 0 0)
1 
 
 2 1 0 0  0 
y      (2 0)
 0 0 1 2  0 
0
 
f ( y )  (1 0), a correct recall
Recall: input pattern x=(0 1 1 0)
(not sufficiently similar to any training input)
0
 
 2 1 0 0 1 
y      (1 1)
 0 0 1 2 1 
0
 
f ( y )  (1 1), not a stored pattern
Solution, cont.

Recall: x=(0 1 0 0) (similar to P1 and P2 )


0
 
 2 1 0 0 1 
y      (1 0)
 0 0 1 2  0 
0
 
f ( y )  (1 0), recalls p1
Example 2: auto-associative memory
• Same as hetero-assoc nets, except pi = ti for all i = 1, 2, …
• Used to recall a pattern by a its noisy or incomplete version.
(pattern completion/pattern recovery)
• Weight matrix W always a symmetric matrix.
• Example:
A single pattern p = (1, 1, 1, -1) is stored and the weights computed
by Hebbian rule (outer product ) is:

1 1 1  1
1 1 1  1
W  t. p t  p. p t  
1 1 1  1
 
 1 1 1 1 
Example 2: auto-associative memory, cont.
• Example, cont. :
recall the following patterns:

1- Recall the training pattern: x=(1 1 1 -1)


y  W .1 1 1  1  4 4 4  4   f ( y )  1 1 1  1

2- Recall the noisy pattern: x=(-1 1 1 -1)


y  W   1 1 1  1  2 2 2  2  f ( y )  1 1
T
1  1 rcalles p

3- Recall the missing information pattern: x=(0 0 1 -1)


y  W  0 0 1  1  2 2 2  2   f ( y )  11 1  1
T
recalles P

  1 noisy 1  0 0 x=(-1
0 0 not
T
4- Recall ythe
 Wmore  1 1 pattern: -1 1recognized
-1)
Illustrate Example 3: Autoassociative Memory

In first image, represent the white pixel with -1 and the black
pixel with 1, then the first row of image can be write as:
-1 1 1 1 -1
and the second row as:
1 -1 -1 -1 1
And so on. Then the input vector p1 will be:

And the weight matrix will be:


T T T
W = p 1 p 1 + p 2 p2 + p 3 p 3
Tests 50% Occluded

67% Occluded

Noisy Patterns (7 pixels)


Exercise
• Required to build a neural network which will associate the following
two sets of patterns using Hebb’s Rule:
p1 = ( 1 -1 -1 -1) t1 = ( 1 -1 -1)
p2 = (-1 1 -1 -1) t2 = ( 1 -1 1)
p3 = (-1 -1 1 -1) t3 = (-1 1 -1)
p4 = (-1 -1 -1 1) t4 = (-1 1 1)

You might also like