0% found this document useful (0 votes)
111 views

Chapter4 Associative Memory

The document discusses associative memory networks. It describes two types of associations - hetero-association and auto-association. Hetero-association relates two different patterns, while auto-association relates parts of a pattern to other parts. The document also covers Hebbian learning rules for associative memory networks and how these networks can recall stored patterns using partial or noisy inputs.

Uploaded by

Bhavnesh Firke
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views

Chapter4 Associative Memory

The document discusses associative memory networks. It describes two types of associations - hetero-association and auto-association. Hetero-association relates two different patterns, while auto-association relates parts of a pattern to other parts. The document also covers Hebbian learning rules for associative memory networks and how these networks can recall stored patterns using partial or noisy inputs.

Uploaded by

Bhavnesh Firke
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 27

CHAPTER 4

ASSOCIATIVE MEMORY
NETWORKS
PATTERN ASSOCIATION
 Associating patterns which are

• similar,
• contrary,
• in close proximity (spatial),
• in close succession (temporal).

 Associative recall

• evoke associated patterns,


• recall a pattern by part of it,
• evoke/recall with incomplete/noisy patterns.
ASSOCIATIVE MEMORY (AM) NETWORK
 Two types of associations exist. For two patterns s and t

• hetero-association (s != t): relating two different patterns (s –


input, t – target).
• auto-association (s = t): relating parts of a pattern with other
parts.

 Architectures of NN associative memory:


• single layer (with/out input layer),
• two layers (for bidirectional association)

 Learning algorithms for AM:


• Hebbian learning rule and its variations,
• gradient descent.
ASSOCIATIVE MEMORY NETWORK
 WORKING PROCESS

• Recall a stored pattern by a noisy input pattern.

• Using the weights that capture the association.

• Stored patterns are viewed as “attractors”, each has its


“attraction basin”.

• Often call this type of NN “associative memory” (recall by


association, not explicit indexing/addressing).
TRAINING ALGORITHM FOR ASSOCIATIVE
MEMORY NETWORK
 Network structure: single layer

• one output layer of non-linear units and one input layer.

s_1 x_1 w_1 y_1 t_1


1
w_1m
w_n
s_n x_n 1 y_m t_m
w_n
Goal of learning: m

• to obtain a set of weights w_ij from a set of training pattern


pairs {s:t} such that when s is applied to the input layer, t is
computed at the output layer,
• for all training pairs s:t, tj = f(sTwj) for all j.
HEBB RULE FOR PATTERN ASSOCIATION
 Algorithm (bipolar or binary patterns):

• For each training samples s:t: wij  si  t j


• wij increases if both si and t j are ON (binary) or have the same
sign (bipolar).
If wij  0, then, after updates for all P training patterns,
P
wij   si ( p)t j ( p), W  { wij }
P 1

• Instead of obtaining W by iterative updates, it can be


computed from the training set by calculating the outer
product of s and t.
OUTER PRODUCT FOR PATTERN ASSOCIATION

Let s and t be row vectors.

Then for a particular training pair s:t

s1  s1t1...s1t m  w11...w1m 


  s t ...s t   
W ( p)  s ( p)  t ( p)    t1 ,..., t m   
T 2 1 2 m  
     
     
 n
s  n 1 n m   n1
s t ...s t w ...wnm 

and
P
W ( p)   s T ( p)  t ( p)
p 1
HETERO-ASSOCIATIVE MEMORY NETWORK
• Binary pattern pairs s:t with |s| = 4 and |t| = 2.
• Total weighted input to output units: y _ in j   xi wij
y _ in j  0
i
1 if
• Activation function: threshold y j  
0 if y _ in j  0

• Weights are computed by Hebbian rule (sum of outer products


of all training pairs) P
W   si ( p)t j ( p)
T

p 1
• Training samples:
s(p) t(p)
p=1 (1 0 0 0) (1, 0)
p=2 (1 1 0 0) (1, 0)
p=3 (0 0 0 1) (0, 1)
p=4 (0 0 1 1) (0, 1)
COMPUTING THE WEIGHTS

1  1 0 1  1 0
       
 0 0 0 1  1 0
sT (1)  t (1)   1 0   s T (2)  t (2)   1 0  
0 0 0 0 0 0
       
 0 0 0   0 0 0 
     
 0 0 0  0 0 0
       
 0 0 0  0 0 0
sT (3)  t (3)   0 1   s T (4)  t (4)   0 1  
0 0 0 1 0 1
       
1  0 1  1  0 1 
     
2 0
 
1 0
W 
0 1
 
0 2
 
TEST/ RECALL THE NETWORK
x  [1 0 0 0] x  [0 1 1 0]
2 0 2 0
   
0 1 1 0
1 0
1 0 0 0
1 0
 2 0   1 1
 0 1

0 1
  
0 0 2 
 2  
y1  1, y 2  0 y1  1, y 2  1

x  [0 1 0 0] similar to s(1) and s(2) (1 0 0 0), (1 1 0 0) class (1, 0)


2 0 (0 0 0 1), (0 0 1 1) class (0, 1)
 
0 1 0 0
1 0
  1 0  (0 1 1 0) is not sufficiently similar
0 1 to any class
 
0 2 

y1  1, y 2  0
AUTO-ASSOCIATIVE MEMORY NETWORK
• Same as hetero-associative nets, except t(p) =s (p).
• Used to recall a pattern by a its noisy or incomplete version.
(pattern completion/pattern recovery)
• A single pattern s = (1, 1, 1, -1) is stored (weights computed
by Hebbian rule or outer product rule.

1 1 1  1
1 1 1  1
W  
1 1 1  1
 1 1  1 1 
training pattern 111  1 W  4 4 4  4  111  1
noisy pattern  111  1W  2 2 2  2  111  1
missing info 0 0 1  1W  2 2 2  2  111  1
more noisy  1  11  1W  0 0 0 0 not recognized
AUTO-ASSOCIATIVE MEMORY NETWORK –
DIAGONAL ELEMENTS
• Diagonal elements will dominate the computation when
multiple patterns are stored (= P).
• When P is large, W is close to an identity matrix. This causes
output = input, which may not be any stoned pattern. The
pattern correction power is lost.
• Replace diagonal elements by zero.
0 1 1  1
1 0 1  1
W0   
 1 1 0  1 
 1  1  1 0 
(1 1 1  1)W '  (3 3 3  3)  (1 1 1  1)
(1 1 1  1)W '  (3 1 1  1)  (1 1 1  1)
(0 0 1  1)W '  (2 2 1  1)  (1 1 1  1)
(1  1 1  1)W '  (1 1  1 1)  wrong
STORAGE CAPACITY
• Number of patterns that can be correctly stored & recalled by a
network.
• More patterns can be stored if they are not similar to each
other (e.g., orthogonal).
• Non-orthogonal
0 0 2 2
0 0 0 0 
(1  1  1 1)  W0    (1  1  11) W0  (1 0  1 1)
(1 1  1 1)  2 0 0  2
It is not stored correctly
2 0 2 0 

• Orthogonal
0 1 1  1
(1 1  1  1)  1 0 1  1
(1 1 1  1)  W0   
(1 1  1 1)  1 1 0  1 All three patterns can be
 1 1 1 0 correctly recalled
BIDIRECTIONAL ASSOCIATIVE MEMORY (BAM)
NETWORK
Architecture:
• Two layers of non-linear units: X-layer, Y-layer.
• Units: discrete threshold, continuing sigmoid (can be either
binary or bipolar).

Weights:

P
Wnm   sT ( p)  t ( p) (Hebbian/o uterproduc t)
p 1

Symmetric: w ij  w ji
Convert binary patterns to bipolar when constructing
W.
RECALL OF BAM NETWORK
Bidirectional, either by X (to recall Y) or by Y (to recall X).
Recurrent:
y (t )  [ f ( y _ in1 (t ),..., f ( y _ inm (t )]
n
where y _ in j (t )   wi j  xi (t  1)
i 1

x(t  1)  [ f ( x _ in1 (t  1),..., f ( x _ inn (t  1)]


m
where x _ in i (t  1)   wij  y j (t )
j 1

Update can be either asynchronous (as in hetero-associative memory)


or synchronous (change all Y units at one time, then all X units the
next time).
ACTIVATION FUNCTIONS IN BAM
The activation function is based on whether the input target vector
pairs used are binary or bipolar.

Activation function for the Y- Activation function for the X-


layer layer
With binary input vectors With binary input vectors

With bipolar input vectors With bipolar input vectors


DISCRETE HOPFIELD NETWORK (DHN)
 A single-layer network
• each node as both input and output units.

 More than an Associative Memory, Discrete Hopfield Network can


be used in several applications.
• Other applications such as combinatorial optimization.

 Different forms: discrete & continuous.

 Major contribution of John Hopfield to NN:


• Treating a network as a dynamic system.
• Introduction of energy function into Neural Network Research.
ARCHITECTURE OF DHN
 Architecture

• Single-layer (units serve as both input and output):


 nodes are threshold units (binary or bipolar).
 weights: fully connected, symmetric, and zero diagonal.

w ij  w ji
w ii  0

xi are external inputs, which


may be transient or
permanent.
STORAGE CAPACITY OF DHN
P: maximum number of random patterns of dimension n can be stored
in a DHM of n nodes

P
Hopfield’s observation: P  0.15n,  0.15
n
n P 1
Theoretical analysis: P , 
2 log 2 n n 2 log 2 n

P/n decreases because larger n leads to more interference between


stored patterns.

Some work to modify HM to increase its capacity to close to n, W is


trained (not computed by Hebbian rule).
CONTINUOUS HOPFIELD NET
 Architecture

• Continuous node output, and continuous time

• Fully connected with symmetric weights wij  w ji , wii  0

dui (t ) n
• Internal activation ui : with   wij x j (t )   i  neti (t )
dt j 1

• Output (state) xi (t )  f (ui (t ))

where f is a sigmoid function to ensure binary/bipolar output. E.g. for


bipolar, use hyperbolic tangent function:
e x  e x
f ( x)  tanh( x)  x
e  ex
CONTINUOUS HOPFIELD NET

Computation: all units change their output (states) at the same time,
based on states of all others.

n
• Compute net: neti (t )   wij x j (t )  i
j 1

• Compute internal activationui (t ) by first-order Taylor expansion


dui (t )
ui (t   )   ini (t )dt  ui (t )     ...  ui (t )  neti  
dt
• Compute output
xi (t )  f (ui (t ))
ITERATIVE ASSOCIATIVE MEMORY NETWORK
Example

 0 1 1 1 
 1 0 1  1 Output units are
x  (1, 1, 1,  1) W  
1 1 0 1  threshold units
 
 1  1  1 0

An incomplete recall input : x'  (1, 0, 0, 0)


Wx'  (0, 1, 1,  1)  x"
Wx"  (3, 2, 2,  2)  (1, 1, 1,  1)  x

In general: using current output as input of the next iteration

x(0) = initial recall input


x(I) = S(Wx(I-1)), I = 1, 2, ……
until x(N) = x(K) for some K < N
Dynamic System: State vector x(I)
If K = N-1, x(N) is a stable state (fixed point)

f(Wx(N)) = f(Wx(N-1)) = x(N)

If x(K) is one of the stored pattern, then x(K) is called a genuine


memory

Otherwise, x(K) is a spurious memory (caused by cross-


talk/interference between genuine memories)

Each fixed point (genuine or spurious memory) is an attractor (with


different attraction basin)
If K != N-1, limit-circle,

The network will repeat

x(K), x(K+1), …, x(N) = x(K) when iteration continues.

Iteration will eventually stop because the total number of distinct state
is finite (3^n) if threshold units are used. If patterns are continuous,
the system may continue evolve forever (chaos) if no such K exists.
SUMMARY
This chapter discussed on the various associative networks:

• Autoassociative Network

• Hetero-associative Network

• Bidirectional Associative Memory Network

• Hopfield Nets

• Iterative Associative Nets

You might also like