Artificial Neural Network
Artificial Neural Network
Outline
Artificial Neural Network ANN Background and Motivation The Brain Structure of Neurons What is ANN? Components of ANN
Why ANN?
Some tasks can be done easily (effortlessly) by humans but are hard by conventional paradigms on Von Neumann machine with algorithmic approach Pattern recognition (old friends, handwritten characters) Content addressable recall Approximate, common sense reasoning (driving, playing piano, baseball player) These tasks are often illdefined, , experience p based, , hard to apply logic
education institutions, industry, military > 500 books on subject > 20 journals dedicated to ANNs numerous popular, industry, academic articles
Arithmetic: Vision:
Memory of arbitrary details: computer wins Memory y of realworld facts: brain wins A computer must be programmed explicitly The brain can learn by experiencing the world
COMPUTER
ordered structure serial processor 10,000,000 operations per second one operation ti at ta time 1 or 2 inputs
BRAIN
10^10 neuron processors 10^4 connections 100 operations per second millions illi of f operations at a time thousands of inputs
Processor Speed
Computational C t ti l Power
Comparison
Von Neumann machine
Human Brain
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
One or a few high speed (ns) processors with ith considerable id bl computing power One or a few shared high speed buses for communication Sequential memory access by address Problem-solving knowledge is separated from the computing component Hard to be adaptive
Large # (1011) of low speed processors (ms) with limited computing power Large # (1015) of low speed connections Content addressable recall (CAM) Problem Problem-solving solving knowledge resides in the connectivity of neurons Adaptation by changing the connectivity
History of ANN
Pitts & McCulloch (1943)
First mathematical model of biological neurons All Boolean operations p can be implemented p by y these neuron like nodes (with different threshold and excitatory/inhibitory connections). Competitor to Von Neumann model for general purpose computing device Origin of automata theory.
Hebb (1949)
Hebbian rule of learning: increase the connection strength between neurons i and j whenever both i and j are activated. Or increase the connection strength between nodes i and j whenever both nodes are simultaneously ON or OFF.
History of ANN
Early booming (50s early 60s) Rosenblatt (1958)
Perceptron: e cept o network et o of o t threshold es o d nodes for pattern classification x1 x2 Perceptron learning rule xn Percenptron convergence theorem: everything that can be represented by a perceptron can be learned
History of ANN
The setback (mid 60s late 70s)
Serious problems with perceptron model (Minskys book 1969) Single layer perceonptrons cannot represent (learn) simple functions such as XOR Multilayer of nonlinear units may have greater power but there is no learning rule for such nets Scaling problem: connection weights may grow infinitely The first two problems overcame by latter effort in 80s, but the scaling problem persists Death of Rosenblatt (1964) Striving of Von Neumann machine and AI
History of ANN
Impressive application (character recognition, speech recognition, texttospeech transformation, process control, associative memory, etc.) Traditional approaches face difficult challenges Caution:
Dont underestimate difficulties and limitations Poses more problems than solutions
IMPORTANT BOOKS
1990 Artificial Neural Systems Jacek M. Zurada y Systems y Bart Kosco 1992 Neural Networks and Fuzzy 1994 Neural Networks: A Comprehensive Foundation Simon Haykin
17
Properties of the brain It can learn, reorganize itself from experience It adapts to the environment It is robust and fault tolerant
19
The Machine
Calculation Precision Logic
20
21
Neural Networks Biological (Real) Mathematical (Artificial) How do you tell them apart ???
Squeeze them !!! Excite them !!!
22
Biological Neuron
23
Neurons communicate with each other, we will see later how this works. This will be the "neural network"
Action Potential
Refractory period: after carrying a pulse, axon fiber is in a state of complete nonexcitability for a certain time called refractory period
25
26
27
28
Each neuron has a cell body (soma), an axon, and many dendrites
Can be in one of the two states: firing and rest. Neuron fires if the total incoming stimulus exceeds the threshold
Synapse: thin gap between axon of one neuron and dendrite of another.
Biological neuron
synapse nucleus cell body axon
dendrites
A neuron has
Th The i information f i circulates i l f from the h d dendrites d i to the h axon via i the cell body (soma) Axon connects to dendrites via synapses
The information transmission happens at the synapses. Synapses vary in strength Synapses may be excitatory or inhibitory
31
33
34
35
ANN
Bio NN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Cell body signal from other neurons firing g frequency q y firing mechanism Synapses synaptic strength
Highly parallel, simple local computation (at neuron level) achieves global results as emerging property of the interaction (at network level) Pattern directed (meaning of individual nodes only in the context of a pattern) Fault-tolerant/graceful degrading Learning/adaptation plays important role.
38
39/48
Activation state vector. This is a vector of the activation level xi of individual neurons in the neural network,
X = (x1, . . . , xn)T Rn.
40/48
42/48
44/48
net
f(net)
45
All inputs to one node come in at the same time and remain p is activated until the output produced Weights associated with links f (net ) is the node function n net = i =1 wi xi is most popular
One Neuron
McCulloughPitts Neuron Model
The McCulloghPitts model: spikes are interpreted as spike rates; synaptic strength are translated as synaptic weights; excitation means positive product between the incoming spike rate and the corresponding synaptic weight; inhibition means negative g product p between the incoming g spike rate and the corresponding synaptic weight;
x1
w1
w2
x2
ntegrate hreshold
xn
wn
Integrate-and-fire Neuron
48
49
w1 1.0
w2 1.0 wn xn X1 0 0 1 1 X2 0 1 0 1 O(k+1) 0 0 0 1
x2
ntegrate hreshold
1.5
w1
1.0
x2
w2 1.0 wn xn
ntegrate hreshold
.9 X1 0 0 1 1 X2 0 1 0 1 O(k+1) 0 1 1 1
w1
x2 w2 wn xn
ntegrate hreshold
-.5
X1 0 1
O(k+1) 1 0
NOR, NAND
x1 x2 w2
w1 1.0
1.0
=1
X1 0 0 1 1 X2 0 1 0 1
-1.0
=0
wn xn O(k+1) 1 0 0 0
NOR, NAND
w1 -1.0
x1 x2 w2 -1.0
=0 =0 =0
1.0
1.0
=1
X1 0 0 1 1
xn
Wn = -1
1.0 X2 0 1 0 1 O(k+1) 1 1 1 0
Given:
55
Solution:
-1
-1/2
56
Input Vector
net
=o
57
y = o = f(net)
Nonlinear Activation Function
= f( wTx)
Nonlinear Operation on
x
58
Nodal Representation
Vector Notation
59
Activation functions
Transforms neurons input into output. Features of activation functions: A squashing effect is required
Prevents accelerating growth of activation levels through the network.
60
Activation Function
a signal function may typically be
binary threshold linear threshold sigmoidal Gaussian probabilistic.
61/48
(Sigmoidal )
62
f(net)
Equivalent Form
64
65
66
sgn(net) =
Bipolar means: Both positive and negative response of neurons are produced
67
f(net)
20 18 16 14 12 10 8 6 4 2 0
10
12
14
16
18
20
70
j is the Gaussian spread factor and cj is the center. Varying the spread makes the function sharper or more diffuse.
71/48
Changing the center shifts the function to the right or left along the activation axis This function is an example of a non monotonic signal function
72/48
Stochastic Neurons
The signal is assumed to be two state
sj { 0, 1} or {1, 1}
Neuron switches into these states depending upon a probabilistic function of its activation, P(xj ).
73/48
74/48
TLU