0% found this document useful (0 votes)
179 views

Artificial Neural Network

The document provides an overview of artificial neural networks (ANN). It discusses how ANN were inspired by biological neural networks in the brain. The brain has around 10 billion neurons that connect and interact in complex parallel ways to enable functions like pattern recognition that are difficult for traditional computers. ANN try to mimic this architecture using networks of simple interconnected nodes that can learn through experience. The document outlines the history and development of ANN and provides comparisons between biological neural networks and traditional computer architectures.

Uploaded by

Himanshu Saxena
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
179 views

Artificial Neural Network

The document provides an overview of artificial neural networks (ANN). It discusses how ANN were inspired by biological neural networks in the brain. The brain has around 10 billion neurons that connect and interact in complex parallel ways to enable functions like pattern recognition that are difficult for traditional computers. ANN try to mimic this architecture using networks of simple interconnected nodes that can learn through experience. The document outlines the history and development of ANN and provides comparisons between biological neural networks and traditional computer architectures.

Uploaded by

Himanshu Saxena
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Artificial Neural Network

Dr Ashutosh Gupta Dr.

Outline
Artificial Neural Network ANN Background and Motivation The Brain Structure of Neurons What is ANN? Components of ANN

McCullough g Pitts Neuron Model Artificial Neural Element Activation functions


2

Artificial Neural Networks


Other terms/names
connectionist parallel distributed processing neural computation adaptive networks..

Why ANN?
Some tasks can be done easily (effortlessly) by humans but are hard by conventional paradigms on Von Neumann machine with algorithmic approach Pattern recognition (old friends, handwritten characters) Content addressable recall Approximate, common sense reasoning (driving, playing piano, baseball player) These tasks are often illdefined, , experience p based, , hard to apply logic

ANN Background and Motivation

Background and Motivation


Growth has been explosive since 1987

education institutions, industry, military > 500 books on subject > 20 journals dedicated to ANNs numerous popular, industry, academic articles

Truly interdisciplinary area of study No longer a flash in the pan technology

Background and Motivation


Computers and the Brain: A Contrast

Arithmetic: Vision:

1 brain = 1/10 pocket calculator 1 brain = 1000 super computers

Memory of arbitrary details: computer wins Memory y of realworld facts: brain wins A computer must be programmed explicitly The brain can learn by experiencing the world

Background and Motivation


ITEM
Complexity

COMPUTER
ordered structure serial processor 10,000,000 operations per second one operation ti at ta time 1 or 2 inputs

BRAIN
10^10 neuron processors 10^4 connections 100 operations per second millions illi of f operations at a time thousands of inputs

Processor Speed

Computational C t ti l Power

Background and Motivation


Inherent Advantages of the Brain:
distributed distributed processing and representation representation

Parallel processing speeds Fault tolerance Graceful degradation Ability to generalize


f(x) x

Comparison
Von Neumann machine

Human Brain
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

One or a few high speed (ns) processors with ith considerable id bl computing power One or a few shared high speed buses for communication Sequential memory access by address Problem-solving knowledge is separated from the computing component Hard to be adaptive

Large # (1011) of low speed processors (ms) with limited computing power Large # (1015) of low speed connections Content addressable recall (CAM) Problem Problem-solving solving knowledge resides in the connectivity of neurons Adaptation by changing the connectivity

Background and Motivation


History of Artificial Neural Networks
Creation: 1890: William James defined a neuronal process of learning Promising Technology: 1943: McCulloch and Pitts earliest mathematical models: recognised as the designers of the first neural network 1949: First Learning Rule 1954: Hebb and IBM research group earliest simulations 1958: Frank Rosenblatt The Perceptron Algorithm Disenchantment: 1969: Minsky and Papert perceptrons have severe limitations Death of ANN Reemergence: 1980s Grossberg , Hopfield, and Rumelhart Backpropagation algorithms, Adaptive Resonance Theory 1985: Multilayer nets that use backpropagation 1986: PDP Research Group multidisciplined approach
Period of Revival

1990s Maturation of Field

History of ANN
Pitts & McCulloch (1943)
First mathematical model of biological neurons All Boolean operations p can be implemented p by y these neuron like nodes (with different threshold and excitatory/inhibitory connections). Competitor to Von Neumann model for general purpose computing device Origin of automata theory.

Hebb (1949)
Hebbian rule of learning: increase the connection strength between neurons i and j whenever both i and j are activated. Or increase the connection strength between nodes i and j whenever both nodes are simultaneously ON or OFF.

History of ANN
Early booming (50s early 60s) Rosenblatt (1958)
Perceptron: e cept o network et o of o t threshold es o d nodes for pattern classification x1 x2 Perceptron learning rule xn Percenptron convergence theorem: everything that can be represented by a perceptron can be learned

Widow and Hoff (1960 (1960, 19062)


Learning rule based on gradient descent (with differentiable unit)

Minskys attempt to build a general purpose machine with Pitts/McCullock units

History of ANN
The setback (mid 60s late 70s)
Serious problems with perceptron model (Minskys book 1969) Single layer perceonptrons cannot represent (learn) simple functions such as XOR Multilayer of nonlinear units may have greater power but there is no learning rule for such nets Scaling problem: connection weights may grow infinitely The first two problems overcame by latter effort in 80s, but the scaling problem persists Death of Rosenblatt (1964) Striving of Von Neumann machine and AI

Renewed enthusiasm and flourish (80s present)


New techniques
Backpropagation learning for multilayer feed forward nets (with nonlinear, differentiable node functions) Thermodynamic models (Hopfield net, Boltzmann machine, etc.) Unsupervised learning

History of ANN

Impressive application (character recognition, speech recognition, texttospeech transformation, process control, associative memory, etc.) Traditional approaches face difficult challenges Caution:
Dont underestimate difficulties and limitations Poses more problems than solutions

Background and Motivation


ANN application areas ...
Science and medicine: modeling, g, prediction, p , diagnosis, g , pattern recognition Manufacturing: process modeling and analysis Marketing and Sales: analysis, classification, customer targeting Finance: portfolio trading, investment support Banking B ki & I Insurance: credit dit and d policy li approval l Security: bomb, iceberg, fraud detection Engineering: dynamic load schedding, pattern recognition

IMPORTANT BOOKS

1990 Artificial Neural Systems Jacek M. Zurada y Systems y Bart Kosco 1992 Neural Networks and Fuzzy 1994 Neural Networks: A Comprehensive Foundation Simon Haykin

17

The biological inspiration


The brain has been extensively studied by scientists. Vast complexity prevents all but rudimentary understanding. Even the behaviour of an individual neuron is extremely complex Some numbers T he human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000 synapses
18

Features of the Brain


Ten billion (1010) neurons Neuron switching time >103secs Face Recognition ~0.1secs On average, each neuron has several thousand connections Hundreds of operations per second High degree of parallel computation Distributed representations Die off frequently (never replaced) Compensated for problems by massive parallelism

Properties of the brain It can learn, reorganize itself from experience It adapts to the environment It is robust and fault tolerant

19

Brain and Machine


The Brain
Pattern Recognition Association Complexity Noise Tolerance

The Machine
Calculation Precision Logic

20

The contrast in architecture


The Von Neumann architecture uses a single processing unit;
Tens of millions of operations per second Absolute arithmetic precision

The brain uses many y slow unreliable processors acting in parallel

21

Neural Networks Biological (Real) Mathematical (Artificial) How do you tell them apart ???
Squeeze them !!! Excite them !!!

22

Biological Neuron

23

Neurons communicate with each other, we will see later how this works. This will be the "neural network"

Action Potential

Refractory period: after carrying a pulse, axon fiber is in a state of complete nonexcitability for a certain time called refractory period
25

26

Biological Neural Network

27

End Bulb Connection

28

Biological neural activity

Each neuron has a cell body (soma), an axon, and many dendrites

Can be in one of the two states: firing and rest. Neuron fires if the total incoming stimulus exceeds the threshold
Synapse: thin gap between axon of one neuron and dendrite of another.

Signal exchange Synaptic strength/efficiency

Biological neuron
synapse nucleus cell body axon

dendrites

A neuron has

Th The i information f i circulates i l f from the h d dendrites d i to the h axon via i the cell body (soma) Axon connects to dendrites via synapses
The information transmission happens at the synapses. Synapses vary in strength Synapses may be excitatory or inhibitory

A branching input (dendrites) A branching output (the axon : transmission line)

The Structure of Neurons


A neuron only fires if its input signal exceeds a certain amount (the threshold) in a short time period. Synapses vary in strength
Good connections allowing a large signal Slight connections allow only a weak signal. Synapses can be either excitatory or inhibitory.

31

The Structure of Neurons


The spikes travelling along the axon of the presynaptic neuron trigger the release of neurotransmitter substances at the synapse. synapse The neurotransmitters cause excitation or inhibition in the dendrite of the postsynaptic neuron. The integration of the excitatory and inhibitory signals may produce spikes in the postsynaptic neuron. The contribution of the signals depends on the strength of the synaptic connection.

The Structure of Neurons


A neuron has a cell body, a branching input structure (the dendrIte) and a branching output structure (the axOn)
Axons connect to dendrites via synapses. Electrochemical signals are propagated from the dendritic input, through the cell body, and down the axon to other neurons

33

Properties of Artificial Neural Nets (ANNs)

34

Properties of Artificial Neural Nets (ANNs)


Many simple neuronlike threshold switching units Many weighted interconnections among units Highly parallel, distributed processing Learning by tuning the connection weights

35

What is an (artificial) neural network


A set of nodes (units, neurons, processing elements)
Each node has input and output Each node performs a simple computation by its node function

Weighted connections between nodes


Connectivity gives the structure/architecture of the net What can be computed by a NN is primarily determined by the connections and their weights

A very y much simplified p version of networks of neurons in animal nerve systems

ANN

Bio NN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Nodes input output p node function Connections connection strength

Cell body signal from other neurons firing g frequency q y firing mechanism Synapses synaptic strength

Highly parallel, simple local computation (at neuron level) achieves global results as emerging property of the interaction (at network level) Pattern directed (meaning of individual nodes only in the context of a pattern) Fault-tolerant/graceful degrading Learning/adaptation plays important role.

Appropriate Problem Domains for Neural Network Learning


Input is highdimensional discrete or realvalued (e.g. raw sensor input) Output is discrete or real valued Output is a vector of values Form of target function is unknown Humans do not need to interpret the results (black box model)

38

Neural Networks Defined


Artificial neural networks are massively parallel ll l adaptive d ti networks t k of f simple i l nonlinear li computing elements called neurons which are intended to abstract and model some of the functionality of the human nervous system in an attempt to partially capture some of its computational strengths.

39/48

Eight Components of Neural Networks


Neurons. These can be of three types:
Input: receive external stimuli Hidden: compute intermediate functions Output: generate outputs from the network

Activation state vector. This is a vector of the activation level xi of individual neurons in the neural network,
X = (x1, . . . , xn)T Rn.

40/48

Eight Components of Neural Networks


Activation/Signal function. A function that generates the output signal of the neuron based on its activation. Pattern of connectivity. This essentially determines the interneuron connection architecture or the graph of the network. Connections which model the inter neuron synaptic efficacies, can be excitatory (+) inhibitory () absent (0).
41/48

Eight Components of Neural Networks


Activity aggregation rule. A way of aggregating activity at a neuron, and is usually computed as an inner product of the input vector and the neuron fanin weight vector.

42/48

Eight Components of Neural Networks


Activation rule. A function that determines the new activation level of a neuron on the basis of its current activation and its external inputs. Learning rule. Provides a means of modifying connection strengths based both on external stimuli and network performance with an aim to improve the latter.
43/48

Eight Components of Neural Networks


Environment. The environments within which neural networks can operate could be
deterministic (noiseless) or stochastic (noisy).

44/48

Real and Artificial Neurons

net

f(net)

net = i =1 wi xi is most popular


n

f (net ) is the node function

45

ANN Neuron Models


Each node has one or more inputs from other nodes, and one output to other nodes Input/output values can be
Binary {0, 1} Bipolar {1, 1} Continuous

General neuron model

All inputs to one node come in at the same time and remain p is activated until the output produced Weights associated with links f (net ) is the node function n net = i =1 wi xi is most popular

Weighted input summation

One Neuron
McCulloughPitts Neuron Model
The McCulloghPitts model: spikes are interpreted as spike rates; synaptic strength are translated as synaptic weights; excitation means positive product between the incoming spike rate and the corresponding synaptic weight; inhibition means negative g product p between the incoming g spike rate and the corresponding synaptic weight;

McCulloughPitts Neuron Model


Inputs: xi, for i=1,2,..,n are 0 and 1 at instant k. Output : o Weights: wi = +1 or 1. Firing rule is defined as:

x1

w1
w2

x2

ntegrate hreshold

xn

wn

Integrate-and-fire Neuron

48

Question What can we do with a McCollochPitts Model ???

Answer Surprisingly we can model all logical expressions !!!

49

AND, OR, NOT


x1

w1 1.0
w2 1.0 wn xn X1 0 0 1 1 X2 0 1 0 1 O(k+1) 0 0 0 1

x2

ntegrate hreshold

1.5

AND, OR, NOT


x1

w1

1.0

x2

w2 1.0 wn xn

ntegrate hreshold

.9 X1 0 0 1 1 X2 0 1 0 1 O(k+1) 0 1 1 1

AND, OR, NOT


x1 -1.0

w1
x2 w2 wn xn

ntegrate hreshold

-.5

X1 0 1

O(k+1) 1 0

NOR, NAND
x1 x2 w2

w1 1.0
1.0

=1
X1 0 0 1 1 X2 0 1 0 1

-1.0

=0

wn xn O(k+1) 1 0 0 0

NOR, NAND
w1 -1.0
x1 x2 w2 -1.0

=0 =0 =0

1.0

1.0

=1
X1 0 0 1 1

xn

Wn = -1

1.0 X2 0 1 0 1 O(k+1) 1 1 1 0

Example Implementation of a given logic expression

Given:

Implement f(x) using McCollochPitts Neurons

55

Solution:

-1/2 2.5 -1/2 T= 1/2 -1/2 3.5

-1

-1/2

56

Artificial Neural Element (ANE)


Node

Input Vector

net

=o

57

Artificial Neural Element Mathematical Model

net = w1x1 + w2x2 + + wnxn + wn+1


= wTx
Linear Operation on

where weight vector w is w = [ w1 w2 .. wn]t and x is input vector: x = [ x1 x2 .. xn]t

y = o = f(net)
Nonlinear Activation Function

= f( wTx)
Nonlinear Operation on

x
58

Artificial Neural Element

Nodal Representation

Vector Notation
59

Activation functions
Transforms neurons input into output. Features of activation functions: A squashing effect is required
Prevents accelerating growth of activation levels through the network.

Simple and easy to calculate

60

Activation Function
a signal function may typically be
binary threshold linear threshold sigmoidal Gaussian probabilistic.

61/48

Some Activation Functions


Binary/

(Sigmoidal )

62

is a gain scale factor

Logistic/sigmoidal unipolar continuous function


63

Hyperbolic Tangent Activation Function (Bipolar continuous function)


Definition

f(net)

Equivalent Form
64

Hyperbolic Activation Function (Bipolar continuous function)

65

Unipolar Binary/Discrete activation function

66

Bipolar Binary/Discrete activation function

sgn(net) =
Bipolar means: Both positive and negative response of neurons are produced
67

Soft limiting activation functions


The sigmoidal signal function has some very useful mathematical properties. It is
monotonic continuous bounded b d d

f(net)

are also called sigmoidal characteristics

Hard limiting activation functions sgn(net) =

The hardlimiting threshold function


Corresponds to the biological paradigm

either fires or not

20 18 16 14 12 10 8 6 4 2 0

10

12

14

16

18

20

70

Gaussian Signal Function

j is the Gaussian spread factor and cj is the center. Varying the spread makes the function sharper or more diffuse.
71/48

Gaussian Signal Function

Changing the center shifts the function to the right or left along the activation axis This function is an example of a non monotonic signal function
72/48

Stochastic Neurons
The signal is assumed to be two state
sj { 0, 1} or {1, 1}

Neuron switches into these states depending upon a probabilistic function of its activation, P(xj ).

73/48

Summary of Signal Functions

74/48

TLU

You might also like