0% found this document useful (0 votes)
3 views

Lecture Notes SC

Uploaded by

glaretsubin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Lecture Notes SC

Uploaded by

glaretsubin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Outline

Definition, why and how are neural


networks being used in solving problems

UNIT-1 Human biological neuron

Artificial Neuron

INTRODUCTION TO Applications of ANN


ARTIFICIAL NEURAL Comparison of ANN vs conventional AI
NETWORKS methods
(ANN)

The idea of ANNs..? Neural networks to the rescue…

NNs learn relationship between cause and effect or • Neural network: information processing
organize large volumes of data into orderly and paradigm inspired by biological nervous
informative patterns.
It’s a frog systems, such as our brain
frog
• Structure: large number of highly interconnected
processing elements (neurons) working together
lion
• Like people, they learn from experience (by
bird example)

What is that?

Definition of ANN Inspiration from Neurobiology


Human Biological Neuron
“Data processing system consisting of a
large number of simple, highly
interconnected processing elements
(artificial neurons) in an architecture inspired
by the structure of the cerebral cortex of the
brain”

(Tsoukalas & Uhrig, 1997).

6 7
Biological Neural Networks Biological Neural Networks

A biological neuron has


three types of main
components; dendrites,
soma (or cell body) and
axon.
Dendrites receives
signals from other
neurons.
Biological neuron The soma, sums the incoming signals. When
sufficient input is received, the cell fires; that is it
transmit a signal over its axon to other cells.

Artificial Neurons Artificial Neurons


• From experience: A physical neuron
ANN is an information processing system that has examples / training
certain performance characteristics in common data
with biological nets. • Strength of connection
between the neurons
Several key features of the processing elements of is stored as a weight-
ANN are suggested by the properties of biological value for the specific
neurons: connection.
• Learning the solution
1. The processing element receives many signals. to a problem =
2. Signals may be modified by a weight at the receiving changing the
synapse.
connection weights
3. The processing element sums the weighted inputs.
4. Under appropriate circumstances (sufficient input), the
neuron transmits a single output.
5. The output from a particular neuron may go to many other
neurons. An artificial neuron
11

Artificial Neurons
Artificial Neuron

ANNs have been developed as generalizations of


mathematical models of neural biology, based on
the assumptions that:

1. Information processing occurs at many simple elements


called neurons.
2. Signals are passed between neurons over connection links.
3. Each connection link has an associated weight, which, in
typical neural net, multiplies the signal transmitted.
4. Each neuron applies an activation function to its net input
to determine its output signal.
Four basic components of a human biological The components of a basic artificial neuron
neuron

13
Model Of A Neuron
• A neural net consists of a large number of
Wa
simple processing elements called neurons,
X1 units, cells or nodes.

Wb Y
X2  f()
• Each neuron is connected to other neurons by
means of directed communication links, each
Wc
X3 with associated weight.

Input units Connection Summing


weights function
computation • The weight represent information being used by
the net to solve a problem.
(dendrite) (synapse) (axon)
(soma)
14 15

• Neural networks are configured for a specific


• Each neuron has an internal state, called application, such as pattern recognition or
its activation or activity level, which is a data classification, through a learning
function of the inputs it has received. process
Typically, a neuron sends its activation as
a signal to several other neurons. • In a biological system, learning involves
adjustments to the synaptic connections
between neurons
• It is important to note that a neuron can
send only one signal at a time, although  same for artificial neural networks (ANNs)
that signal is broadcast to several other
neurons.
16 17

Artificial Neural Network


Synapse Nukleus
History
w1 • 1943 McCulloch-Pitts neurons
 
x1

y
• 1949 Hebb‟s law
Axon
x2 w2 Activation Function:
• 1958 Perceptron (Rosenblatt)
yin = x1w1 + x2w2 (y-in) = 1 if y-in >= 
and (y-in) = 0 • 1960 Adaline, better learning rule (Widrow,
Dendrite Huff)
• 1969 Limitations (Minsky, Papert)
-A neuron receives input, determines the strength or the weight of the input, calculates the total
weighted input, and compares the total weighted with a value (threshold)

-The value is in the range of 0 and 1


• 1972 Kohonen nets, associative memory
- If the total weighted input greater than or equal the threshold value, the neuron will produce the
output, and if the total weighted input less than the threshold value, no output will be produced

18 19
Characterization
• 1977 Brain State in a Box (Anderson) • Architecture
• 1982 Hopfield net, constraint satisfaction – a pattern of connections between neurons
• Single Layer Feedforward
• 1985 ART (Carpenter, Grossfield) • Multilayer Feedforward
• Recurrent
• 1986 Backpropagation (Rumelhart, Hinton, • Strategy / Learning Algorithm
McClelland) – a method of determining the connection weights
• Supervised
• 1988 Neocognitron, character recognition • Unsupervised
• Reinforcement
(Fukushima) • Activation Function
– Function to compute output signal from input signal

20 21

Single Layer Feedforward NN Multilayer Neural Network


z1


V11
x1
 w11
x1 w12
w11 V1n
w12 y1

x2 z2
w12 ym

   y2
w21 

yn zn
x2 

Input layer
w22
output layer xm Vmn  
Input layer Output layer
Contoh: ADALINE, AM, Hopfield, LVQ, Perceptron, SOFM Hidden layer

22 Contoh: CCN, GRNN, MADALINE, MLFF with BP, Neocognitron, RBF, RCE
23

Recurrent NN Strategy / Learning Algorithm


Input Outputs Supervised Learning

• Learning is performed by presenting pattern with target


• During learning, produced output is compared with the desired output
– The difference between both output is used to modify learning
weights according to the learning algorithm
• Recognizing hand-written digits, pattern recognition and etc.
• Neural Network models: perceptron, feed-forward, radial basis function,
support vector machine.

Hidden nodes

Contoh: ART, BAM, BSB, Boltzman Machine, Cauchy Machine,


Hopfield, RNN
24 25
Unsupervised Learning
Reinforcement Learning

• Targets are not provided • Target is provided, but the desired output is absent.
• Appropriate for clustering task • The net is only provided with guidance to determine the
– Find similar groups of documents in the web, content produced output is correct or vise versa.
addressable memory, clustering.
• Weights are modified in the units that have errors
• Neural Network models: Kohonen, self organizing maps,
Hopfield networks.

26 27

Activation Functions Exercise


• Identity
• 2 input AND • 2 input OR
f(x) = x
1 1 1
• Binary step 1 1 1
f(x) = 1 if x >=  1 0 0 1 0 1

f(x) = 0 otherwise 0 1 0 0 1 1
0 0 0
• Binary sigmoid 0 0 0
f(x) = 1 / (1 + e-sx)
• Bipolar sigmoid
f(x) = -1 + 2 / (1 + e-sx)
• Hyperbolic tangent
f(x) = (ex – e-x) / (ex + e-x)
28 29

Where can neural network systems help…


x1
• when we can't formulate an algorithmic
w1= 0.5
solution.
  y • when we can get lots of examples of the
x2 w2 = 0.3 behavior we require.
yin = x1w1 + x2w2
Activation Function:
Binary Step Function „learning from experience‟
 = 0.5,

(y-in) = 1 if y-in >= 


• when we need to pick out the structure
dan (y-in) = 0
from existing data.

30 31
Who is interested?... Problem Domains
• Electrical Engineers – signal processing, • Storing and recalling patterns
control theory • Classifying patterns
• Computer Engineers – robotics • Mapping inputs onto outputs
• Computer Scientists – artificial • Grouping similar patterns
intelligence, pattern recognition • Finding solutions to constrained
• Mathematicians – modelling tool when optimization problems
explicit relationships are unknown

32 33

Coronary Clustering
Disease

Classification 00 11
10
11 10
STOP

01 10 11
Neural
11 00
Net 11 10 00 00 Input patterns
00 11

Input layer

Output layer 00
01

00 01 10 11
Sorted
00 10 11 patterns
.

00 11 34 35

ANN Applications Applications of ANNs

• Signal processing
• Pattern recognition, e.g. handwritten
Medical Applications
characters or face identification.
• Diagnosis or mapping symptoms to a
medical case.
Information

Chemistry
Searching & retrieval • Speech recognition
• Human Emotion Detection
• Educational Loan Forecasting
Education
37
Business & Management
Abdominal Pain Prediction Voice Recognition

Intensity Duration
Male Age Temp WBC Pain Pain
adjustable
1 20 37 10 1 1
weights

AppendicitisDiverticulitis Ulcer Pain Cholecystitis Obstruction Pancreatitis


Duodenal Non-specific
Small Bowel
0 0 Perforated
0 0 0 0
1

38 39

Educational Loan Forecasting System Advantages Of NN


NON-LINEARITY
It can model non-linear systems

INPUT-OUTPUT MAPPING
It can derive a relationship between a set of input & output
responses

ADAPTIVITY
The ability to learn allows the network to adapt to changes in
the surrounding environment

EVIDENTIAL RESPONSE
It can provide a confidence level to a given solution

40 41

Advantages Of NN Comparison of ANN with conventional AI methods


CONTEXTUAL INFORMATION
Knowledge is presented by the structure of the network.
Every neuron in the network is potentially affected by the
global activity of all other neurons in the network.
Consequently, contextual information is dealt with naturally in
the network.

FAULT TOLERANCE
Distributed nature of the NN gives it fault tolerant capabilities

NEUROBIOLOGY ANALOGY
Models the architecture of the brain

42 43
Weight Vectors in Clustering Networks

1 wk1
wk2
2
wk3 k

UNIT-2 3 wk4
4

ASSCIATIVE MEMORY AND • Node k represents a particular class of input vectors, and the
weights into k encode a prototype/centroid of that class.
UNSUPERVISED LEARNING • So if prototype(class(k)) = [ik1, ik2, ik3, ik4], then:
NETWORKS wkm = fe(ikm) for m = 1..4, where fe is the encoding function.
• In some cases, the encoding function involves normalization.
Hence wkm = fe(ik1…ik4).
• The weight vectors are learned during the unsupervised training
phase.

Unsupervised Learning with Artificial Network Types for Clustering


• Winner-Take-All Networks
Neural Networks – Hamming Networks

• The ANN is given a set of patterns, P, from space, S,


but little/no information about their classification, – Maxnet
evaluation, interesting features, etc. It must learn these
by itself! – Simple Competitive Learning Networks

• Tasks
– Clustering - Group patterns based on similarity (Focus of this
lecture) • Topologically Organized Networks
– Vector Quantization - Fully divide up S into a small set of – Winner & its neighbors “take some”
regions (defined by codebook vectors) that also helps cluster
P.
– Probability Density Approximation - Find small set of points
whose distribution matches that of P.
– Feature Extraction - Reduce dimensionality of S by removing

Hamming Networks Hamming Networks (2)


Given: a set of patterns m patterns, P, from an n-dim input space,
Proof: (that output of output node p is the negative of the Hamming
S.
distance between p and input vector I).
Create: a network with n input nodes and m simple linear output
Assume: k bits match.
nodes (one per pattern), where the incoming weights to the
output node for pattern p is based on the n features of p. Then: n - k bits do not match, and n-k is the Hamming distance.
ipj = jth input bit of the pth pattern. ipj = 1 or -1 And: the output value of p‟s output node is:
Set wpj = ipj/2. 1 n  1
Also include a threshold input of -n/2 at each output node.   i pk I k - n   k - (n - k ) - n  k - n  -(n - k )
2  k 1  2
Testing: enter an input pattern, I, and use the network to determine
which member of P that is closest to I. Closeness is based on
the Hamming distance (# non-matching bits in the two patterns). k matches, where each match gives (1)(1) or (-1)(-1) = 1 Neg. Hamming distance
Given input I, the output value of the output node for
n
n n i
n 1 n 
w    i I - n
pattern p= pk n-k mismatches, where each gives (-1)(1) or (1)(-1) = -1
Ik - Ik - 
k 1
pk
2 k 1 2 2 2  k 1 pk k  The pattern p* with the largest negative Hamming distance to I is thus the
pattern with the smallest Hamming distance to I (i.e. the nearest to I).
Hence,
the output node that represents p* will have the highest output value of all
= the negative of the Hamming distance between I and p output
Hamming Network Example Simple Competitive Learning
P = {(1 1 1), (-1 -1 -1), (1 -1 1)} = 3 patterns of length 3
• Combination Hamming-like Net + Maxnet with learning of the
input-to-output weights.
Inputs Outputs
– Inputs can be real valued, not just 1, -1.
• So distance metric is actually Euclidean or Manhattan, not
p1
i1 Hamming.
wgt = 1/2 • Each output node represents a centroid for input patterns it wins
i2 on.
p2 wgt = -1/2
• Learning: winner node‟s incoming weights are updated to move
i3 wgt = -n/2 = -3/2
closer to the input vector.
p3 Euclidean

Given: input pattern I = (-1 1 1) Maxnet


Output (p1) = -1/2 + 1/2 + 1/2 - 3/2 = -1 (Winner)
Output (p2) = 1/2 - 1/2 - 1/2 - 3/2 = -2
Output (p3) = -1/2 - 1/2 + 1/2 - 3/2 = -2

Winning & Learning SCL Examples (1)


``Winning isn’t everything…it’s the ONLY thing” - Vince Lombardi 6 Cases:
(0 1 1) (1 1 0.5)
(0.2 0.2 0.2) (0.5 0.5 0.5)
• Only the incoming weights of the winner node are modified. (0.4 0.6 0.5) (0 0 0)
• Winner = output node whose incoming weights are the shortest
Euclidean distance from the input vector. Learning Rate: 0.5

 I - wki 
2 = Euclidean distance from input vector I to the Initial Randomly-Generated Weight Vectors:
i
i 1 vector represented by output node k‟s incoming weights [ 0.14 0.75 0.71 ]
[ 0.99 0.51 0.37 ] Hence, there are 3 classes to be learned
• Update formula: If j is the winning output node: [ 0.73 0.81 0.87 ]

i : w ji (new)  w ji (old )   ( I i - w ji (old )) Training on Input Vectors


Input vector # 1: [ 0.00 1.00 1.00 ]
1 wk1 Winning weight vector # 1: [ 0.14 0.75 0.71 ] Distance: 0.41
Note: The use of real-valued
wk2 inputs & Euclidean distance Updated weight vector: [ 0.07 0.87 0.85 ]
2 means that the simple product Input vector # 2: [ 1.00 1.00 0.50 ]
wk3 k
of weights and inputs does not Winning weight vector # 3: [ 0.73 0.81 0.87 ] Distance: 0.50
3 wk4 correlate with ”closeness” as Updated weight vector: [ 0.87 0.90 0.69 ]
in binary networks using
4 Hamming distance.

SCL Examples (2) SCL Examples (3)


Input vector # 3: [ 0.20 0.20 0.20 ]
Clusters after epoch 1:
Winning weight vector # 2: [ 0.99 0.51 0.37 ] Distance: 0.86
Weight vector # 1: [ 0.07 0.87 0.85 ]
Updated weight vector: [ 0.59 0.36 0.29 ]
Input vector # 1: [ 0.00 1.00 1.00 ]
Input vector # 4: [ 0.50 0.50 0.50 ]
Weight vector # 2: [ 0.24 0.26 0.22 ]
Winning weight vector # 2: [ 0.59 0.36 0.29 ] Distance: 0.27
Input vector # 3: [ 0.20 0.20 0.20 ]
Updated weight vector: [ 0.55 0.43 0.39 ]
Input vector # 4: [ 0.50 0.50 0.50 ]
Input vector # 5: [ 0.40 0.60 0.50 ]
Input vector # 5: [ 0.40 0.60 0.50 ]
Winning weight vector # 2: [ 0.55 0.43 0.39 ] Distance: 0.25
Input vector # 6: [ 0.00 0.00 0.00 ]
Updated weight vector: [ 0.47 0.51 0.45 ]
Weight vector # 3: [ 0.87 0.90 0.69 ]
Input vector # 6: [ 0.00 0.00 0.00 ]
Input vector # 2: [ 1.00 1.00 0.50 ]
Winning weight vector # 2: [ 0.47 0.51 0.45 ] Distance: 0.83
Updated weight vector: [ 0.24 0.26 0.22 ]
Weight Vectors after epoch 2:
[ 0.03 0.94 0.93 ]
Weight Vectors after epoch 1:
[ 0.19 0.24 0.21 ]
[ 0.07 0.87 0.85 ]
[ 0.93 0.95 0.59 ]
[ 0.24 0.26 0.22 ]
[ 0.87 0.90 0.69 ]
Clusters after epoch 2:
unchanged.
Maxnet Maxnet Examples
wgt  - • Input values: (1, 2, 5, 4, 3) with epsilon = 1/5 and theta = 1

wgt   0.000 0.000 3.000 1.800 0.600


0.000 0.000 2.520 1.080 0.000
= (1)5 - (0.2)(1+2+4+3)
e.g.   1    1 0.000 0.000 2.304 0.576 0.000
0.000 0.000 2.189 0.115 0.000
n 0.000 0.000 2.166 0.000 0.000

0.000 0.000 2.166 0.000 0.000


Simple network to find node with largest initial input value.
Topology: clique with self-arcs, where all self-arcs have a small positive Stable attractor
• Input values: (1, 2, 5, 4.5, 4.7) with epsilon = 1/5 and theta = 1
(excitatory) weight, and all other arcs have a small negative (inhibitory)
weight. 0.000 0.000 2.560 1.960 2.200
0.000 0.000 1.728 1.008 1.296
Nodes: have transfer function fT = max(sum, 0) 0.000 0.000 1.267 0.403 0.749
Algorithm: 0.000 0.000 1.037 0.000 0.415
0.000 0.000 0.954 0.000 0.207
Load initial values into the clique 0.000 0.000 0.912 0.000 0.017
0.000 0.000 0.909 0.000 0.000
Repeat:
0.000 0.000 0.909 0.000 0.000
Synchronously update all node values via fT
Until: all but one node has a value of 0
Winner = the non-zero node
Stable attractor

Associative-Memory Networks
Associative Network Types
Input: Pattern (often noisy/corrupted) 1. Auto-associative: X = Y
Output: Corresponding pattern (complete / relatively noise-free)
Process
1. Load input pattern onto core group of highly-
interconnected neurons.
2. Run core neurons until they reach a steady state.
3. Read output off of the states of the core neurons. *Recognize noisy versions of a pattern
Inputs Outputs
2. Hetero-associative Bidirectional: X <> Y

Output: (1 -1 1 -1 -1) BAM = Bidirectional Associative Memory


Input: (1 0 1 -1 -1)

*Iterative correction of input and output

Hebb‟s Rule Quantifying Hebb‟s Rule


Connection Weights ~ Correlations Compare two nodes to calc a weight change that reflects the state
correlation:
w jk  i pki pj
``When one cell repeatedly assists in firing another, the axon of the first cell Auto-Association: * When the two components are the same
(different),
develops synaptic knobs (or enlarges them if they already exist) in contact increase (decrease) the weight
with the soma of the second cell.” (Hebb, 1949)
Hetero-Association: w jk  i pk o pj i = input component
o = output component

In an associative neural net, if we compare two pattern components (e.g. pixels)


within many patterns and find that they are frequently in: Ideally, the weights will record the average correlations across all patterns:
a) the same state, then the arc weight between their NN nodes should be positive P P
b) different states, then ” ” ” ” negative Auto: w jk  i pki pj 
p 1
Hetero: w jk  i pk o pj 
p 1

Matrix Memory: Hebbian Principle: If all the input patterns are known prior to retrieval time,
then init weights as:
1 P 1 P
The weights must store the average correlations between all pattern components
across all patterns. A net presented with a partial pattern can then use the correlations
Auto: w jk   i pki pj
P p 1
Hetero: w jk   i pko pj
P p 1
to recreate the entire pattern.
Weights = Average Correlations
Auto-Associative Memory Hetero-Associative Memory
1. Auto-Associative Patterns to Remember 3. Retrieval 1. Hetero-Associative Patterns (Pairs) to Remember 3. Retrieval

Comp/Node value legend: 1 1


1 2 1 2 1 2 a a
dark (blue) with x => +1
2 2
dark (red) w/o x => -1
3 4 3 4 3 4 b b
light (green) => 0
3 3

2. Distributed Storage of All Patterns: 1 2 2. Distributed Storage of All Patterns:


3 4 1
-1 1 2 -1 a
1 1 2
1 2
b
3 4 3 4 3

• 1 node per pattern unit • 1 node per pattern unit for X & Y
• Fully connected: clique 1 2 • Full inter-layer connection
• Weights = avg correlations across • Weights = avg correlations across
all patterns of the corresponding units 3 4 all patterns of the corresponding units

Hopfield Networks Hopfield Network Example


1. Patterns to Remember 3. Build Network
• Auto-Association Network
p1 p2 p3
• Fully-connected (clique) with symmetric weights
1/3
• State of node = f(inputs) 1 2 1 2 1 2 1 2
1/3 [-]
• Weight values based on Hebbian principle 3 4 3 4 3 4
-1/3 -1/3
• Performance: Must iterate a bit to converge on a pattern, but 1/3
[+]
generally 3 4
much less computation than in back-propagation networks. 2. Hebbian Weight Init: -1
Input Output (after many iterations) Avg Correlations across 3 patterns

p1 p 2 p 3 Avg 4. Enter Test Pattern

W12 1 1 -1 1/3 1/3


W13 1 -1 -1 -1/3 1 2 1/3
W14 -1 1 1 1/3 -1/3 -1/3
3 4
n 1/3
Discrete node update rule: x pk (t  1)  sgn(  wkj x pj (t )  I pk )
W23 1 -1 1 1/3
j 1 W24 -1 1 -1 -1/3 -1
+1 0 -1
W34 -1 -1 -1 -1
Input value

Storage Capacity of Hopfield Stochastic Hopfield Networks


Networks Node state is stochastically determined by sum of inputs:
Capacity = Relationship between # patterns that can be stored & Node fires with probability:
retrieved
1
without error to the size of the network. p - 2 sumk
1 e
Capacity = # patterns / # nodes or # patterns / # weights For these networks, effective retrieval is obtained when P <
0.138N, which is an improvement over standard Hopfield nets.
• If we use the following definition of 100% correct retrieval:
When any of the stored patterns is entered completely (no noise), then Boltzmann Machines:
that same pattern is returned by the network; i.e. The pattern is a stable Similar to Hopfield nets but with hidden layers.
attractor.
State changes occur either:
• A detailed proof shows that a Hopfield network of N nodes can
achieve 100% correct retrieval on P patterns if: P < N/(4*ln(N)) a. Deterministically when E  0 1
In general, as more patterns are added to a network, N Max P b. Stochastically with probability = 1  e -E /
the avg correlations will be less likely to match the
10 1
Where t is a decreasing temperature variable  E
and
100 5
correlations in any particular pattern. Hence, the 1000 36 is the expected change in energy if the change is made.
likelihood of retrieval error will increase. 10000 271
=> The key to perfect recall is selective ignorance!! 1011 109
The non-determinism allows the system to ”jiggle” out of local
minima.
Unit -3 Overview

What is Fuzzy Logic?

 Where did it begin?


Fuzzy logic  Fuzzy Logic vs. Neural Networks

 Fuzzy Logic in Control Systems

 Fuzzy Logic in Other Fields

 Future

68 69

TRADITIONAL REPRESENTATION
OF LOGIC
WHAT IS FUZZY LOGIC?

 Definition of fuzzy
 Fuzzy – “not clear, distinct, or precise; blurred”
Slow Fast
 Definition of fuzzy logic Speed = 0 Speed = 1

 A form of knowledge representation suitable for bool speed;


get the speed
notions that cannot be defined precisely, but which if ( speed == 0) {
// speed is slow
depend upon their contexts. }
else {
// speed is fast
}

FUZZY LOGIC REPRESENTATION


CONT.
FUZZY LOGIC REPRESENTATION
Slowest
 For every problem
[ 0.0 – 0.25 ]
must represent in
terms of fuzzy sets.
Slow Slowest Slow Fast Fastest
float speed;
get the speed
 What are fuzzy [ 0.25 – 0.50 ]
if ((speed >= 0.0)&&(speed < 0.25)) {
sets? // speed is slowest
Fast }
else if ((speed >= 0.25)&&(speed < 0.5))
{
[ 0.50 – 0.75 ] // speed is slow
}
else if ((speed >= 0.5)&&(speed < 0.75))
Fastest {
// speed is fast
}
[ 0.75 – 1.00 ]
else // speed >= 0.75 && speed < 1.0
{
// speed is fastest
}
FUZZY LOGIC VS. NEURAL
ORIGINS OF FUZZY LOGIC NETWORKS
 Traces back to Ancient Greece  How does a Neural Network work?
 Lotfi Asker Zadeh ( 1965 )
 Both model the human brain.
 First to publish ideas of fuzzy logic.
 Fuzzy Logic
 Professor Toshire Terano ( 1972 )  Neural Networks
 Organized the world's first working group on fuzzy
 Both used to create behavioral
systems.
systems.
 F.L. Smidth & Co. ( 1980 )
 First to market fuzzy expert systems.

FUZZY LOGIC IN CONTROL TEMPERATURE CONTROLLER


SYSTEMS
 The problem
 Change the speed of a heater fan, based off the room
 Fuzzy Logic provides a more efficient and temperature and humidity.
resourceful way to solve Control Systems.  A temperature control system has four settings
 Cold, Cool, Warm, and Hot
 Some Examples
 Humidity can be defined by:
 Temperature Controller  Low, Medium, and High
 Anti – Lock Break System ( ABS )  Using this we can define
the fuzzy set.

BENEFITS OF USING FUZZY LOGIC FUZZY LOGIC IN OTHER FIELDS

 Business

 Hybrid Modeling

 Expert Systems
Fuzzy Logic Example Fuzzy Logic Example
Automotive Speed Controller
1.0

3 inputs: Too
Slow
Slow Optimum Fast
Too
Fast

speed (5 levels) Speed

acceleration (3 levels)
1.0
distance to destination (3 levels) Decelerating Constant Accelerating

1 output: Acceleration

power (fuel flow to engine) 1.0

Very
Close Distant
Close

Set of rules to determine output based on input Distance

values

Fuzzy Logic Example Fuzzy Logic Example


Note there would be a total of 95 different rules for
Example Rules all combinations of inputs of 1, 2, or 3 at a time.
IF speed is TOO SLOW and acceleration is DECELERATING, ( 5x3x3 + 5x3 + 5x3 + 3x3 + 5 + 3 + 3 )
THEN INCREASE POWER GREATLY

IF speed is SLOW and acceleration is DECREASING, In practice, a system won't require all the rules.
THEN INCREASE POWER SLIGHTLY

IF distance is CLOSE, System tweaked by adding or changing rules and by


THEN DECREASE POWER SLIGHTLY
adjusting set boundaries.
...

System performance can be very good but not


usually optimized by traditional metrics (minimize
RMS error).

Fuzzy Logic Summary


Doesn't require an understanding of process but any
knowledge will help formulate rules.

Complicated systems may require several iterations


to find a set of rules resulting in a stable system.

Combining Neural Networks with fuzzy logic reduces


time to establish rules by analyzing clusters of data.

Possible applications: Master Production Schedule,


Material Requirements Planning, Inventory Capacity
84 Planning
CONCLUSION

 Fuzzy logic provides an alternative way to


represent linguistic and subjective attributes of
the real world in computing. UNIT-4
 It is able to be applied to control systems and
other applications in order to improve the FUZZY AIRTHEMATIC
efficiency and simplicity of the design process.

Definition Arithmetic Operations


• Fuzzy Number • Interval Operations A = [ a1 , a3 ] , B = [ b 1 , b3 ]

– Convex and normal fuzzy set defined on R Addition


[ a1 , a3 ]( )[b1 , b3 ]  [ a1  b1 , a3  b3 ]
– Equivalently it satisfies
Subtractio n
• Normal fuzzy set on R [ a1 , a3 ](-)[b1 , b3 ]  [a1 - b3 , a3 - b1 ]
• Every alpha-cut must be a closed interval M ultiplication
• Support must be bounded [ a1 , a3 ]()[b1 , b3 ]  [ a1  b1  a1  b3  a3  b1  a3  b3 ,
a1  b1  a1  b3  a3  b1  a3  b3 ]
• Applications of fuzzy number Division
– Fuzzy Control, Decision Making, [ a1 , a3 ](/)[b1 , b3 ]  [ a1 / b1  a1 / b3  a3 / b1  a3 / b3 ,
a1 / b1  a1 / b3  a3 / b1  a3 / b3 ]
Optimizations execpt b1  b3  0

Examples Properties of Interval Operations


• Addition Commutative
A B  B  A A B  B  A
[2,5]+[1,3]=[3,8] [0,1]+[-6,5]=[-6,6]
Assocoative
• Subtraction ( A  B)  C  A  ( B  C ) ( A  B)  C  A  ( B  C )
[2,5]-[1,3]=[-1,4] [0,1]-[-6,5]=[-5,7] Identity 0  [0,0] 1  [1,1]
• Multiplication A  A0  0 A A  A 1  1  A
[-1,1]*[-2,-0.5]=[-2,2] [3,4]*[2,2]=[6,8] Subdistrib utive
A  (B  C)  A  B  A  C
• Division Inverse
[-1,1]/[-2,-0.5]=[-2,2] [4,10]*[1,2]=[2,10] 0  A-A 1  A/A
M onotonicity for any operations *
If A  E and B  F then A*B  E*F
Arithmetic Operation on Fuzzy
Example
Numbers
• Interval operations of alpha-level sets • A+B = {1/5, 0.8/6, 0.5/7 }

( A * B) A* B for any   (0,1]. i) z  5 No such case.
 A(  ) B ( z )  0

When *  /, 0 B for all   (0,1]. ii ) z  5
x y  23
 A (2)   B (3)  1
A* B    ( A * B) iii ) z  6
 (0,1] x  y  33 or x y  24
 A (3)   B (3)  0.5
• Note: The Result is a fuzzy number.  A (2)   B (4)  0.8  A(  ) B (6)   ( 0.5, 0.8 )  0.8
6  3 3
62 4

• Example: See Text pp. 105 and Fig. 4.5 iv ) z  7


 A (3)   B (4)  min ( 0.5, 0.8 )  0.5

Example Typical Fuzzy Numbers


• Max (A,B) = { (3 , 1) , (4 , 0.5) } Triangular Fuzzy Number
i) z  2 No such case.  Fig. 4.5
 A(  ) B ( z )  0
ii ) z  3 Trapezoidal Fuzzy Numbers: Fig. 4.4
x  y  2  3 or x  y  3 3
Linguistic variable: ”Performance”
 A ( 2)   B (3)  1  1  1  A (3)   B (3)  0.5  1  0.5
 A(  ) B (3)   (1,0.5)  1
3 2  3
Linguistic values (terms):
3 3 3

iii ) z  4 “very small” … “very large”


x  y  2  4 or x  y  3 4 Semantic Rules:
 A ( 2)   B ( 4)  1  0.5  0.5  A (3)   B ( 4)  0.5  0.5  0.5
 A(  ) B ( 4)   (0.5,0.5)  0.5 Terms maps on trapezoidal fuzzy numbers
4  2 4
4  3 4

iv ) z  5 No such case.
Syntactic rules (grammar):
 A(  ) B ( z )  0 Rules for other terms such as “not small”

Lattice of Fuzzy Numbers Fuzzy Equations


• Lattice • Addition A X  B
– Partially ordered set with ordering relation – X = B-A is not a solution because A+(B-A) is
– Meet(g.l.b) and Join(l.u.b) operations not B.
– Example: – Conditions to have a solution
Real number and “less than or equal to” For any   (0,1], let  A [ a1 , a2 ],  B [ b1 , b2 ] and  X [ x1 , x2 ]
(i)  b1 - a1  b2 - a2 for any   (0,1]
• Lattice of fuzzy numbers
(ii)    implies  b1 - a1   b1 -  a1   b2 -  a2  b2 - a2
MIN ( A, B)( z )  sup min[ A( x), B( y )]  MEET ( A, B)
Suppose  X [ b1 - a1 , b2 - a2 ] is a solution of
MAX ( A, B)( z ) 
z  min( x , y )

min[ A( x), B( y )]  JOIN ( A, B)


– Solution
sup 
A X  B for any   (0,1]. Then
z  max( x , y )

X   X
 (0,1]
Fuzzification
Fuzzy Equations
• Multiplication A X  B
Fuzzification is the process of changing a real scalar value into a fuzzy
– X = B/A is not a solution. value.
This is achieved with the different types of fuzzifiers (membership
– Conditions to have a solution
functions).
For any   (0,1], let  A [ a1 , a2 ],  B [ b1 , b2 ] and  X [ x1 , x2 ]
(i)  b1 /  a1  b2 /  a2 for any   (0,1]
(ii)    implies  b1 /  a1   b1 /  a1   b2 /  a2  b2 /  a2

– Solution Suppose  X [ b1 /  a1 , b2 /  a2 ] is a solution of



A X  B for any   (0,1]. Then
X   X
 (0,1]

Fuzzification Fuzzification

How cool is 36 F° ?
Temp: {Freezing, Cool, Warm, Hot} It is 30% Cool and 70% Freezing
Degree of Truth or "Membership"

Freezing Cool Warm Hot


Freezing Cool Warm Hot 1
1
0.7

0.3

0
0
10 30 50 70 90 110
10 30 50 70 90 110
Temp. (F°)
Temp. (F°)

Membership Functions

Fuzzification Fuzzification
Membership Functions Membership Functions
The MATLAB toolbox includes 11 built-in membership function Two membership functions are built on the Gaussian distribution
types. These 11 functions are, in turn, built from several basic curve: a simple Gaussian curve and a two-sided composite of two
functions: different Gaussian curves. The two functions are gaussmf and
gauss2mf. The generalized bell membership has the function name
• piecewise linear functions
gbellmf.
• the Gaussian distribution function
• the sigmoid curve Because of their smoothness and concise notation, Gaussian and bell
membership functions are popular methods for specifying fuzzy sets.
• quadratic and cubic polynomial curves Both of these curves have the advantage of being smooth and nonzero
at all points.
Fuzzification Fuzzification
Membership Functions Membership Functions
Although the Gaussian membership functions and bell membership Polynomial based curves account for several of the membership
functions achieve smoothness, they are unable to specify asymmetric functions in the toolbox.
membership functions, which are important in certain applications. Three related membership functions are the Z, S, and Pi curves, all
named because of their shape. The function zmf is the asymmetrical
the sigmoidal membership function is defined, which is either open polynomial curve open to the left, smf is the mirror-image function
left or right. Asymmetric and closed (i.e. not open to the left or right) that opens to the right, and pimf is zero on both extremes with a rise
membership functions can be synthesized using two sigmoidal in the middle.
functions, so in addition to the basic sigmf, you also have the
difference between two sigmoidal functions, dsigmf, and the product
of two sigmoidal functions psigmf.

Introduction
• After scientists became disillusioned with
classical and neo-classical attempts at
modeling intelligence, they looked in other
Unit -5 directions.
• Two prominent fields arose, connectionism
GENETIC ALGORITHRMS (neural networking, parallel processing)
and evolutionary computing.
• It is the latter that this essay deals with -
genetic algorithms and genetic
programming.

What is GA What is GA
• A genetic algorithm (or GA) is a search • Genetic algorithms are implemented as a
technique used in computing to find true or computer simulation in which a population of
approximate solutions to optimization and abstract representations (called chromosomes
search problems. or the genotype or the genome) of candidate
solutions (called individuals, creatures, or
• Genetic algorithms are categorized as global phenotypes) to an optimization problem evolves
search heuristics. toward better solutions.
• Genetic algorithms are a particular class of
evolutionary algorithms that use techniques • Traditionally, solutions are represented in binary
inspired by evolutionary biology such as as strings of 0s and 1s, but other encodings are
inheritance, mutation, selection, and crossover also possible.
(also called recombination).
What is GA Key terms
• The new population is then used in the • Individual - Any possible solution
next iteration of the algorithm. • Population - Group of all individuals
• Commonly, the algorithm terminates when • Search Space - All possible solutions to the
either a maximum number of generations problem
has been produced, or a satisfactory • Chromosome - Blueprint for an individual
fitness level has been reached for the • Trait - Possible aspect (features) of an individual
population. • Allele - Possible settings of trait (black, blond,
etc.)
• If the algorithm has terminated due to a
maximum number of generations, a • Locus - The position of a gene on the
chromosome
satisfactory solution may or may not have
• Genome - Collection of all chromosomes for an
been reached. individual

Chromosome, Genes and


Genotype and Phenotype
Genomes
• Genotype:
– Particular set of genes in a genome

• Phenotype:
– Physical characteristic of the genotype
(smart, beautiful, healthy, etc.)

Genotype and Phenotype


GA Requirements
• A typical genetic algorithm requires two things to be
defined:
• a genetic representation of the solution domain, and
• a fitness function to evaluate the solution domain.

• A standard representation of the solution is as an array


of bits. Arrays of other types and structures can be used
in essentially the same way.
• The main property that makes these genetic
representations convenient is that their parts are easily
aligned due to their fixed size, that facilitates simple
crossover operation.
• Variable length representations may also be used, but
crossover implementation is more complex in this case.
• Tree-like representations are explored in Genetic
programming.
Representation GA Requirements
• The fitness function is defined over the genetic
Chromosomes could be: representation and measures the quality of the
– Bit strings (0101 ... represented solution.
• The fitness function is always problem dependent.
1100)
• For instance, in the knapsack problem we want to
– Real numbers (43.2 -33.1 ... 0.0 maximize the total value of objects that we can put in a
knapsack of some fixed capacity.
89.2)
• A representation of a solution might be an array of bits,
– Permutations of element (E11 E3 E7 ... E1 where each bit represents a different object, and the
E15) value of the bit (0 or 1) represents whether or not the
object is in the knapsack.
– Lists of rules (R1 R2 R3 ... R22 • Not every such representation is valid, as the size of
R23) objects may exceed the capacity of the knapsack.
• The fitness of the solution is the sum of values of all
– Program elements (genetic objects in the knapsack if the representation is valid, or 0
programming) otherwise. In some problems, it is hard or even
impossible to define the fitness expression; in these
– ... any data structure ... cases, interactive genetic algorithms are used.

A fitness function General Algorithm for GA


• Initialization
• Initially many individual solutions are randomly
generated to form an initial population. The
population size depends on the nature of the
problem, but typically contains several hundreds
or thousands of possible solutions.
• Traditionally, the population is generated
randomly, covering the entire range of possible
solutions (the search space).
• Occasionally, the solutions may be "seeded" in
areas where optimal solutions are likely to be
found.

General Algorithm for GA General Algorithm for GA


• Selection • In roulette wheel selection, individuals are
• During each successive generation, a proportion of the
existing population is selected to breed a new given a probability of being selected that is
generation. directly proportionate to their fitness.
• Individual solutions are selected through a fitness-based
process, where fitter solutions (as measured by a fitness
function) are typically more likely to be selected.
• Certain selection methods rate the fitness of each • Two individuals are then chosen randomly
solution and preferentially select the best solutions.
Other methods rate only a random sample of the based on these probabilities and produce
population, as this process may be very time-consuming. offspring.
• Most functions are stochastic and designed so that a
small proportion of less fit solutions are selected. This
helps keep the diversity of the population large,
preventing premature convergence on poor solutions.
Popular and well-studied selection methods include
roulette wheel selection and tournament selection.
General Algorithm for GA Evolving Neural Networks
• These processes ultimately result in the • Evolving the architecture of neural network
next generation population of is slightly more complicated, and there
chromosomes that is different from the have been several ways of doing it. For
initial generation. small nets, a simple matrix represents
which neuron connects which, and then
• Generally the average fitness will have this matrix is, in turn, converted into the
increased by this procedure for the necessary 'genes', and various
population, since only the best organisms combinations of these are evolved.
from the first generation are selected for
breeding, along with a small proportion of
less fit solutions, for reasons already
mentioned above.

Evolving Neural Networks Example


• Many would think that a learning function could • f(x) = {MAX(x2): 0 <= x <= 32 }
be evolved via genetic programming. • Encode Solution: Just use 5 bits (1 or 0).
Unfortunately, genetic programming combined
with neural networks could be incredibly slow, • Generate initial population.
thus impractical. A 0 1 1 0 1

• As with many problems, you have to constrain


B 1 1 0 0 0
C 0 1 0 0 0
what you are attempting to create. D 1 0 0 1 1

• For example, in 1990, David Chalmers


• Evaluate each solution against objective.
attempted to evolve a function as good as the
delta rule. Sol.
A
String
01101
Fitness
169
% of Total
14.4

• He did this by creating a general equation based B 11000 576 49.2


C 01000 64 5.5
upon the delta rule with 8 unknowns, which the D 10011 361 30.9
genetic algorithm then evolved.

Example Cont‟d
• Create next generation of solutions
– Probability of “being a parent” depends on the fitness.
• Ways for parents to create next generation
– Reproduction
• Use a string again unmodified.
– Crossover
• Cut and paste portions of one string to another.
– Mutation
• Randomly flip a bit.
– COMBINATION of all of the above.

You might also like