0% found this document useful (0 votes)
23 views

Artificial Neural Network

The document discusses biological neurons and artificial neural networks. Biological neurons consist of dendrites, a cell body, an axon, and synapses. Artificial neural networks are modeled after biological neural networks and consist of processing elements, inputs and outputs, weights, summation functions, and activation functions. Neural network architectures can have input, hidden, and output layers and use learning algorithms to change weights and thresholds.

Uploaded by

1realestabass
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Artificial Neural Network

The document discusses biological neurons and artificial neural networks. Biological neurons consist of dendrites, a cell body, an axon, and synapses. Artificial neural networks are modeled after biological neural networks and consist of processing elements, inputs and outputs, weights, summation functions, and activation functions. Neural network architectures can have input, hidden, and output layers and use learning algorithms to change weights and thresholds.

Uploaded by

1realestabass
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Biological Neuron

- The human nervous system is a complex neural network.

- The human brain is the central element in the nervous system.

- The brain consists of a large number of highly connected elements called neurons.

- The biological neuron consists of four main components (see Fig. 1):

• Dendrite
• Cell body
• Axon
• Synapse

- Dendrites receive signals from other neurons

- The axon of a neuron connects to the dendrite of another neuron by means of connectors called
synapses.

- Depending on type of neuron, the number of synaptic connections from other neurons range from few
hundred to 103

- Because of the electrical properties of the neuronal membranes, the signals that reach the dendrite
quickly decay in strength and over distance and hence lose the ability to stimulate the neuron unless they
are strengthened by other signals occurring at almost the same time and/or at nearby locations.

- The cell body (soma) sums the incoming signals from the dendrites

- When sufficient i/ps are received to stimulate the neuron to its threshold, the neuron generates action
potential, i.e., fires, and transmits an action potential along its axon to other neurons or target cells
outside the neurons system such as a muscle

- But if i/ps do not reach the threshold level, the i/ps will quickly decay and do not generate an action
potential.
Cell body
Dendrite

Axon

Synapse

Fig. 1 A biological neuron


Page | 1
Artificial Neural Network
- Artificial Neural Network (ANN) is a model of biological neural network.
- It is a massively parallel distributed processor that stores experiential knowledge
- Knowledge is acquired through a learning process
- Neural networks are leaning systems and hence are suitable for problems where a compact algorithmic
description does not exist or availability of incomplete or noisy data.

- The only requirement is a series of examples of desired i/o relations

e.g complex pattern recognition, speech recognition, natural language processing, and common-sense
reasoning, etc.

Comparison between the brain and ANN


Element Brain Artificial NN
Organisation Network of neurons Network of processing
elements
Components Dendrites, axons, Inputs and outputs weights,
synapses, summer, summation function, bias,
threshold activation function
Processing Analogue Digital
Architecture 10 – 100 billion neurons 1 - 106 processing elements
Hardware Neuron Switching transistor
Switch speed 1ms 1ns to 1ms
Technology Biological Silicon, optical

Comparison between the computer and a neural network


Neural Network Computer
Parallel information processing Sequential
Asynchronous Clocked (synchronous)
Stochastic Deterministic
Distributed information Stored in specific memory locations
Learn by examples Algorithmic (single step programming)
Netware Hardware

• Artificial Neuron Model

- Artificial neuron is a processing element, node or a threshold logic unit

- Four basic components of the model (see Fig. 2):

• Synapses with their weights

- the input to the synapses is a vector signal x with the individual vector components

Page | 2
- Each component of x is i/p to a synapses and connected to a neuron through a synaptic weight
w, i.e x is multiplied by the synaptic weight, w

• Summing Device

- Summing device acts to add all the signals broadcast into the adder, i.e., each input is multiplied
by the associated synaptic weight and then summed

- All the operations up to and including the o/p of the adder constitute a linear combiner

• Bias, bq

- The bias, b, is usually externally applied and is used to set the threshold of the neuron.

• Activation Function (Squashing Function)

- Activation function, f(.), serves to limit the amplitude of the neuron’s output (yq).

- It can be continuous valued, binary, or bipolar, or a linear function in certain cases.

- Some activation functions are as follows:

Name Input/Output Relation Waveform

Hard Limit a=0 y<0


a=1 y 0

Symmetrical Hard Limit a = -1 y < 0


a = +1 y  0

Linear a=y

Saturating Linear a=0 y<0


a=y 0  y 1
a=1 y > +1

Symmetric Saturating a = -1 y < -1


Linear a=y −1  y  1
a=1 y > +1

Page | 3
1
Log-Sigmoid a=
1 + e−y

e− y − e y
Hyperbolic Tangent a=
e y + e−y
Sigmoid

Positive Linear a=0 y<0


a=y 0 y

Threshold
(or bias)
Synapses b
x1
w1

w2 yq
Vector i/p x2
 f(.)
aq
axon(o/p)
signal
 Summing
Activation
wn junction
function
Synaptic
xn weights cell body (soma)

Fig. 2 Neural network model

Neural Network Architecture


- The types of topology and learning algorithms describe the architecture of a neural network

- Learning algorithm is the rule that governs how changes are made to weights and thresholds (or bias) of
an ANN.

● Topology

- Topology = The type of network layer and their interconnections

(i) Network Layers

- Network layer is the organisation or grouping of neurons

- There are three possible layers in a neural network as illustrated in Fig. 3

Page | 4
• Input layer
• Hidden layer(s)
• Output layer

Input layer
Hidden layer (Zero, one or several hidden layers)

Output layer

Fig. 3 Neural Network Possible Layers

• Single Layer NN:

- A single layer NN has one i/p layer and one o/p layer but no hidden layer
- Fig. 4 shows a single layer neural network of S neurons
- Each unit, receives a weighted i/p xj with weight wji
y1
x1 w11 a1
b1
y2
x2
a1
b2
yS
xR aS
WS,R
bS
Fig. 4 A single layer neural network
- Note that each of the R inputs is connected to each of the neurons and that the weight matrix now has S
rows.

- In order to simplify the complex network of arrows, Fig. 4 can be replaced by a simplified form shown in
Fig. 5 and the weight matrix is given as

w w  w 
 1,1 1,2 1, R 
w w  w 
w=
2,1 2,2 2, R 
     
 
 wS ,1 wS , 2  wS , R 
 

Page | 5
- Fig. 6 comprises a layer of S neurons and the layer includes the weight matrix, the summers, the bias
vector b, input x and output a

- The output is computed as:


Layer of S neurons
a = f  y  = f  wx + b 
input
    y
x
(Rx1) (Sx1)
w
(SxR) a = f (y)
(Sx1)

b
(Sx1)
Fig. 5 A single layer neuron in simplified form

• Multi-Layer NN

- As shown in Fig. 6, a multi-layer NN has one i/p layer, one or more hidden layer(s), and one o/p layer.
- The layer of i/p units is connected to a layer of hidden units which in turn is connected to a layer of o/p
units.
- A hidden layer’s output is not externally directly accessible
- Multi-Layer NN with nonlinear activation function provides more computational capability than a single
layer system

Hidden layer of S1 neurons


input Output layer of S2 neurons
x y1 y2
(Rx1) (S1x1) (S2x1)
1
w
(S1xR) a1 = f 1(y1) w2 a2 = f 2 (y2)
(S1x1) (S2xS1) (S2x1)

b b
(S1x1) (S2x1)
Fig. 6 A single layer neuron in simplified form
Page | 6
- The output of the hidden layer is computed as
   
a1 = f 1  y1  = f 1  w1 x + b1 
   
- While the output of the output layer is computed as:
 
       
a 2 = f 2  y 2  = f 2  w 2 a1 + b 2  = f 2  w 2 f 1  w1 x + b1  + b 2 
       
 
-If there is a 3rd layer, its output is will be computed as
 
   
3 3  3 3 2  2 1 1 1 2 3
a = f  y  = f w f
3
 w f  w x + b  + b  + b 
      
   
 
- The superscript in the above expressions does not denote power but its just an indication of the layer
number

Inter-Layer Connections

(i) Fully connected

- Each neuron on a previous layer is connected to every neuron on the next layer

(ii) Partially connected

- Neurons on a previous layer does not have to be connected to all neurons on the next layer

(iii) Feedforward

- Previous layer neurons have their outputs fed into inputs of next layer

(iv) Feedback

- Previous layer neurons have their outputs fed into inputs of next layer and vice versa (i.e., it has
bidirectional connections)-

(v) Hierarchical

- Neurons of a lower layer may only communicate with neurons on the next level of layer

(vi) Resonance

- The layers have bidirectional connections

Page | 7
- Messages are communicated across connections and repeated until certain conditions are reached.

Mcculloch-Pitts Neuron

-Mcculoch-Pitts lays the foundation for ANN

x1
1
x2
w a
y 0
xn w

xn+1
w
xn+m
-c

-c

x1,. . ., xn  excitory i/ps because the synaptic weights, w, are +ve

xn+1,. . ., xn+m  Inhibitory i/ps because the synaptic weights, –c, are negative

- State of neuron y at discrete time k is determined by the states of the i/ps x, to xn+m @ time k-1

- Let total signal received be y

1, yb

 a=
0, yb

e.g Realise AND logic function using Mcculock-pitts neuron


b=2
AND logic function
x1 w =1 a
y x1 x2 a
x2 w =1
1 1 1

1 0 0

0 1 0

0 0 0

Page | 8
e.g 2 OR logic function

OR logic function
b=2 x1 x2 a
x1 w=2 a
y 1 1 1

x2 w=2 1 0 1

0 1 1

0 0 0

e.g 3 XOR logic function (Exclusive OR)

XOR logic function


x1 x2 a

1 1 0

1 0 1

0 1 1

0 0 0

- It requires a two layer network consisting of 3 neurons

b=2

2
x1 y1 x3 2 b=2

-1
a
-1 y3

y2 x4
x2 2
2

b=2
Page | 9
- Layer 1: x1, x2

- Layer 2: x3, x4

- Output: a

- Supposing x1 = x2 =1, the middle layer of the network, x3 = x4 = 0 because y1 < b and y2 < b

i.e, y1 = (2)(x1)+ (-1)(x2) = (2)(1)+ (-1)(1) = 1


since y1 < b, then x3 = 0
similarly, y2 < b, then x4 = 0
 Since both x3 and x4 are inhibiting, y3 = 0 and the o/p a = 0

- Also if x1 = x2 = 0 then the o/p a = 0 also

- But if x1 =1, and x2 = 0 then x3 =1 and x4 = 0  a = 1

- Also for x1 = 0 and x2 the o/p a = 1

The Perceptron
- The name perceptron is now used as a synonym for single-layer, feed-forward networks.

- The diagram in Fig. 7 shows example of perceptrons.

w1,1 y1
x1
a1
b1 x1

x2 y2
a1
x2 y
b2 a
xR yS
b
aS
wS,R xR
bS
Fig. 7(a) A multineuron perceptron Fig. 7(b) A single neuron perceptron

- It is observable in Fig. 7(a) that a single weight only affects one of the outputs.

- This means that the study of perceptrons is made easier by only considering networks with a single output
(i.e. similar to the network shown Fig. 7(b)).

Page | 10
What perceptrons can represent
- Perceptrons can represent the AND, OR and NOT logic functions.

- But it does not follow that a perceptron (a single-layer, feed-forward network) can represent any Boolean
function?

- To see why, consider the following truth tables of the AND and XOR functions

AND Logic XOR Logic


x1 x2 a x1 x2 a
1 1 1 1 1 0

1 0 0 1 0 1

0 1 0 0 1 1

0 0 0 0 0 0

- We can represent these two truth tables graphically as follows:

1,1 1,1
0,1 0,1

0,0 1,0 1,0


0,0
AND XOR
(a) (b)

Fig. 8 (a) AND logic function and (b) XOR logic function

(where a filled circle represents an output of one and a hollow circle represents an output of zero).

- Looking at the AND graph in Fig 8(a), it is observable that one’s can be separated from the zero’s with a
line. This is not possible with XOR in Fig. 8(b).

- Functions such as the AND are called “linearly separable”.

-Although, only being able to learn linearly separable functions is a major disadvantage of the perceptron it
is still worth studying as it is relatively simple and can help provide a framework for other architectures.

Page | 11
-It should however, be realised that perceptrons are not only limited to two inputs (e.g. the AND function).
We can have n inputs which gives us an n-dimension problem.

Learning Linearly Separable Functions


- With a function such as AND (with only two inputs) we can easily decide what weights to use to give us
the required output from the neuron. But with more complex functions (i.e. those with more than two
inputs it may not be so easy to decide on the correct weights).

- Therefore, we would like our neural network to “learn” so that it can come up with its own set of weights.

- We will consider this aspect of neural networks for a simple function (i.e. one with two inputs). We could
obviously scale up the problem to accommodate more complex problems, providing the problems are
linearly separable.

- The algorithm to do this follows, but first some terminology.

Epoch : An epoch is the presentation of the entire training set to the neural network. In the
case of the AND function an epoch consists of four sets of inputs being presented
to the network (i.e. [0,0], [0,1], [1,0], [1,1]).
Target Value, t : When we are training a network we not only present it with the input but also with
a value that we require the network to produce. For example, if we present the
network with [1,1] for the AND function the training value will be 1.
Error, e : The error value is the amount by which the value output by the network differs
from the training value. For example, if we required the network to output 0 and it
output a 1, then e = 0-1 = -1.
Bias, b : The bias value used to set the activation threshold value of the neuron
Output from Neuron, a : The output value from the neuron
xi : Inputs being presented to the neuron
wi : Weight from input neuron ( x i ) to the output neuron
LR : The learning rate. This dictates how quickly the network converges. It is set by a
matter of experimentation. It is typically 0.1.

The Perceptron Training Algorithm:


While epoch produces an error
Present network with next inputs from epoch
e=t–a
If e  0 Then
Note : If the error is positive we need to increase a. If the error is negative we need to
decrease a. Each input contributes wi xi to the total input so if x i is positive, an
increase in wi will increase a. If x i is negative an increase in wi will decrease a).
This can be achieved with the following
wi = wiold + LR  e  xi
new

binew = biold + e
Note : This is often called the delta learning rule.
End If
End While

Page | 12
- The perceptron learning rule is guaranteed to converge to a solution in a finite number of steps, so long as
a solution exists.

- A single-layer perceptron is able to learn only linearly separable functions (e.g., Fig. 8(a)).

- Number of network inputs = number of problem inputs (e.g., if features such as fruit shape, fruit skin
smoothness, and fruit number of seeds are used to classify fruits into lemon and apple, 3 inputs are
required)

- Number of neurons in output layer = number of problem outputs (e.g., to classify animals into cats and
dog classes, one neuron is required where output 1 could represent the cat class and output 0 could
represent the dog class)

- k neurons can categorise 2k classes; for example, to categorise 2 classes, 1 neuron can be used and only
one decision boundary is required; similarly, to categorise 4 classes, 2 neurons can be used and only 2
decision boundaries are required etc

- Consider the graphs in Fig. 9 that has only 2 classes in each graph, it is impossible to draw a single line
(one decision boundary) to separate the classes into 2 regions, these graphs, therefore, represent linearly
inseparable problems

- For problems where the function is linearly inseparable (e.g., Fig. 9), multilayer perceptrons combined
with the backpropagation algorithm can be used for such classification.

(c)
(a) (b)

Fig. 9 Linearly inseparable problems

Example
For the single-input neuron in the figure below, the input is 2.0, its weight is 2.3 and its
bias is -3.
(i.) What is the net input to the activation function?
(ii.) What is the neuron output?

b = -3

a = f (y)
x = 2.0 y
w = 2.3 Page | 13
Solution
(i) The net input to the activation function is:
y = wx + b
= (2.3) x (2.0) + (-3)
= 1.6
(ii) The output cannot be computed because the activation function is not given

Example
What is the output of the neuron of the previous example if it has the following activation functions:
(i) Hardlimit
(ii) Linear
(iii) Log-sigmoid

Solution
(i) Hard limit activation function

0 if ( y  0)

a = f ( y) = 
1 if ( y  0)

a=1
(ii) Linear activation function

a = f (y) = y
a = 1.6

(iii) Log-sigmoid activation function

1 1
a = f ( y) = = = 0.8320
1 + e−y 1 + e −1.6

Example
Consider the 2-input single neuron shown below. The input vector is [-5 6]T and the weight vector is [3 2],
while the bias is 1.2. Calculate the neuron output for the following activation functions:

(i.) A symmetrical hard limit activation function


(ii.) A saturating linear activation function
(iii.) A hyperbolic tangent sigmoid (tansig) transfer function

b = 1.2

x1 = -5 w1 = 3
a = f (y)
y

x2 = 6 w2 = 2

Page | 14
Solution

The net input to the activation function is computed as:


y = wx + b
− 5
y = 3 2   +  − 3 
 
  6   
 
y = -1.8

(i.) A symmetrical hard limit activation function

− 1 if ( y  0)

a = f ( y) = 
+ 1 if ( y  0)

a = f ( y) = f (−1.8) = −1

(ii.) A saturating linear activation function

− 1 if( y  0)


a = f ( y ) =  y if(0  y  +1)

+ 1 if( y  1)
a = f ( y) = f (−1.8) = 0

(iii.) A hyperbolic tangent sigmoid transfer function


e− y − e y
a = f ( y) = y
e + e−y
a = f ( y) = f (−1.8) = –0.9468

Example

In a classification problem that employs a perceptron, x1, x2, x3, and x4 are input vectors while t1, t2, t3, and
t4 are the corresponding outputs; the values of the input-output pairs are provided below. Starting with a
random guess value of weight vector (w) given as [0 0], a bias value given as 0, and using a learning rate of
1.0 and employing the hardlimit activation function, determine
(i) the weight and bias values for the perceptron.
(ii) represent the classification problem graphically while showing the input vector, the decision boundary
and the weight vector.

       
  2    1    − 2   − 1 
 x1 =   , t1 = 0,  x 2 =   , t 2 = 1,  x3 =   , t 3 = 0,  x 4 =   , t 4 = 1
 2   −2   2   1 
               

Page | 15
Solution

From the given information, there are exactly two classes (i.e., class 0 and class 1), therefore only one
output neuron is required. It is a 2-dimensional problem with one decision boundary since there are 2
inputs to the system at any time. A sketch of the perceptron network with the first set of inputs and the
guess value of weight vector and bias is provided below.
b=0

2 w=0
a = f (y)
y

2 w=0

We start by calculating the perceptron’s output a for the first input vector x1, using the initial weights and
bias.

a = f ( w(0) x1 + b(0))
 
 2 
a = f  0 
0   + 0  = f  0  = 1 (using hardlimit activation function)
  2   
   
 

The output a does not equal the target value t1, so we use the perceptron rule to find new weights and
biases based on the error.

e = t1 − a = 0 − 1 = −1

w1 = w 0  + ex1T = 0 0 +  − 1 2 2 = − 2 − 2
          
b1 = b 0  + e = 0 +  − 1 = −1
     

We now apply the second input vector x2, using the updated weights and bias.

a = f ( w(1) x2 + b(1))
 
  1 
a = f  − 2 − 2   − 1 = f 1 = 1
   − 2   
   
 
This time the output a is equal to the target t2. Application of the perceptron rule will not result in any
changes.
w 2  = w1
   
b 2  = b1
   

Page | 16
We now apply the third input vector.

a = f ( w(2) x3 + b(2))
 
  − 2 
 
a = f  − 2 − 2    − 1 = f  − 1 = 0
 
 2    
   
 
The output in response to input vector x3 is equal to the target t3, so there will be no changes.
w 3  = w 2 
   
b 3  = b 2 
   
We now move on to the last input vector x4.
a = f ( w(3) x4 + b(3))
 
 − 1 
a = f  − 2 − 2   − 1 = f  − 1 = 0
   1   
   
 
This time the output a does not equal the appropriate target t4. The perceptron rule will result in a new set
of values for w and b.

e = t4 − a = 1 − 0 = 1

w 4  = w 3  + ex4T = − 2 − 2 + 1− 1 1 = − 3 − 1


          
b 4  = b 3  + e = −1 + 1 = 0
   
We now must check the first vector x1 again. This time the output a is equal to the associated target t1.

a = f ( w(4) x1 + b(4))
 
 2 
 
a = f − 3 − 1    + 0  = f  − 8  = 0
  2   
   
 
Therefore there are no changes.
w 5  = w 4 
   
b 5  = b 4 
   
The second presentation of x2 results in an error and therefore a new set of weight and bias values.

a = f ( w(5) x2 + b(5))
 
  1 
a = f  − 3 − 1   + 0  = f  − 1 = 0
  − 2   
   
 
Page | 17
Here are those new values:

e = t2 − a = 1 − 0 = 1

w 6  = w 5  + ex2T = − 3 − 1 + 1 1 − 2 = − 2 − 3
          
b 6  = b 5  + e = 0 + 1 = 1
   
Cycling through each input vector once more results in no errors.

 
    2 
a = f  w 6  x1 + b 6   = f  − 2 − 3   + 1 = f  − 9  = 0 = t1
        2   
     
 
 
    1 
        
a = f w 6  x 2 + b 6  = f − 2 − 3   + 1 = f  5  = 1 = t 2
         − 2   
     
 

 
    − 2 
`         
a = f w 6  x3 + b 6  = f − 2 − 3   + 1 = f  − 1 = 0 = t 3
        2   
     
 
 
   − 1 
a = f  w 6  x 4 + b 6   = f  − 2 − 3   + 1 = f  0  = 1 = t 4
        1   
     
 

Therefore the algorithm has converged. The final solution is:

w = [-2 -3] and b = 1

(ii) We can graph the training data and the decision boundary of the solution. The decision boundary is
given by

y = wx + b = w1,1 x1 + w1, 2 x 2 + b = −2 x1 − 3 x 2 + 1 = 0
To find the x 2 intercept of the decision boundary, set x1 = 0
b 1 1
x2 = − =− = if x1 = 0
w1, 2 −3 3

To find the x1 intercept of the decision boundary, set x 2 = 0


b 1 1
x1 = − =− = if x2 = 0
w1,1 −2 2
The resulting decision boundary is illustrated as follows

Page | 18
The decision boundary
The weight vectors must be
orthogonal to the decision
boundaries, and pointing in
the direction of points to be
classified as 1 (the dark
points). The weight vectors
can have any length we like.

Example
Consider the following graph representation of classification problems and determine

(i) which of (a), (b), and (c) can be learnt by a single layer perceptron.
(ii) the weight vectors and biases of problems that can be learnt by single layer perceptron and confirm
their correctness by testing them with the input vectors

2
2 2
1
1 1
1 2 3
1 2 3 1 2 3

(a) (c)
(b)
Solution
(i) Since there are only 2 classes (shaded and unshaded circles), only 1 decision boundary is required in
order to classify the problems with a single-neuron perceptron. Hence, only (a) and (c) can be learnt by a
single layer perceptron. This is because (a) and (c) are linearly separable can have a single line dividing
the "1" outputs from the "0" outputs as shown below.

2 2
2
1 1
w 1
3 1 2 3
1 2 3 1 2
w

(b) (c)
(a)
Page | 19
(ii) The next step is to find the weights and biases. The weight vectors must be orthogonal to the decision
boundaries, and pointing in the direction of points to be classified as 1 (the dark points). The weight
vectors can have any length we like.

For (a):
The slope of the decision boundary line is computed as:
y − y1 2 − 0
m= = =2
x − x1 1− 0
Since the weight vector must be orthogonal to the decision boundary, its line is normal to the decision
1 1
boundary line with a slope of − = −
m 2
Choose any value for the x component of the weight vector and use this to compute the corresponding
value of y as follows:

Assuming x = -2 and the weight vector line is chosen to pass through the origin, then, the y component
is computed as:

y = mx + c = −   − 2  + 0 = 1 (This is the equation of a straight line, hence y and x this context are
1
2  
different from the input and output vectors of the neuron)

Hence the weight vector is [-2 1]

Now we find the bias values for each perceptron by picking a point on the decision boundary and
satisfying

y = wx + b = 0

0
 
b = − wx = − − 2 1   = 0
  0
 

We can now check our solution against the original points. Here we test the first network on the input
vector
• For input [-2 2]:
 
  − 2 
 
a = f − 2 1   + 0  = f  6  = 1 (the result matches the actual output)
 
 2    
   
 

• For input [-2 0]:

 
 − 2 
a = f  − 2 
1   + 0  = f  4  = 1 (the result matches the actual output)
  0    
   
 
• For input [-2 -2]:

Page | 20
 
 − 2 
a = f  − 2 1   + 0  = f  2  = 1 (the result matches the actual output)
  − 2   
   
 
• For input [0 -2]:

 
  0 
 
a = f − 2 1    + 0  = f  − 2  = 0 (the result matches the actual output)
  − 2   
   
 
The test continues until all the input points have been tested

For (c):
The slope of the decision boundary line is computed as:
y − y1 2 −1 1
m= = =
x − x1 − 1 − (−2) 1
Since the weight vector must be orthogonal to the decision boundary, its line is normal to the decision
1
boundary line with a slope of − = −1
m
Choose any value for the x component of the weight vector and use this to compute the corresponding
value of y as follows:

Assuming x = 2 and the weight vector line is chosen to pass through the origin, then, the y component
is computed as:

y = mx + c = −1   2  + 0 = −2
 
Hence the weight vector is [2 -2]

Now we find the bias values for each perceptron by picking a point on the decision boundary and
satisfying

y = wx + b = 0
Picking point [-1 2], we have:
− 1 
b = − wx = − 2 − 2   = 6
   2
 

We can now check our solution against the original points. Here we test the first network on the input
vector
• For input [0 2]:
 
 0  
a = f  2 − 2   + 6  = f  2  = 1 (the result matches the actual output)
   2   
   
 

Page | 21
• For input [-2 2]:

 
  − 2 
a = f  2 − 2   + 6  = f  − 2  = 0 (the result matches the actual output)
  2   
   
 

• For input [-2 0]:

 
  − 2 
a = f  2 − 2   + 6  = f  2  = 1 (the result matches the actual output)
  0    
   
 

The test continues until all the input points have been tested

Page | 22

You might also like