0% found this document useful (0 votes)
10 views

NN-2nd

The document discusses various neural network models, including the McCulloch-Pitts model, Hebb model, and Perceptron network, detailing their architectures, functionalities, and limitations. It explains how these models can solve logical functions like AND and OR, and highlights the learning mechanisms introduced in the Hebb and Perceptron models. Additionally, it covers the training processes, decision boundaries, and characteristics of these neural network architectures.

Uploaded by

Vansh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

NN-2nd

The document discusses various neural network models, including the McCulloch-Pitts model, Hebb model, and Perceptron network, detailing their architectures, functionalities, and limitations. It explains how these models can solve logical functions like AND and OR, and highlights the learning mechanisms introduced in the Hebb and Perceptron models. Additionally, it covers the training processes, decision boundaries, and characteristics of these neural network architectures.

Uploaded by

Vansh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Neural Network

Unit – 2nd
McCulloch and Pits Neural Network (MCP Model):
Architecture, Solution of AND, OR function using MCP model,
Hebb Model: Architecture, training and testing, Hebb network
for AND function. Perceptron Network: Architecture, training,
Testing, single and multi-output model, Perceptron for AND
function Linear function, application of linear model, linear
seperatablity, solution of OR function using liner seperatablity
model.
McCulloch and Pits Neural Network:
The first computational model of a neuron was proposed by
Warren MuCulloch (neuroscientist) and Walter Pitts (logician)
in 1943.
Architecture: The motivation behind the McCulloh Pitt’s
Model is a biological neuron. A biological neuron takes an
input signal from the dendrites and after processing it passes
onto other connected neurons as the output if the signal is
received positively, through axons and synapses. This is the
basic working of a biological neuron which is interpreted and
mimicked using the McCulloh Pitt’s Model.
McCulloch Pitt’s model of neuron is a fairly simple model
which consists of some (n) binary inputs with some weight
associated with each one of them. An input is known as
‘inhibitory input’ if the weight associated with the input is of
negative magnitude and is known as ‘excitatory input’ if the
weight associated with the input is of positive magnitude. As
the inputs are binary, they can take either of the 2 values, 0 or
1.

Then we have a summation junction that aggregates all the


weighted inputs and then passes the result to the activation
function. The activation function is a threshold function that
gives out 1 as the output if the sum of the weighted inputs is
equal to or above the threshold value and 0 otherwise.
Let’s say we have n inputs = { X1, X2, X3, …. , Xn }
And we have n weights for each= {W1, W2, W3, …., W4}
So the summation of weighted inputs X.W = X1.W1 + X2.W2 +
X3.W3 +....+ Xn.Wn

If X ≥ ø(threshold value)
Output = 1
Else Output = 0
Geometric Interpretation of McCulloh & Pitt’s Model
OR Function
We know that the thresholding parameter for OR function is 1,
i.e. theta is 1. The possible combinations of inputs are: (0,0),
(0,1), (1,0), and (1,1). Considering the OR function’s
aggregation equation, i.e. x_1+x_2≥1, let us plot the graph.

The graph shows that the inputs for which the output when
passed through OR function M-P neuron lie ON or ABOVE
(output is 1, positive) that line and all inputs that lie BELOW
(output is 0, negative) that line give the output as 0.
Therefore, the McCulloh Pitt’s Model has made a linear
decision boundary which splits the inputs into two classes,
which are positive and negative.
AND Function
Similar to OR Function, we can plot the graph for AND function
considering the equation is x_1+x_2=2.
In this case, the decision boundary equation is x_1 + x_2 =2.
Here, all the input points that lie ON or ABOVE, just (1,1),
output 1 when passed through the AND function M-P neuron.
It fits! The decision boundary works!

Limitations of McCulloch-Pitts Neuron


 Inability to handle non-boolean inputs.
 The requirement to manually set thresholds.
 All inputs are treated equally; no weighting mechanism.
 Cannot handle functions that are not linearly separable like
XOR.
These limitations led to the development of more advanced
models, such as the perceptron proposed by Frank Rosenblatt
in 1958, which introduced learning mechanisms for weights
and thresholds.
Hebb Network:
The Hebb Model is based on Hebbian learning, proposed by
Donald Hebb in 1949. It introduces the idea that "neurons that
fire together, wire together," meaning that if two neurons are
activated simultaneously, the connection between them is
strengthened. This learning rule is the foundation for how
neural networks adapt based on experience. “ When an axon
of cell A is near enough to excite cell B and repeatedly or
permanently takes place in firing it, some growth process or
metabolic changes takes place in one or both the cells such
that A’s efficiency, as one of the cells firing B, is increased.”
Architecture of the Hebb Model
In Hebbian networks, each neuron has:
 Input Units (x₁, x₂, ..., xₙ): Binary inputs (either 0 or 1, or
-1 and 1 in some cases).
 Weights (w₁, w₂, ..., wₙ): Each input is associated with a
weight. The weights are updated using the Hebbian
learning rule.
 Output (y): The final result from the neuron based on the
weighted sum of inputs.
The neuron performs the following operation:
 Weighted sum: S=Σ(wi∗xi)S = Σ(wᵢ * xᵢ)S=Σ(wi∗xi)
 Activation function: In most basic Hebbian models, the
neuron fires if the sum is positive, often using a step
activation function:

he weight update in the Hebb rule is given by;


ith value of w(new) = ith value of w(old) + (ith value of x * y)
wnew=wold+Δw
where,
Δw=η⋅x⋅y , w: weight between the input and the output
 x: input
 y: target output (desired result)
 η: learning rate (usually set to 1 in the simplest form of
Hebbian learning)
STEP 1:Initialize the weights and bias to ‘0’ i.e w1=0,w2=0, .…,
wn=0.
STEP 2: 2–4 have to be performed for each input training
vector and target output pair i.e. s:t (s=training input vector,
t=training output vector)
STEP 3: Input units activation are set and in most of the cases
is an identity function(one of the types of an activation
function) for the input layer;
ith value of x = ith value of s for i=1 to n
Identity Function:Its a linear function and defined as f(x)=x for
all x
STEP 4: Output units activations are set y:t
STEP 5: Weight adjustments and bias adjustments are
performed;
1. ith value of w(new) = ith value of w(old) + (ith value of x
* y)
2. new bias(value) = old bias(value) + y
Implementing AND Gate :
For convenience in Hebb networks, the inputs and outputs are
often expressed as -1 (for 0) and +1 (for 1). Thus, the AND
function would be:
Step 1: Initialize Weights
 Start with all weights (w₁, w₂) and bias initialized to 0:
o w₁ = 0, w₂ = 0, bias (b) = 0
Step 2: Learning Process (Training)
The training involves adjusting weights based on the input and
target output using the Hebbian learning rule.
For each training pattern, the weights are updated as follows:
1. Input (x₁ = -1, x₂ = -1, y = -1):

2. Input (x₁ = -1, x₂ = +1, y = -1):


Final Weights After Training
After the training phase, the final weights are:
 w₁ = 2, w₂ = 2, b = -2
Step 3: Testing the Hebb Network
Decision Boundary :
2x1 + 2x2 – 2b = y
Replacing y with 0, 2x1 + 2x2 – 2b = 0
Since bias, b = 1, so 2x1 + 2x2 – 2(1) = 0
2( x1 + x2 ) = 2
The final equation, x2 = -x1 + 1
Perceptron Network:Perceptron is one of the
simplest Artificial neural network architectures. It was
introduced by Frank Rosenblatt in 1957s. It is the simplest type
of feedforward neural network, consisting of a single layer of
input nodes that are fully connected to a layer of output
nodes. It can learn the linearly separable patterns. it uses
slightly different types of artificial neurons known as threshold
logic units (TLU).
Types of Perceptron
Single-Layer Perceptron:
 This is one of the easiest Artificial neural networks (ANN)
types. A single-layered perceptron model consists feed-
forward network and also includes a threshold transfer
function inside the model. The main objective of the single-
layer perceptron model is to analyze the linearly separable
objects with binary outcomes.
 In a single layer perceptron model, its algorithms do not
contain recorded data, so it begins with inconstantly
allocated input for weight parameters. Further, it sums up
all inputs (weight). After adding all inputs, if the total sum of
all inputs is more than a pre-determined value, the model
gets activated and shows the output value as +1.
 If the outcome is same as pre-determined or threshold
value, then the performance of this model is stated as
satisfied, and weight demand does not change. However,
this model consists of a few discrepancies triggered when
multiple weight inputs values are fed into the model.
 Multilayer Perceptron: Like a single-layer perceptron
model, a multi-layer perceptron model also has the same
model structure but has a greater number of hidden layers.
 The multi-layer perceptron model is also known as the
Backpropagation algorithm, which executes in two stages as
follows:
 Forward Stage: Activation functions start from the input
layer in the forward stage and terminate on the output layer.
 Backward Stage: In the backward stage, weight and bias
values are modified as per the model's requirement. In this
stage, the error between actual output and demanded
originated backward on the output layer and ended on the
input layer.
Advantages of Multi-Layer Perceptron:
o A multi-layered perceptron model can be used to solve
complex non-linear problems.
o It works well with both small and large input data.
o It helps us to obtain quick predictions after the training.
Disadvantages of Multi-Layer Perceptron:
o In Multi-layer perceptron, computations are difficult and
time-consuming.
o In multi-layer Perceptron, it is difficult to predict how much
the dependent variable affects each independent variable.
o The model functioning depends on the quality of the
training.
Basic Components of Perceptron
A perceptron, the basic unit of a neural network, comprises
essential components that collaborate in information
processing.
 Input Features: The perceptron takes multiple input
features, each input feature represents a characteristic or
attribute of the input data.
 Weights: Each input feature is associated with a weight,
determining the significance of each input feature in
influencing the perceptron’s output. During training, these
weights are adjusted to learn the optimal values.
 Summation Function: The perceptron calculates the
weighted sum of its inputs using the summation function.
The summation function combines the inputs with their
respective weights to produce a weighted sum.
 Activation Function: The weighted sum is then passed
through an activation function. Perceptron uses Heaviside
step function functions. which take the summed values as
input and compare with the threshold and provide the
output as 0 or 1.
 Output: The final output of the perceptron, is determined
by the activation function’s result. For example, in binary
classification problems, the output might represent a
predicted class (0 or 1).
 Bias: A bias term is often included in the perceptron model.
The bias allows the model to make adjustments that are
independent of the input. It is an additional parameter that
is learned during training.
 Learning Algorithm (Weight Update Rule): During training,
the perceptron learns by adjusting its weights and bias
based on a learning algorithm. A common approach is the
perceptron learning algorithm, which updates weights
based on the difference between the predicted output and
the true output.
 How does Perceptron work?
Step-1

In the first step first, multiply all input values with


corresponding weight values and then add them to determine
the weighted sum. Mathematically, we can calculate the
weighted sum as follows:
∑wi*xi = x1*w1 + x2*w2 +…wn*xn
Add a special term called bias 'b' to this weighted sum to
improve the model's performance.
∑wi*xi + b
Step-2
In the second step, an activation function is applied with the
above-mentioned weighted sum, which gives us output either
in binary form or a continuous value as follows:
Y = f(∑wi*xi + b)
Training the Perceptron
The goal of training is to adjust the weights and bias so that
the perceptron can correctly classify the given input patterns.
The training process uses supervised learning, where we have
known input-output pairs.
Steps in Perceptron Training:
1. Initialize the Weights and Bias:
o Start with small random values for the weights (w₁, w₂, ...,
wₙ) and bias (b).
o Alternatively, set weights and bias to zero.
2. For Each Training Sample:
o Compute the weighted sum: S=Σ(wi⋅xi)+b.
o Apply the step activation function to determine the
predicted output y.
3. Update the Weights and Bias:
If the predicted output is incorrect, adjust the weights and
bias using the following update rules:
wi←wi+η⋅(ytarget−ypredicted)⋅xi
b←b+η⋅(ytarget−ypredicted)
Where:
o η is the learning rate (a small positive number, typically
between 0 and 1).
o Ytarget is the correct output for the training example.
o Ypredicted is the perceptron’s predicted output.
4. Repeat Until Convergence: Continue adjusting the weights
and bias for all training samples until the perceptron
correctly classifies all training data, or for a fixed number of
iterations (epochs).
3. Testing the Perceptron
After training, the perceptron can be tested on unseen input
data to check its ability to generalize. The testing phase
involves:
1. Input the test data.
2. Compute the weighted sum S=Σ(wi⋅xi)+b.
3. Apply the activation function to get the output y.
If the training was successful, the perceptron should correctly
classify most of the test inputs.
Perceptron for AND Function: the inputs are binary (0 or 1),
and the perceptron should learn to output 1 only when both
inputs are 1.
Step-by-Step Perceptron Training for AND Function:
1. Initialize Weights and Bias:
o Start with w₁ = 0, w₂ = 0, and b=0.
2. Choose a Learning Rate (η):
o Set η=1
3. Training Iteration 1:

4. Training Iteration 2:
Final Weights and Bias:
 After training, the final weights and bias are w₁ = 2 , w₂ =
1, and b=−1.
Testing the Perceptron for AND Function:

Characteristics of Perceptron
1. Perceptron is a machine learning algorithm for supervised
learning of binary classifiers.
2. In Perceptron, the weight coefficient is automatically
learned.
3. Initially, weights are multiplied with input features, and the
decision is made whether the neuron is fired or not.
4. The activation function applies a step rule to check whether
the weight function is greater than zero.
5. The linear decision boundary is drawn, enabling the
distinction between the two linearly separable classes +1
and -1.
6. If the added sum of all input values is more than the
threshold value, it must have an output signal; otherwise,
no output will be shown.
A perceptron model has limitations as follows:
o The output of a perceptron can only be a binary number (0
or 1) due to the hard limit transfer function.
o Perceptron can only be used to classify the linearly
separable sets of input vectors. If input vectors are non-
linear, it is not easy to classify them properly.

Linear Function: A linear function in machine learning


refers to a function that represents a straight-line relationship
between input variables (features) and the output.
Mathematically, it’s expressed as:
y=w1x1+w2x2+...+wnxn+b
Where:
 x1,x2,...,xn are the input features.
 w1,w2,...,wn are the weights associated with each input.
 b is the bias term.
 y is the output.
In the context of classification tasks, linear models aim to find
a linear decision boundary that separates different classes.
Application of Linear Models
Linear models are widely used in various applications like:
1. Binary Classification: In tasks like spam detection, a linear
model is used to classify emails as spam or not spam based
on certain features like word count or frequency of certain
phrases.
2. Regression: Linear models are applied to predict a
continuous output (e.g., house prices, stock values) based
on input variables.
3. Sentiment Analysis: A linear classifier can be used to
separate positive and negative sentiment in text based on
words or phrases.
4. Medical Diagnostics: Linear models can be used to
determine the likelihood of a disease based on various input
symptoms or test results.
Linear Separability
Linear separability is a concept that refers to whether a
dataset can be separated into different classes using a straight
line (in 2D), a plane (in 3D), or a hyperplane (in higher
dimensions).
 A problem is linearly separable if there exists a line (or
hyperplane) that perfectly divides the data points into their
respective classes.
In classification tasks:
 If data is linearly separable, algorithms like the perceptron
or linear SVM (Support Vector Machine) can easily classify
the data with 100% accuracy.
 If the data is not linearly separable, more advanced models
like non-linear classifiers (e.g., neural networks) are
required.
Linear Separability for OR Function
The OR function is a good example of a linearly separable
problem. The OR function outputs 1 when at least one of its
inputs is 1, and outputs 0 when both inputs are 0. The two
classes (OR = 0 and OR = 1) are linearly separable, meaning
you can draw a line that separates the output 0 (0,0) from the
outputs 1 (1,0), 1 (0,1), and 1 (1,1).
Solution of OR Function Using Linear Separability
Step 1: Define Input and Output

Step 2: Initialize Weights and Bias


We will use a simple linear model:
y=w1⋅x1+w2⋅x2+b
 Initialize weights w1=1, w2=1, and bias b=0.
Step 3: Compute the Weighted Sum for OR Function
For each input pair (x1,x2), compute the weighted sum SSS and
apply a step function for binary output:
S=w1⋅x1+w2⋅x2+b
 If S≥0, output y=1 (OR output is 1).
 If S<0, output y=0 (OR output is 0).
Step 4: Testing the OR Function Using Linear Separability
Model

Step 5: Visualize the Linear Decision Boundary


The decision boundary for the OR function is the line:
w1⋅x1+w2⋅x2+b=0
Substituting w1=1, w2=1, and b=0:
x1+x2=0
This line separates the point (0,0), where OR output is 0, from
the other points, where OR output is 1.
In this case, the points (1,0), (0,1), and (1,1) are on one side of
the line, and the point (0,0) is on the other side, making the
dataset linearly separable.

You might also like