0% found this document useful (0 votes)
26 views25 pages

Unit-4 TNM

This document covers the fundamentals of learning with neural networks, including neuron models, network architectures, and the backpropagation algorithm. It explains the structure and function of biological neurons, introduces the McCulloch-Pitts model as a simplified artificial neuron, and discusses various neural network architectures such as feedforward and recurrent networks. Additionally, it details the multilayer perceptron (MLP) and the error backpropagation algorithm used for training these networks.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views25 pages

Unit-4 TNM

This document covers the fundamentals of learning with neural networks, including neuron models, network architectures, and the backpropagation algorithm. It explains the structure and function of biological neurons, introduces the McCulloch-Pitts model as a simplified artificial neuron, and discusses various neural network architectures such as feedforward and recurrent networks. Additionally, it details the multilayer perceptron (MLP) and the error backpropagation algorithm used for training these networks.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

UNIT-IV

Learning with Neural Networks


Syllabus:
Neural models, Network Architectures, Perceptrons, The Error correction
Delta rule, Multilayer perceptron (MLP) networks and the error back
propagation algorithm.

Neuron Models:
Biological Neuron:
The brain is composed of ten billion neurons. Each nerve cell can
interact directly with up to 200,000 other neurons.
The brain is organized into different regions, each responsible for
different functions. The largest parts of the brain are the cerebral
hemispheres, which occupy most of the interior of the skull.
Neurons are connected to each other via their axons and dendrites.
Signals are sent through the axon of one neuron to the dendrites of
other neurons. Hence, dendrites may be represented as the inputs to the
neuron, and the axon as its output.
Each neuron has many inputs through its multiple dendrites, whereas it
has only one output through its single axon. The axon of each neuron
forms connections with the dendrites of many other neurons, with each
branch of the axon meeting exactly one dendrite of another cell at what
is called a synapse.
A biological neuron, which is the fundamental unit of the nervous system.
1. Dendrites
 These are branch-like structures extending from the cell body (soma).
 They receive signals from other neurons and transmit them toward the
soma.
 The signals are usually in the form of electrochemical impulses.

2. Cell Body (Soma) & Nucleus


 The soma contains the nucleus, which regulates the cell’s activities and
maintains its functionality.
 It integrates incoming signals and determines whether to generate an
output signal.
3. Axon
 The axon is a long, tube-like extension from the soma.
 It transmits electrical impulses away from the cell body to other neurons
or muscles.
 The axon plays a key role in signal propagation.
4. Synaptic Terminals
 These are the endpoints of the axon.
 They contain neurotransmitters that help in transmitting signals to the
next neuron.
5. Synapse & Synaptic Gap
 The synapse is the junction between the synaptic terminal of one neuron
and the dendrite of another neuron.
 The synaptic gap (or cleft) is a tiny space (around 50-200 angstroms)
between two neurons where neurotransmitters are released.
6. Signal Transmission Process
1. A neuron receives signals via dendrites.
2. The signals are processed in the cell body and, if strong enough, generate
an action potential.
3. The action potential travels down the axon.
4. At the synaptic terminals, neurotransmitters are released into the
synaptic gap.
5. These neurotransmitters bind to receptors on the next neuron’s dendrites,
continuing the transmission.
Key Takeaways
 Dendrites = Input zone
 Soma = Processing unit
 Axon = Transmission pathway
 Synapse = Communication point
This biological neuron structure serves as an inspiration for artificial neurons in
machine learning models, such as the McCulloch-Pitts neuron and perceptron.
McCulloch-Pitts (MP) Model

The McCulloch-Pitts model, introduced in 1943 by Warren


McCulloch and Walter Pitts, is a mathematical model of a
biological neuron. It was the first attempt to model a neuron
using a simplified, logical framework.
The MP model is a binary threshold neuron model, meaning its output is either
0 or 1 depending on whether the weighted sum of inputs exceeds a certain
threshold.
Fig 6. Mathematical model of a neuron (perceptron)
The bias term may be absorbed in the input vector itself as shown in Fig. .6b.

In the literature, this model of an artificial neuron is also referred to as a perceptron (the name
was given by Rosenblatt in 1958).
The expressions for the neuron output yˆ are referred to as the cell recall mechanism. They
describe how the output is reconstructed from the input signals and the values of the cell
parameters.

Components of the MP Model


1. Inputs (x1,x2,…,xn):
o Represent signals received from other neurons.
o Each input can be either 0 or 1 (binary inputs).
2. Weights (w1,w2,…,wn):
o Each input is associated with a weight.
o Weights are real numbers, usually positive, indicating the strength
of the connection.
o Summation Unit (∑):Calculates the weighted sum of inputs:
o Threshold (θ):A predefined value that the weighted sum is
compared against.
3. Activation Function (Step Function):

o Produces an output based on the comparison of .

Mathematical Representation

Characteristics
 Deterministic Model: The output is strictly determined by the inputs and
weights.
 Non-adaptive: Weights are fixed and not modified during operation.
 Feedforward: No feedback loops; information flows in one direction.
 Binary Inputs and Outputs: Limited to solving problems with binary data.

Examples of Logical Gates Using MP Model


AND Gate
 Inputs: x1, x2
 Weights: w1=1, w2=1
 Threshold: θ=2
 Output:
o Y=1if both x1=1 and x2=1
o Otherwise, Y=0
OR Gate
 Inputs: x1,x2
 Weights: w1=1,w2=1
 Threshold: θ=1
 Output:
o Y=1 if either x1=1 or x2=1 or both.
o Otherwise, Y=0

Limitations (Recap)
1. No Learning: Weights are fixed.
2. Binary Inputs/Outputs Only: Cannot process continuous data.
3. Only Linearly Separable Problems: Cannot solve XOR problem.
4. No Feedback Mechanism: Limited to simple feedforward networks.
5. Simplistic Activation Function: Only uses step function.

Hard limiting activation functions


Hard limiting activation functions are functions that produce discrete (binary or
stepped) outputs rather than continuous values. These functions are generally
non-differentiable at certain points but are useful in scenarios where binary or
step-like decisions are required.
Soft limiting activation functions
Soft limiting activation functions are functions that smoothly limit the output
of a neuron to a certain range, typically between 0 and 1 or -1 and 1. They are
commonly used in neural networks to introduce non-linearity and ensure
outputs remain within a desired range.
Popular Soft Limiting Activation Functions
Neural Network Architectures:
Artificial Neural Networks (ANNs) are inspired by the structure and functioning
of the biological brain, where a vast number of neurons are interconnected to
perform complex and intelligent tasks. In ANNs, neuron models are
interconnected to form various types of networks. These networks can be
broadly classified into two(three) main types based on their structure and signal
flow:
1. Feed Neural Networks
2. Recurrent Neural Networks
3. Convolutional Neural Networks

The structure of a neural network comprises layers of interconnected nodes,


commonly referred to as neurons or units.
These neurons are organized into three main types of layers: an input layer, one
or more hidden layers, and an output layer. Let us understand each key element
of the neural network in detail:
1. Input layer: The input layer is responsible for receiving the initial data or
features that are fed into the neural network.
2. Hidden layers: Hidden layers are intermediate layers between the input
and output layers. They perform complex computations and
transformations on the input data. A neural network can have numerous
hidden layers, each consisting of numerous neurons or nodes.
3. Neurons (Nodes): Neurons are the basic computational units that perform
a weighted sum of their inputs, apply a bias, and pass the result through
an activation function.
 Neurons in the hidden and output layers utilize activation functions
to introduce non-linearities into the network, allowing it to learn
complex patterns.
4. Weights and biases: Weights represent the strength of the connections
between neurons, and biases allow neurons to make predictions even
when all inputs are zero.
5. Activation functions: Activation functions are threshold values that
introduce non-linearities into the neural network, enabling it to
comprehend complex relationships between inputs and outputs. Common
activation functions include sigmoid, tanh, ReLU (Rectified Linear Unit),
and softmax.
Feedforward Neural Networks (FNN):

A Feedforward Neural Network (FNN) is a type of artificial neural network where


connections between nodes do not form a cycle. Information moves in one direction, from the
input layer through hidden layers (if any) to the output layer.
Recurrent Neural Networks (RNNs):

A Recurrent Neural Network (RNN) is a type of neural network where connections form
directed cycles, allowing information to persist across different time steps. This makes RNNs
suitable for sequential data, such as time series, speech, and text processing.
Let’s explore each of these architectures in detail.
Multilayer Neural Networks
Neural networks typically consist of multiple layers of neurons, each performing
transformations on input data. These Multilayer Perceptrons (MLPs) can approximate any
nonlinear function, making them powerful tools for solving complex real-world problems like
image recognition, speech processing, and pattern classification.
Activation Functions in Multilayer Networks

Neural networks use nonlinear activation functions in the hidden layers to model complex
relationships. Some common activation functions include: sigmoid, tanh, ReLu and softmax.

Output Layer Activations

 Sigmoid (for binary classification)


 Softmax (for multi-class classification)
 Linear (for regression problems)
Popular MLP Algorithms and Variants

Multi-Layer Perceptron (MLP) models can be trained using different optimization algorithms
and structural variations. Here are some notable ones:
4.4 Multi-Layer Perceptron (Mlp) Networks and The Error-Backpropagation Algorithm:
 Works well for linear and nonlinear activation functions.
 Used in the backpropagation algorithm in MLP.
 Ensures gradient-based optimization for weight updates.
 Helps in reducing the Mean Squared Error (MSE).
To adjust the weights based on the error between the predicted output y^\hat{y}y^ and the
target output yyy, we use gradient descent.

1. Define the Error Function: A common error function is the Mean Squared Error
(MSE):

2. Compute the Gradient of the Error


3. Update the Weights
4. Update the Bias

Fig. Neural unit with any differentiable activation function


The problem is to find the expression for the learning rule for adapting weights using a training
set of pairs of input and output patterns; the learning is in stochastic gradient descent mode, as
in the last section. We begin by defining error function E(k):

For each training example i, weights w j; j = 1, ..., n (and bias w 0) is updated by adding to it δwj
(and δw0).
The E(k) is a nonlinear function of the weights now, and the gradient cannot be calculated
following the equations derived in the last section for a linear neuron. Fortunately, the
calculation of the gradient is straight forward in the nonlinear case as well. For this purpose,
the chain rule is,

where the first term on the right-hand side is a measure of an error change due to the activation
value a(k) at the kth iteration, and the second term shows the influence of the weights on that
particular activation value a(k). Applying the chain rule again, we get,

The learning rule can be written as,

This is the most general learning rule that is valid for a single neuron having any nonlinear and
differentiable activation function and whose input is formed as a product of the pattern and
weight vectors. It follows the LMS algorithm for a linear neuron presented in the last section,
which was an early powerful strategy for adapting weights using data pairs only.
This rule is also known as delta learning rule with delta defined as,

In terms of δ(k), the weights-update equations become


It should be carefully noted that the δ(k) in these equations is not the error but the error change

due to the input a(k) to the nonlinear activation function at the kth iteration:

Thus, δ(k) will generally not be equal to the error e(k). We will use the term error signal for δ
(k), keeping in mind that, in fact, it represents the error change.
In the world of neural computing, the error signal δ(k) is of highest importance.
Interestingly, for a linear activation function

Therefore,

and

That is, delta represents the error itself. Therefore, the delta rule for a linear neuron is same as
the LMS learning rule presented in the previous section.
Fig: Scalar-output MLP network
The functions σo(.)/σoq(.) are the linear/log-sigmoid activation functions of the output layer, and
the functions σhl(.) are the activation functions of the hidden layer (log-sigmoid or tan-
sigmoid). The structure of output layer is given below figure.

Fig: Output layer of vector-output MLP network

Backpropagation is a learning algorithm used to train a neural network. It


involves propagating the error (difference between predicted and actual outputs)
backward through the network and adjusting the weights and biases iteratively to
minimize the loss function.

You might also like