0% found this document useful (0 votes)
2 views

NN-BNU2

The document discusses various types of neural networks, including Perceptron, Feed-forward Neural Networks, Radial Basis Functions, Recurrent Neural Networks, Convolutional Neural Networks, and Modular Neural Networks, highlighting their unique characteristics and applications. It also covers the workings of neural networks, focusing on activation functions, loss functions, backpropagation, and training algorithms such as gradient descent. Additionally, it explains the importance of error and regularization terms in the learning problem, detailing various error metrics used in neural networks.

Uploaded by

Walaa Gabr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

NN-BNU2

The document discusses various types of neural networks, including Perceptron, Feed-forward Neural Networks, Radial Basis Functions, Recurrent Neural Networks, Convolutional Neural Networks, and Modular Neural Networks, highlighting their unique characteristics and applications. It also covers the workings of neural networks, focusing on activation functions, loss functions, backpropagation, and training algorithms such as gradient descent. Additionally, it explains the importance of error and regularization terms in the learning problem, detailing various error metrics used in neural networks.

Uploaded by

Walaa Gabr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Neural

Networks
Lecture (2) _ ANN Architectures and Learning Schemes

Prof. Walaa Gabr


Prof. of Electrical Engineering & Department Chair
Faculty of Engineering at Benha
Neural Networks

Types of Neural Networks

2
Neural Networks

Types of Neural Networks


Perceptron
Perceptron is one of the simplest types of
neural networks. It consists of a single
layer of neurons, known as perceptrons or
artificial neurons. Each perceptron takes
multiple inputs, applies weights to those
inputs, sums them up, and applies an
activation function to produce an output.
The output is typically binary, indicating a
class or decision.

3
Neural Networks

Types of Neural Networks


Feed-forward Neural Network
This is the simplest form of ANN (artificial
neural network); data travels only in one
direction (input to output). This is the
example we just looked at. When you
actually use it, it’s fast; when you’re
training it, it takes a while. Almost all
vision and speech recognition
applications use some form of this type
of neural network.

4
Neural Networks

Types of Neural Networks


Radial Basis Functions Neural Network
This model classifies the data point based on its distance from a center point. If
you don’t have training data, for example, you’ll want to group things and create a
center point. The network looks for data points that are similar to each other and
groups them. One of the applications for this is power restoration systems.
Kohonen Self-organizing Neural Network
Vectors of random input are input to a discrete map comprised of neurons.
Vectors are also called dimensions or planes. Applications include using it to
recognize patterns in data like a

5
Neural Networks

Types of Neural Networks


Recurrent Neural Network
RNNs are designed to process sequential data,
where the order and context of data points
matter. They introduce feedback connections,
allowing information to flow in cycles or loops
within the network. RNNs have a memory
component that enables them to retain and
utilize information from previous steps in the
sequence.
They are widely used for tasks such as natural
language processing, speech recognition and
time series analysis.

6
Neural Networks

Types of Neural Networks


Convolution Neural Network
CNNs are primarily used for analyzing visual data, such as images or video, but they can also be applied
to other grid-like data. They employ specialized layers, such as convolutional layers and pooling layers,
to efficiently process spatially structured data. A convolutional layer applies a set of learnable filters to
the input, performing convolutions to detect local patterns and spatial relationships. The pooling layers,
on the other hand, reduce the dimensionality of the feature maps.

7
Neural Networks

Types of Neural Networks


Modular Neural Network
A modular neural network, also known as modular neural architecture, is a type of neural
network structure that is composed of distinct and relatively independent modules. Each
module is responsible for handling a specific subtask or aspect of the overall problem. The idea
behind modular neural networks is to break down a complicated problem into simpler sub-
problems and have specialized modules tackle each.
In a modular neural network, these modules often work in parallel or in a hierarchical manner,
where the outputs of one module feed into another. This allows for greater modularity,
flexibility, and easier debugging.

8
Neural Networks

How do they work?

9
Neural Networks

How do they work?


Mathematically speaking, we can represent a neuron as follows

10
Neural Networks

How do they work? Activation Functions

11
Neural Networks

How do they work? Activation Functions

Rectified Linear Unit.


Exponential Linear Unit

12
Neural Networks

How do they work?


Activation Function

13
Neural Networks

14
Neural Networks

15
Neural Networks

16
Neural Networks

How do they work?


Loss function (index): The loss function measures the discrepancy between the predicted
outputs of the neural network and the true values. It quantifies the network’s performance
and guides the learning process by providing feedback on its performance.
Backpropagation: Backpropagation is a learning algorithm used to train a neural network.
It involves propagating the error (difference between predicted and actual outputs)
backward through the network and adjusting the weights and biases iteratively to minimize
the loss function.
Optimization algorithm: Optimization algorithms, such as gradient descent, are
employed to update the weights and biases during training. These algorithms determine
the direction and magnitude of weight adjustments based on the gradients of the loss
function concerning the network parameters.

17
Neural Networks

Learning Problem
We represent the learning problem in terms of the minimization of a loss
index (f). Here, “f” is the function that measures the performance of a Neural
Network on a given dataset.
Generally, the loss index consists of an error term and a regularization term.
While the error term evaluates how a Neural Network fits a dataset, the
regularization term helps prevent the overfitting issue by controlling the
effective complexity of the Neural Network.

18
Neural Networks

Learning Problem
Loss index
The loss index plays a vital role in the use of neural networks. It defines the task the
neural network is required to do and provides a measure of the quality of the
representation required to learn. The choice of a suitable loss index depends on the
application.
When setting a loss index, two different terms must be chosen: an error term and a
regularization term.
loss_index = error_term + regularization_term

19
Neural Networks

Learning Problem
Error term
The error is the most important term in the loss expression. It
measures how the neural network fits the data set.
All those errors can be measured over different subsets of the data. In
this regard, the training error refers to the error measured on the
training samples, the selection error is measured on the selection
samples, and the testing error is measured on the testing samples.

20
Neural Networks

Learning Problem
Error term
Next, we describe the most important errors used in the field of neural
networks:
▪ Mean squared error.
▪ Normalized squared error.
▪ Weighted squared error.
▪ Cross entropy error.
▪ Minkowski error.

21
Neural Networks

Learning Problem
Error term
▪ Mean squared error.
1 2
𝑀𝑆𝐸 = σ(𝑦𝑖 − 𝑦ෝ𝑖 )
𝑛

𝑦𝑖 Observed values
𝑦ෝ𝑖 Predicted values

22
Neural Networks

Learning Problem
Error term
▪ Normalized squared error.

σ 2
1 (𝑦𝑖 − 𝑦ෝ𝑖 )
𝑁𝑀𝑆𝐸 = 2
𝑛 σ 𝑦ෝ𝑖

𝑦𝑖 Observed values
𝑦ෝ𝑖 Predicted values

23
Neural Networks

Learning Problem
Error term
▪ Cross entropy error.

𝐶𝐸𝐸 = 𝑦𝑖 ln(𝑦ෝ𝑖 ) + (1 − 𝑦𝑖 )ln(1 − 𝑦ෝ𝑖 )

𝑦𝑖 Observed values
𝑦ෝ𝑖 Predicted values

24
Neural Networks

Learning Problem
Error term
▪ Minkowski error is a loss index that is more insensitive to
outliers than the standard mean squared error.
What is an outlier?
Outliers refer to those data points which lie far away from most of the data
points. So, basically, outliers are points which are rare or distinct.
Here is a simple example :
Say we have a set of 10 numbers : {45,47,56,3,54,42,50,99,48,55}. In this
set, we observe that most of the numbers lie between 40 and 50. But there
are two numbers, 3 and 99 which are far away from most other numbers.
These numbers would be called outliers.

25
Neural Networks

Learning Problem
Error term
▪ Minkowski error.

1 𝑝
𝑀𝑖𝑛𝑜𝑤𝑠𝑘𝑖 𝐸𝑟𝑟𝑜𝑟 = σ(𝑦𝑖 − 𝑦ෝ𝑖 )
𝑛

𝑦𝑖 Observed values
𝑦ෝ𝑖 Predicted values

26
Neural Networks

Learning Problem

Regularization term
A solution is regular when small changes in the input variables led to small changes in
the outputs. An approach for non-regular problems is to control the effective
complexity of the neural network. We can achieve this by including a regularization
term in the loss index.
Regularization terms usually measure the values of the parameters in the neural
network. Adding that term to the error will cause the neural network to have smaller
weights and biases, which will force its response to be smoother.
The most used types of regularization are the following:
•L1 regularization.
•L2 regularization.

27
Neural Networks

Learning Problem

Regularization term
L1 regularization
The L1 regularization method consists of the sum of the absolute values of all the
parameters in the neural network.
l1_regularization = regularization_weight ⋅ ∑|parameters|
L2 regularization
The L2 regularization method consists of the squared sum of all the parameters in the
2
neural network. l2_regularization = regularization_weight ⋅ ∑parameters

28
Neural Networks

Learning Problem

Training Algorithms
The following chart depicts the computational
speed and the memory requirements of the
training algorithms
As we can see, the slowest training algorithm is
usually gradient descent, but it is the one
requiring less memory.
On the contrary, the fastest one might be the
Levenberg-Marquardt algorithm, but it usually
requires much memory.
A good compromise might be the quasi-Newton
method.

29
Neural Networks

Learning Problem

Training Algorithms
1. Gradient descent (GD)

Gradient descent is the most straightforward training algorithm. It requires information


from the gradient vector, and hence it is a first-order method.
Let denote 𝑓(𝑤 𝑖 ) = 𝑓 𝑖
and ∇𝑓(𝑤 𝑖 ) = 𝑔 𝑖 . The method begins at a point 𝑤 0

𝑖 𝑖+1
and, until a stopping criterion is satisfied, moves from 𝑤 to 𝑤 in the training
𝑖 𝑖
direction 𝑑 = −𝑔 Therefore, the gradient descent method iterates in the following
way:
𝑖+1 𝑖 𝑖 𝑖
𝑤 =𝑤 − 𝑔 𝜂 , 𝑓𝑜𝑟 𝑖 = 0, 1, …
30
Neural Networks

Learning Problem

The loss function [f(w(] depends on the


adaptative parameters – weights and
biases – of the Neural Network.
These parameters can be grouped into a
single n-dimensional weight vector (w).
Here’s a pictorial representation of the
loss function:

The gradient vector groups the first derivatives Hessian matrix groups the second derivatives

31
Neural Networks

Backpropagation

Propagating in the backward direction to update the weights


Steps:
1. Compare the computed output to actual output and determine
loss
2. Determine in which direction to change each weight to reduce
the loss
3. Determine the amount by which to change the weights
4. Apply correction to the weights
5. Repeat the procedure in each iteration till the loss is reduced to
an accepted value
32
Neural Networks

Backpropagation

Perceptron
Developed by Frank Rosenblatt 1957, by using McCulloch and Pitts
model, perceptron is the basic operational unit of artificial neural
networks. It employs supervised learning rule and is able to classify
the data into two classes “binary classifiers”.
Operational characteristics of the perceptron: It consists of a single
neuron with an arbitrary number of inputs along with adjustable
weights, but the output of the neuron is 1 or 0 depending upon the
threshold. It also consists of a bias whose weight is always 1.
Following figure gives a schematic representation of the perceptron.

33
Neural Networks

Backpropagation

Perceptron Learning Algorithm


1.First, multiply all input values with corresponding weight values and then add them to
determine the weighted sum. Mathematically, we can calculate the weighted sum as follows:
σ 𝑤𝑖 ∗ 𝑥𝑖 = 𝑤1 ∗ 𝑥1 + 𝑤2 ∗ 𝑥2 + ⋯ + 𝑤𝑛 ∗ 𝑥𝑛 , Add another essential term called bias 'b' to the
weighted sum to improve the model performance. σ 𝑤𝑖 ∗ 𝑥𝑖 + 𝑏.
2.Next, an activation function is applied to this weighed sum, producing a binary or a continuous-
valued output. 𝑌 = 𝑓(σ 𝑤𝑖 ∗ 𝑥𝑖 + 𝑏).
3.Next, the difference between this output and the actual target value is computed to get the
error term, E, generally in terms of mean squared error. The steps up to this form the forward
propagation part of the algorithm. 𝐸 = (𝑌 − 𝑌𝑎𝑐𝑡𝑢𝑎𝑙 )2
4. We optimize this error (loss function) using an optimization algorithm. Generally, some form of
gradient descent algorithm is used to find the optimal values of the hyperparameters like
learning rate, weight, Bias, etc. This step forms the backward propagation part of the algorithm
34
Neural Networks

Backpropagation
Perceptron Learning Algorithm
Training Algorithm for Single Output Unit
Step 1 − Initialize the following to start the training
• Weights
• Bias
• Learning rate  or
For easy calculation and simplicity, weights and bias must be set equal to 0 and the
learning rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not true.
Step 3 − Continue step 4-6 for every training vector x.
Step 4 − Activate each input unit as follows
𝒙𝒊 = 𝒔𝒊 (𝒊 = 𝟏 → 𝒏)
Step 5 − Now obtain the net input with the following relation
𝑛

𝑦𝑖𝑛 = 𝑏 + ෍ 𝑥𝑖 ∙ 𝑤𝑖
𝑖 35
Neural Networks

Backpropagation
Perceptron Learning Algorithm
Here ‘b’ is bias, and ‘n’ is the total number of input neurons.
Step 6 − Apply the following activation function to obtain the final output.

Or according to the threshold


function
Step 7 − Adjust the weight and bias as follows
Case 1 − if y ≠ t then, Case 2 − if y = t then,

Here ‘y’ is the actual output and ‘t’ is the desired/target output.
Step 8 − Test for the stopping condition, which would happen when there is no change in weight. 36
Neural Networks

Backpropagation
Perceptron Learning Algorithm

37
Neural Networks

Backpropagation
Perceptron Learning Algorithm

38
Neural Networks

Backpropagation
Perceptron Learning Algorithm

39
Neural Networks

Backpropagation
Perceptron Learning Algorithm
Example1

We are going to set weights randomly. Let’s say that w1 = 0.9 and w2 = 0.9
40
Neural Networks

Backpropagation
Perceptron Learning Algorithm
Example1
Round 1
We will apply 1st instance to the perceptron. x1 = 0 and x2 = 0.
Sum unit will be 0 as calculated below
Σ = x1 * w1 + x2 * w2 = 0 * 0.9 + 0 * 0.9 = 0
Activation unit checks sum unit is greater than a threshold. If this rule is satisfied, then it is fired and the unit
will return 1, otherwise it will return 0. BTW, modern neural networks architectures do not use this kind of a
step function as activation.

Activation threshold would be 0.5.


Sum unit was 0 for the 1st instance. So, activation unit would return 0 because it is less than 0.5. Similarly, its
output should be 0 as well. We will not update weights because there is no error in this case.
Let’s focus on the 2nd instance. x1 = 0 and x2 = 1.
Sum unit: Σ = x1 * w1 + x2 * w2 = 0 * 0.9 + 1 * 0.9 = 0.9

41
Neural Networks

Backpropagation
Perceptron Learning Algorithm
Example1
What about errors?
Activation unit will return 1 because sum unit is greater than 0.5. However, output of this instance should be
0. This instance is not predicted correctly. That’s why, we will update weights based on the error.
ε = actual – prediction = 0 – 1 = -1
We will add error times learning rate value to the weights. Learning rate would be 0.5. BTW, we mostly set
learning rate value between 0 and 1.
w1 = w1 + α * ε = 0.9 + 0.5 * (-1) = 0.9 – 0.5 = 0.4
w2 = w2 + α * ε = 0.9 + 0.5 * (-1) = 0.9 – 0.5 = 0.4
Focus on the 3rd instance. x1 = 1 and x2 = 0.

42
Neural Networks

Backpropagation
Perceptron Learning Algorithm
Example1
Sum unit: Σ = x1 * w1 + x2 * w2 = 1 * 0.4 + 0 * 0.4 = 0.4
Activation unit will return 0 this time because output of the sum unit is 0.5 and it is less than 0.5. We will not
update weights.
Mention the 4rd instance. x1 = 1 and x2 = 1.
Sum unit: Σ = x1 * w1 + x2 * w2 = 1 * 0.4 + 1 * 0.4 = 0.8
Activation unit will return 1 because output of the sum unit is 0.8 and it is greater than the threshold value
0.5. Its actual value should 1 as well. This means that 4th instance is predicted correctly. We will not update
anything.

43
Neural Networks

Backpropagation
Perceptron Learning Algorithm
Example1
Round 2
In previous round, we’ve used previous weight values for the 1st instance and it was classified correctly. Let’s
apply feed forward for the new weight values.
Remember the 1st instance. x1 = 0 and x2 = 0.
Sum unit: Σ = x1 * w1 + x2 * w2 = 0 * 0.4 + 0 * 0.4 = 0.4
Activation unit will return 0 because sum unit is 0.4 and it is less than the threshold value 0.5. The output of
the 1st instance should be 0 as well. This means that the instance is classified correctly. We will not update
weights.
Feed forward for the 2nd instance. x1 = 0 and x2 = 1.

44
Neural Networks

Backpropagation
Perceptron Learning Algorithm
Example1
Sum unit: Σ = x1 * w1 + x2 * w2 = 0 * 0.4 + 1 * 0.4 = 0.4
Activation unit will return 0 because sum unit is less than the threshold 0.5. Its output should be 0 as well.
This means that it is classified correctly and we will not update weights.
We’ve applied feed forward calculation for 3rd and 4th instances already for the current weight values in the
previous round. They were classified correctly.

45
Neural Networks

Backpropagation
Perceptron Learning Algorithm
OR Function Using A Perceptron

46
Neural Networks

4
7

You might also like