UNIT-III
UNIT-III
A Feedforward Neural Network (FNN) is a type of artificial neural network where connections between the nodes do
not form cycles. This characteristic differentiates it from recurrent neural networks (RNNs). The network consists of
an input layer, one or more hidden layers, and an output layer. Information flows in one direction—from input to
output—hence the name "feedforward."
Structure of a Feedforward Neural Network
1. Input Layer: The input layer consists of neurons that receive the input data. Each neuron in the input layer
represents a feature of the input data.
2. Hidden Layers: One or more hidden layers are placed between the input and output layers. These layers
are responsible for learning the complex patterns in the data. Each neuron in a hidden layer applies a
weighted sum of inputs followed by a non-linear activation function.
3. Output Layer: The output layer provides the final output of the network. The number of neurons in this layer
corresponds to the number of classes in a classification problem or the number of outputs in a regression
problem.
LTSM vs RNN
Feature LSTM (Long Short-term Memory) RNN (Recurrent Neural Network)
Drawbacks in RNN
LSTM excels in sequence prediction tasks, capturing long-term dependencies. Ideal for time series, machine
translation, and speech recognition due to order dependence. The article provides an in-depth introduction to LSTM,
covering the LSTM model, architecture, working principles, and the critical role they play in various applications.
What is LSTM?
Long Short-Term Memory is an improved version of recurrent neural network designed by Hochreiter &
Schmidhuber.
A traditional RNN has a single hidden state that is passed through time, which can make it difficult for the network
to learn long-term dependencies. LSTMs model address this problem by introducing a memory cell, which is a
container that can hold information for an extended period.
LSTM Architecture
The LSTM architectures involves the memory cell which is controlled by three gates: the input gate, the forget gate,
and the output gate. These gates decide what information to add to, remove from, and output from the memory cell.
• The input gate controls what information is added to the memory cell.
• The forget gate controls what information is removed from the memory cell.
• The output gate controls what information is output from the memory cell.
This allows LSTM networks to selectively retain or discard information as it flows through the network, which
allows them to learn long-term dependencies.
The LSTM maintains a hidden state, which acts as the short-term memory of the network. The hidden state is
updated based on the input, the previous hidden state, and the memory cell’s current state.
Advantages:
1. Handling Long Sequences: LSTMs are well-suited for processing sequences of data with long-range
dependencies. They can capture information from earlier time steps and remember it for a more extended
period, making them effective for tasks like natural language processing (NLP) and time series analysis.
2. Avoiding Vanishing Gradient Problem: LSTMs address the vanishing gradient problem, which is a
common issue in training deep networks, particularly RNNs. The architecture of LSTMs includes gating
mechanisms (such as the forget gate) that allow them to control the flow of information and gradients
through the network, preventing the gradients from becoming too small during training.
3. Handling Variable-Length Sequences: LSTMs can handle variable-length input sequences by dynamically
adjusting their internal state. This is useful in many real-world applications where the length of the input
data varies.
4. Memory Cell: LSTMs have a memory cell that can store and retrieve information over long sequences.
This memory cell allows LSTMs to maintain important information while discarding irrelevant
information, making them suitable for tasks that involve remembering past context.
5. Gradient Flow Control: LSTMs are equipped with mechanisms that allow them to control the flow of
gradients during backpropagation. The forget gate, for example, can prevent gradients from vanishing
when they need to be propagated back in time. This enables LSTMs to capture information from earlier
time steps effectively.
Disadvantages:
1. Computational Complexity: LSTMs are computationally more intensive compared to other neural network
architectures like feedforward networks or simple RNNs. Training LSTMs can be slower and may require
more resources.
2. Overfitting: Like other deep learning models, LSTMs are susceptible to overfitting when there is
insufficient training data. Regularization techniques like dropout can help mitigate this issue.
3. Hyperparameter Tuning: LSTMs have several hyperparameters to tune, such as the number of LSTM units,
the learning rate, and the sequence length. Finding the right set of hyperparameters for a specific
problem can be a challenging and time-consuming process.
4. Limited Interpretability: LSTMs are often considered as “black-box” models, making it challenging to
interpret how they arrive at a particular decision. This can be a drawback in applications where
interpretability is crucial.
5. Long Training Times: Training deep LSTM models on large datasets can be time-consuming and may
require powerful hardware, such as GPUs or TPUs.
A Convolutional Neural Network (CNN) is a type of Deep Learning neural network architecture commonly used in
Computer Vision. Computer vision is a field of Artificial Intelligence that enables a computer to understand and
interpret the image or visual data.
When it comes to Machine Learning, Artificial Neural Networks perform really well. Neural Networks are used in
various datasets like images, audio, and text. Different types of Neural Networks are used for different purposes,
for example for predicting the sequence of words we use Recurrent Neural Networks more precisely an LSTM,
similarly for image classification we use Convolution Neural networks. In this blog, we are going to build a basic
building block for CNN.
Step:
• import the necessary libraries
• set the parameter