Understanding LSTM Architecture

Deep learning

Uploaded by

akshithasonia333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

95 views12 pages

Understanding LSTM Architecture

Deep learning

Uploaded by

akshithasonia333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

LONG SHORT TERM

MEMORY (LSTM)

- Noureen Tabassum
23011DB025
M. Tech (Data Science)
WHAT IS LSTM?
 LSTM (Long Short-Term Memory) is a recurrent neural network
(RNN) architecture widely used in Deep Learning. It excels at capturing
long-term dependencies, making it ideal for sequence prediction tasks.
 Unlike traditional neural networks, LSTM incorporates feedback
connections, allowing it to process entire sequences of data, not just
individual data points
 This makes it highly effective in understanding and predicting patterns
in sequential data like time series, text, and speech.
 LSTM has become a powerful tool in artificial intelligence and deep
learning, enabling breakthroughs in various fields by uncovering
valuable insights from sequential data.
LSTM ARCHITECTURE
In the introduction to long short-term memory, we learned that it resolves the
vanishing gradient problem faced by RNN, so now, in this section, we will
see how it resolves this problem by learning the architecture of the LSTM. At
a high level, LSTM works very much like an RNN cell. Here is the internal
functioning of the LSTM network. The LSTM network architecture consists
of three parts, as shown in the image below, and each part performs an
individual function.
THE LOGIC BEHIND LSTM
 These three parts of an LSTM unit are known as gates. They control the flow
of information in and out of the memory cell or lstm cell..
 The first gate is called Forget gate, the second gate is known as the Input
gate, and the last one is the Output gate.
 An LSTM unit that consists of these three gates and a memory cell or lstm cell
can be considered as a layer of neurons in traditional feedforward neural
network, with each neuron having a hidden layer and a current state.
 Similar to that of RNN, an LSTM also has a hidden state where H(t-1) represents
the hidden state of the previous timestamp and Ht is the hidden state of the current
timestamp.
 In addition to that, LSTM also has a cell state represented by C(t-1) and C(t) for
the previous and current timestamps, respectively.
 Here the hidden state is known as Short term memory, and the cell state is known
as Long term memory. Refer to the following image.
ROLES OF GATES IN ARCHITECTURE
1. Forget Gate:
 In a cell of the LSTM neural network, the first step is to decide whether we
should keep the information from the previous time step or forget it. Here is the
equation for forget gate.

•Xt : input to the current timestamp.

•Uf : weight associated with the input
•Ht-1: The hidden state of the previous timestamp
•Wf : It is the weight matrix associated with the hidden state
[Link] Gate :

 The input gate is used to quantify the importance of the new information
carried by the input. Here is the equation of the input gate.

•Xt: Input at the current timestamp t

•Ui: weight matrix of input
•Ht-1: A hidden state at the previous timestamp
•Wi: Weight matrix of input associated with hidden state
3. Output Gate:

 Here is the equation of the Output gate, which is pretty similar to the two
previous gates.

 Its value will also lie between 0 and 1 because of this sigmoid function.
Now to calculate the current hidden state, we will use Ot and tanh of the
updated cell state. As shown below.
LSTM NETWORK
WHAT ARE BIDIRECTIONAL LSTM’S ?
 Bidirectional LSTMs (Long Short-Term Memory) are a type of recurrent
neural network (RNN) architecture that processes input data in both forward
and backward directions.
 In a traditional LSTM, the information flows only from past to future,
making predictions based on the preceding context.
 However, in bidirectional LSTMs, the network also considers future
context, enabling it to capture dependencies in both directions.
 As a result, bidirectional LSTMs are particularly useful for tasks that
require a comprehensive understanding of the input sequence, such as
natural language processing tasks like sentiment analysis, machine
translation, and named entity recognition.
ADVANTAGES AND DISADVANTAGES OF LSTM’S
 Advantages
 It can learn long-term dependencies in data.
 It helps us to better understand and predict the behaviour of stock markets.
 It gives us better accuracy.
 It is capable of learning and remembering information over time.

 Disadvantages
 One drawbackis that implementing lstm networks on FPGA requires specialized
hardware and software knowledge.
 If they are not trained properly they lead to overfitting
 Limited long-term memory
 Complex cell structure

LSTM Explained: A Simple Overview
No ratings yet
LSTM Explained: A Simple Overview
4 pages
Long Short-Term Memory Overview
No ratings yet
Long Short-Term Memory Overview
22 pages
Predicting Aviation Engine Vibrations with LSTM
No ratings yet
Predicting Aviation Engine Vibrations with LSTM
85 pages
Design of Heuristic Algorithms For Hard Optimization: Éric D. Taillard
No ratings yet
Design of Heuristic Algorithms For Hard Optimization: Éric D. Taillard
293 pages
MNIST Image Classification Guide
No ratings yet
MNIST Image Classification Guide
23 pages
RNN and LSTM Overview
No ratings yet
RNN and LSTM Overview
15 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
78 pages
Blockchain and Deep Learning Insights
100% (1)
Blockchain and Deep Learning Insights
101 pages
Binary Classification with SVM Explained
No ratings yet
Binary Classification with SVM Explained
34 pages
ASL Alphabet Image Classification Guide
No ratings yet
ASL Alphabet Image Classification Guide
15 pages
Review of Deep Learning Architectures
No ratings yet
Review of Deep Learning Architectures
26 pages
Data Augmentation for ASL Dataset
No ratings yet
Data Augmentation for ASL Dataset
10 pages
Building a Large Language Model Guide
No ratings yet
Building a Large Language Model Guide
13 pages
LLM Mesh: A Framework for GenAI Applications
No ratings yet
LLM Mesh: A Framework for GenAI Applications
104 pages
Generator Size and Wattage Guide
No ratings yet
Generator Size and Wattage Guide
1 page
Understanding Machine Learning Basics
100% (1)
Understanding Machine Learning Basics
64 pages
Fruit Model Training Assessment
No ratings yet
Fruit Model Training Assessment
10 pages
Dataiku Guide: First ML Model Basics
No ratings yet
Dataiku Guide: First ML Model Basics
43 pages
Essential Guide to ML System Design
0% (1)
Essential Guide to ML System Design
6 pages
Artificial Intelligence and Knowledge Processing Improved Decision-Making and Prediction (Etc.)
No ratings yet
Artificial Intelligence and Knowledge Processing Improved Decision-Making and Prediction (Etc.)
387 pages
Survey of Graph Neural Networks
No ratings yet
Survey of Graph Neural Networks
70 pages
Deep Learning: Foundations Overview
0% (1)
Deep Learning: Foundations Overview
4 pages
F# for Machine Learning Essentials
No ratings yet
F# for Machine Learning Essentials
29 pages
Beginner's Guide to Machine Learning
100% (1)
Beginner's Guide to Machine Learning
132 pages
A Survey of Evolution of Image Captioning PDF
No ratings yet
A Survey of Evolution of Image Captioning PDF
18 pages
Understanding Regularization Techniques
No ratings yet
Understanding Regularization Techniques
32 pages
PyTorch Tensor Operations Cheat Sheet
No ratings yet
PyTorch Tensor Operations Cheat Sheet
7 pages
SIGIR21 Wang Et Al Decoupled GNN
No ratings yet
SIGIR21 Wang Et Al Decoupled GNN
10 pages
RBF Neural Networks Overview and Applications
No ratings yet
RBF Neural Networks Overview and Applications
34 pages
NLP with Transformers: A Practical Guide
No ratings yet
NLP with Transformers: A Practical Guide
167 pages
Machine Learning Systems with Python
No ratings yet
Machine Learning Systems with Python
513 pages
Deep Learning Fundamentals with PyTorch
No ratings yet
Deep Learning Fundamentals with PyTorch
44 pages
Tensor Flow
100% (1)
Tensor Flow
130 pages
The Gainz Manual
No ratings yet
The Gainz Manual
28 pages
Pretrained Models in Computer Vision
No ratings yet
Pretrained Models in Computer Vision
10 pages
Local Search Optimization Methods
No ratings yet
Local Search Optimization Methods
52 pages
Advances in Computer Vision and Pattern Recognition
No ratings yet
Advances in Computer Vision and Pattern Recognition
522 pages
Activation Functions and Loss in ML
No ratings yet
Activation Functions and Loss in ML
29 pages
Mitigating Overfitting in Neural Networks
100% (1)
Mitigating Overfitting in Neural Networks
7 pages
AI Algorithms with Python Cookbook
No ratings yet
AI Algorithms with Python Cookbook
458 pages
Track Net V3
No ratings yet
Track Net V3
7 pages
AI Engineering Guide for 2025
No ratings yet
AI Engineering Guide for 2025
3 pages
SY8303A: 3A Step-Down Regulator
No ratings yet
SY8303A: 3A Step-Down Regulator
11 pages
Training Multi-Layer Feedforward DNNs
No ratings yet
Training Multi-Layer Feedforward DNNs
9 pages
Ollama: Integrating LLMs Made Easy
No ratings yet
Ollama: Integrating LLMs Made Easy
16 pages
Torquemeters Limited: Precision Solutions
No ratings yet
Torquemeters Limited: Precision Solutions
12 pages
Twitter Sentiment Analysis Project
No ratings yet
Twitter Sentiment Analysis Project
13 pages
Eventmodeling and Eventsourcing Sample
100% (1)
Eventmodeling and Eventsourcing Sample
18 pages
Understanding Neural Networks and Fuzzy Logic
No ratings yet
Understanding Neural Networks and Fuzzy Logic
13 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
100 pages
Digital Manufacturing
No ratings yet
Digital Manufacturing
10 pages
MLOps: Transforming ML for Business Success
No ratings yet
MLOps: Transforming ML for Business Success
112 pages
Master Machine Learning for Interviews
No ratings yet
Master Machine Learning for Interviews
16 pages
Understanding LSTM in Deep Learning
No ratings yet
Understanding LSTM in Deep Learning
19 pages
Understanding LSTM Architecture
No ratings yet
Understanding LSTM Architecture
3 pages
Understanding LSTM Networks
No ratings yet
Understanding LSTM Networks
11 pages
LSTM Networks Overview and Applications
No ratings yet
LSTM Networks Overview and Applications
10 pages
LSTM Architecture and Functionality Explained
No ratings yet
LSTM Architecture and Functionality Explained
2 pages
LSTM Model Overview and Applications
No ratings yet
LSTM Model Overview and Applications
17 pages
Understanding LSTM Networks in Deep Learning
No ratings yet
Understanding LSTM Networks in Deep Learning
5 pages
Load Balancing Algorithms in Cloud Computing
No ratings yet
Load Balancing Algorithms in Cloud Computing
8 pages
Genius POS Quotation for 10 Warung
No ratings yet
Genius POS Quotation for 10 Warung
2 pages
Social Network Analysis Question Bank
No ratings yet
Social Network Analysis Question Bank
12 pages
Microprocessor Lab Exercises Overview
No ratings yet
Microprocessor Lab Exercises Overview
51 pages
YSI pH100 Manual
No ratings yet
YSI pH100 Manual
20 pages
South America Map Coloring Assignment
No ratings yet
South America Map Coloring Assignment
4 pages
FuzzyDesigner: Membership Functions Overview
No ratings yet
FuzzyDesigner: Membership Functions Overview
2 pages
BBA 1st Semester Course Structure
No ratings yet
BBA 1st Semester Course Structure
20 pages
Free Online Barcode Generator Create Barcodes For Free! 2
No ratings yet
Free Online Barcode Generator Create Barcodes For Free! 2
1 page
Mydecorya Backlink Overview
No ratings yet
Mydecorya Backlink Overview
1 page
HP Envy x360 14" 2-in-1 Laptop Review
No ratings yet
HP Envy x360 14" 2-in-1 Laptop Review
2 pages
One-Way ANOVA for Prawn Yield Study
No ratings yet
One-Way ANOVA for Prawn Yield Study
14 pages
Kabarak University Database Exam 2022
No ratings yet
Kabarak University Database Exam 2022
7 pages
50" 4K UHD HDR10 Roku TV Model 100097811
No ratings yet
50" 4K UHD HDR10 Roku TV Model 100097811
1 page
APM 303 Controller Overview and Specs
100% (1)
APM 303 Controller Overview and Specs
43 pages
Level IV Network Infrastructure Project
0% (1)
Level IV Network Infrastructure Project
2 pages
A/B Testing: A Comprehensive Guide
No ratings yet
A/B Testing: A Comprehensive Guide
11 pages
SNAP Command Line Tutorial: Graph Processing
No ratings yet
SNAP Command Line Tutorial: Graph Processing
10 pages
Angular Developer Job Description Template
No ratings yet
Angular Developer Job Description Template
3 pages
ICT - Programming (JAVA) : Quarter 1 - Module 3: Perform Computer Operations (PCO)
No ratings yet
ICT - Programming (JAVA) : Quarter 1 - Module 3: Perform Computer Operations (PCO)
25 pages
Mental Events and Objects in AI
No ratings yet
Mental Events and Objects in AI
25 pages
Best First Search Algorithm Implementation
No ratings yet
Best First Search Algorithm Implementation
3 pages
Informed Search Strategies Overview
No ratings yet
Informed Search Strategies Overview
37 pages
Ict 11 PDF
No ratings yet
Ict 11 PDF
166 pages
Web-Based Commodity Exchange System
No ratings yet
Web-Based Commodity Exchange System
16 pages
Bihar B.Tech Chemistry Exam Paper 2022
No ratings yet
Bihar B.Tech Chemistry Exam Paper 2022
26 pages
iTNC 530 HSCI Contouring Control Guide
No ratings yet
iTNC 530 HSCI Contouring Control Guide
112 pages
Internal Controls and Audit Compliance Guide
No ratings yet
Internal Controls and Audit Compliance Guide
3 pages
C Programming Assignment for Diploma CS
No ratings yet
C Programming Assignment for Diploma CS
2 pages
SSI Interface with Tiva Microcontroller
No ratings yet
SSI Interface with Tiva Microcontroller
19 pages

Understanding LSTM Architecture

Uploaded by

Understanding LSTM Architecture

Uploaded by

LONG SHORT TERM

•Xt : input to the current timestamp.

•Xt: Input at the current timestamp t

You might also like