Open navigation menu

Scribd

0% found this document useful (0 votes)

34 views4 pages

RNN_LSTM_Transformers_Notes

The document provides an overview of various neural network architectures including RNNs, LSTMs, Bidirectional LSTMs, Encoder-Decoder models, and Transformers. It highlights their structures, key features, advantages, and applications, particularly in handling sequential data and long-term dependencies. Additionally, it includes a summary table comparing these models based on their key features and use cases.

Uploaded by

khushirajpurohit617

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views4 pages

RNN_LSTM_Transformers_Notes

The document provides an overview of various neural network architectures including RNNs, LSTMs, Bidirectional LSTMs, Encoder-Decoder models, and Transformers. It highlights their structures, key features, advantages, and applications, particularly in handling sequential data and long-term dependencies. Additionally, it includes a summary table comparing these models based on their key features and use cases.

Uploaded by

khushirajpurohit617

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Notes on RNN, LSTM, Bidirectional LSTM, Encoder-Decoder, and Transformers

1. Recurrent Neural Networks (RNN)

---------------------------------

RNNs are neural networks designed for sequential data. Unlike feedforward networks, RNNs have

connections that form directed cycles,

allowing them to maintain hidden states that capture temporal dependencies.

Architecture:

- Input Layer: Takes one time step of the input sequence at a time.

- Hidden Layer: Processes the input and the previous hidden state.

- Output Layer: Produces the output for each time step.

Equation:

h_t = f(W_xhx_t + W_hhh_(t-1) + b_h)

y_t = g(W_hy*h_t + b_y)

Where:

- x_t: Input at time t

- h_t: Hidden state at time t

- y_t: Output at time t

- W and b: Weights and biases

Limitations:

- Struggles with long-term dependencies due to vanishing gradients.

2. Long Short-Term Memory (LSTM)

---------------------------------

LSTMs are an advanced version of RNNs designed to handle long-term dependencies. They

introduce gates to regulate the flow of information.

Architecture:

- Cell State: Stores long-term information.

- Forget Gate: Decides what information to discard.

- Input Gate: Determines which new information to store.

- Output Gate: Controls the output based on the cell state and hidden state.

Equations:

f_t = sigmoid(W_f[x_t, h_(t-1)] + b_f)

i_t = sigmoid(W_i[x_t, h_(t-1)] + b_i)

C~_t = tanh(W_C[x_t, h_(t-1)] + b_C)

C_t = f_t * C_(t-1) + i_t * C~_t

o_t = sigmoid(W_o[x_t, h_(t-1)] + b_o)

h_t = o_t * tanh(C_t)

Advantages:

- Effectively captures long-term dependencies.

3. Bidirectional LSTM

----------------------

A Bidirectional LSTM processes the sequence in both forward and backward directions, capturing

context from both past and future.

Architecture:
- Two LSTM layers: One processes the input forward, and the other processes it backward.

Equation:

h_t = concat(h_t_forward, h_t_backward)

Applications:

- Speech recognition, language modeling, etc.

4. Encoder-Decoder Architecture

--------------------------------

Used for tasks like machine translation, this architecture includes:

- Encoder: Processes the input sequence and encodes it into a context vector.

- Decoder: Decodes the context vector into the target sequence.

Workflow:

1. The encoder processes the input sequence and generates a fixed-size context vector.

2. The decoder takes this context vector and generates the output sequence step-by-step.

5. Transformers

----------------

Transformers are powerful models that replace RNNs with self-attention mechanisms. They are the

foundation for models like BERT and GPT.

Components:

1. Encoder-Decoder Structure:

- Encoder: Consists of multiple layers with self-attention and feedforward sublayers.

- Decoder: Similar to the encoder but includes cross-attention layers.

2. Self-Attention: Computes attention weights for each word in relation to other words.

Attention Equation:

Attention(Q, K, V) = softmax((QK^T)/sqrt(d_k))V

Advantages:

- Captures global dependencies.

- Highly parallelizable.

Summary Table:

------------------------------------------------------

| Model | Key Feature | Use Case |

|---------------------|----------------------------|---------------------------|

| RNN | Cyclic connections | Sequential data |

| LSTM | Gates for long-term deps. | Long sequence modeling |

| Bidirectional LSTM | Processes two directions | Context-aware tasks |

| Encoder-Decoder | Separate encode/decode | Translation, summarization|

| Transformers | Self-attention | NLP, image processing |

------------------------------------------------------

You might also like

Attention Is All You Need
No ratings yet
Attention Is All You Need
18 pages
RNN_LSTM_BiRNN_Notes
No ratings yet
RNN_LSTM_BiRNN_Notes
3 pages
Deep Learning
No ratings yet
Deep Learning
6 pages
Encoder_Decoder_Transformers_Notes
No ratings yet
Encoder_Decoder_Transformers_Notes
6 pages
deep learning u4
No ratings yet
deep learning u4
5 pages
Sequence Models - Merged
No ratings yet
Sequence Models - Merged
67 pages
2410[1]
No ratings yet
2410[1]
27 pages
LSTM
No ratings yet
LSTM
10 pages
Unlocking Linguistic Intelligence_ Attention Mechanisms and Transformer Architectures in NLP (1)
No ratings yet
Unlocking Linguistic Intelligence_ Attention Mechanisms and Transformer Architectures in NLP (1)
117 pages
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
No ratings yet
RNNs and Their Types - 15 Slides (Easy Copy-Paste Format)
6 pages
Deep Learning Updated
No ratings yet
Deep Learning Updated
11 pages
nndl (2)
No ratings yet
nndl (2)
10 pages
imp_ml
No ratings yet
imp_ml
8 pages
Transformer
No ratings yet
Transformer
5 pages
06-DL-Deep Learning For Text Data (LSTM Seq2Seq Models)
No ratings yet
06-DL-Deep Learning For Text Data (LSTM Seq2Seq Models)
44 pages
9 RNN LSTM Gru
No ratings yet
9 RNN LSTM Gru
91 pages
LLM
No ratings yet
LLM
41 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
2 pages
GRU
No ratings yet
GRU
17 pages
Unit-3_Part-02
No ratings yet
Unit-3_Part-02
40 pages
Were Rnns All We Needed?: Leo - Feng@Mila - Quebec
No ratings yet
Were Rnns All We Needed?: Leo - Feng@Mila - Quebec
20 pages
LSTM
No ratings yet
LSTM
19 pages
Unit III- Recurrent Neural Networks
No ratings yet
Unit III- Recurrent Neural Networks
44 pages
RNN LSTM GRU Transformers
0% (1)
RNN LSTM GRU Transformers
123 pages
1 Recurrent Neural Networks
No ratings yet
1 Recurrent Neural Networks
36 pages
Dpp
No ratings yet
Dpp
2 pages
Unit 4
No ratings yet
Unit 4
50 pages
aM3RdIpjnYdPsGKF
No ratings yet
aM3RdIpjnYdPsGKF
20 pages
DL_MOD4 (3)
No ratings yet
DL_MOD4 (3)
105 pages
Deep Learning Concepts Summary
No ratings yet
Deep Learning Concepts Summary
6 pages
LSTM Detailed Explanation
No ratings yet
LSTM Detailed Explanation
2 pages
11-rnn
No ratings yet
11-rnn
32 pages
10-rnn
No ratings yet
10-rnn
56 pages
Cs224n 2025 Lecture06 Fancy Rnn
No ratings yet
Cs224n 2025 Lecture06 Fancy Rnn
57 pages
UNIT-IV DL
No ratings yet
UNIT-IV DL
23 pages
A Practical Survey On Faster and Lighter Transformers - 2023 - Fournier Et Al
No ratings yet
A Practical Survey On Faster and Lighter Transformers - 2023 - Fournier Et Al
40 pages
RNN-StannfordBased
No ratings yet
RNN-StannfordBased
102 pages
LSTM
No ratings yet
LSTM
2 pages
CE6146_Lecture_4
No ratings yet
CE6146_Lecture_4
53 pages
DL CO-3 PPT 3
No ratings yet
DL CO-3 PPT 3
19 pages
Lecture Notes on Lecture Notes on Deep Learning.docx
No ratings yet
Lecture Notes on Lecture Notes on Deep Learning.docx
8 pages
AE556_2024_Topic6_RNN
No ratings yet
AE556_2024_Topic6_RNN
19 pages
LSTM_Architecture_Presentation
No ratings yet
LSTM_Architecture_Presentation
18 pages
Deep_Learning_Exam_Notes
No ratings yet
Deep_Learning_Exam_Notes
3 pages
Slide 1
No ratings yet
Slide 1
5 pages
Neural Networks
No ratings yet
Neural Networks
22 pages
unit5 3
No ratings yet
unit5 3
48 pages
Sequence Models RNNs ,LSTMs
No ratings yet
Sequence Models RNNs ,LSTMs
3 pages
AIDS-II PT1 Question Bank
No ratings yet
AIDS-II PT1 Question Bank
27 pages
L.7
No ratings yet
L.7
54 pages
Unit 3
No ratings yet
Unit 3
27 pages
DL_Cie2
No ratings yet
DL_Cie2
5 pages
Transformers Report Revised
No ratings yet
Transformers Report Revised
10 pages
Transformer networks
No ratings yet
Transformer networks
53 pages
ASP-DAC2017-1352-11
No ratings yet
ASP-DAC2017-1352-11
6 pages
AAM unit 6 notes
No ratings yet
AAM unit 6 notes
20 pages
Retentive Network - A Successor To Transformer For Large Language Models
No ratings yet
Retentive Network - A Successor To Transformer For Large Language Models
14 pages
chapter 2
No ratings yet
chapter 2
68 pages
Retentive Network: A Successor To Transformer For Large Language Models
No ratings yet
Retentive Network: A Successor To Transformer For Large Language Models
14 pages
Loop-shaping Robust Control
From Everand
Loop-shaping Robust Control
Philippe Feyel
No ratings yet
Soft Computing
No ratings yet
Soft Computing
21 pages
Deep Learning Basics Lecture 6 Convolutional NN
No ratings yet
Deep Learning Basics Lecture 6 Convolutional NN
36 pages
Analysis of Multi Layer Perceptron Network
No ratings yet
Analysis of Multi Layer Perceptron Network
7 pages
ELET442 - Artificial Neural Networks (ANNs)
No ratings yet
ELET442 - Artificial Neural Networks (ANNs)
56 pages
Greedy Layerwise Learning
No ratings yet
Greedy Layerwise Learning
39 pages
Module 1
No ratings yet
Module 1
66 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
3 pages
Artificial Intelligence and Deep Learning: Certificate Program
No ratings yet
Artificial Intelligence and Deep Learning: Certificate Program
12 pages
CS F425 - Deep Learning - [Tanmay Tulsidas Verlekar] - 2023_2
No ratings yet
CS F425 - Deep Learning - [Tanmay Tulsidas Verlekar] - 2023_2
3 pages
A Brief Review of D-Forward Neural Networks
No ratings yet
A Brief Review of D-Forward Neural Networks
8 pages
IV Ai & Ds Al3451 Ml Unit4 Qb
No ratings yet
IV Ai & Ds Al3451 Ml Unit4 Qb
6 pages
Mathematics of Deep Learning 1687444204
No ratings yet
Mathematics of Deep Learning 1687444204
45 pages
Quiz - Review On Machine Learning
No ratings yet
Quiz - Review On Machine Learning
6 pages
Lecture 16-Multilayer Perceptron
No ratings yet
Lecture 16-Multilayer Perceptron
24 pages
Vanishing Gradient Problem in Deep Learning Understanding Intuition and Solutions
No ratings yet
Vanishing Gradient Problem in Deep Learning Understanding Intuition and Solutions
8 pages
Question bank
No ratings yet
Question bank
4 pages
Deep Learning: Technical Introduction: Thomas Epelbaum
No ratings yet
Deep Learning: Technical Introduction: Thomas Epelbaum
106 pages
11.RNN and Transformers
No ratings yet
11.RNN and Transformers
100 pages
Widrow-Hoff Learning Rule
No ratings yet
Widrow-Hoff Learning Rule
9 pages
Back Propagation
100% (1)
Back Propagation
27 pages
Ch. 10: Introduction To Convolution Neural Networks CNN and Systems
No ratings yet
Ch. 10: Introduction To Convolution Neural Networks CNN and Systems
69 pages
AI010 804L01 Neural Networks
No ratings yet
AI010 804L01 Neural Networks
41 pages
4-Machine Learning and Neural Networks
No ratings yet
4-Machine Learning and Neural Networks
9 pages
Supervised Learning Network Introduction: Unit 2
No ratings yet
Supervised Learning Network Introduction: Unit 2
52 pages
NN-mdu-previousyears
No ratings yet
NN-mdu-previousyears
10 pages
Adaline and Medaline
50% (2)
Adaline and Medaline
14 pages
ObjectiveQ&a Mid-I NNDL
No ratings yet
ObjectiveQ&a Mid-I NNDL
15 pages
The Backpropagation Algorithm: Indian Institute of Technology Roorkee
No ratings yet
The Backpropagation Algorithm: Indian Institute of Technology Roorkee
19 pages
NNDL Question Bank
No ratings yet
NNDL Question Bank
9 pages
Co-So-Tri-Tue-Nhan-Tao - 2021-Reviewexercise09-Nn-Sol - (Cuuduongthancong - Com)
No ratings yet
Co-So-Tri-Tue-Nhan-Tao - 2021-Reviewexercise09-Nn-Sol - (Cuuduongthancong - Com)
2 pages