0% found this document useful (0 votes)

82 views26 pages

Understanding Recurrent Neural Networks

The document provides an overview of Recurrent Neural Networks (RNNs), including their structure, types, and implementations, as well as advanced models like Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU). It discusses the importance of memory in processing sequences and time series data, highlighting the advantages and challenges of using RNNs. Additionally, it includes a bibliography and an exercise section for further exploration of the topic.

Uploaded by

ivan.aldaya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views26 pages

Understanding Recurrent Neural Networks

Uploaded by

ivan.aldaya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Recurrent neural networks

Deep Learning for Engineering

Master’s Degree in Electrical Engineering
Ivan Aldaya and Leandra Abreu
Table of Contents

▶ Introduction to RNNs

▶ Basic RNNs

▶ Long-short memory term and gated recurrent units

▶ Bibliography

1/25
Introduction to RNNs
Justiﬁcation of RNNs

MLPs do not exploit correlation RNNs

CNNs exploit signal correlation

However, none of them presents memory

In some cases, it is interesting to have memory

• Mainly in series and sequence processing
• Forecasting of complicated series
is of great interest
• MLPs and CNNs can also be employed

2/25
Introduction to RNNs
Examples of RNNs

Time series forecast Other applications: nonl. comp.

Sahoo, B. B., Jha, R., Singh, A., & Kumar, D. (2019). Long Liu, X., Wang, Y., Wang, X., Xu, H., Li, C., & Xin, X. (2021).
short-term memory (LSTM) recurrent neural network for Bi-directional gated recurrent unit neural network based
low-flow hydrological time series forecasting. Acta nonlinear equalizer for coherent optical communication system.
Geophysica, 67(5), 1471-1481. Optics Express, 29(4), 5923-5933.

3/25
Introduction to RNNs
Types of time series and sequences

Dimensionality Directionality
Time series can be unidimensional: In many cases, only past values can be
• Some temperature trazes considered
• Some voltages or current • This is the case of forecasting

Or they can be multidimensional In some cases, we can have information

• Vectorial measurements of movement on both directions
forces, acceration... • Even for causal systems
• Multidimensional communication • Sometimes, we perform a time frame-
signals, amplitude/phase, polarization work conversion

4/25
Introduction to RNNs
General structure of RNNs
Single-layer RNN Multilayer RNN

x[k] x[k]

x[k+1] x[k+1]
Recurrent layer

Recurrent layer

Recurrent layer
x[k=2] y[k] x[k+2] y[k]

x[k+i] x[k+i]

x[k+N] x[k+N]

Recurrent Densely connected Recurrent Densely connected

stage stage stage stage
5/25
Introduction to RNNs
Diﬀerent implementations

Basic RNNs Advanced RNNs Transformers

mem. mem. mem. mem. mem. mem. mem. mem. mem. mem.
cell cell cell cell cell cell cell cell cell cell

Directionality

They are intuitive but they They improve basic RNNs More advanced memory cells
have some problems: to avoid: Capable of extracting com-
• Training is challenging • Gradient van./exp. plex temoral patterns
• Subject to gradient vani- There are mainly two types: Transformers are used in:
shing/explosion • Long-short memory term • Natural language proc.
• Gated recurrent unit • DNA sequencing
6/25
Table of Contents

▶ Introduction to RNNs

▶ Basic RNNs

▶ Long-short memory term and gated recurrent units

▶ Bibliography

7/25
Basic RNNs
The recurrent unit

Recurrent connection

Basic RNNs
x1 T
ω1 ωr
x2 ω2
u y
ωi ∑
xi
ωN
xN

Weighted sum Activation function

8/25
Basic RNNs
Unwrapping an ANN

ωi
T
xi[k] ∑ h1[k]
ωr
y[k-1]
ωr
u ωi
xi[k]
ωi
∑ y[k] xi[k+1] ∑ h2[k]

ωr
ωi
xi[k+2] ∑ h3[k]
Unwrapping

ωr
ωi
xi[k+LW] ∑ y[k]

9/25
Basic RNNs
Implementation of RNNs

ωi In a more schematical way:

xi[k] ∑ h1[k]

xi[k] h1[k]
ωr
ωr
ωi
xi[k+1] ∑ h2[k]
xi[k+1] h2[k]

ωr ωr
ωi
xi[k+2] ∑ h3[k]
xi[k+j] hj[k]
ωr
ωr
ωi xi[k+N] y[k]
xi[k+LW] ∑ y[k]

10/25
Basic RNNs
Implementation of RNNs

There are two alternatives

Return the full sequence Return the last value

xi[k] h1[k] xi[k]

Full sequence is usually
ωr ωr adopted in intermediate
layers
xi[k+1] h2[k] xi[k+1]
ωr ωr Single output is usually
adopted in the final
xi[k+j] hj[k] xi[k+j] recurrent layer
ωr ωr

xi[k+N] y[k] xi[k+N] y[k]

11/25
Basic RNNs
Implementation of RNNs

Stacking different recurrent layers

h11[k] h12[k]
xi[k]

ωr(1) ωr(2) ωr(3)

h21[k] h22[k]
xi[k-1]

ωr(1) ωr(2) ωr(3)

hj1[k] hj2[k]
xi[k-j]

ωr(1) ωr(2) ωr(3)

hN1[k] hN2[k]
xi[k-N] y[k]
12/25
Basic RNNs
Formatting the data for RNNs

Overlap windowing is usually employed to build the input matrix and the outputs vector

The length of the window

should coincide with the
window 1 out1 depth of the unwrapped cell

LW = Window length
window 2 out 2
The number of windows is:
window 3 out 3 NW = Window number
window i out i
13/25
Basic RNNs
Formatting the data for RNNs

The different windows are combined in a single matrix

Output vector

window 1 out1

window 2 out 2

window 3 out 3

out i
Input matrix
window i

14/25
Basic RNNs
Formatting the data for RNNs

Window 2
Window 1
Window 0
Window i x[2] x[1] x[0]
x[i+0]
Output vector
x[i+1] x[3] x[2] x[1]

x[i+2] x[4] x[3] x[2]

x[i+3] x[5] x[4] x[3] x[i+11] x[13] x[12] x[11]

x[i+4] x[6] x[5] x[4]

Output 2

Output 1

Output 0
x[7] x[6] x[5]

Output i
x[i+5]

x[i+6] x[8] x[7] x[6]

x[i+7] x[9] x[8] x[7]

The size of the input matrix is:
x[i+8] x[10] x[9] x[8]
Input matrix N w × Lw
x[i+9] x[11] x[10] x[9] The size of the output vector is:
x[i+10] x[12] x[11] x[10] 1× Nw
15/25
Basic RNNs
Processing the data with RNNs

RU1,N1 RU2,N2 RU2,Nm

RU1,2 RU2,2 RU2,2
RU1,1 RU2,1 RU2,1
x[NW] x[1] x[1] x[0]
x
[NW+1] x[3] x[2] x[1]
x
[NW+2] x[4] x[3] x[2]
y[NW] y[2] y[1] y[0]
x
[NW+3] x[5] x[4] x[3]
x
Output vector
[NW+4] x[6] x[5] x[4]

Densily conn.
layer
x x x
[NW+LW] [LW+2] [LW+1] x[LW]

Input matrix with 1st recur. layer 2nd recur. layer mth recur. layer
size NW × LW with N1 units with N2 units with N2 units
16/25
Basic RNNs
Drawbacks of the RNNs, the exploiting/vanishing eﬀect

Let's consider a single recurrent unit |ωR|>1 |ωR|<1

To simplify our discussion, we asume Gradient explosion Gradient vanishing
linear activation function
RU RU
RU x[0] x[0]
x[0]
The contribution
of the input x[i] x[1] x[1]
x[1] to the output is
proporcional to: x[2] x[2]
x[2]
x[3] x[3]
x[3] ωRLw−i
x[4] x[4]
x[4] Therefore, we have
two posibilities

x[LW] y[0] x[LW] y[0]

x[LW] y[0]

17/25
▶ Introduction to RNNs

▶ Basic RNNs

▶ Long-short memory term and gated recurrent units

▶ Bibliography

18/25
Long-short memory term and gated recurrent units
Short and long-term paths

A possible solution to this problem is to implement bypass connections

RU Short-mem. Long-mem.
path path
x[0] x[0]

x[1] x[1]

x[2] x[2]

x[3] x[3]

x[4] x[4]
Long. mem.
Short-mem. path
path
x[LW] ∑ y[0] x[LW] y[0]

19/25
Long-short memory term and gated recurrent units
Short-long memory term (LSTM) units

Short-memory

Long-memory path
logistic
Forget
gate

path
RU
x[0]
logistic
x[1]

x[2]
x[i] tanh

x[i]
logistic

Cell state
tanh
Output
gate
x[LW] y[0]

20/25
Long-short memory term and gated recurrent units
Gated recurrent units

RU RU
Input Input
x[0] x[0]

x[1] x[1] Long

memory
x[2]
Short Long x[2] terms
memory LSMT memory GRU +
terms terms Short
x[i] x[i]
memory
terms
Output Output

x[LW] y[0] x[LW] y[0]

21/25
Long-short memory term and gated recurrent units
Gated recurrent units

RU
logistic
Reset gate
x[0]

Cell state
x[1]

x[2]
x[i] logistic Update gate

x[i]
−1
Candidate tanh
hidden state

x[LW] y[0]

22/25
Table of Contents

▶ Introduction to RNNs

▶ Basic RNNs

▶ Long-short memory term and gated recurrent units

▶ Bibliography

23/25
Bibliography

1. Goodfellow, I. (2016). Deep learning (Vol. 196). MIT press.

2. Alpaydin, E. (2021). Machine learning. MIT press.

3. [Link]

4. [Link]

5. [Link]
deep-learning-2e54923f3e2

6. [Link] ecurrent − modern/[Link]

24/25
Exercise

Reply and justify the following issues applied to RNNs, LSTMs, and GRUs:

1. Which are the model parameters of each of the three architectures?

2. Which are the model hyperparameters of each of the three architectures?

3. Brieﬂy compared them in terms of complexity?

Tip: for the sake of clarity, I recommend to build a table to summarize the information of
each question.

25/25

Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
83 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
102 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
99 pages
Overview of Recurrent Neural Networks
No ratings yet
Overview of Recurrent Neural Networks
45 pages
Build a Neural Network from Scratch
No ratings yet
Build a Neural Network from Scratch
9 pages
Stock Price Prediction with LSTM
No ratings yet
Stock Price Prediction with LSTM
51 pages
Stock Market Prediction with Python SVM
No ratings yet
Stock Market Prediction with Python SVM
62 pages
Stock Market Prediction with ML Models
No ratings yet
Stock Market Prediction with ML Models
29 pages
Stock Price Prediction Using Tweets
No ratings yet
Stock Price Prediction Using Tweets
72 pages
Machine Learning Applications in Finance
No ratings yet
Machine Learning Applications in Finance
96 pages
Stock Market Prediction Project Report
No ratings yet
Stock Market Prediction Project Report
26 pages
Stock Price Prediction with ML Models
No ratings yet
Stock Price Prediction with ML Models
27 pages
Stock Market Prediction with ML Techniques
No ratings yet
Stock Market Prediction with ML Techniques
59 pages
Perceptron and MLP Implementation Guide
No ratings yet
Perceptron and MLP Implementation Guide
73 pages
Simple Swing Trading Using Python
No ratings yet
Simple Swing Trading Using Python
4 pages
Stock Market Prediction System Project
No ratings yet
Stock Market Prediction System Project
92 pages
Financial Trading with Python Guide
No ratings yet
Financial Trading with Python Guide
25 pages
Senior Electrical Engineer Profile
No ratings yet
Senior Electrical Engineer Profile
4 pages
Hybrid ML Models for Stock Forecasting
No ratings yet
Hybrid ML Models for Stock Forecasting
40 pages
M.ramadan - Electrical Senior Engineer - CV PDF
No ratings yet
M.ramadan - Electrical Senior Engineer - CV PDF
11 pages
Electrical Engineer Resume - Muhammad Hamid
No ratings yet
Electrical Engineer Resume - Muhammad Hamid
14 pages
Data Analysis and ML for Stock Prices
No ratings yet
Data Analysis and ML for Stock Prices
132 pages
TensorFlow One-Hot Encoding Guide
No ratings yet
TensorFlow One-Hot Encoding Guide
8 pages
Stock Price Prediction for Retail Investors
No ratings yet
Stock Price Prediction for Retail Investors
124 pages
Ahmed Salah Awad: Electrical Engineer Profile
No ratings yet
Ahmed Salah Awad: Electrical Engineer Profile
5 pages
Stock Market Prediction by Payal Kataria
No ratings yet
Stock Market Prediction by Payal Kataria
53 pages
Build and Analyze a Neural Network
No ratings yet
Build and Analyze a Neural Network
23 pages
AD3511 Deep Learning Lab Manual
No ratings yet
AD3511 Deep Learning Lab Manual
80 pages
Neural Network for Handwritten Digit Classification
No ratings yet
Neural Network for Handwritten Digit Classification
30 pages
Building Management System Overview
No ratings yet
Building Management System Overview
47 pages
Feedforward Neural Network with Keras
No ratings yet
Feedforward Neural Network with Keras
58 pages
Mohsin Yasin Siddiqi: Electrical Engineer
No ratings yet
Mohsin Yasin Siddiqi: Electrical Engineer
3 pages
Python Mean Reversion Trading System
No ratings yet
Python Mean Reversion Trading System
17 pages
Stock Price Prediction with ML Techniques
No ratings yet
Stock Price Prediction with ML Techniques
57 pages
Stock Market Forecasting Report
No ratings yet
Stock Market Forecasting Report
29 pages
Chapter 5 - Classification Problems
100% (1)
Chapter 5 - Classification Problems
25 pages
Trading Strategies with ADX, ATR, and BB
No ratings yet
Trading Strategies with ADX, ATR, and BB
14 pages
Deep Learning Lab File for BCA Students
No ratings yet
Deep Learning Lab File for BCA Students
46 pages
Deep Learning Lab Manual: TensorFlow
No ratings yet
Deep Learning Lab Manual: TensorFlow
84 pages
Stock Price Prediction Using RNNs
No ratings yet
Stock Price Prediction Using RNNs
18 pages
Electrical Shop Drawings Guide
No ratings yet
Electrical Shop Drawings Guide
76 pages
Python Deep Learning Lab Manual
No ratings yet
Python Deep Learning Lab Manual
52 pages
Automating Technical Analysis in Python
No ratings yet
Automating Technical Analysis in Python
23 pages
Kalman Filter in Pairs Trading Analysis
No ratings yet
Kalman Filter in Pairs Trading Analysis
17 pages
Pine Script V6 Auto Trading Strategy
No ratings yet
Pine Script V6 Auto Trading Strategy
7 pages
Stock Price Forecasting Dashboard
No ratings yet
Stock Price Forecasting Dashboard
75 pages
Arduino-Based RFID Reader System
No ratings yet
Arduino-Based RFID Reader System
4 pages
Single Layer Perceptron Experiment
No ratings yet
Single Layer Perceptron Experiment
11 pages
Building Management System Challenges
100% (1)
Building Management System Challenges
207 pages
Bitcoin Price Prediction with Neural Networks
No ratings yet
Bitcoin Price Prediction with Neural Networks
57 pages
Super Trend EA for MT5 Trading
No ratings yet
Super Trend EA for MT5 Trading
6 pages
Machine Learning Strategies Overview
No ratings yet
Machine Learning Strategies Overview
59 pages
Linear Regression Essentials in Python
No ratings yet
Linear Regression Essentials in Python
23 pages
Deep Learning Lab Manual 2023-2024
No ratings yet
Deep Learning Lab Manual 2023-2024
41 pages
Decision Tree Analysis in Python
100% (1)
Decision Tree Analysis in Python
5 pages
Medium Voltage Switch Gear
No ratings yet
Medium Voltage Switch Gear
11 pages
LSTM Model for Stock Trend Prediction
No ratings yet
LSTM Model for Stock Trend Prediction
18 pages
Stock Price Prediction with ML Techniques
No ratings yet
Stock Price Prediction with ML Techniques
48 pages
Understanding RNN Architecture
No ratings yet
Understanding RNN Architecture
8 pages
Understanding Recurrent Neural Networks
No ratings yet
Understanding Recurrent Neural Networks
20 pages
Math-6 Q1 LAS Wk-10 Day-1-5
No ratings yet
Math-6 Q1 LAS Wk-10 Day-1-5
6 pages
Types of Informational Text Structures
No ratings yet
Types of Informational Text Structures
3 pages
Grade 8 Electricity Lesson Plan
No ratings yet
Grade 8 Electricity Lesson Plan
4 pages
Sebf8034 29
No ratings yet
Sebf8034 29
26 pages
CBSE Class 12 English Core Answer Key 2024
No ratings yet
CBSE Class 12 English Core Answer Key 2024
7 pages
Đề Kiểm Tra Học Kỳ Môn Anh Văn 12
No ratings yet
Đề Kiểm Tra Học Kỳ Môn Anh Văn 12
5 pages
Innovations in AI and Blockchain 2025
No ratings yet
Innovations in AI and Blockchain 2025
4 pages
Family Disaster Plan Template
No ratings yet
Family Disaster Plan Template
5 pages
Rhetorical Criticism Question Bank
No ratings yet
Rhetorical Criticism Question Bank
15 pages
Power Electronics: Devices and Applications
No ratings yet
Power Electronics: Devices and Applications
33 pages
Breadboard Circuit Analysis: Series & Parallel
No ratings yet
Breadboard Circuit Analysis: Series & Parallel
9 pages
Class 9 English: The Fun They Had Solutions
No ratings yet
Class 9 English: The Fun They Had Solutions
47 pages
AI-Driven Digital Twin for Rock Properties
No ratings yet
AI-Driven Digital Twin for Rock Properties
8 pages
Triman Symbol and Waste Sorting in France
No ratings yet
Triman Symbol and Waste Sorting in France
5 pages
Boilerplate v2.9.4 by DaviddTech
No ratings yet
Boilerplate v2.9.4 by DaviddTech
37 pages
Water Regulation in Plant Cells
No ratings yet
Water Regulation in Plant Cells
9 pages
Drager Oxylog 2000 Plus Draeger Service Manual
100% (1)
Drager Oxylog 2000 Plus Draeger Service Manual
58 pages
EE 305 Instrumentation Mid-Term Exam
No ratings yet
EE 305 Instrumentation Mid-Term Exam
4 pages
Secrets of Longevity in Okinawa and Beyond
100% (2)
Secrets of Longevity in Okinawa and Beyond
17 pages
100 Common Job Interview Questions
No ratings yet
100 Common Job Interview Questions
3 pages
Mbti Step Ii: Jane Sample
No ratings yet
Mbti Step Ii: Jane Sample
24 pages
Eric Shiraev - 人格理論：全球視野 (2017) PDF
100% (1)
Eric Shiraev - 人格理論：全球視野 (2017) PDF
426 pages
Class 9 Physics MCQs with Answers
100% (1)
Class 9 Physics MCQs with Answers
9 pages
Student Contact List and Courses
No ratings yet
Student Contact List and Courses
4 pages
X-Ray Machine Components Overview
No ratings yet
X-Ray Machine Components Overview
17 pages
Cable Float Level Switch Overview
No ratings yet
Cable Float Level Switch Overview
16 pages
Data Types and Collection Methods
No ratings yet
Data Types and Collection Methods
166 pages
Substructure Design Concepts and Foundations
No ratings yet
Substructure Design Concepts and Foundations
32 pages
Translation Basics: Key Concepts Explained
No ratings yet
Translation Basics: Key Concepts Explained
9 pages

Understanding Recurrent Neural Networks

Uploaded by

Understanding Recurrent Neural Networks

Uploaded by

Recurrent neural networks

Deep Learning for Engineering

▶ Long-short memory term and gated recurrent units

MLPs do not exploit correlation RNNs

CNNs exploit signal correlation

However, none of them presents memory

In some cases, it is interesting to have memory

Time series forecast Other applications: nonl. comp.

Or they can be multidimensional In some cases, we can have information

Recurrent Densely connected Recurrent Densely connected

Basic RNNs Advanced RNNs Transformers

▶ Long-short memory term and gated recurrent units

Weighted sum Activation function

ωi In a more schematical way:

There are two alternatives

xi[k] h1[k] xi[k]

xi[k+N] y[k] xi[k+N] y[k]

Stacking different recurrent layers

ωr(1) ωr(2) ωr(3)

ωr(1) ωr(2) ωr(3)

ωr(1) ωr(2) ωr(3)

The length of the window

The different windows are combined in a single matrix

x[i+2] x[4] x[3] x[2]

x[i+3] x[5] x[4] x[3] x[i+11] x[13] x[12] x[11]

x[i+4] x[6] x[5] x[4]

x[i+6] x[8] x[7] x[6]

x[i+7] x[9] x[8] x[7]

RU1,N1 RU2,N2 RU2,Nm

Let's consider a single recurrent unit |ωR|>1 |ωR|<1

x[LW] y[0] x[LW] y[0]

▶ Long-short memory term and gated recurrent units

A possible solution to this problem is to implement bypass connections

x[1] x[1] Long

x[LW] y[0] x[LW] y[0]

▶ Long-short memory term and gated recurrent units

1. Goodfellow, I. (2016). Deep learning (Vol. 196). MIT press.

2. Alpaydin, E. (2021). Machine learning. MIT press.

6. [Link] ecurrent − modern/[Link]

1. Which are the model parameters of each of the three architectures?

2. Which are the model hyperparameters of each of the three architectures?

3. Brieﬂy compared them in terms of complexity?

You might also like