0% found this document useful (0 votes)

28 views

Chapter15 RNN

Uploaded by

Sivaiah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

Chapter15 RNN

Uploaded by

Sivaiah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Chapter 15: Processing Sequences

Using RNNs and CNNs

Tsz-Chiu Au
[email protected]

Ulsan National Institute of Science and Technology (UNIST)

South Korea
Recurrent Neural Networks
• Recurrent neural networks (RNNs) is a class of nets
that can predict the future.
» They can analyze time series data such as stock prices.
• In this chapter, we will study
» The fundamental concepts underlying RNNs.
» How to train them using backpropagation through time.
» How to use them to forecast a time series.
» How to cope with unstable gradients and a (very) limited
short-term memory.
» A CNN architecture called WaveNet that is capable of
processing time series data as well as RNNs.
Recurrent Neurons and Layers
• A recurrent neural network looks very much like a feedforward neural
network, except it also has connections pointing backward.
• A recurrent neuron unrolled through time:

• At each time step t (also called a frame), this recurrent neuron receives
the inputs x(t) as well as its own output from the previous time step, y(t–1).
» In the first time step, the input is 0
A Layer of Recurrent Neurons
• At each time step t, every neuron receives both the input vector x(t) and
the output vector from the previous time step y(t–1).

• Each recurrent neuron has two sets of weights: Wx for the inputs x(t) and
Wy for the outputs of the previous time step, y(t–1).
Memory Cells
• A recurrent neuron has memory because its output is a function of all the
inputs from previous time steps.
• A part of a neural network that preserves some state across time steps is
called a memory cell (or simply a cell).
• A single recurrent neuron, or a layer of recurrent neurons, is a very basic
cell, capable of learning only short patterns.
» To learn longer patterns, a more powerful type of cells is needed.
• A cell’s state at time step t, denoted h(t) (the “h” stands for “hidden”), is a
function of some inputs at that time step and its state at the previous time
step: h(t) = f(h(t–1), x(t)).
• The output at time step t, denoted y(t), is also a function of the previous
state and the current inputs.
Input and Output Sequences
• Sequence-to-sequence network
» E.g., predicting time series such as
stock prices
• Sequence-to-vector network
» E.g., feed the network a sequence of
words corresponding to a movie
review and output a sentiment
score
• Vector-to-sequence network
» E.g., the input could be an image,
and the output could be a caption
for that image.
• Encoder–Decoder
» E.g., translating a sentence from one
language to another.
» Feed the network a sentence in one
language, the encoder would
convert this sentence into a single
vector representation, and then the
decoder would decode this vector
into a sentence in another language.
Training RNNs
• Backpropagation through time (BPTT)
» First, forward pass through the unrolled network
» Second, the output sequence is evaluated using a cost function C(Y(0), Y(1), ...Y(T))
§ The cost function can ignore some outputs
» Third, the gradients of that cost function are then propagated backward through the unrolled
network
» Fourth, the model parameters are updated using the gradients computed during BPTT.
§ Since the same parameters W and b are used at each time step, backpropagation will do
the right thing and sum over all time steps.
Forecasting a Time Series
• A time series is a sequence of data, one per time step.
» A univariate time series is a time series that have one value per time step.
» A multivariate time series is a time series that have multiple values per time step.
• Forecasting is a task of predicting future values.
» E.g., forecast the value at the next time step (represented by the X) in the following
graphs

• Imputation is a task of predicting missing values from the past.

Generating Time Series for Experiments
• In this chapter, instead of using real time series data in the real world, we
will consider the time series generated by this function:

• The function returns a NumPy array of shape [batch size, time steps, 1],
where each series is the sum of two sine waves of fixed amplitudes but
random frequencies and phases, plus a bit of noise.
» In general, the input features are represented as 3D arrays of shape [batch size, time
steps, dimensionality], where dimensionality is 1 for univariate time series and more for
multivariate time series.
• Let’s create a training set, a validation set, and a test set.
Baseline Metrics
• We like to compare our RNN models with some baseline methods to see
whether the RNN models perform as well as we expect.
• Baseline 1: naive forecasting
» Use the last value in a series to predict the next value.
» It gives a mean squared error of about 0.020 in our previous example.

• Baseline 2: linear model (implemented as a fully connected network)

» Use a simple Linear Regression model so that each prediction will be a linear
combination of the values in the time series

» If we compile this model using the MSE loss and the default Adam optimizer, then fit it
on the training set for 20 epochs and evaluate it on the validation set, we get an MSE of
about 0.004.
Implementing a Simple RNN
• Let’s build a very simple RNN and compare it with the baseline methods.

• It just contains a single layer with a single neuron.

• By default, the SimpleRNN layer uses the hyperbolic tangent activation function.
• The initial state h(init) is set to 0.
• The neuron computes a weighted sum of these values and applies the hyperbolic
tangent activation function to the result, and this gives the first output, y0. In a
simple RNN, this output is also the new state h0.
• This new state is passed to the same recurrent neuron along with the next input
value, x(1). The process is repeated until it returns y49.
• By default, recurrent layers in Keras only return the final output. To make them
return one output per time step, you must set return_sequences=True
• If you compile, fit, and evaluate this model (just like earlier, we train for 20
epochs using Adam), you will find that its MSE reaches only 0.014
» better than the naive approach but it does not beat the simple linear model.
» Reason: this simple RNN has just three parameters whereas the simple linear model has 51
parameters.
Trend and Seasonality
• There are other models for forecasting time series
» E.g., weighted moving average models and autoregressive integrated
moving average (ARIMA) models.
• Some of them require you to first remove the trend and
seasonality.
» The known pattern of the time series should be ignored first.
• After the model is trained and makes predictions, you would
have to add the trend and the seasonal pattern back to get
the final predictions.
• When using RNNs, it is generally not necessary to do all this,
but it may improve performance in some cases, since the
model will not have to learn the trend or the seasonality.
Deep RNNs
• To implement a deep RNN with multiple layers of cells:

• Note that you must set return_sequences=True for all recurrent layers
except the last one.
• If you compile, fit, and evaluate this model, you will find that it reaches an
MSE of 0.003 (i.e., better than the linear model)
• The last layer is too simple, and we can replace it with a Dense layer
Forecasting Several Time Steps Ahead
• To predict not just the value at the next time step but also the next 10 values
» One simple way is to use the trained model to predict the next value, then add that value
to the inputs, and use the model again to predict the following value, and so on.

• In this method, the errors might

accumulate over time.
» We get an MSE of about 0.029, which is
even worse than the naive forecasting
(MSE of about 0.223) and the linear
model (MSE of about 0.0188).
• Still, if you only want to forecast a
few time steps ahead only on more
complex tasks, this approach may
work well.
Forecasting Several Time Steps Ahead (cont.)
• The second option is to train an RNN to predict all 10 next values at once.
» We can still use a sequence-to-vector model, but it will output 10 values instead of 1.

» Now we just need the output layer to have 10 units instead of 1

• The MSE for the next 10 time steps is about 0.008.

» Much better than the linear model
• But we can still do better
» instead of training the model to forecast the next 10 values only at the very last time
step, we can train it to forecast the next 10 values at each and every time step.
§ i.e., turn this sequence-to-vector RNN into a sequence-to-sequence RNN.
» The advantage of this technique is that the loss will contain a term for the output of the
RNN at each and every time step, not just the output at the last time step.
Forecasting Several Time Steps Ahead (cont.)
• At time step 0 the model will output a vector containing the forecasts for time steps 1
to 10, then at time step 1 the model will forecast time steps 2 to 11, and so on.

• To turn the model into a sequence-to-sequence model,

» we must set return_sequences=True in all recurrent layers (even the last one)
» we must apply the output Dense layer at every time step.
• Keras offers a TimeDistributed layer, which reshapes the inputs for the wrapped layer
and then reshape the outputs back to sequences.

• We get a validation MSE of about 0.006, which is 25% better than the previous model.
Unstable Gradients Problem
• To deal with the unstable gradients problem, we can reuse the same tricks
for deep nets:
» good parameter initialization, faster optimizers, dropout, and so on.
• However, unlike deep nets such as CNNs, we should not use nonsaturating
activation functions (e.g., ReLU) for RNNs.
» They may actually lead the RNN to be even more unstable during training.
§ Small increase in the outputs will eventually cause the outputs to explode
after many time steps.
» Hence, use a saturating activation function like the hyperbolic tangent.
• The gradients themselves can explode too.
» If you notice that training is unstable, you may want to monitor the size of the
gradients (e.g., using TensorBoard)
» Use Gradient Clipping when needed.
Layer Normalization
• Batch Normalization (BN) cannot be used as efficiently with RNNs as with
deep feedforward nets.
» In fact, you cannot use BN between time steps, only between recurrent layers.
» The use of BNs was slightly better than nothing when applied between recurrent layers (i.e.,
vertically in Figure 15-7), but not within recurrent layers (i.e., horizontally).
• Layer Normalization: instead of normalizing across the batch dimension, it
normalizes across the features dimension.
» Like BN, Layer Normalization learns a scale and an offset parameter for each input.
» In an RNN, it is typically used right after the linear combination of the inputs and the hidden
states.
• One advantage is that it can compute the required statistics on the fly, at each
time step, independently for each instance.
» This also means that it behaves the same way during training and testing (as opposed to BN).
» It does not need to use exponential moving averages to estimate the feature statistics across
all instances in the training set.
Implementation of Layer Normalization
• Use tf.keras to implement Layer Normalization within a simple memory cell.

• To add dropout, all recurrent layers (except for keras.layers.RNN) and all cells
provided by Keras have a dropout hyperparameter and a recurrent_dropout
hyperparameter.
» The former defines the dropout rate to apply to the inputs (at each time step), and the latter
defines the dropout rate for the hidden states (also at each time step).
Long Short-Term Memory
• In RNNs, some information is lost at each time step.
» After a while, the RNN’s state contains virtually no trace of the first inputs.
• To tackle this problem, various types of cells with long-term memory have
been introduced.
» They have proven so successful that the basic cells are not used much anymore.
• The most popular long-term memory cells: the Long Short-Term Memory
(LSTM) cell.
• Two ways to add LSTM layers to a model in Keras:

• The LSTM layer uses an optimized implementation when running on a GPU,

so in general it is preferable to use it.
The Architecture of LSTM Cells
• The state of a LSTM cell is split into two vectors: h(t) and c(t)
» You can think of h(t) as the short-term state and c(t) as the long-term state.
• The key idea is that the network can learn what to store in the long-term
state, what to throw away, and what to read from it.
» c(t–1) first goes through a forget gate, dropping some memories, and then adding some
new memories (selected by an input gate) via the addition operation.
» c(t–1) is copied and passed through the tanh function, and then the result is filtered by
the output gate to produce the short-term state h(t), which is equal to the cell’s output
for this time step, y(t).
Gates in LSTM Cells
• The current input vector x(t) and the previous short-term state h(t–1) are fed to four
different fully connected layers.
• The main layer is the one that outputs g(t) given the current inputs x(t) and the
previous (short-term) state h(t–1).
» Depending on the input gate, g(t) may or may not be added to the long-term state c(t).
• The three other layers are gate
controllers.
» The forget gate (controlled by f(t)) controls
which parts of the long-term state should
be erased.
» The input gate (controlled by i(t)) controls
which parts of g(t) should be added to the
long-term state.
» The output gate (controlled by o(t))
controls which parts of the long-term
state should be read and output at this
time step, both to h(t) and to y(t).
• Gate controllers use the logistic activation
function, whose range is between 0 to 1.
» If they output 0s they close the gate, and
if they output 1s they open it.
The Equations of LSTM Cells
• A LSTM cell can be implemented by the following equations:

• Wxi, Wxf, Wxo, Wxg are the weight matrices of each of the four layers for
their connection to the input vector x(t).
• Whi, Whf, Who, and Whg are the weight matrices of each of the four layers
for their connection to the previous short-term state h(t–1).
• bi, bf, bo, and bg are the bias terms for each of the four layers. Note that
Tensor-Flow initializes bf to a vector full of 1s instead of 0s. This prevents
forgetting everything at the beginning of training.
Peephole Connections
• In a regular LSTM cell, the gate controllers can look only at the input
x(t) and the previous short-term state h(t–1).
• An LSTM variant with extra connections called peephole connections
» The previous long-term state c(t–1) is added as an input to the controllers of the forget
gate and the input gate
» The current long-term state c(t) is added as input to the controller of the output gate.
• Peephole Connections often improve performance, but not always.
• Keras offers an experimental implementation of LSTM cells with
peephole connections
» tf.keras.experimental.PeepholeLSTMCell
» You can create a keras.layers.RNN layer and pass a PeepholeLSTM Cell to its
constructor.
Gated Recurrent Unit (GRU) Cell
• The GRU cell is a simplified version of the LSTM cell, and it seems to
perform just as well
» Both state vectors are merged into a single vector h(t).
» A single gate controller z(t) controls both the forget gate and the input gate.
§ If the gate controller outputs a 1, the forget gate is open (= 1) and the input gate is
closed (1 – 1 = 0).
§ If it outputs a 0, the opposite happens.
» There is no output gate; the full state vector is output at every time step.
§ However, there is a new gate controller r(t) that controls which part of the previous
state will be shown to the main layer (g(t)).
Equations for GRU Cells
• The equations for GRU cells:

• Keras provides a keras.layers.GRU layer.

Using 1D convolutional layers to process sequences
• LSTM and GRU cells are one of the main reasons behind the success of RNNs.
» But they still have a fairly limited short-term memory.
§ They have a hard time learning long-term patterns in sequences of 100 time steps or more,
such as audio samples, long time series, or long sentences.
• One way to solve this is to shorten the input sequences, for example using 1D
convolutional layers.
» Build a neural network composed of a mix of recurrent layers and 1D convolutional layers (or
even 1D pooling layers).
» The 1D convolutional layers downsample the input sequence by a factor of 2, using a stride of 2.

• By shortening the sequences, the convolutional layer may help the GRU layers
detect longer patterns.
WaveNet
• It is possible to use only 1D convolutional layers and drop the recurrent
layers entirely.
• WaveNet stacks 1D convolutional layers, doubling the dilation rate (how
spread apart each neuron’s inputs are) at every layer.
» The lower layers learn short-term patterns, while the higher layers learn long-term
patterns.
» Thanks to the doubling dilation rate, the network can process extremely large sequences
very efficiently.
WaveNet (cont.)
• Here is how to implement a simplified WaveNet in Keras:

• This Sequential model starts with an explicit input layer, then continues
with a 1D convolutional layer using "causal" padding.
» This ensures that the convolutional layer does not peek into the future when making
predictions.
• Add similar pairs of layers using growing dilation rates: 1, 2, 4, 8, and again
1, 2, 4, 8.
• Finally, we add the output layer: a convolutional layer with 10 filters of size
1 and without any activation function.
• GRU with 1D convolutional layers and WaveNet offer the best
performance so far in forecasting our time series.

Samsung UN50KU630DF Chassis UKU6000P PDF
71% (7)
Samsung UN50KU630DF Chassis UKU6000P PDF
170 pages
ANSI-IEC Protection Code
100% (5)
ANSI-IEC Protection Code
2 pages
Stock Prediction Using Recurrent Neural Network (RNN)
0% (1)
Stock Prediction Using Recurrent Neural Network (RNN)
24 pages
Vorträge/Papers/Conférences: The VKT Continuous Vacuum Pan - More Than 20 Years of Experience
No ratings yet
Vorträge/Papers/Conférences: The VKT Continuous Vacuum Pan - More Than 20 Years of Experience
10 pages
07 RNN Recurrent Neural Networks
No ratings yet
07 RNN Recurrent Neural Networks
63 pages
Time Series Prediction With Recurrent Neural Networks
No ratings yet
Time Series Prediction With Recurrent Neural Networks
7 pages
Module 4
No ratings yet
Module 4
36 pages
Unit 4b - Recurrent Neural Networks
No ratings yet
Unit 4b - Recurrent Neural Networks
60 pages
Module 3.2 Time Series Forecasting LSTM Model
No ratings yet
Module 3.2 Time Series Forecasting LSTM Model
23 pages
f 3308049620
No ratings yet
f 3308049620
3 pages
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
No ratings yet
Advanced Data Analytics: Simon Scheidegger - University of Lausanne, Department of Economics
50 pages
Steps For Training A Recurrent Neural Network: Advantages
No ratings yet
Steps For Training A Recurrent Neural Network: Advantages
13 pages
RNN Notes
No ratings yet
RNN Notes
36 pages
slides_rnn
No ratings yet
slides_rnn
75 pages
DL_EXP-7_16010422230
No ratings yet
DL_EXP-7_16010422230
12 pages
18 Rnns
No ratings yet
18 Rnns
57 pages
Recurrent Neural Networks: Anahita Zarei, PH.D
No ratings yet
Recurrent Neural Networks: Anahita Zarei, PH.D
37 pages
Recurrent Neural Network Wiki
100% (1)
Recurrent Neural Network Wiki
7 pages
Modelling Time Series With Neural Networks: Volker Tresp Summer 2017
No ratings yet
Modelling Time Series With Neural Networks: Volker Tresp Summer 2017
24 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
42 pages
DL-unit-4-part-2
No ratings yet
DL-unit-4-part-2
8 pages
A Practical Guide to RNN and LSTM in Keras | by Mohit Mayank | Towards Data Science
No ratings yet
A Practical Guide to RNN and LSTM in Keras | by Mohit Mayank | Towards Data Science
13 pages
3 8 BoykoN Bosik Ivanets
No ratings yet
3 8 BoykoN Bosik Ivanets
6 pages
ML Lec 21 RNN
No ratings yet
ML Lec 21 RNN
72 pages
peerj-cs-2481
No ratings yet
peerj-cs-2481
32 pages
Recurrent Neural Networks For Time Series Forecasting
No ratings yet
Recurrent Neural Networks For Time Series Forecasting
22 pages
Unit IV
No ratings yet
Unit IV
31 pages
A Hybrid Deep Neural Network Model For Time Series Forecasting
No ratings yet
A Hybrid Deep Neural Network Model For Time Series Forecasting
6 pages
Exploring The Use of Recurrent Neural Networks For Time Series Forecasting
No ratings yet
Exploring The Use of Recurrent Neural Networks For Time Series Forecasting
5 pages
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
No ratings yet
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
9 pages
Recurrent Neural Networks
No ratings yet
Recurrent Neural Networks
18 pages
Recurrent Neural Network (RNN)
No ratings yet
Recurrent Neural Network (RNN)
26 pages
DL CO3- PPT 1
No ratings yet
DL CO3- PPT 1
22 pages
1308 0850 PDF
No ratings yet
1308 0850 PDF
43 pages
STMs and LSTM Variations For Prediction
No ratings yet
STMs and LSTM Variations For Prediction
16 pages
Deep Learning Models
No ratings yet
Deep Learning Models
21 pages
Recurrent Neural Networks(RNNs)
No ratings yet
Recurrent Neural Networks(RNNs)
45 pages
What is a Recurrent Neural Network
No ratings yet
What is a Recurrent Neural Network
36 pages
RNN.docx
No ratings yet
RNN.docx
8 pages
DeepLearning Unit-III
No ratings yet
DeepLearning Unit-III
99 pages
module5
No ratings yet
module5
21 pages
Recurrent Neural Networks: Prof. Gheith Abandah
No ratings yet
Recurrent Neural Networks: Prof. Gheith Abandah
32 pages
Lecture Notes_RRN
No ratings yet
Lecture Notes_RRN
8 pages
Blue and White Simple Business Plan Presentation
No ratings yet
Blue and White Simple Business Plan Presentation
15 pages
On Deep Machine Learning & Time Series Models: A Case Study With The Use of Keras
100% (1)
On Deep Machine Learning & Time Series Models: A Case Study With The Use of Keras
34 pages
DNN U2 Notes
No ratings yet
DNN U2 Notes
32 pages
Multi-Step Ahead Time Series Forecasting For Different Data Patterns Based On LSTM Recurrent Neural Network
No ratings yet
Multi-Step Ahead Time Series Forecasting For Different Data Patterns Based On LSTM Recurrent Neural Network
6 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
0% (1)
Unit 4 - Machine Learning - WWW - Rgpvnotes.in
16 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
Recurrent Neural Nets - The Third and Least Appreciated Leg of The AI Stool - Data Science Central
No ratings yet
Recurrent Neural Nets - The Third and Least Appreciated Leg of The AI Stool - Data Science Central
6 pages
RNN
No ratings yet
RNN
47 pages
07 RNN Recurrent Neural Networks
No ratings yet
07 RNN Recurrent Neural Networks
115 pages
Recurrent Neural Networks (RNN)
No ratings yet
Recurrent Neural Networks (RNN)
3 pages
598_114_216_Recurrent_Neural_Networks
No ratings yet
598_114_216_Recurrent_Neural_Networks
87 pages
CS5560 Lect12-RNN - LSTM
No ratings yet
CS5560 Lect12-RNN - LSTM
30 pages
Survey of Prediction Using Recurrent Neural Network
No ratings yet
Survey of Prediction Using Recurrent Neural Network
3 pages
DL 2
No ratings yet
DL 2
37 pages
What are Recurrent Neural Networks.docx
No ratings yet
What are Recurrent Neural Networks.docx
7 pages
RNN LSTM Gru R
No ratings yet
RNN LSTM Gru R
97 pages
An Overview and Comparative Analysis of Recurrent Neural Networks For Short Term Load Forecasting
No ratings yet
An Overview and Comparative Analysis of Recurrent Neural Networks For Short Term Load Forecasting
41 pages
DL-UNIT_5
No ratings yet
DL-UNIT_5
10 pages
UNIT-3
No ratings yet
UNIT-3
30 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
ROOTS OF EASTER FROM BABEL TO ROME
No ratings yet
ROOTS OF EASTER FROM BABEL TO ROME
2 pages
BPL History
No ratings yet
BPL History
14 pages
Tabel Baja
No ratings yet
Tabel Baja
1 page
Teka Jibbaa
No ratings yet
Teka Jibbaa
4 pages
Name of Spaces No. of User Sports Complex SIT: Admin Lobby 150 67.5
No ratings yet
Name of Spaces No. of User Sports Complex SIT: Admin Lobby 150 67.5
52 pages
SVU UG Biotech CBCS Syllabus.docx
No ratings yet
SVU UG Biotech CBCS Syllabus.docx
54 pages
Z706-HDS
No ratings yet
Z706-HDS
1 page
abderhem oumer intern dcument
No ratings yet
abderhem oumer intern dcument
66 pages
Measurement Uncertainties Answers
No ratings yet
Measurement Uncertainties Answers
4 pages
Perform Gmaw - Aut032 - Ccby
No ratings yet
Perform Gmaw - Aut032 - Ccby
82 pages
Diversity in Living Organisms (Classification Criteria) BPP Mam Notes
No ratings yet
Diversity in Living Organisms (Classification Criteria) BPP Mam Notes
17 pages
Port Works Design Manual Part 3 PDF
100% (1)
Port Works Design Manual Part 3 PDF
128 pages
IMac (20-Inch, Mid 2009) - Technical Specifications
No ratings yet
IMac (20-Inch, Mid 2009) - Technical Specifications
4 pages
Test 13 05.04.2025 Full Test_1604095_2025_04_05_14_22
No ratings yet
Test 13 05.04.2025 Full Test_1604095_2025_04_05_14_22
11 pages
Analysis of Communication Tower With Different Hei - 230328 - 153323
No ratings yet
Analysis of Communication Tower With Different Hei - 230328 - 153323
15 pages
Method Statement For Plumbing & Sanitary Wares Installation
No ratings yet
Method Statement For Plumbing & Sanitary Wares Installation
10 pages
Ybrant - Prana - Newsletter V3N08 2010 08 PDF
No ratings yet
Ybrant - Prana - Newsletter V3N08 2010 08 PDF
4 pages
Final SPEC - 3 Star 315 KVA DTR BIS PDF
No ratings yet
Final SPEC - 3 Star 315 KVA DTR BIS PDF
22 pages
CL-200_en
No ratings yet
CL-200_en
4 pages
Dual Operational Amplifiers: Technical Data
No ratings yet
Dual Operational Amplifiers: Technical Data
4 pages
Darwin Digital Library of Evolution Alphabetical Listing of Works
No ratings yet
Darwin Digital Library of Evolution Alphabetical Listing of Works
211 pages
18_Revision Assignment_Aldihyde, Ketone & Carboxylic-Acid_02-12-2024_SC
No ratings yet
18_Revision Assignment_Aldihyde, Ketone & Carboxylic-Acid_02-12-2024_SC
7 pages
Past, Present, Future: Space Exploration
No ratings yet
Past, Present, Future: Space Exploration
10 pages
Me 8 Sem Automation in Production Jun 2017
No ratings yet
Me 8 Sem Automation in Production Jun 2017
4 pages
Instrumental Method of Flavor Analysis
No ratings yet
Instrumental Method of Flavor Analysis
13 pages
Simple Catalog 2019-XYH
No ratings yet
Simple Catalog 2019-XYH
7 pages
Reading Ross 1965 Neurological Findings After Prolonged Sleep Deprivation-1
No ratings yet
Reading Ross 1965 Neurological Findings After Prolonged Sleep Deprivation-1
5 pages

Chapter15 RNN

Uploaded by

Chapter15 RNN

Uploaded by

Chapter 15: Processing Sequences

Using RNNs and CNNs

Ulsan National Institute of Science and Technology (UNIST)

• Imputation is a task of predicting missing values from the past.

• Baseline 2: linear model (implemented as a fully connected network)

• It just contains a single layer with a single neuron.

• In this method, the errors might

» Now we just need the output layer to have 10 units instead of 1

• The MSE for the next 10 time steps is about 0.008.

• To turn the model into a sequence-to-sequence model,

• The LSTM layer uses an optimized implementation when running on a GPU,

• Keras provides a keras.layers.GRU layer.

You might also like