0% found this document useful (0 votes)
59 views20 pages

Tutorial Recurrent Neural Networks

1. The document discusses mounting Google Drive to access files for machine learning tasks in Google Colab notebooks. 2. It then presents a stock market trend prediction problem using LSTM neural networks, including a dataset description. 3. The preprocessing steps covered are feature scaling without data leakage, creating time-series inputs using sliding windows, and reshaping the input data.

Uploaded by

rajitkurup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views20 pages

Tutorial Recurrent Neural Networks

1. The document discusses mounting Google Drive to access files for machine learning tasks in Google Colab notebooks. 2. It then presents a stock market trend prediction problem using LSTM neural networks, including a dataset description. 3. The preprocessing steps covered are feature scaling without data leakage, creating time-series inputs using sliding windows, and reshaping the input data.

Uploaded by

rajitkurup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

HANDS ON: RNN

Mount Drive and Set Working Directory


Google Drive is a free cloud-based storage service that enables users to store and access files
online. Before your computer can use any kind of storage device (such as a hard drive, CD-ROM,
or network share), you or your operating system must make it accessible through the computer's
file system. This process is called mounting. You can only access files on mounted media.

#Mounting the drive


from google.colab import drive
drive.mount("/content/gdrive")

#Setting current working directory or data path


cd "/content/gdrive/My Drive/Colab Notebooks/MLDA/RNN/"
data_path = "/content/gdrive/My Drive/Colab Notebooks/MLDA/RNN/"
DATASET
Stock Market Trend Prediction
Share Market Trend or equity market trend analysis is the process of analysing current trends
in order to predict the future trends. A trend is the general direction at which the stock is
moving.

Trend analysis is based on the idea that what has happened in the past gives traders an idea of
what will happen in the future.

Direction – Trends can move in three directions—up, down, and sideways. If you study prices
over a long period of time, you will be able to see all three types of trends on the same chart.

There is specified duration for a movement to be considered a trend, however, the longer the
trend moves (either upward or downward), the more noteworthy the trend becomes.

LSTMs are very powerful in time-series analysis because they’re able to store past information
or remember information through time. This is important in our case because the previous
market trend of a stock is crucial in predicting its future market trend.
Description
This data set contains 1258 observations and a total of 6 columns dates, volume, and
OHLC (Open, High, Low, Close).

In stock trading, the high and low refer to the maximum and minimum prices in a given
time period. Open and close are the prices at which a stock began and ended trading in the
same period. Volume is the total amount of trading activity (total number of shares that are
actually traded (bought and sold) during the trading day or specified set period of time).

The stacked-LSTM model is trained using 5 years of Google stock data


“Google_Stock_Price_Train”, and for the testing purpose, a new dataset
“Google_Stock_Price_Test” is fetched for the duration between 03.01.2017 to 31.01.2017.
DATA
PREPROCESSING
Feature Scaling
The goal of applying Feature Scaling is to make sure features are on almost the same scale so
that each feature is equally important and make it easier to process by most ML algorithms.

Data leakage refers to a mistake made by the creator of a machine learning model in which
they accidentally share information between the test and training datasets. Typically, when
splitting a dataset into testing and training sets, the goal is to ensure that no data is shared
between the two.

Leakage means that information is revealed to the model that gives it an unrealistic
advantage to make better predictions. This could happen when test data is leaked into the
training set.

Feature scaling across instances should be done after splitting the data between training and
test set, using only the data from the training set. This is because the test set plays the role of
fresh unseen data, so it's not supposed to be accessible at the training stage.
Data preprocessing is a data mining technique which is used to transform the raw data in a
useful and efficient format. Real-world data is often incomplete, inconsistent, lacking in certain
behaviors or trends, and is likely to contain many errors.

import pandas as pd
Data Preprocessing import numpy as np
import matplotlib.pyplot as plt
1. Importing the libraries
2. Importing the training set dataset = pd.read_csv(filepath + “file_name.csv”)
3. Feature scaling
(Standardization &
Normalization) sklearn.preprocessing
(Feature extraction and normalization)
4. Creating time-series inputs
with corresponding output from sklearn.preprocessing import MinMaxScaler
value
5. Reshaping the input
sc = MinMaxScaler(range = (low, high))
1

2 4
Data Preprocessing
1 3
1. Importing the libraries
2. Importing the training set 2
3. Feature scaling Sliding window of 2
3
(Standardization & size 3, sliding with a
3 5
stride of 1
Normalization) 4
4. Creating time-series inputs 4
5
with corresponding output
value 3
5. Reshaping the input
4 6

Keras Documentation: https://keras.io/api/


1 2 3 4
Data Preprocessing
1. Importing the libraries 1
2. Importing the training set
3. Feature scaling 2
(Standardization & Sliding window of
3 2 3 4 5
size 3, sliding with a
Normalization)
stride of 1
4. Creating time-series inputs 4
with corresponding output
5
value
5. Reshaping the input
3 4 5 6

Keras Documentation: https://keras.io/api/


BUILDING
THE
MODEL
from keras.layers import SimpleRNN, LSTM, GRU, Bidirectional

from keras.layers import SimpleRNN


from keras.layers import LSTM
from keras.layers import GRU
Building the RNN from keras.layers import Bidirectional

1. Importing the modules


2. Initializing the model model = Sequential()

3. Adding the layer


model.add(SimpleRNN)

model.add(LSTM)

model.add(GRU)

Keras Documentation: https://keras.io/api/ model.add(Bidirectional(layer))


Understanding return_state and return_sequences

LSTM(units = 100)

c1 h1 c2 h2 ht
ct ht

LSTM LSTM LSTM


LSTM LSTM LSTM

return_sequences = False
return_state = False
Understanding return_state and return_sequences

LSTM(units = 200, return_sequences = True) LSTM(units = 100, return_state = True)

h1 h2 ht ht ct ht

LSTM LSTM LSTM


LSTM LSTM LSTM

return_sequences = True return_sequences = False


return_state = False return_state = True
Understanding return_state and return_sequences

LSTM(units = 200, return_sequences = True, return_state = True)

h1 h2 ht ct ht

LSTM LSTM LSTM

return_sequences = True
return_state = True
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
Building the Stacked-LSTM
from keras.layers import Dropout
1. Importing the modules
2. Initializing the model stacked_lstm = Sequential()
3. Adding the layers
4. Adding Dropout Regularization stacked_lstm.add(layer)

stacked_lstm.add(Dropout(rate=0.2))

Keras Documentation: https://keras.io/api/


Stacked-LSTM (stacked_lstm = Sequential())

he1 he2 hem

LSTM LSTM LSTM stacked_lstm.add(LSTM(units = 50))

stacked_lstm.add(LSTM(units = 50,
LSTM LSTM LSTM return_sequences = True))

stacked_lstm.add(LSTM(units = 50,
LSTM LSTM LSTM return_sequences = True))

stacked_lstm.add(LSTM(units = 50,
LSTM LSTM LSTM return_sequences = True,
input_shape = (timesteps, features)))
from keras.optimizers import Adam

Compiling and Training the stacked_lstm.compile(optimizer =‘adam’,


Stacked-LSTM loss = ‘mean_squared_error’)
1. Importing the modules
2. Compiling the model
3. Fitting the model to the training set stacked_lstm.add(Dropout(rate=0.2))

stacked_lstm.fit(X_train, y_train, epochs


= 100, batch_size = 32)

Keras Documentation: https://keras.io/api/


Making the predictions
1. Getting the real stock prices of
2017 test_set = pd.read_csv(filepath + “file_name.csv”)
2. Getting the predicted stock prices
of 2017
test_set_scaled = sc.transform(test_set)
a. Feature scale the test set using
the same MinMaxScaler
instance used for scaling the train_set_test_inputs =
training set training_set_scaled[len(training_set_scaled) - 60:]
b. Slice training set for testing
c. Concatenate sliced training
set and scaled test set test_data = np.concatenate((train_set_test_inputs,
d. Creating time-series inputs real_stock_price_scaled))
with corresponding output
value
e. Make the predictions
Visualising the Results

plt.plot(real_stock_price, color = 'red', label = 'Real Google Stock Price')


plt.plot(predicted_stock_price, color = 'blue', label = 'Predicted Google
Stock Price')
plt.title('Google Stock Price Prediction')
plt.xlabel('Time')
plt.ylabel('Google Stock Price')
plt.legend()
plt.show()

You might also like