0% found this document useful (0 votes)
147 views

Deep Learning For Financial Applications - A Survey

This document summarizes a survey paper on deep learning models for financial applications. It begins by introducing computational intelligence and machine learning in finance, noting the increasing interest in deep learning. It then reviews existing surveys on machine learning in finance before describing common deep learning models like convolutional neural networks and LSTMs. The main sections outline applications of deep learning models in areas like algorithmic trading, credit risk assessment, portfolio allocation and fraud detection. Statistical results on deep learning research trends are also discussed.

Uploaded by

biondimi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
147 views

Deep Learning For Financial Applications - A Survey

This document summarizes a survey paper on deep learning models for financial applications. It begins by introducing computational intelligence and machine learning in finance, noting the increasing interest in deep learning. It then reviews existing surveys on machine learning in finance before describing common deep learning models like convolutional neural networks and LSTMs. The main sections outline applications of deep learning models in areas like algorithmic trading, credit risk assessment, portfolio allocation and fraud detection. Statistical results on deep learning research trends are also discussed.

Uploaded by

biondimi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Deep Learning for Financial Applications : A Survey

Ahmet Murat Ozbayoglua , Mehmet Ugur Gudeleka , Omer Berat Sezera


a
Department of Computer Engineering, TOBB University of Economics and Technology, Ankara, Turkey

Abstract
Computational intelligence in finance has been a very popular topic for both academia and
financial industry in the last few decades. Numerous studies have been published resulting
arXiv:2002.05786v1 [q-fin.ST] 9 Feb 2020

in various models. Meanwhile, within the Machine Learning (ML) field, Deep Learning
(DL) started getting a lot of attention recently, mostly due to its outperformance over the
classical models. Lots of different implementations of DL exist today, and the broad interest
is continuing. Finance is one particular area where DL models started getting traction,
however, the playfield is wide open, a lot of research opportunities still exist. In this paper,
we tried to provide a state-of-the-art snapshot of the developed DL models for financial
applications, as of today. We not only categorized the works according to their intended
subfield in finance but also analyzed them based on their DL models. In addition, we also
aimed at identifying possible future implementations and highlighted the pathway for the
ongoing research within the field.
Keywords: deep learning, finance, computational intelligence, machine learning, financial
applications, algorithmic trading, portfolio management, risk assesment, fraud detection

1. Introduction
Stock market forecasting, algorithmic trading, credit risk assessment, portfolio allocation,
asset pricing and derivatives market are among the areas where ML researchers focused on
developing models that can provide real-time working solutions for the financial industry.
Hence, a lot of publications and implementations exist in the literature.
However, within the ML field, DL is an emerging area with a rising interest every year.
As a result, an increasing number of DL models for finance started appearing in conferences
and journals. Our focus in this paper is to present different implementations of the developed
financial DL models in such a way that the researchers and practitioners that are interested
in the topic can decide which path they should take.
In this paper, we tried to provide answers to the following research questions:

• What financial application areas are of interest to DL community?

• How mature is the existing research in each of these application areas?

• What are the areas that have promising potentials from an academic/industrial re-
search perspective?
Preprint submitted to Applied Soft Computing February 17, 2020
• Which DL models are preferred (and more successful) in different applications?

• How do DL models pare against traditional soft computing / ML techniques?

• What is the future direction for DL research in Finance?

Our focus was solely on DL implementations for financial applications. A substantial


portion of the computational intelligence for finance research is devoted to financial time
series forecasting. However, we preferred to concentrate on those studies in a separate survey
paper [1] in order to be able to pinpoint other, less covered application areas. Meanwhile,
we decided to include algorithmic trading studies with DL based trading strategies which
may or may not have an embedded time series forecasting component.
For our search methodology, we surveyed and carefully reviewed the studies that came
to our attention from the following sources: ScienceDirect, ACM Digital Library, Google
Scholar, arXiv.org, ResearchGate, Google keyword search for DL and finance
The range of our survey spanned not only journals and conferences, but also Masters
and PhD theses, book chapters, arXiv papers and noteworthy technical papers that came
up in Google searches. Furthermore, we only chose the articles that were written in English.
It is worth to mention that we encountered a few studies that were written in a different
language, but had English abstracts. However, for overall consistency, we decided not to
include those studies in our survey.
Most of the papers in this survey used the term “deep learning" in their model description
and they were published in the last 5 years. However, we also included some older papers
that implemented deep learning models even though they were not called “deep learning"
models at their time of publication. Some examples for such models include Recurrent
Neural Network (RNN), Jordan-Elman networks.
To best of our knowledge, this will be the first comprehensive “deep learning for financial
applications" survey paper. As will be introduced in the next section, a lot of ML surveys
exist for different areas of finance, however, no study has concentrated on DL implemen-
tations. We genuinely believe our study will highlight the major advancements in the field
and provide a roadway for the intended researchers that would like to develop DL models
for different financial application areas.
The rest of the paper is structured as follows. After this brief introduction, in Section 2,
the existing surveys that are focused on ML and soft computing studies for financial ap-
plications are presented. In Section 3, we will provide the basic working DL models that
are used in finance, i.e. Convolutional Neural Network (CNN), Long-Short Term Memory
(LSTM), etc. Section 4 will focus on the implementation areas of the DL models in finance.
Some of these include algorithmic trading, credit risk assessment, portfolio allocation, asset
pricing, fraud detection and derivatives market. After briefly stating the problem definition
in each subsection, DL implementations of each associated problem will be given.
In Section 5, these studies will be compared and some overall statistical results will be
presented including histograms about the yearly distribution of different subfields, models,
publication types, etc. These statistics will not only demonstrate the current state for the
field but also will show which areas are mature, which areas still have opportunities and
2
which areas are getting accelerated attention. Section 6 will have discussions about what
has been done in the field so far and where the industry is going. The chapter will also
include the achievements and expectations of both academia and the industry. Also, open
areas and recommended research topics will be mentioned. Finally, in Section 7, we will
summarize the findings and conclude.

2. Machine Learning in Finance


Finance has always been one of the most studied application areas for ML, starting
as early as 40 years ago. So far, thousands of research papers were published in various
fields within finance, and the overall interest does not seem to diminish anytime soon. Even
though this survey paper is solely focused on DL implementations, we wanted to provide the
audience with some insights about previous ML studies by citing the related surveys within
the last 20 years.
There are a number of ML surveys and books with a general perspective such that they
do not concentrate on any particular implementation area. The following survey papers fall
into that category. Bahrammirzaee et al. [2] compared Artificial Neural Networks (ANNs),
Expert Systems and Hybrid models for various financial applications. Zhang et al. [3]
reviewed the data mining techniques including Genetic Algorithm (GA), rule-based systems,
Neural Networks (NNs) preferred in different financial application areas. Similarly, Mochn
et al. [4] also provided insights about financial implementations based on soft computing
techniques like fuzzy logic, probabilistic reasoning and NNs. Even though Pulakkazhy et
al. [5] focused particularly on data mining models in banking applications, they still had a
span of several subtopics within the field. Meanwhile, Mullainathan et al. [6] studied the
ML implementations from a high level and econometric point of view. Likewise, Gai et al.
[7] reviewed the Fintech studies and implementations not only from an ML perspective but
in general. The publications in [8, 9, 10, 11] constitute some of the books that cover the
implementations of soft computing models in finance.
Meanwhile, there are some survey papers that are also not application area-specific but
rather focused on particular ML techniques. One of those soft computing techniques is
the family of Evolutionary Algorithms (EAs), i.e. GA, Particle Swarm Optimization (PSO),
etc. commonly used in financial optimization implementations like Portfolio Selection. Chen
et al. [12] wrote a book covering GAs and Genetic Programming (GP) in Computational
Finance. Later, Castillo et al. [13], Ponsich et al. [14], Aguilar-Rivera et al. [15] extensively
surveyed Multiobjective Evolutionary Algorithms (MOEAs) on portfolio optimization and
other various financial applications.
Since ANNs were quite popular among researchers, a number of survey papers were just
dedicated to them. Wong et al. [16] covered early implementations of ANNs in finance. Li
et al. [17] reviewed implementations of ANNs for stock price forecasting and some other
financial applications. Lately, Elmsili et al. [18] contained ANN applications in economics
and management research in their survey.
In addition, LeBaron [19] covered the studies focused on agent-based computational
finance. Meanwhile, Chalup et al. [20] wrote a book chapter on kernel methods in finan-
3
cial applications which includes models like Principal Component Analysis (PCA), Support
Vector Machine (SVM).
And then, there are application-specific survey papers that single out particular financial
areas which are quite useful and informative for researchers that already know what they
are looking for. These papers will be covered in the appropriate subsections of Section 4
during problem description. In the next section, brief working structures of the DL models
used in the financial applications will be given.

3. Deep Learning
Deep Learning is a particular type of ML that consists of multiple ANN layers. It provides
high-level abstraction for data modelling [21]. In the literature, different DL models exist:
Deep Multilayer Perceptron (DMLP), CNN, RNN, LSTM, Restricted Boltzmann Machines
(RBMs), Deep Belief Networks (DBNs), and Autoencoders (AEs).

3.1. Deep Multi Layer Perceptron (DMLP)


In the literature, DMLP was the first proposed ANN model of its kind. DMLP networks
consist of input, output and hidden layers just like an ordinary Multilayer Perceptron (MLP);
however, the number of layers in DMLP is more than MLP. Each neuron in every layer has
input (x), weight (w) and bias (b) terms. An output of a neuron in the neural network is
illustrated in Equation 1. In addition, each neuron has a nonlinear activation function which
produces the output of that neuron through accumulating weighted inputs from the neurons
in the preceding layer. Sigmoid [22], hyperbolic tangent [23], Rectified Linear Unit (ReLU)
[24], leaky ReLU [25], swish [26], and softmax[27] are among the most preferred nonlinear
activation functions in the literature.
X
yi = σ( Wi xi + bi ) (1)
i

With multi-layer deep ANNs, more efficient classification and regression performances
are achieved when compared against shallow nets. DMLPs’ learning process is implemented
through backpropagation. The amount of the output error in the output layer neurons is
also reflected back to the neurons in the previous layers. In DMLP, Stochastic Gradient
Descent (SGD) method is (mostly) used for the optimization of learning (to update the
weights of the connections between the layers). In Figure 1, a DMLP model, the layers, the
neurons in layers, the weights between the neurons are shown.

4
Forward Pass  Backpropagation

l Output  Units l

Wkl
Wkl

k Hidden Units H2 k

Wjk Wjk

j Hidden Units H1 j

Wij Wij

i Input Units i

Figure 1: Deep Multi Layer Neural Network Forward Pass and Backpropagation [21]

3.2. Convolutional Neural Networks (CNNs)


CNN is a type of Deep Neural Network (DNN) that is mostly used for image classifi-
cation, image recognition problems. In its methodology, the whole image is scanned with
filters. In the literature, 1x1, 3x3 and 5x5 filter sizes are mostly used. In most of the
CNN architectures, there are different types of layers: convolutional, pooling (average or
maximum), fully connected layers. CNN consists of convolutional layers based on the con-
volutional operation. Figure 2 shows the generalized CNN architecture that has different
layers: convolutional, subsampling (pooling), fully connected layers.

15x15x32 Generalized Convolutional Neural Network Structure


15x15x64 128
7x7x64
15x15 28x28

0.25 0.50
3x3
3x3
3x3 D D

Fully
Input Convolutional Subsampling Convolutional Subsampling Connected
Layer Convolutional Convolutional
Layer Max
  Layer Layer Fully Layer
  Layer Dropout
Layer Layer Pooling Connected

Figure 2: Generalized Convolutional Neural Network Architecture

5
3.3. Recurrent Neural Network (RNN)
In the literature, RNN has been mostly used on sequential data such as time-series data,
audio and speech data, language. It consists of RNN units that are structured consecu-
tively. Unlike feed-forward networks, RNNs use internal memory to process the incoming
inputs. RNNs are used in the analysis of the time series data in various fields (handwriting
recognition, speech recognition, etc).
There are different types of RNN structures: one to many, many to one, many to many.
Generally, RNN processes the input sequence series one by one at a time, during its op-
eration. Units in the hidden layer hold information about the history of the input in the
"state vector" [21]. RNNs can be trained using the Backpropagation Through Time (BPTT)
method. Using BPTT, the differentiation of the loss at any time t has reflected the weights
of the network at the previous time. Training of RNNs are more difficult than Feedforward
Neural Networks (FFNNs) and the training period of RNNs takes longer.
In Figure 3, the information flow in the RNN’s hidden layer is divided into discrete times.
The status of the node S at different times of t is shown as st , the input value x at different
times is xt , and the output value o at different times is shown as ot . The parameter values
(U, W, V ) are always used in the same step.

Output: o
ot-1 ot ot+1

V V V V
s st-1 st st+1
W W W W W

U U U U

Input: x xt-1 xt xt+1

Figure 3: RNN cell through time[21]

3.4. Long Short Term Memory (LSTM)


LSTM network [28] is a different type of DL network specifically intended for sequential
data analysis. The advantage of LSTM networks lies in the fact that both short term
and long term values in the network can be remembered. Therefore, LSTM networks are
mostly used for sequential data analysis (automatic speech recognition, language translation,
handwritten character recognition, time-series data forecasting, etc.) by DL researchers.
LSTM networks consist of LSTM units. LSTM unit is composed of cells having input,
output and forget gates. These three gates regulate the information flow. With these
features, each cell remembers the desired values over arbitrary time intervals. LSTM cells
combine to form layers of neural networks. Figure 4 illustrates the basic LSTM unit (σg :
sigmoid function, tanh: hyperbolic tangent function, X: multiplication, +: addition).

6
Output of
Current Block
(Ht)

Memory From Memory From


Previous Block X + Current Block
(Ct-1) (Ct)
tanh

X X

σ σ tanh σ
Output of Output of
Previous Block Current Block
(Ht-1) (Ht)

Input (Xt)

Figure 4: Basic LSTM Unit [28]

3.5. Restricted Boltzmann Machines (RBMs)


RBM is a different type of ANN model that can learn the probability distribution of
the input set [29]. RBMs are mostly used for dimensionality reduction, classification, and
feature learning. RBM is a bipartite, undirected graphical model that consists of two layers;
visible and hidden layer. The units in the layer are not connected to each other. Each cell is
a computational point that processes the input. Each unit makes stochastic decisions about
whether transmitting the input data or not. The inputs are multiplied by specific weights,
certain threshold values (bias) are added to the input values, then the calculated values are
passed through an activation function. In the reconstruction stage, the results in the outputs
re-enter the network as the input, then they exit from the visible layer as the output. The
values of the previous input and the values after the processes are compared. The purpose of
the comparison is to reduce the difference. The learning is performed multiple times on the
network [29]. RBM is a two-layer, bipartite, and undirected graphical model that consists
of two layers; visible and hidden layers (Figure 5). The layers are not connected among
themselves. The disadvantage of RBM is its tricky training. “RBMs are tricky because
although there are good estimators of the log-likelihood gradient, there are no known cheap
ways of estimating the log-likelihood itself" [30].

7
c c1 c2 c3 cm

h1 h2 h3 hm

Wmxn

v1 v2 vn

b b1 b2 bn

Figure 5: RBM Visible and Hidden Layers [29]

3.6. Deep Belief Networks (DBNs)


DBN is a type of ANN that consists of a stack of RBM layers. DBN is a probabilistic
generative model that consists of latent variables. DBNs are used for finding independent
and discriminative features in the input set using an unsupervised approach. DBN can learn
to reconstruct the input set in a probabilistic way during the training process. Then the
layers on the network begin to detect the discriminative features. After the learning step,
supervised learning is carried out to perform for the classification [31]. Figure 6 illustrates
the DBN structure.

h13 h23 hm3

h12 h22 hm2

h11 h21 hm1

v1 v2 vn

Figure 6: Deep Belief Network [29]

3.7. Autoencoders (AEs)


AE networks are commonly used in DL models, wherein they remap the inputs (features)
such that the inputs are more representative for the classification. In other words, AE
networks perform an unsupervised feature learning process. A representation of a data set
is learned by reducing the dimensionality with an AE. In the literature, AEs have been used
for feature extraction and dimensionality reduction [27, 32]. The architecture of an AE has
similarities with that of a FFNN. It consists of an input layer, output layer and one (or
more) hidden layer that connects them together. The number of nodes in the input layer
8
and the number of nodes in the output layer are equal to each other in AEs, and they have
a symmetrical structure. AEs contain two components: encoder and decoder.
The advantages of the usage of AE are dimensionality reduction and feature learning.
However, reducing dimensions and feature extraction in AE cause some drawbacks. Focusing
on minimizing the loss of the data relationship in the code of AE causes the loss of some
significant data relationship. This may be a drawback of AE[33]. Figure 7 shows the basic
AE structure.

Encoder Decoder

Input Output

Code

Figure 7: Basic Autoencoder Structure

3.8. Other Deep Structures


The DL models are not limited to the ones mentioned in the previous subsections. Some of
the other well-known structures that exist in the literature are Deep Reinforcement Learn-
ing (DRL), Generative Adversarial Networks (GANs), Capsule Networks, Deep Gaussian
Processes (DGPs). Meanwhile, to the best of our knowledge, we have not encountered any
noteworthy academic or industrial publication on financial applications using these models
so far, with the exception of DRL which started getting attention lately. However, that
does not imply that these models do not fit well with the financial domain. On the contrary,
they offer great potentials for researchers and practitioners participating in finance and deep
learning community who are willing to go the extra mile to come up with novel solutions.
Since research for model developments in DL is ongoing, new structures keep on coming.
However, the aforementioned models currently cover almost all of the published work. Next
section will provide details about the implementation areas along with the preferred DL
models.

4. Financial Applications
There are a lot of financial applications of soft computing in the literature. DL has been
studied in most of them, although, some opportunities still exist in a number of fields.
Throughout this section, we categorized the implementation areas and presented them
in separate subsections. Besides, in each subsection we tabulated the representative features
of the relevant studies in order to provide as much information as possible in the limited
space.
9
Also, the readers should note that there were some overlaps between different imple-
mentation areas for some papers. There were two main reasons for that: In some papers,
multiple problems were addressed separately, for e.g. text mining was studied for feature
extraction, then algorithmic trading was implemented. For some other cases, the paper
might fit directly into multiple implementation areas due to the survey structure, for e.g.
cryptocurrency portfolio management. In such cases we included the papers in all of the
relevant subsections creating some overlaps.
Some of the existing study areas can be grouped as follows:

4.1. Algorithmic Trading


Algorithmic trading (or Algo-trading) is defined as buy-sell decisions made solely by al-
gorithmic models. These decisions can be based on some simple rules, mathematical models,
optimized processes, or as in the case of machine/deep learning, highly complex function
approximation techniques. With the introduction of electronic online trading platforms and
frameworks, algorithmic trading took over the finance industry in the last two decades. As
a result, Algo-trading models based on DL also started getting attention.
Most of the Algo-trading applications are coupled with price prediction models for market
timing purposes. As a result, a majority of the price or trend forecasting models that
trigger buy-sell signals based on their prediction are also considered as Algo-trading systems.
However, there are also some studies that propose stand-alone Algo-trading models focused
on the dynamics of the transaction itself by optimizing trading parameters such as bid-ask
spread, analysis of limit order book, position-sizing, etc. High Frequency Trading (HFT)
researchers are particularly interested in this area. Hence, DL models also started appearing
in HFT studies.
Before diving into the DL implementations, it would be beneficial to briefly mention
about the existing ML surveys on Algo-trading. Hu et al. [34] reviewed the implementations
of various EAs on Algorithmic Trading Models. Since financial time series forecasting is
highly coupled with algorithmic trading, there are a number of ML survey papers focused
on Algo-trading models based on forecasting. The interested readers can refer to [1] for more
information.
As far as the DL research is concerned, Table 1, Table 2, and Table 3 present the past and
current status of algo-trading studies based on DL models. The papers are distributed to
these tables as follows: Table 1 has the particular algorithmic trading implementations that
are embedded with time series forecasting models, whereas Table 2 is focused on classification
based (Buy-sell Signal, or Trend Detection) algo-trading models. Finally, Table 3 presents
stand-alone studies or other algorithmic trading models (pairs trading, arbitrage, etc) that
do not fit into the above clustering criteria.
Most of the Algo-trading studies were concentrated on the prediction of stock or index
prices. Meanwhile, LSTM was the most preferred DL model in these implementations. In
[35], market microstructures based trade indicators were used as the input into RNN with
Graves LSTM to perform the price prediction for algorithmic stock trading. Bao et al. [36]
used technical indicators as the input into Wavelet Transforms (WT), LSTM and Stacked
Autoencoders (SAEs) for the forecasting of stock prices. In [37], CNN and LSTM model
10
structures were implemented together (CNN was used for stock selection, LSTM was used
for price prediction).
Table 1: Algo-trading Applications Embedded with Time Series Forecasting Models

Art. Data Set Period Feature Set Method Performance Cri- Environment
teria

[35] GarantiBank in BIST, 2016 OCHLV, Spread, PLR, Graves MSE, RMSE, MAE, Spark
Turkey Volatility, Turnover, LSTM RSE, Correlation R-
etc. square
[36] CSI300, Nifty50, HSI, 2010-2016 OCHLV, Technical WT, Stacked au- MAPE, Correlation -
Nikkei 225, S&P500, Indicators toencoders, LSTM coefficient, THEIL-
DJIA U
[37] Chinese Stocks 2007-2017 OCHLV CNN + LSTM Annualized Return, Python
Mxm Retracement
[38] 50 stocks from NYSE 2007-2016 Price data SFM MSE -
[39] The LOB of 5 stocks of 2010 FI-2010 dataset: WMTR, MDA Accuracy, Precision, -
Finnish Stock Market bid/ask and volume Recall, F1-Score
[40] 300 stocks from SZSE, 2014-2015 Price data FDDR, DNN+RL Profit, return, SR, Keras
Commodity profit-loss curves
[41] S&P500 Index 1989-2005 Price data, Volume LSTM Return, STD, SR, Python,
Accuracy TensorFlow,
Keras, R,
H2O
[42] Stock of National Bank 2009-2014 FTSE100, DJIA, GASVR, LSTM Return, volatility, Tensorflow
of Greece (ETE). GDAX, NIKKEI225, SR, Accuracy
EUR/USD, Gold
[43] Chinese stock-IF-IH-IC 2016-2017 Decisions for price MODRL+LSTM Profit and loss, SR -
contract change
[44] Singapore Stock Market 2010-2017 OCHL of last 10 days DNN RMSE, MAPE, -
Index of Index Profit, SR
[45] GBP/USD 2017 Price data Reinforcement SR, downside de- Python,
Learning + LSTM viation ratio, total Keras, Ten-
+ NES profit sorflow
[46] Commodity, FX future, 1991-2014 Price Data DNN SR, capability ratio, C++,
ETF return Python
[47] USD/GBP, S&P500, 2016 Price data AE + CNN SR, % volatility, avg H2O
FTSE100, oil, gold return/trans, rate of
return
[48] Bitcoin, Dash, Ripple, 2014-2017 MA, BOLL, the LSTM, RNN, Accuracy, F1- Python, Ten-
Monero, Litecoin, Doge- CRIX returns, Eu- MLP measure sorflow
coin, Nxt, Namecoin ribor interest rates,
OCHLV
[49] S&P500, KOSPI, HSI, 1987-2017 200-days stock price Deep Q-Learning, Total profit, Corre- -
and EuroStoxx50 DNN lation
[50] Stocks in the S&P500 1990-2015 Price data DNN, GBT, RF Mean return, MDD, H2O
Calmar ratio
[51] Fundamental and Tech- - Fundamental , tech- CNN - -
nical Data, Economic nical and market in-
Data formation

Using a different model, Zhang et. al. [38] proposed a novel State Frequency Memory
(SFM) recurrent network for stock price prediction with multiple frequency trading patterns
and achieved better prediction and trading performances. In an HFT trading system, Tran
et al. [39] developed a DL model that implements price change forecasting through mid-
price prediction using high-frequency limit order book data with tensor representation. In
[40], the authors used Fuzzy Deep Direct Reinforcement Learning (FDDR) for stock price
prediction and trading signal generation.
For index prediction, the following studies are noteworthy. In [41], the price prediction
11
of S&P500 index using LSTM was implemented. Mourelatos et al. [42] compared the
performance of LSTM and GA with a SVR (GASVR) for Greek Stock Exchange Index
prediction. Si et al. [43] implemented Chinese intraday futures market trading model with
DRL and LSTM. Yong et al. [44] used feed-forward DNN method and Open,Close,High,
Low (OCHL) of the time series index data to predict Singapore Stock Market index data.
Forex or cryptocurrency trading was implemented in some studies. In [45], agent inspired
trading using deep (recurrent) reinforcement learning and LSTM was implemented and
tested on the trading of GBP/USD. In [46], feedforward deep MLP was implemented for
the prediction of commodities and FX trading prices. Korczak et al. [47] implemented a
forex trading (GBP/PLN) model using several different input parameters on a multi-agent-
based trading environment. One of the agents was using CNN as the prediction model and
outperformed all other models.
On the cryptocurrency side, Spilak et al. [48] used several cryptocurrencies (Bitcoin,
Dash, Ripple, Monero, Litecoin, Dogecoin, Nxt, Namecoin) to construct a dynamic portfolio
using LSTM, RNN, MLP methods.
In a versatile study, Jeong et al. [49] combined deep Q-learning and DNN to implement
price forecasting and they intended to solve three separate problems: Increasing profit in
a market, prediction of the number of shares to trade, and preventing overfitting with
insufficient financial data.
In [52], technical analysis indicator’s (Relative Strength Index (RSI)) buy & sell limits
were optimized with GA which was used for buy-sell signals. After optimization, DMLP was
also used for function approximation. In [53], the authors combined deep Fully Connected
Neural Network (FNN) with a selective trade strategy unit to predict the next price. In
[54], the crossover and Moving Average Convergence and Divergence (MACD) signals were
used to predict the trend of the Dow 30 stocks’ prices. Sirignano et al. [55] proposed a novel
method that used limit order book flow and history information for the determination of
the stock movements using LSTM model. Tsantekidis et al. [56] also used limit order book
time series data and LSTM method for the trend prediction.
Several studies focused on utilizing CNN based models due to their success in image
classification problems. However, in order to do that, the financial input data needed to be
transformed into images which required some creative preprocessing. Gudelek et al. [57]
converted time series of price data to 2-dimensional images using technical analysis and
classified them with deep CNN. Similarly, Sezer et al. [58] also proposed a novel technique
that converts financial time series data that consisted of technical analysis indicator outputs
to 2-dimensional images and classified these images using CNN to determine the trading
signals. In [59], candlestick chart graphs were converted into 2-dimensional images. Then,
unsupervised convolutional AE was fed with the images to implement portfolio construction.
Tsantekidis et al. [60] proposed a novel method that used the last 100 entries from the limit
order book to create a 2-dimensional image for the stock price prediction using CNN method.
In [61], an innovative method was proposed that uses CNN with correlated features combined
together to predict the trend of the stocks prices. Finally, Sezer et al. [62] directly used bar
chart images as inputs to CNN and predicted if the image class was Buy, Hold or Sell, hence
a corresponding Algo-trading model was developed.
12
Table 2: Classification (Buy-sell Signal, or Trend Detection) Based Algo-trading Models

Art. Data Set Period Feature Set Method Performance Environment


Criteria

[52] Stocks in Dow30 1997-2017 RSI DMLP with ge- Annualized Spark MLlib, Java
netic algorithm return
[53] SPY ETF, 10 stocks 2014-2016 Price data FFNN Cumulative gain MatConvNet,
from S&P500 Matlab
[54] Dow30 stocks 2012-2016 Close data and LSTM Accuracy Python, Keras,
several technical Tensorflow,
indicators TALIB
[55] High-frequency record 2014-2017 Price data, record LSTM Accuracy -
of all orders of all orders, trans-
actions
[56] Nasdaq Nordic (Kesko 2010 Price and volume LSTM Precision, Re- -
Oyj, Outokumpu Oyj, data in LOB call, F1-score,
Sampo, Rautaruukki, Cohen’s k
Wartsila Oyj)
[57] 17 ETFs 2000-2016 Price data, techni- CNN Accuracy, MSE, Keras, Tensorflow
cal indicators Profit, AUROC
[58] Stocks in Dow30 and 9 1997-2017 Price data, techni- CNN with feature Recall, precision, Python, Keras,
Top Volume ETFs cal indicators imaging F1-score, annual- Tensorflow, Java
ized return
[59] FTSE100 2000-2017 Price data CAE TR, SR, MDD, -
mean return
[60] Nasdaq Nordic (Kesko 2010 Price, Volume CNN Precision, Re- Theano, Scikit
Oyj, Outokumpu Oyj, data, 10 orders of call, F1-score, learn, Python
Sampo, Rautaruukki, the LOB Cohen’s k
Wartsila Oyj)
[61] Borsa Istanbul 100 2011-2015 75 technical indi- CNN Accuracy Keras
Stocks cators and OCHLV
[62] ETFs and Dow30 1997-2007 Price data CNN with feature Annualized Keras, Tensorflow
imaging return
[63] 8 experimental assets - Asset prices data RL, DNN, Genetic Learning and -
from bond/derivative Algorithm genetic algorithm
market error
[64] 10 stocks from S&P500 - Stock Prices TDNN, RNN, Missed oppor- -
PNN tunities, false
alarms ratio
[65] London Stock Exchange 2007-2008 Limit order book CNN Accuracy, kappa Caffe
state, trades,
buy/sell orders,
order deletions
[66] Cryptocurrencies, Bit- 2014-2017 Price data CNN, RNN, Accumulative -
coin LSTM portfolio value,
MDD, SR

Serrano et al. [63] proposed a novel method called “GoldAI Sachs” Asset Banker Rein-
forcement Learning Algorithm for algorithmic trading. The proposed method used a ran-
dom neural network, GP, and Reinforcement Learning (RL) to generate the trading signals.
Saad et al. [64] compared Timedelay Neural Network (TDNN), RNN and Probabilistic
Neural Network (PNN) for trend detection using 10 stocks from S&P500. In [65], HFT
microstructures forecasting with CNN method was performed. In [66], cryptocurrency port-
folio management based on three different proposed models (basic RNN, LSTM and CNN)
was implemented.
Tino et al. [67] used The Deutscher Aktienindex (DAX), London Financial Times Stock
Exchange Index (FTSE)100, call and put options prices to predict the changes with Markov
models and used the financial time series data to predict volatility changes with RNN.
Meanwhile, Chen et al. [68] proposed a method that uses a filterbank CNN Algorithm on
13
15x15 volatility times series converted synthetic images. In the study, the financial domain
knowledge and filterbank mechanism were combined to determine the trading signals. Bari
et al. [69] used text mining to extract information from the tweets and financial news and
used LSTM, RNN, Gated-Recurrent Unit (GRU) for the generation of the trading signals.
Dixon et al. [70] used RNN for the sequence classification of the limit order book to predict
a next event price-flip.
Table 3: Stand-alone and/or Other Algorithmic Models

Art. Data Set Period Feature Set Method Performance Environment


Criteria

[67] DAX, FTSE100, 1991-1998 Price data Markov model, Ewa-measure, -


call/put options RNN iv, daily profits’
mean and std
[68] Taiwan Stock Index Fu- 2012-2014 Price data to im- Visualization Accumulated -
tures, Mini Index Fu- age method + CNN profits,accuracy
tures
[69] Energy-Sector/ 2015-2016 Text and Price LSTM, RNN, Return, SR, pre- Python, Tweepy
Company-Centric data GRU cision, recall, ac- API
Tweets in S&P500 curacy
[70] CME FIX message 2016 Limit order book, RNN Precision, recall, Python, Tensor-
time-stamp, price F1-measure Flow, R
data
[71] Taiwan stock index fu- 2017 Price data Agent based Accuracy -
tures (TAIFEX) RL with CNN
pre-trained
[72] Stocks from S&P500 2010-2016 OCHLV DCNL PCC, DTW, Pytorch
VWL
[73] News from NowNews, 2013-2014 Text, Sentiment DNN Return Python, Tensor-
AppleDaily, LTN, Mon- flow
eyDJ for 18 stocks
[74] 489 stocks from S&P500 2014-2015 Limit Order Book Spatial neural net- Cross entropy er- NVIDIA’s cuDNN
and NASDAQ-100 work ror
[75] Experimental dataset - Price data DRL with CNN, Mean profit Python
LSTM, GRU,
MLP

Chen et al. [71] used 1-dimensional CNN with an agent-based RL algorithm on the
Taiwan stock index futures (TAIFEX) dataset. Wang et al. [72] proposed a Deep Co-
investment Network Learning (DeepCNL) method that used convolutional and RNN layers.
The investment pattern was determined using the extracted Rise-Fall trends. Day et al. [73]
used financial sentiment analysis using text mining and DNN for stock algorithmic trading.
Sirignano et al. [74] proposed a “spatial neural network” model that used limit order book
and spatial features for algorithmic trading. Their model estimates the best bid-ask prices
using bid, ask prices in the limit order book. Gao et al. [75] used GRU, LSTM units, CNN,
and MLP to model Q values for the implementation of the DRL method.

4.2. Risk Assessment


Another study area that has been of interest to DL researchers is Risk Assessment which
identifies the “riskiness" of any given asset, firm, person, product, bank, etc. Several different
versions of this general problem exist, such as bankruptcy prediction, credit scoring, credit
evaluation, loan/insurance underwriting, bond rating, loan application, consumer credit de-
termination, corporate credit rating, mortgage choice decision, financial distress prediction,
14
business failure prediction. Correctly identifying the risk status in such cases is crucial, since
asset pricing is highly dependent on these risk assessment measures. The mortgage crisis
based on improper risk assessment of Credit Default Swaps (CDS) between financial insti-
tutions caused the real-estate bubble to burst in 2008 and resulted in the Great Recession
[76].
The majority of the risk assessment studies concentrate on credit scoring and bank
distress classification. However, there are also a few papers covering mortgage default possi-
bility, risky transaction detection or crisis forecasting. Meanwhile, there are some anomaly
detection studies for risk assessment, most of which also fall under the "Fraud Detection"
category which will be covered in the next subsection.
Table 4: Credit Scoring or Classification Studies

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[77] The XR 14 CDS con- 2016 Recovery rate, DBN+RBM AUROC, FN, WEKA
tracts spreads, sector FP, Accuracy
and region
[78] German, Japanese - Personal financial SVM + DBN Weighted- -
credit datasets variables accuracy, TP,
TN
[79] Credit data from Kaggle - Personal financial DNN Accuracy, TP, -
variables TN, G-mean
[80] Australian, German - Personal financial GP + AE as FP Python, Scikit-
credit data variables Boosted DNN learn
[81] German, Australian - Personal financial DCNN, MLP Accuracy, -
credit dataset variables False/Missed
alarm
[82] Consumer credit data - Relief algorithm CNN + Relief AUROC, K-s Keras
from Chinese finance chose the 50 most statistic, Accu-
company important features racy
[83] Credit approval dataset - UCI credit ap- Rectifier, Tanh, - AWS EC2, H2O, R
by UCI Machine Learn- proval dataset Maxout DL
ing repo

Before going into the details about specific DL implementations, it is worthwhile to men-
tion the existing ML surveys on the topic. Kirkos et al. [84], Ravi et al. [85], Fethi et al. [86]
reviewed the bank performance assessment studies based on Artificial Intelligence (AI) and
ML models. Lahsasna et al. [87], Chen et al.[88] surveyed the credit scoring and credit risk
assessment studies based on soft computing techniques whereas Marques et. al. [89] focused
only on Evolutionary Computation (EC) Models for credit scoring implementations. Mean-
while, Kumar et al. [90], Verikas et al. [91] reviewed ML implementations of bankruptcy
prediction studies. Similarly, Sun et al. [92] provided a comprehensive survey about research
on financial distress and corporate failures. Apart from these reviews, for assessing overall
risk, Lin et al. [93] surveyed the financial crisis prediction studies based on ML models.
Since risk assessment is becoming vital for survival in today’s financial world, a lot of
researchers turned their attention to DL for higher accuracy. Table 4, Table 5 provide
snapshot information about the different risk assessment studies implemented using various
DL models.
For credit score classification (Table 4), Luo et al. [77], used CDS data for Corporate

15
Credit rating and corresponding credit classification (A,B or C). Among the tested models,
DBN with RBM performed the best. This implementation was probably the first study to
implement Credit rating with DBN. Similarly, in [78], a cascaded hybrid model of DBN,
Backpropagation and SVM for credit classification was implemented and good performance
results (the accuracy was above 80-90 %) were achieved. In [79], credit risk classification
was achieved by using an ensemble of deep MLP networks each using subspaces of the
whole space by k-means (using minority class in each, but only a partial subspace of the
majority class). The data imbalance problem was handled by using multiple subspaces for
each classifier, where each of them had all the positive (minor) instances, but a subsample
of negative (majority) instances, finally they used an ensemble of deep MLPs combining
each subspace model. In [80], credit scoring was performed using a SAE network and GP
model to create credit assessment rules in order to generate good or bad credit cases. In
another study, Neagoe et. al. [81] classified credit scores using various DMLP and deep CNN
networks. In a different study [82], consumer credit scoring classification was implemented
with a 2-D representation of the input consumer data through transforming the data into a
2-D pixel matrix. Then the resulting images were used as the training and test data for CNN.
2-D pixel matrix representation of the consumer data was adapted by using CNN for image
classification. This was the first implementation of credit scoring using CNN. Niimi [83]
used UCI credit approval dataset 1 to compare DL, SVM, Logistic Regression (LR), Random
Forest (RF), eXtreme Gradient Boosting (XGBoost) and provided information about credit
fraud and credit approval applications; then experimented with the credit approval problem
with several models. Various models were compared for credit approval classification. Also,
some introduction about credit fraud detection was provided.
Financial distress prediction for banks and corporates are studied extensively (Table 5).
In [94], a hybrid DBN with SVM was used for financial distress prediction to identify whether
the firm was in trouble or not, whereas bank risk classification was studied in [95]. In [96],
news semantics were extracted by the word sequence learning and associated events were
labeled with the bank stress, then from the formed semantic vector representation, the bank
stress was determined and classified against a threshold. Prediction and semantic meaning
extraction were integrated in a neat way. In another study [97], text mining was again used
for identifying the bank distress by extracting the data from financial news and then using
a Deep Feed Forward Network (DFFN) on semantic sentence vectors extracted from word
embeddings to classify if there was an event or not. Similarly, Cerchiello et al. [98] used
text mining from the financial news to classify bank distress. Malik et al. [99] evaluated
the bank stress by first predicting the bank’s performance through an LSTM network, then
Backpropagation network was used for finding the bank stress level.
Table 5: Financial Distress, Bankruptcy, Bank Risk, Mortgage Risk, Crisis Forecasting Studies

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[94] 966 french firms - Financial ratios RBM+SVM Precision, Recall -

1
https://archive.ics.uci.edu/ml/datasets.html
16
Table 5: Financial Distress, Bankruptcy, Bank Risk, Mortgage Risk, Crisis Forecasting Studies

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[95] 883 BHC from EDGAR 2006-2017 Tokens, weighted CNN, LSTM, Accuracy, Preci- Keras, Python,
sentiment polarity, SVM, RF sion, Recall, F1- Scikit-learn
leverage and ROA score
[96] The event data set for 2007-2014 Word, sentence DNN +NLP pre- Relative useful- -
large European banks, process ness, F1-score
news articles from
Reuters
[97] Event dataset on Euro- 2007-2014 Text, sentence Sentence vector + Usefulness, F1- -
pean banks, news from DFFN score, AUROC
Reuters
[98] News from Reuters, fun- 2007-2014 Financial ratios doc2vec + NN Relative useful- Doc2vec
damental data and news text ness
[99] Macro/Micro economic 1976-2017 Macro economic CGAN, MVN, RMSE, Log like- -
variables, Bank char- variables and bank MV-t, LSTM, lihood, Loan loss
acteristics/performance performances VAR, FE-QAR rate
variables from BHC
[100] Financial statements of 2002-2006 Financial ratios DBN Recall, Precision, -
French companies F1-score, FP, FN
[101] Stock returns of Amer- 2001-2011 Price data DBN Accuracy Python, Theano
ican publicly-traded
companies from CRSP
[102] Financial statements of 2002-2016 Financial ratios CNN F1-score, AU- -
several companies from ROC
Japanese stock market
[103] Mortgage dataset with 1995-2014 Mortgage related ANN Negative average AWS
local and national eco- features log-likelihood
nomic factors
[104] Mortgage data from 2012-2016 Personal financial CNN Accuracy, Sensi- -
Norwegian financial variables tivity, Specificity,
service group, DNB AUROC
[105] Private brokerage com- - 250 features: order CNN, LSTM F1-Score Keras, Tensorflow
pany’s real data of risky details, etc.
transactions
[106] Several datasets com- 1996-2017 Index data, 10- Logit, CART, AUROC, KS, G- R
bined to create a new year Bond yield, RF, SVM, NN, mean, likelihood
one exchange rates, XGBoost, DNN ratio, DP, BA,
WBA

There are also a number of research papers that were focused on bankruptcy or corporate
default prediction. Ribeiro et al. [100] implemented bankruptcy prediction with DBN. The
results of DBN were compared with SVM and RBM. Yeh et al. [101] used the stock returns
of default and solvent companies as inputs to RBM used as SAE, then the output of RBM
was used as input to DBN to predict if the company was solvent or default. The results
were compared with an SVM model and the DBN model outperformed SVM. Hosaka et al.
[102] tried a different approach by converting the financial data to the image to use CNN
for bankruptcy prediction.
The remaining implementations of risk assessment are as follows: Sirignano et al. [103]
used the mortgage application data of 20 years for identifying the mortgage risk using various
parameters. They also performed a lot of analyses relating different factors that affected the
mortgage payment structure. The authors also analyzed the prepayment and delinquency
behavior in their assessment. For another mortgage risk assessment application, Kvamme
et al. [104] used CNN and RF models to predict whether a customer would default on its
mortgage or not. In a different study, Abroyan et al. [105] used CNN and LSTM networks
17
to classify if a transaction performed on the stock market (trade) was risky or not and high
accuracy was achieved. Finally, Chatzis et al. [106] developed several ML and DL models for
detecting events that caused the stock market to crash. DL models had good classification
(detecting crisis or not) performance.

4.3. Fraud Detection


Financial fraud is one of the areas where the governments and authorities are desperately
trying to find a permanent solution. Several different financial fraud cases exist such as credit
card fraud, money laundering, consumer credit fraud, tax evasion, bank fraud, insurance
claim fraud. This is one of the most extensively studied areas of finance for ML research
and several survey papers were published accordingly. At different times, Kirkos et al. [107],
Yue et al. [108], Wang et al. [109], Phua et al. [110], Ngai et al. [111], Sharma et al. [112],
West et al. [113] all reviewed the accounting and financial fraud detection studies based on
soft computing and data mining techniques.
These type of studies mostly can be considered as anomaly detection and are generally
classification problems. Table 6 presents different fraud detection studies based on DL
models.
There are a number of studies focused on identifying credit card fraud. Heryadi et al.
[114] developed several DL models for credit card fraud detection for Indonesian banks. They
also analyzed the effects of the data imbalance between fraud and nonfraud data. In more
recent studies, Roy et al. [115] used LSTM model for the credit card fraud detection, whereas
in [116], the authors implemented MLP networks to classify if a credit card transaction was
fraudulent or not. Sohony et al. [117] used an ensemble of FFNN for the detection of card
fraud. Jurgovsky et al. [118] used LSTM for detecting credit card fraud from credit card
transaction sequences. They compared their results with RF.
Paula et al. [119] used deep AE to implement anomaly detection to identify the financial
fraud and money laundering for Brazilian companies on export tax claims. In a similar study,
Gomes et al. [120] proposed an anomaly detection model that identified the anomalies in
parliamentary expenditure spending in Brazilian elections using also deep AE.
Wang et al. [121] used text mining and DNN models for the detection of automobile
insurance fraud. Longfei et al. [122] developed DNN models to detect online payment
transaction fraud. Costa et al. [123] used character sequences in financial transactions
and the responses from the other side to detect if the transaction was fraud or not with
LSTM. Goumagias et al. [124] used deep Q-learning (RL) to predict the risk-averse firms’
tax evasion behaviours. Finally, they provided suggestions for the states to maximize their
tax revenues accordingly.
Table 6: Fraud Detection Studies

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[114] Debit card transactions by a local 2016-2017 Financial transaction CNN, AUROC -
Indonesia bank amount on several time Stacked-
periods LSTM,
CNN-LSTM

18
Table 6: Fraud Detection Studies

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[115] Credit card transactions from re- 2017 Transaction variables and LSTM, GRU Accuracy Keras
tail banking several derived features
[116] Card purchases’ transactions 2014-2015 Probability of fraud per ANN AUROC -
currency/origin coun-
try, other fraud related
features
[117] Transactions made with credit 2013 Personal financial variables ANN, RF Recall, Pre- -
cards by European cardholders to PCA cision, Accu-
racy
[118] Credit-card transactions 2015 Transaction and bank fea- LSTM AUROC Keras,
tures Scikit-
learn
[119] Databases of foreign trade of the 2014 8 Features: Foreign Trade, AE MSE H2O, R
Secretariat of Federal Revenue of Tax, Transactions, Em-
Brazil ployees, Invoices, etc
[120] Chamber of Deputies open data, 2009-2017 21 features: Brazilian Deep Au- MSE, RMSE H2O, R
Companies data from Secretariat State expense, party toencoders
of Federal Revenue of Brazil name, Type of expense,
etc.
[121] Real-world data for automobile - Car, insurance and acci- DNN + LDA TP, FP, -
insurance company labeled as dent related features Accuracy,
fradulent Precision,
F1-score
[122] Transactions from a giant online 2006 Personal financial variables GBDT+DNN AUROC -
payment platform
[123] Financial transactions - Transaction data LSTM t-SNE -
[124] Empirical data from Greek firms - - DQL Revenue Torch

4.4. Portfolio Management


Portfolio Management is the process of choosing various assets within the portfolio for a
predetermined period. As seen in other financial applications, slightly different versions of
this problem exist, even though the underlying motivation is the same. In general, Portfolio
Management covers the following closely related areas: Portfolio Optimization, Portfolio
Selection, Portfolio Allocation. Sometimes, these terms are used interchangeably. Li et al.
[125] reviewed the online portfolio selection studies using various rule-based or ML models.
Portfolio Management is actually an optimization problem, identifying the best possible
course-of-action for selecting the best-performing assets for a given period. As a result, there
are a lot of EA models that were developed for this purpose. Metaxiotis et al. [126] surveyed
the MOEAs implemented solely on the portfolio optimization problem.
However, some DL researchers managed to configure it as a learning model and obtained
superior performances. Since Robo-advisory for portfolio management is on the rise, these
DL implementations have the potential to have a far greater impact on the financial industry
in the near future. Table 7 presents the portfolio management DL models and summarizes
their achievements.
There are a number of stock selection implementations. Takeuchi et al. [127] classified
the stocks in two classes, low momentum and high momentum depending on their expected
return. They used a deep RBM encoder-classifier network and achieved high returns. Simi-
larly, in [128], stocks were evaluated against their benchmark index to classify if they would
outperform or underperform using DMLP, then based on the predictions, adjusted the port-
19
folio allocation weights for the stocks for enhanced indexing. In [129], an ML framework
including DMLP was constructed and the stock selection problem was implemented.
Table 7: Portfolio Management Studies

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[66] Cryptocurrencies, Bit- 2014-2017 Price data CNN, RNN, Accumulative -
coin LSTM portfolio value,
MDD, SR
[127] Stocks from NYSE, 1965-2009 Price data Autoencoder + Accuracy, confu- -
AMEX, NASDAQ RBM sion matrix
[128] 20 stocks from S&P500 2012-2015 Technical indica- MLP Accuracy Python, Scikit
tors Learn, Keras,
Theano
[129] Chinese stock data 2012-2013 Technical, funda- Logistic Regres- AUC, accuracy, Keras, Tensorflow,
mental data sion, RF, DNN precision, recall, Python, Scikit
f1, tpr, fpr learn
[130] Top 5 companies in - Price data and Fi- LSTM, Auto- CAGR -
S&P500 nancial ratios encoding, Smart
indexing
[131] IBB biotechnology in- 2012-2016 Price data Auto-encoding, Returns -
dex, stocks Calibrating, Vali-
dating, Verifying
[132] Taiwans stock market - Price data Elman RNN MSE, return -
[133] FOREX (EUR/USD, 2013 Price data Evolino RNN Return Python
etc), Gold
[134] Stocks in NYSE, 1993-2017 Price, 15 firm LSTM+MLP Monthly return, Python,Keras,
AMEX, NASDAQ, TAQ characteristics SR Tensorflow in
intraday trade AWS
[135] S&P500 1985-2006 monthly and daily DBN+MLP Validation, Test Theano, Python,
log-returns Error Matlab
[136] 10 stocks in S&P500 1997-2016 OCHLV, Price RNN, LSTM, Accuracy, Keras, Tensorflow
data GRU Monthly return
[137] Analyst reports on the 2016-2018 Text LSTM, CNN, Bi- Accuracy, R2 R, Python, MeCab
TSE and Osaka Ex- LSTM
change
[138] Stocks from Chi- 2015-2018 OCHLV, Funda- DDPG, PPO SR, MDD -
nese/American stock mental data
market
[139] Hedge fund monthly re- 1996-2015 Return, SR, STD, DNN Sharpe ratio, -
turn data Skewness, Kurto- Annual return,
sis, Omega ratio, Cum. return
Fund alpha
[140] 12 most-volumed cryp- 2015-2016 Price data CNN + RL SR, portfolio -
tocurrency value, MDD

Portfolio selection and smart indexing were the main focuses of [130] and [131] using AE
and LSTM networks. Lin et al. [132] used the Elman network for optimal portfolio selection
by predicting the stock returns for t+1 and then constructing the optimum portfolio ac-
cording to the returns. Meanwhile, Maknickiene et al. [141] used Evolino RNN for portfolio
selection and return prediction accordingly. The selected portfolio components (stocks) were
orthogonal in nature.
In [134], through predicting the next month’s return, top to be performed portfolios were
constructed and good monthly returns were achieved with LSTM and LSTM-MLP combined
DL models. Similarly, Batres et al. [135] combined DBN and MLP for constructing a stock
portfolio by predicting each stock’s monthly log-return and choosing the only stocks that
were expected to perform better than the performance of the median stock. Lee et al.
20
[136] compared 3 RNN models (S-RNN, LSTM, GRU) for stock price prediction and then
constructed a threshold-based portfolio with selecting the stocks according to the predictions.
With a different approach, Iwasaki et al. [137] used the analyst reports for sentiment analyses
through text mining and word embeddings and used the sentiment features as inputs to Deep
Feedforward Neural Network (DFNN) model for the stock price prediction. Then different
portfolio selections were implemented based on the projected stock returns.
DRL was selected as the main DL model for [138]. Liang et al. [138] used DRL for
portfolio allocation by adjusting the stocks weights using various RL models. Chen et al.
[139] compared different ML models (including DFFN) for hedge fund return prediction and
hedge fund selection. DL and RF models had the best performance.
Cryptocurrency portfolio management also started getting attention from DL researchers.
In [140], portfolio management (allocation and adjustment of weights) was implemented
by CNN and DRL on selected cryptocurrencies. Similarly, Jiang et al. [66] implemented
cryptocurrency portfolio management (allocation) based on 3 different proposed models,
namely RNN, LSTM and CNN.

4.5. Asset Pricing and Derivatives Market (options, futures, forward contracts)
Accurate pricing or valuation of an asset is a fundamental study area in finance. There
are a vast number of ML models developed for banks, corporates, real estate, derivative
products, etc. However, DL has not been applied to this particular field and there are
some possible implementation areas that DL models can assist the asset pricing researchers
or valuation experts. There were only a handful of studies that we were able to pinpoint
within the DL and finance community. There are vast opportunities in this field for future
studies and publications.
Meanwhile, financial models based on derivative products is quite common. Options
pricing, hedging strategy development, financial engineering with options, futures, forward
contracts are among some of the studies that can benefit from developing DL models. Some
recent studies indicate that researchers started showing interest in DL models that can
provide solutions to this complex and challenging field. Table 8 summarizes these studies
with their intended purposes.
Table 8: Asset Pricing and Derivatives Market Studies

Art. Der.Type Data Set Period Feature Set Method Performance Env.
Criteria
[137] Stock ex- Analyst reports 2016-2018 Text LSTM, CNN, Bi- Accuracy, R2 R,
change on the TSE and LSTM Python,
Osaka Exchange MeCab
[142] Options Simulated a - Price data, option DNN RMSE, the av- Tensorflow
range of call strike/maturity, erage percentage
option prices dividend/risk free pricing error
rates, volatility
[143] Futures, TAIEX Options 2017 OCHLV, fundamen- MLP, MLP with RMSE, MAE, -
Options tal analysis, option Black scholes MAPE
price
[144] Equity re- Returns in 1975-2017 57 firm characteris- Fama-French R2 ,RMSE Tensorflow
turns NYSE, AMEX, tics n-factor model
NASDAQ DL

21
Iwasaki et al. [137] used a DFNN model and the analyst reports for sentiment analyses
to predict the stock prices. Different portfolio selection approaches were implemented after
the prediction of the stock prices. Culkin et al. [142] proposed a novel method that used
feedforward DNN model to predict option prices by comparing their results with Black
& Scholes option pricing formula. Similarly, Hsu et al. [143] proposed a novel method
that predicted TAIEX option prices using bid-ask spreads and Black & Scholes option price
model parameters with 3-layer DMLP. In [144], characteristic features such as Asset growth,
Industry momentum, Market equity, Market Beta, etc. were used as inputs to a Fama-French
n-factor model DL to predict US equity returns in National Association of Securities Dealers
Automated Quotations (NASDAQ), American Stock Exchange (AMEX), New York Stock
Exchange (NYSE) indices.

4.6. Cryptocurrency and Blockchain Studies


In the last few years, cryptocurrencies have been the talk of the town due to their
incredible price gain and loss within short periods. Even though price forecasting dominates
the area of interest, some other studies also exist, such as cryptocurrency Algo-trading
models.
Meanwhile, Blockchain is a new technology that provides a distributed decentralized
ledger system that fits well with the cryptocurrency world. As a matter of fact, cryptocur-
rency and blockchain are highly coupled, even though blockchain technology has a much
wider span for various implementation possibilities that need to be studied. It is still in its
early development phase, hence there is a lot of hype in its potentials.
Some DL models have already appeared about cryptocurrency studies, mostly price
prediction or trading systems. However, still there is a lack of studies for blockchain research
within the DL community. Given the attention that the underlying technology has attracted,
there is a great chance that some new studies will start appearing in the near future. Table 9
tabulates the studies for the cryptocurrency and blockchain research.
Table 9: Cryptocurrency and Blockchain Studies

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[48] Bitcoin, Dash, 2014-2017 MA, BOLL, the LSTM, RNN, MLP Accuracy, F1- Python, Ten-
Ripple, Mon- CRIX daily returns, measure sorflow
ero, Litecoin, Euribor interest
Dogecoin, Nxt, rates, OCHLV
Namecoin of EURO/UK,
EURO/USD,
US/JPY
[66] Cryptocurrencies, 2014-2017 Price data CNN Accumulative -
Bitcoin portfolio value,
MDD, SR
[140] 12 most-volumed 2015-2016 Price data CNN + RL SR, portfolio
cryptocurrency value, MDD
[145] Bitcoin data 2010-2017 Hash value, bitcoin Takagi–Sugeno Analytical hier- -
address, pub- Fuzzy cognitive archy process
lic/private key, maps
digital signature,
etc.

22
Table 9: Cryptocurrency and Blockchain Studies

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[146] Bitcoin data 2012, TransactionId, Graph embedding F1-score -
2013, 2016 input/output Ad- using heuristic,
dresses, timestamp laplacian eigen-
map, deep AE
[147] Bitcoin, Litecoin, 2015-2018 OCHLV, tech- CNN, LSTM, State MSE Keras, Ten-
StockTwits nical indicators, Frequency Model sorflow
sentiment analysis
[148] Bitcoin 2013-2016 Price data Bayesian optimized Sensitivity, speci- Keras,
RNN, LSTM ficity, precision, Python,
accuracy, RMSE Hyperas

Chen et al. [145] proposed a blockchain transaction traceability algorithm using Takagi-
Sugeno fuzzy cognitive map and 3-layer DMLP. Bitcoin data (Hash value, bitcoin address,
public/private key, digital signature, etc.) was used as the dataset. Nan et al. [146] proposed
a method for bitcoin mixing detection that consisted of different stages: Constructing the
Bitcoin transaction graph, implementing node embedding, detecting outliers through AE.
Lopes et al. [147] combined the opinion market and price prediction for cryptocurrency
trading. Text mining combined with 2 models, CNN and LSTM were used to extract the
opinion. Bitcoin, Litecoin, StockTwits were used as the dataset. Open,Close,High, Low,
Volume (OCHLV) of prices, technical indicators, and sentiment analysis were used as the
feature set.
In another study, Jiang et al. [66] presented a financial-model-free RL framework for the
Cryptocurrency portfolio management that was based on 3 different proposed models, basic
RNN, LSTM and CNN. In [140], portfolio management was implemented by CNN and DRL
on 12 most-volumed cryptocurrencies. Bitcoin, Ethereum, Bitcoin Cash and Digital Cash
were used as the dataset.
In addition, Spilak et al. [48] used 8 cryptocurrencies (Bitcoin, Dash, Ripple, Monero,
Litecoin, Dogecoin, Nxt, Namecoin) to construct a dynamic portfolio using LSTM, RNN,
MLP methods. McNally et al. [148] compared Bayesian optimized RNN, LSTM and Au-
toregressive Integrated Moving Average (ARIMA) to predict the bitcoin price direction.
Sensitivity, specificity, precision, accuracy, Root Mean Square Error (RMSE) were used as
the performance metrics.

4.7. Financial Sentiment Analysis and Behavioral Finance


One of the most important components of behavioral finance is emotion or investor
sentiment. Lately, advancements in text mining techniques opened up the possibilities for
successful sentiment extraction through social media feeds. There is a growing interest
in Financial Sentiment Analysis, especially for trend forecasting and Algo-trading model
development. Kearney et al. [149] surveyed ML-based financial sentiment analysis studies
that use textual data.
Nowadays there is broad interest in the sentiment analysis for financial forecasting re-
search using DL models. Table 10 provides information about the sentiment analysis studies
that are focused on financial forecasting and based on text mining.
23
In [150], technical analysis (MACD, Moving Average (MA), Directional Movement Index
(DMI), Exponential Moving Average (EMA), Triple Exponential Moving Average (TEMA),
Momentum, RSI, Commodity Channel Index (CCI), Stochastic Oscillator, Price of Change
(ROC)) and sentiment analysis (using social media) were used to predict the price of stocks.
Shi et al. [151] proposed a method that visually interpreted text-based DL models in predict-
ing the stock price movements. They used the financial news from Reuters and Bloomberg.
In [152], text mining and word embeddings were used to extract information from the finan-
cial news from Reuters and Bloomberg to predict the stock price movements. In addition, in
[153], the prices of index data and emotional data from text posts were used to predict the
stock opening price of the next day. Zhongshengz [154] performed classification and stock
price prediction using text and price data. Das et al. [155] used Twitter sentiment data and
stock price data to predict the prices of Google, Microsoft and Apple stocks.
Prosky et al. [156] performed sentiment, mood prediction using news from Reuters and
used these sentiments for price prediction. Li et al. [157] used sentiment classification
(neutral, positive, negative) for the stock open or close price prediction with LSTM (various
models). They compared their results with SVM and achieved higher overall performance.
Iwasaki et al. [137] used analyst reports for sentiment analysis through text mining and word
embeddings. They used the sentiment features as inputs to DFNN model for stock price
prediction. Finally, different portfolio selections were implemented based on the projected
stock returns.
In a different study, Huang et al. [158] used several models including Hidden Markov
Model (HMM), DMLP and CNN using Twitter moods along with the financial price data
for prediction of the next day’s move (up or down). CNN achieved the best result.
Table 10: Financial Sentiment Studies coupled with Text Mining for Forecasting

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[137] Analyst reports on the 2016-2018 Text LSTM, CNN, Accuracy, R2 R, Python,
TSE and Osaka Exchange Bi-LSTM MeCab
[150] Sina Weibo, Stock market 2012-2015 Technical indica- DRSE F1-score, pre- Python
records tors, sentences cision, recall,
accuracy, AU-
ROC
[151] News from Reuters and 2006-2015 Financial news, DeepClue Accuracy Dynet soft-
Bloomberg for S&P500 price data ware
stocks
[152] News from Reuters and 2006-2013 News, price data DNN Accuracy -
Bloomberg, Historical
stock security data
[153] SCI prices 2008-2015 OCHL of change Emotional MSE -
rate, price Analysis +
LSTM
[154] SCI prices 2013-2016 Text data and Price LSTM Accuracy, F1- Python,
data Measure Keras
[155] Stocks of Google, Mi- 2016-2017 Twitter sentiment RNN - Spark,
crosoft and Apple and stock prices Flume,Twitter
API,
[156] 30 DJIA stocks, S&P500, 2002-2016 Price data and fea- LSTM, NN, Accuracy VADER
DJI, news from Reuters tures from news ar- CNN and
ticles word2vec

24
Table 10: Financial Sentiment Studies coupled with Text Mining for Forecasting

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[157] Stocks of CSI300 index, 2009-2014 Sentiment Posts, Naive Bayes Precision, Recall, Python,
OCHLV of CSI300 index Price data + LSTM F1-score, Accu- Keras
racy
[158] S&P500, NYSE Compos- 2009-2011 Twitter moods, in- DNN, CNN Error rate Keras,
ite, DJIA, NASDAQ Com- dex data Theano
posite

Even though financial sentiment is highly coupled with text mining, we decided to repre-
sent those two topics in different subsections. The main reason for such a choice is not only
the existence of some financial sentiment studies which do not directly depend on financial
textual data (like [158]) but also the existence of some financial text mining studies that are
not automatically used for sentiment analysis which will be covered in the next section.

4.8. Financial Text Mining


With the rapid spreading of social media and real-time streaming news/tweets, instant
text-based information retrieval became available for financial model development. As a
result, financial text mining studies became very popular in recent years. Even though some
of these studies are directly interested in the sentiment analysis through crowdsourcing, there
are a lot of implementations that are interested in the content retrieval of news, financial
statements, disclosures, etc. through analyzing the text context. There are a few ML surveys
focused on text mining and news analytics. Among the noteworthy studies of such, Mitra
et al. [159] edited a book on news analytics in finance, whereas Li et al. [160], Loughran et
al. [161], Kumar et al. [162] surveyed the studies of textual analysis of financial documents,
news and corporate disclosures. It is worth to mention that there are also some studies
[163, 164] of text mining for financial prediction models.
Previous section was focused on DL models using sentiment analysis specifically tailored
for the financial forecasting implementations, whereas this section will include DL studies
that have text Mining without Sentiment Analysis for Forecasting (Table 11), financial
sentiment analysis coupled with text mining without forecasting intent (Table 12) and finally
other text mining implementations (Table 13), respectively.
Huynh et al. [165] used the financial news from Reuters, Bloomberg and stock prices
data to predict the stock movements in the future. In [166], different event-types on Chi-
nese companies are classified based on a novel event-type pattern classification algorithm.
Besides, the stock prices were predicted using additional inputs. Kraus et al. [167] imple-
mented LSTM with transfer learning using text mining through financial news and stock
market data. Dang et al. [168] used Stock2Vec and Two-stream GRU (TGRU) models to
generate the input data from the financial news and stock prices for the classification of
stock prices.
In [169], events were detected from Reuters and Bloomberg news through text mining.
The extracted information was used for price prediction and stock trading through the CNN
model. Vargas et al. [170] used text mining and price prediction together for intraday

25
directional movement estimation. Akita et al. [171] implemented a method that used text
mining and price prediction together for forecasting prices. Verma et al. [172] combined
news data with financial data to classify the stock price movement. Bari et al. [69] used
text mining for extracting information from the tweets and news. In the method, time
series models were used for stock trade signal generation. In [173], a method that performed
information fusion from news and social media sources was proposed to predict the trend of
the stocks.
In [174], social media news were used to predict the index price and the index direction
with RNN-Boost through Latent Dirichlet Allocation (LDA) features. Hu et al. [175]
proposed a novel method that used text mining techniques and Hybrid Attention Networks
based on the financial news for forecasting the trend of stocks. Li et al. [176] implemented
intraday stock price direction classification using the financial news and stocks prices. In
[177], financial news data and word embedding with Word2vec were implemented to create
the inputs for Recurrent CNN (RCNN) to predict the stock price.
Table 11: Text Mining Studies without Sentiment Analysis for Forecasting

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[69] Energy-Sector/ Company- 2015-2016 Text and Price Return, SR, pre- Python,
Centric Tweets in S&P500 data cision, recall, ac- Tweepy API
curacy
[165] News from Reuters, 2006-2013 Financial news, Bi-GRU Accuracy Python,
Bloomberg price data Keras
[166] News from Sina.com, 2012-2016 A set of news text Their unique Precision, Recall, -
ACE2005 Chinese corpus algorithm F1-score
[167] CDAX stock market data 2010-2013 Financial news, LSTM MSE, RMSE, TensorFlow,
stock market MAE, Accuracy, Theano,
data AUC Python,
Scikit-Learn
[168] Apple, Airbus, Ama- 2006-2013 Price data, news, TGRU, Accuracy, preci- Keras,
zon news from Reuters, technical indica- stock2vec sion, AUROC Python
Bloomberg, S&P500 stock tors
prices
[169] S&P500 Index, 15 stocks in 2006-2013 News from CNN Accuracy, MCC -
S&P500 Reuters and
Bloomberg
[170] S&P500 index news from 2006-2013 Financial news SI-RCNN Accuracy -
Reuters titles, Technical (LSTM +
indicators CNN)
[171] 10 stocks in Nikkei 225 and 2001-2008 Textual informa- Paragraph Profit -
news tion and Stock Vector +
prices LSTM
[172] NIFTY50 Index, NIFTY 2013-2017 Index data, news LSTM MCC, Accuracy -
Bank/Auto/IT/Energy In-
dex, News
[173] Price data, index data, 2015 Price data, news Coupled ma- Accuracy, MCC Jieba
news, social media data from articles and trix and ten-
social media sor
[174] HS300 2015-2017 Social media RNN-Boost Accuracy, MAE, Python,
news, price data with LDA MAPE, RMSE Scikit-learn
[175] News and Chinese stock 2014-2017 Selected words in HAN Accuracy, An- -
data a news nual return
[176] News, stock prices from 2001 Price data and ELM, DLR, Accuracy Matlab
Hong Kong Stock Ex- TF-IDF from PCA, BELM,
change news KELM, NN

26
Table 11: Text Mining Studies without Sentiment Analysis for Forecasting

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[177] TWSE index, 4 stocks in 2001-2017 Technical indica- CNN + RMSE, Profit Keras,
TWSE tors, Price data, LSTM Python,
News TALIB
[178] Stock of Tsugami Corpora- 2013 Price data LSTM RMSE Keras, Ten-
tion sorflow
[179] News, Nikkei Stock Aver- 1999-2008 news, MACD RNN, Accuracy, P- -
age and 10-Nikkei compa- RBM+DBN value
nies
[180] ISMIS 2017 Data Mining - Expert identifier, LSTM + Accuracy -
Competition dataset classes GRU +
FFNN
[181] Reuters, Bloomberg News, 2006-2013 News and sen- LSTM Accuracy -
S&P500 price tences
[182] APPL from S&P500 and 2011-2017 Input news, CNN + Accuracy, F1- Tensorflow
news from Reuters OCHLV, Techni- LSTM, score
cal indicators CNN+SVM
[183] Nikkei225, S&P500, 2001-2013 Stock price data DGM Accuracy, MCC, -
news from Reuters and and news %profit
Bloomberg
[184] Stocks from S&P500 2006-2013 Text (news) and LAR+News, MAPE, RMSE -
Price data RF+News

Minami et al. [178] proposed a method that predicted the stock price with corporate
action event information and macro-economic index data using LSTM. In [179], a novel
method that used a combination of RBM, DBN and word embeddings to create word vectors
for RNN-RBM-DBN network was proposed to predict the stock prices. Buczkowski et al.
[180] proposed a novel method that used expert recommendations, ensemble of GRU and
LSTM for prediction of the prices.
In [181] a novel method that used character-based neural language model using financial
news and LSTM was proposed. Liu et al. [182] proposed a method that used word em-
beddings with word2Vec, technical analysis features and stock prices for price prediction.
In [183], Deep Neural Generative Model (DGM) with news articles using Paragraph Vector
algorithm was used for creation of the input vector to predict the stock prices. In [184],
the stock price data and word embeddings were used for stock price prediction. The results
showed that the extracted information from embedding news improves the performance.
Table 12: Financial Sentiment Studies coupled with Text Mining without Forecasting

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[95] 883 BHC from EDGAR 2006-2017 Tokens, weighted CNN, LSTM, Accuracy, Preci- Keras,
sentiment polar- SVM, Ran- sion, Recall, F1- Python,
ity, leverage and dom Forest score Scikit-learn
ROA
[185] SemEval-2017 dataset, fi- 2017 Sentiments in Ensemble Cosine similarity Python,
nancial text, news, stock Tweets, News SVR, CNN, score, agreement Keras, Scikit
market data headlines LSTM, GRU score, class score Learn
[186] Financial news from 2006-2015 Word vector, Targeted Cumulative -
Reuters Lexical and dependency abnormal return
Contextual input tree LSTM
[187] Stock sentiment analysis 2015 StockTwits mes- LSTM, Accuracy, preci- -
from StockTwits sages Doc2Vec, sion, recall, f-
CNN measure, AUC

27
Table 12: Financial Sentiment Studies coupled with Text Mining without Forecasting

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[188] Sina Weibo, Stock market 2012-2015 Technical indica- DRSE F1-score, pre- Python
records tors, sentences cision, recall,
accuracy, AU-
ROC
[189] News from NowNews, Ap- 2013-2014 Text, Sentiment Return Python, Ten-
pleDaily, LTN, MoneyDJ sorflow
for 18 stocks
[190] StockTwits 2008-2016 Sentences, Stock- CNN, LSTM, MCC, WSURT Keras, Ten-
Twits messages GRU sorflow
[191] Financial statements of - Sentences, text DNN Precision, recall, -
Japan companies f-score
[192] Twitter posts, news head- - Sentences, text Deep-FASP Accuracy, MSE, -
lines R2
[193] Forums data 2004-2013 Sentences and Recursive Precision, recall, -
keywords neural tensor f-measure
networks
[194] News from Financial Times - Sentiment of SVR, Bidi- Cosine similarity Python,
related US stocks news headlines rectional Scikit Learn,
LSTM Keras, Ten-
sorflow

Akhtar et al. [185] compared CNN, LSTM and GRU based DL models against MLP for
financial sentiment analysis. Rawte et al. [95] tried to solve three separate problems using
CNN, LSTM, SVM, RF: Bank risk classification, sentiment analysis and Return on Assets
(ROA) regression.
Chang et al. [186] implemented the estimation of information content polarity (nega-
tive/positive effect) with text mining, word vector, lexical, contextual input and various
LSTM models. They used the financial news from Reuters.
Jangid et al. [187] proposed a novel method that is a combination of LSTM and CNN
for word embedding and sentiment analysis using Bidirectional LSTM (Bi-LSTM) for aspect
extraction. The proposed method used multichannel CNN for financial sentiment analysis.
Shijia et al. [188] used an attention-based LSTM for the financial sentiment analysis using
news headlines and microblog messages. Sohangir et al. [189] used LSTM, doc2vec, CNN
and stock market opinions posted in StockTwits for sentiment analysis. Mahmoudi et al.
[190] extracted tweets from StockTwits to identify the user sentiment. In the evaluation
approach, they also used emojis for the sentiment analysis. Kitamori et al. [191] extracted
the sentiments from financial news and used DNN to classify positive and negative news.
In [192], the sentiment/aspect prediction was implemented using an ensemble of LSTM,
CNN and GRU networks. In a different study, Li et al. [193] proposed a DL based sentiment
analysis method using RNN to identify the top sellers in the underground economy. Moore
et al. [194] used text mining techniques for sentiment analysis from the financial news.
Table 13: Other Text Mining Studies

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[73] News from NowNews, Ap- 2013-2014 Text, Sentiment Return Python, Ten-
pleDaily, LTN, MoneyDJ sorflow
for 18 stocks

28
Table 13: Other Text Mining Studies

Art. Data Set Period Feature Set Method Performance Env.


Criteria
[96] The event data set for large 2007-2014 Word, sentence DNN +NLP Relative useful- -
European banks, news ar- preprocess ness, F1-score
ticles from Reuters
[97] Event dataset on European 2007-2014 Text, sentence Sentence vec- Usefulness, F1- -
banks, news from Reuters tor + DFFN score, AUROC
[98] News from Reuters, funda- 2007-2014 Financial ratios doc2vec + Relative useful- Doc2vec
mental data and news text NN ness
[121] Real-world data for auto- - Car, insurance DNN + LDA TP, FP, Accu- -
mobile insurance company and accident racy, Precision,
labeled as fradulent related features F1-score
[123] Financial transactions - Transaction data LSTM t-SNE -
[195] Taiwan’s National Pension 2008-2014 Insured’s id, RNN Accuracy, total Python
Insurance area-code, gen- error
der, etc.
[196] StockTwits 2015-2016 Sentences, Stock- Doc2vec, Accuracy, preci- Python, Ten-
Twits messages CNN sion, recall, f- sorflow
measure, AUC

In [195], individual social security payment types (paid, unpaid, repaid, transferred) were
classified and predicted using LSTM, HMM and SVM. Sohangir et al. [196] used two neural
network models (doc2Vec, CNN) to find the top authors in StockTwits messages and to
classify the authors as expert or non-expert for author classification purposes.
In [123], the character sequences in financial transactions and the responses from the
other side was used to detect if the transaction was fraud or not with LSTM. Wang et al.
[121] used text mining and DNN models to detect automobile insurance fraud.
In [96], the news semantics were extracted by the word sequence learning, bank stress
was determined and classified with the associated events. Day et al. [73] used financial
sentiment analysis using text mining and DNN for stock algorithmic trading.
Cerchiello et al. [98] used the fundamental data and text mining from the financial news
(Reuters) to classify the bank distress. In [97], the bank distress was identified by extracting
the data from the financial news through text mining. The proposed method used DFNN
on semantic sentence vectors to classify if there was an event or not.

4.9. Theoretical or Conceptual Studies


There were a number of research papers that were either focused on the theoretical
concepts of finance or the conceptual designs without model implementation phases; however
they still provided valuable information, so we decided to include them in our survey. In
Table 14, these studies were tabulated according to their topic of interest.
In [197], the connection between deep AEs and Singular Value Decomposition (SVD)
were discussed and compared using stocks from iShares Nasdaq Biotechnology ETF (IBB)
index and the stock of Amgen Inc. Bouchti et al. [198] explained the details of DRL and
mentioned that DRL could be used for fraud detection/risk management in banking.

29
Table 14: Other - Theoretical or Conceptual Studies

Art. SubTopic IsTimeSeries? Data Set Period Feature Method


Set
[197] Analysis of AE, SVD Yes Selected stocks from the 2012- Price AE, SVD
IBB index and stock of 2014 data
Amgen Inc.
[198] Fraud Detection in No Risk Management / - - DRL
Banking Fraud Detection

4.10. Other Financial Applications


Finally, there were some research papers which did not fit into any of the previously
covered topics. Their data set and intended output were different than most of the other
studies focused in this survey. These studies include social security payment classification,
bank telemarketing success prediction, hardware solutions for faster financial transaction
processing, etc. There were some anomaly detection implementations like tax evasion, money
laundering that could have been included in this group; however we decided to cover them
in a different subsection, fraud detection. Table 15 shows all these aforementioned studies
with their differences.
Dixon et al. [199] used Intel Xeon Phi to speedup the price movement direction prediction
problem using DFFN. The main contribution of the study was the increase in the speed of
processing. Alberg et al. [200] used several company financials data (fundamental data)
and price together to predict the next period’s company financials data. Kim et al. [201]
used CNN for predicting the success of bank telemarketing. In their study, they used the
phone calls of the bank marketing data and 16 finance-related attributes. Lee et al. [202]
used technical indicators and patent information to estimate the revenue and profit for the
corporates using RBM based DBN, FFNN and Support Vector Regressor (SVR).
Ying et al.[195] classified and predicted individual social security payment types (paid,
unpaid, repaid, transferred) using LSTM, HMM and SVM. Li et al. [193] proposed a deep
learning-based sentiment analysis method to identify the top sellers in the underground
economy. Jeong et al. [49] combined deep Q-learning and deep NN to implement a model
to solve three separate problems: Increasing profit in a market, prediction of the number of
shares to trade, and preventing overfitting with insufficient financial data.
Table 15: Other Financial Applications

Art. Subtopic Data Set Period Feature Set Method Performance Env.
Criteria
[49] Improving trad- S&P500, KOSPI, HSI, 1987-2017 200-days stock price Deep Q- Total profit, -
ing decisions and EuroStoxx50 Learning and Correlation
DNN
[193] Identifying Forums data 2004-2013 Sentences and key- Recursive Precision, -
Top Sellers In words neural tensor recall, f-
Underground networks measure
Economy
[195] Predicting Taiwan’s National Pen- 2008-2014 Insured’s id, area- RNN Accuracy, to- Python
Social Ins. Pay- sion Insurance code, gender, etc. tal error
ment Behavior
[199] Speedup 45 CME listed commod- 1991-2014 Price data DNN - -
ity and FX futures

30
Table 15: Other Financial Applications

Art. Subtopic Data Set Period Feature Set Method Performance Env.
Criteria
[200] Forecasting Stocks in NYSE, NAS- 1970-2017 16 fundamental fea- MLP, LFM MSE, Com- -
Fundamentals DAQ or AMEX ex- tures from balance pount annual
changes sheet return, SR
[201] Predicting Bank Phone calls of bank mar- 2008-2010 16 finance-related CNN Accuracy -
Telemarketing keting data attributes
[202] Corporate 22 pharmaceutical com- 2000-2015 11 financial and 4 RBM, DBN RMSE, profit -
Performance panies data in US stock patent indicator
Prediction market

5. Current Snaphot of DL research for Financial Applications


For the survey, we reviewed 144 papers from various financial application areas. Each
paper is analyzed according to its topic, publication type, problem type, method, dataset,
feature set and performance criteria. Due to space limitations, we will only provide the
general summary statistics indicating the current state of the DL for finance research.

The histogram of Publication Count in Topics

All-years count
Last-3-years count
Publication Count

40

30

20

10

0
f in

al

r is

f in

po

fra

ot

cr

as

ot
go

he

yp

he
se
k

r tf
an

an

ud

tp
r it

rf

r
as

to
ol
ci

ci

de

cu

th
se
hm

in

r ic
io
al

al

te

eo
sm

an

r
te

i
r
ic

ct

ng
en

en

ry
c
xt

an
e

io

ia
tr a

nt

tim

cy

an

or
ag
m

la
di

d
in

an
en

em

co
pp
ng

de
in

d
ta

nc
lic
e
g

r
bl
n

i
na

ep
at

va
t

oc
io
ly

tu
tiv
kc
ns
sis

al
es
ha

stu

Topic Name
in

dy

Figure 8: The histogram of Publication Count in Topics

First and foremost, we clustered the various topics within the financial applications
research and presented them in Figure 8. A quick glance at the figure shows us financial text
mining and algorithmic trading are the top two fields that the researchers most worked on
followed by risk assessment, sentiment analysis, portfolio management and fraud detection,
respectively. The results indicate most of the papers were published within the last 3 years
implying the domain is very hot and actively studied. We can also observe these phenomena
by analyzing Figure 9. Also, it is worth to mention that the few papers that were published
before 2013 all used RNN based models.
31
Histogram of Publication Count in Years

60
Publication Count in Year

50

40

30

20

10

0
19

19

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20

20
98

99

00

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18
Year

Figure 9: The histogram of Publication Count in Years

The histogram of Publication Count in Model Type

60 All-years count
Last-3-years count

50
Publication Count

40

30

20

10

0
RN

CN

RL

RB
M

th

BN

M
N

er
LP

Model Type

Figure 10: The histogram of Publication Count in Model Types

When the papers were clustered by the DL model type as presented in Figure 10, we
observe the dominance of RNN, DMLP and CNN over the remaining models, which might
be expected, since these models are the most commonly preferred ones in general DL im-
plementations. Meanwhile, RNN is a general umbrella model which has several versions
including LSTM, GRU, etc. Within the RNN choice, most of the models actually belonged

32
to LSTM, which is very popular in time series forecasting or regression problems. It is also
used quite often in algorithmic trading. More than 70% of the RNN papers consisted of
LSTM models.

Figure 11: Wordcloud of most-used Software, Frameworks, Environments

Figure 11 presents the commonly used software and frameworks for DL model implemen-
tations through Wordcloud whereas Figure 12 provides the details about the development
environments. The left chart (Figure 12a) presents the high level view where Python had
the lion’s share with 80% over R (with 10%) and the other languages. The chart on the
right (Figure 12b) provides the details about how the developers are using Python through
different libraries and frameworks.

2.9 9.6
2.
2 25.7
2. 28.4
1.5 2
1.5

Python Based
All in One
0.9
0.9
21.1 0.9
2. 0.9
8
0.9
3.7
80.1 7.3 3.7 0.9
0.9
0.9

python keras tensorflow sklearn


pandas theano numpy torch
python r spark java other caffe pytorch dynet jieba
matlab talib hyperas vader textblob

(a) Preferred Development Environments (b) Preferred Python Libraries

Figure 12: Distribution of Preferred Environments

33
Meanwhile, DMLP generally fits well for classification problems; hence it is a common
choice for most of the financial application areas. However, since it is a natural extension of
its shallow counterpart MLP, it has a longer history than the other DL models.

The histogram of Top Journals

Expert Systems with Applications 4.292


Decision Support Systems 3.847
Applied Soft Computing 4.873
Applied Stochastic Models in Business and Industry 1.124
European Journal of Operational Research 3.806
IEEE Transactions on Neural Networks 2.633
Knowledge-based Systems 5.101
Neurocomputing 4.072
SSRN - Social Science Research Network -
Algorithmic Finance -
Applied Mathematics and Computation 3.092
Data & Knowledge Engineering 1.583
Electronic Commerce Research and Applications 2.911
Engineering Applications of Artificial Intelligence 3.526
Journal Name

Engineering Letters -
Frontiers in Signal Processing -
Future Generation Computer Systems 5.768
IEEE Access 4.098
IEEE Transactions on Industrial Informatics 7.377
IEEE Transactions on Knowledge and Data Engineering 3.857
IEEE Transactions on Neural Networks and Learning Systems 11.683
IEICE Transactions on Information and Systems -
International Journal of Computer Applications 3.12
International Journal of Intelligent Systems and Applications in Engineering -
International Journal of Machine Learning and Computing -
Journal of Big Data -
Journal of Computational Science 2.502
Journal of Mathematical Finance 0.39
Neural Computing and Applications 4.664
Pattern Recognition Letters 2.810
Plos One 2.776
Sustainability 2.592
The Journal of Supercomputing 2.16
0 1 2 3 4 5

Journal Count
Last-3-years count Other-years count

Figure 13: Top Journals - corresponding numbers next to the bar graph are representing the impact factor
of the journals

CNN started getting more attention lately since most of the implementations appeared
within the last 3 years. Careful analysis of CNN papers indicates that a recent trend of
representing financial data with a 2-D image view in order to utilize CNN is growing. Hence

34
CNN based models might overpass the other models in the future. It actually passed DMLP
for the last 3 years.
The top journals are tabulated in Fig 13. The journals with the most published papers in
the last 3 years include Expert Systems with Applications, Decision Support Systems, Ap-
plied Soft Computing, Neurocomputing, Knowledge-based Systems and European Journal
of Operational Research.

6. Discussion and Open Issues


After reviewing all the publications based on the selected criteria explained in the previ-
ous section, we wanted to provide our findings of the current state-of-the-art situation. Our
discussions are categorized by the DL models and implementation topics.

6.1. Discussions on DL Models


It is possible to claim that LSTM is the dominant DL model that is preferred by most re-
searchers, due to its well-established structure for financial time series data forecasting. Most
of the financial implementations have time-varying data representations requiring regression-
type approaches which fits very well for LSTM and its derivatives due to their easy adapta-
tions to the problems. As long as the temporal nature of the financial data remains, LSTM
and its related family models will maintain their popularities.
Meanwhile, CNN based models started getting more traction among researchers in the
last two years. Unlike LSTM, CNN works better for classification problems and is more
suitable for either non-time varying or static data representations. However, since most
financial data is time-varying, under normal circumstances, CNN is not the natural choice
for financial applications. However, in some independent studies, the researchers performed
an innovative transformation of 1-D time-varying financial data into 2-D mostly stationary
image-like data to be able to utilize the power of CNN through adaptive filtering and implicit
dimensionality reduction. This novel approach seems working remarkably well in complex
financial patterns regardless of the application area. In the future, more examples of such
implementations might be more common; only time will tell.
Another model that has a rising interest is DRL based implementations; in particu-
lar, the ones coupled with agent-based modelling. Even though algorithmic trading is the
most preferred implementation area for such models, it is possible to develop the working
structures for any problem type.
Careful analyses of the reviews indicate in most of the papers hybrid models are preferred
over native models for better accomplishments. A lot of researchers configure the topologies
and network parameters for achieving higher performance. However, there is also the danger
of creating more complex hybrid models that are not easy to build, and their interpretation
also might be difficult.
Through the performance evaluation results, it is possible to claim that in general terms,
DL models outperform ML counterparts when working on the same problems. DL problems
also have the advantage of being able to work on larger amount of data. With the growing
expansion of open-source DL libraries and frameworks (Figure 11), DL model building and
35
development process is easier than ever. And this phenomena is also supported by the
increasing interest in adapting DL models into all areas of finance which can be observed
from Figure 9.
Also it is worth to mention that, besides the outperformance of DL models over ML, the
performance evaluation results are improving every year relatively, even though it is very
difficult to explicitly quantifty the amount of improvement. The improvements are most
notable in trend prediction based algo-trading implementations and text-mining studies due
to deeper and/or more versatile networks and new innovative model developments. This is
also reflected through the increasing number of published papers year over year.

6.2. Discussions on Implementation Areas


Price/trend prediction and Algo-trading models have the most interest among all finan-
cial applications that use DL models in their implementations. Risk assessment and portfolio
management have always been popular within the ML community, and it looks like this is
also valid for DL researchers.
Even though broad interest in DL models is on the rise, financial text mining is partic-
ularly getting more attention than most of the other financial applications. The streaming
flow of financial news, tweets, statements, blogs opened up a whole new world for the finan-
cial community allowing them to build better and more versatile prediction and evaluation
models integrating numerical and textual data. Meanwhile, the general approach nowadays
is to combine text mining with financial sentiment analysis. With that, it is reasonable to
assume higher performance will be achieved. A lot of researchers started working on that
particular application area. It is quite probable that the next generation of outperforming
implementations will be based on models that can successfully integrate text mining with
quantified numerical data.
These days, one other hot area within the DL research is the cryptocurrencies. We can
also include blockchain research to that, even though it is not necessarily directly related
to cryptocurrencies, but generally used together in most implementations. Cryptocurrency
price prediction has the most attraction within the field, but since the topic is fairly new,
more studies and implementations will probably keep pouring in due to the high expectations
and promising rewards.

6.3. Open Issues and Future Work


When we try to extrapolate the current state of research and the achieved accomplish-
ments into the future, a few areas of interests stand out. We will try to elaborate on them and
provide a pathway for what can be done or needs to be done within the following few years.
We will try to sort out our opinions by analyzing them through the model development and
research topic point of view.

6.3.1. Model Development Perspective


We have already mentioned the growing attention on the adaptation of 2-D CNN imple-
mentations for various financial application areas. This particular technique looks promising

36
and provides opportunities. It would be beneficial to further explore the possibilities using
that approach in different problems. The playfield is still wide open.
Graph CNN is another model that is closely related but still showing some discrepan-
cies. It has not been used much, only one study was published that relates graph-CNN with
financial applications. However, versatile transformations of financial data into graphs, inte-
grating sentiment analysis through graph representations and constructing different models
can create opportunities for researchers to build better performing financial applications.
There are also recently developed DL models, like GAN, Capsule networks, etc. that
can also provide viable alternatives to existing implementations. They have started showing
up in various non-financial studies, however to the best of our knowledge, no known imple-
mentation of such kind for financial applications exists. It might open up a new window
of opportunities for financial researchers and practitioners. In addition to such new mod-
els, innovative paradigms like transfer learning, one-shot learning can be tested within the
environment.
Since financial text mining is overtaking the other topics in an accelerated fashion, new
data models like Stock2Vec [168] can be enhanced for better and more representative models.
In addition, Natural Language Processing (NLP) based ensemble models or more integration
of data semantics into the picture can increase the accuracy of the existing models.
Finally, according to our observations, hybrid models are preferred more over the native
or standalone models in most studies. This trend will likely continue, however, researchers
need to introduce more versatile, sometimes unconventional models for better results. Hybrid
models integrating various simple DL layers like cascaded CNN-LSTM blocks can have better
outcomes since ensembling spatial and temporal information together in a novel way might
be an important milestone for researchers seeking for "alpha" in their models.

6.3.2. Implementation Perspective


As far as the application areas are concerned, the usual suspects, algorithmic trading,
portfolio management and risk assessment will probably continue on their dominance within
the financial research arena in the foreseeable future. Meanwhile, some new shining stars
started getting more attention, not only because they represent fairly new research oppor-
tunities, but also their forecasted impact on the financial world is noteworthy.
Cryptocurrencies and blockchain technology are among these new research areas. Hence,
it is worthwhile to explore the possibilities that these new fields will bring. It will be a while
before any of these technologies become widely accepted industry standard, however, that
is the sole reason why it provides a great opportunity for the researchers to shape the future
of the financial world with new innovative models and hoping that the rest of the world will
follow their footsteps.
Another area that can benefit from more innovative models is portfolio management.
Robo-advisory systems are on the rise throughout the world and these systems depend
on high performing automated decision support systems. Since DL models fit well to that
description, it would be logical to assume the utilization of DL implementations will increase
in the coming years. As such, the corresponding quant funds will be very interested in the
achievements that the DL researchers can offer for the financial community. This might
37
require integrating learning and optimization models together for better-performing systems.
Hence, ensemble models that can successfully mix EC and DL components might be what
the industry is anticipating for the immediate future. This might also result in new research
opportunities.
Yet, one other research area that is generally avoided by soft computing and DL re-
searchers is the financial derivatives market. Even though there are many different products
that exist on the market, the corresponding DL research is very scarce. However, for pro-
fessionals working in the finance industry, these products actually provide incredible flexi-
bilities ranging from hedging their investments to implementing leveraged transactions with
minimized risk. Even though, opportunities exist for DL researchers, there was not a broad
interest in the topic, since there are only a handful of studies for the derivatives market. Op-
tion strategy optimization, futures trading, option pricing, arbitrage trading can be among
the areas that might benefit from DL research.
Sentiment analysis, text mining, risk adjusted asset pricing are some of the other imple-
mentation areas that attract researchers but not yet fully utilized. It is quite probable we
will see more papers in these fields in the near future.
Last, but not least, HFT is one area that has not benefitted from the advancements in ML
research to its full potential yet. Since HFT requires lightning-fast transaction processing,
the statistical learning model that is embedded into such trading systems must not introduce
any extra latency to the existing system. This necessitates careful planning and modelling
of such models. For that purpose, DL models embedded within the Graphic Processing
Unit (GPU) or Field Programmable Gate Array (FPGA) based hardware solutions can be
studied. The hardware aspects of DL implementations are generally omitted in almost all
studies, but as stated above, there might be opportunities also in that field.

6.3.3. Suggestions for Future Research


Careful analyses of Figures 8 and 9 indicate the rising overall appetite for applied DL
research for finance. Even though the interest is broad, some areas like cryptocurrency and
block chain studies might get more attention compared to other areas.
With respect to the promising outlook in text mining and financial sentiment analysis,
we believe behavioral finance is also a fairly untouched research area that hides a lot of
opportunities within. There is a lack of research work published on behavioral finance using
DL models. This might be mainly due to the difficulties of quantifying the inputs and outputs
of behavioral finance research to be used with DL models. However, new advancements in
text mining, NLP, semantics combined with agent-based computational finance can open up
huge opportunities in that field. We would encourage researchers to look further into this
for a possible implementation area as it currently seems to be wide open for new studies.

6.4. Responses to our Initial Research Questions


At this point, since we gathered and processed all the information we need, we are
ready to provide answers to our initially stated research questions. The questions and our
corresponding answers according to our survey are as follows:

38
• What financial application areas are of interest to DL community?
Response: Financial text mining, Algo-trading, risk assessments, sentiment analysis,
portfolio management and fraud detection are among the most studied areas of finance
research. (Please check Figure 8)

• How mature is the existing research in each of these application areas?


Response: Even though DL models already had better achievements compared to
traditional counterparts in almost all areas, the overall interest is still on the rise in
all research areas.

• What are the areas that have promising potentials from an academic/industrial re-
search perspective?
Response: Cryptocurrencies, blockchain, behavioral finance, HFT and derivatives mar-
ket have promising potentials for research.

• Which DL models are preferred (and more successful) in different applications?


Response: RNN based models (in particular LSTM), CNN and DMLP have been
used extensively in implementations. From what we have encountered, LSTM is more
successful and preferred in time-series forecasting, whereas DMLP and CNN are better
suited to applications requiring classification.

• How do DL models pare against traditional soft computing / ML techniques?


Response: In most of the studies, DL models performed better than their ML counter-
parts. There were a few occasions where ML had comparable or even better solutions,
however the general tendency is the outperformance of the DL methods.

• What is the future direction for DL research in Finance?


Response: Hybrid models based on Spatio-temporal data representations, NLP, se-
mantics and text mining-based models might become more important in the near
future.

7. Conclusions
The financial industry and academia have started realizing the potentials of DL in various
application areas. The number of research work keeps on increasing every year with an
accelerated fashion. However, we are just in the early years of this new era, more studies will
be implemented and new models will keep pouring in. In this survey, we wanted to highlight
the state-of-the-art DL research for the financial applications. We not only provided a
snapshot of the existing research status but also tried to identify the future roadway for
intended researchers. Our findings indicate there are incredible opportunities within the
field and it looks like they will not disappear anytime soon. So, we encourage the researchers
that are interested in the area to start exploring.

39
8. Acknowledgement
This work is supported by the Scientific and Technological Research Council of Turkey
(TUBITAK) grant no 215E248.

Glossary
AE Autoencoder. 4, 8, 9, 11, 12, 15, 18–20, 23, 29, 29, 30
30 DFNN Deep Feedforward Neural Network. 21, 22,
AI Artificial Intelligence. 15 24, 29
AMEX American Stock Exchange. 20–22, 31 DGM Deep Neural Generative Model. 27
ANN Artificial Neural Network. 3, 4, 7, 8, 17, 19 DGP Deep Gaussian Process. 9
ARIMA Autoregressive Integrated Moving Aver- DJI Dow Jones Index. 24
age. 23 DJIA Dow Jones Industrial Average. 11, 24, 25
AUC Area Under the Curve. 20, 26, 27, 29 DL Deep Learning. 1–4, 6, 8–11, 14–16, 18–25, 28,
AUROC Area Under the Receiver Operating 31–39
Characteristics. 13, 15, 17–19, 24, 26, 28, DLR Deep Learning Representation. 26
29 DMI Directional Movement Index. 24
BA Balanced Accuracy. 17 DMLP Deep Multilayer Perceptron. 4, 12, 13, 16,
BELM Basic Extreme Learning Machine. 26 19, 20, 22–24, 32, 34, 35, 39
BHC Bank Holding Companies. 17, 27 DNN Deep Neural Network. 5, 11–15, 17–22, 24,
Bi-GRU Bidirectional Gated Recurrent Unit. 26 25, 28–30
Bi-LSTM Bidirectional LSTM. 20, 21, 24, 28 DP Discriminant Power. 17
BIST Istanbul Stock Exchange Index. 11 DQL Deep Q-Learning. 19
BOLL Bollinger Band. 22 DRL Deep Reinforcement Learning. 9, 12, 14, 21,
BPTT Backpropagation Through Time. 6 23, 29, 30, 35
CAE Convolutional Autoencoder. 13 DRSE Deep Random Subspace Ensembles. 24, 28
CAGR Compound Annual Growth Rate. 20 DTW Dynamic Time Warping. 14
CART Classification and Regression Trees. 17 EA Evolutionary Algorithm. 3, 10, 19
CCI Commodity Channel Index. 24 EC Evolutionary Computation. 15, 38
CDAX German Stock Market Index Calculated by ELM Extreme Learning Machine. 26
Deutsche Börse. 26 EMA Exponential Moving Average. 24
CDS Credit Default Swaps. 15, 16 ETF Exchange-Traded Fund. 11, 13
CGAN Conditional GAN. 17 FDDR Fuzzy Deep Direct Reinforcement Learn-
CME Chicago Mercantile Exchange. 14, 30 ing. 11
CNN Convolutional Neural Network. 2, 4, 5, 10– FE-QAR Fixed Effects Quantile VAR. 17
18, 20–32, 34–37, 39 FFNN Feedforward Neural Network. 6, 8, 13, 18,
CRIX The Cryptocurrency Index. 22 27, 30
CRSP Center for Research in Security Prices. 17 FN False Negative. 15, 17
CSI China Securities Index. 11, 25 FNN Fully Connected Neural Network. 12
DAX The Deutscher Aktienindex. 13, 14 FP False Positive. 15, 17, 19, 29
DBN Deep Belief Network. 4, 8, 15–17, 20, 27, 30, FPGA Field Programmable Gate Array. 38
31 FTSE London Financial Times Stock Exchange In-
DCNL Deep Co-investment Network Learning. 14 dex. 11, 13, 14
DCNN Deep Convolutional Neural Network. 15 G-mean Geometric Mean. 17
DDPG Deep Deterministic Policy Gradient. 20 GA Genetic Algorithm. 3, 12
Deep-FASP The Financial Aspect and Sentiment GAN Generative Adversarial Network. 9, 37
Prediction task with Deep neural net- GASVR GA with a SVR. 11, 12
works. 28 GBDT Gradient-Boosted-DecisionTrees. 19
DFFN Deep Feed Forward Network. 16, 17, 21, GBT Gradient Boosted Trees. 11
40
GP Genetic Programming. 3, 13, 15, 16 31
GPU Graphic Processing Unit. 38 OCHL Open,Close,High, Low. 11, 12, 24
GRU Gated-Recurrent Unit. 14, 19–21, 27, 28, 32 OCHLV Open,Close,High, Low, Volume. 11, 13,
HAN Hybrid Attention Network. 26 14, 20–23, 25, 27
HFT High Frequency Trading. 10, 11, 13, 38, 39 PCA Principal Component Analysis. 4, 19, 26
HMM Hidden Markov Model. 24, 29, 30 PCC Pearson’s Correlation Coefficient. 14
HS China Shanghai Shenzhen Stock Index. 26 PLR Piecewise Linear Representation. 11
HSI Hong Kong Hang Seng Index. 11, 30 PNN Probabilistic Neural Network. 13
IBB iShares Nasdaq Biotechnology ETF. 20, 29, PPO Proximal Policy Optimization. 20
30 PSO Particle Swarm Optimization. 3
KELM Kernel Extreme Learning Machine. 26 R2 Squared correlation, Non-linear regression mul-
KOSPI The Korea Composite Stock Price Index. tiple correlation. 20, 21, 24, 28
11, 30 RBM Restricted Boltzmann Machine. 4, 7, 8, 15–
KS Kolmogorov–Smirnov. 17 17, 19, 20, 27, 30, 31
LAR Linear Auto-regression Predictor. 27 RCNN Recurrent CNN. 26
LDA Latent Dirichlet Allocation. 19, 26, 29 ReLU Rectified Linear Unit. 4
LFM Lookahead Factor Models. 31 RF Random Forest. 11, 16–21, 27, 28
LOB Limit Order Book Data. 13 RL Reinforcement Learning. 11, 13, 14, 18, 20–23
LR Logistic Regression. 16 RMSE Root Mean Square Error. 11, 17, 19, 21,
LSTM Long-Short Term Memory. 2, 4, 6, 10–14, 23, 26, 27, 31
16–30, 32, 33, 35, 37, 39 RNN Recurrent Neural Network. 2, 4, 6, 10–14,
MA Moving Average. 22, 24 20–24, 26–33, 39
MACD Moving Average Convergence and Diver- ROA Return on Assets. 17, 27, 28
gence. 12, 24, 27 ROC Price of Change. 24
MAE Mean Absolute Error. 11, 21, 26 RSE Relative Squared Error. 11
MAPE Mean Absolute Percentage Error. 11, 21, RSI Relative Strength Index. 12, 13, 24
26, 27 S&P500 Standard’s & Poor’s 500 Index. 11, 13,
MCC Matthew Correlation Coefficient. 26–28 14, 20, 24–27, 30
MDA Multilinear Discriminant Analysis. 11 SAE Stacked Autoencoder. 10, 16, 17
MDD Maximum Drawdown. 11, 13, 20, 22 SCI SSE Composite Index. 24
ML Machine Learning. 1–4, 10, 15, 18–21, 23, 25, SFM State Frequency Memory. 11
35, 36, 38, 39 SGD Stochastic Gradient Descent. 4
MLP Multilayer Perceptron. 4, 11, 12, 14–16, 18, SPY SPDR S&P 500 ETF. 13
20–23, 28, 31, 34 SR Sharpe-ratio. 11, 13, 14, 20, 22, 26, 31
MODRL Multi-objective Deep Reinforcement STD Standard Deviation. 11, 20
Learning. 11 SVD Singular Value Decomposition. 29, 30
MOEA Multiobjective Evolutionary Algorithm. 3, SVM Support Vector Machine. 4, 15–17, 24, 27–30
19 SVR Support Vector Regressor. 12, 27, 28, 30
MSE Mean Squared Error. 11, 13, 19, 20, 23, 24, SZSE Shenzhen Stock Exchange Composite Index.
26, 28, 31 11
MV-t Multivariate t Distribution. 17 TAIEX Taiwan Capitalization Weighted Stock In-
MVN Multivariate Normal Distribution. 17 dex. 21
NASDAQ National Association of Securities Deal- TALIB Technical Analysis Library Package. 13,
ers Automated Quotations. 14, 20–22, 25, 26
31 TAQ Trade and Quote. 20
NES Natural Evolution Strategies. 11 TDNN Timedelay Neural Network. 13
NIFTY National Stock Exchange of India. 26 TEMA Triple Exponential Moving Average. 24
NIKKEI Tokyo Nikkei Index. 11 TF-IDF Term Frequency-Inverse Document Fre-
NLP Natural Language Processing. 17, 29, 37–39 quency. 26
NN Neural Network. 3, 17, 24, 26, 29, 30 TGRU Two-stream GRU. 25, 26
NYSE New York Stock Exchange. 11, 20–22, 25, THEIL-U Theil’s inequality coefficient. 11
41
TN True Negative. 15 WBA Weighted Balanced Accuracy. 17
TP True Positive. 15, 19, 29 WMTR Weighted Multichannel Time-series Re-
TR Total Return. 13 gression. 11
TSE Tokyo Stock Exchange. 20, 21, 24
WSURT Wilcoxon Sum-rank Test. 28
TWSE Taiwan Stock Exchange. 26
VAR Vector Auto Regression. 17 WT Wavelet Transforms. 10, 11
VWL WL Kernel-based Method. 14 XGBoost eXtreme Gradient Boosting. 16, 17

42
References
[1] Omer Berat Sezer, Mehmet Ugur Gudelek, and Ahmet Murat Ozbayoglu. Financial time series fore-
casting with deep learning : A systematic literature review: 2005-2019, 2019.
[2] Arash Bahrammirzaee. A comparative survey of artificial intelligence applications in finance: artificial
neural networks, expert system and hybrid intelligent systems. Neural Computing and Applications,
19(8):1165–1195, June 2010.
[3] D. Zhang and L. Zhou. Discovering golden nuggets: Data mining in financial application. IEEE
Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), 34(4):513–522,
November 2004.
[4] Asunción Mochón, David Quintana, Yago Sáez, and Pedro Isasi Viñuela. Soft computing techniques
applied to finance. Applied Intelligence, 29:111–115, 2007.
[5] Pulakkazhy. Mining in banking and its applications: A review. Journal of Computer Science, 9(10):
1252–1259, October 2013.
[6] Sendhil Mullainathan and Jann Spiess. Machine learning: An applied econometric approach. Journal
of Economic Perspectives, 31(2):87–106, May 2017.
[7] Keke Gai, Meikang Qiu, and Xiaotong Sun. A survey on fintech. Journal of Network and Computer
Applications, 103:262–273, 2018.
[8] Boris Kovalerchuk and Evgenii Vityaev. Data Mining in Finance: Advances in Relational and Hybrid
Methods. Kluwer Academic Publishers, Norwell, MA, USA, 2000.
[9] Rafik A. Aliev, Bijan Fazlollahi, and Rashad R. Aliev. Soft computing and its applications in business
and economics. In Studies in Fuzziness and Soft Computing, 2004.
[10] Anthony Brabazon and Michael O’Neill, editors. Natural Computing in Computational Finance.
Springer Berlin Heidelberg, 2008.
[11] Ludmila Dymowa. Soft Computing in Economics and Finance. Springer Berlin Heidelberg, 2011.
[12] Shu-Heng Chen, editor. Genetic Algorithms and Genetic Programming in Computational Finance.
Springer US, 2002.
[13] Ma. Guadalupe Castillo Tapia and Carlos A. Coello Coello. Applications of multi-objective evolu-
tionary algorithms in economics and finance: A survey. In 2007 IEEE Congress on Evolutionary
Computation. IEEE, September 2007.
[14] Antonin Ponsich, Antonio Lopez Jaimes, and Carlos A. Coello Coello. A survey on multiobjective
evolutionary algorithms for the solution of the portfolio optimization problem and other finance and
economics applications. IEEE Transactions on Evolutionary Computation, 17(3):321–344, June 2013.
[15] Ruben Aguilar-Rivera, Manuel Valenzuela-Rendon, and J.J. Rodriguez-Ortiz. Genetic algorithms and
darwinian approaches in financial applications: A survey. Expert Systems with Applications, 42(21):
7684–7697, November 2015.
[16] Bo K Wong and Yakup Selvi. Neural network applications in finance: A review and analysis of
literature (1990–1996). Information & Management, 34(3):129–139, October 1998.
[17] Yuhong Li and Weihua Ma. Applications of artificial neural networks in financial economics: A survey.
In 2010 International Symposium on Computational Intelligence and Design. IEEE, October 2010.
[18] B. Elmsili and B. Outtaj. Artificial neural networks applications in economics and management
research: An exploratory literature review. In 2018 4th International Conference on Optimization
and Applications (ICOA), pages 1–6, April 2018.
[19] Blake LeBaron. Chapter 24 agent-based computational finance. In L. Tesfatsion and K.L. Judd,
editors, Handbook of Computational Economics, volume 2 of Handbook of Computational Economics,
pages 1187–1233. Elsevier, 2006.
[20] Stephan K. Chalup and Andreas Mitschele. Kernel methods in finance. In Handbook on Information
Technology in Finance, pages 655–687. Springer Berlin Heidelberg, 2008.
[21] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436–444, 2015.
[22] George Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control,
signals and systems, 2(4):303–314, 1989.

43
[23] Barry L Kalman and Stan C Kwasny. Why tanh: choosing a sigmoidal function. In [Proceedings 1992]
IJCNN International Joint Conference on Neural Networks, volume 4, pages 578–581. IEEE, 1992.
[24] Vinod Nair and Geoffrey E Hinton. Rectified linear units improve restricted boltzmann machines.
In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814,
2010.
[25] Andrew L Maas, Awni Y Hannun, and Andrew Y Ng. Rectifier nonlinearities improve neural network
acoustic models. In Proc. icml, volume 30, page 3, 2013.
[26] Prajit Ramachandran, Barret Zoph, and Quoc V Le. Searching for activation functions. arXiv preprint
arXiv:1710.05941, 2017.
[27] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016.
http://www.deeplearningbook.org.
[28] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–
1780, 1997.
[29] Xueheng Qiu, Le Zhang, Ye Ren, P. Suganthan, and Gehan Amaratunga. Ensemble deep learning
for regression and time series forecasting. In 2014 IEEE Symposium on Computational Intelligence in
Ensemble Learning (CIEL), pages 1–6, 2014.
[30] Yoshua Bengio. Deep learning of representations for unsupervised and transfer learning. In Proceedings
of ICML workshop on unsupervised and transfer learning, pages 17–36, 2012.
[31] Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. A fast learning algorithm for deep belief
nets. Neural Computation, 18(7):1527–1554, 2006.
[32] Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. Extracting and
composing robust features with denoising autoencoders. In Proceedings of the 25th international
conference on Machine learning, pages 1096–1103. ACM, 2008.
[33] Qinxue Meng, Daniel Catchpoole, David Skillicom, and Paul J Kennedy. Relational autoencoder
for feature extraction. In 2017 International Joint Conference on Neural Networks (IJCNN), pages
364–371. IEEE, 2017.
[34] Yong Hu, Kang Liu, Xiangzhou Zhang, Lijun Su, E.W.T. Ngai, and Mei Liu. Application of evolu-
tionary computation for rule discovery in stock algorithmic trading: A literature review. Applied Soft
Computing, 36:534–551, November 2015.
[35] Sercan Karaoglu and Ugur Arpaci. A deep learning approach for optimization of systematic signal
detection in financial trading systems with big data. International Journal of Intelligent Systems and
Applications in Engineering, SpecialIssue(SpecialIssue):31–36, July 2017.
[36] Wei Bao, Jun Yue, and Yulei Rao. A deep learning framework for financial time series using stacked
autoencoders and long-short term memory. PLOS ONE, 12(7):e0180944, July 2017.
[37] Shuanglong Liu, Chao Zhang, and Jinwen Ma. Cnn-lstm neural network model for quantitative strat-
egy analysis in stock markets. In Neural Information Processing, pages 198–206. Springer International
Publishing, 2017.
[38] Liheng Zhang, Charu Aggarwal, and Guo-Jun Qi. Stock price prediction via discovering multi-
frequency trading patterns. In Proceedings of the 23rd ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining - KDD17. ACM Press, 2017.
[39] Dat Thanh Tran, Martin Magris, Juho Kanniainen, Moncef Gabbouj, and Alexandros Iosifidis. Tensor
representation in high-frequency financial data for price change prediction. In 2017 IEEE Symposium
Series on Computational Intelligence (SSCI). IEEE, November 2017.
[40] Yue Deng, Feng Bao, Youyong Kong, Zhiquan Ren, and Qionghai Dai. Deep direct reinforcement
learning for financial signal representation and trading. IEEE Transactions on Neural Networks and
Learning Systems, 28(3):653–664, March 2017.
[41] Thomas Fischer and Christopher Krauss. Deep learning with long short-term memory networks for
financial market predictions. European Journal of Operational Research, 270(2):654–669, October
2018.
[42] Marios Mourelatos, Christos Alexakos, Thomas Amorgianiotis, and Spiridon Likothanassis. Financial
indices modelling and trading utilizing deep learning techniques: The athens se ftse/ase large cap use

44
case. In 2018 Innovations in Intelligent Systems and Applications (INISTA). IEEE, July 2018.
[43] Weiyu Si, Jinke Li, Peng Ding, and Ruonan Rao. A multi-objective deep reinforcement learning
approach for stock index future’s intraday trading. In 2017 10th International Symposium on Com-
putational Intelligence and Design (ISCID). IEEE, December 2017.
[44] Bang Xiang Yong, Mohd Rozaini Abdul Rahim, and Ahmad Shahidan Abdullah. A stock market
trading system using deep neural network. In Communications in Computer and Information Science,
pages 356–364. Springer Singapore, 2017.
[45] David W. Lu. Agent inspired trading using recurrent reinforcement learning and lstm neural networks,
2017.
[46] Matthew Francis Dixon, Diego Klabjan, and Jin Hoon Bang. Classification-based financial markets
prediction using deep neural networks. SSRN Electronic Journal, 2016.
[47] Jerzy Korczak and Marcin Hernes. Deep learning for financial time series forecasting in a-trader
system. In Proceedings of the 2017 Federated Conference on Computer Science and Information
Systems. IEEE, September 2017.
[48] Bruno Spilak. Deep neural networks for cryptocurrencies price prediction. Master’s thesis, Humboldt-
Universitat zu Berlin, Wirtschaftswissenschaftliche Fakultat, 2018.
[49] Gyeeun Jeong and Ha Young Kim. Improving financial trading decisions using deep q-learning: Pre-
dicting the number of shares, action strategies, and transfer learning. Expert Systems with Applications,
117:125–138, March 2019.
[50] Christopher Krauss, Xuan Anh Do, and Nicolas Huck. Deep neural networks, gradient-boosted trees,
random forests: Statistical arbitrage on the s&p 500. European Journal of Operational Research, 259
(2):689–702, June 2017.
[51] Google. System and method for computer managed funds to outperform benchmarks.
[52] Omer Berat Sezer, Murat Ozbayoglu, and Erdogan Dogdu. A deep neural-network based stock trading
system based on evolutionary optimized technical analysis parameters. Procedia Computer Science,
114:473–480, 2017.
[53] Ariel Navon and Yosi Keller. Financial time series prediction using deep learning, 2017.
[54] Luigi Troiano, Elena Mejuto Villa, and Vincenzo Loia. Replicating a trading strategy by means of
lstm for financial industry applications. IEEE Transactions on Industrial Informatics, 14(7):3226–
3234, July 2018.
[55] Justin Sirignano and Rama Cont. Universal features of price formation in financial markets: Perspec-
tives from deep learning. SSRN Electronic Journal, 2018.
[56] Avraam Tsantekidis, Nikolaos Passalis, Anastasios Tefas, Juho Kanniainen, Moncef Gabbouj, and
Alexandros Iosifidis. Using deep learning to detect price change indications in financial markets. In
2017 25th European Signal Processing Conference (EUSIPCO). IEEE, August 2017.
[57] M. Ugur Gudelek, S. Arda Boluk, and A. Murat Ozbayoglu. A deep learning based stock trading
model with 2-d cnn trend detection. In 2017 IEEE Symposium Series on Computational Intelligence
(SSCI). IEEE, November 2017.
[58] Omer Berat Sezer and Ahmet Murat Ozbayoglu. Algorithmic financial trading with deep convolutional
neural networks: Time series to image conversion approach. Applied Soft Computing, 70:525–538,
September 2018.
[59] Guosheng Hu, Yuxin Hu, Kai Yang, Zehao Yu, Flood Sung, Zhihong Zhang, Fei Xie, Jianguo Liu, Neil
Robertson, Timpathy Hospedales, and Qiangwei Miemie. Deep stock representation learning: From
candlestick charts to investment decisions. In 2018 IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP). IEEE, April 2018.
[60] Avraam Tsantekidis, Nikolaos Passalis, Anastasios Tefas, Juho Kanniainen, Moncef Gabbouj, and
Alexandros Iosifidis. Forecasting stock prices from the limit order book using convolutional neural
networks. In 2017 IEEE 19th Conference on Business Informatics (CBI). IEEE, July 2017.
[61] Hakan Gunduz, Yusuf Yaslan, and Zehra Cataltepe. Intraday prediction of borsa istanbul using convo-
lutional neural networks and feature correlations. Knowledge-Based Systems, 137:138–148, December
2017.

45
[62] Omer Berat Sezer and Ahmet Murat Ozbayoglu. Financial trading model with stock bar chart image
time series with deep convolutional neural networks. arXiv preprint arXiv:1903.04610, 2019.
[63] Will Serrano. Fintech model: The random neural network with genetic algorithm. Procedia Computer
Science, 126:537–546, 2018.
[64] E.W. Saad, D.V. Prokhorov, and D.C. Wunsch. Comparative study of stock trend prediction using
time delay, recurrent and probabilistic neural networks. IEEE Transactions on Neural Networks, 9
(6):1456–1470, 1998.
[65] Jonathan Doering, Michael Fairbank, and Sheri Markose. Convolutional neural networks applied
to high-frequency market microstructure forecasting. In 2017 9th Computer Science and Electronic
Engineering (CEEC). IEEE, September 2017.
[66] Zhengyao Jiang, Dixing Xu, and Jinjun Liang. A deep reinforcement learning framework for the
financial portfolio management problem. arXiv preprint arXiv:1706.10059, 2017.
[67] P. Tino, C. Schittenkopf, and G. Dorffner. Financial volatility trading using recurrent neural networks.
IEEE Transactions on Neural Networks, 12(4):865–874, July 2001.
[68] Yu-Ying Chen, Wei-Lun Chen, and Szu-Hao Huang. Developing arbitrage strategy in high-frequency
pairs trading with filterbank cnn algorithm. In 2018 IEEE International Conference on Agents (ICA).
IEEE, July 2018.
[69] Omar A. Bari and Arvin Agah. Ensembles of text and time-series models for automatic generation of
financial trading signals from social media content. Journal of Intelligent Systems, 2018.
[70] Matthew Francis Dixon. Sequence classification of the limit order book using recurrent neural net-
works. SSRN Electronic Journal, 2017.
[71] Chiao-Ting Chen, An-Pin Chen, and Szu-Hao Huang. Cloning strategies from trading records using
agent-based reinforcement learning algorithm. In 2018 IEEE International Conference on Agents
(ICA). IEEE, July 2018.
[72] Yue Wang, Chenwei Zhang, Shen Wang, Philip S. Yu, Lu Bai, and Lixin Cui. Deep co-investment
network learning for financial assets, 2018.
[73] Min-Yuh Day and Chia-Chou Lee. Deep learning for financial sentiment analysis on finance news
providers. In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis
and Mining (ASONAM). IEEE, August 2016.
[74] Justin Sirignano. Deep learning for limit order books, 2016.
[75] Xiang Gao. Deep reinforcement learning for time series: playing idealized trading games, 2018.
[76] Martin Neil Baily, Robert E. Litan, and Johnson Matthew S. The origins of the financial crisis.
Initiative on Business and Public Policy at Brookings, Fixing Finance Series - Paper 3, 2008.
[77] Cuicui Luo, Desheng Wu, and Dexiang Wu. A deep learning approach for credit scoring using credit
default swaps. Engineering Applications of Artificial Intelligence, 65:465–470, October 2017.
[78] Lean Yu, Rongtian Zhou, Ling Tang, and Rongda Chen. A dbn-based resampling svm ensemble
learning paradigm for credit classification with imbalanced data. Applied Soft Computing, 69:192–202,
August 2018.
[79] Ying Li, Xianghong Lin, Xiangwen Wang, Fanqi Shen, and Zuzheng Gong. Credit risk assessment algo-
rithm using deep neural networks with clustering and merging. In 2017 13th International Conference
on Computational Intelligence and Security (CIS). IEEE, December 2017.
[80] Khiem Tran, Thanh Duong, and Quyen Ho. Credit scoring model: A combination of genetic program-
ming and deep learning. In 2016 Future Technologies Conference (FTC). IEEE, December 2016.
[81] Victor-Emil Neagoe, Adrian-Dumitru Ciotec, and George-Sorin Cucu. Deep convolutional neural
networks versus multilayer perceptron for financial prediction. In 2018 International Conference on
Communications (COMM). IEEE, June 2018.
[82] Bing Zhu, Wenchuan Yang, Huaxuan Wang, and Yuan Yuan. A hybrid deep learning model for
consumer credit scoring. In 2018 International Conference on Artificial Intelligence and Big Data
(ICAIBD). IEEE, May 2018.
[83] Ayahiko Niimi. Deep learning for credit card data analysis. In 2015 World Congress on Internet
Security (WorldCIS). IEEE, October 2015.

46
[84] Efstathios Kirkos and Yannis Manolopoulos. Data mining in finance and accounting: A review of
current research trends. In Proceedings of the 1 st International Conference on Enterprise Systems
and Accounting (ICESAcc, pages 63–78, 2004.
[85] V. Ravi, H. Kurniawan, Peter Nwee Kok Thai, and P. Ravi Kumar. Soft computing system for bank
performance prediction. Applied Soft Computing, 8(1):305–315, January 2008.
[86] Meryem Duygun Fethi and Fotios Pasiouras. Assessing bank efficiency and performance with op-
erational research and artificial intelligence techniques: A survey. European Journal of Operational
Research, 204(2):189–198, July 2010.
[87] Adel Lahsasna, Raja Noor Ainon, and Ying Wah Teh. Credit scoring models using soft computing
methods: A survey. Int. Arab J. Inf. Technol., 7:115–123, 2010.
[88] Ning Chen, Bernardete Ribeiro, and An Chen. Financial credit risk assessment: a recent review.
Artificial Intelligence Review, 45(1):1–23, October 2015.
[89] AI Marques, Vicente García, and José Salvador Sánchez. A literature review on the application of
evolutionary computing to credit scoring. Journal of the Operational Research Society, 64(9):1384–
1399, 2013.
[90] P. Ravi Kumar and V. Ravi. Bankruptcy prediction in banks and firms via statistical and intelligent
techniques – a review. European Journal of Operational Research, 180(1):1–28, July 2007.
[91] Antanas Verikas, Zivile Kalsyte, Marija Bacauskiene, and Adas Gelzinis. Hybrid and ensemble-based
soft computing techniques in bankruptcy prediction: a survey. Soft Computing, 14(9):995–1010,
September 2009.
[92] Jie Sun, Hui Li, Qing-Hua Huang, and Kai-Yu He. Predicting financial distress and corporate failure: A
review from the state-of-the-art definitions, modeling, sampling, and featuring approaches. Knowledge-
Based Systems, 57:41–56, February 2014.
[93] W. Lin, Y. Hu, and C. Tsai. Machine learning in financial crisis prediction: A survey. IEEE Trans-
actions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(4):421–436, July
2012.
[94] Zineb Lanbouri and Said Achchab. A hybrid deep belief network approach for financial distress
prediction. In 2015 10th International Conference on Intelligent Systems: Theories and Applications
(SITA). IEEE, October 2015.
[95] Vipula Rawte, Aparna Gupta, and Mohammed J. Zaki. Analysis of year-over-year changes in risk
factors disclosure in 10-k filings. In Proceedings of the Fourth International Workshop on Data Science
for Macro-Modeling with Financial and Economic Datasets - DSMM18. ACM Press, 2018.
[96] Samuel Ronnqvist and Peter Sarlin. Detect & describe: Deep learning of bank stress in the news. In
2015 IEEE Symposium Series on Computational Intelligence. IEEE, December 2015.
[97] Samuel Ronnqvist and Peter Sarlin. Bank distress in the news: Describing events through deep
learning. Neurocomputing, 264:57–70, November 2017.
[98] Paola Cerchiello, Giancarlo Nicola, Samuel Rönnqvist, and Peter Sarlin. Deep learning bank distress
from news and numerical financial data. CoRR, abs/1706.09627, 2017.
[99] Nikhil Malik, Param Vir Singh, and Urooj Khan. Can banks survive the next financial crisis? an
adversarial deep learning model for bank stress testing. An Adversarial Deep Learning Model for Bank
Stress Testing (June 30, 2018), 2018.
[100] Bernardete Ribeiro and Noel Lopes. Deep belief networks for financial prediction. In Neural Informa-
tion Processing, pages 766–773. Springer Berlin Heidelberg, 2011.
[101] Shu-Hao Yeh, Chuan-Ju Wang, and Ming-Feng Tsai. Deep belief networks for predicting corporate
defaults. In 2015 24th Wireless and Optical Communication Conference (WOCC). IEEE, October
2015.
[102] Tadaaki Hosaka. Bankruptcy prediction using imaged financial ratios and convolutional neural net-
works. Expert Systems with Applications, September 2018.
[103] Justin Sirignano, Apaar Sadhwani, and Kay Giesecke. Deep learning for mortgage risk. SSRN Elec-
tronic Journal, 2018.
[104] Håvard Kvamme, Nikolai Sellereite, Kjersti Aas, and Steffen Sjursen. Predicting mortgage default

47
using convolutional neural networks. Expert Systems with Applications, 102:207–217, July 2018.
[105] Narek Abroyan and. Neural networks for financial market risk classification. Frontiers in Signal
Processing, 1(2), August 2017.
[106] Sotirios P. Chatzis, Vassilis Siakoulis, Anastasios Petropoulos, Evangelos Stavroulakis, and Nikos
Vlachogiannakis. Forecasting stock market crisis events using deep and statistical machine learning
techniques. Expert Systems with Applications, 112:353–371, December 2018.
[107] E Kirkos, C Spathis, and Y Manolopoulos. Data mining techniques for the detection of fraudulent
financial statements. Expert Systems with Applications, 32(4):995–1003, May 2007.
[108] Dianmin Yue, Xiaodan Wu, Yunfeng Wang, Yue Li, and Chao-Hsien Chu. A review of data mining-
based financial fraud detection research. In 2007 International Conference on Wireless Communica-
tions, Networking and Mobile Computing. IEEE, September 2007.
[109] Shiguo Wang. A comprehensive survey of data mining-based accounting-fraud detection research. In
2010 International Conference on Intelligent Computation Technology and Automation. IEEE, May
2010.
[110] Clifton Phua, Vincent C. S. Lee, Kate Smith-Miles, and Ross W. Gayler. A comprehensive survey of
data mining-based fraud detection research. CoRR, abs/1009.6119, 2010.
[111] E.W.T. Ngai, Yong Hu, Y.H. Wong, Yijun Chen, and Xin Sun. The application of data mining tech-
niques in financial fraud detection: A classification framework and an academic review of literature.
Decision Support Systems, 50(3):559–569, February 2011.
[112] Anuj Sharma and Prabin Kumar Panigrahi. A review of financial accounting fraud detection based
on data mining techniques. International Journal of Computer Applications, 39(1):37–47, February
2012.
[113] Jarrod West and Maumita Bhattacharya. Intelligent financial fraud detection: A comprehensive
review. Computers & Security, 57:47–66, March 2016.
[114] Yaya Heryadi and Harco Leslie Hendric Spits Warnars. Learning temporal representation of transaction
amount for fraudulent transaction recognition using cnn, stacked lstm, and cnn-lstm. In 2017 IEEE
International Conference on Cybernetics and Computational Intelligence (CyberneticsCom). IEEE,
November 2017.
[115] Abhimanyu Roy, Jingyi Sun, Robert Mahoney, Loreto Alonzi, Stephen Adams, and Peter Beling. Deep
learning detecting fraud in credit card transactions. In 2018 Systems and Information Engineering
Design Symposium (SIEDS). IEEE, April 2018.
[116] Jon Ander Gómez, Juan Arévalo, Roberto Paredes, and Jordi Nin. End-to-end neural network archi-
tecture for fraud scoring in card payments. Pattern Recognition Letters, 105:175–181, April 2018.
[117] Ishan Sohony, Rameshwar Pratap, and Ullas Nambiar. Ensemble learning for credit card fraud detec-
tion. In Proceedings of the ACM India Joint International Conference on Data Science and Manage-
ment of Data - CoDS-COMAD18. ACM Press, 2018.
[118] Johannes Jurgovsky, Michael Granitzer, Konstantin Ziegler, Sylvie Calabretto, Pierre-Edouard
Portier, Liyun He-Guelton, and Olivier Caelen. Sequence classification for credit-card fraud detection.
Expert Systems with Applications, 100:234–245, June 2018.
[119] Ebberth L. Paula, Marcelo Ladeira, Rommel N. Carvalho, and Thiago Marzagao. Deep learning
anomaly detection as support fraud investigation in brazilian exports and anti-money laundering. In
2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE,
December 2016.
[120] Thiago Alencar Gomes, Rommel Novaes Carvalho, and Ricardo Silva Carvalho. Identifying anomalies
in parliamentary expenditures of brazilian chamber of deputies with deep autoencoders. In 2017 16th
IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, December
2017.
[121] Yibo Wang and Wei Xu. Leveraging deep learning with lda-based text analytics to detect automobile
insurance fraud. Decision Support Systems, 105:87–95, January 2018.
[122] Longfei Li, Jun Zhou, Xiaolong Li, and Tao Chen. Poster: Practical fraud transaction prediction. In
ACM Conference on Computer and Communications Security, 2017.

48
[123] Allan Inocencio de Souza Costa and Luis Silva. Sequence classification of the limit order book using
recurrent neural networks. 2016.
[124] Nikolaos D. Goumagias, Dimitrios Hristu-Varsakelis, and Yannis M. Assael. Using deep q-learning
to understand the tax evasion behavior of risk-averse firms. Expert Systems with Applications, 101:
258–270, July 2018.
[125] Bin Li and Steven C. H. Hoi. Online portfolio selection: A survey. ACM Comput. Surv., 46(3):
35:1–35:36, January 2014.
[126] K. Metaxiotis and K. Liagkouras. Multiobjective evolutionary algorithms for portfolio management:
A comprehensive literature review. Expert Systems with Applications, 39(14):11685–11698, October
2012.
[127] Lawrence Takeuchi. Applying deep learning to enhance momentum trading strategies in stocks. 2013.
[128] Anthony Grace. Can deep learning techniques improve the risk adjusted returns from enhanced
indexing investment strategies. Master’s thesis, 2017.
[129] XingYu Fu, JinHong Du, YiFeng Guo, MingWen Liu, Tao Dong, and XiuWen Duan. A machine
learning framework for stock selection, 2018.
[130] Saurabh Aggarwal and Somya Aggarwal. Deep investment in financial markets using deep learning
models. International Journal of Computer Applications, 162(2):40–43, March 2017.
[131] J.B. Heaton and Nick Polson. Deep learning for finance: Deep portfolios. SSRN Electronic Journal,
2016.
[132] Chi-Ming Lin, Jih-Jeng Huang, Mitsuo Gen, and Gwo-Hshiung Tzeng. Recurrent neural network for
dynamic portfolio selection. Applied Mathematics and Computation, 175(2):1139–1146, April 2006.
[133] Nijolė Maknickienė. Selection of orthogonal investment portfolio using evolino rnn trading model.
Procedia - Social and Behavioral Sciences, 110:1158–1165, January 2014.
[134] Bo Zhou. Deep learning and the cross-section of stock returns: Neural networks combining price and
fundamental information. SSRN Electronic Journal, 2018.
[135] Bilberto Batres-Estrada. Deep learning for multivariate financial time series. Master’s thesis, KTH,
Mathematical Statistics, 2015.
[136] Sang Il Lee and Seong Joon Yoo. Threshold-based portfolio: the role of the threshold and its appli-
cations. The Journal of Supercomputing, September 2018.
[137] Hitoshi Iwasaki and Ying Chen. Topic sentiment asset pricing with dnn supervised learning. SSRN
Electronic Journal, 2018.
[138] Zhipeng Liang, Hao Chen, Junhao Zhu, Kangkang Jiang, and Yanran Li. Adversarial deep reinforce-
ment learning in portfolio management, 2018.
[139] Jiaqi Chen, Wenbo Wu, and Michael Tindall. Hedge fund return prediction and fund selection: A
machine-learning approach. Occasional Papers 16-4, Federal Reserve Bank of Dallas, November 2016.
[140] Zhengyao Jiang and Jinjun Liang. Cryptocurrency portfolio management with deep reinforcement
learning. In 2017 Intelligent Systems Conference (IntelliSys). IEEE, September 2017.
[141] Nijole Maknickiene, Aleksandras Vytautas Rutkauskas, and Algirdas Maknickas. Investigation of
financial market prediction by recurrent neural network. 2014.
[142] Robert Culkin and Sanjiv R. Das. Machine learning in finance: The case of deep learning in option
pricing. 2017.
[143] Pei-Ying Hsu, Chin Chou, Szu-Hao Huang, and An-Pin Chen. A market making quotation strategy
based on dual deep learning agents for option pricing and bid-ask spread estimation. In 2018 IEEE
International Conference on Agents (ICA). IEEE, July 2018.
[144] Guanhao Feng, Nicholas G. Polson, and Jianeng Xu. Deep factor alpha, 2018.
[145] Rui-Yang Chen. A traceability chain algorithm for artificial neural networks using t–s fuzzy cognitive
maps in blockchain. Future Generation Computer Systems, 80:198–210, March 2018.
[146] Lihao Nan and Dacheng Tao. Bitcoin mixing detection using deep autoencoder. In 2018 IEEE Third
International Conference on Data Science in Cyberspace (DSC). IEEE, June 2018.
[147] Gonçalo Duarte Lima Freire Lopes. Deep learning for market forecasts. 2018.
[148] Sean McNally, Jason Roche, and Simon Caton. Predicting the price of bitcoin using machine learn-

49
ing. In 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based
Processing (PDP). IEEE, March 2018.
[149] Colm Kearney and Sha Liu. Textual sentiment in finance: A survey of methods and models. Interna-
tional Review of Financial Analysis, 33:171–185, May 2014.
[150] Qili Wang, Wei Xu, and Han Zheng. Combining the wisdom of crowds and technical analysis for
financial market prediction using deep random subspace ensembles. Neurocomputing, 299:51–61, July
2018.
[151] Lei Shi, Zhiyang Teng, Le Wang, Yue Zhang, and Alexander Binder. Deepclue: Visual interpretation
of text-based deep stock prediction. IEEE Transactions on Knowledge and Data Engineering, pages
1–1, 2018.
[152] Yangtuo Peng and Hui Jiang. Leverage financial news to predict stock price movements using word
embeddings and deep neural networks. In Proceedings of the 2016 Conference of the North American
Chapter of the Association for Computational Linguistics: Human Language Technologies. Association
for Computational Linguistics, 2016.
[153] Qun Zhuge, Lingyu Xu, and Gaowei Zhang. Lstm neural network with emotional analysis for prediction
of stock price. 2017.
[154] Zhongshengz. Measuring financial crisis index for risk warning through analysis of social network.
Master’s thesis, 2018.
[155] Sushree Das, Ranjan Kumar Behera, Mukesh Kumar, and Santanu Kumar Rath. Real-time sentiment
analysis of twitter streaming data for stock prediction. Procedia Computer Science, 132:956–964, 2018.
[156] Jordan Prosky, Xingyou Song, Andrew Tan, and Michael Zhao. Sentiment predictability for stocks.
CoRR, abs/1712.05785, 2017.
[157] Jiahong Li, Hui Bu, and Junjie Wu. Sentiment-aware stock market prediction: A deep learning
method. In 2017 International Conference on Service Systems and Service Management. IEEE, June
2017.
[158] Yifu Huang, Kai Huang, Yang Wang, Hao Zhang, Jihong Guan, and Shuigeng Zhou. Exploiting twitter
moods to boost financial trend prediction based on deep network models. In Intelligent Computing
Methodologies, pages 449–460. Springer International Publishing, 2016.
[159] Leela Mitra and Gautam Mitra. Applications of news analytics in finance: A review. In The Handbook
of News Analytics in Finance, pages 1–39. John Wiley & Sons, Ltd., May 2012.
[160] Feng Li. Textual analysis of corporate disclosures: A survey of the literature. Journal of Accounting
Literature, 29, February 2011.
[161] Tim Loughran and Bill McDonald. Textual analysis in accounting and finance: A survey. Journal of
Accounting Research, 54(4):1187–1230, June 2016.
[162] B. Shravan Kumar and Vadlamani Ravi. A survey of the applications of text mining in financial
domain. Knowledge-Based Systems, 114:128–147, December 2016.
[163] Marc-André Mittermayer and Gerhard F Knolmayer. Text mining systems for market response to
news: A survey. September 2006.
[164] Arman Khadjeh Nassirtoussi, Saeed Aghabozorgi, Teh Ying Wah, and David Chek Ling Ngo. Text
mining for market prediction: A systematic review. Expert Systems with Applications, 41(16):7653–
7670, November 2014.
[165] Huy D. Huynh, L. Minh Dang, and Duc Duong. A new model for stock price movements prediction
using deep neural network. In Proceedings of the Eighth International Symposium on Information and
Communication Technology - SoICT 2017. ACM Press, 2017.
[166] Songqiao Han, Xiaoling Hao, and Hailiang Huang. An event-extraction approach for business analysis
from online chinese news. Electronic Commerce Research and Applications, 28:244–260, March 2018.
[167] Mathias Kraus and Stefan Feuerriegel. Decision support from financial disclosures with deep neural
networks and transfer learning. Decision Support Systems, 104:38–48, December 2017.
[168] L. Minh Dang, Abolghasem Sadeghi-Niaraki, Huy D. Huynh, Kyungbok Min, and Hyeonjoon Moon.
Deep learning approach for short-term stock trends prediction based on two-stream gated recurrent
unit network. IEEE Access, pages 1–1, 2018.

50
[169] Xiao Ding, Yue Zhang, Ting Liu, and Junwen Duan. Deep learning for event-driven stock prediction.
In Proceedings of the 24th International Conference on Artificial Intelligence, IJCAI’15, pages 2327–
2333. AAAI Press, 2015.
[170] Manuel R. Vargas, Beatriz S. L. P. de Lima, and Alexandre G. Evsukoff. Deep learning for stock market
prediction from financial news articles. In 2017 IEEE International Conference on Computational
Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA). IEEE,
June 2017.
[171] Ryo Akita, Akira Yoshihara, Takashi Matsubara, and Kuniaki Uehara. Deep learning for stock pre-
diction using numerical and textual information. In 2016 IEEE/ACIS 15th International Conference
on Computer and Information Science (ICIS). IEEE, June 2016.
[172] Ishan Verma, Lipika Dey, and Hardik Meisheri. Detecting, quantifying and accessing impact of news
events on indian stock indices. In Proceedings of the International Conference on Web Intelligence -
WI17. ACM Press, 2017.
[173] Xi Zhang, Yunjia Zhang, Senzhang Wang, Yuntao Yao, Binxing Fang, and Philip S. Yu. Improving
stock market prediction via heterogeneous information fusion. Knowledge-Based Systems, 143:236–247,
March 2018.
[174] Weiling Chen, Chai Kiat Yeo, Chiew Tong Lau, and Bu Sung Lee. Leveraging social media news to
predict stock index movement using rnn-boost. Data & Knowledge Engineering, August 2018.
[175] Ziniu Hu, Weiqing Liu, Jiang Bian, Xuanzhe Liu, and Tie-Yan Liu. Listening to chaotic whispers:
A deep learning framework for news-oriented stock trend prediction. In Proceedings of the Eleventh
ACM International Conference on Web Search and Data Mining, WSDM ’18, pages 261–269, New
York, NY, USA, 2018. ACM.
[176] Xiaodong Li, Jingjing Cao, and Zhaoqing Pan. Market impact analysis via deep learned architectures.
Neural Computing and Applications, March 2018.
[177] Che-Yu Lee and Von-Wun Soo. Predict stock price with financial news based on recurrent convolu-
tional neural networks. In 2017 Conference on Technologies and Applications of Artificial Intelligence
(TAAI). IEEE, December 2017.
[178] Shotaro Minami. Predicting equity price with corporate action events using lstm-rnn. Journal of
Mathematical Finance, 08(01):58–63, 2018.
[179] Akira Yoshihara, Kazuki Fujikawa, Kazuhiro Seki, and Kuniaki Uehara. Predicting stock market
trends by recurrent deep neural networks. In Lecture Notes in Computer Science, pages 759–769.
Springer International Publishing, 2014.
[180] Przemyslaw Buczkowski. Predicting stock trends based on expert recommendations using gru/lstm
neural networks. In Lecture Notes in Computer Science, pages 708–717. Springer International Pub-
lishing, 2017.
[181] Leonardo dos Santos Pinheiro and Mark Dras. Stock market prediction with deep learning: A
character-based neural language model for event-based trading. In Proceedings of the Australasian
Language Technology Association Workshop 2017, pages 6–15, 2017.
[182] Yang Liu, Qingguo Zeng, Huanrui Yang, and Adrian Carrio. Stock price movement prediction from
financial news with deep learning and knowledge graph embedding. In Knowledge Management and
Acquisition for Intelligent Systems, pages 102–113. Springer International Publishing, 2018.
[183] Takashi Matsubara, Ryo Akita, and Kuniaki Uehara. Stock price prediction by deep neural generative
model of news articles. IEICE Transactions on Information and Systems, E101.D(4):901–908, 2018.
[184] Janderson B. Nascimento and Marco Cristo. The impact of structured event embeddings on scalable
stock forecasting models. In Proceedings of the 21st Brazilian Symposium on Multimedia and the Web
- WebMedia15. ACM Press, 2015.
[185] Md Shad Akhtar, Abhishek Kumar, Deepanway Ghosal, Asif Ekbal, and Pushpak Bhattacharyya.
A multilayer perceptron based ensemble technique for fine-grained financial sentiment analysis. In
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages
540–546. Association for Computational Linguistics, 2017.
[186] Ching-Yun Chang, Yue Zhang, Zhiyang Teng, Zahn Bozanic, and Bin Ke. Measuring the information

51
content of financial news. In COLING, 2016.
[187] Hitkul Jangid, Shivangi Singhal, Rajiv Ratn Shah, and Roger Zimmermann. Aspect-based financial
sentiment analysis using deep learning. In Companion of the The Web Conference 2018 on The Web
Conference 2018 - WWW18. ACM Press, 2018.
[188] Shijia E., Li Yang, Mohan Zhang, and Yang Xiang. Aspect-based financial sentiment analysis with
deep neural networks. In Companion of the The Web Conference 2018 on The Web Conference 2018
- WWW18. ACM Press, 2018.
[189] Sahar Sohangir, Dingding Wang, Anna Pomeranets, and Taghi M. Khoshgoftaar. Big data: Deep
learning for financial sentiment analysis. Journal of Big Data, 5(1), January 2018.
[190] Nader Mahmoudi, Paul Docherty, and Pablo Moscato. Deep neural networks understand investors
better. Decision Support Systems, 112:23–34, August 2018.
[191] Shiori Kitamori, Hiroyuki Sakai, and Hiroki Sakaji. Extraction of sentences concerning business
performance forecast and economic forecast from summaries of financial statements by deep learning.
In 2017 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, November 2017.
[192] Guangyuan Piao and John G. Breslin. Financial aspect and sentiment predictions with deep neural
networks. In Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW18.
ACM Press, 2018.
[193] Weifeng Li and Hsinchun Chen. Identifying top sellers in underground economy using deep learning-
based sentiment analysis. In 2014 IEEE Joint Intelligence and Security Informatics Conference. IEEE,
September 2014.
[194] Andrew Moore and Paul Rayson. Lancaster a at semeval-2017 task 5: Evaluation metrics matter: pre-
dicting sentiment from financial news headlines. In Proceedings of the 11th International Workshop on
Semantic Evaluation (SemEval-2017), pages 581–585, Vancouver, Canada, August 2017. Association
for Computational Linguistics.
[195] Josh Jia-Ching Ying, Po-Yu Huang, Chih-Kai Chang, and Don-Lin Yang. A preliminary study on deep
learning for predicting social insurance payment behavior. In 2017 IEEE International Conference on
Big Data (Big Data). IEEE, December 2017.
[196] Sahar Sohangir and Dingding Wang. Finding expert authors in financial forum using deep learning
methods. In 2018 Second IEEE International Conference on Robotic Computing (IRC). IEEE, January
2018.
[197] Vadim Sokolov. Discussion of ’deep learning for finance: deep portfolios’. Applied Stochastic Models
in Business and Industry, 33(1):16–18, 2017.
[198] Abdelali El Bouchti, Ahmed Chakroun, Hassan Abbar, and Chafik Okar. Fraud detection in banking
using deep reinforcement learning. In 2017 Seventh International Conference on Innovative Computing
Technology (INTECH). IEEE, August 2017.
[199] Matthew Dixon, Diego Klabjan, and Jin Hoon Bang. Implementing deep neural networks for financial
market prediction on the intel xeon phi. In Proceedings of the 8th Workshop on High Performance
Computational Finance - WHPCF15. ACM Press, 2015.
[200] John Alberg and Zachary Chase Lipton. Improving factor-based quantitative investing by forecasting
company fundamentals. CoRR, abs/1711.04837, 2017.
[201] Kee-Hoon Kim, Chang-Seok Lee, Sang-Muk Jo, and Sung-Bae Cho. Predicting the success of bank
telemarketing using deep convolutional neural network. In 2015 7th International Conference of Soft
Computing and Pattern Recognition (SoCPaR). IEEE, November 2015.
[202] Joonhyuck Lee, Dong Sik Jang, and Sangsung Park. Deep learning-based corporate performance
prediction model considering technical capability. Sustainability, 9(6), May 2017.

52

You might also like