0% found this document useful (0 votes)
40 views

Review of Deep Learning Models For Crypto Prices Prediction

Uploaded by

kattapa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Review of Deep Learning Models For Crypto Prices Prediction

Uploaded by

kattapa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Review of deep learning models for crypto price prediction: implementation and

evaluation

Jingyang Wua , Xinyi Zhanga , Fangyixuan Huanga , Haochen Zhoua , Rohtiash Chandraa
a UNSW Sydney Sydney Australia

Abstract
There has been much interest in accurate cryptocurrency price forecast models by investors and researchers. Deep Learning models
are prominent machine learning techniques that have transformed various fields and have shown potential for finance and economics.
arXiv:2405.11431v1 [cs.LG] 19 May 2024

Although various deep learning models have been explored for cryptocurrency price forecasting, it is not clear which models are
suitable due to high market volatility. In this study, we review the literature about deep learning for cryptocurrency price forecasting
and evaluate novel deep learning models for cryptocurrency stock price prediction. Our deep learning models include variants of
long short-term memory (LSTM) recurrent neural networks, variants of convolutional neural networks (CNNs), and the Transformer
model. We evaluate univariate and multivariate approaches for multi-step ahead predicting of cryptocurrencies close-price. Our
results show that the univariate LSTM model variants perform best for cryptocurrency predictions. We also carry out volatility
analysis on the four cryptocurrencies which reveals significant fluctuations in their prices throughout the COVID-19 pandemic.
Additionally, we investigate the prediction accuracy of two scenarios identified by different training sets for the models. First, we
use the pre-COVID-19 datasets to model cryptocurrency close-price forecasting during the early period of COVID-19. Secondly,
we utilise data from the COVID-19 period to predict prices for 2023 to 2024.
Keywords: cryptocurrency, deep learning, time series prediciton
PACS: 0000, 1111
2000 MSC: 0000, 1111

1. Introduction [13]. Satoshi Nakamoto pseudonymously introduced Bitcoin


and released it as an open source software in January 2009 [8].
The traditional financial ecosystem is implemented through The cryptocurrency ecosystem encompassing Bitcoin and Alt-
a complex set of policies and structural mechanisms that finan- coins with tokens such as Civic and BitDegree, marks a sig-
cial institutions utilise to engender currency within an econ- nificant stride towards a decentralized financial system. The
omy [1]. The core of this ecosystem is the central bank, trea- cryptocurrency ecosystem refers to the broader infrastructure
sury, and commercial banking entities which are classified un- and community that encompasses various cryptocurrencies and
der three primary monetary frameworks: commodity-based [2], blockchain projects, whereas a ”token” such as BitDegree and
commodity-backed [3], and fiat currency systems [4]. Trig- Civic, serves specific functions within these ecosystems, often
gered by the flaws in these institutions such as inflationary facilitating access to services or representing certain assets. Un-
propensities and transactional inefficiencies [5], the digitisation like Bitcoin, which is primarily a digital currency intended for
of currency has become a revolution [6]. Cryptocurrencies aim transactions and value storage, BitDegree serves a distinct pur-
to rectify the existing system imperfections [7], such as infla- pose by focusing on education, offerring tokens as incentives
tion, financial stability, transactional efficiency, and reduced op- within its educational platform. Nevertheless, due to its decen-
erational costs. A cryptocurrency is a peer-to-peer digital ex- tralised nature and absence of governmental support, the cryp-
change system where cryptographic techniques are employed tocurrency market is susceptible to significant fluctuations in
to create and distribute units of currency among participants value and the formation of pricing bubbles [14].
[8, 9]. The cryptocurrency market has seen rapid and unpre-
dictable changes over its relatively brief existence [10]. The The inherent volatility of cryptocurrencies featuring transac-
security of the cryptocurrency market is ensured by a tech- tion volume fluctuations and price variability, complicates the
nology called blockchain [11], which provides a comprehen- predictive analysis of cryptocurrency prices [15]. However,
sive security. In the present year (2024), there are over 5,000 volatility [16] makes it a profitable market for speculation as
cryptocurrencies and 5.8 million active users in the cryptocur- the sourse of potencial gain. The prominent cryptocurrencies
rency industry [12]. Due to its inherent nature of mixing cryp- such as Bitcoin (BTC), Ethereum (ETH), and Litecoin (LTC)
tography with a monetary unit, Bitcoin (BTC) became one differ based on valuation, transaction speed, usage, and volatil-
of the most popular cryptocurrency and received attention in ity [17]. Identifying the precise catalysts for these price trends
fields such as computer science, economics and cryptography in the cryptocurrency domain remains elusive due to the sec-
Preprint submitted to . May 21, 2024
tor’s pronounced volatility. Nevertheless, the market value of then evaluate novel deep learning models for cryptocurrency
cryptocurrencies is projected to increase in the future, with an price forecasting. Specifically, we utilise variants of long short-
expected compound annual growth rate of 11.1% [18]. Mean- term memory (LSTM) recurrent neural networks, variants of
while, the financial audit sector is evolving to integrate cryp- convolutional neural networks (CNNs), and the Transformer
tocurrencies as a valid transaction medium. Investors have en- model. We evaluate univariate and multivariate approaches
countered challenges in previous instances due to price bubbles for multi-step ahead predicting of cryptocurrencies close-price.
resulting in extreme fluctuations [19]. In order to surmount Our results show that the univariate LSTM model variants per-
these obstacles, it is imperative to have a dependable model form best for cryptocurrency predictions. We also carry out
that can aid market participants in identifying trends and gen- volatility analysis on the four cryptocurrencies and investigate
erating accurate predictions. Predicting cryptocurrency prices the prediction accuracy of two scenarios identified by different
with precision is difficult due to its sensitivity to multiple fac- training sets for the models. First, we use the pre-COVID-19
tors, including government policies, technology advancements, datasets to model cryptocurrency close-price forecasting during
public perception, and world events [20]. Muarry et al. [21] the early period of COVID-19. Secondly, we utilise data from
highlights the inherent difficulties in predicting the pricing of the COVID-19 period to predict prices for 2023 to 2024. We in-
cryptocurrencies because of their high volatility, decentralised vestigate the effect of univariate and multivariate models, where
nature, and other distinctive features such as transaction speed the multivariate model features the Gold price, close, open, and
and variations in their ecosystems. high price of the crypto being predicted and a highlighted cor-
Several researchers are affirming the correlation between related crypto price index.
cryptocurrencies and other domains such as the economics, fi- The rest of the paper is organised as follows. Chapter 2
nance, the internet, and even politics. Wang et al. [22] pre- provides a comprehensive overview and analysis of previous
sented an analysis using machine learning models and revealed research and literature relating to the topic. Chapter ?? pro-
a strong correlation between cryptocurrencies and their intrinsic vides the framework that compares selected deep learning mod-
features (e.g., lagged volatility, previous trading information). els. Chapter 4 presents the experiments and results. Chapter 5
Kyriazis [23] studied spillover effects in cryptocurrency mar- presents a discussion, and Chapter 6 concludes the paper.
kets, emphasising Bitcoin’s role using statistical models such as
vector autoregression (VAR) [24] and generalized autoregres-
sive conditional heteroskedasticity (GARCH) [25] to explain 2. Review
inter-market dynamics. Huynh et al. [26] revealed that gold can
Forecasting financial time series is highly favoured by re-
be used as a reliable tool to reduce the risk associated with un-
searchers in both academic and financial sectors due to its wide
predictable changes in the cryptocurrency market when utilized
applications and significant influence. Machine learning and
as a separate form of currency. However, investors are enthusi-
deep learning have paved the way for numerous models, lead-
astic and also cautious due to the highly volatile cryptocurrency
ing to a large body of published research. Among these areas of
market. Machine learning, along with deep learning models are
interest, cryptocurrency price prediction stands out. This sec-
promising for cryptocurrency due to prediction capabilities and
tion offers an in-depth overview of how machine learning and
the ability to model multimodal [27], spatiotemporal data [28],
deep learning are applied to financial time series forecasting,
and time series forecasting [29].
especially for predicting cryptocurrency prices, without using
Machine learning and deep learning models have shown
complex terminology.
great potential in temporal forecasting problems for various do-
mains, such as climate extremes [30], energy [31], and financial
time series [32]. Deep learning models can assist in forecasting 2.1. Financial time series prediction
future cryptocurrency prices, although there are challenges due Financial time series forecasting had an emphasis on pre-
to nonlinear and volatile nature of the time series. Many re- dicting asset prices [41]. Although there are diverse method-
searchers are keen to use long short-term memory (LSTM) and ologies, the key focus has been on predicting the future move-
its variants to predict cryptocurrencies [33, 34, 35]. Deep learn- ments of the underlying asset with deep learning models [42].
ing methods such as LSTM recurrent neural networks [36, 37], This field covers a variety of subjects including forecasting of
convolutional neural networks (CNN) [38], and Transformer stock prices, index prediction, forex price prediction, as well
models [39] are also promising for predicting cryptocurren- as predictions for commodities (such as oil and gold), bond
cies. Chandra et al. [40] led a comparative analysis of various prices, volatility, and cryptocurrency prices [43]. Despite the
deep learning models for multi-step forward time series predic- wide range of topics, the underlying principles applied in these
tion. A myriad of factors, both internal and external, such as forecasts remain uniformly applicable across all categories.
the trading volume, market beta, and volatility, play a critical Research within financial time series forecasting is broadly
role in determining cryptocurrency value. Therefore, we need segregated into two categories based on precise price forecast-
to utilise cryptocurrencies that are highly correlated for deep ing and trend (directional movement) forecasting [44]. Al-
learning models and access univariate and multivariate deep though exact price prediction aligns with regression tasks, the
learning models. primary goal in numerous financial forecasting projects is not
In this paper, we provide a detailed review of the literature the accurate prediction of prices, but rather the correct identi-
on crypto-price forecasting using deep learning models and fication of price movement direction. This shifts the emphasis
2
towards trend prediction, or determining the directional change which combined sentiment analysis with deep learning mod-
in prices, marking it as a more critical area of investigation com- els and got enhanced prediction accuracy by incorporating both
pared to pinpoint price forecasting. Hence, trend prediction is market data and investor sentiment.
approached as a classification issue. Some analyses focus on bi- Table 1 provides a list of sample studies focused on using
nary outcomes, addressing only upward/downward movements traditional statistical and machine learning methods to predict
[45], while others incorporate a third class (neutral option), thus cryptocurrency trends. We report various models with error
constituting a 3-class problem [46]. measures such as the mean absolute error (MAE), mean ab-
In recent years, researchers have utilised machine learning solute percentage error (MAPE) and root mean squared er-
and deep learning for the analysis of financial time series data. ror(RMSE). We also mention the time periods of data used in
Nabipour et al. [45] conducted a comparative analysis of deep these literatures.
learning models (simple recurrent neural network (RNNs) [47]
and LSTM networks [36]) with machine learning models for
stock market trend prediction, demonstrating the superior accu- 2.2. Cryptocurrency prediction
racy of deep learning. Mehtab et al. [48] enhanced NIFTY-50
1
Indian stock index prediction using LSTM models with the
grid-searching and walk-forward validation and achieved no- Some researchers have employed machine learning models,
table accuracy. NIFTY-50 represents the weighted average of such as simple neural networks (SNN) also known as back-
50 of the top companies listed on the National Stock Exchange propagation and artificial neural networks [61], support vector
(NSE) of India. Rezaei et al. [49] combined deep learning with machines (SVM) [62], genetic algorithm-based SNN [63], and
frequency decomposition methods, including empirical mode neuroevolution of augmenting topologies (NEAT) [64] which
decomposition (EMD) [50], and complete ensemble empiri- evolves both architecture and neural network parameters.
cal mode decomposition (CEEMD) [51] to predict stock prices Next, we review some of the machine learning models that
and demonstrated effectiveness of CEEMD-CNN-LSTM and are pivotal in predicting cryptocurrency prices. Greaves and
EMD-CNN-LSTM. Jing et al. [52] developed a hybrid model Au [65] demonstrated the superiority of neural networks over
that merges deep learning with investor sentiment analysis, util- linear regression, logistic regression, and support vector ma-
ising CNN for sentiment classification and LSTM for stock chines (SVM) [66] for Bitcoin price prediction. Sovbetov
price prediction, demonstrating enhanced predictive accuracy [67] examined the effect of market factors by using autoregres-
for stock prices. Mehtab and Sen [53] used a blend of machine sive distributed lag (ARDL) and the S&P50 Index on various
learning and deep learning models with walk-forward valida- cryptocurrencies. Guo et al. [68] improved short-term Bit-
tion and grid-search technique for precise short-term forecast- coin volatility forecasting with temporal mixture models, out-
ing rather than long-term trends of NIFTY-50, offering valu- performing traditional methods. Akcora et al. [69] investi-
able insights for short-term traders. Li and Pan [54] enhanced gated the predictive Granger causality of chainlets and iden-
stock price prediction accuracy by employing an ensemble deep tify certain types of chainlets that exhibit the highest predic-
learning model that leveraged stock prices and news data, us- tive influence on Bitcoin price and investment risk. Roy et
ing LSTM and gated recurrent unit (GRU) networks. Kanwal al. [70] used ARIMA, Autoregressive, and Moving Average
et al. [55] introduced a hybrid deep learning model combin- models in forecasting short-run volatility in Bitcoin’s weighted
ing bidirectional LSTM and one-dimensional CNN for stock costs. Derbentsev et al. [71] compared binary autoregres-
price prediction, achieving higher accuracy and efficiency on sive tree (BART), ARIMA, and autoregressive fractional inte-
five distinct stock datasets. Swathi et al. [56] presented a grated moving average (ARFIMA) models for forecasting Bit-
novel model for stock price prediction, leveraging Twitter sen- coin, Ethereum, and Ripple prices where BART had best accu-
timent analysis with an impressive accuracy of 94.73%, show- racy. Kumar et al. [72] and Latif et al. [73] examine the effec-
casing its effectiveness over traditional and other deep learning tiveness of LSTM and ARIMA models in the short-term pre-
methods. Ben Ameur et al. [57] utilized deep learning mod- diction of BTC prices, demonstrating that while ARIMA mod-
els (LSTM, GRU, RNN, and CNNs) to forecast commodity els can capture the general trend, LSTM models excel in pre-
prices for the Bloomberg Index, demonstrating LSTM models dicting both the direction and magnitude of price movements,
superior performance. Baser et al. [58] evaluated gold price highlighting the potential of deep learning in financial market
prediction using tree-based models, including Decision Trees, predictions. Maleki et al. [74] used machine learning models
AdaBoost, Random Forest, Gradient Boosting, and XGBoost. including linear regression, gradient boosting regressor(GBR),
They demonstrated XGBoosts superior accuracy through tech- support vector regressor(SVR), random forest regressor (RFR)
nical indicators analysis. Deepa et al. [59] used statistical and and ARIMA in predicting Bitcoin prices, suggesting new in-
machine learning models for prediction of cotton prices in India vestment strategies in the cryptocurrency market.
and reported that boosted decision tree regression provided the Table 2 provides a list of studies focused on using traditional
highest accuracy. Zhao and Yang [60] proposed an integrated statistical and machine learning methods to predict cryptocur-
deep learning framework for stock price movement prediction, rency trends. The table reports metrics such as the mean ab-
solute error (MAE), mean absolute percentage error (MAPE)
and root mean squared error(RMSE). We also mention the time
1 https://www.nseindia.com/ periods of data used in these papers.
3
Target Time range
Methods Data Metric
predictor (month/day/year)
LSTM Price 12/29/2014-
NIFTY 50 index RMSE
grid-search[48] prediction 07/31/2020
MLP, F1-Score
Trend 11/01/2009-
RNN, Stock market trends Accuracy
prediction 11/01/2019
LSTM[45] ROC-AUC
LSTM, CNN,
RMSE
empirical mode Price 01/01/2010-
Stock prices MAE
decomposition, prediction 09/01/2019
MAPE
CEEMD[49]
Price 01/01/2017-
CNN, LSTM[52] Stock prices MAPE
prediction 07/01/2019
Linear Regression,
Bagging, XGBoost Short-term price Comparative
NSE stock prices -
Ranform Forests prediction analysis
MLP, SVM, LSTM[53]
MSE
LSTM with
Price 12/31/2017- Precision
Sentiment Stock prices
Prediction 06/01/2018 Recall
Analysis[54]
F1-Score
BD-LSTM, Price 01/01/2000- Accuracy
Stock prices
1D-CNN[55] prediction 12/31/2020 Efficiency
Teaching Accuracy
Learning Based Price Precision
Stock prices -
Optimization prediction Recall
LSTM[56] F1-Score
LSTM, Bloomberg
Price 01/01/2002-
Gated Recurrent Units, Commodity Accuracy
prediction 12/31/2020
RNN, CNN[57] Index
Decision Tree, MAE
AdaBoost, Price 11/18/2011- MSE
Gold prices
Random Forests, prediction 01/01/2019 RMSE
Gradient Boosting, XGBoost[58] R2 Score
Logistic Regression,
Bayesian Linear Regression, MAE
Agriculture
Boosted Decision Price RMSE
material -
Tree Regression, prediction RAE
prices
Random Forest Regression, R square
Poisson Regression[59]
LSTM, Ensemble CNN,
Stock prices Price 01/01/2002-
Denoising Autoencoder, RMSE
and sentiment movement 12/31/2020
Sentiment Analysis[60]

Table 1: Sample studies focused on financial time series forecasting

4
Cryptocurrency Target Time range
Methods Metric
(type) predictor (month/day/year)
Linear regression,
prior- MSE
Logistic regression, BTC Future price
07/04/2013 Accuracy
Neural Networks, SVM[65]
BTC
ETH
Autoregressive Short-Long 01/01/2010- ADF
Dash
Distributed Lag[67] term price 01/12/018 test price
LTC
Monero
Temporal Short-term 09/01/2015- RMSE
BTC
mixture models[68] volatility 04/01/2017 MAE
01/01/2009- RMSE
k-Chainlets[69] BTC Close Price
01/01/2018 wallet gain
ARIMA,
07/31/2013-
Autoregressive, BTC Market price RMSE
08/01/2017
Movingaverage[70]
BART, BTC
Short-term 01/01/2017-
ARIMA, Ripple RMSE
price 03/01/2019
ARFIMA[71] ETH
ARIMA, 01/01/2016-
ETH Close Price MSE
LSTM[72] 12/31/2021
MAPE
ARIMA, Short-term 12/21/2020-
BTC MAE
LSTM[73] price 12/21/2021
RMSE
Logistic Regression, MSE
BTC
Gradient boosting regressor, MAPE
ETH 04/01/2018-
SVR, Close price MAE
ZEC 03/31/2019
Random forest regressor, AIC
LTC
ARIMA[74] BIC

Table 2: Sample studies focused on using machine learning and time series methods in cryptocurrency forecasting

2.3. Deep learning models for cryptocurrency prediction ple. LSTM, bidirectional LSTM, and CNN models demon-
In recent years, deep learning models have been prominent strated the capability to deliver precise and dependable pre-
in the prediction of cryptocurrencies, as follows. Jiang and dictions. Marne et al. [81] used RNN and LSTM models to
Liang [38] combined CNNs with reinforcement learning [75] predict Bitcoin prices that showed better results than machine
for portfolio management utilising historical cryptocurrency learning models. Nasekin and Chen [82] analysed cryptocur-
pricing data to allocate assets optimally within specified port- rency investor sentiment using CNN for sentiment classifica-
folio constraints. Wu et al. [35] improved Bitcoin prediction tion and index construction from StockTwits messages. Sridhar
accuracy by using autoregressive characteristics in an LSTM and Sanagavarapu [39] employed a Transformer model for Do-
network, outperforming standard LSTM. Lee et al. [76] intro- gecoin price prediction demonstrating the model’s capability
duced a novel approach employing inverse reinforcement learn- to capture both short-term and long-term dependencies effec-
ing coupled with agent-based modeling for Bitcoin price pre- tively. Betancourt and Chen [83] propose the utilization of deep
diction. Ly et al. [77] employed LSTM networks to predict reinforcement learning (DRL) [84] for the dynamic manage-
Bitcoin trends, demonstrating the models’ capability to fore- ment of cryptocurrency asset portfolios, accommodating port-
cast price changes and classify market movements with vary- folios comprising an evolving number of cryptocurrency as-
ing degrees of accuracy. Saad et al. [13] found LSTM to sets. Shahbazi and Byun [85] applied reinforcement learning
be the most accurate in forecasting Bitcoin prices compared for forecasting Litecoin and Monero market values. D’Amato
to various machine learning modelsPatel et al. Lucarelli and et al. [86] employed a Jordan RNN to enhance the prediction
Borrotti [78] investigated automated cryptocurrency trading us- of cryptocurrency volatility, demonstrating superior accuracy
ing deep reinforcement learning, employing double deep Q- over traditional machine learning models for Bitcoin, Ripple,
learning networks trained by Sharpe ratio rewards, which out- and Ethereum. Schnaubelt [87] applied reinforcement learning
performed traditional models in Bitcoin trading. Lahmiri and to develop cryptocurrency trading strategies. Parekh et al. [88]
Bekiros [79] compared LSTM networks with generalized re- combined LSTM and sentiment analysis to predict cryptocur-
gression neural networks (GRNN) to forecast cryptocurrency rency prices. The study integrated market sentiments from so-
prices, revealing the chaotic dynamics and fractality in digital cial media for enhanced forecasting accuracy. Kim et al. [89]
currencies’ time series. Patel et al. [80] introduced a hybrid applied a self-attention-based multiple LSTM model and im-
LSTM with gated recurrent unit model for Litecoin and Mon- proved the prediction accuracy for Bitcoin. Goutte et al. [90]
ero and achieved more accuracy than a simple LSTM model. used LSTM networks with technical analysis to enhance cryp-
Livieris et al. [33] combined deep learning and ensemble learn- tocurrency trading strategies, particularly focusing on Bitcoin.
ing to forecast trends and prices of Bitcoin, Ethereum, and Rip- Table 3 provides an overview of sample research that focuses
5
on applying deep learning techniques to forecast the trend of the model’s ability to incorporate higher-order moments and
cryptocurrencies. The table reports metrics such as the mean leverage effects significantly enhances the accuracy of volatility
absolute error (MAE), mean absolute percentage error (MAPE) forecasts across various cryptocurrencies. Ma et al. [98] em-
and root mean squared error(RMSE). We also mention the time ployed the Markov Regime-Switching Mixed Data Sampling
periods of data used in these papers. (MRS-MIDAS) model to forecast cryptocurrency volatility,
particularly focusing on Bitcoin. They enhanced the standard
2.4. Cryptocurrency volatility and prediction MIDAS approach by incorporating jump-driven time-varying
Several researchers concentrate on analyzing and predict- transition probabilities, which allowed the model to capture dy-
ing the volatility of cryptocurrencies. Volatility in the cryp- namic changes in volatility states influenced by market jumps.
tocurrencies market is a significant factor that influences nu- Table 4 provides an overview of sample research that focuses
merous decisions in business and finance [91]. Recently, there on cryptocurrency volatility and prediction. The table reports
has been identification of volatility spillovers between the cryp- metrics such as the mean absolute error (MAE), mean absolute
tocurrency market and other financial markets [86]. Katsiampa percentage error (MAPE) and root mean squared error(RMSE).
[16] employed an Asymmetric Diagonal BEKK model to exam- We also mention the time periods of data used in these papers.
ine the volatility dynamics in the cryptocurrency market, reveal-
ing significant interdependencies and responsiveness to major
news in the volatility levels of major cryptocurrencies such as 3. Methodology: Implementation and Evaluation
Bitcoin, Ether, Ripple, Litecoin, and Stellar Lumen. Woebbek-
ing [92] developed the CVX index using a model-free approach 3.1. Conventional models
derived from cryptocurrency option prices, unveiling that cryp-
tocurrency volatility often diverges from traditional financial 3.1.1. ARIMA
markets, and is distinctly reactive to major market events. Yen The ARIMA model, often known as the Box-Jenkins model
and Cheng [91] utilized stochastic volatility models to ana- [100], is a commonly used statistical/econometric model for
lyze the impact of the Economic Policy Uncertainty (EPU) forecasting time series data. The ARIMA model consists of
index on cryptocurrency volatility, finding that China’s EPU three components: autoregressive (AR), integrated (I), and
uniquely predicts the volatility of Bitcoin and Litecoin, suggest- moving average (MA). The integrated component represents
ing these cryptocurrencies might serve as hedging tools against the amount of differencing required to transform the series data
EPU risks. Cross, Hou, and Trinh [93] utilized a time-varying into a stationary representation. The autoregressive component
parameter model to explore the returns and volatility of cryp- describes the relationship between the present value of a time
tocurrencies during the 2017–18 bubble, highlighting a signifi- series and its previous values, capturing their correlation. The
cant risk premium effect in Litecoin and Ripple and identifying moving average component indicates the correlation between
adverse news effects as key drivers of the 2018 crash across Bit- the current observation and its previous error term. This com-
coin, Ethereum, Litecoin, and Ripple. Ftiti, Louhichi, and Ben ponent assists the model in capturing stochastic variations in
Ameur [94] utilized heterogeneous autoregressive (HAR) mod- the time series. The three components constitute the three pa-
els with high-frequency data to explore cryptocurrency volatil- rameters p, d, and q in the model. p represents the number of
ity during the COVID-19 pandemic. Their findings underscore lag observations in the autoregressive part. d is the order of dif-
the predictive superiority of models incorporating both posi- ferencing, which forms the integrated part, and q is the number
tive and negative semi-variances, especially during the crisis, of lagged forecast errors in the moving average component.
suggesting these models can effectively capture the asymmetric
dynamics of market volatility. Yin, Nie, and Han [95] applied 3.1.2. Multilayer perceptron
the Generalized Autoregressive Conditional Heteroskedasticity
A simple neural network, also known as the multilayer per-
- Mixed Data Sampling (GARCH-MIDAS) model to explore
ceptron is a machine learning model that features an input layer,
the influence of oil market shocks on the volatility of Bitcoin,
an output layer and at least one hidden layer. Figure 1 illustrates
Ethereum, and Ripple. Their analysis revealed that oil market
the architecture of the MLP. The MLP need to use a training
shocks, both supply and demand types, significantly affect the
algorithm to update the weights and biases to ensure that the
long-term volatility of these cryptocurrencies, thereby suggest-
output (prediction) of the network resembles the actual obser-
ing potential hedging capabilities against oil-induced economic
vations (training data). The network computes the weight sum
uncertainties.
of inputs to get the hidden and output layers by
There is also some research on the prediction of cryptocur-
rency volatility. Catania, Grassi, and Ravazzolo [96] em- hW,b (x) = f (W x + b)
ployed a score-driven Generalized Hyperbolic Skew Student’s t n
(GHSKT) model to analyze and predict the volatility of Bit-
X (1)
= f ( Wi xi + b)
coin, Ethereum, Litecoin, and Ripple. They demonstrated i=1
that accounting for long memory and asymmetric reactions to
past shocks enhances the model’s predictive accuracy signif- where x is the input item, f (·) is the activation function, b is the
icantly across various forecast horizons. Catania and Grassi bias, n is the number of input units and w is the weight.
[97] further employed the GHSKT model demonstrating that
6
Deep learning Cryptocurrency Target Time range
Metric
techniques (type) predictor (month/day/year)
RMSE
BTC
CNN, 01/01/2018- Accuracy
ETH Close Price
Reinforcement Learning[38] 02/28/2019 AUC
XRP
F1
LSTM with MSE
Short-Long 01/01/2018-
autoregressive BTC RMSE
term price 07/28/2018
characteristics[35] MAPE
MAE
inverse Reinforcement
09/01/2016- MSE
Learning BTC Close Price
07/31/2017 RMSE
Agent-based Model[76]
MAPE
Trend
RNN, LSTM[77] BTC - -
Prediction
Reinforcement Learning,
BTC 10/01/2015- RMSE
LSTM, Close price
ETC 05/01/2018 MAE
Conjugate Gradient[13]
Deep Reinforecemt Profit-based
BTC Trading -
Learning[78] Metrics
Chaotic BTC
Price 07/16/2010-
Neural DASH -
Forecasting 10/01/2018
Networks[79] XRP
LSTM with LTC Short-Long 30/01/2015-
Accuracy
Gated Recurrent Units[80] Monero term price 02/23/2020
LSTM, BTC MSE
Short-Long 01/01/2018-
BD-LSTM, ETH RMSE
term price 08/31/2019
CNN[33] XRP MAPE
01/01/2014-
RNN, LSTM[81] BTC Close Price RMSE
01/31/2019
Sentiment 03/01/2013-
CNN[82] Various -
Analysis 05/31/2018
07/05/2019- Accuracy
Transformer[39] DOGE Close Price
04/28/2021 R-squared
Deep Reinforment Portfolio 08/17/2017- Total Return
Various
Learning[83] Management 11/01/2019 Sharpe Ratio
MAE
LTC MSE
Reinforment Learning[85] Price Prediction 2016-2020
Monero RMSE
MAPE
BTC
04/28/2013- MSE
Jordan RNN[86] XRP Volatility
12/15/2019 MAPE
ETH
Limit
Deep Reinforcement 01/01/2018- Total Return
Various Order
Learning[87] 06/30/2019 Sharpe Ratio
Placement
MSE
LSTM Dash Price
- MAE
sentiment analysis[88] BTC-Cash Prediction
MAPE

Table 3: Sample studies focused on deep learning methods in cryptocurrency forecasting

7
Target Time range
Methods Data Metric
predictor (month/day/year)
BTC
ETH
Asymmetric Diagonal, Volatility 08/07/2015- Past squared errors,
XRP
BEKK model[16] dynamics 02/10/2018 past conditional volatility
LTC
XLM
Model-free volatility, Volatility 02/06/2020-
BTC Not specified
CVX index[92] dynamics 07/06/2021
EPU impact on
Stochastic volatility BTC 02/2014-
cryptocurrency Not specified
models[91] LTC 06/2019
volatility
Time-varying BTC
Forecast accuracy
parameter ETH Returns and 01/2017-
MSFE
stochastic volatility LTC volatility dynamics 01/2019
ALPL
model[93] XRP
BTC
Heterogeneous MSE
ETH Volatility 04/01/2018-
autoregressive MAE
ETC forecasting 06/30/2020
models[94] MAPE
XRP
BTC Impact of oil MAE
GARCH-MIDAS 04/28/2013-
ETH market shocks MAPE
model[95] 12/31/2018
XRP on volatility RMSE
Score-driven BTC
Generalized Hyperbolic ETH Volatility 04/29/2013- Quasi-Like
Skew Student’s t LTC forecasting 12/01/2017 loss function
model[96] XRP
Score-driven BTC
MSE
Generalized Hyperbolic ETH Volatility
- MAE
Skew Student’s t LTC forecasting
MAPE
model[97] XRP
Markov Quasi-Like
Regime-Switching, Volatility 03/01/2013- loss function
BTC
Mixed Data forecasting 09/29/2018 MSE
Sampling model[98] MAE
Cryptocurrency Diebold-Mariano test
GARCH-MIDAS Gold uncertainty impact 01/02/2014- R-square
model[99] Silver on precious metal 05/13/2022 Model Confidence Set test
volatility Direction-of-Change rate test

Table 4: Sample studies focused on cryptocurrency volatility and prediction

8
considered to be an enhanced version of the RNN [36]. The
LSTM overcame the vanishing gradient constraint by enhanc-
ing its ability to retain long-term dependencies through mem-
ory cells in the hidden layer. We present the architecture of
LSTM network in Figure 3 showing how the information is
passed through LSTM memory cells in the hidden layer. The
LSTM cell is designed as a unit that memorises each input in-
formation for a long time, where previous information can still
be retained, and hence addressing the problem of learning long-
term dependencies in sequence data. The LSTM cell calculates
a hidden state output ht by

ft = σ(W f [ht−1 , xt ] + b f )
Figure 1: Architecture of a multilayer perceptron showing the input, hidden and
output layers and interconnections between them, also the parameters in each it = σ(Wi [ht−1 , xt ] + bi )
layer(weight, bias and input). ot = σ(Wo [ht−1 , xt ] + bo )
(2)
z = tanh(Wz [ht−1 , xt ] + bz )
3.2. Deep learning models Ct = f ∗ Ct−1 + it ∗ z
3.2.1. Variants of LSTM networks ht = 0t ∗ tanh(Ct )
RNNs are well-known for modelling temporal sequences,
where ft ,it and ot refer to the forget gate, input gate and output
which are distinguished by their context layers as they mem-
gate respectively. W is weight matrices adjusted learning along
ory information from prior input to influence the future results.
with b, which is the bias. xt is the number of input features, and
There are several simple RNN architectures, such as the El-
ht is the number of hidden units. z express as intermediate cell
man RNN [47] (also known as simple RNN) which was one
state, and Ct is the current cell memory.
of the earliest attempts for effectively modelling temporal se-
quences. Figure 2 gives architecture of the Elman RNN. There
are trainable weights connecting each two adjacent layers. A
context (state or memory) layer is used to store the output of
state neurons resulting from the computation of previous time
steps, making them appropriate for capturing time-varying pat-
terns in data.

Figure 3: LSTM network showing the input, hidden (LSTM cell), and output
layers. LSTM cell extract information from input feature x in over all time
span.

The bi-directional LSTM (BD-LSTM) is an advanced algo-


rithm based on LSTM, which process information in both di-
rections with two independent hidden layers [102]. The basic
idea is each input sequence passes through the RNN once in
both the forward and reverse directions. This bidirectional ar-
chitecture provides the output layer with complete past and fu-
ture context information for each node in the input sequence.
Figure 2: Architecture of Elman RNN that consists of input, hidden, context The structure of BD-LSTM is shown in Figure 4. In contrast to
(state), and output layers LSTM, BD-LSTM exhibits greater efficacy in addressing prob-
lems that require the acquisition of context from both temporal
However, simple RNNs faced problems in training due to the directions. This is particularly evident in certain applications
vanishing gradient problem [101] arising when handling long- within the domains of natural language processing [103] and
term dependencies in sequence data. The LSTM algorithm is speech recognition [104].
9
a set of stock data inputs, along with specifications like filter
count, filter width, and stride length, while the kernel’s height
remains irrelevant. This function initializes filter values using a
Gaussian distribution and sets biases to zero. It generates sev-
eral matrices as outputs, where their quantity corresponds to
the filter count. These matrices are crucial as they contribute to
feature extraction within the CNN model, ultimately serving as
inputs for the subsequent pooling layer following the activation
function’s execution. We note that in the case of multivariate
time series data, the conventional 2D-CNN can be utilised.
The activation function is essential to optimise models per-
formance. The activation functions such as hyperbolic tangent
(Tanh), rectifier linear units (ReLU), Sigmoid, and leaky ReLU
are typically employed in CNNs. We opt for ReLU and Leaky
ReLU which are prominent in the literature and also have abil-
ity to avoid vanishing gradients as given below.

0 if z ≤ 0,

ReLU(z) = 

z if z > 0,

 (3)
αi zi if z ≤ 0,

Leaky-ReLU(zi ) = 

zi
 if z > 0,
Figure 4: BD-LSTM Network showing the input, output and back-
ward&forward layer with the connection between them. These two hidden where z and zi are convolution outcomes, and αi is user-
layer combine forward and backward information flow to enhance the ability defined hyperparameter for convolutional layer i, typically
of model to acquire information. starting at 0.01. We select ReLU for the initial convolu-
tional layer to address the vanishing gradient issue, followed
by Leaky-ReLU in subsequent layers as shown in Figure 6. We
The encoder-decoder LSTM (ED-LSTM) can output a re-
train the CNN model by minimising the error defined by the
quired sequence based on an input sequence (the length of the
loss function using the Adam optimiser [113] with user-defined
sequence can be different) [105]. ED-LSTM makes specific
learning rate λ = 0.0001.
architectural changes to the original LSTM to better handle a
series of problems known as sequence to sequence. ED-LSTM 3.2.3. Convolutional LSTM networks
is very suitable for translating a certain language into different Convolutional LSTM (Conv-LSTM) network [114] was ini-
languages [106]. We present the ED-LSTM architecture in Fig- tially introduced for weather forecasting problems. This net-
ure 5. work extends the original fully connected LSTM and changes
the matrix multiplication of the LSTM cell to convolution. We
3.2.2. Convolutional neural networks use ∗ to present convolution operation. And ◦ recognised as the
CNNs are one of the most prominent deep learning mod- Hadamard product. The key equations in the Conv-LSTM cell
els initially designed for computer vision and image process- are expressed as :
ing tasks [107, 108, 109, 110]. Their application spans diverse
areas, notably in detection [111] and segmentation tasks [112], ft = σ(W x f ∗ xt + Wh f ∗ ht−1 + Wc f ◦ ct−1 + b f )
where they have shown superior efficacy accuracy when com- it = σ(W xi ∗ xt + Whi ∗ ht−1 + Wci ◦ ct−1 + bi )
pared to traditional machine learning models. A CNN typically ot = σ(W xo ∗ xt + Who ∗ ht−1 + Wco ◦ ct + bo ) (4)
comprises several layers, including convolutional, pooling, and ct = ft ◦ ct−1 + it ◦ tanh(W xc ∗ xt + Whc ∗ ht−1 + bc )
fully connected layers.
ht = 0t ◦ tanh(Ct )
Subsequently, the fully connected layer, akin to conventional
neural networks, ensures a dense interconnection between the where ft ,it , ot and ht refer to the forget gate, input gate, out-
nodes of consecutive layers. CNNs identifies hierarchical pat- put gate and hidden state respectively. W is weight matrices
terns (features) in the data through iterative convolution and adjusted learning along with b, which is the bias. Also, the
pooling, culminating in a fully connected layer that consoli- past status ct−1 can be regarded as “forgotten” in the process,
dates these features for the final task output. This structural and ct is the current cell memory. These equations are sim-
design has been pivotal for their proficiency in handling tasks ilar to 2. The Conv-LSTM model has the ability to capture
related to image processing. both the spatial and temporal relationships in the data at the
Given that our dataset consists of univariate time series same time, resulting in more precise predictions. In our im-
for stock price, it’s crucial to modify conventional two- plementation, for the case of univariate time series, we utilise
diremntional CNN to suit our problem. Consequently, we’ve the 1D-convolutions in Conv-LSTM and 2D convolutional for
integrated a specialized function into our model that processes multivariate time series forecasting.
10
Figure 5: Encoder-Decoder LSTM Network showing encoder and decoder transfer sequence with encoder vector.

where 1 ≤ 2k ≤ m. The temporal encoding, hence, is TE


∈ RN×m . The vector representations alongside the temporal
encodings are then concatenated and provided to the encoder
layers.
A concise overview of the complete framework of our Trans-
former model is delineated in Figure 7. The encoder depicted in
Figure 7 consists of M identically structured layers. Each layer
is equipped with two sub-layers: a multihead self-attention
mechanism and a fully connected feed-forward network. Both
sub-layers incorporate residual connections and normalization
to enhance their functionality. The decoder, also shown in Fig-
ure 7, mirrors the encoder’s structure with a notable distinction:
it features an additional multi-head self-attention layer. Un-
like the original decoder described in [117], this version omits
the masked attention mechanism because it processes only ob-
Figure 6: Architecture of 1D-CNN, featuring input, 1D-convolutional, pooling, served historical data, which does not include future informa-
dense and output layers. tion.
The emergence of attention mechanisms marks a pivotal in-
novation in deep learning, focusing computational efforts to
3.2.4. Transformer Networks capture attention mechanism in cognition. Vaswani et al. [117]
The Transformer model is an extension of the encoder- revolutionized this approach by introducing the Transformer
decoder LSTM architecture which has been widely used in ma- architecture, predicated on the exclusive use of self-attention
chine translation problems [115]. The encoder condenses the mechanisms. The self-attention mechanism, as defined, fol-
essential data of the input sequence into a vector of fixed length, lows:
which is subsequently transformed into an output by the de-
PR⊤
!
coder [105]. The design of the decoder offers a method for
managing lengthy sequential data [116]. Attention(P, R, S ) = softmax √ S , (5)
m
Analogously, we input the sequential data to a vector rep-
resentation layer. Given the input sequence X = {xi : i = where P, R, S ∈ RN×m correspond to the query, key, and value
1, . . . , N} ∈ RN , the m-dimensional embedding layer yields a matrices derived from three separate linear transformations of
matrix B ∈ RN×m through a dense network. the same input. The architecture of the self-attention mecha-
We need to incorporate temporal encoding with the vec- nism is illustrated in 7.
torised input to encapsulate the temporal structure of the time The self-attention mechanism has transformed the strategy
series. Hence, employing sine and cosine functions at distinct of focusing on vital local content within the data. Vaswani et
frequencies to represent temporal information, we define: al. [117] expanded this idea by proposing multi-head attention,
  whereby several self-attention processes, or ”heads,” are exe-
TE(i,2k) = sin i/100002k/m , cuted in parallel, each assessing different projected versions of
  the queries, keys, and values. The combined outcomes of these
TE(i,2k+1) = cos i/100002k/m , heads are then linearly transformed to obtain the final output as
11
Figure 7: Architecture of Transformer model showing ...

shown in Figure 7. Algorithm:

while ξk not converged do


3.2.5. Model training with Adam Optimiser k ←k+1
We utilise the modified Adam (adaptive moment estimation) gk ← ∇ξ f (ξk−1 )
optimiser [113] which is an extension to the stochastic gradient
descent [118, 119] and further extends adaptive gradient meth- mk ← βa · mk−1 + (1 − βa ) · gk
ods (AdaGrad [120], AdaDelta [121], and RMSProp [122]). vk ← βb · vk−1 + (1 − βb ) · gk ◦ gk
Adam is an adaptive gradient-based optimisation algorithm that mk
m̂k ←
computes individual adaptive learning rates for different param- (1 − βka )
eters from the history of the first and second moments (mean vk
v̂k ←
and variance) of the gradients. Let gk ◦ gk signify the element- (1 − βkb )
wise square of gk . m̂k
Input: ξk ← ξk−1 − α · √
( v̂k + ϵ ′ )
• Step size, α end while

Output: Resulting parameters ξk


• Exponential decay rates for the moment estimates, βa , βb ∈
Adam optimisation aims to minimise the expected value of
[0, 1)
a differentiable function, such as a neural network model f (ξ)
with a set of parameters given ξ representing the weights and
• Stochastic objective function with parameters, f (ξ) biases. The algorithm updates the exponential moving averages
of the gradient (mk ) and its square (vk ), with hyperparameters
• Initial parameter vector, ξ0 βa , βb controlling their decay rates. Adjustments in the algo-
rithm improve efficiency, employing anqupdated computation
Initialise:
for parameter adjustments with αk = α 1 − βkb /(1 − βka ). We
m0 ← 0 (Initialize 1st moment vector) note that vector operations are performed element-wise.
The key to Adam’s update mechanism is the padaptive step
v0 ← 0 (Initialize 2nd moment vector)
bk / b
size, influenced by the signal-to-noise ratio m vk , dictating
k←0 (Initialize timestep) the magnitude of parameter updates. This feature allows for ef-
12
fective scaling of steps in parameter space, contributing to the The input characteristics for each segment are data from N sub-
robustness and versatility of the algorithm in various optimisa- sequent time points, and the output label(s) is the time point(s)
tion contexts. that comes afterwards. Therefore, we can have single-step pre-
diction or multi-step ahead prediction. In the multivariate strat-
3.3. Data egy, as input vectors, we will utilise a window that holds multi-
We choose four different cryptocurrencies to evaluate the per- ple time series data at sequential multiple time points as shown
formance of the respective statistical and deep learning models. for the case of Bitcoin price prediction in Figure 8. The input
The cryptocurrencies include Bitcoin, Ethereum, Dogecoin and data consists of multiple time series, such as the high-price and
Litecoin. We focus on multi-step ahead stock price forecast- close-price of Bitcoin, and the stock price of Gold to provide a
ing, where a step is defined by a day. Bitcoin is the first and multi-step prediction of Bitcoin close price.
most prominent cryptocurrency, which was launched in 2009
by Satoshi Nakamoto [8]. Ethereum was designed in 2013 by 3.5. Framework
Vitalik Buterin and Gavin Wood [123]. Ethereum is not just
We present our framework in Figure 9 that highlights major
a cryptocurrency but also a platform for building decentralized
components of the entire process. In Step 1, we extract data and
applications using smart contracts. After Bitcoin, Ethereum is
pre-process data for analysis. Since the Gold price data only has
the cryptocurrency with the second-largest market capitalisa-
its value on trading days (Monday to Friday), we use interpo-
tion. Dogecoin, another open-source cryptocurrency based on
lation methods to fill in prices on non-trading days (Saturday,
the popular ”doge” internet meme, grew in popularity and price
Sunday and Public Holidays). The interpolation we used is a
in 2021 after billionaire Elon Musk publicly backed it. Litecoin
linear method4 which fills in missing values with the average of
created by Charlie Lee in 2011, Litecoin is based on Bitcoin’s
both sides. This is commonly used to handle missing values in
protocol but differs in terms of the algorithm used. Litecoin
time series data [127].
uses the scrypt encryption, proposed by Colin Percival [124].
The data pre-processing in Step 2 features data-scaling to en-
Due to the incompleteness of data sources, we combine data
sure the model stability, which limits the price within the range
sources from two websites, including Yahoo Finance2 and Kag-
defined by min-max scalar5 . Furthermore, we separate the data
gle [125] with fundamental details summarised in Table 5. The
into 2 sets for comparing the influence of COVID-19 and also
datasets feature the historical price information of the four cryp-
determine the train-test data split, which is based on a given
tocurrencies with begin and end date and the number of data
timeline (not shuffled). Our goal is to predict the closing price
points shown in Table 6. We forecast the closing price of each
of the respective cryptocurrencies, therefore we have univariate
cryptocurrency using univariate and multivariate deep learning
data with close price and multivariate data with close price, gold
models. According to Wang et al. [22], in the multivariate
price, high price, low price and opening price as shown in Table
model, it is feasible to incorporate the features of the cryptocur-
1. We split the dataset into a 70:30 ratio. We use the data for
rency price such as the open, high, low and volume to enhance
the selected cryptocurrency from the opening to June 2021 as
forecasting accuracy. Hence, we used these features in our mul-
Dataset 1, where the last 30% of the data includes the period
tivariate models as shown in Table 6. We also add gold prices
following the beginning of COVID-19 (March 2020 to June
as an additional feature to the multivariate model, as noted by
2021), during which high volatility in crypto price was evident.
Huynh et al. [26] who found a strong correlation between cryp-
We know that cryptocurrency is a financial asset with highly
tocurrencies and the gold market. We obtained the Gold price
volatile prices [15], hence we explore which model is more ef-
data from London bullion market (LBMA) during 31 December
fective in predicting the rising trend following the breakout of
2012 to 28 February 2022 collected from Factset3 .
COVID-19. Dataset 2 features the COVID-19 period both in the
train and test dataset (March 2020 to April 2024) to ensure that
3.4. Data processing
high volatility crypto price data is part of the training dataset.
We need to reconstruct the original time series data for multi- Table 7 presents the dates for the train and test datasets, for both
step-ahead prediction using deep learning models. The embed- experiments.
ding theorem of Taken’s states that the reconstruction process In Step 3, we select the optimal hyperparameters for each
can replicate significant characteristics of the initial time series model using trial runs, knowledge from the same model runs in
[126]. Given an observed time series x(t), we can generate em- literature (e.g. [40]), and the default values in the library imple-
bedded phase space Y(t) = [x(t), x(t − T ), ..., X(t − (D − 1)T )] ; mentation (e.g. PyTorch[128]). We note that we use RMSE as
where T is the time delay, D is the embedding dimension with the accuracy measure for all the models in our framework. Once
t = 0, 1, 2, ..., N − D − 1 , and N is the original length of the the best parameters have been determined, we can then continue
time series. Takens’ theorem demonstrates that if the original with our investigations that compare univariate and multivariate
attractor had a dimension of d, then an embedding dimension of deep learning models.
D = 2d + 1 would be enough. In univariate method prediction,
we divide a single time series data set up into several sections.
4 https://pandas.pydata.org/docs/reference/api/pandas.

DataFrame.interpolate.html#pandas.DataFrame.interpolate
2 https://finance.yahoo.com/lookup 5 https://scikit-learn.org/stable/modules/generated/
3 https://www.factset.com/ sklearn.preprocessing.MinMaxScaler.html

13
Cryptocurrency Period (Day/Month/Year) Size Mean: Close price (USD) Variance: Close price
Bitcoin 29/04/2013-01/04/2024 3991 13692 2.856 × 108
Ethereum 08/08/2015-01/04/2024 3160 983.53 1.274 × 106
Dogecoin 16/12/2013-01/04/2024 3760 0.0406 5.947 × 10−3
Litecoin 29/04/2013-01/04/2024 3991 60.208 3.884 × 103

Table 5: Statistical description of the dataset, where the size refers to the number of data points (days).

Figure 8: Data windowing in multivariate approach with Bitcoin close-price forecasting, showing the size of input window (time series window) and output window
(prediction horizon). The input data consists of multiple time series, such as the Bitcoin (BTC) high-price and close price and the stock price of Gold, to provide
multistep ahead prediction of BTC close price.

Variable Variable Description Data type Crypto Experiment Split Period (Day/Month/Year)
SNo the order of the data Number Bitcoin Dataset 1 Train 29/04/2013-27/12/2018
Name Name of cryptocurrency Letter Test 28/12/2018-01/06/2021
Symbol Abbreviation of cryptocurrency Letter Dataset 2 Train 01/03/2020-09/01/2023
Date Date of observation Date Test 10/01/2023-01/04/2024
High Highest price on given day Number Ethereum Dataset1 Train 08/08/2015-02/09/2019
Low Lowest price on given day Number Test 03/09/2019-01/06/2021
Open Opening price on given day Number Dataset 2 training 01/03/2020-09/01/2023
Close Closing price on given day Number testing 10/01/2023-01/04/2024
Volume Volume of transactions on given day Number Dogecoin Dataset 1 Train 16/12/2013-06/03/2019
Test 07/03/2019-01/06/2021
Table 6: Dataset for multivariate models. Dataset 2 Train 01/03/2020-09/01/2023
Test 10/01/2023-01/04/2024
Litecoin Dataset 1 Train 29/04/2013-27/12/2018
In Step 4, we provide data analysis by first implementing a Test 28/12/2018-01/06/2021
volatility analysis of the close price of selected cryptocurren- Dataset 2 Train 01/03/2020-09/01/2023
cies, which can reveal fluctuations and patterns throughout the Test 10/01/2023-01/04/2024
selected period. Historically, cryptocurrencies featured a wide
price range, with high major fluctuation across the COVID-19 Table 7: Time span of training and test dataset in different experiments for each
cryptocurrency.
period [129, 130]. We use the volatility analysis to review the
fluctuations, since they can lead to instability during the model
training process, causing slower convergence and poor general-
isation ability on the test dataset. In Step 4, taking into account and Transformer. models predict the close price of the selected
our multivariate models, we provide feature correlation analysis cryptocurrencies. We use MLP and ARIMA models as baseline
to find out how the different features affect each other. models for Bitcoin dataset.
We next compare the respective models in Step 5 with Ex- In Step 6, we use Dataset 2 for Experiment 2 to predict the
periment 1 (pre-COVID-19 training data) and select the two close price during COVID-19 using training data that features
best-performing models for the next step. We develop and com- COVID-19 effect on the cryptocurrencies. We do this to deter-
pare the multivariate model and univariate deep learning model mine whether the prediction accuracy has been improved and
including LSTM, BD-LSTM, ED-LSTM, CNN, Conv-LSTM, hence compare the results with Experiment 1. We also incor-
14
porate the shuffle data splitting strategy to enhance the efficacy Model Hidden Train Test
of model performance. The initial 70% of the data is randomly LSTM 20 0.0194 0.0696
rearranged using the shuffle. 50 0.0176 0.0490
100 0.0165 0.0370
3.6. Technical details BD-LSTM 20 0.0195 0.0239
50 0.0177 0.0210
In order to distinguish the model performance, we use RMSE 100 0.0170 0.0204
as the criterion for the different prediction horizons. The ED-LSTM 20 0.0192 0.0930
smaller the RMSE values, the better the prediction accuracy: 50 0.0169 0.0609
v
u 100 0.0164 0.0373
N
t
1 X Conv-LSTM 20 0.0124 0.0176
RMS E = (yi − ŷi )2 (6) 50 0.0123 0.0181
N i=1
100 0.0132 0.0194
where yi and ŷi are the observed data and predicted data, re- CNN 20 0.0126 0.0206
50 0.0128 0.0209
spectively. N is the length of observed data.
100 0.0135 0.0235
As indicated earlier, we use the Adam optimiser for all the
MLP 5 0.0137 0.0278
deep learning models, where we use default values for the hy- 10 0.0130 0.0268
perparameters, i.e. α = 0.001, βa = 0.9, βb = 0.999, and 20 0.0122 0.0195
ϵ ′ = 1e − 8.
In the case of the Transformer and ARIMA model, we re- Table 8: Hyperparameter selection for the number of hidden units for the se-
viewed the literature [39, 131, 70] to obtain the hyperparame- lected models using RMSE for the train and test dataset (LSTM, BD-LSTM,
ters. Table 9 describes the details of model hyperparameters, in- ED-LSTM, Conv-LSTM, CNN, and MLP).
cluding the number of input layers, output layers, hidden layers
and other hyperparameters. We use the ReLu activation func-
phases of COVID-19, including its initiation, spread, and de-
tion in the respective deep learning models with a maximum
cline. Figure 10 presents the close price of Bitcoin, Ethereum,
training time of 200 epochs via the Adam optimiser [113].
Dogecoin and Litecoin across the selected period, with the
We implemented specific experiments to determine the ap-
shaded region (pink) indicating COVID-19. We can observe
propriate hyperparameters for each model. Based on related
that the closing price of each cryptocurrency exhibited large
models in the literature [40], we used model architectures:
fluctuation within the red area. Litecoin experienced significant
CNN and LSTM variants feature one hidden layer with selected
volatility before the beginning of COVID-19, while the price
hidden units. We refer to previous research on MLP model for
fluctuations of the other three cryptocurrencies (Bitcoin, Do-
time series prediction [132] and evaluate performance for se-
gecoin and Ethereum) before COVID-19 were not significant.
lected hidden neurons, as shown in Table 8. We use the first
This demonstrates that after COVID-19, the price of cryptocur-
70% of the Bitcoin close-price in dataset1 for training and the
rency is more volatile than before. We observe that Ethereum
remaining for testing. We repeated model training with differ-
trend is highly correlated to Bitcoin before and during COVID-
ent initial parameters for each hyperparameter configuration 5
19. There is a significant price increase from 2020 to 2022,
times and reported the average. Table 8 presents the perfor-
which was subsequently followed by a decrease and another
mance (RMSE) of each model in the test dataset for the hyper-
increase in recovering the price, in the case of Bitcoin and
parameters, with the best values in bold.
Ethereum. Next, we present the monthly volatility plot in Fig-
ure 11, where we observe that Ethereum and Litecoin generally
4. Results lie below 10% during COVID-19 and highlighted (pink). We
also show the Bitcoin monthly volatility below 6% during the
In this section, we provide comprehensive information about same time; however, Dogecoin presents a different trend during
the datasets and present research design with computational re- COVID-19. The monthly volatility of the Dogecoin reached
sults. above 20% in January and May, 2021. During other months,
it remained consistently at a value of 15%. Our analysis re-
4.1. Data analysis veals that the volatility patterns of 4 cryptocurrencies indicate
The coronavirus disease 2019 (COVID-19) pandemic [133] a significant decrease in volatility in the subsequent month af-
originated 17th November 2019 in Wuhan, China, and exten- ter the periods of high volatility. The monthly volatility during
sively began spreading from March 2020 [134] worldwide. COVID-19 is generally similar to the monthly volatility, prior
COVID-19 had a devastating effect on the world economy, and to the pandemic (2018 onwards). Although the monthly volatil-
its impact included finance, supply chain, politics, and men- ity does not change significantly, it fluctuates significantly when
tal health [135], with further effects in the post-pandemic era looking at the daily close price across the entire period.
[136, 137, 138]. Therefore, it is necessary to analyze the price Since we will develop a multivariate model, we also need to
trends of the four cryptocurrencies in our study. provide analyses of how different features of the cryptocurrency
We investigate the trends for the four selected cryptocurren- (low, high, open, and close price) are correlated with the Gold
cies over the given period covering COVID-19. We cover all the price. Figure 12 shows the correlations between the features
15
Figure 9: Framework of evaluation and details include data collection, pre-experiment analysis, data pre-processing, hyperparameter selection and design of
experiments.

16
Input Hidden Output
Comments
layers Layers layers
MLP (6,1) 3 (1,5) Include three hidden layers.
ARIMA - - - Construct ARIMA(1,0,1) model.
LSTM (6,1) 2 (1,5) Include two LSTM layers.
BD-LSTM (6,1) 2 (1,5) Include Forward&Backward LSTM layer.
ED-LSTM (6,1) 4 (1,5) Two LSTM networks with a time distributed layer.
Conv-LSTM (6,1) 3 (1,5) Include Conv1D layer, LSTM network and dense layer.
CNN (6,1) 4 (1,5) Include Conv1D layer, pooling layer and two dense layers.
Include a Multi-Head Attention mechanism and a
Transformer (6,1) 2 (1,5)
position-wise fully connected feed-forward network.

Table 9: Hyperparameter configuration of all respective models.

(a) Bitcoin (b) Ethereum

(c) Dogecoin (d) Litecoin

Figure 10: Time-series of cryptocurrency close-price highlighting COVID-19 trend in pink.

of the multivariate model in each cryptocurrency using Pearson 4.2. Results: pre-COVID-19
correlation. We observe that close-price is highly correlated to
We next implement the investigations outlined in Step 4 (Ex-
the low-price, high-price and open price. We observe that there
periment 1) of Framework (Figure 9, where we compare the
is a lower correlation between Gold and other features; how-
selected deep learning models and univariate and multivariate
ever, we will use Gold in our multivariate model as data that is
strategies using training dataset pre-COVID-19. Note that our
outside the crypto ecosystem, but linked to it. We also find that
test dataset includes the first phase of COVID-19 (Table 3).
Gold price has the highest correlation with Bitcoin, followed by
Ethereum and Litecoin, and the least with Dodgecoin. Figure We present the results for each prediction horizon (step) ob-
13 presents the Pearson correlation for the respective features tained from 30 independent experimental runs (mean RMSE
including the close, high, low and opening price for a given and 95% confidence interval) that feature model training using
cryptocurrency with Gold price and most correlated other cryp- different initial weights and biases. We note that robustness is
tocurrency (using Figure 12), which is Ethereum in the case of the degree of confidence in a forecast, which is indicated by a
Bitcoin, i.e. Figure 13 -Panel (a). We will use this for multivari- low confidence interval. Moreover, scalability refers to the ca-
ate prediction strategy using data processing as shown in Figure pacity to maintain constant performance as the prediction hori-
8. zon expands. Our main focus is the performance (RMSE) on
the test dataset, both in terms of the mean of 5 prediction hori-
zons, and the individual prediction horizons. Therefore, in the
17
(a) Bitcoin (b) Ethereum

(c) Dogecoin (d) Litecoin

Figure 11: Monthly volatility of cryptocurrency highlighting COVID-19 trend in pink.

rest of the discussion, we focus on the test dataset.


We first use Bitcoin data to evaluate conventional models
(MLP and ARIMA) when compared to deep learning mod-
els (LSTM, ED-LSTM, BD-LSTM, CNN, Conv-LSTM, Trans-
former), for the univariate (Figure 14) and multivariate strate-
gies (15). The results show that MLP and ARIMA perform
worse than the deep learning models. MLP exhibits a lack of
robustness, and ARIMA model struggles in test prediction ac-
curacy when compared to the deep learning models. We note
that ARIMA does the best on the train dataset due to over-
training and struggles in generalisation ability. The deep learn-
ing model results are consistent with the finding by Chandra et
al. [40] where the prediction accuracy of deep learning mod-
els is better than conventional machine learning models for
multistep ahead time-series forecasting. The prediction per-
formance of each model shows a trend where the best Mul-
tivariate strategy (ED-LSTM) provides consistent accuracy as
the prediction horizon changes when compared to the Univari-
ate strategy (BD-LSTM). In Figure 15, the Multivariate strategy
shows that Conv-LSTM provides the lowest prediction accu-
racy, while ED-LSTM and BD-LSTM models provide the most
Figure 12: Correlation coefficients between Gold stock price and closing price
of four cryptocurrencies. accurate predictions. In Figure 14, contrary to the results of
the Multivariate strategy, the most robust Univariate model for
predicting Bitcoin is Conv-LSTM.
Figure 16 presents the results for Ethereum using the Uni-
variate strategy, where we observe that LSTM provides the best
test performance, followed by BD-LSTM. Figure 17 provides
18
(a) Correlation coefficients between variables in multivariable mod-(b) Correlation coefficients between variables in multivariable mod-
els in Bitcoin els in Ethereum

(c) Correlation coefficients between variables in multivariable mod-(d) Correlation coefficients between variables in multivariable mod-
els in Dogecoin els in Litecoin

Figure 13: Heatmap of Pearson correlation coefficients for the multivariate models.

19
(a) Univariate model train and test data (mean of 5 prediction horizons) (b) 5 step-ahead prediction for test dataset

Figure 14: Bitcoin: performance evaluation of respective univariate methods (RMSE mean with 95% confidence interval for 30 experimental runs).

(a) Multivariate model train and test data (mean of 5 prediction horizons) (b) 5 step-ahead prediction for the test dataset

Figure 15: BTC: performance evaluation of respective multivariate methods (RMSE mean with 95% confidence interval for 30 experimental runs).

the results for the Multivariate strategy, where CNN provides stock price, which is the major factor making a difference in
the best performance which is followed by Conv-LSTM. No- the multivariate model. We notice that CNN provides the worst
tably, the Transformer model provides the best performance. In accuracy formed by the Transformer model in both strategies.
comparison to the Univariate strategy, we notice that the Mul- Finally, we present the results for the Litcoin for both Uni-
tivariate strategy provides a much better test accuracy, which is variate and Multivariate strategies. Figures 20 21 show all the
also more robust and scalable, i.e. higher prediction horizons results of the Univariate models, where we find that the Conv-
maintain better accuracy. Furthermore, we note that the Conv- LSTM and BD-LSTM show the best performance, whereas
LSTM provides the worst performance in the Univariate case, LSTM, ED-LSTM and Conv-LSTM provide the best perfor-
but one of the best in the Multivariate strategy. mances for the Multivariate strategies. On the contrary, the
In the case of Dodgecoin, Figures 18 and 19 reveal that BD- CNN provides the worst performance for the Multivariate strat-
LSTM exhibits the best accuracy, both for the Univariate and egy, much higher in magnitude when compared to the rest of
the Multivariate strategies, and also provides similar stability the models. We also notice that the Multivariate strategy pro-
for higher prediction horizons. This could be due to the price vides better stability as the prediction horizon increases when
and vitality trends in Figures 10 and 11 (Panels c), where we no- compared to the Univariate strategy, and the Univariate strat-
tice that Dodgecoin has a similar trend pre-COVID-19 and dur- egy provides much better accuracy of the best models when
ing the first phase of COVID-19 which makes Dataset 1 used compared to the Multivariate strategy. We summarise the re-
for these experiments. Furthermore, we also note that in Fig- sults further in Table 10 which features the model prediction
ure 13 (Panel c), Dodgecoin is least correlated with the Gold accuracy of the test dataset. We report the RMSE mean and
20
95% confidence interval for the four cryptocurrencies, and the gap expands as the prediction step increases. This is because
best models for the different steps are highlighted in bold. The our task is defined as a direct approach for forecasting multi-
test mean provides the average of the five steps. It is clear that steps, rather than an iterated prediction strategy. We observe
the Univariate models are better than the Multivariate models; the changes in the prediction horizon of each model and find
however, we find that the accuracy of both strategies is close that CNN and Conv-LSTM are significantly worse than other
(test mean) for Bitcoin and Dogecoin. models. The forecast accuracy of these two models frequently
declines more rapidly than that of other models, and occasion-
4.3. Results: Data featuring COVID-19 ally even show volatility (Figure 18b), across Step 1 to 5. We
The previous section presents results given by the respective found that the CNN-related models using convolution opera-
models using data before COVID-19. We found that the uni- tion provided lower accuracy than other models in predicting
variate strategy was better than the Multivariate strategy (Table cryptocurrency price. Later we will analyse the cause of this
10), therefore we only used the Univariate strategy for Exper- issue. Among the predictions for the four currencies in Exper-
iment 2 (during COVID-19) and presented the results in Table iment 1 (Table 10), Dogecoin has the worst prediction effect.
11. We find that comparing the results of Experiment 1, the The RMSE values of the model are significantly higher than
prediction accuracy of Bitcoin, Ethereum, and Dogecoin has those of the other three cryptocurrencies. We believe this is due
improved to a certain extent. The prediction accuracy that re- to the particularity of the Dogecoin price, which leads to large
veals the greatest improvement is forecasting the close price of prediction errors. The first 70% of the data fluctuates smoothly,
Dogecoin. After training using data from the COVID-19 pe- while the last 30% of the data fluctuates violent.
riod, the prediction performance of Dogecoin is almost close to In the second experiment, we evaluate the accuracy deep
Bitcoin and Ethereum. Nevertheless, the forecast precision for learning model predicting cryptocurrencies. We utilize the two
Litecoin decreased. Additionally, we find that the robustness of models that predicted the best in previous experiments to do
the model improved after training with data from the COVID- forecasting with the new dataset. The new dataset includes all
19 period. The models keep that the confidence intervals for close prices since COVID-19 to April 2024. It has been discov-
all predicted horizons of Bitcoin and Ethereum are controlled ered that the close-price forecasts for Bitcoin, Dogecoin, and
within ±0.0007. In the case of Dogecoin and Litecoin, the ro- Ethereum have all shown enhancement. Also, as the prediction
bustness of the models in 1-step ahead prediction has generally horizon rises, the prediction accuracy of the model deteriorates
improved. more slowly. We claim that there are two causes contributing to
the decrease in the accuracy of forecasting litecoin. The first is
5. Discussion that because our evaluation criterion for the model is the over-
all performance of the model, we did not choose Conv-LSTM,
In this section, we provide a discussion based on the re- which had the best performance in predicting Litecoin in previ-
sults, taking into consideration the model architecture as well ous experiment. Maybe Conv-LSTM is more suitable for pre-
as the characteristics of the data. In summary, we evaluated the dicting Litecoin. The second reason is that Litecoin had a vio-
predictive performance of all models and presented the results lent price ups and downs before COVID-19. Due to our design
through Figures 14 to 21. We also provide a ranking of the of the experiment, the data during this period was not included
performance accuracy for each type of model in Tables 12 and in the training set.
13. Next, we aim to investigate what might be contributing to the
We first review the results of the first experiment that investi- lower accuracy of multivariate models compared to univariate
gated the model performance with COVID-19 data. Our results models in prediction. In our analysis (Fig or Table xxx, we no-
show that LSTM, BD-LSTM and ED-LSTM provide outstand- tice that the price of cryptocurrencies is extremely unstable and
ing predictive performance across four different cryptocurren- is greatly influenced by several variables outside or inside of
cies and two different approaches for selecting model variables. the market [10]. Simply inserting some additional factors will
The CNN, Conv-LSTM, and Transformer models show good not only be ineffective in assisting the model to accurately fore-
performance only under particular conditions. We also note cast outcomes, but it may also mislead the model into acquiring
that in all the cryptocurrencies, the Univariate model outper- knowledge of irrelevant data features.
formed the Multivariate model, but in some cases (Bitcoin and Our analysis of volatility (Figure 10 and 11) shows the high
Dogecoin) the accuracy was close when comparing both strate- degree of volatility exhibited by cryptocurrencies throughout
gies. We found that the models with high forecast accuracy the COVID-19 pandemic. Through a comparative analysis of
were mostly accompanied by narrower confidence intervals. On the outcomes obtained from first and second experiment, we
the contrary, the higher RMSE values usually resulted in lower observed that the use of high volatility data as the training set
robustness of the model. We found that models with better provides better prediction accuracy for the model. The robust-
prediction accuracy (lower RMSE) provide more robust perfor- ness and scalability of the model are also improved.
mance accuracy, given different model weight initializations in Next, we investigate the factors contributing to the advan-
independent experimental runs. The accuracy of performance tages and disadvantages of each model. Conventional time se-
generally declines as the prediction horizons increase, which is ries models and machine learning techniques are inadequate for
natural for multistep ahead problems (Figure 14b). The pre- addressing issues such as timing dependency and gradient ex-
diction is derived from the current values, and the information plosion. As we used MLP and ARIMA models to predict Bit-
21
Data Strategy Model Step 1 Step 2 Step 3 Step 4 Step 5 Test Mean
BTC Univariate CNN 0.0380±0.0015 0.0451±0.0017 0.0476±0.0013 0.0537±0.0015 0.0616±0.0021 0.0492±0.0016
LSTM 0.0223±0.0013 0.0337±0.0020 0.0425±0.0025 0.0496±0.0026 0.0515±0.0022 0.0399±0.0021
ED-LSTM 0.0250±0.0014 0.0363±0.0029 0.0421±0.0030 0.0448±0.0028 0.0477±0.0025 0.0392±0.0025
BD-LSTM 0.0196±0.0008 0.0296±0.0013 0.0333±0.0014 0.0385±0.0012 0.0424±0.0015 0.0327±0.0012
Conv-LSTM 0.0244±0.0012 0.0302±0.0011 0.0381±0.0015 0.0434±0.0020 0.0468±0.0023 0.0366±0.0016
Transformer 0.0360±0.0041 0.0431±0.0038 0.0500±0.0036 0.0563±0.0037 0.0617±0.0039 0.0494±0.0038
Multivariate CNN 0.0416±0.0011 0.0477±0.0016 0.0534±0.0018 0.0575±0.0019 0.0606±0.0021 0.0522±0.0017
LSTM 0.0310±0.0018 0.0395±0.0029 0.0449±0.0036 0.0458±0.0031 0.0467±0.0022 0.0416±0.0027
ED-LSTM 0.0290±0.0022 0.0356±0.0017 0.0384±0.0015 0.0405±0.0013 0.0430±0.0014 0.0373±0.0016
BD-LSTM 0.0247±0.0018 0.0336±0.0022 0.0415±0.0026 0.0462±0.0026 0.0511±0.0026 0.0394±0.0024
Conv-LSTM 0.0500±0.0037 0.0466±0.0019 0.0678±0.0095 0.0886±0.0144 0.1327±0.0266 0.0771±0.0112
Transformer 0.0382±0.0026 0.0418±0.0024 0.0459±0.0021 0.0501±0.0020 0.0526±0.0022 0.0457±0.0023
ETH Univariate CNN 0.0384±0.0012 0.0453±0.0017 0.0478±0.0026 0.0521±0.0030 0.0586±0.0035 0.0484±0.0024
LSTM 0.0280±0.0013 0.0341±0.0012 0.0372±0.0011 0.0441±0.0016 0.0470±0.0017 0.0381±0.0014
ED-LSTM 0.0261±0.0012 0.0373±0.0019 0.0405±0.0018 0.0456±0.0013 0.0471±0.0012 0.0393±0.0015
BD-LSTM 0.0265±0.0012 0.0351±0.0019 0.0401±0.0019 0.0423±0.0022 0.0498±0.0025 0.0388±0.0019
Conv-LSTM 0.0327±0.0020 0.0410±0.0021 0.0497±0.0030 0.0609±0.0048 0.0751±0.0063 0.0519±0.0036
Transformer 0.0337±0.0035 0.0412±0.0042 0.0449±0.0039 0.0521±0.0032 0.0535±0.0036 0.0451±0.0037
Multivariate CNN 0.1007±0.0038 0.1385±0.0261 0.1564±0.0444 0.1147±0.0034 0.1276±0.0023 0.1276±0.0160
LSTM 0.1913±0.0156 0.1868±0.0190 0.1993±0.0178 0.1887±0.0180 0.1767±0.0224 0.1886±0.0186
ED-LSTM 0.1815±0.0362 0.1955±0.0423 0.1862±0.0441 0.1881±0.0459 0.1913±0.0476 0.1885±0.0432
BD-LSTM 0.1033±0.0159 0.1788±0.0354 0.1799±0.0314 0.1741±0.0285 0.1960±0.0302 0.1664±0.0283
Conv-LSTM 0.0799±0.0243 0.1011±0.0247 0.1185±0.0422 0.1786±0.0526 0.1950±0.0517 0.1346±0.0391
Transformer 0.2464±0.0141 0.2485±0.0136 0.2590±0.0150 0.2508±0.0147 0.2470±0.0153 0.2503±0.0145
DOGE Univariate CNN 0.1714±0.0490 0.2620±0.0831 0.2786±0.1056 0.4862±0.0956 0.5171±0.0968 0.3431±0.0860
LSTM 0.1492±0.0122 0.1523±0.0134 0.1577±0.0123 0.1637±0.0127 0.1681±0.0133 0.1582±0.0128
ED-LSTM 0.1386±0.0136 0.1419±0.0129 0.1401±0.0127 0.1422±0.0121 0.1428±0.0118 0.1411±0.0126
BD-LSTM 0.0509±0.0014 0.0562±0.0017 0.0722±0.0025 0.0648±0.0021 0.0645±0.0020 0.0617±0.0019
Conv-LSTM 0.0590±0.0198 0.1554±0.0913 0.1430±0.0914 0.1456±0.0893 0.1413±0.0830 0.1289±0.0750
Transformer 0.2116±0.0361 0.2228±0.0517 0.2435±0.0601 0.2219±0.0503 0.2188±0.0461 0.2237±0.0489
Multivariate CNN 0.8122±0.0212 0.6364±0.0792 0.6725±0.0700 0.6656±0.0840 0.5972±0.0871 0.6768±0.0683
LSTM 0.1706±0.0087 0.1746±0.0074 0.1828±0.0083 0.1810±0.0083 0.1825±0.0095 0.1783±0.0084
ED-LSTM 0.1829±0.0211 0.1823±0.0205 0.1816±0.0200 0.1821±0.0196 0.1823±0.0192 0.1822±0.0201
BD-LSTM 0.0616±0.0065 0.0603±0.0051 0.0619±0.0039 0.0653±0.0032 0.0720±0.0042 0.0642±0.0046
Conv-LSTM 0.2103±0.0975 0.1962±0.0897 0.1912±0.0859 0.2347±0.1085 0.2160±0.1012 0.2097±0.0966
Transformer 0.2472±0.0200 0.2482±0.0217 0.2501±0.0211 0.2345±0.0215 0.2332±0.0203 0.2426±0.0209
LTC Univariate CNN 0.0390±0.0008 0.0470±0.0015 0.0627±0.0033 0.0948±0.0128 0.1131±0.0170 0.0713±0.0071
LSTM 0.0382±0.0019 0.0479±0.0024 0.0619±0.0031 0.0768±0.0038 0.0861±0.0038 0.0622±0.0030
ED-LSTM 0.0369±0.0028 0.0487±0.0025 0.0631±0.0036 0.0756±0.0058 0.0838±0.0066 0.0616±0.0043
BD-LSTM 0.0318±0.0019 0.0401±0.0019 0.0507±0.0029 0.0588±0.0039 0.0699±0.0034 0.0503±0.0028
Conv-LSTM 0.0265±0.0012 0.0362±0.0017 0.0443±0.0015 0.0513±0.0019 0.0581±0.0017 0.0433±0.0016
Transformer 0.0726±0.0035 0.0712±0.0041 0.0746±0.0046 0.0818±0.0038 0.0891±0.0032 0.0779±0.0038
Multivariate CNN 0.4648±0.0468 0.4553±0.0444 0.4810±0.0445 0.5086±0.0460 0.5167±0.0492 0.4853±0.0462
LSTM 0.0625±0.0052 0.0788±0.0070 0.0921±0.0058 0.0885±0.0044 0.0919±0.0044 0.0828±0.0054
ED-LSTM 0.0940±0.0130 0.1121±0.0125 0.1032±0.0116 0.0833±0.0036 0.0880±0.0037 0.0961±0.0089
BD-LSTM 0.0859±0.0111 0.1520±0.0268 0.2513±0.0441 0.2838±0.0523 0.2760±0.0501 0.2098±0.0369
Conv-LSTM 0.0542±0.0054 0.0713±0.0094 0.0952±0.0126 0.1028±0.0136 0.1123±0.0127 0.0872±0.0107
Transformer 0.1532±0.0131 0.1624±0.0108 0.1724±0.0083 0.1671±0.0129 0.1650±0.0128 0.1640±0.0116

Table 10: Prediction accuracy on the test dataset reporting RMSE mean and 95% confidence interval (±) for the four cryptocurrencies. The best models for the
different steps are highlighted in bold, and the test mean provides the average of the five steps.

22
(a) Univariate model accuracy for train and test data (mean of 5 prediction (b) 5 step-ahead prediction for the test dataset
horizons)

Figure 16: ETH: performance evaluation of respective univariate methods (RMSE mean with 95% CI for 30 experimental runs.)

(a) Multivariate model accuracy for train and test data (mean of 5 prediction (b) 5 step-ahead prediction for the test dataset
horizons)

Figure 17: ETH: performance evaluation of respective multivariate methods (RMSE mean with 95% CI for 30 experimental runs.)

coin, the prediction performance was not as good as the deep language modeling problems, particularly for sequence to se-
learning model. And according to [139], there is long-term quence modeling in language translation. In this model, an
memory in the cryptocurrency market. The LSTM network, encoder LSTM is used to transform a source sequence into a
a deep learning model initially designed to address long-term fixed-length vector, while a decoder LSTM is employed to con-
memory issues, distinguishes itself in this regard. The mem- vert the vector representation back into a variable-length tar-
ory gate in the LSTM network can better capture information get sequence [105]. In our study, the encoder function maps
in time series with long-term dependencies. The prediction an input time series to a vector of fixed length. Subsequently,
performance of CNN is worse than that of LSTM, which is the decoder LSTM function translates the vector representa-
what we expected. Because the convolutional layer in CNN tion to several prediction horizons. Despite the differences in
is better at capturing local patterns and features, it has better the application, the fundamental objective of mapping inputs
prediction results for sequences where there is obvious spatial to outputs remains unchanged. As a result, ED-LSTM mod-
correlation between data points. According to past research, els have proven to be highly efficient for multi-step ahead pre-
CNN seems to be more effective in handling image recogni- diction. BD-LSTMs utilize two LSTM models to capture both
tion problems. Next, we analyze the differences between the forward and backward information about the sequence at each
four models with LSTM layers (LSTM, ED-SLTM, BD-LSTM time step [102]. While these models have been shown effec-
and Conv-LSTM). The ED-LSTM model has been created for tive for language modeling, our findings indicate that they’re

23
(a) Univariate model accuracy for train and test data (mean of 5 prediction (b) 5 step-ahead prediction for the test dataset
horizons)

Figure 18: DOGE: performance evaluation of respective univariate methods (RMSE mean with 95% CI for 30 experimental runs.)

(a) Multivariate model accuracy for train and test data (mean of 5 prediction (b) 5 step-ahead prediction for the test dataset
horizons)

Figure 19: DOGE: performance evaluation of respective multivariate methods (RMSE mean with 95% CI for 30 experimental runs.)

useful in tracking both present and future states for time se- 6. Conclusions
ries modeling. In our experiments, BD-LSTM and ED-LSTM
In this study, we provide a rigorous evaluation of novel deep
provided stable and outstanding prediction performance. Al-
learning models for cryptocurrency price forecasting. We com-
though Conv-LSTM uses convolutional layers as input with
pared prominent deep learning models using univariate and
LSTM memory cells in the hidden layer, it differs from the con-
multivariate strategies. The results show that the Bidirecional-
ventional LSTM models since the memory cells from different
LSTM provides the highest accuracy in predicting cryptocur-
hidden layers update only in the time domain and are mutu-
rency prices. We also provided a comparison with baseline
ally independent [140]. Therefore, information at the top layer
models such as multilayer perceptron and ARIMA and found
in time t − 1 will be ignored by the bottom layer at time t. In
that deep learning models generally outperform them. We also
cryptocurrency price prediction, the time information is crucial.
found that multivariate models provided less prediction effi-
This also explains why the prediction accuracy of Conv-LSTM
ciency than univariate models; however, it has scope for im-
in our experiments is often low and unstable at high prediction
provement given the availability of higher correlated time series
horizons. We also employed the Transformer provided unsatis-
data as features. In terms of the effect of COVID-19, we found
factory results which may be attributed to the limited training
that close-price volatility for cryptocurrency is quite apparent.
data, as Transformer models are often better suited for handling
Our experimental results show that utilising a training data set
large amounts of data.
with high volatility enhances the precision of our predictions.
24
(a) Univariate model accuracy for train and test data (mean of 5 prediction (b) 5 step-ahead prediction for the test dataset
horizons)

Figure 20: LTC: performance evaluation of respective univariate methods (RMSE mean with 95% CI for 30 experimental runs.)

(a) Multivariate model accuracy for train and test data (mean of 5 prediction (b) 5 step-ahead prediction for the test dataset
horizons)

Figure 21: LTC: performance evaluation of respective multivariate methods (RMSE mean with 95% CI for 30 experimental runs.)

In future work, it would be worthwhile to improve the multi- References


variate model. It is advisable to utilise more dependable factors [1] S. Bose, G. Dong, A. Simpson, S. Bose, G. Dong, A. Simpson, The
to enhance forecasts, maybe employing techniques like causal financial ecosystem, Springer, 2019.
inference to find these variables. We can also use this study [2] J. Frankel, B. Smit, F. Sturzenegger, Fiscal and monetary policy in a
framework to switch the goal into predictions of other financial commodity-based economy 1, Economics of transition 16 (4) (2008)
679–713.
indicators such as volatility of cryptocurrency. Further appli- [3] F. A. Hayek, Denationalisation of money: the argument refined: an anal-
cations to other specific issues could also be viable, such as ysis of the theory and practice of concurrent currencies, Vol. 70, Institute
predicting energy use and extreme weather forecasting. of economic affairs, 1990.
[4] M. M. Gross, C. Siebenbrunner, Money creation in fiat and digital cur-
rency systems, International Monetary Fund, 2019.
7. Code and Data [5] J. H. Boyd, R. Levine, B. D. Smith, The impact of inflation on financial
sector performance, Journal of monetary Economics 47 (2) (2001) 221–
We provide open source code and data using GitHub reposi- 248.
[6] U. Milkau, J. Bott, Digitalisation in payments: From interoperability
tory 6 . to centralised models?, Journal of Payments Strategy & Systems 9 (3)
(2015) 321–340.
[7] D. Chaum, Blind signatures for untraceable payments, in: Advances in
6 https://github.com/sydney-machine-learning/ Cryptology: Proceedings of Crypto 82, Springer, 1983, pp. 199–203.
deeplearning-crypto [8] S. Nakamoto, Bitcoin: A peer-to-peer electronic cash system (2008).

25
Data Model Step 1 Step 2 Step 3 Step 4 Step 5 Test Mean
BTC BD-LSTM 0.0194±0.0002 0.0258±0.0003 0.0311±0.0004 0.0367±0.0003 0.0414±0.0004 0.0309±0.0003
ED-LSTM 0.0199±0.0001 0.0284±0.0004 0.0339±0.0006 0.0381±0.0005 0.0418±0.0003 0.0324±0.0004
LSTM 0.0283±0.0004 0.0326±0.0003 0.0369±0.0003 0.0410±0.0003 0.0447±0.0002 0.0367±0.0003
CNN 0.0293±0.0006 0.0342±0.0006 0.0388±0.0005 0.0431±0.0004 0.0467±0.0003 0.0384±0.0005
Conv-LSTM 0.0209±0.0004 0.0263±0.0004 0.0317±0.0004 0.0372±0.0003 0.0421±0.0002 0.0316±0.0004
Transformer 0.0484±0.0055 0.0514±0.0051 0.0546±0.0047 0.0585±0.0047 0.0609±0.0047 0.0548±0.0050
ETH BD-LSTM 0.0230±0.0002 0.0304±0.0004 0.0361±0.0005 0.0426±0.0004 0.0484±0.0006 0.0361±0.0004
ED-LSTM 0.0230±0.0001 0.0307±0.0004 0.0362±0.0004 0.0426±0.0004 0.0481±0.0006 0.0361±0.0004
LSTM 0.0238±0.0005 0.0306±0.0005 0.0361±0.0005 0.0426±0.0005 0.0480±0.0005 0.0362±0.0005
CNN 0.0321±0.0004 0.0397±0.0008 0.0481±0.0014 0.0536±0.0016 0.0585±0.0015 0.0464±0.0012
Conv-LSTM 0.0250±0.0004 0.0332±0.0007 0.0405±0.0013 0.0481±0.0020 0.0550±0.0025 0.0404±0.0014
Transformer 0.0269±0.0008 0.0359±0.0010 0.0411±0.0011 0.0486±0.0017 0.0542±0.0019 0.0413±0.0013
DOGE BD-LSTM 0.0291±0.0001 0.0618±0.0041 0.0673±0.0046 0.0742±0.0044 0.0795±0.0041 0.0624±0.0035
ED-LSTM 0.0290±0.0001 0.0626±0.0030 0.0656±0.0029 0.0708±0.0026 0.0757±0.0024 0.0607±0.0022
LSTM 0.0660±0.0039 0.0683±0.0042 0.0728±0.0047 0.0792±0.0048 0.0835±0.0043 0.0740±0.0044
CNN 0.0646±0.0041 0.0630±0.0040 0.0655±0.0039 0.0762±0.0039 0.0806±0.0031 0.0700±0.0038
Conv-LSTM 0.0538±0.0021 0.0593±0.0022 0.0612±0.0020 0.0662±0.0020 0.0726±0.0019 0.0626±0.0020
Transformer 0.0536±0.0071 0.0586±0.0077 0.0639±0.0073 0.0724±0.0070 0.0789±0.0064 0.0655±0.0071
LTC BD-LSTM 0.0577±0.0007 0.0797±0.0022 0.0968±0.0018 0.1096±0.0015 0.1205±0.0016 0.0929±0.0016
ED-LSTM 0.0578±0.0006 0.0797±0.0012 0.0962±0.0010 0.1096±0.0009 0.1198±0.0009 0.0926±0.0009
LSTM 0.0587±0.0021 0.0804±0.0017 0.0971±0.0015 0.1100±0.0014 0.1207±0.0013 0.0934±0.0016
CNN 0.0823±0.0011 0.1060±0.0042 0.1163±0.0025 0.1268±0.0029 0.1403±0.0049 0.1143±0.0031
Conv-LSTM 0.0809±0.0073 0.1043±0.0095 0.1201±0.0087 0.1286±0.0086 0.1383±0.0072 0.1145±0.0083
Transformer 0.0890±0.0056 0.1046±0.0048 0.1163±0.0042 0.1273±0.0038 0.1354±0.0033 0.1146±0.0043

Table 11: Prediction accuracy on the test dataset reporting RMSE mean and 95% confidence interval (±) for the four cryptocurrencies during COVID-19.

Data Model LSTM ED-LSTM BD-LSTM CNN Conv-LSTM Transformer


Univariate 4 3 1 5 2 6
BTC
Multivariate 3 1 2 5 6 4
Univariate 1 3 2 5 6 4
ETH
Multivariate 5 4 3 1 2 6
Univariate 4 3 1 6 2 5
DOGE
Multivariate 2 3 1 6 4 5
Univariate 4 3 2 5 1 6
LTC
Multivariate 1 3 5 6 2 4
Mean Rank 3 2.875 2.125 4.875 3.125 5

Table 12: Performance (rank) of different models for predicting cryptocurrency for Dataset 1.

Data LSTM ED-LSTM BD-LSTM CNN Conv-LSTM Transformer


BTC 4 3 1 5 2 6
ETH 3 2 1 4 5 6
DOGE 6 4 1 5 2 3
LTC 3 1 2 4 5 6
Mean Rank 4 2.5 1.25 4.5 3.5 5.25

Table 13: Performance (rank) of different models for predicting cryptocurrency for Dataset 2.

[9] A. Manimuthu, G. Rejikumar, D. Marwaha, et al., A literature review on [13] M. Saad, J. Choi, D. Nyang, J. Kim, A. Mohaisen, Toward characteriz-
Bitcoin: transformation of crypto currency into a global phenomenon, ing blockchain-based cryptocurrencies for highly accurate predictions,
IEEE Engineering Management Review 47 (1) (2019) 28–35. IEEE Systems Journal 14 (1) (2019) 321–332.
[10] R. Farell, An analysis of the cryptocurrency industry, Wharton Research [14] S. Corbet, B. Lucey, L. Yarovaya, Datestamping the bitcoin and
Scholars 130 (2015) 1–23. ethereum bubbles, Finance Research Letters 26 (2018) 81–88.
[11] I. Eyal, Blockchain technology: Transforming libertarian cryptocur- [15] J. Bhosale, S. Mavale, Volatility of select crypto-currencies: A compar-
rency dreams to finance and banking realities, Computer 50 (9) (2017) ison of bitcoin, ethereum and litecoin, Annu. Res. J. SCMS, Pune 6 (1)
38–49. (2018) 132–141.
[12] H. Jang, J. Lee, An empirical study on modeling and prediction of bit- [16] P. Katsiampa, An empirical investigation of volatility dynamics in the
coin prices with bayesian neural networks based on blockchain informa- cryptocurrency market, Research in International Business and Finance
tion, IEEE access 6 (2017) 5427–5437. 50 (2019) 322–335.

26
[17] H. Elendner, S. Trimborn, B. Ong, T. M. Lee, The cross-section of [42] T. Fischer, C. Krauss, Deep learning with long short-term memory net-
crypto-currencies as financial assets: An overview (2016). works for financial market predictions, European journal of operational
[18] P. L. Seabe, C. R. B. Moutsinga, E. Pindza, Forecasting cryptocurrency research 270 (2) (2018) 654–669.
prices using lstm, gru, and bi-directional lstm: A deep learning ap- [43] O. B. Sezer, M. U. Gudelek, A. M. Ozbayoglu, Financial time series
proach, Fractal and Fractional 7 (2) (2023) 203. forecasting with deep learning: A systematic literature review: 2005–
[19] N. Kyriazis, S. Papadamou, S. Corbet, A systematic review of the bubble 2019, Applied soft computing 90 (2020) 106181.
dynamics of cryptocurrency prices, Research in International Business [44] V. Plakandaras, T. Papadimitriou, P. Gogas, K. Diamantaras, Market sen-
and Finance 54 (2020) 101254. timent and exchange rate directional forecasting, Algorithmic Finance
[20] M. A. Ammer, T. H. Aldhyani, Deep learning algorithm to predict cryp- 4 (1-2) (2015) 69–79.
tocurrency fluctuation prices: Increasing investment awareness, Elec- [45] M. Nabipour, P. Nayyeri, H. Jabani, S. Shahab, A. Mosavi, Predict-
tronics 11 (15) (2022) 2349. ing stock market trends using machine learning and deep learning al-
[21] K. Murray, A. Rossi, D. Carraro, A. Visentin, On forecasting cryptocur- gorithms via continuous and binary data; a comparative analysis, Ieee
rency prices: A comparison of machine learning, deep learning, and Access 8 (2020) 150199–150212.
ensembles, Forecasting 5 (1) (2023) 196–209. [46] A. Kong, H. Zhu, R. Azencott, Predicting intraday jumps in stock prices
[22] Y. Wang, G. Andreeva, B. Martin-Barragan, Machine learning ap- using liquidity measures and technical indicators, Journal of Forecasting
proaches to forecasting cryptocurrency volatility: Considering internal 40 (3) (2021) 416–438.
and external determinants, International Review of Financial Analysis [47] J. L. Elman, Finding structure in time, Cognitive science 14 (2) (1990)
90 (2023) 102914. 179–211.
[23] N. A. Kyriazis, A survey on empirical findings about spillovers in cryp- [48] S. Mehtab, J. Sen, A. Dutta, Stock price prediction using machine
tocurrency markets, Journal of Risk and Financial Management 12 (4) learning and lstm-based deep learning models, in: Machine Learning
(2019) 170. and Metaheuristics Algorithms, and Applications: Second Symposium,
[24] J. H. Stock, M. W. Watson, Vector autoregressions, Journal of Economic SoMMA 2020, Chennai, India, October 14–17, 2020, Revised Selected
perspectives 15 (4) (2001) 101–115. Papers 2, Springer, 2021, pp. 88–106.
[25] J.-C. Duan, The garch option pricing model, Mathematical finance 5 (1) [49] H. Rezaei, H. Faaljou, G. Mansourfar, Stock price prediction using deep
(1995) 13–32. learning and frequency decomposition, Expert Systems with Applica-
[26] T. L. D. Huynh, M. A. Nasir, X. V. Vo, T. T. Nguyen, “small things tions 169 (2021) 114332.
matter most”: The spillover effects in the cryptocurrency market and [50] G. Rilling, P. Flandrin, P. Goncalves, et al., On empirical mode decom-
gold as a silver bullet, The North American Journal of Economics and position and its algorithms, in: IEEE-EURASIP workshop on nonlinear
Finance 54 (2020) 101277. signal and image processing, Vol. 3, Grado: IEEE, 2003, pp. 8–11.
[27] T. Baltrušaitis, C. Ahuja, L.-P. Morency, Multimodal machine learning: [51] M. E. Torres, M. A. Colominas, G. Schlotthauer, P. Flandrin, A complete
A survey and taxonomy, IEEE transactions on pattern analysis and ma- ensemble empirical mode decomposition with adaptive noise, in: 2011
chine intelligence 41 (2) (2018) 423–443. IEEE international conference on acoustics, speech and signal process-
[28] S. Wang, J. Cao, S. Y. Philip, Deep learning for spatio-temporal data ing (ICASSP), IEEE, 2011, pp. 4144–4147.
mining: A survey, IEEE transactions on knowledge and data engineering [52] N. Jing, Z. Wu, H. Wang, A hybrid model integrating deep learning with
34 (8) (2020) 3681–3700. investor sentiment analysis for stock price prediction, Expert Systems
[29] B. Lim, S. Zohren, Time-series forecasting with deep learning: a survey, with Applications 178 (2021) 115019.
Philosophical Transactions of the Royal Society A 379 (2194) (2021) [53] S. Mehtab, J. Sen, Stock price prediction using machine learning and
20200209. deep learning algorithms and models, Machine Learning in the Analysis
[30] V. Jacques-Dumas, F. Ragone, P. Borgnat, P. Abry, F. Bouchet, and Forecasting of Financial Time Series (2022) 235–303.
Deep learning-based extreme heatwave forecast, Frontiers in Climate [54] Y. Li, Y. Pan, A novel ensemble deep learning model for stock prediction
4 (2022). based on stock prices and news, International Journal of Data Science
[31] S. Mahjoub, L. Chrifi-Alaoui, B. Marhic, L. Delahoche, Predicting en- and Analytics 13 (2) (2022) 139–149.
ergy consumption using lstm, multi-layer gru and drop-gru neural net- [55] A. Kanwal, M. F. Lau, S. P. Ng, K. Y. Sim, S. Chandrasekaran,
works, Sensors 22 (11) (2022) 4062. Bicudnnlstm-1dcnn—a hybrid deep learning-based predictive model for
[32] R. Chandra, Y. He, Bayesian neural networks for stock price forecasting stock price prediction, Expert Systems with Applications 202 (2022)
before and during covid-19 pandemic, Plos one 16 (7) (2021) e0253217. 117123.
[33] I. E. Livieris, E. Pintelas, S. Stavroyiannis, P. Pintelas, Ensemble deep [56] T. Swathi, N. Kasiviswanath, A. A. Rao, An optimal deep learning-based
learning models for forecasting cryptocurrency time-series, Algorithms lstm for stock price prediction using twitter sentiment analysis, Applied
13 (5) (2020) 121. Intelligence 52 (12) (2022) 13675–13688.
[34] F. Ferdiansyah, S. H. Othman, R. Z. R. M. Radzi, D. Stiawan, Y. Sazaki, [57] H. Ben Ameur, S. Boubaker, Z. Ftiti, W. Louhichi, K. Tissaoui, Fore-
U. Ependi, A lstm-method for bitcoin price prediction: A case study ya- casting commodity prices: empirical evidence using deep learning tools,
hoo finance stock market, in: 2019 international conference on electrical Annals of Operations Research (2023) 1–19.
engineering and computer science (ICECOS), IEEE, 2019, pp. 206–210. [58] P. Baser, J. R. Saini, N. Baser, Gold commodity price prediction using
[35] C.-H. Wu, C.-C. Lu, Y.-F. Ma, R.-S. Lu, A new forecasting framework tree-based prediction models, International Journal of Intelligent Sys-
for bitcoin price with lstm, in: 2018 IEEE international conference on tems and Applications in Engineering 11 (1s) (2023) 90–96.
data mining workshops (ICDMW), IEEE, 2018, pp. 168–175. [59] S. Deepa, A. Alli, S. Gokila, et al., Machine learning regression model
[36] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural com- for material synthesis prices prediction in agriculture, Materials Today:
putation 9 (8) (1997) 1735–1780. Proceedings 81 (2023) 989–993.
[37] Y. Yu, X. Si, C. Hu, J. Zhang, A review of recurrent neural net- [60] Y. Zhao, G. Yang, Deep learning-based integrated framework for
works: LSTM cells and network architectures, Neural computation stock price movement prediction, Applied Soft Computing 133 (2023)
31 (7) (2019) 1235–1270. 109921.
[38] Z. Jiang, J. Liang, Cryptocurrency portfolio management with deep rein- [61] J. Almeida, S. Tata, A. Moser, V. Smit, Bitcoin prediciton using ann,
forcement learning, in: 2017 Intelligent systems conference (IntelliSys), Neural networks 7 (2015) 1–12.
IEEE, 2017, pp. 905–913. [62] D. C. Mallqui, R. A. Fernandes, Predicting the direction, maximum,
[39] S. Sridhar, S. Sanagavarapu, Multi-head self-attention transformer for minimum and closing prices of daily bitcoin exchange rate using ma-
dogecoin price prediction, in: 2021 14th International Conference on chine learning techniques, Applied Soft Computing 75 (2019) 596–606.
Human System Interaction (HSI), IEEE, 2021, pp. 1–6. [63] S. G. Quek, G. Selvachandran, J. H. Tan, H. Y. A. Thiang, N. T. Tuan,
[40] R. Chandra, S. Goyal, R. Gupta, Evaluation of deep learning models for et al., A new hybrid model of fuzzy time series and genetic algorithm
multi-step ahead time series prediction, Ieee Access 9 (2021) 83105– based machine learning algorithm: a case study of forecasting prices
83123. of nine types of major cryptocurrencies, Big Data Research 28 (2022)
[41] R. S. Tsay, Analysis of financial time series, John wiley & sons, 2005. 100315.

27
[64] A. Radityo, Q. Munajat, I. Budi, Prediction of bitcoin exchange rate to [87] M. Schnaubelt, Deep reinforcement learning for the optimal placement
american dollar using artificial neural network methods, in: 2017 in- of cryptocurrency limit orders, European Journal of Operational Re-
ternational conference on advanced computer science and information search 296 (3) (2022) 993–1006.
systems (ICACSIS), IEEE, 2017, pp. 433–438. [88] R. Parekh, N. P. Patel, N. Thakkar, R. Gupta, S. Tanwar, G. Sharma, I. E.
[65] A. Greaves, B. Au, Using the bitcoin transaction graph to predict the Davidson, R. Sharma, Dl-guess: Deep learning and sentiment analysis-
price of bitcoin, No data 8 (2015) 416–443. based cryptocurrency price prediction, IEEE Access 10 (2022) 35398–
[66] C. Cortes, V. Vapnik, Support-vector networks, Machine learning 20 35409.
(1995) 273–297. [89] G. Kim, D.-H. Shin, J. G. Choi, S. Lim, A deep learning-based cryp-
[67] Y. Sovbetov, Factors influencing cryptocurrency prices: Evidence from tocurrency price prediction model that uses on-chain data, IEEE Access
bitcoin, ethereum, dash, litcoin, and monero, Journal of Economics and 10 (2022) 56232–56248.
Financial Analysis 2 (2) (2018) 1–27. [90] S. Goutte, H.-V. Le, F. Liu, H.-J. Von Mettenheim, Deep learning and
[68] T. Guo, A. Bifet, N. Antulov-Fantulin, Bitcoin volatility forecasting with technical analysis in cryptocurrency market, Finance Research Letters
a glimpse into buy and sell orders, in: 2018 IEEE international confer- 54 (2023) 103809.
ence on data mining (ICDM), IEEE, 2018, pp. 989–994. [91] K.-C. Yen, H.-P. Cheng, Economic policy uncertainty and cryptocur-
[69] C. G. Akcora, A. K. Dey, Y. R. Gel, M. Kantarcioglu, Forecasting bitcoin rency volatility, Finance Research Letters 38 (2021) 101428.
price with graph chainlets, in: Advances in Knowledge Discovery and [92] F. Woebbeking, Cryptocurrency volatility markets, Digital finance 3 (3)
Data Mining: 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, (2021) 273–298.
VIC, Australia, June 3-6, 2018, Proceedings, Part III 22, Springer, 2018, [93] J. L. Cross, C. Hou, K. Trinh, Returns, volatility and the cryptocurrency
pp. 765–776. bubble of 2017–18, Economic Modelling 104 (2021) 105643.
[70] S. Roy, S. Nanjiba, A. Chakrabarty, Bitcoin price forecasting using time [94] Z. Ftiti, W. Louhichi, H. Ben Ameur, Cryptocurrency volatility forecast-
series analysis, in: 2018 21st International Conference of Computer and ing: What can we learn from the first wave of the covid-19 outbreak?,
Information Technology (ICCIT), IEEE, 2018, pp. 1–5. Annals of Operations Research 330 (1) (2023) 665–690.
[71] V. Derbentsev, N. Datsenko, O. Stepanenko, V. Bezkorovainyi, Forecast- [95] L. Yin, J. Nie, L. Han, Understanding cryptocurrency volatility: The role
ing cryptocurrency prices time series using machine learning approach, of oil market shocks, International Review of Economics & Finance 72
in: SHS Web of Conferences, Vol. 65, EDP Sciences, 2019, p. 02001. (2021) 233–253.
[72] S. Aanandhi, S. Akhilaa, V. Vardarajan, M. Sathiyanarayanan, et al., [96] L. Catania, S. Grassi, F. Ravazzolo, Predicting the volatility of cryp-
Cryptocurrency price prediction using time series forecasting (arima), tocurrency time-series, Mathematical and Statistical Methods for Actu-
in: 2021 4th International Seminar on Research of Information Technol- arial Sciences and Finance: MAF 2018 (2018) 203–207.
ogy and Intelligent Systems (ISRITI), IEEE, 2021, pp. 598–602. [97] L. Catania, S. Grassi, Forecasting cryptocurrency volatility, Interna-
[73] N. Latif, J. D. Selvam, M. Kapse, V. Sharma, V. Mahajan, Comparative tional Journal of Forecasting 38 (3) (2022) 878–894.
performance of lstm and arima for the short-term prediction of bitcoin [98] F. Ma, C. Liang, Y. Ma, M. I. M. Wahab, Cryptocurrency volatility fore-
prices, Australasian Accounting, Business and Finance Journal 17 (1) casting: A markov regime-switching midas approach, Journal of Fore-
(2023) 256–276. casting 39 (8) (2020) 1277–1290.
[74] N. Maleki, A. Nikoubin, M. Rabbani, Y. Zeinali, Bitcoin price prediction [99] Y. Wei, Y. Wang, B. M. Lucey, S. A. Vigne, Cryptocurrency uncertainty
based on other cryptocurrencies using machine learning and time series and volatility forecasting of precious metal futures markets, Journal of
analysis, Scientia Iranica 30 (1) (2023) 285–301. Commodity Markets 29 (2023) 100305.
[75] L. P. Kaelbling, M. L. Littman, A. W. Moore, Reinforcement learning: [100] G. E. Box, G. M. Jenkins, G. C. Reinsel, G. M. Ljung, Time series
A survey, Journal of artificial intelligence research 4 (1996) 237–285. analysis: forecasting and control, John Wiley & Sons, 2015.
[76] K. Lee, S. Ulkuatam, P. Beling, W. Scherer, Generating synthetic bit- [101] S. Hochreiter, The vanishing gradient problem during learning recurrent
coin transactions and predicting market price movement via inverse re- neural nets and problem solutions, International Journal of Uncertainty,
inforcement learning and agent-based modeling, Journal of Artificial So- Fuzziness and Knowledge-Based Systems 6 (02) (1998) 107–116.
cieties and Social Simulation 21 (3) (2018). [102] A. Graves, J. Schmidhuber, Framewise phoneme classification with bidi-
[77] B. Ly, D. Timaul, A. Lukanan, J. Lau, E. Steinmetz, Applying deep rectional lstm and other neural network architectures, Neural networks
learning to better predict cryptocurrency trends, in: Midwest Instruction 18 (5-6) (2005) 602–610.
and Computing Symposium, 2018. [103] Y. Liu, C. Sun, L. Lin, X. Wang, Learning natural language infer-
[78] G. Lucarelli, M. Borrotti, A deep reinforcement learning approach for ence using bidirectional lstm model and inner-attention, arXiv preprint
automated cryptocurrency trading, in: Artificial Intelligence Applica- arXiv:1605.09090 (2016).
tions and Innovations: 15th IFIP WG 12.5 International Conference, [104] L. Chen, J. Tao, S. Ghaffarzadegan, Y. Qian, End-to-end neural network
AIAI 2019, Hersonissos, Crete, Greece, May 24–26, 2019, Proceedings based automated speech scoring, in: 2018 IEEE international conference
15, Springer, 2019, pp. 247–258. on acoustics, speech and signal processing (ICASSP), IEEE, 2018, pp.
[79] S. Lahmiri, S. Bekiros, Cryptocurrency forecasting with deep learning 6234–6238.
chaotic neural networks, Chaos, Solitons & Fractals 118 (2019) 35–40. [105] I. Sutskever, O. Vinyals, Q. V. Le, Sequence to sequence learning with
[80] M. M. Patel, S. Tanwar, R. Gupta, N. Kumar, A deep learning-based neural networks, Advances in neural information processing systems 27
cryptocurrency price prediction scheme for financial institutions, Journal (2014).
of information security and applications 55 (2020) 102583. [106] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares,
[81] S. Marne, S. Churi, D. Correia, J. Gomes, Predicting price of H. Schwenk, Y. Bengio, Learning phrase representations using RNN
cryptocurrency–a deep learning approach, NTASU-9 (3) (2020). encoder-decoder for statistical machine translation, arXiv preprint
[82] S. Nasekin, C. Y.-H. Chen, Deep learning-based cryptocurrency senti- arXiv:1406.1078 (2014).
ment construction, Digital Finance 2 (1) (2020) 39–67. [107] H. Gunduz, Y. Yaslan, Z. Cataltepe, Intraday prediction of borsa is-
[83] C. Betancourt, W.-H. Chen, Deep reinforcement learning for portfolio tanbul using convolutional neural networks and feature correlations,
management of markets with a dynamic number of assets, Expert Sys- Knowledge-Based Systems 137 (2017) 138–148.
tems with Applications 164 (2021) 114002. [108] L. Di Persio, O. Honchar, et al., Artificial neural networks architectures
[84] K. Arulkumaran, M. P. Deisenroth, M. Brundage, A. A. Bharath, Deep for stock price prediction: Comparisons and applications, International
reinforcement learning: A brief survey, IEEE Signal Processing Maga- journal of circuits, systems and signal processing 10 (2016) 403–413.
zine 34 (6) (2017) 26–38. [109] E. Hoseinzade, S. Haratizadeh, Cnnpred: Cnn-based stock market pre-
[85] Z. Shahbazi, Y.-C. Byun, Improving the cryptocurrency price prediction diction using a diverse set of variables, Expert Systems with Applica-
performance based on reinforcement learning, IEEE Access 9 (2021) tions 129 (2019) 273–285.
162651–162659. [110] A. Siripurapu, Convolutional networks for stock trading, Stanford Univ
[86] V. D’Amato, S. Levantesi, G. Piscopo, Deep learning in predicting cryp- Dep Comput Sci 1 (2) (2014) 1–6.
tocurrency volatility, Physica A: Statistical Mechanics and its Applica- [111] H. Jiang, E. Learned-Miller, Face detection with the faster r-cnn, in:
tions 596 (2022) 127158. 2017 12th IEEE international conference on automatic face & gesture

28
recognition (FG 2017), IEEE, 2017, pp. 650–657. [137] G. Miao, Z. Chen, H. Cao, W. Wu, X. Chu, H. Liu, L. Zhang,
[112] A. Garcia-Garcia, S. Orts-Escolano, S. Oprea, V. Villena-Martinez, H. Zhu, H. Cai, X. Lu, et al., From immunogen to COVID-19 vaccines:
J. Garcia-Rodriguez, A review on deep learning techniques applied to Prospects for the post-pandemic era, Biomedicine & Pharmacotherapy
semantic segmentation, arXiv preprint arXiv:1704.06857 (2017). 158 (2023) 114208.
[113] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv [138] D. Łaskawiec, M. Grajek, P. Szlacheta, I. Korzonek-Szlacheta, Post-
preprint arXiv:1412.6980 (2014). pandemic stress disorder as an effect of the epidemiological situation
[114] X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, W.-c. Woo, Con- related to the COVID-19 pandemic, in: Healthcare, Vol. 10, 2022, p.
volutional lstm network: A machine learning approach for precipita- 975.
tion nowcasting, Advances in neural information processing systems 28 [139] Y. Jiang, H. Nie, W. Ruan, Time-varying long-term memory in bitcoin
(2015). market, Finance Research Letters 25 (2018) 280–284.
[115] K. Cho, B. Van Merriënboer, D. Bahdanau, Y. Bengio, On the proper- [140] Y. Wang, M. Long, J. Wang, Z. Gao, P. S. Yu, Predrnn: Recurrent neural
ties of neural machine translation: Encoder-decoder approaches, arXiv networks for predictive learning using spatiotemporal lstms, Advances
preprint arXiv:1409.1259 (2014). in neural information processing systems 30 (2017).
[116] D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly
learning to align and translate, arXiv preprint arXiv:1409.0473 (2014).
[117] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez,
Ł. Kaiser, I. Polosukhin, Attention is all you need, Advances in neural
information processing systems 30 (2017).
[118] S.-i. Amari, Backpropagation and stochastic gradient descent method,
Neurocomputing 5 (4-5) (1993) 185–196.
[119] S. Ruder, An overview of gradient descent optimization algorithms,
arXiv preprint arXiv:1609.04747 (2016).
[120] J. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for on-
line learning and stochastic optimization., Journal of machine learning
research 12 (7) (2011).
[121] M. D. Zeiler, Adadelta: an adaptive learning rate method, arXiv preprint
arXiv:1212.5701 (2012).
[122] G. Hinton, N. Srivastava, K. Swersky, Neural networks for machine
learning lecture 6a overview of mini-batch gradient descent, Toronto
University (2012).
[123] V. Buterin, et al., A next-generation smart contract and decentralized
application platform, white paper 3 (37) (2014) 2–1.
[124] C. Percival, S. Josefsson, The scrypt password-based key derivation
function, Tech. rep. (2016).
[125] Cryptocurrency historical prices, last accessed 13 Feburary 2024 (2021).
URL https://www.kaggle.com/datasets/sudalairajkumar/
cryptocurrencypricehistory/data
[126] F. Takens, Detecting strange attractors in turbulence, in: Dynamical Sys-
tems and Turbulence, Warwick 1980: proceedings of a symposium held
at the University of Warwick 1979/80, Springer, 2006, pp. 366–381.
[127] A. Gnauck, Interpolation and approximation of water quality time series
and process identification, Analytical and bioanalytical chemistry 380
(2004) 484–492.
[128] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,
T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., Pytorch: An imper-
ative style, high-performance deep learning library, Advances in neural
information processing systems 32 (2019).
[129] P. E. Mandaci, E. C. Cagli, Herding intensity and volatility in cryp-
tocurrency markets during the COVID-19, Finance Research Letters 46
(2022) 102382.
[130] M. A. Naeem, E. Bouri, Z. Peng, S. J. H. Shahzad, X. V. Vo, Asymmetric
efficiency of cryptocurrencies during COVID19, Physica A: Statistical
Mechanics and its Applications 565 (2021) 125562.
[131] A. Tanwar, V. Kumar, Prediction of cryptocurrency prices using trans-
formers and long short term neural networks, in: 2022 International
Conference on Intelligent Controller and Computing for Smart Power
(ICICCSP), IEEE, 2022, pp. 1–4.
[132] D. Wang, W.-Z. Lu, Forecasting of ozone level in time series using mlp
model with a novel hybrid training algorithm, Atmospheric Environment
40 (5) (2006) 913–924.
[133] D. Cucinotta, M. Vanelli, Who declares covid-19 a pandemic, Acta bio
medica: Atenei parmensis 91 (1) (2020) 157.
[134] K. G. Andersen, A. Rambaut, W. I. Lipkin, E. C. Holmes, R. F. Garry,
The proximal origin of sars-cov-2, Nature medicine 26 (4) (2020) 450–
452.
[135] A. Brodeur, D. Gray, A. Islam, S. Bhuiyan, A literature review of the
economics of covid-19, Journal of economic surveys 35 (4) (2021)
1007–1044.
[136] M. Leach, H. MacGregor, I. Scoones, A. Wilkinson, Post-pandemic
transformations: How and why COVID-19 requires us to rethink de-
velopment, World development 138 (2021) 105233.

29

You might also like