0% found this document useful (0 votes)
81 views

ISJBC

This document discusses characterizing blockchain-based cryptocurrencies like Bitcoin and Ethereum to accurately predict their prices. It analyzes user and network activity data that impact cryptocurrency prices over time. Machine learning models are constructed to predict cryptocurrency prices with up to 99% accuracy based on key network features correlated with demand and supply dynamics.

Uploaded by

Alex-sama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

ISJBC

This document discusses characterizing blockchain-based cryptocurrencies like Bitcoin and Ethereum to accurately predict their prices. It analyzes user and network activity data that impact cryptocurrency prices over time. Machine learning models are constructed to predict cryptocurrency prices with up to 99% accuracy based on key network features correlated with demand and supply dynamics.

Uploaded by

Alex-sama
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 1

Towards Characterizing Blockchain-based


Cryptocurrencies for Highly-Accurate Predictions
Muhammad Saad, Jinchun Choi, DaeHun Nyang, Joongheon Kim, and Aziz Mohaisen Senior Member, IEEE.

Abstract—Recently, the Blockchain-based cryptocurrency mar- tampering, and such features lay ideal foundations for cryp-
ket witnessed enormous growth. Bitcoin, the leading cryptocur- tocurrency applications to be built on top of Blockchain.
rency, reached all-time highs many times over the year leading Cryptocurrencies involve the exchange of digital assets
to speculations to explain the trend in its growth. In this paper,
we study Bitcoin and Ethereum and explore features in their (tokens) and have evolved from virtual currency to smart
network that explain their price hikes. We gather data and contracts and applications beyond currency. This transforma-
analyze user and network activity that highly impact the price of tion of cryptocurrencies is categorized as Blockchain 1.0,
these cryptocurrencies. We monitor the change in the activities 2.0 and 3.0 [8]. Blockchain 1.0 solely involves transfer of
over time and relate them to economic theories. We identify key digital currency between parties. Bitcoin is an example of
network features that help us reason about the determine the
demand and supply dynamics in a cryptocurrency. Finally, we Blockchain 1.0, since it only allows transfer of digital tokens
use machine learning methods to construct models that predict (bitcoins). Blockchain 2.0 is an extension of Blockchain 1.0
Bitcoin price. Based on our experimental results using two large that allows transfer of many other assets, offering more flexible
datasets for validation, we confirm that our approach provides an protocols for the users to design their transactions, such as
accuracy of up to 99% for Bitcoin and Ethereum price prediction smart contracts [9] and decentralized autonomous organiza-
in both instances.
tions (DAOs) [10], which are among many useful applications
Index Terms—Blockchain, Bitcoin, Ethereum, prediction. of Blockchain 2.0 [11]. Blockchain 3.0 is yet another extension
of this technology that envisions the use of Blockchain beyond
digital currencies, with applications for distributed censorship
I. I NTRODUCTION
resistant organization models, digital identity verification and
Blockchain-based digital currencies have witnessed enor- decentralized domain name system [12].
mous change in value over the last few years [1]. Bitcoin, New cryptocurrencies address shortcomings of older ones,
the most popular cryptocurrency, was launched in 2009, and with better throughput, scalability, and programmability. Al-
stayed as the only Blockchain-based cryptocurrency for more though this gives a general idea why cryptocurrency markets
than two years. However, today, the cryptocurrency world have grown, many factors contributing to the rise in cryptocur-
has more than 5000 cryptocurrencies [2] and more than 5.8 rency prices are not well-understood. In this paper, we look at
million active users [3]. Bitcoin leads the cryptocurrency the dynamics of various variables in a cryptocurrency, namely
market with 58% market share; corresponding to $4.9 Billion Bitcoin and Ethereum, which can shed light on their price
USD trade volume and over 12,000 transactions per hour [4]. trends. We use the network features of these cryptocurrencies
In December 2016, the price of 1 Bitcoin token (BTC) was as an example and perform an in-depth analysis using the data
under $1000 USD, compared to about $19,000 USD in late obtained from their Blockchain and peer-to-peer network.
2017, and over $3600 USD in January 2019 [5]. These changes The key factor that influences the growth of the cryptocur-
in the price led to a lot of interest in cryptocurrency and rency market is the interest shown by the users towards the
Bitcoin in particular. In this paper, we carry out a study on trade of the digital tokens. As more users engage in the market
Bitcoin and Ethereum to analyze their network features that activity, the demand for the digital tokens increases, leading to
capture the user behavior and in turn have an impact on their a higher price. However, unlike fiat currency systems which
price. are centralized and traceable, cryptocurrencies are (theoreti-
The underlying technology of every cryptocurrency is cally) decentralized and pseudo-anonymous, lacking tangible
the Blockchain. Blockchain acts as a decentralized public digital footprints. Therefore, with insufficient knowledge it
database that preserves anonymity and augments trust between becomes challenging to measure the interest factor of the
the users [6]. Trust in an anonymous peer-to-peer model users and perform a user-based study aimed towards the
is achieved by consensus protocols such as Proof-of-Work understanding of changing price and market trends.
(PoW), Proof-of-Stake (PoS), Proof-of-Knowledge (PoK), and We address this challenge by arguing that despite anonymity
distributed consensus [7]. The decentralized environment and and decentralization of cryptocurrencies, there are several
the append-only model prevent Blockchains failure and data network indicators that might be useful in demonstrating the
interest of users and the overall market behavior. We show
M. Saad, J. Choi, and A. Mohaisen are with the University of Central that these network indicators have a high correlation with
Florida. D. Nyang and J. Choi are with Inha University, South Korea. J. Kim the price of a cryptocurrency and can be used to accurately
is with Chung-Ang University, Seoul, South Korea. J. Kim and A. Mohaisen
are the corresponding authors. E-mail: [email protected], Tel: +1-407-823- predict its price. Furthermore, these features can also be used
1294. An earlier version of this work has appeared in HotPOST 2018. to provide a rationale behind the network activity driven by
IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 2

the user behavior. To validate our reasoning with experiments, 1


Ripple
we construct a machine learning model that learns from the Litecoin

Normalized Value
0.8 Dash
highly correlated network indicators and predicts the price of Bitcoin
Ethereum
cryptocurrencies with high accuracy. 0.6
Prior efforts on prediction for cryptocurrency price used 0.4
the past price indexes to forecast the future price [13]. This
0.2
approach is inspired by a large body of work on stock market
prediction [14], and has been tried in the cryptocurrency 0

17

17

17

17

18

18

18
market. However, this method does not partake the volatile

1/

1/

1/

1/

1/

1/

1/
/0

/0

/0

/0

/0

/0

/0
behaviors of network entities that may indirectly, although

05

07

09

11

01

03

05
Dates (mm/dd/yy)
drastically influence the price, independent of the previous
price indexes. For instance, a sudden decrease in the network Fig. 1. Cryptocurrency price change in 2017-18. Notice that there is a high
hash rate can prolong the block publishing time and reduce the correlation in the price fluctuation in all cryptocurrencies. Furthermore, it can
be observed that towards the end of 2017, each cryptocurrency reached its
network throughput as well as the number of newly generated highest price index. In 2018, the price index decreased.
coins. Such a decline in hash rate is independent of the past
price index, and therefore cannot be used to accurately model
the future price. Lacking the ability to capture this behavior
has led to low accuracy of prediction models in the prior art influence the financial and other systems, general analysis of
(≈ 52%). Specific to our work, we take a tangential approach Bitcoin and Ethereum, and their price prediction.
towards modeling the price by using network indicators that Vigna et al. [16] analyzed how Blockchain based applica-
are strongly correlated with the cryptocurrency price and lead tions are challenging the global economic order by exploring
to better and more accurate prediction models. the impact of Blockchain-based applications on the future
Novelty. The novelty of our approach lies in: 1) the identifica- of the financial system. Swan [8] proposed a possibility
tion of the key network features that capture the changing price of cheaper, efficient and secure economical models based
models, and can therefore be used for feature engineering (sub- on Blockchain. The use of Blockchain 3.0 is estimated to
subsection III-A1), 2) the distinguishing methodology from create new possibilities in Internet of Things (IoT), privacy
the prior work [13], in which only the past price indexes were management, and voting systems [17].
used to predict the future price (subsection III-A), and 3) the Blockchain 2.0 transformed cryptocurrency from mere ex-
methodical reasoning about correlation of identified features change of tokens to smart contracts. Rose [18] analyzed the
with cryptocurrency price, to enhance the understanding of evolution of digital currencies and Omohundro [19] explored
user behavior and the cryptocurrency network (section IV). recent developments in cryptocurrency and smart contracts.
After feature selection, we use standard machine learning tech- Kosba et al. [9] explored different dimensions of smart con-
niques, including regression, long short-term memory (LSTM) tracts, including criminal smart contracts. Peters et al. [20]
networks, and conjugate gradient algorithm. As a result, our analyzed the future of banking system ledgers with Blockchain
prediction models achieve a high accuracy of 99% for Bitcoin technology, transaction processing and smart contracts.
and Ethereum, outperforming the state-of-the-art [15] (52%),
For better applications, the security attack surface of
and validating the novelty and significance of our approach.
Blockchain is also explored, including the 51% attack, selfish
Contributions. In summary, we make the following con-
mining, double-spending, block withholding, block forks and
tributions. 1) We study Bitcoin and Ethereum network and
distributed denial-of-service (DDoS) attacks [21]; arguably the
identify the key network indicators that affect their price.
most prevalent attack [22].
2) We show how these features are driven by user and network
activity, and provide a rationale behind their influence on Limited research is done on the feature-based price analysis.
price. 3) We adopt machine learning approach using regression Indera et al. [13] developed a non-linear autoregressive Bitcoin
and long short-term memory (LSTM) analyses to construct price prediction model using the opening and closing past
price prediction models for Bitcoin and Ethereum. 4) Our prices to predict future price. McNally [15] explored various
prediction models estimate the price of cryptocurrencies with machine learning approaches to predict Bitcoin price using
high accuracy (99%), and outperform the state-of-the-art. Bitcoin price index, achieving a maximum accuracy of 52%
Organization. The rest of the paper is organized as follows. with LSTM networks. This paper is an extension of our
In section II, we review the related work. In section III, we previously published work in [23]; concurrent to that work,
provide preliminaries of this work and outline our method- Jang and Lee [24] performed a time series analysis of Bitcoin
ology and dataset attributes. In section IV we perform data to improve predictive performance. They use Bayesian neural
analysis to extract the most significant features that impact the network with other linear and non-linear benchmark models
price. In section V we carry out our experiments and report to explain volatility in Bitcoin price.
the results. Concluding remarks are made in section VI. In this paper, we explore other features, besides past prices,
to establish patterns in price. We investigate various network
II. R ELATED W ORK features and identify the highly correlated ones that determine
In this section, we review the notable related work. We focus the price. Using those features, we train and test our models,
on analyses dedicated to understanding how cryptocurrencies which achieves a near-perfect prediction accuracy.
IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 3

1.00 derives the growth in another similar currency, highlighting the


1.0 0.8 0.9 0.9 0.8 speculative nature of the interdependent interactions between
Eth 0.96 the currencies’ prices, and hinting on the potential generality
of findings to other systems.
0.8 1.0 0.9 0.8 0.9
Ripple Litecoin Bitcoin

0.92 A. Methodology
In this section, we outline our methodology for characteriz-
0.9 0.9 1.0 0.8 0.9 0.88 ing price of Bitcoin and Ethereum, spanning data collection,
data characteristics, and our approach. We outline the rele-
0.9 0.8 0.8 1.0 0.8 0.84 vance of key indicators in our dataset towards the broader
goal of making a price prediction model.
1) Data Collection: For this study, we crawled data related
0.80
0.8 0.9 0.9 0.8 1.0 to the network features of Bitcoin and Ethereum using online
Dash

resources. For Bitcoin, we used the public Blockchain and


Eth Bitcoin Litecoin Ripple Dash API provided by the exchange company “Blockchain” [5], that
maintains data related to Bitcoin network. One of the features
Fig. 2. Correlation of major cryptocurrencies exemplified through a heatmap.
used in our prediction model is the “number of wallets”. These
III. P RELIMINARIES are the wallets created solely on the exchange of “Blockchain”
and are not related to other exchanges such as “Coinbase”.
The main goal of this work is broad and aims to pro- The memory pool (mempool) data shown in our work is
vide the initial step towards characterizing Blockchain-based also related to the information maintained by the mempool of
cryptocurrencies for predictions. Towards that, we perform “Blockchain” full node. It is worth mentioning that memory
a detailed analysis for the top two cryptocurrencies in the pool of nodes in the peer-to-peer settings of Bitcoin may vary
market, namely Bitcoin and Ethereum. We select them due due peer positioning and nature of transaction relay. However,
to their widespread popularity, extensive user-base, and high all other features such as hash rate, price, number of bitcoins,
market cap. Our approach towards price characterization based number of transactions are consistent across all exchanges and
on network features can be extended to other cryptocurrencies. nodes in the network. From “Blockchain” API, we collected
In Fig. 1, we plot the price change trend of five major data from 04/2016 to 05/2018. The dataset consists of features
cryptocurrencies over the last one year. The difference in the including the number of wallets, unspent transaction outputs
actual price value of each currency is high, and cannot be (UTXO’s), mempool size, block size, mean confirmation time,
plotted in one graph. We use the min-max normalization to miner’s income, transactions per day, transactions per block,
scale the data in the range [0, 1] and plot the normalized price. unique Bitcoin addresses, cumulative network’s hash rate,
xi −min(x)
The min-max scaling is conducted as z = max(x)−min(x) . network’s difficulty, fee, fee per transaction, system-wide total
In Fig. 1, we observe an increase in the price of every bitcoins, trade volume and the market price of Bitcoin.
cryptocurrency over the year 2017, and particularly towards For Ethereum, we followed the same procedure and col-
the end of the year. The growing trend started around April lected data using the information provided by an Ethereum
2017, and kept on increasing. Towards the end of 2017, the exchange “Etherscan” [25]. We collected data from 04/2016
rise in the price has been very steep. It is commonly conceived to 05/2018 including features such as transaction growth, ad-
that these cryptocurrencies are competitors in the market and dress count, ether supply, market cap, transaction fee, hashing
price hikes in one leads to a price fall in another. However, power, difficulty, block time, gas limit, and gas used.
from the plots we observed that there is an almost monotonic The price of a cryptocurrency can be influenced by internal
change in the price of all the currencies simultaneously. They features, external features, or both. Internal features include
all followed similar trends of rise and fall over time. It can be indicators that represent the network behavior such as mem-
further observed in Fig. 1 that the price of each cryptocurrency pool size, hash rate [23] etc. On the other hand, external
decreased sharply at the start of the year 2018. Although, the features include crude oil price, government policies towards
price has been fluctuating over the year, it is noteworthy that cryptocurrency exchanges, electricity charges, public senti-
there has been a monotonic change in the price across all ment [26] etc. In this paper, we focus on collecting internal
cryptocurrencies, indicating the presence of a correlating factor features and determine their effect on price. Our rationale for
among all. this approach is driven by the fact that the internal features
To further analyze the similarity in their trends, we use eventually accommodate for the impact of external policies.
the Pearson correlation coefficient between the price in all For instance, if electricity cost is increased, some mining pools
currencies over time, defined as ρ(X, Y ) = √ Cov(X,Y ) . shut down [27]. As a result, the internal features including
Var(X)Var(Y )
We report our results in Fig. 2. While the pair-wise correlation the hash rate and the block publishing time change. Since
is high across all currencies, supporting the initial premise of the external factors influencing the cryptocurrency market are
this work, we observe significant correlation between Bitcoin, eventually manifested in the internal network behavior, we
Dash and Litecoin price growth. Furthermore, we found a primarily focus on collecting and analyzing internal features.
significant correlation between the price trend of Ethereum External features, however useful, fall outside the scope of this
and Ripple. As such, the growth in one major cryptocurrency, paper.
IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 4

1 1 1
Wallets Transaction Cost Block Size
Normalized Value

Normalized Value

Normalized Value
0.8 Mempool 0.8 Difficulty 0.8 Confirmation Time
Hash Rate Miner’s Revenue Addresses
Bitcoins Fee Price
0.6 Price 0.6 Orphaned Blocks 0.6
Price
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
/ 6
/ 6
/ 16
/ 17
/ 17
/ 17
/ 7
/ 17
/ 17
/ 18
/0 18
18

/ 6
/ 6
/ 16
/ 17
/ 17
/ 17
/ 7
/ 17
/ 17
/ 18
/0 18
18

/ 6
/ 6
/ 16
/ 17
/ 17
/ 17
/ 7
/ 17
/ 17
/ 18
/0 18
18
09 01/1
11 01/1

09 01/1

09 01/1
11 01/1

09 01/1

09 01/1
11 01/1

09 01/1
01 01/
03 01/
05 01/
07 01/

11 01/
01 01/
03 01/
05 01/
1/

01 01/
03 01/
05 01/
07 01/

11 01/
01 01/
03 01/
05 01/
1/

01 01/
03 01/
05 01/
07 01/

11 01/
01 01/
03 01/
05 01/
1/
/

/
07

07

07
Dates (mm/dd/yy) Dates (mm/dd/yy) Dates (mm/dd/yy)

(a) Normalized Features (Bitcoin) (b) Normalized Features (Bitcoin) (c) Normalized Features (Bitcoin)
1 1 1
Tx per day Addresses Block Time
Normalized Value

Normalized Value

Normalized Value
0.8 Tx per block 0.8 Block Size 0.8 Difficulty
Price Hash Rate Fee
Ether Price
0.6 0.6 Price 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
/ 6
/ 6
/ 16
/ 17
/ 17
/ 17
/ 7
/ 17
/ 17
/ 18
/0 18
18

/0 5
/0 6
/0 6
/0 6
/0 6
/0 7
/0 7
/0 7
/0 7
/0 8
18

/0 5
/0 6
/0 6
/0 6
/0 6
/0 7
/0 7
/0 7
/0 7
/0 8
18
09 01/1
11 01/1

09 01/1

01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1

01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1
01 01/
03 01/
05 01/
07 01/

11 01/
01 01/
03 01/
05 01/
1/

1/

1/
/0

/0
/
07

10

10
Dates (mm/dd/yy) Dates (mm/dd/yy) Dates (mm/dd/yy)

(d) Normalized Features (Bitcoin) (e) Normalized Features (Ethereum) (f) Normalized Features (Ethereum)
1 1 1
Market Cap Demand Demand
Normalized Value

Normalized Value

Normalized Value
0.8 Uncle Blocks 0.8 Price 0.8 Price
Gas Limit
Price
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
/0 5
/0 6
/0 6
/0 6
/0 6
/0 7
/0 7
/0 7
/0 7
/0 8
18

/ 6
/ 6
/ 16
/ 17
/ 17
/ 17
/ 7
/ 17
/ 17
/ 18
/0 18
18

/0 5
/0 6
/0 6
/0 6
/0 6
/0 7
/0 7
/0 7
/0 7
/0 8
18
01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1

09 01/1
11 01/1

09 01/1

01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1
1/

01 01/
03 01/
05 01/
07 01/

11 01/
01 01/
03 01/
05 01/
1/

1/
/0

/0
/
10

07

10
Dates (mm/dd/yy) Dates (mm/dd/yy) Dates (mm/dd/yy)

(g) Normalized Features (Ethereum) (h) Demand and Price (Bitcoin) (i) Demand and Price (Ethereum)

Fig. 3. Trends in the features captured from our dataset. Notice that the hash rate, the difficulty, and the transaction cost are highly correlated with the price.
The increase in demand (Total Wallets / Total Bitcoins) has led to an increase in the price. The features in Ethereum dataset are more correlated with price
than Bitcoin dataset. Furthermore, Ethereum also captures the demand and supply trend more accurately than Bitcoin.

2) Data characteristics: Bitcoin and Ethereum involve the In Bitcoin, the size of blocks is fixed at 1MB and the average
exchange of digital tokens, and their operations may vary at block computation time is 10 minutes. In Ethereum, the block
the application level. As mentioned earlier, Bitcoin belongs size is adjustable depending upon the transaction backlog
to the first generation of Blockchain (Blockchain 1.0) that and mean confirmation time. The average block computation
only involves exchange digital coins. Ethereum on the other time in Ethereum is between 10-20 seconds. We observed
hand, belongs to Blockchain 2.0, that offers development of in our dataset that the maximum hash rate of Bitcoin was
smart contracts atop Blockchains. Smart contracts enable the equal to 11,941,671 Terahashes per second (TH/second) with
users to make conditional changes in the exchange of coins a difficulty parameter of 1,590,896,927,258, and the maximum
by offering greater programmability with a broader use-case. hash rate of Ethereum was equal to 268,134 Gigahashes per
Due to that, the dataset includes some common features among second (GH/second) with a difficulty parameter of 3218.953.
both cryptocurrencies such as hash rate, block size etc., and The total coins in Bitcoin and Ethereum, at the time of our
some unique features such as gas limit, gas price, etc. data collection were 17,055,012 and 99,687,139 respectively.

The number of wallets gives an estimate of how many 3) Analysis Metrics and Approach: In this paper, we ana-
new users join the platform everyday. Although this measure lyze the attributes of the cryptocurrency system, exemplified
is specific to the exchange, the other parameter known as by Bitcoin and Ethereum, that are influential on their price.
“unique addresses” captures the growth of users in the overall To determine the contributing features towards price, we found
cryptocurrency. For Bitcoin, we collected a total of 24,867,899 the most highly correlated features in the dataset to explore
wallets and 464,173 unique addresses, while for Ethereum we general trends and insights about the two cryptocurrencies.
collected a total of 812,183 addresses. In cryptocurrencies, Next, we estimated the change in user behavior (character-
mempool is a repository for unconfirmed transactions prior to ized by various attributes associated with users) that led to
the mining process. The size of mempool varies depending on increase or decrease in the price. For example, if the number
the rate of the incoming transactions, the transaction backlog, of wallets is increasing, then there is a likelihood that more
and the rate of transaction mining. users are joining the network, which leads to a higher demand
IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 5

1 TABLE I
Bitcoin R EGRESSION ANALYSIS RESULTS , HIGHLIGHTING A MODERATE POSITIVE
Ethereum
CORRELATION BETWEEN CRUDE OIL PRICES AND B ITCOIN (0.55), AS
Normalized Value 0.8 Oil
WELL AS E THEREUM (0.44). H OWEVER , THE CRUDE OIL PRICE IS NOT
0.6 USED AS A PREDICTION FEATURE IN THIS STUDY BECAUSE IT IS BELOW
THE CORRELATION THRESHOLD (0.6), USED FOR FEATURE SELECTION .
0.4
Correlation Standard
Slope Y-Intercept
0.2 Coefficient Error
Bitcoin 0.41 0.09 0.55 0.02
0 Ethereum 0.35 0.08 0.44 0.03
/0 7
/0 7
/0 7
/0 7
/0 7
/0 8
/0 8
/0 8
/0 8
/0 8
/0 8
/0 9
/0 9
19
05 1/1
07 1/1
09 1/1
11 1/1
01 1/1
03 1/1
05 1/1
07 1/1
09 1/1
11 1/1
01 1/1
03 1/1
05 1/1
1/
/0

Supply-and-demand trends. In cryptocurrencies, new coins


03

Dates (mm/dd/yy)
are generated in the system as when a block is published. Since
Fig. 4. Price trends observed in Bitcoin, Ethereum, crude oil. Notice that the average block time is constant, therefore, the supply of new
over time, while Bitcoin and Ethereum show similar trends in price changes,
the oil prices have been distinctly different. currency in the system is deterministic and linear. When new
users join the cryptocurrency, new wallets and addresses are
created. In Fig. 3(a), we observed that the number of wallets
for the fixed number of coins in the system. With the limited and addresses have increased non-uniformly in Bitcoin and
coin supply and high collective purchase power, the price Ethereum, raising the demand for the limited number of coins.
(naturally) goes up. Using those highly correlated features, we Since the number of wallets grew at a higher rate than new
train machine learning models to predict the price of Bitcoin coins, we can formulate this as a demand and supply model:
and Ethereum over time. Towards that, we divided our data a growing rate of wallets denotes that more users are joining
into a training dataset and a test dataset, and cross validated Bitcoin, which leads to an increase in demand for the coins.
the predicted outcome. With good accuracy, we were able to Since the increase rate of coins is a small constant, the new
construct models for the top two cryptocurrencies that help in coin supply to system is less than the demand, which explains
explaining their price trends. the primary cause of price rise with growing wallets number.
We plot the min-max normalized number of wallets per
IV. DATA A NALYSIS AND T RENDS available coins for Bitcoin and Ethereum in Fig. 3(h) and
General trends. We analyze the trends in features of dataset Fig. 3(i). We first calculated the number of wallets per
for each cryptocurrency. In order to do that, we normalize bitcoin, and then normalized the number using the min-max
the data using the min-max normalization and plot various normalization. We observed that there is an increase in the
normalized features over time in Fig. 3. In Fig. 3(a) and demand, which contributes to the price hike. We also noticed
Fig. 3(b) we observe that the number of wallets, the hash rate, that correlation between demand and price in Ethereum was
the number of bitcoins, the cost per transaction, the difficulty, higher (0.96) than Bitcoin (0.74).
and the miner’s revenue change monotonically with the price. Examining External Features. It has been postulated in the
In Bitcoin, the mempool size and and the fee varied over time, literature [29], that the crude oil price may influence trends
although had an identical trend to one another; the correlation in the cryptocurrency market. The crude oil price affects the
between the fee and the mempool size was 0.82. When the electricity tariffs worldwide, which in turn affect the operations
mempool size grows, for sudden high demands, or while the of the mining pools. High electricity price can force mining
Bitcoin network is under flood attacks [28], users naturally pay pools to shut down, and as a result, the hash power and the
more to prioritize their transactions, which explains the high throughput of a cryptocurrency might decrease.
correlation between the mempool size and the transaction fee. To examine that, we collected the price indexes of crude oil
We also observed that in the Ethereum dataset, the features in the international market and observed its correlation with
including addresses, hash rate, block time, and gas limit the price of Bitcoin and Ethereum. In Figure 4, we plot the
closely followed the changing trends in price. In Blockchain normalized price indexes of cryptocurrencies with crude oil.
applications, it is possible that two miners come up with a valid Notice that the overall trend in oil prices differs in Bitcoin
block and only one of them gets accepted into the main chain. compared to Ethereum. Especially, since the start of 2018, and
In Bitcoin, those rejected blocks are known as the “Orphaned while the price of cryptocurrencies decreased, the oil prices
Blocks” and in Ethereum they are called “Uncle Blocks”. have increased considerably.
From Fig. 3(b), it can be observed that in Bitcoin there is To further observe the patterns of similarity, we performed
no link between the rate of orphaned blocks and price, but in linear regression analysis to model the relationship between
Ethereum, from Fig. 3(g), there is a high correlation between the independent variable (crude oil price) and the dependent
the rate of uncle blocks and the price. One possible explanation variables (Bitcoin and Ethereum prices). We report our results
to that is block time in each cryptocurrency. In Bitcoin, the in Table I. Overall, the results show a positive correlation
average block time is 10 minutes and it is less likely that two between the crude oil price and the price of cryptocurrencies.
miners can come up with same block within that time period. In particular, Bitcoin has a comparatively high correlation
However, in Ethereum, the block time is very short and when coefficient (0.55) compared to Ethereum (0.44). However, and
the price is increasing more miners attempt to mine blocks as we show later in the subsequent paragraph, for our pre-
which increases the possibility of uncle blocks. diction models, we only select features that have a minimum
IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 6

1 1 1
Miners Reward Hash Rate Hash Rate
Normalized Value

Normalized Value
Fee Difficulty Confirmation Time
0.8 0.8 0.8 Difficulty

Normalized Value
Price
0.6 0.6
0.6
0.4 0.4
0.4
0.2 0.2
0.2
0 0
/ 6
/ 6
/ 16
/ 17
/ 17
/ 17
/ 7
/ 17
/ 17
/ 18
/0 18
18

/ 6
/ 6
/ 16
/ 17
/ 17
/ 17
/ 7
/ 17
/ 17
/ 18
/0 18
18
09 01/1
11 01/1

09 01/1

09 01/1
11 01/1

09 01/1
0
01 01/
03 01/
05 01/
07 01/

11 01/
01 01/
03 01/
05 01/
1/

01 01/
03 01/
05 01/
07 01/

11 01/
01 01/
03 01/
05 01/
1/

3
3
4
4
5

6
6
7
7
8
/

/
07

07

/2
/2
/2
/2
/2

/2

/2
/2
/2
/2
/2
10
10
10
10
10

10

10
10
10
10
10
Dates (mm/dd/yy) Dates (mm/dd/yy) Dates (mm/dd/yy)

(a) Miner’s revenue and fee paid (Bitcoin) (b) Trend of hash rate and difficulty (Bitcoin) (c) Price with difficulty and hash rate (Bitcoin)
1 1 1
Miners Reward Hash Rate Hash Rate
Normalized Value

Normalized Value

Normalized Value
0.8 Fee 0.8 Difficulty 0.8 Difficulty
Block Time
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
/0 5
/0 6
/0 6
/0 6
/0 6
/0 7
/0 7
/0 7
/0 7
/0 8
18

/0 5
/0 6
/0 6
/0 6
/0 6
/0 7
/0 7
/0 7
/0 7
/0 8
18

11 /17
11 /17
11 /17
12 /17
12 /17
01 /17
01 /18
02 /18
18
01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1

01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1
07 1/1
10 1/1
01 1/1
04 1/1
1/

1/

8/
9
2
6
0
4
8
1
5
/0

/0

/1
/0
/1
/3
/1
/2
/1
/2
/0
10

10

10
Dates (mm/dd/yy) Dates (mm/dd/yy) Dates (mm/dd/yy)

(d) Miner’s revenue and fee paid (Ethereum) (e) Trend of hash rate and difficulty (Ethereum) (f) Ethereum change in difficulty
Fig. 5. In 5(a) miner’s revenue is indicated by the Coinbase reward. 5(b), shows the increasing hash rate and the network’s difficulty. Notice in 5(c), when
the network’s difficulty is constant and the hash rate decreases, the price also decreases.

1.0 1.00
1.0 0.9 0.9 0.9 0.8 1.0 0.4 0.3 0.8 0.9 1.0 0.9 0.9 0.9 0.7 1.0 0.9 0.8 0.8 0.9
dif cpt wal

tt

0.9 1.0 0.8 0.8 0.9 0.8 0.4 0.3 1.0 0.8 0.9 1.0 1.0 1.0 0.8 0.9 0.9 0.9 0.9 1.0
ua

0.8 0.95
0.9 0.8 1.0 1.0 0.6 0.9 0.1 -0.0 0.8 0.7 0.9 1.0 1.0 1.0 0.7 0.9 0.9 0.8 0.8 0.9
tc

0.9 0.8 1.0 1.0 0.7 0.9 0.1 -0.0 0.8 0.7 0.9 1.0 1.0 1.0 0.7 0.9 0.9 0.8 0.8 0.9
hr

pr mc

0.6 0.90
0.8 0.9 0.6 0.7 1.0 0.7 0.6 0.6 1.0 0.8 0.7 0.8 0.7 0.7 1.0 0.8 0.8 0.8 0.8 0.8
mr

1.0 0.8 0.9 0.9 0.7 1.0 0.4 0.3 0.8 0.9 1.0 0.9 0.9 0.9 0.8 1.0 1.0 0.9 0.9 1.0
tt

hr

0.4 0.85
0.4 0.4 0.1 0.1 0.6 0.4 1.0 0.8 0.6 0.6 0.9 0.9 0.9 0.9 0.8 1.0 1.0 0.9 0.9 0.9
ua

gp diff

0.3 0.3 -0.0 -0.0 0.6 0.3 0.8 1.0 0.4 0.6 0.8 0.9 0.8 0.8 0.8 0.9 0.9 1.0 1.0 0.9
pr fee

0.2 0.80
0.8 1.0 0.8 0.8 1.0 0.8 0.6 0.4 1.0 0.9 0.8 0.9 0.8 0.8 0.8 0.9 0.9 1.0 1.0 0.9
gl

0.9 0.8 0.7 0.7 0.8 0.9 0.6 0.6 0.9 1.0
utxo

0.9 1.0 0.9 0.9 0.8 1.0 0.9 0.9 0.9 1.0
0.0 0.75
gu
wal
cpt
dif
hr
mr
tt
ua
fee
pr
utxo

tt
ua
tc
mc
pr
hr
diff
gp
gl
gu

Fig. 6. Correlation matrix of Bitcoin. Here, wal, cpt, dif, hr, mr, tt, ua, fee, pr, Fig. 7. Correlation matrix of Ethereum. Here tt, ua, tc, mc, pr, hr, diff, gp,
and utxo denote number of wallets, cost per transaction, difficulty, hash rate, gl, and gu denote total transactions, unique addresses, market cap, price, hash
mining revenue, total transactions, unique addresses, fee, price, and unspent rate, difficulty, gas price, gas limit, and gas used respectively.
transaction output respectively.
our regression model and prediction, we selected features with
correlation coefficient of 0.6. Since the correlation coefficient
correlation coefficient greater than 0.6.
of crude oil is below our baseline criteria, we do not include
it among the selected features for the prediction task.
Features for price prediction. To determine the most useful A. Effects of User Activity on Price
features in our dataset for price estimation, we calculated the In this section, we try to explain the user activity, determined
correlation matrix of all data attributes. We report a subset by highly correlated features, affects the price. Among them
of correlation matrix in Fig. 6, and Fig. 7. It can be observed the features such as the number of wallets, the hash rate, and
from the figures that the features in Ethereum dataset are more the UTXO’s, determine the number of new users coming into
highly correlated with the price than the features in Bitcoin the network, new miners joining the mining pools, and the
dataset. In Ethereum, the minimum and maximum correlation aggregate spendable balance of all the users.
factor of the features with price is 0.7 and 0.9 respectively, Wallets and Unique Addresses. As mentioned earlier, the
while in Bitcoin, the minimum and maximum correlation of increase in the number of wallets corresponds to greater
the features with the price is 0.4 and 1.0 respectively. For demand of the limited coins in the system, which results in a
IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 7

price hike. This reasoning can also be extended to the number price fluctuation observed in Ethereum related to the change
of unique addresses and the number of transactions per day. in the hash rate and constant difficulty. However, within our
The growth in these two features indicates more users coming dataset, we noticed that on 15th October 2017, the difficulty
into the system and making more transactions. As such, the measure of Ethereum decreased by 52% over night while the
increase in the number of users and user activity (transactions) hash rate was constant. Although, it did not affect the price,
corresponds to (possibly) more cash is flowing into the system. but it decreased the block computation time by 52%. We plot
Since cash flow in Bitcoin increases, the (collective) purchase this observation in Fig. 5(f).
power of users also increases. This implies that for fixed assets UTXO’s. In Bitcoin, another important feature that con-
(bitcoins) owned by a user A in the system, there is some user tributes towards the price is the set of unspent transaction
B in the system who is willing to pay more for the same set of outputs (UTXO’s). UTXO’s are the spendable transactions in
assets. In economics, the trend above is captured by a theory wallets that are confirmed in Blockchain. UTXO’s determine
known as the “greater fool theory” [30], which states that the the number of sellers in Bitcoin. Just as the increase in the
price of a commodity is determined by the expectations of number of wallets indicates more buyers in the system, more
users rather than by the commodity’s intrinsic value. UTXO’s indicate more sellers. UTXO’s depend on the number
Difficulty and Hash Rate. Computing a block generates new of coins produced and the nature of the ongoing transaction.
coins in the system, which are given to the miner as a Coinbase In our dataset, we observed that there is a high correlation
reward. Miners earn coins from the Coinbase rewards and between the price and the UTXO set.
fee paid by the users for transaction processing. As the price As the UTXO set increases, there is more spendable balance
grows, the corresponding value of miner’s income (in USD) in the system which leads to an increase in the exchange
also grows. In Fig. 5(a) and Fig. 5(d), we plot the miner’s of transactions (trade), which in turn increases the price of
income from our two datasets. We observed that the Coinbase Bitcoin. The fall in Bitcoin price in 2018 can also be attributed
rewards and fee have increased over time. With the growing to the fall in the UTXO set, indicating less spendable balance
incentive of income, more miners are joining the mining pools in the system and limited trade avenues for the users. This
hoping to capitalize on the increasing monetary reward, which can be further linked to the decreasing interest of people in
explains why the hash rate grows with the price. Bitcoin which explains the decrease in price.
Bitcoin. In Bitcoin, the difficulty is a measure of how long it Gas. In Ethereum, gas is the “fuel” unit used in the execution
takes to compute a block, which is defined by a target value of smart contracts. Each operation code instruction in a smart
set by the network [31]. Based on the hashing power, the code consumes different units of gas which is summed up
target is adjusted every two weeks to keep block mining time towards the end of smart contract execution to compute the
within 10 minutes. The difficulty is recomputed based on the total units of gas used. The transaction fee is calculated using
hashing power: if hashing power increases, the probability of gas price and the units of gas used during the process. In our
finding a block within under 10 minutes increases. To adjust dataset, we observed that the amount of gas used in Ethereum
the probability, the difficulty is raised by increasing the target. had a high correlation with the price, indicating a user behavior
In Fig. 5(b) and Fig. 5(e), we plot the difficulty along with related to the interest in smart contracts. A high use of gas
the network’s hashing rate for Bitcoin and Ethereum. In (1)), can (possibly) mean that more smart contracts are being run
we show how the block computation time, T (B), is affected on Ethereum virtual machine (EVM), or more computation
by the hashing rate, Hr , the target, T arget, the probability intensive operations are being performed while running smart
of finding a block, Pr (B), and the average number of hashes contracts. In each case, it is indicative of a high user interest in
required to solve the target, H. Ethereum and smart contracts which explains the price hike.
T arget 1 H V. P REDICTIONS : E XPERIMENT AND R ESULTS
Pr (B) = 256
;H = ; T (B) = (1)
2 Pr (B) Hr In this section we build price prediction models for Bitcoin
Since the difficulty remains constant for 2016 blocks, we and Ethereum using features in our dataset. For prediction
analyze how the mining pool size affects the price and the models we take supervised learning approach using regression,
average block computation time. From our dataset, we found long short-term memory (LSTM) networks, and conjugate
a window of time where the difficulty was constant and the gradient algorithm. Our results validate that network features
hashing rate was reduced. At the same time interval, we found can be used to accurately predict the price of a cryptocurrency.
the mean confirmation time for transactions and the price.
From (1) we inferred that, with constant Pr (B), the block A. Regression Approach
time T (B) increases if Hr is reduced, leading to a higher We consider three popular approaches: the linear regression,
confirmation time for transactions and less Coinbase rewards regression with gradient boosting, and regression with random
per time unit, therefore leading to a fall in the price. In forest. We test our datasets with each method to find the opti-
Fig. 5(c), we plot one case that happened in October 2017. mum technique useful towards the price prediction of Bitcoin
Ethereum. In Ethereum, the difficulty is adjusted after ev- and Ethereum. In the following, we review the conceptual
ery block using Homstead method described in [32]. Since primitives required for understanding each of those algorithms.
Ethereum follows a different set of protocols than Bitcoin, its Linear regression. Linear regression (LR) is a method of
difficulty measure does not remain constant for a deterministic predicting the future value of an unknown dependent variable
period of time (2016 blocks in Bitcoin). Due to that, there is no by learning the values of known independent variable [33].
IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 8

1.0 0.04 1.0 1.0


Predicted Predicted Predicted
0.02 0.8

Error (Bitcoin)
Price (Bitcoin)

Price (Bitcoin)

Price (Bitcoin)
0.8 Test 0.00 0.8 Test Test
0.6 0.02 0.6 0.6
0.4 0.04 0.4 0.4
0.06
0.2 0.08 0.2 0.2
0.10 Error
0.0 0.0 0.0
0 50 100 150 200 0 50 100 150 200 0 20 40 60 80 100 120 140 0 50 100 150 200
Data Points Data Points Data Points Data Points
(a) Predicted and test values (LR) (b) Error in predicted/test val (LR) (c) Design-based sampling (LR) (d) Predicted and test values (RF)

0.15 1.0 1.0 1.0


Error Predicted Predicted Predicted
0.8

Price (Bitcoin)
0.8 0.8 Test
Error (Bitcoin)

Price (Bitcoin)

Price (Bitcoin)
0.10 Test Test
0.6 0.6 0.6
0.05
0.4 0.4 0.4
0.00
0.2 0.2 0.2
0.05
0.0 0.0 0.0
0 50 100 150 200 0 20 40 60 80 100 120 140 0 50 100 150 200 0 20 40 60 80 100 120 140
Data Points Data Points Data Points Data Points
(e) Error in predicted/test val (RF) (f) Design-based sampling (RF) (g) Predicted and test values (GB) (h) Design-based sampling (GB)

Fig. 8. Results obtained from regression model applied on Bitcoin dataset with 30% test data. Notice in 8(a), high similarity in prediction and test values
indicate high accuracy. Also notice that random sampling always achieved a higher accuracy than design-based sampling. Due to low accuracy in design
based sampling as shown in 8(c), 8(f), and 8(h), there is a significant difference in the predicted and test price.

1.0 1400 Predicted 1.0


Predicted 0.2 Predicted
Error (Ethereum)
Price (Ethereum)

Price (Ethereum)
Price (Ethereum)
0.8 Test 0.1 1200 Test 0.8 Test
0.0 1000
0.6 0.1 0.6
0.4 0.2 800 0.4
0.3 600
0.2 0.4 0.2
0.5 Error 400
0.0 0.6 0 50 100 150 200 250 300 350 0.0
0 50 100 150 200 250 300 350 0 50 100 150 200 0 50 100 150 200 250 300 350
Data Points Data Points Data Points Data Points
(a) Predicted and test values (LR) (b) Error in predicted/test val (LR) (c) Design-based sampling (LR) (d) Predicted and test values (RF)

0.06 1400 1.0


0.04 Predicted Predicted 1400 Predicted
Price (Ethereum)
Error (Ethereum)

Price (Ethereum)

1200
Price (Ethereum)
Test 0.8 Test 1200 Test
0.02
1000 0.6 1000
0.00
0.02 800 800
0.4
0.04 600 600
0.2
0.06 Error 400 400
0.08 200 0.0
0 50 100 150 200 250 300 350 0 50 100 150 200 0 50 100 150 200 250 300 350 0 50 100 150 200
Data Points Data Points Data Points Data Points
(e) Error in predicted/test val (RF) (f) Design-based sampling (RF) (g) Predicted and test values (GB) (h) Design-based sampling (GB)

Fig. 9. Results obtained from Ethereum dataset. Notice that unlike Bitcoin, gradient boosting and random forest achieve higher accuracy of prediction with
low error compared to linear regression. Design-based sampling achieves lower accuracy than random sampling, however, it is higher compared to Bitcoin.

Provided data in the format x = x1 , x2 , ...xn , y = y1 , y2 , ...yn , The value of regression coefficient m, and y-intercept b is
where x is the independent variable and y is the dependent computed by taking partial derivative of R2 and setting to 0:
variable to be predicted, linear regression finds a line of best n
X n
X n
X n
X Xn
2
fit, y = mx + b, where m is the coefficient of regression of y m=n xi yi − xi yi /n (xi ) − ( xi )2 (3)
on x, and b is the y-intercept. For example, if the regression i=1 i=1 i=1 i=1 i=1
coefficient m, of y on x is 0.45 units, that will imply that y n
X n
X n
X n
X n
X Xn
2 2
will increase by 0.45 if x increases by 1 unit. The accuracy of b= xi yi − xi xi yi /n xi −( xi )2
linear regression is determined by calculating the coefficient i=1 i=1 i=1 i=1 i=1 i=1
(4)
of determination R2 , also known as the least square fit. Least
square fit calculates the minimum (min) between the predicted The approach of using linear regression for modeling has
value and the real value as mentioned below: been widely adopted in many applications. De Cock et al. [34]
introduced a protocol for performing linear regression over a
dataset in multiple parties. Roy et al. [35] propose predict
n
X n
X financial market behavior based on a linear regression model.
2 2
R2 = (∆yi ) = [(mxi + b) − yi ] = min (2) Gradient Boosting. Gradient boosting (GB) uses residual
i=1 i=1 fitting to minimize the loss function and improve the accuracy.
IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 9

TABLE II
R ESULTS OBTAINED FROM REGRESSION MODELS APPLIED ON B ITCOIN AND E THEREUM DATASETS WITH VARYING TEST DATA PERCENTAGE . L INEAR
REGRESSION PERFORMS BEST WITH 10% TEST DATA WHILE GRADIENT DECENT AND RANDOM FOREST PERFORM BEST WITH 5% TEST DATA .

Linear Regression Random Forest Gradient Boosting


Test Data (%)
Rˆ2 RMSE MAE Rˆ2 RMSE MAE Rˆ2 RMSE MAE
5 0.9937 0.0207 0.0143 0.9970 0.0141 0.0072 0.9968 0.0146 0.0076
15 0.9956 0.0175 0.0121 0.9933 0.0215 0.0108 0.9924 0.0228 0.0108
25 0.9949 0.0179 0.0117 0.9914 0.0234 0.0105 0.9924 0.0221 0.0105
Bitcoin
35 0.9951 0.0170 0.0112 0.9893 0.0251 0.0109 0.9927 0.0206 0.0100
50 0.9952 0.0162 0.0106 0.9899 0.0235 0.0105 0.9933 0.0191 0.0094

5 0.9559 0.0486 0.0289 0.9999 0.0028 0.0014 0.9999 0.0022 0.0012


15 0.8897 0.0718 0.0316 0.9984 0.0088 0.0026 0.9996 0.0041 0.0016
25 0.8964 0.0651 0.0298 0.9981 0.0087 0.0027 0.9992 0.0056 0.0017
Ethereum
35 0.9113 0.0593 0.0267 0.9978 0.0093 0.0028 0.9994 0.0050 0.0017
50 0.9277 0.0563 0.0262 0.9972 0.0110 0.0031 0.9995 0.0049 0.0016

The loss functions, root-mean-square error (RMSE) and mean both random sampling and design-based sampling [41] for
absolute error (MAE) are defined as: each regression model. In design-based sampling, we split the
n
P dataset into 80% training data and 20% test data based on the
|yi − yip |
v
u n
uX time series. In other words, we trained data from April 2016,
RM SE = t (yi − yip )2 , M AE = i=1 (5) to January 2018, and predicted results from January 2018, to
n
i=1 May 2018. We report our results in Fig. 8 and Fig. 9. We
where yi is i-th target value, yip is i-th prediction value. To found higher accuracy in Ethereum price prediction compared
minimize loss function value, gradient descent approach is to Bitcoin. We also noticed that compared to random sampling,
used to update predictions based on a learning rate, α. the design-based sampling achieved lower accuracy. This can
be observed in Fig. 8(c), 8(f), and 8(h), as well as in Fig.
n
X 9(c), 9(f), and 9(h), which show a big difference between the
yip = yip − α ∗ 2 ∗ (yi − yip ) (6)
predicted curve and the test curve. Unlike random sampling,
i=1
the design-based sampling does not capture characteristics
Gradient boosting allows updating the prediction values so of data that lies in the unknown regions. It is similar to
that the sum of the remainders is minimum and the predicted using past indexes to predict the future. Therefore, it does
values are close to the actual values. This approach is used not lead to high accuracy. For more details about random
for many applications as well. For example, Alonso et al. [36] and design-based sampling, we refer the reader to [41]. To
research the wind energy prediction problem using Gradient further investigate the accuracy of prediction we varied the
Boosted Regression. Zhang et al. [37] propose gradient boost- percentage of test data from 5% to 50% and noticed the change
ing regression tree method to improve travel time prediction. in the accuracy and error. We report our results in Table II.
Random forest. Random forest (RF) is one of supervised From our experiment, we made the following observations.
learning algorithms that builds multiple decision trees and to 1) Linear regression achieved highest accuracy and low error
make precise predictions [38]. Random forest creates random in Bitcoin dataset, followed by the random forest and gradient
subsets of the features by drawing bootstrap sample Z ∗ of boosting, respectively. 2) Gradient boosting achieved highest
size N from training data and growing a random forest tree accuracy in the Ethereum dataset, followed by the random
Tb using these subsets recursively. It outputs the ensemble of forest and linear regression. 3) As the percentage of the
trees {Tb }B
1 and makes prediction over a new point x with re- training data decreased, the accuracy decreased and the error
B
gression using fˆrf
B
(x) = B1
P
Tb (x). Random forest is robust increased. 4) The maximum accuracy achieved in Bitcoin
b=1 dataset was 0.9957 with 10% test data. 5) The maximum
against outliers and avoids overfitting. Various structures are accuracy achieved in the Ethereum dataset was 0.9999 with
predicted in the literature using this approach. Lin et al. [39] 10% test data. 6) Design-based sampling always achieved
show that the prediction of wind speed and direction using lower accuracy (a maximum of 0.901) compared to the ran-
random forests. Sadeghi-Mobarakeh et al. [40] use random dom sampling. 7) In design-based sampling, linear regression
forest model to predict the values in the electricity market. outperformed gradient boost and the random forest. 8) There
For our first experiment, we formulated our problem as a is a more linear relationship among the Bitcoin features with
multiple regression model based on highly correlated features its price, compared to Ethereum.
in the dataset. We applied the random sampling method for
data division and trained the model on linear regression,
random forest regression, and gradient boosting. We changed B. LSTM Approach
the percentage of training and test data in for each regression LSTM units are units of recurrent neural networks (RNNs)
model and evaluated the performance using regression coeffi- that can be used for prediction by keeping a continuous set
cient R2 , RMSE, and MAE, defined in (2), and (5). We applied of data for a long time [42]. RNNs constructed from LSTM
IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 10

1.0 1.0 1.0 1.0


Actual Actual Actual Actual
0.8 Trained 0.8 Trained 0.8 Trained 0.8 Trained
Predicted Predicted Predicted Predicted

Ethereum

Ethereum
0.6 0.6 0.6 0.6
Bitcoin

Bitcoin
0.4 0.4 0.4 0.4
0.2 0.2 0.2 0.2
0.0 0.0 0.0 0.0
0 100 200 300 400 500 600 700 0 100 200 300 400 500 600 700 0 200 400 600 800 1000 0 200 400 600 800 1000
Data Points Data Points Data Points Data Points
(a) Epoch = 10 (Bitcoin) (b) Epoch = 30 (Bitcoin) (c) Epoch = 10 (Ethereum) (d) Epoch = 30 (Ethereum)

Fig. 10. Results obtained by applying LSTM networks prediction over the dataset. Note that prediction over Bitcoin is more accurate than Ethereum.

ℎ" ft , and adding the result of it ∗ C


et , which is the product of
the output of the input gate and new candidate values.
𝐶"#$ X + 𝐶" Ct = ft ∗ Ct−1 + it ∗ C
et (10)
𝑓" 𝑖" 𝜎
The hidden state, ht , based on the state of the cell that is
X 𝑜" revised, is obtained by selecting the parts of the cell state to
X
𝐶/"
be output at the output gate, ot , and putting the current cell
𝜎 𝜎 tanh 𝜎
state into the tanh layer and multiplying by the result of the
ℎ"#$ ℎ" sigmoid layer. The formula for computing ot and ht is defined
below in (11), (12), while the model of this LSTM networks
𝑥" is illustrated in Fig. 11.
Fig. 11. LSTM cell overview. Ct , ht , xt , ft , it , and ot denote state of the
cell, hidden state, input, output of forget gate, output of the input gate, and ot = σ(Wo } [ht−1 , xt ] + bo ) (11)
output of the output gate. Ct−1 and ht−1 denote previous cell’s state. ht = ot ∗ tanh(Ct ) (12)
units are also called an LSTM networks, and are popular for
We used LSTM approach on our datasets of Bitcoin and
making predictions based on time series data. For prediction
Ethereum to build a price prediction model. Similar to our
purposes, three types of deep learning approaches are typically
methods in linear regression, we used min-max normalization
used, recurrent neural networks (RNNs), convolutional neural
on dataset features and selected the features with a correlation
networks (CNNs), and autoregressive integrated moving aver-
factor greater than 0.6. We split the dataset into 80% training
age (ARIMA). Our choice of RNNs over other alternatives is
and 20% test subsets, and varied the number of epochs to
driven by the results obtained in the literature. In one such
observe the change in the prediction model. We set the batch
work [15], it has been shown comparatively how RNNs and
size (subset size of training sample) to 1 and the look back
ARIMA perform in predicting Bitcoin price. In particular, it
value to 1. The look back value is the number of previous
is noted that RNNs significantly outperformed ARIMA and
time steps to be utilized as input variables for prediction of
achieved higher accuracy. Moreover, another work has shown
the next time period. We tested various look back values (1–
that CNNs are well suited to perform predictive analysis on
5 and 10–50), and chose 10 for our experiments based on
image or text-based samples [43]. To this end, and because
the performance. We report our results in Fig. 10. Our results
RNNs (particularly, LSTM-based RNNs) are more suitable to
indicate that the error values, captured by RMSE and MAE in
our goal of prediction, we use them in this study.
test data for Bitcoin were low at 50 epochs (0.11 and 0.095),
Technically, an LSTM consists of memory cells where each
while the error values in test data for Ethereum were low at 30
cell is composed of three gates, a forget gate, an input gate,
epochs (0.13 and 0.1091). For the train data, the error values
and an output gate. The gates are responsible for managing
decreased as the number of epochs increased for both Bitcoin
the information of each cell. The forget gate layer determines
and Ethereum. In Table III, and Table IV, we enlist the values
the information transfer based on the results of the sigmoid
of RMSE and MAE for training and test data obtained from
layer, ft , where W is weight, b is bias, and } is element-wise
our experiments. The results show that with LSTM, Bitcoin
vector product as defined below:
achieves higher accuracy with minimum error on each epoch.
ft = σ(Wf } [ht−1 , xt ] + bf ) (7) This also validates our results obtained in regression analysis.
C. Conjugate Gradient Approach
Input gate layer and tanh layer decide the nature of the infor- We also built a neural network and used conjugate gradient
mation to be stored in the cell. The sigmoid layer determines algorithm with linear search for price prediction. We normalize
the value to update (it ) and the tanh layer creates a new and split the data into 20% test and 80% training subsets. We
candidate, Cet , that is the state value of the cell.
train our network on 100 epochs and compute the training
it = σ(Wi } [ht−1 , xt ] + bi ) (8) and validation errors. For this model evaluation, if the training
and validation errors are high, the model is considered to be
C
et = tanh(WC } [ht−1 , xt ] + bC ) (9)
underfitting, and overfitting otherwise. In our experiment, we
The current state of the cell can be calculated by multiplying found the training error for Bitcoin was 0.00013, where the
the old state of the cell, Ct−1 by the result of the forget gate, corresponding validation error was 0.00089. For Ethereum,
IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 11

1.0 Predicted 1.0 Predicted 1.0 Predicted Predicted


0.8 Test 0.8 Test 0.8 Test 1.00 Test
0.6 0.6 0.6 0.75

Price
Price

Price

Price
0.4 0.4 0.4 0.50
0.2 0.2 0.2
0.25
0.00
0.0 0.0 0.0
0 50 100 150 200 0 20 40 60 80 100 120 140 0 50 100 150 200 250 300 0 50 100 150 200
Data Points Data Points Data Points Data Points
(a) Predicted/test values (Bitcoin) (b) Design-based sampling (Bitcoin) (c) Predicted/test values (Ethereum) (d) Design-based (Ethereum)

Fig. 12. Results obtained from neural network for Bitcoin and Ethereum. With random sampling, the accuracy was as higher than design-based sampling
with low training and validation error. Prediction accuracy was higher for Bitcoin dataset.

TABLE III rate, based on other attributes than past price. Compared to the
T HE RESULTS OBTAINED FROM LSTM MODEL USED ON B ITCOIN previous work that predicts Bitcoin price based on previous
DATASET. T HE RESULTS SHOW THAT WITH 50 EPOCHS RMSE AND MAE
FOR TEST DATA WAS MINIMUM . price observations, our approach is highly accurate.
Acknowledgement. This research was supported by
Train Data Test Data National Research Foundation of Korea under NRF-
Epochs
RMSE MAE RMSE MAE
10 0.05 0.042485 0.17 0.163407
2016K1A1A2912757 and 2017R1A4A1015675, and Chung-
20 0.05 0.045599 0.14 0.135538 Ang University Research Grant (2019).
30 0.05 0.046075 0.13 0.118998
40 0.05 0.044236 0.12 0.106008 R EFERENCES
50 0.04 0.040621 0.11 0.094958 [1] P. M. Krafft, N. D. Penna, and A. S. Pentland, “An experimental
study of cryptocurrency market dynamics,” in Proceedings of
TABLE IV the 2018 CHI Conference on Human Factors in Computing
T HE RESULTS OBTAINED FROM LSTM MODEL USED ON E THEREUM Systems, Montreal, Canada,, Apr 2018, p. 605. [Online]. Available:
DATASET. T HE RESULTS SHOW THAT WITH 30 EPOCHS RMSE AND MAE
http://doi.acm.org/10.1145/3173574.3174179
FOR TEST DATA WAS MINIMUM .
[2] A. Sonewane, “Top 10 cryptocurrency 2017 — best cryptocurrency to
invest,” 2017. [Online]. Available: https://goo.gl/D1cafv
Train Data Test Data [3] G. Hileman and M. Rauchs, “Global cryptocurrency benchmarking
Epochs study,” Cambridge Centre for Alternative Finance, 2017.
RMSE MAE RMSE MAE [4] K. Sedgwick, “Statistics that reveal growing demand for the
10 0.09 0.08287 0.15 0.123918 cryptocurrency,” 2017. [Online]. Available: https://goo.gl/yK5dyh
20 0.07 0.06937 0.14 0.114183 [5] B. Community, “Bitcoin block explorer - blockchain,” 2018. [Online].
30 0.06 0.057834 0.13 0.109132 Available: https://blockchain.info/
40 0.05 0.050322 0.14 0.113484 [6] M. Saad, L. Njilla, C. A. Kamhoua, and A. Mohaisen, “Countering
50 0.05 0.043871 0.15 0.125366 selfish mining in blockchains,” CoRR, vol. abs/1811.09943, 2018.
[Online]. Available: http://arxiv.org/abs/1811.09943
[7] A. Ahmad, M. Saad, M. Bassiouni, and A. Mohaisen, “Towards
blockchain-driven, secure and transparent audit logs,” in International
the training and validation errors were found to be 0.00026, Conference on Mobile and Ubiquitous Systems: Computing, Networking
and 0.00095, respectively. From this experiment, we notice and Services, MobiQuitous, New York City,USA, Nov 2018, pp. 443–
448. [Online]. Available: https://doi.org/10.1145/3286978.3286985
that the error, while small, is slightly higher than the training [8] M. Swan, Blockchain: Blueprint for a New Economy, 1st ed. O’Reilly
error. Such a model is considered to be a good fit and Media, Inc., 2015. [Online]. Available: https://goo.gl/7wck2T
we report our results in Fig. 12. For comparison, we also [9] A. E. Kosba, A. Miller, E. Shi, Z. Wen, and C. Papamanthou, “Hawk:
The blockchain model of cryptography and privacy-preserving smart
used the hessian gradient decent optimization which reduces contracts,” in Proceedings of the 37th IEEE Symposium on Security and
training and validation error at a faster rate by choosing second Privacy (Oakland), San Jose, CA, May 2016, pp. 839–858. [Online].
derivative information for better gradient direction. However, Available: https://doi.org/10.1109/SP.2016.55
[10] A. Norta, “Creation of smart-contracting collaborations for decentralized
the overall margin of error with hessian algorithm was more autonomous organizations,” in International Conference on Business
than the conjugate gradient’s. Informatics Research. Springer, 2015, pp. 3–17.
[11] B. Community, “Crypto 2.0 comparison spreadsheet,” 2017. [On-
line]. Available: https://www.reddit.com/r/CryptoCurrency/comments/
VI. C ONCLUSION AND F UTURE W ORK 2921em/created a crypto 20 comparison spreadsheet and/
[12] V. Buterin et al., “A next-generation smart contract and decentralized
In this paper, look into analyzing cryptocurrency market application platform,” white paper, 2014.
price through a correlation analysis with various cryptocur- [13] N. Indera, I. Yassin, A. Zabidi, and Z. Rizman, “Non-linear
rency attributes, exemplified by Bitcoin and Ethereum. We autoregressive with exogeneous input (NARX) bitcoin price prediction
model using pso-optimized parameters and moving average technical
collect data spanning more than 20 months and estimate the indicators,” Journal of Fundamental and Applied Sciences, vol. 9,
most significant features that influence the price. We computed 2017. [Online]. Available: https://goo.gl/9pojUX
the correlation between features such as hash rate, number [14] K. Kohara, T. Ishikawa, Y. Fukuhara, and Y. Nakamura, “Stock price
prediction using prior knowledge and neural networks,” Int. Syst. in
of users, transaction rate, total bitcoins and price. We map Accounting, Finance and Management, vol. 6, no. 1, pp. 11–22, 1997.
the change in features on users and network activities to [15] S. McNally, “Predicting the price of bitcoin using machine learning,”
understand the dynamics of the cryptocurrencies. We used our Ph.D. dissertation, Dublin, National College of Ireland, 2016.
[16] P. Vigna and M. J. Casey, The age of cryptocurrency: how bitcoin and
findings to construct a machine learning model that accurately the Blockchain are challenging the global economic order. Macmillan,
predicts Bitcoin and Ethereum prices with the minimum error 2016. [Online]. Available: https://goo.gl/tTJN2j
IEEE SYSTEMS JOURNAL (ISJ), VOLUME X, ISSUE X, DECEMBER 2018 12

[17] M. Pilkington, “Blockchain technology: principles and applications,” 10th International Conference on Hybrid Artificial Intelligent Systems
Browser Download This Paper, 2015. HAIS , Bilbao, Spain, Jun 2015, pp. 26–37. [Online]. Available:
[18] C. Rose, “The evolution of digital currencies: Bitcoin, a cryptocur- https://goo.gl/RuDhbH
rency causing a monetary revolution,” The International Business & [38] J. Friedman, T. Hastie, and R. Tibshirani, The elements of statistical
Economics Research Journal (Online), vol. 14, no. 4, p. 617, 2015. learning. Springer series in statistics New York, 2001, vol. 1.
[19] S. Omohundro, “Cryptocurrencies, smart contracts, and artificial intelli- [39] Y. Lin, U. Krüger, J. Zhang, Q. Wang, L. A. Lamont, and L. E. Chaar,
gence,” AI matters, vol. 1, no. 2, pp. 19–21, 2014. “Seasonal analysis and prediction of wind energy using random forests
[20] G. W. Peters and E. Panayi, “Understanding modern banking ledgers and ARX model structures,” IEEE Trans. Contr. Sys. Techn., vol. 23,
through blockchain technologies: Future of transaction processing and no. 5, pp. 1994–2002, 2015. [Online]. Available: https://goo.gl/kjWEor
smart contracts on the internet of money,” in Banking Beyond Banks [40] A. Sadeghi-Mobarakeh, M. Kohansal, E. E. Papalexakis, and H. M.
and Money. Springer, 2016, pp. 239–278. Rad, “Data mining based on random forest model to predict the
[21] A. Sapirshtein, Y. Sompolinsky, and A. Zohar, “Optimal selfish california ISO day-ahead market prices,” in Proceedings of IEEE Power
mining strategies in bitcoin,” in 20th International Conference on & Energy Society Innovative Smart Grid Technologies Conference,
Financial Cryptography and Data Security, FC, Christ Church, ISGT, Washington, USA, Apr 2017, pp. 1–5. [Online]. Available:
Barbados,, Feb 2016, pp. 515–532. [Online]. Available: https: https://goo.gl/pQHMHi
//doi.org/10.1007/978-3-662-54970-4 30 [41] A. Abadie, S. Athey, G. W. Imbens, and J. M. Wooldridge, “Sampling-
[22] M. Vasek, M. Thornton, and T. Moore, “Empirical analysis of based vs. design-based uncertainty in regression analysis,” arXiv preprint
denial-of-service attacks in the bitcoin ecosystem,” in Financial arXiv:1706.01778, 2017.
Cryptography and Data Security - FC Workshops, Christ Church, [42] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
Barbados, Mar 2014, pp. 57–71. [Online]. Available: https://doi.org/10. computation, vol. 9, no. 8, pp. 1735–1780, 1997.
1007/978-3-662-44774-1 5 [43] S. Mobin, B. Cheung, and B. A. Olshausen, “Convolutional vs.
[23] M. Saad and A. Mohaisen, “Towards characterizing blockchain-based recurrent neural networks for audio source separation,” CoRR, vol.
cryptocurrencies for highly-accurate predictions,” in IEEE International abs/1803.08629, 2018. [Online]. Available: http://arxiv.org/abs/1803.
Workshop on Hot Topics in Pervasive Mobile and Online Social 08629
Networking, HOTPOST, Honolulu, USA, April 2018, pp. 704–709.
[Online]. Available: https://doi.org/10.1109/INFCOMW.2018.8406859
[24] H. Jang and J. Lee, “An empirical study on modeling and prediction
of bitcoin prices with bayesian neural networks based on blockchain
information,” IEEE Access, vol. 6, pp. 5427–5437, 2018.
[25] Etherscan, “The ethereum block explorer. ethereum charts and Muhammad Saad is a PhD. student in the Department of Computer Science
statistics,” 2018. [Online]. Available: https://etherscan.io/charts at the UCF. At UCF, Saad is a member of the Security Analytics and Research
[26] R. Bohme, N. Christin, B. Edelman, and T. Moore, “Bitcoin: Economics, Lab (SEAL) advised by Prof. Aziz Mohaisen. His research spans blockchain
technology, and governance,” Journal of Economic Perspectives, vol. 29, with emphasis on their attack surface. His work has appeared in reputable
no. 2, pp. 213–38, 2015. venues including ASIACCS 2018, HotPOST 2018, DLoT 2018. He won the
[27] L. Wang and Y. Liu, “Exploring miner evolution in bitcoin network,” best paper award at DloT 2018, for his work on blockchain-based audit logs.
in International Conference on Passive and Active Measurement PAM,
New York, NY, USA, March 2015, pp. 290–302. [Online]. Available: Jinchun Choi is a Ph.D. student at the Department of Computer Science at
https://doi.org/10.1007/978-3-319-15509-8 22 the University of Central Florida and the Department of Computer Information
[28] M. Saad, M. T. Thai, and A. Mohaisen, “POSTER: deterring Science of Inha University (joint Ph.D. program). He has obtained his B.Eng.
DDoS attacks on blockchain-based cryptocurrencies through mempool and M.S. degrees from the Inha University, in Incheon, Korea, in 2011, 2014,
optimization,” in Proceedings of Asia Conference on Computer and respectively. He is a member of the Global Research Lab on Big Data Security
Communications Security, ASIACCS, Incheon, Republic of Korea, Jun and is conducting research in the field of information security. In particular,
2018, pp. 809–811. [Online]. Available: https://goo.gl/4kgiCM his interests include biometrics, network security, user authentication and IoT
[29] M. Gronwald, “The economics of bitcoins–market characteristics and security.
price jumps,” Journal of Economic Perspectives, 2014.
[30] M. Oberholzer, Share prices: critical perspective of the greater fool DaeHun Nyang is a Full Professor at the Department of Computer Science
theory. Potchefstroom: Noordwes-Universiteit, Potchefstroomkampus and Engineering at Inha University, Korea. He obtained his Ph.D. in Computer
(Suid-Afrika), 2010. Science from Yonsei University, Korea, in 2000. He is a member of the board
[31] B. Community, “Difficulty in Bitcoin,” 2018. [Online]. Available: of directors and the editorial board of the Korean Institute of Information
https://en.bitcoin.it/wiki/Difficulty Security and Cryptology and a section editor of the ETRI Journal. His research
[32] P. Szilagyi, “Ethereum block validator,” 2018. [Online]. Available: interests include cryptography, privacy, usable security, network security, and
https://goo.gl/sBLoD4 system security. He is a member of IEEE.
[33] R. G. Ahangar, M. Yahyazadehfar, and H. Pournaghshband, “The
comparison of methods artificial neural network with linear regression
Joongheon Kim (M’06–SM’18) has been an assistant professor with Chung-
using specific variables for prediction stock price in tehran stock
Ang University, Seoul, Korea, since 2016. He received his B.S. (2004) and
exchange,” CoRR, vol. abs/1003.1457, 2010. [Online]. Available:
M.S. (2006) degrees from Korea University, Seoul, Korea; and his Ph.D.
http://arxiv.org/abs/1003.1457
(2014) degree from the University of Southern California (USC), Los Angeles,
[34] M. D. Cock, R. Dowsley, A. C. A. Nascimento, and S. C. Newman,
CA, USA. In industry, he was with LG Electronics (Seoul, Korea, 2006–2009),
“Fast, privacy preserving linear regression over distributed datasets
InterDigital (San Diego, CA, USA, 2012), and Intel Corporation (Santa Clara,
based on pre-distributed data,” in Proceedings of the 8th ACM Workshop
CA, USA, 2013–2016). He is a senior member of the IEEE. He was a recipient
on Artificial Intelligence and Security, AISec Colorado, USA, Oct 2015,
of the Annenberg Graduate Fellowship with his Ph.D. admission from USC
pp. 3–14. [Online]. Available: https://goo.gl/fr6Wti
(2009) and the Haedong Young Scholar Award (2018).
[35] S. S. Roy, D. Mittal, A. Basu, and A. Abraham, “Stock market
forecasting using LASSO linear regression model,” in Afro-European
Conference for Industrial Advancement - Proceedings of the First Aziz Mohaisen earned his M.Sc. and Ph.D. degrees from the University of
International Afro-European Conference for Industrial Advancement, Minnesota in 2012. He is currently an Associate Professor at the University
AECIA 2014, Addis Ababa, Ethiopia, 17-19, Nov 2014, pp. 371–381. of Central Florida, where he only directs the Security and Analytics Lab
[Online]. Available: https://goo.gl/majFkf (SEAL). Before joining UCF in 2017, he was an Assistant Professor at SUNY
[36] Á. Alonso, A. Torres, and J. R. Dorronsoro, “Random forests and Buffalo (2015–201) and a Senior Research Scientist at Verisign Labs (2012–
gradient boosting for wind energy prediction,” in Proceedings of 2015). His research interests are in the areas of networked systems and their
[37] Y. Zhang and A. Haghani, “A gradient boosting method to improve security, online privacy, and measurements. He is an Associate Editor of IEEE
travel time prediction,” Transportation Research Part C: Emerging Transactions on Mobile Computing, and is a senior member of ACM (2018)
Technologies, vol. 58, pp. 308–324, 2015. and IEEE (2015).

You might also like