0% found this document useful (0 votes)
26 views5 pages

The Time Series Forecasting: From The Aspect of Network

This document proposes a new method for time series forecasting based on converting time series data into networks and using link prediction. It introduces visibility graphs and link prediction, describes how to convert time series to networks and find related nodes. An experiment on stock price forecasting shows the method performs well with small training data compared to ARIMA.

Uploaded by

lanhdienthusinh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views5 pages

The Time Series Forecasting: From The Aspect of Network

This document proposes a new method for time series forecasting based on converting time series data into networks and using link prediction. It introduces visibility graphs and link prediction, describes how to convert time series to networks and find related nodes. An experiment on stock price forecasting shows the method performs well with small training data compared to ARIMA.

Uploaded by

lanhdienthusinh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

epl draft

The time series forecasting: from the aspect of network

S. Chen1 , X. Lan1 , Y. Hu2 , Q. Liu3,4 and Y. Deng1


1
School of Computer and Information Science, Southwest University - Chongqing 400715, China
2
Institute of Business Intelligence and Knowledge Discovery, Guangdong University of Foreign Studies, Guangzhou
510006, China
arXiv:1403.1713v1 [physics.data-an] 7 Mar 2014

3
Department of Biomedical Informatics, Medical Center, Vanderbilt University, Nashiville, 37235, USA
4
School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200030, China

PACS 89.75.Hc – Networks and genealogical trees


PACS 05.40.Fb – Random walks
PACS 89.65.-s – Social and economic systems

Abstract – Forecasting can estimate the statement of events according to the historical data
and it is considerably important in many disciplines. At present, time series models have been
utilized to solve forecasting problems in various domains. In general, researchers use curve fitting
and parameter estimation methods (moment estimation, maximum likelihood estimation and least
square method) to forecast. In this paper, a new sight is given to the forecasting and a completely
different method is proposed to forecast time series. Inspired by the visibility graph and link
prediction, this letter converts time series into network and then finds the nodes which are mostly
likelihood to link with the predicted node. Finally, the predicted value will be obtained according
to the state of the link. The TAIEX data set is used in the case study to illustrate that the
proposed method is effectiveness. Compared with ARIMA model, the proposed shows a good
forecasting performance when there is a small amount of data.

Introduction. – Forecasting estimates the statement processes, so it is applied in the area of forecasting [2, 3].
of events in the future according to the historical data and In artificial neural network, the inputs or variables get
it is considerably important in many disciplines, such as filter through one or more hidden layers and the inter-
finance, meteorology, industry and so forth. At present, mediate output is related to the final output. Besides
abundant time series models have been utilized to solve exponential smoothing methods, ARMA model, ARIMA
forecasting problems. model and ANN model, there are some other methods
For instance, exponential smoothing methods were in- to study time series, such as autoregressive conditional
troduced in the 1950s by the works of Brown and where- heteroscedastic (ARCH) model, generalized autoregressive
after these methods got a great development. In general, conditional heteroscedastic (GARCH) model, long mem-
the smoothing parameters are restricted to the range 0 to ory models, structural models and so forth. These meth-
1 but the usual intervals may produce non-invertible mod- ods have their respective characteristics.
els. Autoregressive and Moving Average (ARMA) model In this letter, a new sight is given to the forecasting.
is an important method to study time series. The concept Different from the existed methods, the proposed method
of autoregressive (AR) and moving average (MA) mod- in this letter forecasts the time series according to the net-
els was formulated by the works of Yule, Slutsky, Walker work structure. Inspired by the visibility graph and the
and Yaglom. Autoregressive integrated moving average link prediction, this letter converts time series into net-
(ARIMA) model [1] is based on the ARMA model. The work based on the visibility graph and finds the relation-
difference is that ARIMA model converts non-stationary ship between the predicted node and other nodes based
time series into stationary time series before adopting on the link prediction. Link prediction can estimate the
ARMA model. ARMA and ARIMA model are widely likelihood of the existence of a link between two nodes and
used to predict linear time series. In order to predict commonly, if two nodes are more similar, they are more
non-linear time series, some other models are proposed. likely to be connected. In other words, we can find which
Artificial neural network (ANN) is useful for nonlinear nodes the predicted node will link with by using link pre-

p-1
S. Chen et al.

an efficient method to deal with the problem of link pre-


diction. This method is based on local random walk and
exploits the possible links by calculating the similarity be-
tween two nodes. If two nodes are more similar, they are
more likely to be connected. The main process is described
as follows.
Supposing there is an undirected network G(V, E),
where V is the set of nodes and E is the set of links.
For each pair of nodes, x, y ∈ V , a score Sxy denotes the
similarity between node x and node y.
Fig. 1: The histogram shows a time series with 10 data values,
and according to the visibility algorithm, the associated graph
Random walk process can be described by the transition
is obtained. In the histogram, if a bar can be seen from the top probability matrix R. Rxy = axy /kx presents the proba-
of considered one, they will be linked. If two bar are linked in bility that a walker who stays at node x will move to node
the histogram, the nodes which represent them will be linked in y in the next step, where if the node x links with the node
the associated graph. y,axy = 1; otherwise, axy = 0, and kx is the degree of the
node x. The following equation can measure the proba-
bility that a walker starting from node x locates at other
diction. And then, a method which can covert network nodes after t steps.
into time series is proposed to calculate the value of the
predicted node according to the state of the link. →
−π x (t) = RT −
→π x (t − 1) (2)
This letter presents a new aspect of forecasting. The
proposed method is compared with classic ARIMA model where −→π x (0) is an N × 1 vector with the x-th element
in forecasting stock price. We use time series of differ- equals to 1 and others to 0. Then, the similarity between
ent length as training set to forecast stock price. The node x and node y is defined as follows:
experimental results show that when the number of data
LRW kx ky
in training set is small, the forecasting of the proposed Sxy (t) = πxy (t) + πyx (t) (3)
method is better than ARIMA model. In general, the 2M 2M
forecasting effect of the proposed method is favorable. where M is the number of links and there exists Sxy = Syx .
To let a random walker circulate locally rather than go too
Preliminary knowledge. – far away, a equation is defined as follows which superposes
The visibility graph. Complex networks are widely the contribution of each walker.
used in many disciplines [4–13]. Recently, a new tool called
t
the visibility graph is proposed to transform time series SRW
X
LRW
into network which builds a bridge between time series S xy (t) = Sxy (i) (4)
i=1
and network. The visibility graph method was proposed
by L. Lacasa et al. in 2008 [14]. In the visibility graph, In the literature [22], the authors have proved that this
the values of time series are plotted by using vertical bars. link prediction method based on local random walk is ef-
A vertical bar links with others and the visibility criteria ficient and compared with six well-known methods on five
is established in the literature [14] as follows: real networks, it gives better prediction than the three
local similarity indices.
Definition 1 Two data value (t1 , y1 ) and (t2 , y2 ) have
visibility, if any other value (t3 , y3 ) is placed between them Proposed method. – Inspired by the visibility graph
fulfills: and link prediction, we propose a new method to forecast
t2 − t3 time series from the aspect of network. The time series
y3 < y2 + (y1 − y2 ) (1) is converted into the network to exploit the relationship
t2 − t1
between the predicted node and other nodes. Then, this
The associated graph derived from a time series has network will be converted into time series by our method
the following properties: 1)Connected: a node can see its to calculate the value of the predicted one. The detailed
nearest neighbors. 2)Undirected: the associated graph ex- process is described as follows.
tracted from a time series is undirected. 3)Invariant under Supposing there is a time series (t1 , x1 ), (t2 , x2 ), · · · ,
affine transformations of the series data: Although rescale (ti , xi ) · · · , (tn , xn ). xi is the data value and ti is the time
both horizontal and vertical axes, the visibility criterion when xi is observed.
is invariant. Step1 : The time series is converted into network by
Until now, the visibility graph has been applied in many using the visibility graph algorithm. The node y which
discipline to analysis the time series [15–21]. denotes the data needing to be predicted. The node y
Link prediction. Recently, link prediction is paid much needs be added to the network. Every node see at least
attention in complex networks. It can find the links which its nearest neighbors, so the predicted node y should link
likely appear in the network. The literature [22] proposes with the node which denotes the last data in the time

p-2
title

y5

The original networt y8


The added node

Fig. 2: Supposing the original network is like fig.1 showing.


The added node denotes the datum which needs to be predicted.
It must link with the last datum in the time series.

series. Then, a new network will be obtained. The fig.2 x5 x7 x8 x10 y


shows this process.
Step2 : In the new network, The link prediction method
is used. According to the above Eq.2, Eq.3 and Eq.4, x5 x7 x8 x10 y
the similarity between the added node y and other nodes
xi will be obtained. A latent assumption is that the the
Fig. 3: Supposing that according to the step2, we know the node
link denotes the similarity between two endpoints. If the
y links with the node x5 and the node x8 , respectively. Then
similarity between two nodes is higher, these two nodes convert the network into time series again and plot the new
are more likely to be linked. There, a threshold value is time series by using vertical bars. In the histogram, x7 is the
set to judge whether two nodes are linked with each other. last datum that x5 can see on its right side and x10 is the last
n
datum that x8 can see on its right side. According to the Eq.8,
the values of y5 and y8 can be obtained. Because the value of
P
Syxi
i=1 y8 is less than the value of y5 , the value y = y8 . The predicted
T = (5)
n value is y8 .
where n is the number of the nodes in the original net-
work, and Syxi is the similarity between the added node
y and other nodes xi . To judge whether the nodes xi link
with the added node y, we have: y = min(yj ) (8)
 Especially, the node xn is the last one in the original
1 Syxi > T
Eyxi = (6) time series. There is no node on xn ’s right side, so xn is
0 Syxi ≤ T
not considered in the above calculation. The fig.3 shows
Eyxi is the edge between the node xi and the added the process of obtaining the value of y.
node y. It is obvious that Eyxi = Exyi . This equation
means that if Syxi > T , the added node y will link with Case Study. – Taiwan stock Exchange Capitalization
the node xi . If Syxi < T , the added node y will not link Weighted Stock Index values in 2012 are used to verify the
with the node xi . Especially, the last node xn must link forecasting performance of the proposed method. In this
with y, so Eyxn = 1. letter, these data are regarded as a continuous data. The
After this step, we will know which nodes will link with autoregressive integrated moving average (ARIMA) model
the added node y. The structure of the new network is is used to be compared with the proposed method. To es-
clear. The value of the node y will be obtained in the next timate the forecasting accuracy of the proposed method,
step. the root mean square errors (RMSE) is used as a perfor-
Step3 : In this step, the network will be converted into mance measure which is shown as follows:
the time series again and the value of the added data which v
n
needs to be predicted will be obtained.
uP
u |actual(t) − forecast(t)|2
In the new time series, the order of the data xi is decided
t
t=1
RM SE= (9)
which is same as the order before they are converted into n
the network. The value of xi is same as the previous too. The proposed method and ARIMA model use the train-
The added data y is the last data in the new time series ing set to forecast the next datum value in the time series.
and its value needs to be calculated. This new time series With the prediction processing, the actual stock price will
is plotted by using vertical bars. In the histogram, suppose be added into the training data and the oldest datum will
that xj (xj ∈ xi ) links with y and x∗j is the last one that be deleted. The number of the training data keeps un-
xj can see on its right side. We have: changed. The training data includes three groups. Group
xj − xj
∗ 1 originally consists of the data of October (21 records),
yj = ∗ (t − tj ) + xj (7) group 2 originally consists of the data of August, Septem-
tj − tj ber and October (64 records), and group 3 originally con-
where t is the time when y appears. The value of data sists of June, July, August, September and October (107
y is defined as follows: records). These groups are respectively used to forecast

p-3
S. Chen et al.

8000 8000
actual stock price actual stock price
forecast by proposed method forecast by proposed method
forecast by ARIMA forecast by ARIMA

7500 7500
Stock price

Stock price
7000 7000

6500 6500
0 2012/11/5 2012/11/10 2012/11/15 2012/11/20 2012/11/25 0 2012\11\5 2012\11\10 2012\11\15 2012\11\20 2012\11\25
Tima scales Time scale

Fig. 4: The training data consists of the data of October (21 Fig. 6: The training data consists of the data of June, July, Au-
records). The figure shows the predicted stock price of Novem- gust, September and October ((107 records)). The figure shows
ber obtained by ARIMA and the proposed method, respectively. the predicted stock price of November obtained by ARIMA and
the proposed method, respectively.

8000
actual stock price
forecast by proposed method
200
forecast by ARIMA
ARIMA
The proposed method
180

160

7500
140
Stock price

120
RMSE

100

7000 80

60

40

20
6500
0 2012\11\5 2012\11\10 2012\11\15 2012\11\20 2012\11\25
Time scales
0
1 2 3
The group

Fig. 5: The training data consists of the data of August,


September and October (64 records). The figure shows the pre- Fig. 7: This figure shows the RMSE of ARIMA and the pro-
dicted stock price of November obtained by ARIMA and the posed method. No matter what the number of training data is,
proposed method, respectively. the RMSE of the proposed method is less than ARIMA.

the stock price values of November. The predicted results cially, when there is a small amount of data, the forecast-
are shown in the Fig.4, Fig.5 and Fig.6. Table 1 shows ing performance of the proposed method is much better
the root mean square errors of ARIMA model and the than ARIMA model. With the number of data increas-
proposed method. ing, the predicted accuracy of the proposed method and
ARIMA model is improved. The predicted results show
that the proposed method is effective and when there is a
Table 1: The RMSE of ARIMA and the proposed method
small amount of data, the proposed method can obtain a
Group RMSE(ARIMA) RMSE(the proposed method)
better predicted performance than ARIMA model.
1 158.06 116.00
2 103.78 78.29 Conclusion. – In this letter, a new forecasting
3 82.91 78.15
method is proposed. This new method forecasts data
from the aspect of the network. Time series is converted
It is obvious that the forecasting performance of the into network to find out which data will link with the
proposed method is better than ARIMA model. Espe- datum which needs to be predicted. Then, the network

p-4
title

which contains the relationship information between the [19] Telesca L., Lovallo M., Ramirez R. A. and Flores
predicted datum and other data is converted into time se- M. L., Physica A, 392 (2013) 24
ries again to forecast the value of datum which needs to [20] Donner R. V. and Donges J. F., Acta Geophysica, 60
be predicted. The case study illustrates that the proposed (2012) 3
method is effectiveness. Compared with ARIMA, the pro- [21] Pierini J. O., Lovallo M. and Telesca, L., Acta Geo-
physica, 391 (2012) 20
posed shows a good forecasting performance when there
[22] Liu W., and Lü L., Europhys. Lett., 89 (2010) 5
is a small amount of data.

∗∗∗

The work described in this letter is partially sup-


ported Chongqing Natural Science Foundation, Grant
No. CSCT, 2010BA2003, National Natural Sci-
ence Foundation of China, Grant No. 61174022
and 71271061, National High Technology Research
and Development Program of China (863 Program)
(No.2013AA013801), Science and Technology Planning
Project of Guangdong Province, China (2010B010600034,
2012B091100192),Doctor Funding of Southwest University
Grant No. SWU110021, Fundamental Research Funds for
the Central Universities No. XDJK2014D008.

REFERENCES

[1] Box G. E., Jenkins G. M. and Reinsel G. C., Wiley.


com, (2013)
[2] Hippert H. S., Pedreira C. E.C and Souza R. C., IEEE
Trans. Power Syst., 16 (2001) 1
[3] Zhang G., Eddy Patuwo B. and Hu M., International
journal of forecasting, 14 (1998) 1
[4] Watts D. J. and Strogatz S. H., Nature, 393 (1998)
6684
[5] Barabási A.-L. and Albert R., Science, 286 (1999) 5439
[6] Newman M. E. J, Oxford University Press, New York,
(2009)
[7] Liu H., Lu J., Lü J. and Hill D. J., Automatica, 45
(2009) 8
[8] Vidal M., Cusick M. E. and Barabási A.-L., Cell, 144
(2011) 6
[9] Wei D. J., Deng X. Y., Zhang X. G., Deng Y. and
Mahadevan S., Phys. A, 392 (2013) 10
[10] Donges J. F., Donner R. V. and Kurths J. , Europhys.
Lett., 102 (2013) 1
[11] Wei D. J., Liu Q., Zhang H. X., Hu Y., Deng Y. and
Mahadevan S. , Sci. Rep., 2013 (Doi:10.1038)
[12] Chen D. B., Gao H., Lü L. and Zhou T., PLos ONE,
8 (2013) 10
[13] Liu J. G., Ren Z.-M. and Guo Q., Phys. A, 392 (2013)
18
[14] Lacasa L., Luque B., Ballesteros F., Luque J. and
Nuño J. C., Proc. Narl. Acad. Sci. USA, 105 (2008) 13
[15] Telesca L. and Lovallo M. , Europhys. Lett., 97 (2012)
5
[16] Lacasa L., Luque B., Luque J. and Nuno J. C. , Eu-
rophys. Lett., 86 (2009) 3
[17] Mehraban S., Shirazi A., Zamani M. and Jafari G.,
arXiv:1301.1010, (2013)
[18] Ahmadlou M., Adeli H. and Adeli A., Physica A, 391
(2012) 20

p-5

You might also like