0% found this document useful (0 votes)
7 views

A novel algorithmic trading strategy using data driven

This paper presents a novel algorithmic trading strategy that utilizes a data-driven approach to estimate innovation volatility, improving upon traditional Kalman filtering methods. The proposed data-driven innovation volatility forecast (DDIVF) demonstrates better performance in trading strategies compared to commonly used methods, particularly in volatile markets. The study highlights the robustness of the DDIVF in handling varying initial values and its application in pairs and multiple trading strategies using cointegrated exchange-traded funds.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

A novel algorithmic trading strategy using data driven

This paper presents a novel algorithmic trading strategy that utilizes a data-driven approach to estimate innovation volatility, improving upon traditional Kalman filtering methods. The proposed data-driven innovation volatility forecast (DDIVF) demonstrates better performance in trading strategies compared to commonly used methods, particularly in volatile markets. The study highlights the robustness of the DDIVF in handling varying initial values and its application in pairs and multiple trading strategies using cointegrated exchange-traded funds.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

A Novel Algorithmic Trading Strategy Using

Data-Driven Innovation Volatility


You Liang Aerambamoorthy Thavaneswaran Md. Erfanul Hoque
Department of Mathematics Department of Statistics Department of Statistics
Ryerson University University of Manitoba University of Manitoba
Toronto, Canada Winnipeg, Canada Winnipeg, Canada
[email protected] [email protected] [email protected]

Abstract—The explosion of algorithmic trading has been one As an extension to the dynamic setup, we consider the filtering
of the most prominent recent trends in the finance industry. problems and state space models ( [3]). Kalman filtering (KF),
Regularized estimating functions including Kalman filtering (KF) non-normal filtering and partially Bayes maximum informative
allow dynamic data scientists and algo traders to enhance
the predictive power of statistical models and improve trading filtering have algorithmic trading applications in quantitative
strategies. Recently there has been a growing interest in using finance ( [4], [5]). In dynamical learning, KF offers a compu-
KF in pairs trading. However, a major drawback is that the tationally efficient recursive procedure to learn the dynamical
innovation volatility estimate calculated by using a KF algorithm systems using prior knowledge.
is always affected by the initial values and outliers. A simple
yet effective data-driven approach to estimate the innovation
First the information based estimating function (EF) ap-
volatility with some robustness properties is presented in this proach is introduced to study the robust filtering problems.
paper. The results show that the performance of the trading Consider a probability space (Ω, F, P ), on which y and θ
strategy based on the data-driven innovation volatility forecast are jointly distributed random variables, and θ is real valued.
(DDIVF) is better than the commonly used KF-based innovation An EF for θ is a real valued function, denoted by g(y; θ),
volatility forecast (KFIVF). Autocorrelations of the absolute val-
ues of the innovations in multiple trading are used to demonstrate
and it is unbiased if E[g(y; θ)] = 0. The information matrix
that the innovations are non-normal with time-varying volatility. associated with g is defined by
We describe and analyze experiments on three cointegrated
exchange-traded funds (ETFs) and explain how our approach can Ig = E[gg  ] = (E[∂g/∂θ]) (E[gg  ])−1 (E[∂g/∂θ]).
improve the performance of the trading strategies. A proposed
novel trading strategy for multiple trading with robustness to
initial values and to the volatile stock market is also discussed [6]–[8] illustrate the combined EFs approach in a number of
in some detail by using a training sample and a test sample. linear non-Gaussian process filtering problems in the scalar
Index Terms—Pairs Trading, Multiple Trading, Kalman Filter, case. Filters are obtained as the solutions of maximum in-
Data-Driven Volatility, Robustness, Volatile Market formative estimating equations. Recently, [9] introduced the
I. I NTRODUCTION penalized EF approach by including a penalty in the combined
(linear and quadratic) EF obtained in [10] and studied the
Supervised learning ( [1]) is the most widely utilized penalized estimate of θ for logarithmic autoregressive condi-
form of machine learning. Its goal is to predict the response tional duration models. However, the resulting combined EF
from the associated features. Regularization ( [2]) puts extra with a penalty added in [9] is biased. In this paper, an unbiased
constraints on a machine learning model, and these constraints Bayesian regularized EF is defined as a combination of the
and penalties are designed to encode specific kinds of prior optimal EF based on the observed process and the optimal EF
knowledge. Consider the linear regression model of the prior process. Consider a dynamic model for a filtering
y = Aθ +  (1) problem:
with no intercept, where y is the p × 1 vector of responses, A yt = At θt + t ,
is the p × m matrix of standardized features, and  is the p × 1
vector of independent and identically distributed (i.i.d) normal where t is an independent sequence of zero mean random
errors. The regularized least squares estimate minimizes the variables with density f (·), yt is an observed sequence of
sum of the objectives variables, and θt |θt−1 has density λt (·). For example, t might
have a heavy-tailed distribution such as the Laplace distribu-
J1 = ||y − Aθ||22 , J2 = pλ (θ),
tion or the Cauchy distribution. In this case, it is impossible
where J1 is the minimization objective for the regression resid- to obtain a simple recursive relation for the posterior mean.
uals, J2 is the minimization objective for the prior information, However, we can take Godambe’s formulation (see [8]) as
and λ is a tuning parameter. Lasso estimates are viewed as L1 - a starting point and investigate a combination of orthogonal
penalized least squares estimates with penalty J2 = λ||θ||1 . EFs. It can be shown that if θt−1 were known the optimal
ª$SPXO

1107 2020 IEEE Symposium Series on Computational Intelligence (SSCI)


December 1-4, 2020, Canberra, Australia

Authorized licensed use limited to: University of Canberra. Downloaded on May 21,2021 at 07:16:58 UTC from IEEE Xplore. Restrictions apply.
regularized filtered estimate θ̂t is obtained as the solution of security. In a non-dynamic setting, we use in-sample data to
the combined EFs, obtain offline estimates θ̂0 and θ̂1 for regression coefficients θ0
∂ ln f (yt − At θt ) ∂ ln λt (θt |θt−1 ) and θ1 , and calculate an estimate σ̂ of the standard deviation
+ = 0. (2) of t . In the trading period, the z-score zt is computed as
∂θt ∂θt
If f and λ are normal densities then the filtered estimate θ̂t zt = νt /σ̂ = (P1,t − θ̂0 − θ̂1 P2,t )/σ̂,
turns out to be posterior mean as in KF for state space models. which is used to generate trading signals. Further studies
It can also be shown that the optimal linear predictor could describe the price difference of a pair in the state space
also be obtained as a solution of the the unbiased filtering formulation ( [18], [19]). In order to incorporate the time
equation varying regression coefficients ( [4], [5]), and extend pairs
At (yt − At θt ) θt − μt trading to multiple trading ( [20]), the linear state space model
− + = 0, (3)
ση2 σ2 or dynamic linear model can be used. The state space model
where μt is the conditional mean of θt . Moreover, if f is employs a random walk as the state equation:
the normal density and λt (·) is the density of a symmetric θt = θt−1 + vt , (5)
distribution with mean zero, sign correlation ρ (see [11]) and
variance σ 2 , the optimal linear predictor could be obtained as where θt is the m-dimensional state vector at time t, and vt is
a solution of the unbiased filtering equation i.i.d with mean zero and covariance matrix Σv . An observed
process yt can be described by an observation equation:
At (yt − At θt ) sign(θt − μt )(|θt − μt | − ρσ)
− + = 0, (4)
ση2 σ 2 (1 − ρ2 ) yt = A t θ t +  t , (6)
which can be rewritten as where At is a m-dimensional feature, and the observational
At (yt − At θt ) θt − μt ρ noise t is i.i.d with mean zero and variance σ2 . A primary
− + 2 − sign(θt −μt ) = 0. aim of the analysis is to produce dynamic filtered estimates,
ση2 σ (1 − ρ2 ) σ(1 − ρ2 )
θt|t = E[θt |Fty ], for the hedge ratio θt to hedge the risk
As a special case, if we further assume that λt (·) is the density exposure of the stock price movement, given the data Fty =
of a Laplace distribution then we can interpret the dynamic {y1 , . . . , yt } up to time t. Using the filtered estimate θt−1|t−1 ,
Lasso estimate as a posterior mode for each t. In general, (4) νt = yt − At θt−1|t−1 is called the innovation at time t. The
can be interpreted as a generalized unbiased filtering equation innovation sequence νt and its time varying volatility are used
with symmetric priors and it is more informative than the least to generate trading signals in algorithmic trading.
squares EF. The corresponding estimate can be interpreted as Recently there has been a growing interest in pairs trading
a dynamic generalized Bayesian least square estimate. and multiple trading based on Kalman filtering. In the litera-
Algorithmic trading ( [12]–[16]) uses a computer program ture [4], [5], [20] among others, very small initial values of the
that follows a defined set of instructions (an algorithm) to place KF are used. Trading profit is sensitive to initial values, and it
a trade and can generate profits at a speed and frequency that is decreases sharply when initial values are slightly increased. In
impossible for a human trader. It is rarely in the best interest of this paper, a novel data-driven robust filtering algorithm based
investment managers to share profitable trading strategies with on regularized EFs is proposed for multiple trading, which
the public, so most trading strategies including pairs trading does not need to assume very small initial values. It is shown
remained a secret of the investors until the introduction of that the commonly used square root of the innovation variance
online trading. Pairs trading is a trading strategy used to exploit is not an appropriate estimator of the innovation volatility (see
financial markets that are out of equilibrium. The strategy [11] for details). A data-driven trading strategy based on joint
involves identifying two securities whose prices tend to move forecasts of volatility and stock price is studied in [21]. The
together in the long term. Upon divergence, the cheaper secu- data-driven generalized exponential weighted moving average
rity is bought long and the more expensive one is sold short. (DD-EWMA) volatility forecasting model proposed in [11]
When prices converge back to their historical equilibrium, is used to forecast the innovation volatility directly in this
the trade is closed and a profit collected. Pairs trading has paper. The data-driven innovation volatility forecast (DDIVF)
been introduced to the academic community through [17]. provides accurate dynamic interval forecasts of innovations
The idea behind a pair (of stocks, bonds, foreign exchanges, and can be used to generate the trading signals appropriately.
commodities, etc.) is closely linked to the statistical concept Let the conditional variance of the innovation νt , based on the
of cointegration. If a linear combination of a collection of past data up to time t − 1, be σt2 . The DD-EWMA volatility
nonstationary time series is stationary, then the collection is forecasting model for innovations is given by
said to be cointegrated. For cointegrated prices P1,t and P2,t ,
the difference or spread of two prices, t = P1,t − θ0 − θ1 P2,t , |νt−1 − ν̄|
σ̂t = (1 − α) σ̂t−1 + α , 0 < α < 1, (7)
is stationary, which suggests that t is perturbed around an ρ̂ν
equilibrium value. In pairs trading, the regression coefficient where α is the smoothing constant, and ρ̂ν is the sam-
θ1 is called the hedge ratio, and it describes the amount of ple sign correlation of the innovation sequence, defined as
one security to purchase or sell for every unit of the other Corr(νt − ν̄, sgn(νt − ν̄)). Model (7) is data-driven in the

1108

Authorized licensed use limited to: University of Canberra. Downloaded on May 21,2021 at 07:16:58 UTC from IEEE Xplore. Restrictions apply.

sense that the optimal value of α is obtained by minimizing √ νt and the innovation volatility Qt
variable, the innovation
the one-step ahead forecast error sum of squares (FESS), are used. However, Qt is not an appropriate estimate of the
and the sample sign correlation ρ̂ν is used to identify the innovation volatility. Therefore, DD-EWMA volatility fore-
conditional distribution of νt . In this paper, this model is used casting model (7) is used to obtain DDIVF, and Algorithm
and extended to study the volatility forecasts for innovation 1 illustrates the details of DDIVF calculation. Based on the
and improve the stability of filtering algorithm. past k innovations νt−k , . . . , νt−1 , sample sign correlation ρ̂ν
The remainder of this paper is organized as follows. In and volatility estimate |νs − ν̄|/ρ̂ν , s = t − k, · · · , t − 1 are
Section II, a maximum informative filtering algorithm is calculated. The smoothed value Ss of the volatility estimate is
proposed with DDIVF. In Section III, a data-driven multiple calculated recursively. The optimal smoothing constant αopt
trading strategy using maximum informative filtered hedge is determined by minimizing the one-step ahead FESS. Using
ratios and DDIVF is proposed. Trading strategies constructed the optimal value αopt , we calculate the smoothed value
using the DDIVF performs better than the commonly used KF- Ss recursively. Finally, St−1 is computed, and used as the
based innovation volatility forecast (KFIVF). The robustness volatility forecast σ̂t for νt .
of these two strategies are analyzed and compared using a
training sample and a test sample. The trading strategy using Algorithm 1 Dynamic DD-EWMA volatility forecasts of
DDIVF is robust to a wide range of initial values, and robust innovation
to the volatile stock market, since the time varying innovation Require: Predicted errors νs , s = t − k, · · · , t − 1
1: ρ̂ν ← Corr(νs − ν̄, sign(νs − ν̄))
volatility is properly investigated. Finally, Section IV provides
2: Vs ← |νs − ν̄|/ρ̂ν {Compute estimated volatility}
conclusions.
3: St−k−1 ← V̄l {Initial volatility forecast using first l
II. M ETHODS observations}
We consider multiple trading for stocks with multiple 4: α ← (0.01, 0.5) by 0.01{Set a range for α}
5: Ss ← α ∗ Vs + (1 − α) ∗ Ss−1 , s = t − k, . . . , t − 1
cointegrations and construct a novel trading strategy using t−1 2
dynamic maximum informative filtering. Consider m asset 6: αopt ← minα s=t−k+l (Vs −Ss−1 ) {Determine optimal
prices P1,t , P2,t , . . . , Pm,t with a multiple cointegrated re- α by minimizing FESS}
lationship. The state space model (5) - (6) is used where 7: Ss ← αopt ∗ Vt + (1 − αopt ) ∗ Ss−1 , s = t − k, . . . , t − 1
θt = (θ0,t , θ1,t , . . . , θm−1,t ) , yt = P1,t and At = 8: σ̂t ← St−1 {Calculate one-step-ahead DDIVF based on k
(1, P2,t , . . . , Pm,t ). In addition, it is assumed for simplicity observations}
that θ0 , vt and t are uncorrelated. 9: return αopt , σ̂t

A. Data-Driven Maximum Informative Filters Using Estimat- It follows from [6] and [8] that the information matrix
ing Functions associated with the combined EF (8) is maximal in the
For model (5) - (6), let θt−1|t−1 = E[θt−1 |Ft−1
y
] and class of linear combination of g1t and νt . When Gaussian
−1  y
It−1|t−1 = Pt−1|t−1 = Var(θt−1 − θt−1|t−1 |Ft−1 ). Based assumptions hold for vt and t , the linear optimal filter (9)
on the non-Gaussian maximum informative filter provided in turns out to be KF. The point estimation can be regarded
[8], we consider the combination of two elementary EFs: as recursive if θt = θt|t = E[θt |Fty ]. We cannot con-
clude this in general, though we conclude that θt is Fty -
y
g1t = θt − E[θt |Ft−1 ] = θt − θt−1|t−1 ,
measurable and E[θt |Ft−1y
] = E[θt |Ft−1y
]. The following
and dynamic maximum informative filtering algorithm is used
to calculate dynamic hedge ratios recursively. The algorithm
y
g2t = νt = yt − E[yt |Ft−1 ] = yt − At θt−1|t−1 , updates It|t , which is computationally simpler than updating
where νt = yt − yt|t−1 is the innovation or forecast error of yt . the covariance matrix Pt|t . The updating formulas for It|t is
The “optimal” combination in the class of linear combinations equivalent to the updating formulas for Pt|t , which is given
−1
of g1t and νt is given by by Pt|t = (I − It|t−1 At Q−1
t At )Pt|t−1 . √
y The innovation νt , and the standard deviation Qt or the
Cov(g1t , νt |Ft−1 )
θt − θt−1|t−1 − y (yt − At θt−1|t−1 ). (8) DDIVF σ̂t of can be used to construct the signals for a trading
Var(νt |Ft−1 ) strategy at each time t. [5] discussed a pairs trading√strategy,
This yields the “optimal” estimate of θt as and [20] proposed a multiple trading strategy using Qt . The
first two values of νt are relatively large since the filter needs a
θt = θt−1|t−1 + (It−1|t−1
−1
+ Σv )At Q−1 
t (yt − At θt−1|t−1 ), few iterations before stabilization. Without outliers, νt follows
(9) a normal distribution approximately. Hence, the dynamic z-
where the innovation variance is given by score zt is computed as
y −1 
Qt = Var(νt |Ft−1 ) = At (It−1|t−1 + Σv )At + σ2 . zt = νt / Qt , (10)
In most of the applications including pairs trading and and the z-scores will be compared with a threshold value√ p
risk forecasting, the filtered estimate θt−1|t−1 for the state to generate trading signals. The strategies using νt and Qt

1109

Authorized licensed use limited to: University of Canberra. Downloaded on May 21,2021 at 07:16:58 UTC from IEEE Xplore. Restrictions apply.
Algorithm 2 Dynamic maximum informative filtered hedge optimal threshold value, popt , is determined by maximizing the
ratios ASR; the optimal cumulative profit is computed from optimal
Require: Data: adjusted closing stock prices signals and positions generated by using popt .
P1,t , P2,t , . . . , Pm,t , t = 1, . . . , n
1: Let yt = P1,t , At = (1, P2,t , . . . , Pm,t ) Algorithm 3 Robust Multiple Trading using rolling DDIVF
2: Initialization: initial state θ0 , initial error covariance ma- and optimal signals
trix I0|0 = Σ−1 0 , constant error covariance matrix Σv , Require: Data: P1,t , P2,t , . . . , Pm,t , t = 1, . . . , n; rolling
constant innovation variance σ2 window size k
3: for t ← 1, . . . , n do t−1|t−1
1: Let yt = P1,t , At = (1, P2,t , . . . , Pm,t ); νt and θ
4: Prediction: Based on data available at t − 1: obtained from Algorithm 2
5: θt|t−1 ← θt−1|t−1 ; It|t−1 −1 −1
← It−1|t−1 + Σv ; ŷt|t−1 ← 2: σ̂t , t = k + 1, . . . , n is obtained by using Algorithm 1

At θt|t−1 using a rolling approach based on νt , t = 1, . . . , n. Each
6: Update: Inference about θt is updated using the obser- rolling window size is k
vation yt at time t 3: zt ← νt /σ̂t , t = k + 1, . . . , n
−1
7: νt ← yt − ŷt|t−1 ; Qt ← At It|t−1 At + σ2 4: Generate trading signals st :
8: DDIVF σ̂t is calculated based on νt−k , . . . , νt−1 using 5: for t ← k + 2, . . . , n do
Algorithm 1 6: If zt−1 < p & zt > p, then st ← −1; If zt−1 >
9: θt|t ← θt|t−1 +It|t−1 −1
At Q−1 1 
t νt ; It|t = It|t−1 + σ2 At At −p & zt < −p, then st ← 1; Else st ← 0
10: end for 7: position.At ← −1000 ∗ θt−1|t−1 ∗ st ; position.yt ←
11: return νt , Qt , σ̂t 1000 ∗ st
8: prof it.At ← (At − At−1 ) ∗ position.At ; prof it.yt ←
position.yt ∗ (yt − yt−1 )
require the initial values of Σv and σ2 to be very small since 9: prof itt ← prof it.At + prof it.yt
increased initial values will cause the problem of volatility
√ 10: end for √
clustering of νt and it is not appropriate to directly use Qt 11: Calculate the ASR as SR(p) = 252 ∗
as the volatility estimate. mean(prof itt )/sd(prof itt )
12: Determine the optimal value of p, popt , by maximizing
B. A Novel Data-Driven Robust Multiple Trading Strategy
SR(p)
In the literature [4], [5], [20], trading profit is sensitive 13: Obtain the cumulative profit cumsum(prof itt ) using popt
to initial values. Therefore, a robust multiple trading strategy 14: return popt , cumulative profit
using θt−1|t−1 , νt and DDIVF σ̂t is proposed
√ and compared
with the multiple trading strategy using Qt to demonstrate
the profitability and robustness of proposed strategy. The
III. R ESULTS
dynamic robust z-score zt is computed as
In this section we test the trading strategies constructed

zt = νt /σ̂t , (11)
using DDIVF against those constructed using KFIVF ( Qt
and the z-scores will be compared with a threshold value p from KF algorithm), and explore the robustness of the profit
to generate trading signals. The strategy using equation (10) and ASR. The proposed method and algorithms are tested
always requires very small initial values of Σv and σ2 due to with the adjusted closing prices of three exchange-traded
the convergence issue and to guarantee a successful trading funds (ETFs) downloaded from Yahoo Finance for the period
strategy. However, the proposed strategy using equation (11) from 2017-02-01 to 2020-03-15: iShares MSCI Australia ETF
doesn’t require such assumptions for Σv and σ2 . The spread (EWA) and iShares MSCI Canada ETF (EWC), and iShares
νt = P1,t −θ̂0,t−1|t−1 −θ̂1,t−1|t−1 P2,t −· · ·−θ̂m−1,t−1|t−1 Pm,t North American Natural Resources ETF (IGE). From the
is modelled as a mean-reverting process. Upon divergence, the whole period of time, we selected the training sample from
cheaper security (or linear combination of securities) is bought 2017-02-01 to 2019-03-14 and the test sample from 2019-
long and the more expensive security (or linear combination 03-15 to 2020-03-15. We use EWA and EWC to illustrate
of securities) is sold short. When the prices converge back to the proposed robust trading strategy. The price movements of
their historical equilibrium, the trade is closed and a profit the these two stocks are visualized in Fig. 1 for the training
is collected. Algorithm 3 generates the trading signals st , sample and the test sample, respectively. The training sample
where sells are represented as st = −1, buys as st = 1, is used to obtain popt for each chosen values of σ2 and δ in
and no signal as st = 0. The buy signal is generated Table III. Then the test sample is used to test the profitability
when zt crosses a threshold p from above; the sell signal and robustness of the proposed robust strategy. It is known
is generated when zt crosses p from below. Then trading that March 2020 was a historically volatile month for the
positions are determined using st , and the profit of holding stock market. The proposed strategy is demonstrated to be
those positions is further computed. Finally, annualized Sharpe profitable and robust during this period until these two stocks
ratios (ASRs) are calculated for a range of values of p. The did not exhibit the cointegration relationship since April 2020.

1110

Authorized licensed use limited to: University of Canberra. Downloaded on May 21,2021 at 07:16:58 UTC from IEEE Xplore. Restrictions apply.
The cointegration of the two stocks are regularly checked by are the bounds calculated by popt σ̂t . Fig. 3 compares DDIVF
Engle-Granger test and Johansen test over time. and KFIVF, and it is shown that KFIVF is very similar to
the average of DDIVF and is not able to capture the time
pair.stock 2017−02−01 / 2019−03−14 pair.stock 2019−03−15 / 2020−03−13
varying √innovation volatility. Therefore the bounds calculated
28
EWA
28
30 EWA 30 by popt Qt where popt = 1.48 from the traditional pairs
EWC EWC

trading is not providing the online guide for signals, as shown


26 26

25 25 in Fig. 5. We calculate the positions in each asset according


to the spread and signals using popt = 1.68. The look-ahead
24 24

22 22
20 20 bias is eliminated by lagging the signals. Each trade consists
20 20
of 1,000 units of the spread. The estimated profit is the sum of
18 18
the price differences multiplied by the corresponding positions
Feb 01 May 01
2017 2017
Aug 01
2017
Nov 01
2017
Feb 01 May 01
2018 2018
Aug 01
2018
Nov 01
2018
Feb 01
2019
Mar 15
2019
May 01
2019
Jul 01
2019
Sep 03
2019
Nov 01
2019
Jan 02
2020
Mar 02
2020 in each asset. The cumulative profit of the robust pairs trading
Fig. 1. Daily adjusted closing prices of EWA and EWC
is $7024.441.

We consider the state space model (5) - (6) where θt =


Innovation Volatility 2017−02−03 / 2019−03−14
(θ0,t , θ1,t ), P1,t = θ0,t + θ1,t P2,t + t . P1,t is EWC, and P2,t
is EWA. The initial state, θ0 = 0, and the initial information 0.7 0.7
matrix I0 is chosen such that I0−1 is a zero matrix. The
innovation covariance σ2 = 0.001, and the error covariance 0.6 0.6

matrix Σv is a 3 × 3 diagonal matrix with elements δ/(1 − δ),


where δ = 0.0001. The dynamics of the filtered intercept 0.5 0.5

(green) and hedge ratio for EWA (blue) are shown in Fig.
0.4 0.4
2. The small initial values of the innovation covariance and
the error covariance matrix will guarantee a successful trading 0.3 0.3
strategy by using νt and Qt as in [5] and [20]. However,
this condition is not required for the proposed robust trading 0.2 0.2

strategy using DDIVF. We will discuss this in the following


sections. 0.1 0.1

Feb 03 Jun 01 Sep 01 Dec 01 Mar 01 Jun 01 Sep 04 Dec 03 Mar 01


Dynamic hedge ratios 2017−02−02 / 2019−03−14 Dynamic hedge ratios 2019−03−18 / 2020−03−13 2017 2017 2017 2017 2018 2018 2018 2018 2019


Fig. 3. DDIVF σ̂t vs. KFIVF Qt : 2017-02-01 to 2019-03-14
1.2 1.2 1.2 1.2

1.0 1.0 1.0 1.0

0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4


Trading signals 2017−02−03 / 2019−03−14
0.2 0.2 0.2 0.2

Feb 02 May 01 Aug 01 Nov 01 Feb 01 May 01 Aug 01 Nov 01 Feb 01 Mar 18 May 01 Jul 01 Sep 03 Nov 01 Jan 02 Mar 02
2017 2017 2017 2017 2018 2018 2018 2018 2019 2019 2019 2019 2019 2019 2020 2020

Fig. 2. Filtered θ̂0,t|t (green) and θ̂1,t|t (blue) 0.5 0.5

A. Robust Multiple trading strategy using νt and DDIVF 0.0 0.0

A rolling window approach is first applied to the training


sample to forecast the volatility of νt and obtain popt to be
used for the test sample. The selected data covers 532 days, −0.5 −0.5

with 432 overlapping rolling windows of size 100 days. Each


window of size 100 is used to calculate a one-day-ahead
DDIVF using Algorithm 1, and the corresponding z-score
using (11). For example, ν1 , . . . , ν100 are used to calculate Feb 03 Jun 01 Sep 01 Dec 01 Mar 01 Jun 01 Sep 04 Dec 03 Mar 01
2017 2017 2017 2017 2018 2018 2018 2018 2019
the volatility forecast σ̂101 for ν101 , and z101 = ν101 /σ̂101 .
Then the z-scores are compared with a threshold value p to Fig. 4. Robust pairs trading using σ̂t : 2017-02-01 to 2019-03-14
generate trading signals. Using Algorithm 3, the range of p
is chosen as (0.1, 2) with an increment of 0.01. The optimal The pair EWA and IGE is used to construct another robust
value is determined as popt = 1.68. The corresponding optimal pairs trading strategy, where IGE is the response variable
trading signals are visualized in Fig. 4, where the red lines and EWA is the feature. EWA, EWC and IGE are used

1111

Authorized licensed use limited to: University of Canberra. Downloaded on May 21,2021 at 07:16:58 UTC from IEEE Xplore. Restrictions apply.
TABLE II
Trading signals 2017−02−03 / 2019−03−14 O PTIMAL ROBUST AND T RADITIONAL MULTIPLE TRADING STRATEGIES :
2017-02-01 TO 2019-03-14

Robust multiple trading Traditional multiple trading


popt ASR Profit popt ASR Profit B/H Profit
0.5 0.5 EWA/EWC 1.68 1.694 7024.441 1.48 1.520 6319.299 3798.899
EWA/IGE 1.73 1.113 8105.911 1.5 -0.013 -98.021 2555.466
EWA/EWC/IGE 1.85 1.896 10645.81 1.08 1.472 8530.784 4768.198
ASR: annualized Sharpe ratio; B/H: buy and hold

0.0 0.0

sponding traditional multiple trading strategy and the buy and


hold strategy without transaction costs. A stability analysis
is further conducted for the initial values of pairs trading
−0.5 −0.5
using EWA and EWC for the training sample. The robust
pairs trading strategy using DDIVF is stable when the initial
Feb 03 Jun 01 Sep 01 Dec 01 Mar 01 Jun 01 Sep 04 Dec 03 Mar 01 value of σ2 and the error covariance matrix Σv are increased.
2017 2017 2017 2017 2018 2018 2018 2018 2019
However, the traditional pairs trading strategy requires very
√ small initial values. In the following Table III, we compare
Fig. 5. Pairs trading using Qt : 2017-02-01 to 2019-03-14
the optimal popt , profit and ASR between the proposed robust
pairs trading strategy and the traditional one, according to
to construct a robust multiple trading strategy with IGE as various initial values of σ2 and δ. The traditional pairs trading
the response variable and EWA and EWC as features. The strategy fails gradually when the initial values are increased,
summary statistics of innovation νt are provided in Table I. however; our strategy is robust to a wide range of initial
The results of all trading strategies are summarized in Table II. values. The traditional pairs trading works well only when
Table II compares each robust multiple trading strategy with δ = 0.0001 and σ2 = 0.001. However, our proposed robust
the traditional multiple trading strategy and the buy and hold data-driven pairs trading strategy works consistently well for
strategy. For each collection of stocks, the optimal threshold various initial values. The robust data-driven pairs trading
popt , cumulative profit and ASR for the robust one is provided strategy works consistently well for various initial values. The
in columns 2 to 4, respectively. Those values for the traditional results of cumulative profits and ASRs are stable when σ2
one are provided in columns 5 to 7, respectively. The last ranges from 0.001 to 10, and δ ranges from 0.0001 to 0.05,
column provides the profit from the buy and hold strategy. as shown in Table III.
Each data-driven robust multiple trading strategy outperforms
(with a higher profit and ASR) than the corresponding tradi-
tional one and the buy and hold without transaction costs. The TABLE III
S TABILITY A NALYSIS OF I NITIAL VALUES : 2017-02-01 TO 2019-03-14
value popt is essential to guarantee an appropriate number of
signals since we trade at the daily adjusted closing price with Robust pairs trading Traditional pairs trading
σ2 δ popt ASR Profit popt ASR Profit
no market impact or slippage, and trade for free. The most 0.001 0.0001 1.68 1.694 7024.441 1.48 1.520 6319.299
profitable strategy in theory is not necessarily the best one in 0.001 0.001 1.65 1.694 7023.8 0.47 1.520 6317.67
0.001 0.005 1.66 1.694 7023.737 0.21 1.638 7304.173
practice since (1) cost is increased with the increased number 0.001 0.01 1.66 1.694 7023.729 0.15 1.520 6317.493
of stocks; (2) strategy with relatively smaller popt will generate 0.001 0.05 1.66 1.694 7023.723 0.1 0.477 1066.272
0.01 0.0001 1.72 2.075 8085.982 1.21 1.681 6991.076
more signals with a higher transaction cost. In practice, the 0.01 0.001 1.68 1.694 7024.441 0.34 1.509 6787.412
used R code for trading algorithms can be modified to fit into 0.01 0.005 1.65 1.694 7023.877 0.21 1.520 6317.862
0.01 0.01 1.65 1.694 7023.8 0.15 1.520 6317.668
a live trading platform such as Zorro (see [5]). 0.01 0.05 1.66 1.694 7023.737 0.1 0.477 1066.279
0.05 0.0001 1.57 1.439 5646.207 0.81 1.910 8210.672
0.05 0.001 1.65 1.841 7632.649 0.41 1.434 5962.275
TABLE I
0.05 0.005 1.68 1.694 7024.438 0.21 1.504 6252.573
S UMMARY STATISTICS OF νt : 2017-02-01 TO 2019-03-14 0.05 0.01 1.12 1.739 7803.182 0.15 1.504 6251.728
0.05 0.05 1.65 1.694 7023.796 0.1 0.477 1066.311
ρ̂ν ρ̂∗ν acf-νt acf-|νt | acf-νt2 0.1 0.05 1.65 1.694 7023.87 0.1 0.477 1066.351
EWA/EWC 0.179 0.801 -0.0362 0.0023 -0.0233 0.5 0.05 1.68 1.694 7024.409 0.1 0.477 1066.653
EWA/IGE 0.191 0.789 0.0262 0.0041 0.0119 1 0.05 1.65 1.628 6754.427 0.1 0.477 1066.995
EWA/EWC/IGE 0.178 0.793 -0.0299 -0.0048 0.0183 2 0.05 1.64 1.841 7632.201 0.1 0.477 1067.584
ρ̂ν : sample sign correlation of νt ; ρ̂∗ν : sample sign correlation of νt 5 0.05 1.72 2.075 8085.885 0.1 0.404 906.864
without outliers; acf: Lag 1 sample autocorrelation 6 0.05 1.71 2.109 8208.29 0.1 0.404 906.802
7 0.05 1.72 2.052 7994.416 0.1 0.230 516.504
8 0.05 1.69 2.052 7994.204 0.1 0.230 516.456
9 0.05 1.69 2.172 8459.779 NA NA NA
B. Stability Analysis of Initial Values for Pairs Trading 10 0.05 1.68 1.763 6891.977 NA NA NA
ASR: annualized Sharpe ratio
It follows from Table II that the data-driven robust multiple
strategies are profitable. Each one outperforms the corre-

1112

Authorized licensed use limited to: University of Canberra. Downloaded on May 21,2021 at 07:16:58 UTC from IEEE Xplore. Restrictions apply.
C. Stability Analysis of Volatile Market
Trading signals 2019−03−19 / 2020−03−13
Considering that the duration of the test trading period
is 252 days, the rolling window size to forecast innovation
volatility is selected to be 50 days, and 172 overlapping 0.5 0.5

rolling windows are used. For σ2 = 0.001, δ = 0.0001 and


popt = 1.68 from the training sample, trading signals from
0.0 0.0
the robust strategy for the test sample are visualized in Fig.
7. Fig. 6 compared DDIVF (red line) and KFIVF (blue line),
and it is shown that KFIVF is very similar to the average of −0.5 −0.5
DDIVF and is not able to capture the time varying innovation
volatility in the test period. Therefore,
√ for the traditional
strategy, the bounds calculated by popt Qt where popt = 1.48 −1.0 −1.0

from training sample is not able to provide appropriate trading


signals based on the test sample dynamically, as shown in Fig.
8.
Mar 19 May 01 Jul 01 Sep 03 Nov 01 Jan 02 Mar 02
2019 2019 2019 2019 2019 2020 2020

DDVFI vs. KFVEI 2019−03−19 / 2020−03−13 Fig. 8. Trading signals in volatile market: robust pairs trading vs. traditional
pairs trading

TABLE IV
1.0 1.0
S TABILITY A NALYSIS OF VOLATILE M ARKET: 2019-03-15 TO
2020-03-15
0.8 0.8
Robust pairs trading Traditional pairs trading
σ2 δ popt ASR Profit popt ASR Profit
0.001 0.0001 1.68 2.177 4548.931 1.48 9.408 3415.201
0.6 0.6 0.001 0.001 1.65 1.748 3661.748 0.47 9.412 3416.024
0.001 0.005 1.66 2.177 4549.977 0.21 9.413 3416.108
0.001 0.01 1.66 2.177 4549.989 0.15 9.413 3416.118
0.4 0.4 0.001 0.05 1.66 2.177 4549.999 0.1 18.225 1538.643
0.01 0.0001 1.72 2.108 4169.427 1.21 4.546 5523.647
0.01 0.001 1.68 2.177 4548.932 0.34 2.891 6002.732
0.2 0.2 0.01 0.005 1.65 1.748 3661.637 0.21 9.412 3415.923
0.01 0.01 1.65 1.748 3661.749 0.15 9.413 3416.025
0.01 0.05 1.66 2.177 4549.979 0.1 18.224 1538.627
0.05 0.0001 1.57 1.129 2373.655 0.81 1.799 2269.826
Mar 19 May 01 Jul 01 Sep 03 Nov 01 Jan 02 Mar 02
2019 2019 2019 2019 2019 2020 2020 0.05 0.001 1.65 1.912 3999.317 0.41 4.547 5524.517
0.05 0.005 1.68 2.177 4548.936 0.21 9.409 3415.205
√ 0.05 0.01 1.12 1.276 3104.549 0.15 9.411 3415.638
Fig. 6. DDIVF σ̂t vs. KFIVF Qt 0.05 0.05 1.65 1.748 3661.754 0.1 18.219 1538.557
0.1 0.05 1.65 1.748 3661.647 0.1 18.213 1538.47
0.5 0.05 1.68 2.177 4548.979 0.1 18.174 1537.849
1 0.05 1.65 2.177 4548.15 0.1 18.135 1537.22
2 0.05 1.64 1.912 3999.732 0.1 18.081 1536.338
5 0.05 1.72 2.107 4169.42 0.1 18.004 1535.427
Trading signals 2019−03−19 / 2020−03−13 6 0.05 1.71 2.108 4169.496 0.1 17.989 1535.444
7 0.05 1.72 2.108 4169.689 0.1 1.571 226.6097
8 0.05 1.69 1.913 3999.329 0.1 1.572 226.8437
9 0.05 1.69 1.913 3999.679 NA NA NA
10 0.05 1.68 1.912 4000.079 NA NA NA
ASR: annualized Sharpe ratio
1 1

In Table IV, for each value of popt obtained from the


0 0
training trading period, profit and ASR between the proposed
robust strategy and the traditional one are compared, according
to various initial values of σ2 and δ during the volatile
−1 −1
test trading period. The traditional pairs trading strategy fails
even with the very small initial values, however; the robust
data-driven pairs trading strategy works consistently well for
various initial values. The results of cumulative profits and
Mar 19
2019
May 01
2019
Jul 01
2019
Sep 03
2019
Nov 01
2019
Jan 02
2020
Mar 02
2020
ASRs are stable when σ2 ranges from 0.001 to 10, and δ
ranges from 0.0001 to 0.05, as shown in Table IV. It is noted
Fig. 7. Robust pairs trading using σ̂t that the ASR is around 2 (shown in column 4 in Table IV) for
most of the cases, which is an ideal value for daily trading.

1113

Authorized licensed use limited to: University of Canberra. Downloaded on May 21,2021 at 07:16:58 UTC from IEEE Xplore. Restrictions apply.
The profit of buy and hold strategy during this test period is $ [16] A. Thavaneswaran, Y. Liang, Z. Zhu and R. K. Thulasiram, “Novel
-8429.348. Each proposed robust strategy with certain initial Data Driven Fuzzy Algorithmic Volatility Forecasting Models with Ap-
plications to Algorithmic Trading”, In proceeding of IEEE International
values is much more profitable than the buy and hold strategy Conference on Fuzzy Systems (FUZZ-IEEE), Glasgow, UK, July, 2020.
for the same test period, with a profit shown in column 5 in [17] E. G. Gatev, W. N. Goetzmann, and K. G. Rouwenhorst, “Pairs trading:
Table IV. Performance of a relative-value arbitrage rule. Review of Financial
Studies”, vol. 19(3), pp. 797 827, 2006.
[18] R. J. Elliott, J. van der Hoek and W. P. Malcolm, “Pairs trading”,
IV. C ONCLUSION Quantitative Finance, vol. 5, pp. 271 - 276, 2005.
[19] C. Ed. de Moura, A. H. Pizzinga, and J. P. Zubelli, “A pairs trading
This paper presents a maximum informative estimating strategy based on linear state space models and the kalman filter”,
function based robust filter, and a robust multiple trading Quantitative Finance, vol. 16, pp. 1559 - 1573, 2016.
strategy based on DDIVF using DD-EWMA forecast model [20] Y. Liang, A. Thavaneswaran, N. Yu, M. E. Hoque, and R. K. Thulasiram,
“Dynamic data science applications in optimal profit algorithmic trad-
for volatility. The driving idea, unlike the existing work, is that ing”, pp. 1294-1299, In proceedings (workshop) of IEEE 44th Annual
the filtering algorithm is obtained using the partially Bayes EF Computers, Software, and Applications Conference (COMPSAC 2020),
approach (see [8]). Data-driven robust trading strategies have 2020
[21] Y. Liang, A. Thavaneswaran, A. Paseka, Z. Zhu, and R. K. Thulasiram,
been evaluated through some experiments and it is shown that “A novel data-driven algorithmic trading strategy using joint forecasts of
the proposed robust trading strategy using DDIVF outperforms volatility and stock price”, pp. 293-302, In proceedings (Symposia) of
(i.e., has a larger profit and much more robust to initial values) IEEE 44th Annual Computers, Software, and Applications Conference
(COMPSAC 2020), 2020.
the trading strategy using KFIVF. Moreover, the robustness of
the proposed method and algorithms are further tested in the
volatile market.

ACKNOWLEDGEMENT
The first author acknowledges the Faculty of Science start-
up grant from Ryerson University. The second author ac-
knowledge the Discovery grant from Natural Sciences and
Engineering Research Council (NSERC).

R EFERENCES
[1] B. Efron and T. Hastie, Computer age statistical inference. Cambridge
University Press, 2016.
[2] M. Jahja, D. Farrow, R. Rosenfeld, and R. J. Tibshirani, “Kalman filter,
sensor fusion, and constrained regression: Equivalences and insights”, In
33rd Conference on Neural Information Processing Systems (NeurIPS
2019), Vancouver, BC, Canada, 2019.
[3] J. Durbin and S. J. Koopman, Time Series Analysis by State Space
Methods. Oxford Statistical Science Series, Oxford, 2001.
[4] E. Chan, Algorithmic Trading: Winning Strategies and Their Rationale.
Hoboken, New Jersey, John Wiley and Sons, 2013.
[5] K. Longmore, Kalman Filter Example: Pairs Trading in R. Robot Wealth.
https://robotwealth.com/kalman-filter-pairs-trading-r/, September, 2019.
[6] M. E. Thompson and A. Thavaneswaran, “Filtering via estimating
functions”, Applied Mathematics Letters, vol. 2(5), pp. 6167, 1999.
[7] M. E. Thompson, “Dynamic data science and official statistics”, The
Canadian Journal of Statistics, vol. 46 (1) pp. 10 - 23, 2018.
[8] A. Thavaneswaran and M. E. Thompson, “Nonnormal filtering via
estimating functions”, In N. Balakrishnan, editor, Aspects of Probability
and Statistics, pp. 173 183, CRC Press, London, 2019.
[9] Y. Zhang, J. Zou, N. Ravishanker and A. Thavaneswaran, “Modeling
financial durations using penalized estimating functions”, Computational
Statistics & Data Analysis, vol. 131, pp. 145-158, 2019.
[10] A. Thavaneswaran, N. Ravishanker, and Y. Liang. “Generalized duration
models and optimal estimation using estimating functions”, Annals of
the Institute of Statistical Mathematics, vol. 67(1), pp. 129156, 2015.
[11] A. Thavaneswaran, A. Paseka and J. Frank, “Generalized Value at Risk
Forecasting”, Communications in Statistics Theory and Methods, pp.
1-8, 2019.
[12] A. Arratia, Computational Finance: An Introductory Course with R.
Atlantis Press, 2014.
[13] Á. Cartea, S. Jaimungal, and J. Penalva, Algorithmic and High-
Frequency Trading (Mathematics, Finance and Risk). Cambridge Uni-
versity Press, 2015.
[14] C. Conlan, Automated Trading with R: Quantitative Research and
Platform Development. Apress, 2016.
[15] M. L. Halls-Moore, Advanced Algorithmic Trading. Available at
https://www.quantstart.com/advanced-algorithmic-trading-ebook/, 2017.

1114

Authorized licensed use limited to: University of Canberra. Downloaded on May 21,2021 at 07:16:58 UTC from IEEE Xplore. Restrictions apply.

You might also like