Bayesian Modelling of Time-Varying
Bayesian Modelling of Time-Varying
1157–1185
1 Introduction
For datasets observed over a long period, stationarity turns out to be an oversimplified
assumption that ignores systematic deviations of parameters from constancy. Thus time-
varying parameter models have been studied extensively in the literature of statistics,
economics, and related fields. For example, the financial datasets, due to external fac-
tors such as war, terrorist attacks, economic crisis, political events, etc. exhibit deviation
from time-constant stationary models. Accounting for such changes is crucial as other-
wise time-constant models can lead to incorrect policy implications as pointed out by
Bai (1997). Thus functional estimation of unknown parameter curves using time-varying
models has become an important research topic today. In this paper, we analyze popular
conditional heteroscedastic models such as AutoRegressive Conditional Heteroscedas-
ticity (ARCH) and Generalized ARCH (GARCH) from a Bayesian perspective in a
time-varying setup. Before discussing our new contributions in this paper, we provide a
brief overview of some previous works in this area.
In the regression regime, time-varying models have garnered a lot of recent atten-
tion; see, for instance, Fan and Zhang (1999), Fan and Zhang (2000), Hoover et al.
(1998), Huang et al. (2004), Lin and Ying (2001), Ramsay and Silverman (2005), Zhang
et al. (2002) among others. The models show time-heterogeneous relationship between
response and predictors. Consider the following two regression models
Model I: yi = xTi θi + ei , Model II: yi = xTi θ0 + ei , i = 1, . . . , n,
c 2021 International Society for Bayesian Analysis https://doi.org/10.1214/21-BA1267
1158 Bayesian Modelling of Time-Varying Conditional Heteroscedasticity
Similar models can be defined for tvGARCH(1,1) as well where σi2 has an additional
2
recursive term involving σi−1
posterior contraction rates with respect to average log-affinity and then the same rate
is transferred to the average Hellinger metric. The frequentist literature on inference
about time-varying needs very stringent moment assumption and local stationarity as-
sumptions which are often difficult to verify. Moreover, for econometric datasets, the
existence of even the fourth moment is often questionable. Thus this paper offers some
alternative way to estimate coefficients under lesser assumptions.
The rest of the paper is organized as follows. Section 2 describes the proposed
Bayesian model in detail. Section 3 discusses an efficient computational scheme for
the proposed method. We calculate posterior contraction rate in Section 4. In Section 5
we study the performance of our proposed method in the light of. Section 6 deals with
real data application of the proposed methods for the three separate models and con-
cludes with a brief interpretation of the results. We wrap the paper up with discussions,
some concluding remarks, and possible future directions in Section 7. The supplemen-
tary materials Karmakar and Roy (2021) contain theoretical proofs and some additional
results.
2 Modeling
We elaborate on the models and our Bayesian framework for time-varying analogs of
three specific cases that are popularly used to analyze econometric datasets.
In a Bayesian regime we put priors on μ(·) and ai (·). To respect the shape-constraints
as imposed by P we reformulate the problem. With Bj as the B-spline basis functions,
let
K1
μ(x) = exp(βj )Bj (x),
j=1
K2
ak (x) = θkj Mk Bj (x), 0 ≤ θkj ≤ 1,
j=1
S. Karmakar and A. Roy 1161
exp(δi )
Mi = p , i = 1, . . . , p,
k=0 exp(δk )
δl ∼N (0, c1 ), for 0 ≤ l ≤ p,
βj ∼N (0, c2 ) for 1 ≤ j ≤ K1 ,
θkj ∼U (0, 1) for 1 ≤ k ≤ p, 1 ≤ j ≤ K2 .
P is P-supported.The
The prior induced by above construction
P
verification is very
straightforward. In above construction, j=0 Mj = 1. Thus j=1 Mj ≤ 1. Since 0 ≤
P P P
θkj ≤ 1, supx ai (x) ≤ Mi . Thus supx i=1 ai (x) ≤ i=1 Mi ≤ 1. We have j=1 Mj ≤ 1
if and only if δ0 = −∞, which has probability zero. On the other hand, we also have
μ(·) ≥ 0 as we have exp(βj ) ≥ 0. Thus, the induced priors, described above are well
supported in P.
Additionally we impose the following constraints on parameter space for the time-
varying parameters,
P1 = {μ, ai : μ(x) ≥ 0, 0 ≤ ai (x), 0 ≤ bj (x), sup ak (x) + bj (x) < 1}. (2.5)
x
k j
K1
μ(x) = exp(βj )Bj (x),
j=1
K2
ak (x) = θkj Mk Bj (x), 0 ≤ θkj ≤ 1, 1 ≤ k ≤ p,
j=1
1162 Bayesian Modelling of Time-Varying Conditional Heteroscedasticity
K3
bk (x) = ηkj Mk+p Bj (x), 0 ≤ ηij ≤ 1, 1 ≤ k ≤ q,
j=1
exp(δi )
Mi = p , i = 1, . . . , p + q,
k=0 exp(δk )
δl ∼N (0, c1 ), for 0 ≤ l ≤ p + q,
βj ∼N (0, c2 ) for 1 ≤ j ≤ K1 ,
θkj ∼U (0, 1) for 1 ≤ k ≤ p, 1 ≤ j ≤ K2 ,
ηkj ∼U (0, 1) for 1 ≤ k ≤ q, 1 ≤ j ≤ K3 .
Here Bj ’s are the B-spline basis functions. The parameters δj ’s are unbounded. The
verification of support condition 2.5 for the proposed prior is similar.
We impose the following constraints on parameter space for the time-varying pa-
rameters,
P = {μ, ai : μ(x) ≥ 0, 0 ≤ ak (x) ≤ 1, ak (x) + bj (x) = 1}. (2.7)
k j
The prior functions that allow us to reformulate the problem keeping it consistent with
(2.7) is described below:
K1
μ(x) = exp(βj )Bj (x),
j=1
K2
ak (x) = θkj Mk Bj (x), 0 ≤ θkj ≤ 1, 1 ≤ k ≤ p,
j=1
K3
bi (x) = ηkj Mk+p Bj (x), 0 ≤ ηij ≤ 1, 1 ≤ i ≤ (q − 1),
j=1
S. Karmakar and A. Roy 1163
p
q−1
bq (x) =1 − ak (x) + bj (x) ,
k=1 j=1
exp(δi )
Mi = p+q−1 , i = 1, . . . , p + q − 1,
k=0 exp(δk )
δl ∼N (0, c1 ), for 0 ≤ l ≤ p + q − 1,
βj ∼N (0, c2 ) for 1 ≤ j ≤ K1 ,
θkj ∼U (0, 1) for 1 ≤ k ≤ p, 1 ≤ j ≤ K2 ,
ηkj ∼U (0, 1) for 1 ≤ k ≤ (q − 1), 1 ≤ j ≤ K3 .
p
n
L ∝ exp − {μ(i/n) + 2
ak (i/n)Xi−k + Xi2 log μ(i/n)
i=p k=1
p
K1
p
+ 2
ai (i/n)Xi−i } − βj2 /(2c2 ) − δl2 /(2c1 ) 10≤θkj ≤1 ,
i=1 j=1 l=0
K 1 K 2 exp(δj )
where μ(x) = j=1 exp(βj )Bj (x), ak (x) = j=1 θkj Mk Bj (x) and Mj = p exp(δ k)
.
k=0
We develop efficient Markov Chain Monte Carlo (MCMC) algorithm to sample the pa-
rameter β, θ and δ from the above likelihood. The computation of derivatives allows
us to develop an efficient gradient-based MCMC algorithm to sample these parame-
ters. We calculate the gradients of negative log-likelihood (− log L) with respect to the
parameters β, θ and δ. The gradients are given below,
d log L Bj (i/n)Xi2
− = exp(βj ) 1 − 2 ) + βj /c2 , (3.1)
βj i
(μ(i/n) + j aj (i/n)Xi−j
d log L Bj (i/n)Xi2
− = Mk 1 − 2 ) , (3.2)
θkj i
(μ(i/n) + j aj (i/n)Xi−j
d log L
− = δj /c1 + (Mj 1{j=k} − Mj Mk ) θkj Bj (x)
δj i
k
Bj (i/n)Xi2
× 1− 2 ) , (3.3)
i
(μ(i/n) + j aj (i/n)Xi−j
where 1{j=k} stands for the indicator function which takes the value 1 when j = k.
1164 Bayesian Modelling of Time-Varying Conditional Heteroscedasticity
p
q
K1
p
+ ai (i/n)Xt−i + bi (i/n)λt−i } − βj2 /(2c2 ) − δl2 /(2c1 )
i=1 i=1 j=1 l=0
Bj (i/n)σt2
ηkj Bj (x) 1 − 2 )+
2 1{j>p} .
t
(μ(i/n) + j aj (i/n)Xi−j k bk (i/n)σi−k )
1≤k≤q
While fitting tvGARCH(p, q), we assume for any t < 0, Xt2 = 0, σt2 = 0. Thus, we need
to additionally estimate the parameter σ02 . The derivative of the likelihood concerning
σ02 is calculated numerically using the jacobian function from R package pracma. For
the tviGARCH, the derivatives are similar so we avoid computing them for the sake of
brevity.
Based on these gradient functions, we develop gradient-based Hamiltonian Monte
Carlo (HMC) sampling. Note that, parameter spaces of θkj ’s have bounded support.
We circumvent this by mapping any Metropolis candidate falling outside the parameter
space back to the nearest boundary. HMC has two parameters, required to be spec-
ified. These are the leap-frog step and the step-size parameter. It is difficult to tune
both of them simultaneously. We choose to tune the step size parameter to maintain
an acceptance range between 0.6 to 0.8. After every 100 iterations, the step-length is
adjusted (increased or reduced) accordingly if it falls outside. Neal et al. (2011) showed
S. Karmakar and A. Roy 1165
that a higher leapfrog step is better for estimation accuracy at the expense of greater
computation. To maintain a balance between accuracy and computational complexity,
we keep it fixed at 30 and obtain good results.
4 Large-Sample Properties
We now focus on obtaining posterior contraction rates for our proposed Bayesian models.
The posterior consistency is studied in the asymptotic regime of increasing number of
time points n. We study the posterior consistency with respect to the average Hellinger
distance on the coefficient functions which is
1 1
d21,n = d2H (κ1 , κ2 ) = ( f 1 − f 2 )2 ,
n n
n
where f1 = i=1 Pκ1 (Xi |Xi−1 ) and f2 denotes the corresponding likelihoods.
(n)
Definition: For a sequence n if Πn (d(f, f0 )|X (n) ≥ Mn n |X (n) ) → 0 in Fκ0 -probability
for every sequence Mn → ∞, then the sequence n is called the posterior contraction
rate.
All the proofs are postponed to the supplementary materials. The proof is based
on the general contraction rate result for independent non-i.i.d. observations (Ghosal
and Van der Vaart, 2017) and some results on B-splines based finite random series. The
exponentially consistent tests are constructed leveraging on the famous Neyman-Pearson
Lemma as in Ning et al. (2020). Thus the first step is to calculate posterior contraction
1/2 1/2
rate with respect to average log-affinity rn2 (f1 , f2 ) = − n1 log f1 f2 . Then we show
that rn2 (f1 , f2 ) 2n implies n1 d2H (f1 , f2 ) 2n . We also consider following simplified
priors for αj and τi to get better control over tail probabilities,
for i = 1, 2. These priors have not been considered while fitting the model as it would re-
quire computationally expensive reversible jump MCMC strategy. The contraction rate
will depend on the smoothness of true coefficient functions μ and a and the parameters
b13 and b23 from the prior distributions of K1 and K2 . Let κ0 = (μ0 , a01 ) be the truth
of κ.
Assumptions (A): There exists constants MX > 1, 0 < Mμ < MX such that,
(A.1) The coefficient functions satisfy supx μ0 (x) < Mμ and supx a01 (x) < 1 − Mμ /MX .
1166 Bayesian Modelling of Time-Varying Conditional Heteroscedasticity
(A.2) inf x min(μ0 (x), a01 (x)) > ρ for some small ρ > 0.
by recursion.
Theorem 1. Under assumptions (A.1)–(A.3), let the true functions μ0 (·) and a10 (·) be
Hölder smooth functions with regularity level ι1 and ι2 respectively, then the posterior
contraction rate with respect to the distance d21,n is
max n−ι1 /(2ι1 +1) (log n)ι1 /(2ι1 +1)+(1−b13 )/2 , n−ι2 /(2ι2 +1) (log n)ι2 /(2ι2 +1)+(1−b23 )/2 ,
for i = 1, 2. These priors have not been considered while fitting the model as it would re-
quire computationally expensive reversible jump MCMC strategy. The contraction rate
will depend on the smoothness of true coefficient functions μ and a and the parameters
b13 and b23 from the prior distributions of K1 and K2 . Let κ0 = (μ0 , a01 ) be the truth
of κ.
Assumptions (B): There exists constants MX > 1, 0 < Mμ < MX such that,
(B.1) The coefficient functions satisfy supx μ0 (x) < Mμ and supx (a01 (x) + b01 (x)) <
1 − Mμ /MX .
(B.2) inf x min(μ0 (x), a01 (x), b01 (x)) > ρ for some small ρ > 0.
by recursion. Similarly we have E(σi2 ) = Eκ0 (Eκ0 (Xi2 |Fi )) = Eκ0 (Xi2 ) < MX .
S. Karmakar and A. Roy 1167
Theorem 2. Under assumptions (B.1)–(B.3), let the true functions μ0 (·), a10 (·) and
b10 (·) be Hölder smooth functions with regularity level ι1 . ι2 and ι3 respectively, then the
posterior contraction rate with respect to the distance d21,n is
max n−ι1 /(2ι1 +1) (log n)ι1 /(2ι1 +1)+(1−b13 )/2 , n−ι2 /(2ι2 +1) (log n)ι2 /(2ι2 +1)+(1−b23 )/2 ,
n−ι3 /(2ι3 +1) (log n)ι2 /(2ι3 +1)+(1−b33 )/2 ,
for i = 1, 2. These priors have not been considered while fitting the model as it would re-
quire computationally expensive reversible jump MCMC strategy. The contraction rate
will depend on the smoothness of true coefficient functions μ and a and the parameters
b13 and b23 from the prior distributions of K1 and K2 . Let κ0 = (μ0 , a01 ) be the truth
of κ.
(C.1) The coefficient functions satisfy supx μ0 (x) < Mμ < ∞ for some Mμ .
(C.2) inf x (μ0 (x)) > ρ, inf x a01 (x) > ρ, supx a0,1 (x) < 1 − ρ for some ρ > 0.
Theorem 3. Under assumptions (C.1)–(C.2), let the true functions μ0 (·) and a10 (·) be
Hölder smooth functions with regularity level ι1 and ι2 respectively, then the posterior
contraction rate with respect to the distance d21,n is
−ι1 /(2ι1 +1) ι1 /(2ι1 +1)+(1−b13 )/2 −ι2 /(2ι2 +1) ι2 /(2ι2 +1)+(1−b23 )/2
max n (log n) ,n (log n) ,
5 Simulation
We run simulations to study the performance of our proposed Bayesian method in cap-
turing the true coefficient functions under different true models. The hyperparameters
c1 and c2 of the normal prior are all set 100, which makes the prior weakly informa-
tive. We consider 4, 5 and 6 equidistant knots for the B-splines when n = 200, 500 and
1000 respectively. We collect 10000 MCMC samples and consider the last 5000 as post
burn-in samples for inferences. We shall compare the estimated functions with the true
functions in terms of the posterior estimates of functions along with its 95% pointwise
1168 Bayesian Modelling of Time-Varying Conditional Heteroscedasticity
credible bands. The credible bands are calculated from the MCMC samples at each
point t = 1/T, 2/T, . . . , 1. We take the posterior mean as the posterior estimate of the
unknown functions.
Since, to the best of our knowledge, there is no other Bayesian model for these
time-varying conditional heteroscedastic models, we compare our Bayesian estimates
with corresponding frequentist time-varying estimates. For computing the time-varying
estimates of these models, we use the kernel-based method from Karmakar et al. (2021).
The M-estimator of the parameter vector θ(t) are obtained using the conditional quasi
log-likelihood. For instance, in the tvARCH(1) case, say θ(t) = (μ(t), a1 (t))
n
t − i/n
θ̂bn (t) = argmin K (Xi |Fi−1 , θ) t ∈ [0, 1],
θ∈Θ i=2
bn
where (·) denotes the Gaussian log-likelihood. Note that these methods are fast but
usually need a cross-validated choice of bandwidth bn . We use K(x) = 3/4(1−x2 )I(|x| ≤
1) and an appropriately chosen bandwidth as suggested by the authors therein. Since
our discussion also involves iGARCH formulation, we wrote a separate kernel-based
frequentist estimation for iGARCH models analogously. Apart from these two time-
varying estimates, we also obtain a time-constant fit on the same data to help initiate a
discussion on whether there was a necessity of introducing coefficients varying with time.
For this tseries and rugarch R packages are used respectively for ARCH/GARCH and
iGARCH fits.
To compare these estimates, we evaluate the average mean square errors (AMSE) for
the three estimates. Note that in an usual linear regression
of response y on predictor
X scenario, the fitted MSE is often defined as n1 (yi − ŷi )2 . Since here, Xi |Fi−1 ∼
N (0, σi2 ), we use the following definition of AMSE
1 2
AMSE = (Xi − σ̂i2 )2 ,
n i
where the σ̂i2 is computed with the fitted parameter values as per the model under
consideration. For example, for a tvGARCH(1,1) model we have
where μ̂(·), â(·) and b̂(·) are the estimated curves from the posterior. Replacing the
response yi by Xi2 is natural as often autocorrelations of Xi2 are checked to gauge
presence of CH effect. Moreover, one of the early methods to deal with CH models was
to view Xi2 approximated by an TVAR(1) process. See Bose and Mukherjee (2003) and
references therein. Similar estimators as our proposed AMSE to evaluate the fitting
accuracy has been used in the literature previously. See Starica (2003); Fryzlewicz et al.
(2008a); Rohan and Ramanathan (2013); Karmakar et al. (2021) for example.
In the next three subsections, we provide the results for the three models, namely,
tvARCH, tvGARCH, and tviGARCH. Our conclusions from these results are illustrated
at the end of the section.
S. Karmakar and A. Roy 1169
Figure 1: tvARCH(1): True coefficient functions (red), estimated curve (black) along
with the 95% pointwise credible bands (green) are shown for T = 200, 500, 1000 from
top to bottom.
μ0 (x) =10 exp − (x − 0.5)2 /0.1 ,
a10 (x) =0.4(x − 0.15)2 + 0.1.
We compare the estimated functions with the truth for sample size 1000 in Figures 1.
Table 1 illustrates the performance of our method with respect to other competing
methods.
1170 Bayesian Modelling of Time-Varying Conditional Heteroscedasticity
Note that, estimation of GARCH, due to the additional bi (·) parameter curves is a
significantly more challenging problem and often requires a much higher sample size to
have a reasonable estimation. We show by the means of the following pictures in Figure 2
that the estimation looks reasonable even for smaller sample sizes. The AMSE score
comparisons are shown in Tables 2. The performance of our method is also contrasted
with other competing methods.
The frequentist computation for tviGARCH method is carried out based on a kernel-
based estimation scheme along the same line as Karmakar et al. (2021). The estimated
plots along with the 95% credible intervals are shown in Figure 3 for three sample sizes
n = 200, 500, 1000 and the AMSE scores in Table 3.
S. Karmakar and A. Roy 1171
Figure 2: tvGARCH(1,1): True coefficient functions (red), estimated curve (black) along
with the 95% pointwise credible bands (green) are shown for T = 200, 500, 1000 from
top to bottom.
To summarize, our estimated functions are close to true functions for all the cases.
We also find that the credible bands are getting tighter with increasing sample size. Thus
estimates are improving in precision as sample size increases as shown in Figures 1 to 3.
AMSEs of our Bayesian estimates are at least better for all the cases as in Tables 1 to 3.
For tviGARCH, AMSE* is considered due to the huge and somewhat incomparable
values of AMSE due to non-existent variance.
Figure 3: tviGARCH(1,1): True coefficient functions (red), estimated curve (black) along
with the 95% pointwise credible bands (green) are shown for T = 200, 500, 1000 from
top to bottom.
Typically we model the log-return data of the daily closing price of these data to avoid
the unit-root scenario. The log-return is defined as follows and is close to the relative
return
Pi − Pi−1 Pi − Pi−1
Yi = log Pi − log Pi−1 = log 1 + ≈ ,
Pi−1 Pi−1
where Pi is the closing price on the ith day. Conditional heteroscedastic models are
popularly used for model building, analysis and forecasting. Here we extend such models
to a more sophisticated and general scenario by allowing the coefficient functions to vary.
In this section, we analyze two datasets: USD to JPY conversion and NASDAQ, a
popular US stock market data. We analyze the NASDAQ data through tvGARCH(1,1)
and tviGARCH(1,1) models and USDJPY conversion rate data through tvARCH(1)
models. We just fit one lag for these models as multiple lag fits are similar and larger
lags seem to be insignificant. This result is consistent with the findings in Karmakar
S. Karmakar and A. Roy 1173
et al. (2021), Fryzlewicz et al. (2008b) etc. Moreover, as Fryzlewicz et al. (2008b) claims,
stock indices and Forex rates are more suited to GARCH and ARCH type of models re-
spectively for their superior predictive performance. Each of these datasets was collected
up to 31 July 2020. We exhibit our results for the last 200, 500 and 1000 days which
capture the last 6 months, around 1.5 years, and around 3 years of data respectively.
All these datasets were collected from www.investing.com. Note that these datasets
are usually available for weekdays barring holidays and typically there are about 260
data points every single year.
Figure 4: USDJPY data (tvARCH(1) model) Estimated curve (black) along with the
95% pointwise credible bands (green) are shown for T = 200, 500, 1000 from top to
bottom.
a tvGARCH(1,1) model on the NASDAQ data for the last 200, 500, 1000 days ending
on 31 July 2020. One can see the a1 (·) values are generally low and the b1 (·) values
are higher which is consistent with how these outcomes turn out for time-constant esti-
mates for econometric datasets. One can also see the role sample size plays in curating
these time-varying estimates. For n = 200, the b1 (·) achieves high value of 0.6 around
mid-March 2020 but for higher sample sizes it shows values as high as 0.8. One can
also note the striking similarity for the analysis of the last 500 and 1000 days which is
S. Karmakar and A. Roy 1175
fairly consistent with the idea that estimation is more stable for such CH type models
with a higher sample size. Nonetheless, the estimates for n = 200 seem quite smooth as
well which can be seen as a benefit of our methodology. Table 5 provides a comparison
of AMSE scores across the three methods for three sample sizes. The Bayesian tv-
GARCH(1,1) performs relatively better than other methods and estimated curves have
smaller credible bands with a growing sample size. The behavior of the mean function
also shows higher volatility around the pandemic.
Figure 5: NASDAQ data (tvGARCH(1,1) model) Estimated curve (black) along with
the 95% pointwise credible bands (green) are shown for T = 200, 500, 1000 from top to
bottom.
in terms of Bayes factor (Kass and Raftery, 1995). Our calculation of the Bayes factor is
based on the posterior samples using the harmonic mean identity of Neton and Raftery
(1994). Let us denote B200 , B500 and B1000 as the Bayes factors for three sample sizes
where
P (D(i) |tvGARCH)
Bi = ,
P (D(i) )|tviGARCH)
S. Karmakar and A. Roy 1177
Figure 6: NASDAQ data (tviGARCH(1,1) model) Estimated curve (black) along with
the 95% pointwise credible bands (green) are shown for T = 200, 500, 1000 from top to
bottom.
for sample size i and the corresponding dataset D(i) . The values we obtain are
2 log(B200 ) = 8.16, 2 log(B500 ) = 19.08 and 2 log(B1000 ) = 24.14. According to guide-
lines from section 3.2 of Kass and Raftery (1995), there is ‘positive’ evidence in favor of
tvGARCH for sample sizes 200 and 500. However, the same evidence becomes ‘strong’
for sample size 1000.
1178 Bayesian Modelling of Time-Varying Conditional Heteroscedasticity
1
n
1
L(n)
m = −Xi2 /σ̂i2 − log(σ̂i ) − log(2π) ,
m i=n−m+1 2
One can see, we again observe the same advantage of tviGARCH modeling over the
tvGARCH one for smaller sample sizes. This is an interesting find of this paper in the
context of the Bayesian model fitting to these datasets.
Supplementary Material
Proof of Theorems (DOI: 10.1214/21-BA1267SUPP; .pdf). The supplementary material
includes the proof of Theorems 1, 2 and 3 and a general discussion of the main strategy
behind them. We also include the traceplots for the MCMC chain from our simulations.
References
Amorim, L. D., Cai, J., Zeng, D., and Barreto, M. L. (2008). Regression splines in the
time-dependent coefficient rates model for recurrent event data. Statistics in Medicine,
27(28):5890–5906. MR2597750. doi: https://doi.org/10.1002/sim.3400. 1159
Andreou, E. and Ghysels, E. (2006). Monitoring disruptions in financial markets. Jour-
nal of Econometrics, 135(1-2):77–124. MR2328397. doi: https://doi.org/10.1016/
j.jeconom.2005.07.023. 1158
Andrews, D. W. K. (1993). Tests for parameter instability and structural change with
unknown change point. Econometrica, 61(4):821–856. MR1231678. doi: https://
doi.org/10.2307/2951764. 1158
Audrino, F. and Bühlmann, P. (2009). Splines for financial volatility. Journal of
the Royal Statistical Society: Series B (Statistical Methodology), 71(3):655–670.
MR2749912. doi: https://doi.org/10.1111/j.1467-9868.2009.00696.x. 1159
Bai, J. (1997). Estimation of a change point in multiple regression models. The Review
of Economics and Statistics, 79(4):551–563. 1157
Betancourt, M. (2017). A conceptual introduction to Hamiltonian Monte Carlo.
arXiv preprint arXiv:1701.02434. MR1699395. doi: https://doi.org/10.1017/
CBO9780511470813.003. 1159
S. Karmakar and A. Roy 1181
Engle, R. F. and Rangel, J. G. (2005). The spline garch model for unconditional volatility
and its global macroeconomic causes. 1158
Engle, R. F. and Rangel, J. G. (2008). The spline-GARCH model for low-frequency
volatility and its global macroeconomic causes. The Review of Financial Studies,
21(3):1187–1222. 1159
Fan, J. and Zhang, W. (1999). Statistical estimation in varying coefficient models.
Annals of Statistics, 27(5):1491–1518. MR1742497. doi: https://doi.org/10.1214/
aos/1017939139. 1157
Fan, J. and Zhang, W. (2000). Simultaneous confidence bands and hypothesis test-
ing in varying-coefficient models. Scandinavian Journal of Statistics, 27(4):715–731.
MR1804172. doi: https://doi.org/10.1111/1467-9469.00218. 1157
Fan, J. and Zhang, W. (2008). Statistical methods with varying coefficient models.
Statistics and its Interface, 1(1):179. MR2425354. doi: https://doi.org/10.4310/
SII.2008.v1.n1.a15. 1159
Franco-Villoria, M., Ventrucci, M., Rue, H., et al. (2019). A unified view on
Bayesian varying coefficient models. Electronic Journal of Statistics, 13(2):5334–5359.
MR4047589. doi: https://doi.org/10.1214/19-EJS1653. 1159
Fryzlewicz, P., Sapatinas, T., and Subba Rao, S. (2008a). Normalized least-squares
estimation in time-varying ARCH models. Annals of Statistics, 36(2):742–786.
MR2396814. doi: https://doi.org/10.1214/07-AOS510. 1158, 1159, 1168
Fryzlewicz, P., Sapatinas, T., and Subba Rao, S. (2008b). Normalized least-squares
estimation in time-varying ARCH models. Annals of Statistics, 36(2):742–786.
MR2396814. doi: https://doi.org/10.1214/07-AOS510. 1161, 1173
Ghosal, S. and Van der Vaart, A. (2017). Fundamentals of nonparametric Bayesian
inference, volume 44. Cambridge University Press. MR3587782. doi: https://doi.
org/10.1017/9781139029834. 1165
Gu, C. and Wahba, G. (1993). Smoothing spline anova with component-wise Bayesian
“confidence intervals”. Journal of Computational and Graphical Statistics, 2(1):97–
117. MR1272389. doi: https://doi.org/10.2307/1390957. 1159
Hastie, T. and Tibshirani, R. (1993). Varying-coefficient models. Journal of the Royal
Statistical Society: Series B (Methodological), 55(4):757–779. MR1229881. 1159
Hoover, D. R., Rice, J. A., Wu, C. O., and Yang, L.-P. (1998). Nonparamet-
ric smoothing estimates of time-varying coefficient models with longitudinal data.
Biometrika, 85(4):809–822. MR1666699. doi: https://doi.org/10.1093/biomet/
85.4.809. 1157, 1158
Huang, J. Z. and Shen, H. (2004). Functional coefficient regression models for non-
linear time series: a polynomial spline approach. Scandinavian Journal of Statistics,
31(4):515–534. MR2101537. doi: https://doi.org/10.1111/j.1467-9469.2004.
00404.x. 1159
S. Karmakar and A. Roy 1183
Acknowledgments
We would like to thank the editor, the associate editor, and two anonymous referees for their
constructive suggestions that improved the quality of the manuscript.