Large Deviations and Asymptotic Methods in Finance by Peter K. Friz, Jim Gatheral, Archil Gulisashvili, Antoine Jacquier, Josef Teichmann (Eds.)
Large Deviations and Asymptotic Methods in Finance by Peter K. Friz, Jim Gatheral, Archil Gulisashvili, Antoine Jacquier, Josef Teichmann (Eds.)
Peter K. Friz
Jim Gatheral
Archil Gulisashvili
Antoine Jacquier
Josef Teichmann Editors
Large Deviations
and Asymptotic
Methods in
Finance
Springer Proceedings in Mathematics & Statistics
Volume 110
Springer Proceedings in Mathematics & Statistics
Josef Teichmann
Editors
Large Deviations
and Asymptotic Methods
in Finance
123
Editors
Peter K. Friz Archil Gulisashvili
Institut für Mathematik Department of Mathematics
Technische Universität Berlin Ohio University
Berlin Athens, OH
Germany USA
v
vi Contents
Systemic Risk and Default Clustering for Large Financial Systems . . . 529
Konstantinos Spiliopoulos
pffiffiffi
Estimation of Volatility Functionals: The Case of a n Window . . . . . 559
Jean Jacod and Mathieu Rosenbaum
Introduction
vii
viii Introduction
theory was originally developed (in the finite-dimensional case) by Harald Cramér
in the 1930s for actuarial mathematics.
A widely circulated preprint by Patrick Hagan et al. (following the famous
SABR paper), first presented in 2001 by Andrew Lesniewski at the Courant Finance
Seminar, intensified the connection between heat-kernels, geometry and finance.
The resulting SABR formula has become industry standard in fixed income mod-
elling (and presumably a long-time headache for quants tortured by risk manage-
ment). The topic was further explored by a number of people including Marco
Avellaneda, Christian Bayer, Gérard Ben Arous, Jérôme Busca, Jean-Dominique
Deuschel, Martin Forde, Pierre Henry-Labordère, Elton Hsu, Peter Laurence,
Cheng Ouyang and many others (including, unsurprisingly, all the editors of this
volume).
Despite the undisputed mathematical depth of this development, the agenda has
been largely initiated by people in or near the industry, a quick publication not
always being their first priority. This, at least, is our only explanation for the fact
that some key papers have remained preprints ever since, though widely circulating
and used for years. We also note that the derivation of closed-form approximation
formulae in various non-tractable models remains a constant topic in major
academic and industry meetings alike, not to mention some specialist meetings
(Vienna 2009, Berlin 2011, London 2013) organized by factions of the present
group of editors in different constellations.
The present proceedings grew on this fertile ground. Contributions include some
unpublished classics (in brushed-up versions), notably the aforementioned preprint
by Patrick Hagan et al. as well as recent works touching the theme of large devi-
ations and/or asymptotic expansions in mathematical finance.
The editors have known each other for a long time. The idea for this book project
was born in July 2013, but the first step towards realization was overshadowed by a
sad event: we are still shocked that our esteemed colleague and friend, whom we
had invited to co-edit this volume, has never received his invitation: Peter Laurence
passed away unexpectedly in August 2013.
Peter Laurence was born in New York, NY, on 27 March 1952. After under-
graduate courses at the Wharton School of Finance and Commerce at the
University of Pennsylvania, he obtained a Bachelor of Science in Mathematics and
Philosophy degree in 1973. He also obtained a Master of Science degree (1977)
and a Ph.D. degree (1981) from the University of Wisconsin Madison. From
1974–1991, Peter was a faculty member at the University of Wisconsin, the
Courant Institute of Mathematical Sciences at New York University, Worcester
Polytechnic Institute, Pennsylvania State University, and the University of Milano,
Italy. From 1991 till his untimely death in 2013, Peter was a professor at Sapienza
Università di Roma and a visiting scholar at the Courant Institute. Peter published
more than 60 research papers, co-authored a book “Quantitative Methods of
Derivative Securities: From Theory to Practice” with Marco Avellaneda, and was
one of the editors of the volume “Quantitative Energy Finance: Modeling, Pricing,
and Hedging in Energy and Commodity Markets”. His long-term friend Marco
Avellaneda remembers: Peter had an infinite joie de vivre… This involved a lot of
Introduction ix
research in Math Physics, one of his passions. I enjoyed discussing Math Physics
with him. We also began our interest in finance in the 90s and co-authored [our]
book. He had a kind heart. May he rest in peace and live in us who are still here.
Or as Bruno Dupire articulates it: He was a gentleman and will be missed.
Each of us has stories to tell about Peter and his inexhaustible passion for
mathematics and its impact on finance. Instead of trying to fit them in this intro-
duction we rather let him speak through mathematics: some of Peter Laurence’s
final contributions to mathematical finance do appear in these proceedings, with the
kind agreement of the respective co-authors.
We are indebted to all the reviewers who helped us achieving this work. It is also
our pleasure to thank Magdalena Mueller-Laurence, as well as the Springer
Proceedings team, without whom this book would never have appeared.
Abstract We study the SABR model of stochastic volatility (Wilmott Mag, 2003
[10]). This model is essentially an extension of the local volatility model (Risk
7(1):18–20 [4], Risk 7(2):32–39, 1994 [6]), in which a suitable volatility parameter is
assumed to be stochastic. The SABR model admits a large variety of shapes of volatil-
ity smiles, and it performs remarkably well in the swaptions and caps/floors markets.
We refine the results of (Wilmott Mag, 2003 [10]) by constructing an accurate and
efficient asymptotic form of the probability distribution of forwards. Furthermore,
we discuss the impact of boundary conditions at zero forward on the volatility smile.
Our analysis is based on a WKB type expansion for the heat kernel of a perturbed
Laplace-Beltrami operator on a suitable hyperbolic Riemannian manifold.
1 Introduction
The SABR model [10] of stochastic volatility attempts to capture the dynamics of
smile in the interest rate derivatives markets which are dominated by caps/floors and
swaptions. It provides a parsimonious, accurate, intuitive, and easy to implement
framework for pricing, position management, and relative value in those markets.
The model describes the dynamics of a single forward (swap or LIBOR) rate with
stochastic volatility. The dynamics of the model is characterized by a function C ( f )
of the forward rate f which determines the general shape of the volatility skew, a
parameter v which controls the level of the volatility of volatility, and a parameter
ρ which governs the correlation between the changes in the underlying forward rate
and its volatility. It is an extension of Black’s model: choosing v = 0 and C ( f ) = f
reduces SABR to the lognormal Black model, while v = 0 and C ( f ) = 1 reduces
it to the normal Black model.
The main reason why the SABR model has proven effective in the industrial
setting is that, even though it is too complex to allow for a closed form solution,
it has an accurate asymptotic solution. This solution, as well as its implications for
pricing and risk management of interest derivatives, has been described in [10].
In this paper we further refine the results presented in [10]. Our developments
go in two directions. Fist, we present a more systematic framework for generating
an accurate, asymptotic form of the probability distribution in the SABR model.
Secondly, we address the issue of low strikes, or the behavior of the model as the
forward rate approaches zero.
Our way of thinking has been strongly influenced by the asymptotic techniques
which go by the names of the geometric optics or the WKB method, and, most
importantly, by the classical results of Varadhan [19, 20] (see also [13, 18] for
more recent presentations and refinements). These techniques allow one to relate
the short time asymptotics of the fundamental solution (or the Green’s function)
of Kolmogorov’s equation to the differential geometry of the state space. From the
probabilistic point of view, the Green’s function represents the transition probability
of the diffusion, and it thus carries all the information about the process.
Specifically, let U denote the state space of an n-dimensional diffusion process
with no drift, and let G X (s, x), x, X ∈ U, denote the Green’s function. We also
assume that the process is time homogeneous, meaning that the diffusion matrix is
independent of s. Then, Varadhan’s theorem states that
d (x, X )2
lim s log G X (s, x) = − .
s→0 2
The function D (ζ) represents a certain metric whose precise meaning is explained
in the body of the paper. The key object from the point of view of option pricing is
the probability distribution of forwards PF (τ , f ). Our main result in this paper is
the explicit asymptotic formula:
exp −D (ζ)2 /2τ v 2
PF (τ , f ) = √ (1 + · · · ) .
2πτ σC (F) (cosh D (ζ) − ρ sinh D (ζ))3/2
In order not to burden the notation, we have written down the leading term only; the
complete formula is stated in Sect. 5. To leading order, the probability distribution of
forwards in the SABR model is Gaussian with the metric D (ζ) replacing the usual
distance.
From this probability distribution, we can deduce explicit expressions for implied
volatility. The normal volatility is given by:
Precise formulas, including the subleading terms and the impact of boundary condi-
tions at zero forward, are stated in Sect. 5. To calculate the corresponding lognormal
volatility one can use the results of [11].
We would like to mention that other stochastic volatility models have been exten-
sively studied in the literature (notably among them the Heston model [12]). Useful
presentations of these models are contained in [5, 17].
A comment on our style of exposition in this paper. We chose to present the argu-
ments in an informal manner. In order to make the presentation self-contained, we
present all the details of calculations, and do not rely on general theorems of differ-
ential geometry, stochastic calculus, or the theory of partial differential equations.
And while we believe that all the results of this paper could be stated and proved
rigorously as theorems, little would be gained and clarity might easily get lost in the
course of doing so.
The paper is organized as follows. In Sect. 2 we review the model and formulate
the basic partial differential equation, the backward Kolmogorov equation. We also
introduce the Green’s and discuss various boundary conditions at zero. Section 3 is
devoted to the description of the differential geometry underlying the SABR model.
We show that the stochastic dynamics defining the model can be viewed as a pertur-
bation of the Brownian motion on a deformed Poincare plane. The elliptic operator
in the Kolmogorov equation turns out to be a perturbed Laplace-Beltrami opera-
tor. This differential geometric setup is key to our asymptotic analysis of the model
4 P. Hagan et al.
which is carried through in Sect. 4. In Sect. 5 we derive the explicit formulas for the
probability distribution and implied volatility which we have discussed above. In
Appendix A we review the derivation of the fundamental solution of the heat equa-
tion on the Poincare plane. This solution is the starting point of our perturbation
expansion. Finally, Appendix B contains some useful asymptotic expansions.
2 SABR Model
In this section we describe the SABR model of stochastic volatility [10]. It is a two
factor model with the dynamics given by a system of two stochastic differential
equations. The state variables of the model can be thought of as the forward price
of an asset, and a volatility parameter. In order to derive explicit expressions for the
associated probability distribution and the implied volatility, we study the Green’s
function of the backward Kolmogorov operator.
We consider a European option on a forward asset expiring T years from today. The
forward asset that we have in mind can be for instance a forward LIBOR rate, a
forward swap rate, or the forward yield on a bond. The dynamics of the forward in
the SABR model is given by1 :
d Ft = t C (Ft ) dWt ,
(1)
dt = vt d Z t .
Here Ft is the forward rate process, and Wt and Z t are Brownian motions with
where the correlation ρ is assumed constant. We supplement the dynamics (1) with
the initial condition
F0 = f,
(3)
0 = σ.
1 Note that our notation departs somewhat from the notation used in [10]: we use t instead of αt
and vt instead of νt . The name SABR is an acronym for “Stochastic Alpha Beta Rho” which was
the name of the model originally used at Paribas.
Probability Distribution in the SABR Model of Stochastic Volatility 5
Note that we assume that a suitable numeraire has been chosen so that Ft is a
martingale. The process t is the stochastic component of the volatility of Ft , and v
is the volatility of t (the “volvol”) which is also assumed to be constant.
The function C(x) is defined for x > 0, and is assumed to be positive, smooth,
and integrable around 0;
K du
< ∞, for all K > 0. (4)
0 C(u)
The so extended C(x) is an even function, C (−x) = C(x), for all values of x, and
thus the process (1) is invariant under the reflection Ft → −Ft . The state space of
the extended process is thus the upper half plane. Later on in this paper we shall
discuss the Dirichlet and Neumann boundary conditions for the SABR model.
A special case of (1) which will play an important role in our analysis is the case
of C(x) = 1, and ρ = 0. In this situation, the basic equations of motion have a
particularly simple form:
d Ft = t dWt ,
(8)
dt = vt d Z t ,
with E [dWt d Z t ] = 0. We shall refer to this model as the normal SABR model.
6 P. Hagan et al.
or, explicitly,
σ K (T, f, σ)2 = C (K )2 E (t )2 F (0) = f, Ft = K , (0) = σ . (10)
Our analysis in the following sections enables us, in particular, to derive an explicit
expression for σ K .
∂G 1 ∂2 G ∂2 G 2 ∂ G
2
+ σ 2 C ( f )2 + 2vρC ( f ) + v = 0, (11)
∂t 2 ∂ f2 ∂ f ∂σ ∂σ 2
Thus G T,F, (t, f, σ) is a Green’s function for (11). Once we have constructed
it, we can price any European option. For example, the price C T,K (t, f, σ) of a
European call option struck at K and expiring at time T can be written in terms of
G T,F, (t, f, σ) as
C T,K (t, f, σ) = (F − K )+ G T,F, (t, f, σ) d Fd, (14)
Probability Distribution in the SABR Model of Stochastic Volatility 7
where, as usual, (F − K )+ = max (F − K , 0), and where the integration extends
over the upper half plane (F, ) ∈ R2 : > 0 .
Note that the process (1) is time homogeneous, and thus G T,F, (t, f, σ) is a
function of the time to expiry τ = T − t only. Denoting
and
C K (τ , f, σ) ≡ C T,K (t, f, σ) ,
∂G 1 ∂2 G ∂2 G 2 ∂ G
2
= σ 2 C ( f )2 + 2vρC ( f ) + v , (15)
∂τ 2 ∂ f2 ∂ f ∂σ ∂σ 2
and
G F, (τ , f, σ) = δ ( f − F, σ − ) , at τ = 0. (16)
This formula has familiar structure, and one of our main goals will be to derive a
useful expression for PF (τ , f ).
It is also easy to express the local volatility in terms of the Green’s function.
Indeed,
∞
C (K )2 0 2 G K , (τ , f, σ) d
σ K (τ , f, σ)2 = ∞ , (19)
0 G K , (τ , f, σ) d
or
M K2 (τ , f, σ)
σ K (τ , f, σ) = C (K ) , (20)
PK (τ , f, σ)
where ∞
M K2 (τ , f, σ) = 2 G K , (τ , f, σ) d (21)
0
τ σ
s= , x = f, X = F, y = , Y = ,
T v v
and the rescaled Green’s function:
In terms of these variables, the initial value problem (15) and (16) can be recast as:
∂K 1 ∂2 K ∂2 K ∂2 K
= εy 2 C(x)2 + 2ρC (x) + ,
∂s 2 ∂x 2 ∂x∂ y ∂ y2 (22)
K (0, x, y) = δ (x − X, y − Y ) ,
It will be assumed that ε is small and it will serve as the parameter of our expansion.
The heuristic picture behind this idea is that the volatility varies slower than the for-
ward, and the rates of variability of f and σ/v are similar. The time T defines the time
scale of the problem, and thus s is a natural dimensionless time variable. Expressed in
terms of the new variables, our problem has a natural differential geometric content
which is key to its solution.
Finally, let us write down the equations above for the normal SABR model:
∂K 1 ∂2 K ∂2 K
= εy 2 + ,
∂s 2 ∂x 2 ∂ y2 (24)
K (0, x, y) = δ (x − X, y − Y ) .
We will show later that this initial value problem has a closed form solution.
The problem as we have formulated it so far is not complete. Since the value of the
forward rate should be positive,2 we have to specify a boundary condition for the
Green’s function at x = 0. Three commonly used boundary conditions are [9]:
2 Recent
history shows that this is not always necessarily the case, but we regard such occurances
as anomalous.
Probability Distribution in the SABR Model of Stochastic Volatility 9
• Dirichlet (or absorbing) boundary condition. We assume that the Green’s function,
D (s, x, y), vanishes at x = 0,
denoted by K X,Y
D
K X,Y (s, 0, y) = 0. (25)
∂
K N (s, 0, y) = 0. (26)
∂x X,Y
• Robin (or mixed) boundary condition. The Green’s function, which we shall denote
R (s, x, y), satisfies the following condition. Given η > 0,
by K X,Y
∂
− + η K X,Y
R
(s, 0, y) = 0. (27)
∂x
From the financial point of view, the relevant boundary conditions are the Dirichlet
and Neumann conditions. It is well known that the Green’s functions corresponding
to these different boundary conditions obey the following conditioning inequalities:
K D ≤ K ≤ K N. (28)
Since the Dirichlet boundary condition corresponds to the stochastic process being
killed at the boundary, the total mass of the Green’s function is less than one:
D
K X,Y (s, x, y) d x d y < 1. (29)
extended to the entire upper half plane, as explained in Sect. 2.1.3 Then, one verifies
readily that
D
K X,Y (s, x, y) = K X,Y (s, x, y) − K X,Y (s, −x, y) , (31)
and
N
K X,Y (s, x, y) = K X,Y (s, x, y) + K X,Y (s, −x, y) (32)
It is easy to write down a formal solution to the initial value problem (22). Let L
denote the partial differential operator
1 2 ∂2 ∂2 ∂2
L= y C(x)2 2 + 2ρC(x) + 2 (33)
2 ∂x ∂x∂ y ∂y
∂U
= εLU,
∂s
U (0) = I,
and thus the Green’s function K X,Y (s, x, y) is the integral kernel of U (s):
In order to solve the problem (22) it is thus sufficient to construct the semigroup
U (s) and find its integral kernel. Keeping in mind that our goal is to find an explicit
formula for K X,Y (s, x, y), the strategy will be to represent L as the sum
L = L 0 + V, (36)
3 This solution ignores any boundary condition at x = 0 and is sometimes referred to as the Green’s
function with a free boundary condition.
Probability Distribution in the SABR Model of Stochastic Volatility 11
Here, the operator Q (s) is given by the well known regular perturbation expansion:
Q (s) = I + es1 ad L 0 (V ) . . . esn ad L 0 (V ) ds1 . . . dsn , (39)
1≤n<∞ 0≤s1 ≤...sn ≤sε
ad L 0 (V ) = L 0 V − V L 0 . (40)
We will use the first few terms in the expansion above in order to construct an
accurate approximation to the Green’s function K X,Y (s, x, y):
1
Q (s) = I + sεV + (sε)2 ad L 0 (V ) + V 2 + O (sε)3 . (41)
2
We shall disregard the convergence issues associated with this series, and use it solely
as a tool to generate an asymptotic expansion.
In solving our model we find that the normal SABR model represents Brownian
motion on the Poincare plane. Generally, when ρ = 0, or C(x) = 1, the model
amounts to Brownian motion on a two dimensional manifold, the SABR plane, per-
turbed by a drift term. In this section we summarize a number of basic facts about
the differential geometry of the state space of the SABR model. The fundamental
geometric structure is that of the Poincare plane. We will show that the state space of
the SABR model can be viewed as a suitable deformation of the Poincare geometry.
12 P. Hagan et al.
We begin by reviewing the Poincare geometry of the upper half plane which will
serve as the standard state space of our model. For a full (and very readable) account
of the theory the reader is referred to e.g. [1].
The Poincare plane (also known as the hyperbolic or Lobachevski plane) is the
upper half plane H2 = {(x, y) : y > 0} equipped with the Poincare line element
d x 2 + dy 2
ds 2 = . (42)
y2
1 1 0
h= . (43)
y2 0 1
(clearly, this is a reflection with respect to the y-axis). The key fact about θ is that it
is an involution, i.e.
θ ◦ θ (z) = z. (47)
Probability Distribution in the SABR Model of Stochastic Volatility 13
One can also write θ as θ (z) = −z, which shows that it is an anti-holomorphic map
of H2 into itself. It is easy to find the set of fixed points of θ, namely the points on
the Poincare plane which are left invariant by θ:
|z − Z |2
cosh d (z, Z ) = 1 + , (49)
2yY
The state space associated with the general SABR model has a somewhat more
complicated geometry. Let S2 denote the upper half plane {(x, y) : y > 0} , equipped
with the following metric g:
1 1 −ρC(x)
g= . (51)
(1 − ρ2 )y 2 C(x)2 −ρC(x) C(x)2
This metric is a generalization of the Poincare metric: the case of ρ = 0 and C(x) = 1
reduces to the Poincare metric. In fact, the metric g is the pullback of the Poincare
metric under a suitable diffeomorphism. To see this, we define a map φ : S2 → H2 by
1 x du
φ (z) = − ρy , y , (52)
1 − ρ2 0 C(u)
where z = (x, y) and Z = (X, Y ) are two points on S2 . Since det (g) = y −4 C(x)−2 ,
the invariant volume element on S2 is given by
dμg (z) = det (g) d x d y
dx dy (55)
= .
C(x)y 2
i.e. θ is inherited from the corresponding reflection θ of the Poincare plane. Explicitly,
θ (x, y) = (−x, y). Strictly speaking, this holds only holds if x = 0, as the metric
(51) explodes at the boundary x = 0.
It is no coincidence that the SABR model leads to the Poincare geometry. Indeed,
the dynamics of the normal SABR model is given by the Brownian motion on the
Poincare plane. In this section we shall establish this relationship, and use it in
Sect. 3.3 in order to find an explicit representation of the integral kernel of (37).
Recall [13] that the Brownian motion on the Poincare plane is described by the
following system of stochastic differential equations:
d X t = Yt dWt ,
(57)
dYt = Yt d Z t ,
E [dWt d Z t ] = 0. (58)
Probability Distribution in the SABR Model of Stochastic Volatility 15
Comparing this with the special case of the normal SABR model (8), we see that (8)
reduces to (57) once we have made the following identifications:
X t = Fv 2 t ,
1 (59)
Yt = v 2 t ,
v
and used the scaling properties of a Wiener process:
dWv 2 t = v dWt ,
d Z v2 t = v d Z t .
Note that the system (57) can easily be solved in closed form: its solution is
given by
t s2
X t = X 0 + Y0 exp Z (s) − dW (s) ,
0 2
(60)
t2
Yt = Y0 exp Z t − .
2
Let us now compare the SABR dynamics with that of the diffusion on the SABR
plane. In order to find the dynamics of Brownian motion on the SABR plane we use
the fact that there is a mapping (namely, (52)) of S2 into H2 . Using this mapping and
Ito’s lemma yields the following system
1 2
d Xt = Y C (X t ) C (X t ) dt + Yt C (X t ) dWt ,
2 t (61)
dYt = Yt d Z t ,
Note that this is not exactly the SABR model dynamics. Indeed, one can regard the
SABR model as the perturbation of the Brownian motion on the SABR plane by the
drift term − 21 Yt2 C (X t ) C (X t ) dt.
As in the case of the Poincare plane, it is possible to represent the solution to the
system (61) explicitly:
Xt t
du s2
= Y0 exp Z (s) − dW (s) ,
X0 C(u) 0 2
(63)
t2
Yt = Y0 exp Z t − .
2
16 P. Hagan et al.
d X t = Yt C (X t ) ◦ dWt ,
dYt = Yt ◦ d Z t .
Consequently, the initial value problem (22) can be written in the following geometric
form:
∂ 1
K Z (s, z) = ε ∂ μ ∂μ K Z (s, z) ,
∂s 2 (64)
K Z (0, z) = δ (z − Z ) ,
1 ∂ ∂f
g f = √ det g g μν ν , (65)
det g ∂x μ ∂x
In the case of the Poincare plane, the Laplace-Beltrami operator has the form:
∂2 ∂2
h = y 2 + . (66)
∂x 2 ∂ y2
As anticipated by our discussion in Sect. 3.2, this operator is closely related to the
operator L in the normal SABR model. In fact, in this case,
1
L= h , (67)
2
and thus the problem (24) turns out to be the initial value problem the heat equation
on H2 :
∂KZ 1
= εh K Z ,
∂s 2 (68)
K Z (0, z) = δ (z − Z ) .
The key fact is that the Green’s function for this equation can be represented in closed
form, √ ∞
e−sε/8 2 ue−u /2sε
2
K Z (s, z) =
h
√ du. (69)
(2πsε)3/2 Y 2 d(z,Z ) cosh u − cosh d (z, Z )
This formula was originally derived by McKean [16] (see also [13] and references
therein). We have added the superscript h to indicate that this Green’s function
is associated with the Poincare metric. In Appendix A we outline an elementary
derivation of this fact.
Let us now extend the discussion above to the general case. We note first that,
except for the case of C(x) = 1, the operator ∂ μ ∂μ does not coincide with the
Laplace-Beltrami operator g on S2 associated with the metric (51). It is, however,
easy to verify that
1 ∂
μν ∂ f
∂ μ ∂μ f = g f − √ det g g
det g ∂x ν ∂x μ
1 ∂f
= g f − y 2 CC ,
1−ρ 2 ∂x
and thus
1 1 ∂
L= g − y 2 CC
2 2 1 − ρ2 ∂x
= L 0 + V,
18 P. Hagan et al.
1
L0 = g , (70)
2
and V (x) is lower order:
1 ∂
V =− y 2 C(x)C (x) . (71)
2 1 − ρ2 ∂x
φ ◦ g = h ◦ φ, (72)
∂K 1
= ε g K
∂s 2
g
on S2 can be solved in closed form! The Green’s function K Z (s, z) of this equation
is related to (69) by
g
K Z (s, z) = det (∇φ (Z )) K φ(Z
h
) (s, φ (z)) . (73)
Explicitly,
√ ∞
e−sε/8 2 ue−u /2sε
2
g
K Z (s, z) = √ du , (74)
(2πsε)3/2 1 − ρ2 Y 2 C (X ) δ cosh u − cosh δ
where δ = δ (z, Z ) is the geodesic distance (54) on S2 . This is the explicit represen-
tation of the integral kernel of the operator U0 (s).
4 Asymptotic Expansion
In principle, we have now completed our task of solving the initial value problem
(24). Indeed, its solution is given by
g
K Z (s, z) = Q (s) K Z (s, z) , (75)
Probability Distribution in the SABR Model of Stochastic Volatility 19
where Q (s) is the perturbation expansion given by (39). In order to produce clear
results that can readily be used in practice we perform now a perturbation expansion
on the expression above. Our method allows one to calculate the Green’s function of
the model to the desired order of accuracy.
Let us start with the Green’s function K Zh (s, z) which is defined on the Poincare
plane. In Appendix B we derived an asymptotic expansion (117) for the heat kernel
on the Poincare plane. After rescaling as in (106), we arrive at
1 d2
K Zh (s, z) = exp − ×
2πλY 2 2λ
d 1 d coth d − 1
1− +1 λ + O λ2 ,
sinh d 8 d2
λ = sε. (76)
g
We can now extend the expression to the general Green’s function K Z (s, z). Using
g
(73) or (74) we find that K Z (s, z) has the following asymptotic expansion:
g 1 δ2
K Z (s, z) = exp − ×
2πλ 1 − ρ2 Y 2 C (X ) 2λ
δ 1 δ coth δ − 1
1− + 1 λ + O λ2 .
sinh δ 8 δ2
To complete the calculation in the case of general C(x) we need to take into
account the contribution to the Green’s function coming from perturbation V defined
in (71). Let us define the function:
which yields the following asymptotic formula for the Green’s function:
1 δ2
K Z (s, z) = exp −
2πλ 1 − ρ2 Y 2 C (X ) 2λ
δ δ
× 1− q
sinh δ sinh δ
1 δ coth δ − 1 3 (1 − δ coth δ) + δ 2
− + − q λ + O λ2 . (79)
8 8δ 2 8δ sinh δ
In a way, this is the central result of this paper. It gives us a precise asymptotic
behavior of the Green’s function of the SABR model, as λ → 0.
5 Volatility Smile
We are now ready to complete our analysis. Given the explicit form of the approximate
Green’s function, we can calculate (via another asymptotic expansion) the marginal
probability distribution. Comparing the result with the normal probability distribution
allows us to find the implied normal and lognormal volatilities, as functions of the
model parameters. We conclude this section by deriving explicit formulas for the
case of the CEV model C (x) = x β and the shifted lognormal model C (x) = x + a.
Here the metric δ (z, Z ) is defined implicitly by (54). We evaluate this integral asymp-
totically by using Laplace’s method (steepest descent). This analysis is carried out
in Appendix B.2. The key step is to analyze the argument Y of the exponent
1
φ (Y ) = δ (z, Z )2 , (81)
2
Probability Distribution in the SABR Model of Stochastic Volatility 21
in order to find the point Y0 where this function is at a minumum. Let us introduce
the notation:
1 x du
ζ= .
y X C(u)
Since yC(u) is basically the rescaled volatility at forward u, 1/ζ represents the
average volatility between today’s forward x and at option’s strike X . In other words,
ζ represents how “easy” it is to reach the strike X . Some algebra shows that the
minimum of (81) occurs at Y0 = Y0 (ζ, y), where
Y0 = y ζ 2 − 2ρζ + 1. (82)
The meaning of Y0 is clear: it is the “most likely value” of Y , and thus Y0 C (X ) (when
expressed in the original units) should be the leading contribution to the observed
implied volatility. Also, let D (ζ) denote the value of δ (z, Z ) with Y = Y0 . Explicitly,
ζ 2 − 2ρζ + 1 + ζ − ρ
D (ζ) = log . (83)
1−ρ
The analysis in Appendix B.3 shows that the probability distribution for x is Gaussian
in this minimum
distance, at least to leading order. Specifically, it is shown there that
to within O λ2 ,
1 1 D2 yC (x) D
PX (s, x, y) = √ exp − 1+
2πλ yC (X ) I 3/2 2λ 2 1 − ρ2 I
1 yC (x) D 6ρyC (x)
− λ 1+ + cosh (D) (84)
8 2 1−ρ I 2 1 − ρ2 I 2
3 1 − ρ2 3yC (x) 5 − ρ2 D sinh (D)
− + + ··· ,
I 2 1 − ρ2 I 2 D
where
I (ζ) = ζ 2 − 2ρζ + 1
= cosh D (ζ) − ρ sinh D (ζ) . (85)
22 P. Hagan et al.
As this expression may be useful on its own, we rewrite it in terms of the original
variables:
1 1 D2 σC ( f ) D
PF (τ , f, σ) = √ exp − 1+
2πτ σC (F) I 3/2 2τ v 2
2v 1 − ρ2 I
1 σC ( f ) D 6ρσC ( f )
− τ v2 1 + + cosh (D)
8 2v 1 − ρ2 I v 1 − ρ2 I 2
3 1 − ρ2 3σC ( f ) 5 − ρ2 D sinh (D)
− + + · · · , (86)
I 2v 1 − ρ2 I 2 D
where we have slightly abused the notation. This is the desired asymptotic form of
the marginal probability distribution.
The normal implied volatility is given by Sect. 2.2, and we are thus left with the task
of calculating the conditional second moment. Explicitly,
∞
M X2 (s, x, y) = Y 2 K Z (s, z) dY
0
1 ∞ δ δ
−δ 2 /2λ
= e q 1−
2πλ 1 − ρ2 C (X )
0 sinh δ sinh δ
1 δ coth δ − 1 3 (1 − δ coth δ) + δ 2
− λ 1+ − q dY. (87)
8 δ2 δ sinh δ
Despite their complicated appearances, the two expressions have a lot in common,
and their ratio has a rather simple form. After the dust settles, we find that
σ K (τ , f, σ)2 = σ 2 C ( f )2 I (ζ)
2σC ( f ) (ρ cosh (D) − sinh (D)) 2
× 1+ τv + · · · , (88)
σC ( f ) D I + 2 1 − ρ2 I 2 v
or
σ K (τ , f, σ) = σC ( f ) I (ζ)
σC ( f ) (ρ cosh (D) − sinh (D)) 2
× 1+ τv + · · · . (89)
σC ( f ) D I + 2 1 − ρ2 I 2 v
v f
ζ= log . (90)
σ F
v f +a
ζ= log . (92)
σ F +a
Our analysis so far has been base on the assumption that we were boundary conditions
at zero forward. In the case of ρ = 0, we can tackle the Dirichlet and Neumann
boundary conditions explicitly.
As explained in Sect. 2.3, the Green’s functions corresponding to the Dirichlet and
Neumann boundary conditions at zero forward can easily be calculated, using the
method of images, in terms of the Green’s function with free boundary conditions.
24 P. Hagan et al.
PFDirichlet (τ , f, σ) = PF (τ , f, σ) − PF (τ , − f, σ) ,
(93)
PFNeumann (τ , f, σ) = PF (τ , f, σ) + PF (τ , − f, σ) .
Analogous formulas hold for the conditional second moments. We can now easily find
asymptotic expressions for the implied volatilities corresponding to these boundary
conditions.
In order to keep the appearance of the otherwise unwieldy formulas reasonable,
we shall introduce some additional notation. Let
I θ = I ζθ , (94)
where
1 θ(x) du
θ
ζ = . (95)
y X C(u)
η 1
σ K (τ , f, σ) = σC (K ) I
1 − ηγ + η 2 γ 2
σC ( f ) (ρ cosh (D) − sinh (D)) 2
× 1+ τv + · · · . (98)
σC ( f ) D I + 2 1 − ρ2 I 2 v
It is worthwhile to note that for large strikes all three of these quantities are practically
equal, and one might as well work with the free boundary condition expression.
Indeed, in this case, γ ≈ 0, and so 1 − ηγ + η 2 γ 2 ≈ 1. Also, we see from this
expression that, at least asymptotically,
σ Dirichlet
K (τ , f, σ) < σ free
K (τ , f, σ) < σ K
Neumann
(τ , f, σ) . (99)
Probability Distribution in the SABR Model of Stochastic Volatility 25
This result is intuitively clear, and (98) quantifies it in a way that can be used for
position management purposes. The decision which boundary condition to adopt
should be made based on specific market conditions.
∂ 1 ∂
Q=i y − +y . (101)
∂y 2 ∂x
∂ 1 ∂
Q† = i y − −y , (102)
∂y 2 ∂x
1 1
Q Q † + Q † Q = −h − . (103)
2 4
26 P. Hagan et al.
1 1 1
(| − h ) = |Q Q † + |Q † Q + (|)
2 2 4
1 † 1 1
= Q |Q + (Q|Q) + (|)
†
2 2 4
1
≥ (|) ,
4
where we have used the fact that (|) ≥ 0, for all functions ∈ H. As a
consequence, we have established that the spectrum of the operator −h is bounded
from below by 41 ! This fact was first proved in [16].
∂
G Z (s, z) = h G Z (s, z) ,
∂s (104)
G Z (0, z) = Y 2 δ (z − Z ) ,
Note that, up to the factor of Y 2 in front of the delta function and a trivial time
rescaling, this is exactly the initial value problem (68):
The Green’s function G Z (s, z) is also referred to as the heat kernel4 on H2 . The
reason for inserting the factor of Y 2 in front of δ (z − Z ) is that the distribution
Y 2 δ (z − Z ) is invariant under the action (44) of the Lie group S L (2, R). In fact, we
verify readily that
1
Y 2 δ (z − Z ) = δ (cosh d (z, Z ) − 1) .
π
4 It is the integral kernel of the semigroup of operators generated by the heat equation.
Probability Distribution in the SABR Model of Stochastic Volatility 27
Now, since the initial value problem (105) is invariant under S O(2, R), its solution
must be invariant and thus a function of d (z, Z ) only. Let r = cosh d (z, Z ), and
write G Z (s, z) = ϕ (s, r ). Then the heat equation in (105) takes the form
∂ ∂2 ∂
ϕ (s, r ) = r 2 − 1 ϕ (s, r ) + 2r ϕ (s, r ) . (107)
∂s ∂r 2 ∂r
We have established above that the operator −h is self-adjoint on the Hilbert space
H, and its spectrum is bounded from below by 41 . Therefore, we shall seek the solution
as the Laplace transform
∞
ϕ (s, r ) = e−sλ L (λ, r ) dλ (108)
1/4
where
1 1
ν =− ±i λ−
2 4
1
= − ± iω,
2
and recognize in (109) the Legendre equation. Note that, as a consequence of the
inequality λ ≥ 14 , ω is real and Re ν = − 21 .
In the remainder of this appendix, we will use the well known properties of the
solutions to the Legendre equation, and follow Chaps. 7 and 8 of Lebedev’s book on
special functions [15]. The general solution to (109) is a linear combination of the
Legendre functions of the first and second kinds, P−1/2+iω (r ) and Q −1/2+iω (r ),
respectively:
1
L + ω2 , r = Aω P−1/2+iω (r ) + Bω Q −1/2+iω (r ) . (110)
4
which would imply that ϕ (s, cosh d) is singular at d = 0, for all values of s > 0.
Since this is impossible, we conclude that Bω = 0. Note that, on the other hand,
1
Aω = tanh (πω) .
2π
Note that this relation can be viewed as a spectral representation for the unbounded
self-adjoint Laplace-Beltrami operator on the Poincare plane.
Now, the Legendre function of the first kind P−1/2+iω (r ) has the following
integral representation:
√ ∞
2 sin (ωu)
P−1/2+iω (cosh d) = coth (πω) √ du, (115)
π d cosh u − cosh d
This is McKean’s closed form representation of the Green’s function of the heat
equation on the Poincare plane [16].
Going back to the original normalization conventions of (68) yields formula (69).
5 Strictly speaking, we will deal with distributions rather than functions. A rigor oriented reader can
We shall first establish a short time asymptotic expansion of McKean’s kernel. This
expansion plays a key role in the analysis of the Green’s
√ function of the SABR model.
In the right hand side of (116) we substitute u = 4sw + d 2 :
√
e−s/4 2 −d 2 /4s ∞ e−w dw
G Z (s, z) = √ e √ .
4π 3/2 s 0 cosh 4sw + d 2 − cosh d
e−s/4 d2
G Z (s, z) = exp − ×
4πs 4s
d 1 d coth d − 1
1− s + O s2 ,
sinh d 4 d2
and we thus obtain the following asymptotic expansion of the McKean kernel:
1 d2
G Z (s, z) = exp − ×
4πs 4s
d 1 d coth d − 1
1− + 1 s + O s2 , (117)
sinh d 4 d2
30 P. Hagan et al.
Taking the derivative of G Z (s, z) with respect of d (z, Z ) in the expansion above,
we find that
∂ 1 d2
G Z (s, z) = exp − ×
∂d 4πs 4s
d d d 1 − d coth d
− + 1+3 + O (s) . (118)
sinh d 2s 8 d2
Next we review the Laplace method (see e.g. [2, 3]) which allows one to evaluate
approximately integrals of the form:
∞
f (u) e−φ(u)/ du. (119)
0
We use this method in order to evaluate the marginal probability distribution for the
Green’s function.
In the integral (119), is a small parameter, and f (u) and φ (u) are smooth
functions on the interval [0, ∞).6 We also assume that φ (u) has a unique minimum
u 0 inside the interval with φ (u 0 ) > 0. The idea is that, as → 0, the value of the
integral is dominated by the quadratic approximation to φ (u) around u 0 .
More precisely, we have the following asymptotic expansion. As → 0,
∞ 2π
f (u)e−φ(u)/ du = e−φ(u 0 )/ ×
0 φ (u 0 )
f (u 0 ) φ(4) (u 0 ) f (u 0 )
f (u 0 ) + −
2φ (u 0 ) 8φ (u 0 )2
f (u 0 ) φ(3) (u 0 ) 5φ(3) (u 0 )2 f (u 0 )
− + 3 + O 2 . (120)
2φ (u 0 )2 24φ u 0
To generate this expansion, we first expand f (u) and φ (u) in Taylor series around u 0
to orders 2 and 4, respectively (keep in mind that the first order term in the expansion
of φ (u) is zero). Then, expanding the regular terms in the exponential, we organize
the integrand as e−φ (u 0 )(u−u 0 ) /2 times a polynomial in . In the limit → 0, the
2
integral reduces to calculating moments of the Gaussian measure; the result is (120).
It is straightforward to compute terms of order higher than 1 in , even though the
calculations become increasingly complex as the order increases.
Finally, let us state a slight generalization of (120), which we use below. In the
integral (119), we replace f (u) by f (u) + g (u). Then, as → 0,
∞ 2π
[ f (u) + g (u)]e−φ(u)/ du = e−φ(u 0 )/ ×
0 φ (u 0 )
f (u 0 ) φ(4) (u 0 ) f (u 0 )
f (u 0 ) + g (u 0 ) + −
2φ (u 0 ) 8φ (u 0 )2
f (u 0 ) φ(3) (u 0 ) 5φ(3) (u 0 )2 f (u 0 )
− + +O 2
.
2φ (u 0 )2 24φ (u 0 )3
(121)
We shall now apply formula (121) to evaluate the integrals (80) and (87). Each of
these integrals is of the form given by the right hand side of (121). We find easily
that the minimum Y0 of the function
1
φ (Y ) = δ (z, Z )2
2
is given by
Y0 = y ζ 2 − 2ρζ + 1,
where x
1 du
ζ= .
y X C(u)
and
I (ζ) = ζ 2 − 2ρζ + 1.
32 P. Hagan et al.
D
φ (Y0 ) = ,
1 − ρ2 y 2 I sinh D
3D
φ(3) (Y0 ) = − ,
1 − ρ2 y 3 I 2 sinh D
and
3 (1 − D coth D) 12D
φ(4) (Y0 ) = 2 + .
1−ρ 2 4 2 2
y I sinh D 1 − ρ y 4 I 3 sinh D
2
It is actually easier to begin the calculation with (87). In order to evaluate the
various terms on the right hand side of (121), let us define
δ δ
f (Y ) = 1− q ,
sinh δ sinh δ
and
δ 1 δ coth δ − 1 3 (1 − δ coth δ) + δ 2
g (Y ) = − + − q .
sinh δ 8 8δ 2 8δ sinh δ
3/2
D C (x) (sinh (D) − ρ cosh (D))
f (Y0 ) = − 3/2 ,
sinh D 2 1 − ρ2 I2
D 1 − D coth D 3yC (x) D
f (Y0 ) = 1+
sinh D 2 1 − ρ2 y 2 I D sinh D 2 1 − ρ2 I
3/2
D C (x) (sinh (D) − ρ cosh (D))
+ 3/2 ,
sinh D 1 − ρ2 yI3
Probability Distribution in the SABR Model of Stochastic Volatility 33
and
1 D 1 − D coth D 3 (1 − D coth D) + D 2
g (Y0 ) = − 1− + yC (x) .
8 sinh D D2 2 1 − ρ2 I D
as claimed in Sect. 5.
Let us now compute (80). We note that the functions f and g in (121) occurring in
this integral are obtained from the corresponding functions in (80) by dividing them
by Y 2 . We thus define
! f (Y )
f (Y ) = ,
Y2
and
g (Y )
!
g (Y ) = .
Y2
Then,
! f (Y0 )
f (Y0 ) = 2 2 ,
y I
! 2 f (Y0 ) f (Y0 )
f (Y0 ) = − 3 3 + 2 2 ,
y I y I
! 6 f (Y 0 ) 4 f (Y0 ) f (Y0 )
f (Y ) = 4 4
− 3 3
+ 2 2 ,
y I y I y I
and
g (Y0 )
!
g (Y0 ) = .
y2 I 2
34 P. Hagan et al.
as stated in Sect. 5.
References
1. Beardon, A.F.: The Geometry of Discrete Groups. Springer, New York (1983)
2. Bender, C.M., Orszag, S.A.: Advanced Mathematical Methods for Scientists and Engineers.
Springer, New York (1999)
3. Bleistein, N., Handelsman, R.A.: Asymptotic Expansions of Integrals. Dover Publications,
New York (1986)
4. Derman, E., Kani, I.: Riding on a smile. Risk 7(2), 32–39 (1994)
5. Duffie, D., Pan, J., Singleton, K.: Transform analysis and asset pricing for affine jump diffusions.
Econometrica 68, 1343–1377 (2000)
6. Dupire, B.: Pricing with a smile. Risk 7(1), 18–20 (1994)
7. Elworthy, K.D.: Geometric aspects of diffusions on manifolds. Ecole d’Ete de Probabilites de
Saint Flour, vol. XVII. Springer, New York (1987)
8. Emery, M.: Stochastic Calculus in Manifolds. Springer, Berlin (1989)
9. Guenther, R.B., Lee, J.W.: Partial Differential Equations of Mathematical Physics and Integral
Equations. Prentice Hall, Englewood Cliffs (1988)
10. Hagan, P.S., Kumar, D., Lesniewski, A., Woodward D.E.: Managing smile risk, Wilmott Mag.
(2003)
11. Hagan, P.S., Woodward, D.E.: Equivalent black volatilities. Appl. Math. Financ. 6(3), 147–157
(1999)
12. Heston, S.: A closed form solution for options with stochastic volatility with applications to
bond and currency options. Rev. Financ. Stud. 6, 327–343 (1993)
13. Hsu, E.P.: Stochastic Analysis on Manifolds. American Mathematical Society, Providence
(2002)
14. Kevorkian, J., Cole, J.D.: Perturbation Methods in Applied Mathematics. Springer, Berlin
(1985)
15. Lebedev, N.N.: Special Functions and their Applications. Dover Publications, New York (1972)
16. McKean, H.P.: An upper bound to the spectrum of on a manifold of negative curvature. J.
Differ. Geom. 4, 359–366 (1970)
17. Lewis, A.L.: Option Valuation Under Stochastic Volatility. Finance Press, Newport Beach
(2000)
Probability Distribution in the SABR Model of Stochastic Volatility 35
18. Molchanov, S.A.: Diffusion processes and Riemannian geometry. Russ. Math. Surv. 30, 1–63
(1975)
19. Varadhan, S.R.S.: On the behavior of the fundamental solution of the heat equation with variable
coefficients. Commun. Pure Appl. Math. 20, 431–455 (1967)
20. Varadhan, S.R.S.: Diffusion processes in a small time interval. Commun. Pure Appl. Math. 20,
659–685 (1967)
Asymptotic Implied Volatility at the Second
Order with Application to the SABR Model
Louis Paulot
1 Introduction
The most known model for pricing derivatives is the Black-Scholes-Merton model,
where the underlying is supposed to follow a geometric Brownian motion. Popular
extensions include local volatility models and stochastic volatility models. As an
example the SABR model [6] combines the local volatility of the CEV model [4] and
a lognormal volatility process. Closed formulas for European options can be obtained
for a few models; it is the case of the CEV model or for a stochastic volatility example
the Heston model [10]. These are however special cases and there are generally no
closed form formulas. Finite difference methods or Monte-Carlo simulations can be
used to price derivatives. Approximations have also been computed to achieve faster
pricing, especially for calibration processes.
For short maturities, Hagan, Kumar, Lesniewski and Woodward provide an
approximation for the implied volatility of the SABR model they introduce [6].
Berestycki, Busca and Florent [2, 3] and Henry-Labordère [8] give general methods
L. Paulot (B)
Misys, 42 Rue Washington, 75008 Paris, France
e-mail: [email protected]
A stochastic volatility model for some asset with pure diffusion (no jumps) is
described by two risk-neutral processes: the asset price S and a variable V which
describes the stochastic part of volatility. In the Heston model V would be the vari-
ance whereas in the SABR model it is a factor of volatility. The diffusion is given by
the stochastic differential equations
where dW1 and dW2 are two standard Brownian processes with correlation ρ.
The dependence of parameters in variables S and V we have written is the more
common, it may be more general with all parameters depending on both variables.
Stochastic volatility models can be seen as diffusions on a Riemann surface. More
precisely, prices of securities are sections of a line bundle over this Riemann surface
which are solutions of a diffusion (or heat) equation.
A introduction to this subject and its applications to finance can be found in [9].
We present here the formalism and define all quantities we use in order to set our
conventions.
Let us consider a general model with n state variables X i (t) (which will be the
spot and the volatility) which follow a pure diffusion process, without jumps. For
simplicity we consider a European payoff of some maturity T . The price P(X (t), t)
of such a payoff is the solution of a diffusion equation
1
−∂t P = μi ∂i P + i j ∂i ∂ j P − r P (2)
2
where i j is the covariance matrix, μi the drifts and r the numéraire rate. All coef-
ficients can depend on state variables X i (t) and time t. Unless explicitly staten,
we adopt Einstein sum convention: repeated indices are summed. The price of the
European option is given by the solution of this equation with terminal boundary
condition at maturity T given by the payoff.
The covariance matrix i j can be seen geometrically as the inverse g i j = i j of
a metric gi j on the space of variables. The diffusion equation describes the diffusion
−1a Riemannian manifold: the state of variables endowed with the metric gi j =
over
ij.
Examples
1
−∂t P = (r − q)S∂ S P + σ 2 S 2 ∂ S ∂ S P − r P
2
There are several gauge transformations which are natural for such systems:
1. Change of numéraire:
P(X, t)
P(X, t) −→
(X, t)
The natural way to handle a system with gauge freedom is to introduce covariant
derivatives. The coordinate freedom is handled through the Levi-Civita connection
which acts respectively on scalars, vectors and 1-forms as
Di f = ∂i f
j
Di f j = ∂i f j + ik f k
Di f j = ∂i f j − ikj f k
j
where ik are the Christoffel symbols. The action on tensors with more indices is
obtained by acting on all indices with the Christoffel symbols. Christoffel symbols
can be computed from the metric as
1 kl
ikj = g ∂i gl j + ∂ j gil − ∂l gi j .
2
A fundamental property of the Levi-Civita connection is the covariance of the metric:
Di g jk = 0.
Asymptotic Implied Volatility at the Second Order … 41
The metric is used to transforms vectors into 1-forms and conversely, i.e. lowering
or raising indices:
Ai = g i j A j
Ai = gi j A j .
The numéraire gauge freedom is handled through a line bundle L (i.e. with sec-
tions in R). Geometrically, P is a section of L. A R-valued connection1 is defined
with spatial and time components given by a 1-form Ai and a scalar2 Q:
∇i P = (Di − Ai )P
∇t P = (∂t − Q)P.
P −→ eφ(X,t) P
∇i P −→ eφ(X,t) ∇i P
∇t P −→ eφ(X,t) ∇t P,
Ai −→ Ai − ∂i φ
Q −→ Q − ∂t φ.
1 i
− ∇t P = ∇ ∇i P. (3)
2
Identifying terms between Eqs. (2) and (3), the R connection must be
1 j
Ai = gi j − kl g kl − μ j (4)
2
1 ij
Q = − g ∂i A j − Ai A j − ikj Ak + r . (5)
2
1 This connection is similar to the connection which described the electromagnetic potential, except
that the fibre of the gauge bundle is R instead of U (1). This causes a difference of a factor i in
equations.
2 There is a breaking of symmetry between time and spatial directions. The diffusion equation can
In addition with
gi j = i j
The Kolmogorov backward Eq. (3) leads to a dual Kolmogorov forward equation.
We suppose that all prices are expressed with respect to a numéraire which is a
traded asset that does not pay any coupon or dividend. The price of the numéraire
security itself is identically 1; this reads mathematically
1 i
−∇t 1 = ∇ ∇i 1
2
If p(X, t) is a risk-neutral probability density to get in state X at time t starting
from state X 0 at time 0, then the price of a European payoff of maturity T ≥ t can
be written as
P(X 0 , 0) = dX p(X, t)P(X, t) .
As t does not appear on the left-hand side, the derivative of the integral with respect
to t must vanish.
dX ∂t ( p(X, t)P(X, t)) = 0.
We define an action of the gauge group on p with a plus sign instead of a minus sign
when acting on P:
∇i p = (Di + Ai ) p
∇t p = (∂t + Q) p.
This means that they p and P have opposite charges under the numéraire R gauge
group, such that p P is neutral and ∇t ( p P) = ∂t ( p P). We have thus
dX (∇t p(X, t)P(X, t) + p(X, t)∇t P(X, t)) = 0 .
Using Eq. (3) for ∇t P(X, t) and integrating by part on the spatial directions, this
equation becomes
1 i
dX ∇t p(X, t) − ∇ ∇i p(X, t) P(X, t) = 0 . (6)
2
Asymptotic Implied Volatility at the Second Order … 43
1 i
∇t p = ∇ ∇i p. (7)
2
Moreover, if the market is complete Eq. (6) must be true for all functions P(·, t) which
imposes Eq. (7). This is the Kolmogorov forward equation, written in a covariant way.
It should be noted that p(X, t) is a density, which means that the Levi-Civita
connection does not reduce to a partial derivative as would be the case for a scalar.
More precisely, the transition probability p(X 0 , 0; X, t) has value in L L∗ ⊗
∧d (T ∗ M). Numéraire gauge tranformations associated with the line bundle L acting
on p gives the well-known change of measure which are usually obtained from the
Girsanov formula.
dF = σ F (F, V )dW1
dV = μV (V )dt + σV (V )dW2
In order to keep exposition as simpler and clear as possible, we will skip here technical
details and refer the reader to [5, 13] for mathematically precise statements.3
3 In
finance we will usually consider noncompact manifolds, possibly with boundaries as in the
SABR model for 0 < β < 1.
44 L. Paulot
At short time the solution of Eq. (7) with initial condition p(X, 0) = δ(X − X 0 )
is asymptotically given by a heat kernel expansion4
√ d 2 (X 0 , X )
g(X ) −
p(X, t) = (X 0 , X )P(X 0 , X )e 2t ak (X 0 , X )t k . (8)
(2πt)n/2
k≥0
g = Det(gi j ).
d(X 0 , X ) is the geodesic distance between the starting point X 0 and the end point
X , this is the minimal distance between X 0 and X . It can also be written as
t
d 2 (X 0 , X )
= min dt gi j Ẋ i Ẋ j
t X (s) 0
where the minimum is taken on all paths going from X (0) = X 0 to X (t) = X .
(This is independent of t.) We denote by C this geodesic path. (X 0 , X ) is the Van
Vleck–Morette determinant
1 ∂ 2 d 2 (X 0 , X )
Det −
2 ∂ Xi∂ X j
(X 0 , X ) = √ 0
.
g(X 0 )g(X )
P(X 0 , X ) is the parallel transport along the geodesic with respect to the R connection.
It is such that its covariant derivative along the geodesic path is null:
− Ai Ẋ i dt − Ai dX i
P(X 0 , X ) = e C =e C
where the integral is computed on the geodesic path C. Finally, ai (X 0 , X ) are func-
tions which are defined recursively with
a0 = 1
4 Using Feynman path integral, the solution to Eq. (7) can be written up to some normalization factor
as t
1
− dt gi j Ẋ i Ẋ j + Ai Ẋ i + Q
p(X, t) ∝ [D X ]e 0 2
where [D X ] means integrating over all path X (s) going from X (0) = X 0 to X (t) = X . The
normalization factor is the inverse of the same quantity with the integral computed over all paths
with starting point X 0 , so that the total probability is 1. It is generally not possible to compute
this integral exactly. However it gives some hints on the asymptotic solution at short time: the
solution will be dominated by the path corresponding to the minimal value of the integrand inside
the exponential, which will be close to the geodesic path.
Asymptotic Implied Volatility at the Second Order … 45
Along a given geodesic curve parameterized by its geodesic distance from X 0 , this
equation reads
1 i
(k + d∂d )ak = P −1 −1/2 ∇ ∇i − Q P1/2 ak−1
2
F
where p(F, V ; t) is given by the heat kernel expansion (8) with X = and
V
n = 2. The integrand can be written as
B
1 − − C − Dt + o(t)
σ 2F (K , V ) p(K , V ; t) = e t (10)
2πt
46 L. Paulot
with
1
B= d(F0 , V0 ; K , V )2 (11)
2
1
C = −2 ln(σ F (K , V )) − [ln(g(K , V )) + ln((F0 , V0 ; K , V ))] + M(K , V )
2
(12)
D = −a1 (K , V ) (13)
the initial conditions and the strike K . Expanding all functions in the neighborhood
of Vmin , where B (Vmin ) = 0, the integrand is
B B 2
1 − − C − Dt − δV
σ 2F (K , V ) p(K , V ; t)
= e t e 2t
2πt
1 1 (4) 1 (3) δV 4 1 (3) 2 δV 6
C − C δV 2 −
2
1− B − B C + B
2 24 6 t 72 t2
+ o(t) + odd terms
where derivatives are with respect to V , all functions B, C, D and their derivatives
are taken at (K , Vmin ) and δV = V − Vmin . When writing o(t), we have anticipated
that after integration δV 2 ∼ 1t . We have also anticipated that odd terms in δV will
not give contributions to the integral.
Integrating over δV , and using that the first even moments of the standard normal
distribution are M2 = 1, M4 = 3 and M6 = 15, we get for the integral
B
1 − − C − Dt
E σ F (Ft , Vt )δ(Ft − K ) = √
2
e t
2πt B
1 2
t 1 (4) 1 (3) 3t 1 (3) 2 15t
1− C −C − B − B C + B + o(t)
2 B 24 6 B 2 72 B 3
Asymptotic Implied Volatility at the Second Order … 47
B
1 − −C − Dt + o(t)
E σ F (Ft , Vt )δ(Ft − K ) = √
2
e t (14)
2πt
with
= C + 1 ln(B )
C (15)
2 ⎡ 2 ⎤
1 (4) (3) B (3) ⎦
= D+
D ⎣C − C 2 + 1 B − B C − 5
2B 4 B B 12 B
⎡ 2 ⎤
1 ⎣ 2 1 B (4) 1 B (3) ⎦
= D+ C −C − + (16)
2B 4 B 3 B
where all derivatives are with respect to V and all functions and their derivatives are
taken at (K , Vmin ).
The price of a Call of maturity T and strike K can be written as the payoff integrated
against the risk-neutral distribution:
−r T
Call(K , T ) = e dF (F − K )+ p(F, T )
where p(F, t) is the marginal probability density of Ft . This can be written also as
a double integral over forward and time as
T
Call(K , T ) = e−r T (F0 − K )+ + dF dt (F − K )+ ∂t p(F, t) . (17)
0
E σ 2 (Ft , Vt )δ(Ft − F)
F
σloc
2
(F, t) = E σ 2F (Ft , Vt ) | Ft = F = .
p(F, t)
48 L. Paulot
Plugging the Kolmogorov equation (18) in Eq. (17) and integrating twice by part on
the F variable, the Call price is finally obtained as an integral over time at strike K :
1 T
Call(K , T ) = e−r T (F0 − K )+ + dt E σ 2F (Ft , Vt )δ(Ft − K ) . (19)
2 0
Using expression (14) for the integrand, the integral over time can be computed:
T
1
dt E σ 2F (Ft , Vt )δ(Ft − K ) =
2 0
⎡ B
1 −C ⎣ t − t √ B
√ e e − B erfc
2 π t
⎛ B ⎞⎤ ⎛ B⎞
D t − B −
− ⎝ (t − 2B)e t + 2B 3/2 erfc ⎠⎦ + o⎝t 5/2 e t ⎠
3 π t
The final step consists in computing the same expansion for the Black–Scholes model,
which is simpler as there is no stochastic volatility to be integrated. The metric is
given by the inverse of the variance:
1
gF F = .
σ2 F 2
Asymptotic Implied Volatility at the Second Order … 49
(F0 , K ) = 1.
ln2 KF
1 F − 2 σ2
p(K , t) = √ e 2σ t 1 − t + o(t) .
σK 2πt K 8
with
1 K
BBS = ln2
2σ 2 F0
BS 1
C = − ln(σ) − ln(K F0 )
2
BS σ2
D = .
8
50 L. Paulot
(In fact formula (21) is exact: there is no o(t) correction and it can be integrated
exactly to get the Black–Scholes formula.)
Writing Eq. (20) for both the stochastic volatility model and the Black-Scholes
model, the implied volatility is such that both quantities are equal:
B + 3 T = BBS + C
+ ln(B) + DT BS T + 3 T + o(T ).
BS + ln(BBS ) + D
+C
T 2B T 2BBS
(22)
Expanding the implied volatility σ as a Taylor expansion
σ(K , T ) = σ0 (K ) + σ1 (K )T + σ2 (K )T 2 + o(T 2 )
and plugging this into Eq. (22) on the Black–Scholes side, we get
2
1 2 K σ1 σ2 2 σ1 σ1
ln 1−2 T −2 T +3 T 2 − ln(σ0 ) − T
2σ02 T F0 σ0 σ0 σ0 σ0
1 1 2 K σ1 σ02 3σ02
− ln(K F0 ) + ln ln − 2 T + T + T
2 2σ02 F0 σ0 8 ln2 FK 0
B T + 3 T + o(T ).
= +C + ln(B) + D
T 2B
Coefficients must be equal at each order in T , which gives our final expansion of the
implied volatility.
Power −1 gives the order 0 implied volatility
K K
ln F0 ln F0
σ0 = √ = (23)
2B d(F0 , V0 ; K , Vmin )
We stress that this result is exact in strike: for a given strike, we have computed
exactly the three first coefficients of the Taylor expansion. Moreover, contrary to
other expansions, the order 1 expansion is extracted from the order 0 expansion of the
probability. This technique allows us to extract a second order term for the implied
volatility from the order 1 term in the probability expansion. This method can be
used to compute the Taylor expansion of implied volatility up to any order, although
the computation becomes more complicated and involves integrals of increasing
dimension: the ak coefficient of the heat kernel expansion involves k + 1 integrals.
The computation we have performed makes the implicit hypothesis that we are not
exactly at the money: K = F0 . Otherwise, the dominant term in the exponential
would vanish and we could not use the asymptotic expansion of the erfc function at
infinity. Precisely at the money, we should use instead a Taylor expansion in 0. As
the implied volatility surface is smooth, we just take the limit of formulas (23), (24)
and (25) at K → F0 . If we perform instead the Taylor expansion of the erfc function
at 0, we find only the two first orders
e−C(F0 )
σ0 (F0 ) =
F0
σ1 1 σ02
(F0 ) = 0) .
− D(F
σ0 3 8
Careful Taylor expansions of all quantities at the money can be used to check that this
is indeed the limit of Eqs. (23) and (24). Moreover, it can be seen that the existence
of these limit are conditions for formulas (24) and (25) to be convergent, as B goes to
0 at the money (at order 2 in the geodesic distance, which means that the numerators
must in fact vanish at order 2).
Instead of Black volatility, the asymptotic expansion can be computed for other
local volatility models. Without stochastic volatility, the SABR model reduces to the
CEV model. The local volatility part of the model is thus taken into account exactly
without introducing approximation besides the stochastic corrections. In view of our
application to the SABR model, we will compute here a CEV implied volatility.
There are closed formulas for this model, involving Bessel functions. This implied
volatility can therefore be used in the CEV pricing formula in order to get the price
of the option.
52 L. Paulot
For a CEV model with parameter β0 and volatility factor σ, such that
dF = σ F β0 dW ,
and D
the function B, C are
1
B0 = ln2 (q0 )
2σ 2
0 = −ln(σ) − 1 β0 ln(K F0 )
C
2
0 = β0 (2 − β0 )σ
2
D 1−β
8K 1−β0 F0 0
with
⎧ 1−β
⎪
⎪ K 1−β0 − F0 0
⎨ β0 < 1
q0 = 1− β0
⎪
⎪ K
⎩ ln β0 = 1.
F0
|q0 | |q0 |
σ0 = √ = (27)
2B d(F0 , V0 ; K , Vmin )
σ1 + ln(σ0 ) + 1 β0 ln(K F0 )
C
=− 2
. (28)
σ0 2B
2
σ2 σ1 β (2 − β0 )σ02
=
3
−
1 + 3 σ1 − 0
D . (29)
σ0 2 σ0 2B σ0 1−β
8K 1−β0 F0 0
The Black implied volatility formulas correspond to the special case β0 = 1. The
Bachelier (i.e. normal) implied volatility would correspond to β0 = 0.
Asymptotic Implied Volatility at the Second Order … 53
3.7 Generalization
− C
C ∗ + ln(B) − ln(B∗ )
λ1 =
B∗
− D
D ∗ − λ1 B∗ − λ1 C
∗ − 1 λ2 B∗
B∗ 2 1
λ2 = .
B∗
• Plug parameters z i (λ1 T + λ2 T 2 ) into the closed form option price of the proxy
model to get an approximate price of the option in the real model.
The closer the models are, the better the approximation is. It is clear that if the proxy
model is the real model itself, there are no corrections at all. This procedure consists
in approximating only the differences between models at a given strike and not the
option price itself. In the basic case of Sect. 3.4 where the proxy model is the Black-
Scholes model, the approximation leverages on the fact that the volatility surface is
more regular than the option price.
54 L. Paulot
4 SABR Model
4.1 Model
The SABR Model [6] is a stochastic volatility model where the volatility is a local
volatility function multiplied by a lognormal stochastic volatility:
dF = V C(F)dW1
dV = νV dW2
with dW1 dW2 = ρdt. The initial value for V is the parameter5 α:
α = V (0).
C(F) = F β
5 We use the standard notation of α for the initial value of the volatility variable in the SABR model
In order to compute the order 0 implied volatility, the only geometric object involved
is the metric. According to the dictionary of Sect. 2.2, its inverse is the covariance
matrix
V 2 C(F)2 ρνV 2 C(F)
gi j = .
ρνV 2 C(F) ν 2 V 2
1−β
F 1−β − F0
q=
1−β
and for β = 1
F
q = ln .
F0
In addition, we rescale the time such that ν disappears of the equations while
keeping the same solution of the equations (the variances which are the physical
quantities are not changed):
t −→ ν 2 t
α
α −→
ν
ν −→ 1
At the end of the computation, the inverse transformation must be applied to the
implied volatility:
σν ←− σ
q − ρV
x=
1 − ρ2
y = V.
dx 2 + dy 2
ds 2 = .
y2
This geometry corresponds to the hyperbolic plane, in the Poincaré half-plane rep-
resentation (y > 0) [7, 8]. Geodesics are vertical lines and semi-circles orthogonal
to the y = 0 axis. The geodesic distance between two points (x1 , y1 ) and (x2 , y2 )
can be computed:
−1 (x2 − x1 )2 + (y2 − y1 )2
d(x1 , y1 ; x2 , y2 ) = cosh 1+ .
2y1 y2
(We have dropped the absolute values as the numerator and the denominator have
the same sign.)
Plugging the expression for Vmin and going back to the original time, with ν
factors, the order 0 implied volatility for the SABR model is
K
ν ln
F0
σ0 = (32)
α2 + 2ρανq + ν 2 q 2 + ρα + qν
ln
(1 + ρ)α
with ⎧ 1−β
⎪
⎪ K 1−β − F0
⎨ β<1
q= 1−β
⎪
⎪ K
⎩ ln β = 1.
F0
β−1
σ0 (F0 ) = αF0
To compute the order 1 correction, we need the scalar factor in the time value expan-
in Eq. (15), with C given in Eq. (12).
sion, given by C
For the hyperbolic plane, the Van Vleck–Morette determinant can be computed
as a function of the geodesic distance:
d
= .
sinh(d)
We need also
σ F (K , V ) = V C(K ) = V K β
58 L. Paulot
1 1
g(K , V ) = =
V 4 C(K )2 (1 − ρ2 ) V 4 K 2β (1 − ρ2 )
d
B = dd =
αVmin (1 − ρ2 ) sinh(d)
1 ρC (F)
A= d ln(C(F)) − dV . (34)
2(1 − ρ2 ) 2(1 − ρ2 )
C (F)
A= dx.
2 1 − ρ2
to
x2 √
q−ρV
= 1−ρ2 .
y2 V
If β = 1, the 1-form
1
A= dx
2 1 − ρ2
M = 0.
Asymptotic Implied Volatility at the Second Order … 59
In other cases, the first part of Eq. (34) is an exact form and can be integrated directly:
(K ,V )
1 1 C(K ) β K
d ln(C(F)) = ln = ln .
(F0 ,α) 2(1 − ρ )
2 2(1 − ρ )
2 C(F) 2(1 − ρ )
2 F
(35)
The second part must be integrated on the geodesic path. The geodesic is a semi-
circle with origin (X, 0), radius R and going through (x1 , x2 ) and (y1 , y2 ). The origin
is therefore
x 2 − x12 + y22 − y12
X= 2 (36)
2(x2 − x1 )
1 − t2
x=X+R
1 + t2
2t
y=R .
1 + t2
ds = d ln(t)
1−β
a = F0
b = (1 − β) 1 − ρ2
c = (1 − β)ρ
#
R − xi + X
ti = (40)
R + xi − X
and
−1 (z) = 1 ln 1 + z ,
tanh
2 1 − z
which coincides with the inverse function of tanh on ] − 1; 1[. Summing Eqs. (35)
and (38), the integral of the connection is finally
β K ρβ
M= ln − [G(t2 ) − G(t1 )] . (41)
2 F (1 − β) 1 − ρ2
⎛ ⎞ ⎛ ⎞
−ρα νq − ρ α2 + 2ρανq + ν 2 q 2
x1 ⎜ ⎟ x2 ⎜ ⎟
= ⎝ ν 1α− ρ2 ⎠ and =⎜ ν 1 − ρ2 ⎟
y1 y2 ⎝ ⎠
α + 2ρανq + ν 2 q 2
2
ν ν
and G(t) is defined in formula (39).
Asymptotic Implied Volatility at the Second Order … 61
which is valid for all positive strikes. Exactly at the money, the formula we give must
be replaced by its limit, which can be computed by a Taylor expansion or numerically.
At the money and only at the money it appears to be equal to the original HKLW
formula:
σ1 1 2 2(β−1) 1 β−1 1 1
(F0 ) = α (1 − β)2 F0 + ρανβ F0 + ν 2 − ρ2 ν 2 . (44)
σ0 24 4 12 8
This is not surprising as their expansion is in fact an expansion in both maturity and
moneyness (eventually of order 0 in moneyness).
4.4 Order 2
To compute the second order correction to implied volatility, we need to compute D
as defined in Eq. (16), with D = −a1 defined in Eq. (9).
We have to compute a1 as defined in Eq. (9). Most of the integration can be done
analytically. We have first the integral of Q along the geodesic:
1
a1(Q) =− Qds.
d C
Using the values defined in the previous section for X , R, t1 , t2 , a, b and c, its integral
along the geodesic is
(Q) β β R2 H (t2 ) − H (t1 )
a1 = 1−β+ (45)
2 2(1 − ρ2 ) (1 − β)2 R 2− (a + bX )2 ln(t2 ) − ln(t1 )
62 L. Paulot
with
a + b(R + X ) + c Rt
H (t) =
(a + bX )(1 + t 2 ) + b R(1 − t 2 ) + 2c Rt
⎧
⎪
⎪ cR c R + t (a + b(X − R))
⎪
⎪ tan −1
(a + bX )2 > (1 − β)2 R 2
⎨
(a + bX )2 − (1 − β)2 R 2 (a + bX )2 − (1 − β)2 R 2
+
⎪
⎪ cR c R + t (a + b(X − R))
⎪−
⎪ tanh −1 (a + bX )2 < (1 − β)2 R 2 .
⎩
(1 − β)2 R 2 − (a + bX )2 (1 − β)2 R 2 − (a + bX )2
Note that in the denominator, the quantity ln(t2 ) − ln(t1 ) is up to a sign the geodesic
(Q)
distance d. If β = 1, a1 reduces to
(Q) R | x2 − x1 |
a1 =− . (46)
8(1 − ρ2 )d
D i Di = y 2 (∂x2 + ∂ y2 ).
Forgetting A(0) which is pure gauge (it can be checked by hand that A(0) will not
contribute), we denote by P (1) the A(1) part of the parallel transport:
(1)
&
A(1)
P (1) = e−M = e− C
with ⎧
⎪ ρβ
⎪
⎨− [G(t2 ) − G(t1 )] β<1
(1) (1 − β) 1 − ρ2
M = ρ (48)
⎪
⎪ K
⎩ ρ ln −V +α β = 1.
2 1 − ρ2 F0
we can rewrite
2
1 −1 i 1
P ∇ ∇i P = y 2 − ∂x2 + ∂ y2 M(1) + ∂x M(1) − A(1) x
2 2
2
+ ∂ y M(1) − A(1)
y + ∂ x A (1)
x + ∂ y A (1)
y
∂x A(1) (1)
x + ∂ y A y = 0.
This integral can be computed by a numerical quadrature with few points. For β = 1
the connection has in fact no curvature and therefore a1(A) = 0.
64 L. Paulot
1
d =
α(1 − ρ2 )Vmin sinh(d)
B = dd
B (3) 3
=−
B Vmin
B (4) 12 cosh(d) 1
= 2 − 3d − . (50)
B Vmin sinh(d) d
and C
. Using our decomposition M = β
We need finally C 2 ln K
F + M(1) , we can
write from formulas (12) and (15)
= − β ln(K F) − 1 ln d 1
C + ln B + ln 1 − ρ2 + M1 .
2 2 sinh(d) 2
with a1(Q) given in Eq. (45), a1(A) in (49), B in (50) and M(1) in (48) with G(t)
defined in Eq. (39). For β = 1, this expression can be simplified using Eq. (46) for
a1(Q) and
a1(A) = 0
ρ
M(1) = −
2(1 − ρ2 )
M(1) = 0.
Asymptotic Implied Volatility at the Second Order … 65
The computation has been done in redefined variables such that ν = 1. To restore
the ν factors, α must be replaced by αν , T by ν 2 T and the final implied volatility
must be multiplied by ν.
At the money, the formula for σσ02 looks divergent but its limit is well defined. We
compute this limit numerically, although it could be done analytically.
The results of Sect. 3.6 can be used to invert the SABR volatility into a CEV fractional
volatility. Using formulas of Sect. 3.6 the implied CEV volatility is computed and
used in the closed-form option prices of the CEV model.
This appears to be useful at low strikes for β < 1/2 or with small volatility of
volatility: only the corrections to the CEV model which come from the stochastic
volatility are approximated, not the local volatility part. For example, at the money
the first order coefficient of the Black volatility which is given by Eq. (44) becomes
for the CEV volatility
σ1 1 β−1 1 1
(F0 ) = ρανβ F0 + ν 2 − ρ2 ν 2 .
σ0 4 12 8
We present in Figs. 1 and 2 the implied volatility given by our expansion and compare
it to the implied volatility computed by a two-dimensional finite difference method
scheme. We also show for comparison the implied volatility given by the original
formula of [6]. In this example, parameters are F0 = 4, α = 30 %, β = 0.7,
ν = 40 %, ρ = −0.5. The FDM scheme is a second order Yanenko scheme [14]
with exponential fitting. We use 400 points in strike, 200 points in volatility and 30
time steps.
66 L. Paulot
2.5 years
0.55 Order 2 0.03 Order 2
Order 1 Order 1
0.5 HKLW 0.025 HKLW
FDM
0.45
Implied Voaltility
0.02
Relative Error
0.4
0.35 0.015
0.3 0.01
0.25
0.005
0.2
0.15 0
0.1 -0.005
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike
5 years
0.55 Order 2 Order 2
Order 1 Order 1
0.5 HKLW 0.08 HKLW
FDM
0.45
Implied Voaltility
Relative Error
0.4 0.06
0.35
0.3 0.04
0.25
0.02
0.2
0.15 0
0.1
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike
7.5 years
0.55 Order 2 Order 2
0.5 Order 1
HKLW
0.15 Order 1
HKLW
FDM
0.45
Implied Voaltility
Relative Error
0.4 0.1
0.35
0.3 0.05
0.25
0.2 0
0.15
0.1 -0.05
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike
10 years
0.55 Order 2 0.25 Order 2
Order 1 Order 1
0.5 HKLW 0.2 HKLW
FDM
0.45
Implied Voaltility
0.15
Relative Error
0.4
0.35 0.1
0.3 0.05
0.25
0
0.2
0.15 -0.05
0.1 -0.1
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike
Fig. 1 Implied volatility and relative error for the SABR model with parameters F0 = 4, α = 30 %,
β = 0.7, ν = 40 %, ρ = −0.5 and maturities 2.5 yr, 5 yr, 7.5 yr and 10 yr. On the left, implied
volatilities are plotted for our first order and second order expansions, the original formula of [6]
and are compared to the result of a FDM solution. On the right are the relative errors with respect
to this reference solution
Asymptotic Implied Volatility at the Second Order … 67
15 years
0.55 Order 2 0.4 Order 2
Order 1 Order 1
0.5 HKLW 0.3 HKLW
FDM
0.45
Implied Voaltility
0.2
Relative Error
0.4
0.1
0.35
0.3 0
0.25 -0.1
0.2 -0.2
0.15
-0.3
0.1
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike
20 years
Order 2 0.6 Order 2
0.5 Order 1 Order 1
HKLW HKLW
FDM 0.4
Implied Voaltility
0.4 0.2
Relative Error
0.3 0
-0.2
0.2
-0.4
0.1
-0.6
0 -0.8
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike
30 years
Order 2 1 Order 2
0.5 Order 1 Order 1
HKLW HKLW
0.4 FDM 0.5
Implied Voaltility
0.3 0
Relative Error
0.2
0.1 -0.5
0 -1
-0.1
-1.5
-0.2
-0.3 -2
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike
Fig. 2 Implied volatility and relative error for the SABR model with parameters F0 = 4, α = 30 %,
β = 0.7, ν = 40 %, ρ = −0.5 and maturities 10 yr, 20 yr and 30 yr. On the left, implied volatilities
are plotted for our first order and second order expansions, the original formula of [6] and are
compared to the result of a FDM solution. On the right are the relative errors with respect to this
reference solution
At very short maturities, all expansions are acceptable as the expansion is domi-
nated by the order 0 term. At first order our expansion is equal to the HKLW at the
money but is more regular in strikes and is better in the wings as our computation
does not involve any approximation in the moneyness. Our second order expansion is
one order of magnitude more precise.6 When maturity grows, first order expansions
lose precision but the second order remain relatively good up to 10 years, where
ν 2 T = 1.6. At higher maturities, the second order expansion explodes quadratically
and finally gives even negative volatilities at very long maturity and low strikes. At
6 Infact at very short maturities, the FDM scheme we use is less precise and less stable than this
second order expansion, especially in the wings where the probability density is very small.
68 L. Paulot
long maturities, a FDM or an other numerical method must be used, unless a valid
long maturity expansion could be computed more efficiently.
5 Conclusion
Acknowledgments We thank Erwan Curien for his interest in this work, Martial Millet, Xavier
Lacroze and Wafaa Bennehhou for many discussions.
At long maturity, the SABR model is not realistic as the volatility process is a
geometric Brownian motion. In particular the variance of the volatility increases
linearly in time. A direct extension would be to add mean reversion to the volatility
process, either on the volatility or on the variance. The asymptotic expansion can
still be computed at order 2. However its domain of validity is usually reduced:
in addition to other conditions, the maturity must be small compared to the mean
reversion characteristic time.
We impose mean reversion on the volatility process as
The metric is not modified as it describes the diffusion part. Expressions for A and
Q are modified as follows:
ρκ(V − V ) κ(V − V )
A = ASABR − dF + 2 2 dV
νV 2 (1 − ρ2 )F β ν V (1 − ρ2 )
Asymptotic Implied Volatility at the Second Order … 69
1 κ2 (V − V )2 1 V 1 ρβκ(V − V )
Q = Q SABR + + κ−κ − .
2 ν 2 V 2 (1 − ρ2 ) 2 V 2 ν(1 − ρ2 )F 1−β
At first order, the integral of A along the geodesic is needed. Using the same
notations as in Sect. 4.3 where variables have been rescaled such that ν = 1, it gets
an additional term
' (
ρκ V t
−1 −1 2
M = MSABR + 2 tan (t2 ) − tan (t1 ) − ln
1 − ρ2 R t1
' (
V V V
+ κ ln + − .
α V α
References
1. Avramidi, I.G.: Analytical and geometric methods for heat kernel applications in finance. http://
infohost.nmt.edu/~iavramid/notes/hkt/hktslides7b.pdf (2007)
2. Berestycki, H., Busca, J., Florent, I.: Asymptotics and calibration of local volatility models.
Quant. Financ. 2(1), 61–69 (2002)
3. Berestycki, H., Busca, J., Florent, I.: Computing the implied volatility in stochastic volatility
models. Commun. Pure Appl. Math. 57(10), 1352–1373 (2004)
4. Cox, J.C.: Notes on option pricing I: constant elasticity of diffusions. Working paper (1975)
5. DeWitt, B.S.: Dynamical Theory of Groups and Fields. Gordon and Breach, New York (1965)
6. Hagan, P.S., Kumar, D., Lesniewski, A.S., Woodward, D.E.: Managing smile risk. Wilmott
Mag. 1(8), 84–108 (2002)
7. Hagan, P.S., Lesniewski, A.S., Woodward, D.E.: Probability distribution in the SABR model of
stochastic volatility. In: Friz, P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.)
Large Deviations and Asymptotic Methods in Finance. Springer Proceedings in Mathematics
and Statistics, vol. 110 (2015)
8. Henry-Labordère, P.: A general asymptotic implied volatility for stochastic volatility models.
http://arxiv.org/abs/cond-mat/0504317 (2005)
9. Henry-Labordère, P.: Analysis, Geometry, and Modeling in Finance: Advanced Methods in
Option Pricing. Chapman & Hall/CRC, Boca Raton (2008)
10. Heston, S.L.: A closed-form solution for options with stochastic volatility with applications to
bond and currency options. Rev. Financ. Stud. 6(2), 327–343 (1993)
11. Varadhan, S.: Diffusion processes in a small time interval. Commun. Pure Appl. Math. 20(4),
659–685 (1967)
12. Varadhan, S.: On the behavior of the fundamental solution of the heat equation with variable
coefficients. Commun. Pure Appl. Math. 20(2), 431–455 (1967)
13. Vassilevich, D.V.: Heat kernel expansion: user’s manual. Physics report 388, 279–360. http://
arxiv.org/abs/hep-th/0306138 (2003)
14. Yanenko, N.N.: The Method of Fractional Steps. Springer, New York (1971)
Unifying the BGM and SABR Models:
A Short Ride in Hyperbolic Geometry
Pierre Henry-Labordère
1 Introduction
The BGM model [6, 14] has recently been the focus of much attention as it gives
a theoretical justification for pricing caps-floors using the classical Black-Scholes
formula. The basic (physical) random variables are given by the Libor forward rates
which are assumed to follow a correlated log-normal process. As the forward swap
rate model implied by the BGM model is quite complicated (the swap forward rate
is not log-normally distributed), the calibration to a swaption matrix is difficult. An
asymptotic swaption implied volatility (at the zero-order in the swaption maturity)
was initially derived by Rebonato [17], Hull and White [9] for the BGM model.
Despite its great success, the BGM model presents the same drawbacks as the
classical Black-Scholes theory: as the forward rates follow a correlated log-normal
P. Henry-Labordère (B)
Société Générale, Global Markets Quantitative Research, Paris, France
e-mail: [email protected]
process, the model is not able to calibrate the full swaption matrix in/out-the money
(in particular the caplets smile) and give a good dynamics to the Libor rates. The
incorporation of a swaption smile can be obtained by introducing more elaborated
models which should be flexible enough to calibrate caplets and a grid of swaption
volatilities (not necessary at the money) across all swaption expiries and underlying
swap maturities. One property that these models must still share is their ability to
quickly calibrate the swaption matrix without using complicated numerical routines
such as Monte-Carlo simulation which are usually noisy and time-consuming. In this
context, Andersen-Andreasen introduced the CEV Libor Market Model (LMM) [1]
which assumes that each Libor forward rate follows a CEV process, and showed how
to obtain an asymptotic swaption smile. Their method is still based on the Rebonato
“freezing” argument which consists in assuming that the ratio of a forward Libor rate
over the swap rate and the derivative of the swap rate according to a forward Libor
rate are almost constant (and therefore equal to their values at the spot).
Recently, for this specific model, Kawai found a more accurate asymptotic formula
using the Wiener chaos expansion [15]. Although giving more flexibility than the
BGM model, the CEV LMM model is still not able to calibrate the swaption matrix
for in/out strikes and in this context, we are naturally led to use stochastic volatility
LMM. The literature on this subject is not particularly large. Andersen-al introduced
a LMM where the Libors follow a multi-dimensional correlated CEV process cou-
pled (but uncorrelated) to a Heston model [2, 3] and recently Piterbarg modifies this
model by allowing the model parameters to be time-dependent [16]. Using an aver-
aging principle, which consists in replacing the time-dependent parameters by some
effective constant parameters, Piterbarg derives an asymptotic volatility. Note that as
these models are uncorrelated to the stochastic volatility, the swaption fair value is
simply given by the fair price in the case of a local volatility model conditional to the
stochastic volatility process as explained by the Hull and White decomposition [10].
An asymptotic expression can then be generated by approximating the moments of
the volatility process [2].
For pricing exotic options (such as Bermudan swaptions for example), it is simpler
or more natural to model directly the forward swap rate with a stochastic volatility
process. For example, the SABR model [11] was introduced to fulfill this goal. An
asymptotic swaption smile formula (at the first-order) was derived for this specific
model and helps to calibrate quickly the model to liquid market data. In this con-
text, it is natural to try to reconcile/unify both benchmark models, the BGM and
SABR models. We therefore introduce a LMM where the forward rates follow a
multi-dimensional CEV process (with one beta for each Libor forward rate) corre-
lated to a SABR model. As it is the case for the SABR model, we impose that the
Libors are correlated to an unique volatility and it is therefore not possible to follow
the Andersen-al [3] method (i.e. the Hull and White decomposition) to derive an
asymptotic swaption smile.
In this paper, we pursue our previous work on the application of the heat kernel
expansion on a Riemannian manifold endowed with an Abelian connection [12]
to derive an asymptotic smile formula for a swaption. The plan of this paper is as
follows: in the first part, we will recall some definitions and present a list of recent
Unifying the BGM and SABR Models … 73
Libor Market Models. In the second part, we apply the heat kernel expansion to
derive an asymptotic swaption smile formula at the first-order valid for any LMM.
In the third part, we present our stochastic LMM and apply this general formula.
We will prove that the geometry underlying this model is the hyperbolic manifold
Hn+1 with n the number of forward rates. Furthermore, we show that the “freezing”
argument is no longer valid when we try to price a swaption in/out the money.
We denote by Fk (t) ≡ F(t, Tk−1 , Tk ) with the forward rate resetting at Tk−1 with
τk = Tk−1 − Tk the tenor. As the product of the bond P(t, Tk ) with the forward
rates Fk (t) is a difference of two bonds with maturity Tk−1 and Tk , τ1k (P(t, Tk−1 ) −
P(t, Tk )), and therefore a traded asset, Fk is a (local) martingale under Qk , the
(forward) measure associated with the numéraire P(t, Tk ). Therefore, we assume
the following driftless dynamics
Note that the stochastic differential equation for the Libors rate Fk has been written
in the forward measure Qk and the stochastic equation for a remains the same in the
forward or forward swap rate measures as a is assumed to be uncorrelated with the
Libor rates. This will not be the case in our LMM.
We note sαβ the forward swap rate starting at Tα and expiring at Tβ . The forward
swap rate satisfies the following driftless dynamics in the forward-swap measure
β
Qαβ (associated to the numéraire Cαβ (t) = i=α+1 τi P(t, Ti ))
β
∂sαβ
dsαβ = σk (t)k (a, Fk )d Z k
∂ Fk
k=α+1
(3.1)
3.1 Saddle-Point
d(x,x 0 )2
As the conditional probability at the zero-order is proportional to e− 4t (see
Appendix A) with d(x, x 0 ) the geodesic distance between the points x =
({Fi }i=1,...,n , a) ∈ Rn+1 and x 0 = ({Fi0 }i=1,...,n , α), the saddle-point corresponds
to the point x on the submanifold sαβ = K which minimizes the geodesic distance
d(x, x 0 ) [4, 5]
Plugging our asymptotic expression for the conditional probability (5.2) into (3.1)
and doing the integration over B using the Laplace method (see Appendix B), we
finally obtain the local volatility at the first-order
⎧
β
⎨
αβ
(σloc )2 (t, K ) = ρi j (t)σi (t)σ j (t) f i j (F ∗ , a ∗ )
⎩
i, j=α+1
⎛ ⎛ ⎞ ⎞⎫
n+1
∂μ f i j (F ∗ , a ∗ ) ∗ ∗
n+1
∂μν f i j (F ∗ , a ∗ ) ⎬
1 + 2t Aμν ⎝ ⎝2 ∂ν ψ(F , a ) − Aγδ ∂νγδ d 2 ⎠ + ⎠
f i j (F ∗ , a ∗ ) ψ(F ∗ , a ∗ ) f i j (F ∗ , a ∗ ) ⎭
μ,ν=1 γ,δ=1
∂s ∂sαβ √
with f i j (F, a) = i (a, Fi ) j (a, F j ) ∂ Fαβi ∂ Fj , ψ(F, a) = gP and Aμν =
[∂μν d 2 ]−1 . g, P and are defined in Appendix A and computed explicitly in Sect. 4.
Note that as opposed to other asymptotic methods presented in the literature, this
formula is exact at t → 0. A similar zero-order formula (independent of the time t
for σi (t), ρi j (t) constant) was derived for a general multi-dimensional local volatility
model by [4]. Moreover, in the expansion, we assumed that the time t is small but
we have made no assumption that Fk is close to the spot Libor or that the volatility
of volatility is small.
76 P. Henry-Labordère
The asymptotic smile can be derived in two steps from the asymptotic local volatility:
first, we have (s0 ≡ sαβ (t = 0))
αβ
σloc (t, sαβ ) αβ
dsαβ = αβ
σloc (t, s0 )dBt
σloc (t, s0 )
t αβ
and doing a change of local time t = 0 σloc (u, s0 )2 du, we now obtain the associated
local volatility model for the swap rate
αβ
dsαβ = σ̄loc (t, sαβ )dBt
αβ
αβ σloc (t,s)
with σ̄loc (t, s) = αβ . Secondly, we know that there is a one-to-one
σloc (t,s0 )
correspondence between this local volatility and the smile [12] given at the first-
order by
Tα αβ 2
αβ 0 (σloc ) (u, s0 )du αβ
σBS (K , Tα ) = σBS (K )0
Tα
Tα
1 αβ αβ
× 1+ (σloc )2 (u, s0 )duσBS (K )1 (3.4)
2 0
αβ ln( sK )
σBS (K )0 = K 0
dx
s0 C(x)
αβ
αβ
αβ 1 (σBS (K )0 )2 K s0 1 ∂t σloc ( f av , 0)
σBS (K )1 = − 2 ln + αβ 2
K dx C(K ) (σ ) (0, s0 ) C( f av )
s0 C(x) loc
αβ
σloc (0,K ) s0 +K
with C( f ) ≡ αβ , f av ≡ 2 .
σloc (0,s0 )
4 SABR-LMM Model
We have seen that the asymptotic local and implied volatilities can be computed if we
know the geodesic distance and a parametrization of geodesic curves on Mn+1 . This
is the case for the hyperbolic space Hn for all n. This manifold has a lot of important
properties. In the first part, we present our BGM-LMM-SABR model and show
that the underlying geometry is Hn+1 (with n the number of forward Libor rates).
Unifying the BGM and SABR Models … 77
Using this connection, we will find an asymptotic local volatility and an asymptotic
swaption implied volatility.
4.1 Dynamics
We introduce the SABR-LMM model, given by the following SDE under the spot
β(t)−1
Libor measure Q (associated to the numéraire Bm (t) = j=1 (1 + τ j F j (T j−1 ))
P(t, Tβ(t)−1 ) where β(t) = m if Tm−2 < t < Tm−1 )
with
β
Ck (Fk ) = φk Fk k
k
τ j ρ jk σk (t)σ j (t)Ck (Fk )C j (F j )
Bk (t, F) =
(1 + τ j F j )
j=β(t)
Z is a correlated Brownian motion under the measure Q. Here we take the local
β
volatility term k (a, Fk ) of type aφk Fk k with φk ∈ R.
The functions Ck (Fk ) have been scaled by φk and therefore we can impose that
σk (0) = 0. The stochastic equation for a was written in the spot Libor measure in
order to get a SDE independent of a specific underlying swap sαβ or a forward bond.
Under the forward swap measure Qαβ , we have
with
β
j)
max(k,
P(t, T j ) τi ρki σi (t)σk (t)Ci (Fi )Ck (Fk )
bk (t, F) = (21 j≤k − 1)τ j
Cαβ (t) (1 + τi Fi )
j=α+1 i=min(k+1, j+1)
β
Cαβ (t) = τi P(t, Ti )
i=α+1
β
i
τk Ck (Fk )ρka σk (t)
ba (F, t) = τi ωi (t)
1 + τk Fk (t)
i=α+1 k=β(t)
78 P. Henry-Labordère
i 1
k=β(t) (1+τk Fk )
and with ωi (t) = β j 1
. Here Z is a correlated Brownian motion
j=α+1 k=β(t) (1+τk Fk )
under the measure Qαβ . Note that the forward-rate dynamics under the forward mea-
sure Qk is much simpler and given by the following stochastic differential equations
(SDE)
As it is the case for the BGM model, we can use a piecewise parametric form or a
functional form for the serial volatilities σi (t) and the correlation ρi j (t) (here full
rank) as
By definition (see Eq. (5.1a)), the infinitesimal distance (at t = 0) between the point
xα and xα +d xα (5.1a) is given by (ρi j ≡ [ρ−1 ]i j , (i, j) = (1, . . . , n), ρia ≡ [ρ−1 ]ia
and ρaa ≡ [ρ−1 ]aa are the components of the inverse of the correlation matrix ρ)
n+1
ds ≡
2
gαβ d xα d xβ
α,β=1
⎛ ⎞
2 ⎝ i j νd Fi νd F j
n n
ia νd Fi 2⎠
= 2 2 ρ +2 ρ da + ρ da
aa
ν a Ci (Fi ) C j (F j ) Ci (Fi )
i, j=1 i=1
After some algebraic manipulations, we show that in the new coordinates [xk ]k=1...n+1
( L̂ is the Cholesky decomposition of the (reduced) correlation matrix: [ρ]i, j=1...n =
[ L̂ L̂ † ]i, j=1...n )
n Fi n
d Fi
xk = ν L̂ ki
+ ρia L̂ ik a , k = 1, . . . , n
Fi0 Ci (Fi )
i=1 i=1
n
1
xn+1 = (ρaa − ρia ρ ja ρi j ) 2 a
i, j
Using the geodesic distance on Hn+1 between the points x = ({F}k , a) and the
F dF
initial point x 0 = ({F 0 }k , α) (qi ≡ F 0i C (Fi ) ) given by
i i i
n
−1
ν2 i, j=1 ρ qi q j
ij + 2ν(a − α) nj=1 ρ ja q j + (a − α)2 ρaa
d(F, a|F , α) = cosh
0
1+
2(ρaa − i,n j=1 ρia ρ ja ρi j )aα
In this model, the non-linear equation (3.3), satisfied by the saddle-point a ∗ (s), qi∗ (s)
which implicitly depends on the swaption strike s, read:
n
n
a ∗ (s)2 ρaa = α2 ρaa − 2να ρia qi∗ + ν 2 ρi j qi∗ q ∗j (4.4)
i=1 i, j=1
∗ (s)−α) n
(ρia (a ν + ij ∗ ∗ ∗
j=1 ρ q j )d(q , a ) λ ∂sαβ
= |(a,q)=(a ∗ ,q ∗ ) (4.5)
a ∗ (s)(cosh(d(a ∗ , {qi∗ }))2 − 1)
1
2 α ∂qi
80 P. Henry-Labordère
with
Fi∗
qi∗ = φi−1 x −βi d x
Fi0
The saddle-point is determined by solving these non-linear equations (4.4) and (4.5)
and an approximation (which could be used as a guess solution in a numerical opti-
mization routine) is found by linearizing these equations around the spot Libor rates
(i.e. qi = 0)
(s − s0 )
λ∗ (s) = n (4.6)
p,q=1 ω p ωq ρ̃ pq
n
Fi∗ j=1 ρ̃i j ω j (s − s0 )
= 1 + n + O((s − s0 )2 ) (4.7)
Fi0 ω
p,q=1 p q pq ω ρ̃
∂s
with ωi ≡ ∂qαβi (qi = 0) and ρ̃i j = ρi j − ρ ρaa
ρ ia ja
. Note that when the strike is close to
at-the-money, the saddle-points are close to the spot Libors and a ∗ = α. Moreover,
by using the explicit expression for the hyperbolic distance, the Van-Vleck-Morette
determinant is
d(F, a|F 0 , α)
(F, a|F 0 , α) =
cosh2 (d(F, a|F 0 , α)) − 1
4.3 Connection
n
Finally, the Abelian 1-form connection, A = i=1 Ai d Fi + Aa da, is
n
1 b j (t, F) ∂ j C j (F j )
n
A= − ν ρi j dqi + ρa j da
ν C j (F j ) 2
j=1 i=1
n
− ba (t, F) ν ρia dqi + ρaa da
i=1
In order to compute the log of the parallel gauge transport ln(P)(a, q|α) = C A,
we need to know a parametrization of the geodesic curve on Hn+1 . However, we
can directly find ln(P)(a, q|α) if we approximate the drifts bk (t, F) by their values
at the Libor spots (and t = 0). A similar approximation was done in the Hagan-al
formula [11] as was shown in [12]. Modulo this approximation,
n
1 b j (0, F 0 ) ∂ j C j (F j )
n 0
ln(P)(a, q|α) ∼ − ν ρi j qi + ρa j (a − α)
ν C j (F 0)
j
2
j=1 i=1
n
− ba (0, F 0 ) ν ρia qi + ρaa (a − α)
i=1
β
αβ
(σloc )2 (t, s) = ρi j σi (t)σ j (t) f i j (a, F)
i, j=α+1
⎛ ⎧
β
⎨∂ f (F∗, a ∗ ) ∂μ f i j (F∗, a ∗ ) ∂ν ψ(F∗, a ∗ )
μν i j
× ⎝1 + 2t Aμν + 2
⎩ f i j (F∗, a ∗ ) f i j (F∗, a ∗ ) ψ(F∗, a ∗ )
μ,ν=α+1
⎫⎞
n+1
∂μ f i j (F∗, a ∗ ) ⎬
− Aγδ ∂νγδ d 2
(F∗, a ∗
) ⎠
f i j (F∗, a ∗ ) ⎭
γ,δ=1
82 P. Henry-Labordère
with (a ∗ ≡ a ∗ (s), F ∗ ≡ {Fi∗ (s)}i ) the saddle-point satisfying the Eqs. (4.4) and
(4.5) approximated by (4.6) and (4.7) and
∂sαβ ∂sαβ
f i j (a, F) = a 2 Ci (Fi )C j (F j ) , ψ(a, F) = gP , Aαβ = [∂αβ d 2 ]−1
∂ Fi ∂ F j
2 n ij n ja 2
ν i, j=1 ρ qi q j + 2ν(a − α) j=1 ρ q j + (a − α)
0
d(F, a|F , α) = cosh −1 1+ n
2(ρaa − i, j=1 ρ ρ ρi j )aα
ia ja
⎡ ⎛ ⎞⎤
0
1 ⎣ b j (0, F 0 ) ∂ j C j (F j ) ⎝ i j
n n
ln(P)(a, q|α) ∼ − ν ρ qi + ρ (a − α)⎠⎦
a j
ν C j (F j0 ) 2
j=1 i=1
⎛ ⎞
n
− ba (0, F ) ⎝ν
0 ρ qi + ρ (a − α)⎠
ia aa
i=1
d(F, a|F 0 , α)
(F, a|F 0 , α) =
cosh2 (d(F, a|F 0 , α)) − 1
n+1 1
√ 2 det[ρ]− 2
2
g= n
νa 1+n
i=1 Ci (Fi )
Note that this expression is exact when t goes to zero. The smile at the first-
order is then obtained by plugging the above expression into (3.4) with t =
ν
2
2(ρaa − n ρia ρ ja ρ )
t.
i, j,k=1 ij
Remark 4.4.1 (Libor CEV model) Note that our model reduces for ν goes to zero
(and α ≡ 1) to the Andersen-Andreasen CEV Libor model (with different CEV
parameters for each Libors) and the above expressions degenerates into
∂sαβ ∂sαβ
f i j (F) = Ci (Fi )C j (F j )
∂ Fi ∂ F j
n
d(F) = !2 ρi j qi q j
i, j=1
n
b j (0, F 0 ) ∂ j C j (F j ) i j
0 n
ln(P)(q) = ( − ) ρ qi
j=1
C j (F j0 ) 2
i=1
(F, F ) = 1
0
n 1
√ 2 2 det[ρ]− 2
g= n
i=1 Ci (Fi )
Unifying the BGM and SABR Models … 83
with the saddle-points (4.4) and (4.5) satisfying the non-linear equations (modulo
the constraint sαβ = s)
∂sαβ
ρi j q ∗j = λ .
∂qi q ∗
It is interesting to note that for n = 1, i.e. for a caplet, the caplet asymptotic smile
reduces to the classical SABR formula by construction. Moreover, the asymptotic
local volatility is given at the zero-order by
αβ
n
∂sαβ ∗ ∂sαβ ∗
(σloc )2 (s, t) = ρi j (t)σi (t)σ j (t)a ∗ 2 (F ∗ )Ci (Fi∗ )Ci (Fi∗ ) (F ) (F )
∂ Fi ∂ Fj
i, j=1
with F ∗ depending implicitly on s via (4.4) and (4.5). At this stage, it is useful
to recall how a similar asymptotic local volatility is derived using the “freezing”
argument. The forward swap rate satisfies the following SDE in the forward swap
numéraire Qαβ
n
∂sαβ
dsαβ = σk (t)aCk (Fk )d Z k (4.8)
∂ Fk
k=1
∂s
The “freezing” argument consists in assuming that the terms ∂ Fαβk and C(s)
C(Fi ) are
almost constant. Therefore, the SDE (4.8) can be approximated by
n
∂sαβ Ck (Fk0 )
dsαβ = (F 0 )σk (t)a Ck (s)d Z k
∂ Fk Ck (s 0 )
k=1
αβ
n
Ci (Fi0 ) C j (F j ) ∂sαβ
0
(σloc )2 (s, t) = ρi j (t)σi (t)σ j (t)a ∗ 2 (s)
Ci (s 0 ) Ci (s 0 ) ∂ Fi
i, j=1
∂sαβ 0
× (F 0 ) (F )Ci (s)C j (s)
∂ Fj
84 P. Henry-Labordère
β
Table 4 Scenario B: Libor volatility d L i = 0.25(0.17 + 0.002(Ti−1 − t))L i dW
Swaption Strike MC (%) F1 F2
5 × 15 5.08 % (ITM) 18.12 18.20 % (8) 18.17 % (5)
7.26 % (ATM) 16.51 16.61 % (10) 16.63 % (12)
9.44 % (OTM) 15.38 15.38 % (0) 15.56 % (18)
10 × 10 5.55 % (ITM) 17.80 17.81 % (1) 17.89 % (9)
7.93 % (ATM) 16.26 16.33 % (7) 16.38 % (11)
10.31 % (OTM) 15.17 15.19 % (2) 15.32 % (15)
Libor L i (0) = log(a + bi), L 0 (0) = 5 %, L 19 (0) = 9 %. β = 0.5
We can reproduce this formula for the swaption smile at-the-money1 as the
saddle-point Libor rates coincides with the spot rates. This is not the case for in/out-
the-money swaption. Therefore our expression (exact at the zero-order) shows that
the freezing argument is no longer correct when we try to fit a swaption implied smile
in/out-the-money. In the following, we have tested our asymptotic swaption formula
at the zero-order with the same beta βk = β and ν = 0 (Formula F1) against the
Andersen-Andreasen asymptotic formula (Formula F2) [1] in the case ν = 0. The
accuracy of these approximations are examined using Monte-Carlo (MC) prices as a
benchmark. Following [15], we consider five scenarii (see Tables 3, 4, 5, 6 and 7). In
the following tables, the implied volatility is reported and the numbers in brackets are
the errors (in basis points i.e. true volatility times 104 ) corresponding to the implied
volatility computed using the F1 or F2 formula minus the MC implied volatility. An
x × y swaption has an option maturity of x years, a swap length of y years and a
tenor of one year. We set a time-step for Monte-Carlo δ = 0.125 and 216 paths.2 Our
formula F1 is more accurate than F2.
1 An at-the-money swaption (ATM) has a strike K equal to the spot rate sαβ (0) and an out-of-the
money (OTM) (resp. in-the-money (ITM)) swaption has K < sαβ (0) (resp. K > sαβ (0)).
2 We have used a predictor-corrector scheme with a Brownian bridge.
Unifying the BGM and SABR Models … 85
β
Table 5 Scenario C: Libor volatility d L i = 0.25(0.17 − 0.002(Ti−1 − t))L i dW
Swaption Strike MC (%) F1 F2
5 × 15 5.08 % (ITM) 14.89 14.97 % (8) 15.08 % (19)
7.26 % (ATM) 13.73 13.79 % (4) 13.81 % (8)
9.44 % (OTM) 12.92 12.91 % (−1) 12.92 % (0)
10 × 10 5.55 % (ITM) 14.52 14.53 % (1) 14.64 % (12)
7.93 % (ATM) 13.33 13.38 % (5) 13.40 % (7)
10.31 % (OTM) 12.51 12.51 % (0) 12.54% (3)
Libor L i (0) = log(a + bi). L 0 (0) = 5 %, L 19 (0) = 9 %. β = 0.5
β
Table 6 Scenario D: d L i = 0.05L i ( √ bi12(t) d W1 + √ bi22(t) d W2 ). bi1 (t) =
bi1 (t) +bi2 (t)2 bi1 (t) +bi2 (t)2
ρe−k1 (Ti−1 −t) + θe−k2 (Ti−1 −t) , bi2 (t) = 1 − ρ2 e−k1 (Ti−1 −t)
Swaption Strike MC (%) F1 F2
5 × 15 5.08 % (ITM) 19.19 19.33 % (14) 19.38 % (19)
7.26 % (ATM) 17.59 17.72 % (13) 17.75 % (16)
9.44 % (OTM) 16.46 16.49 % (3) 16.61 % (15)
10 × 10 5.55 % (ITM) 18.92 18.94 % (2) 19.06 % (14)
7.93 % (ATM) 17.31 17.39 % (8) 17.45 % (14)
10.31 % (OTM) 16.18 16.21 % (3) 16.32 % (14)
ρ = 0.99, θ = −0.99, k1 = k2 = 0.54. Libor L i (0) = log(a + bi). L 0 (0) = 5 %, L 19 (0) = 9 %.
β = 0.5
5 Conclusion
In this short note, we have introduced a LMM model coupled to a SABR stochastic
volatility process. By using the heat kernel expansion technique in the short time
limit, we have obtained an asymptotic swaption implied volatility at the first-order,
compatible with the Hagan-al classical formula for caplets. Moreover, we have seen
that this exact expression (when the expiry is very short) is incompatible with the
analogous expression obtained using the freezing argument.
86 P. Henry-Labordère
with dWμ dWν = ρμν (t)dt. Then, the metric gμν depends only on the diffusion terms
σμ and the connection Aμ on the drift terms bμ as well
ρμν (t)
gμν (t, x) = 2 , μ, ν = 1, . . . , n , ρμν ≡ [ρ−1 ]μν (5.1a)
σμ (t, x)σν (t, x)
n
1
n+1
1
Aα (t, x) = gαμ bμ (t, x) − g − 2 ∂ν g 1/2 g μν (t, x) , α = 1, . . . , n
2
μ=1 ν=1
(5.1b)
with g(t, x) ≡ det[gμν (t, x)]. In terms of these functions, the asymptotic solution to
the Kolmogorov equation in the short-time limit is given by
√ ∞
g(x) σ(x,x 0 )
p(x, t|x ) =
0
n (x, x 0 )P(x, x 0 )e− 2t 1+ an (x, x )t
0 n
,t → 0
(4πt) 2 n=1
(5.2)
d 2 (x, x 0 )
σ(x, x 0 ) =
2
1 ∂ 2 σ(x, x 0 ) 1
(x, x 0 ) = g(0, x)− 2 det(− )g(0, x 0 )− 2 (5.3)
∂x∂x 0
Unifying the BGM and SABR Models … 87
References
1. Andersen, L., Andreasen, J.: Volatility skews and extensions of the Libor market model. Appl.
Math. Financ. 7(1), 1–32 (2000)
2. Andersen, L., Andreasen, J.: Volatile volatilities. Risk 15(12), 163–168 (2002)
3. Andersen, L., Brotherton-Ratcliffe, R.: Extended Libor market models with stochastic volatil-
ity. J. Comput. Financ. 9(1), 1–40 (2005)
4. Avellaneda, M., Boyer-Olson, D., Busca, J., Fritz, P.: Reconstructing the smile. Risk Mag.
(2002)
5. Berestycki, H., Busca, J., Florent, I.: Asymptotics and calibration of local volatility models.
Quant. Financ. 2, 31–44 (1998)
6. Brace, A., Gatarek, D., Musiela, M.: The market model of interest rate dynamics. Math. Financ.
7, 127–154 (1996)
7. Brigo, D., Mercurio, F.: Interest Rate Models, Theory and Practice. Springer Finance. Springer,
Berlin (2015)
8. Erdély, A.: Asymptotic Expansions. Dover, New York (1956)
88 P. Henry-Labordère
9. Hull, J., White, A.: Forward rate volatilities swap rate volatilities, and the implementation of
the Libor market model. J. Fixed Income 3, 46–62 (2000)
10. Hull, J., White, A.: The pricing of options on assets with stochastic volatilities. J. Financ. 42,
281–300 (1987)
11. Hagan, P., Kumar, D., Lesniewski, A., Woodward, D.: Managing smile risk. Willmott Mag.
84–108 (2002)
12. Henry-Labordère, P.: A General Asymptotic Implied Volatility for Stochastic Volatility Models.
http://arxiv.org/abs/condmat/0504317
13. Henry-Labordère, P.: Solvable local and stochastic volatility models: supersymmetric methods
in option pricing. Quant. Financ. 7(5), 525–535 (2007)
14. Jamshidian, F.: Libor and swap market models and measures. Financ. Stoch. 1(4), 293–330
(1997)
15. Kawai, A.: A new approximate swaption formula in the Libor model market: an asymptotic
approach. Appl. Math. Financ. 10(1), 49–74 (2003)
16. Piterbarg, V.: A stochastic volatility forward Libor model with a term structure of volatility
smiles, SSRN
17. Rebonato, A.: On the pricing implications of the joint lognormal assumption for the swaption
and cap markets. J. Comput. Financ. 3(3), 5–26 (1999)
18. Terras, A.: Harmonic Analysis on Symmetric Spaces and Applications, vol. I, II. Springer, New
York (1985, 1988)
Second Order Expansion for Implied
Volatility in Two Factor Local Stochastic
Volatility Models and Applications
to the Dynamic λ-Sabr Model
1 Introduction
Part of this work carried out while the author was a Visiting Scholar at the Courant Institute.
literature were the Hull and White model [37], the Stein and Stein model [50] and
the Heston model [33]. In three of these models the underlying asset and its volatility
are driven by Brownian motions that may or may not be instantaneously correlated.
The correlation coefficient is taken to be a constant ρ. Then Bates introduced the
first of a series of models incorporating jumps [7]. These were followed by work by
Andersen and Andreasen [2].
In recent times there has been an explosion of models using the method of sto-
chastic time changes to produce ever more versatile models [15, 16]. However purely
diffusive models have retained of their popularity. A case in point is the introduc-
tion into the literature of the dynamic SABR (stochastic alpha-beta-rho model), by
Hagan, Lesniewski and Woodward [32]
β
d Ft = γ(t)Ft yt dW1t
dyt = ν(t)yt dW2t
< dW1t , dW2t > = ρ(t)dt
F0 = F̄0 ; y0 = α
A possible shortcoming of the Sabr was the lack of mean reversion in the stochastic
volatility. To address this a generalization was proposes by Henry-Labordère, who
in [39] introduced the λ Sabr model. In this new model the second equation is
complemented by a mean-reverting term
extending earlier asymptotic results for local volatility models to second order, so
that they become capable to furnish highly accurate expansions both for time homo-
geneous and time inhomogeneous models. In addition, unlike earlier treatments (with
the exception of Forde and Jacquier’s treatment of the Heston model) the expansions
in [28] were put on a rigorous mathematical basis. In particular it is shown that the
terms in the expansion of the implied volatility can actually be interpreted as deriv-
atives with respect to time to maturity, and for this reason, are in a certain sense
optimal.
In Sect. 4.2 we combine the results in [28] with the Gyöngy projection tech-
nique and with the heat kernel method for time inhomogeneous diffusions to develop
asymptotic expansions that are highly accurate and we show, using in part the results
in [28] that these results extend to stochastic volatility models with time dependent
parameters. That is to say, we describe in detail the first and second term in the
expansion of the implied volatility in the limit of short times to expiration in two
factor local-stochastic volatility models. The form of the resulting expansion is:
• Second order expansion of local volatility σ L ( f t , at , f, T ):
σ L ( f t , at , t, f, T ) = σ L(0) ( f t , at , t, f ) + σ L(0) ( f t , at , t, f )τ
(1)
+ σ L ( f t , at , t, f )τ 2 + o(τ 2 ) τ → 0 (1.1)
where we have set τ = T − t and where, in the time inhomogeneous case, we note
(i)
that the coefficients σ L , i = 0, 1, 2, will depend explicitly on the spot time t, as
well.
• Order τ 5/2 expansion of the call prices
C(s, t, K , T ) − (s − K )+ ed(K ,s,t) /2τ
2
(1.2)
(1) (2)
=C (s, K , t)τ 3/2
+C (s, K , t)τ 5/2
+ o(τ 5/2
).
σBS (( f t , at , t, f, T )
(0) (0) (1)
= σBS (( f t , at , t, f ) + σBS (( f t , at , t, f )τ + σBS (( f t , at , t, f )τ 2 + O(τ 2 )
(1.3)
The key to the proof will be determining the coefficients in the first expansion (for
local volatility). With the local volatility up to second order in hand, we can apply
the asymptotic expansion obtained for time inhomogeneous local volatility models
in [28] to derive the implied volatility. One difference between this paper and earlier
ones is that we do not make simplifications for the purpose of making formulas shorter
or simpler. We take the point of view that the derivation of the full length formulas
should be made clear and these should be presented at a first stage. At a second stage,
one can explore ways to simplify or shorten the formulas. Since the formulas are all in
closed form, up to quadrature, this requires at most approximating a one dimensional
Second Order Expansion for Implied Volatility … 93
integral by Simpson’s rule and thus, even if the formulas are long their calculation
is instantaneous. An illustration is given for the λ-Sabr model. We devote a section
to the asymptotics obtained in [28] for the one factor (local volatility) models, since
these can be readily applied in the present setting once the local volatility function
has been determined.
As this work was nearing completion we learned of nice independent work by
Louis Paulot [48] who also derives a second order asymptotic expansion for the
implied volatility in two factor stochastic volatility models in the time homogeneous
case. Paulot does not consider the time inhomogeneous case considered in this paper.
Paulot takes a somewhat different (and very reasonable) approach to ours, in which
he bypasses the computation of the local volatility function. For the local variance
function (square of the local volatility), which he does not try to determine explicitly,
his calculations stop at the first order (4.23) and do not seek to determine the next
order (4.24). The determination of the local volatility in stochastic volatility models
is of independent interest and is provided by our method. Once the local volatility
has been determined the determination of the call prices is a straightforward “plug
and play”, using the Proposition (5.2). Alternatively using the local volatility one can
use Dupire’s forward equation to obtain the call prices for all strikes and maturities.
The determination of the implied volatility is more involved, since the relationship
between local volatility and implied volatility derived in [28] and recalled in (5.3) is
correspondingly more complicated.
Another important issue concerns rigor. Two questions are often asked concerning
asymptotic expansions used in mathematical finance and also in other areas of the
applied sciences. The first is, are the results rigorous? The second, related question
concerns the role of the boundary conditions and the influence of the boundary
conditions on the asymptotics. In the present paper, in the style of classical asymptotic
analysis, our main concern is not on full rigor, but rather to provide details of full
expansions. As opposed to previous works we provide full detail of intermediate
steps.
However, in order to point out the rigorous underpinnings of this work, the influ-
ence of degenerating coefficients, and of boundary conditions, we devote a section to
the discussion of how (certain aspects) of how our results here can be made rigorous,
based on work of Azencott and co-workers [6], the relevance of whose work seems
to have not been noticed in mathematical finance heretofore. In particular, we point
out, that, the Sabr and λ Sabr model are associated to an incomplete Riemannian
manifold, when β < 1.
In this section, we introduce the family of models to which our results will apply.
It is understood that we are working under a risk neutral forward measure and are
94 G. Ben Arous and P. Laurence
with initial conditions at time zero given by (s0 , a0 ). Here W1 and W2 are Brownian
motions with deterministic, possibly time dependent correlation ρ(t).
The associated Kolmogorov backward and forward equations for this family of
models are given by:
1 1
pt + C 2 a 2 p f f + ρCavp f a + v 2 paa + αpa = 0 (2.4)
2 2
1 2 2 1
pt − (C a p) f f − (ρCavp) f a − (ν 2 p)aa + (αp)a = 0 (2.5)
2 2
The matrix a with entries ai j , i, j = 1, 2
C 2 a 2 ρCav
a= (2.6)
ρCav v 2 a 2
C( f t , at ; K , T ) = E( ft ,at ) [( f T − K )+ ],
Here and throughout this paper we will assume the expectations are taken with respect
to the a (given) risk neutral measure.
Heat Kernel: Time homogeneous case
In the time homogeneous case, we may without loss of generality assume the initial
time is 0. It is well-studied in differential geometry and stochastic analysis, under
certain technical conditions (see the discussion in Sect. 3), the transition density has
in the time homogeneous case the following expansion as T → 0+
Second Order Expansion for Implied Volatility … 95
d 2 ( f 0 ,a0 ; f,a )
e− 2T
p( f 0 , a0 ; f, a, T ) = g( f, a)U ( f 0 , a0 ; f, a, T ) (2.7)
2π T
< A, B > = gi j ai b j
n
U (x0 ; x, T ) = u k (x0 ; x)T k + O(T n+1 ).
k=0
γ̇
• P = e γ V, is the exponential of the work done by the vector field A along the
geodesic γ, with V = V i ∂i and
1 ∂ √ ij
V i = bi − √ gg . (2.8)
g ∂x j
Here the bracket “ < ·, · >” with vector entries V and W is defined via the metric
gi j as gi j vi w j .
• By adding and subtracting first order terms (2.4) can then be re-expressed in the
form
1
ut + B u + V · ∇u = 0, (2.9)
2
√
where B is the second order Laplace Beltrami operator √1 ∂ ( gg i j ∂∂x j ).
g ∂ xi
Note that the zeroth order heat kernel coefficient is known in closed form provided
we have available in closed form both the distance function and the geodesics. The
higher order on diagonal heat kernel coefficients u i (x, x) have been calculated up to
order 4 in very general settings. On the other hand the efficient calculation of the off
96 G. Ben Arous and P. Laurence
n
1 d 2 (x0 ,x)
C(s0 , a0 ; K , T ) ∼ e− 2T G k (s, a)T k dsda.
2π T
k=0 {s≥K }
d 2 (x0 ;x,t )
e− 2τ
p(x0 , t; x, T ) = u(x0 , t; x, T ) g(x, T ) (2.11)
2π τ
where now the distance function d, the Van Vleck-De Witt determinant and the
contravariant drift (2.8), depend explicitly on time because of the explicit dependence
of the metric and the drift on time. The series expansion H in this case reads
n
u(x0 , t; x, T ) = u k (x0 , x, t)τ k + O(τ n+1 ).
k=0
Moreover, the explicit form of the heat kernel coefficients is given by the following
formula
d( f 0 ,a0 , f,a,t)
V,γ̇ ∂d( f˜(ρ), ã(ρ), f, a)
u 0 (x0 , x, t) = (x0 , x)e γ exp − dρ
0 ∂t
and recursively
d( f 0 ,a0 , f,a,t)
u 0 (x0 , x, t) ρ k−1 ∂Uk−1
u k (x0 , x, t) = k LUk−1 + dρ,
d (x0 , x, t) 0 U0 ∂t
(2.12)
ui
uˆi = (2.13)
u0
where
Ai = gi j A j
Q = g i j (Ai A j − b j Ai − ∂ j Ai )
While the form (2.14) is equivalent to our form (2.9) it makes it necessary to
introduce the potential term Q. The motivations for introducing such a formulation
are undoubtedly related to the arduous task encountered in quantum field theory
of determining the higher order heat kernel coefficients. This covariant approach,
introduced among others by Avramidi in [4] and discussed in his lectures on mathe-
matical finance [5], introduces heavy machinery, complex line bundles gauge groups
etc., which for instance in the case of time homogeneous parabolic operators leads
to the determination of heat kernel coefficients u n via the solution of the transport
equations:
1
(1 + σi )u n = P −1 −1/2
(D + Q) 1/2
Pu n−1
n
where
σi = ∇i σ ∇i = ∂i + Ai
σ i = g i j σi
This section is meant to give the reader an intuitive grasp of some of the key concepts.
We do not aspire to completeness but try to indicate original sources where various
issues are dealt with in depth. This section can be skipped by those interested only
in the practical results.
Varadhan’s lemma, in the form he first derived it (see [52, 53]), applies to the
case where the underlying operator is uniformly parabolic, with sufficient regularity
on the coefficients. In its original form as obtained in [52], it relates to the unique
fundamental solution in all of Rn associated to a diffusion
pt − L p = 0
p D (x, y, t)
= Prob. of reaching y at time t
and not reaching the boundary before time t, starting at x
This leads to a general construction of the so-called minimal heat kernel which
applies to non-compact Riemannian manifolds like R2+ . In such a construction, which
is discussed in detail in Hsu [34], the manifold is exhausted by a sequence Dn of
nested compact domains for which the Dirichlet heat kernel pn (diffusion killed on
exiting Dn ), is constructed and the minimal heat kernel corresponds to the pointwise
limit of these as n → ∞. It is intuitively clear that at points on the boundary of the
manifold where the diffusion can reach the boundary, this construction leads to a
fundamental solution satisfies a Dirichlet condition in the backward variables.
As was shown by Azencott [6], a simple sufficient condition for Varadhan’s lemma
to hold that does extend to the case where the coefficients degenerate at the boundary,
is the case of complete Riemannian manifolds. To put this geometric concept into
perspective, consider the case where the underlying manifold consists of the first
quadrant, corresponding to non-negative forward price and volatility. In this case,
the manifold together with the natural Riemannian metric associated to the inverse of
the diffusion metric, is complete, if the boundary (and the other points “at infinity”,
if the domain is unbounded) are at an infinite distance from any point in the interior
of the manifold. It can be shown that a Riemannian manifold is complete if it is
metrically complete, i.e., all Cauchy sequences converge to a point in M (and not on
the boundary of M).
The above, with a modern proof appears as Theorem 5.2.1 in Hsu’s book [34].
Theorem 3.1 Let M be a complete Riemannian manifold and p M (t, x, y) the min-
imal heat kernel on M. Then, uniformly on compact subsets of M × M, we have
(3.1).
When the underlying Riemannian manifold fails to be complete, Hsu [36] refined
the work by Azencott by showing that a sufficient condition under which Varadhan’s
lemma is still guaranteed to hold is
which in our setting amounts to requiring that the distance of x to y does not exceed
the sum of the distance of x to the boundary and the distance of y to the boundary.
Below we illustrate this with a couple of examples:
Consider the well known distance function associated to the family of λ-Sabr
models (7.8) in the ( f, a) plane.
• When β = 1, the boundary a = 0 is at an infinite distance from any point in
the interior ( f, a). This is because the quantity q involved in the definition of the
distance equals
f
1
| lim du| = ∞
→0 u
100 G. Ben Arous and P. Laurence
Similarly it is easily seen that the f = 0 axis is at an infinite distance from any
point in the interior. Thus in the case β = 0 the Riemannian manifold is complete
and Varadhan’s lemma holds for all points in the manifold.
• When β < 1, since f1β is integrable at zero, the points on the f = 0 axis lie at
a finite distance and those on the a = 0 axis at an infinite distance. Thus here
we need to apply the sufficient condition (3.2) to guarantee the applicability of
Varadhan’s lemma.
• For the family of SV models
d f t = at dW1t + μ f dt
dat = γ at dW2t + μa dt
q
Also, it is easily seen that when q < 1 the a = 0 axis lies at a finite distance from
interior points (it suffices to note that vertical lines are geodesics) and also in this
case the curvature of the metric blows up at a = 0.
A final note concerns the relationship between the remarks above and the inter-
esting recent work by Doust [19] who in determining call prices in the Sabr model
for 0 < β < 1, takes into account the probability that the forward hits zero. Doust
correctly points out, just as Lewis had done, in the case of the CEV model (see [42]
p. 305), that the call price needs to be adjusted to allow for this possibility. There is non
conflict between this result and the modified version’s of Varadhan’s lemma. These
simply give sufficient conditions for the distance of points x, y from the boundary
(here “infinity”), so that the effect of paths of the forward which reach the boundary
is exponentially negligible in the small time limit. As Doust points out and as his
numerics shows, for longer times these need to be taken into account.
We also would like to remark that the when the local volatility component C( f, t)
of the local-stochastic volatility models is such that C(0, t) = 0 (for instance
C( f, t) = γ(t) f β with 0, β < 1, the series constructed by means of the heat kernel
vanishes on the boundary in the backward variables. Remarkably, this comes “for
free” (see (8.1) in Sect. 8) and was not part of the heat kernel’s explicit construction.
Second Order Expansion for Implied Volatility … 101
d f t = σ L ( f t , t)dWt ,
gives rise to the so-called local volatility models, made famous by the work of Bruno
Dupire. Given a two factor local-stochastic volatility model, of the form
Note that the effective drift of f t hasn’t changed when the original drift is of the
special form b(t) f (t). In fact, applying Gyöngy’s result in our case, the effective
drift is given by
C 2 (F, T ) R+ A2 p( f, a, F, A, T )d A
σ L2 (F, T ) = (4.5)
R+ p( f, a, F, A, T )d A
2 This
expression was coined by Marco Avellaneda, who in 1999, without prior knowledge of
Gyöngy or Dupire’s work, independently discovered the technique.
102 G. Ben Arous and P. Laurence
where
√
α0 =: g(F, A, T )u 0 (4.7)
=: Ĉ (4.8)
√
αi = g(F, A, T )u 0 û i ( f, a, F, A, t), i ≥ 1 (4.9)
d2
and where we have canceled the common factor 1
n . Letting = τ and φ = 2
(2π(τ )) 2
the above may be expressed in the form
n φ
C 2 (F, T ) R+ A2 αi e d A
i=0
σ L2 (F, T | f, a) = n (4.10)
φ
R+ αi e d A
i=0
Now we apply the Laplace expansion to obtain an expansion of the local volatility.
It is convenient to consider separately the time homogeneous and the more involved
time inhomogeneous cases:
Proposition 4.1 Suppose φ(Y ) has a unique minimum in (0, +∞), at the point A0 .
The asymptotic expansion of
+∞ −φ(A)
f (A)e d A, (4.11)
0
as → 0 is given by:
2π − φ(A0 )
= e
φ (A0 )
× f (0) (A0 ) + f (1) (A0 ) + f (2) (A0 ) 2
Second Order Expansion for Implied Volatility … 103
where
In using the Laplace expansion in conjunction with the heat kernel form (4.16),
we must deal with a more general (than (4.16) expression of the form
+∞ φ(A)
f 0 (A) + f 1 (A) + 2 f 2 (A) e− d A (4.16)
0
( j)
Let us introduce the notation f i , i, j = 1, 2 to indicate the jth term in the Laplace
expansion above applied to the function fi . We apply the proposition to numerator and
denominator of the expression defining the local volatility. This yields an expression
that has the following form
This ratio can be expanded in powers of , and yields the following asymptotic
expansion which can be applied to obtain the effective local volatility valid in a wide
3 The expansion up to the first order (4.12) appears in several sources including in textbook form,
as in Bender and Orszag [9], p. 273. On the other hand we have not been able to locate a source for
the second order expansion given here as (4.15).
104 G. Ben Arous and P. Laurence
(0)
f0
(0)
ζ0
1
+ (0)
− f 0(0) (ζ0(1) + ζ1(0) ) + ζ0(0) ( f 0(1) + f 1(0) )
(ζ0 )2
1
(0) 3
f 0(0) (ζ0(1) + ζ1(0) )2 − f 0(0) ( f 0(2) + f 1(1) + f 2(0) )
(ζ0 )
(0) (1) (0) (1) (0) (0) (2) (1) (0)
+ ζ0 (−(ζ0 + ζ1 )( f 0 + f 1 )+ f 0 ( f 0 + f 1 + f 2 ) 2 + o( 2 ), → 0
Let us introduce the following notation in relation to the family of stochastic volatility
models in (4.1), in conjunction with the notation introduced in our discussion of the
heat kernel expansion, following Yoshida. Recall from (4.8) the definition of Ĉ
Ĉ(a) = detg(a, f ) ( f 0 , a0 , f, a)P( f 0 , a0 , f, a)
where on the left hand side we suppress the dependence of Ĉ on variables other than
a, since these are not relevant in the Laplace asymptotics.
Proposition 4.2 Assume the distance function in family of local-stochastic volatility
models has a unique minimum, as a function of the final value of the volatility a,
2
at c = amin . Let φ = d2 . Then the effective local volatility in the family of local
stochastic volatility models (4.1), is given, up to the second order, by
and correspondingly, by taking square roots and expanding, the local volatility is
given by4
where
(0) (0)
σL = VL (= C( f )c) (4.19)
(1)
VL
σ L(1) = (0)
(4.20)
2σ L
4 In(4.17) we once again suppressed on the right hand side the dependence of variables other than
t, T and f . In the next section, when we combine the asymptotics for local volatility with the above
asymptotics for implied volatility, given local volatility, it will be important that the local volatility
depends on the initial time t and on the final time T in the particular way indicated.
Second Order Expansion for Implied Volatility … 105
where
(0)
VL ( f, t) = C 2 ( f )c2 (4.22)
C 2 ( f ) Ĉ(c) + 2cĈ (c) φ (c) − cĈ(c)φ (3) (c)
VL(1) ( f, t) =
Ĉ(c)φ (c)2
(4.23)
(num,2)
(2) VL ( f, t)
VL ( f, t) = (den,2)
(4.24)
VL ( f, t)
with
1
VL(den,2) ( f, t) =
24Ĉ(c)2 (φ (c))6 φ (c)5
and
(num,2)
VL ( f, t)
2
= C( f )2 12c φ (c) −2Ĉ (c) + cĈ(c)u 1 (c)φ (c) φ (c) Ĉ (c)
+ 2Ĉ(c)u 1 (c)φ (c) − Ĉ (c)φ (3) (c) + Ĉ(c)φ (c) 2φ (c)6 12cφ (c)2 Ĉ (3) (c)
+ 35cĈ (c)φ (3) (c)2 + 6Ĉ (c)φ (c) 3φ (c) − 5cφ (3) (c)
− 15Ĉ (c)φ (c) 2φ (3) (c) + cφ (4) (c) + φ
6
−12Ĉ (c)φ (c)
φ (c) + c2 u 1 (c)φ (c)2 − cφ (3) (c) + Ĉ(c)u 1 (c)φ (c)
−24φ (c)2 − 24c2 u 1 (c)φ (c)3 + 5c2 φ (3) (c)2 − 3cφ (c) −8φ (3) (c)
+ cφ (4) (c) + 2Ĉ (c) −11cφ (3) (c)2 + 6cu 1 (c)φ (c)2 4φ (c) + cφ (3) (c)
+ 3φ (c) 2φ (3) (c) + cφ (4) (c) + Ĉ(c)2 (u 1 (c)
6
φ φ (c)2 24φ (c)2 − 5c2 φ (3) (c)2 3cφ (c) −8φ (3) (c)
+ cφ (4) (c) + φ 48cu 1 (c)φ (c)4 + φ (c) − cφ (3) (c)
6
106 G. Ben Arous and P. Laurence
−5φ (3) (c)2 + 3φ (c)φ (4) (c) + φ (c)6 −35cφ (3) (c)3 + 35φ (c)φ (3) (c)
φ (3) (c) + cφ (4) (c) − 3φ (c)2 5φ (4) (c) + 2cφ (5) (c)
Since the heat kernel expansion for the probability transition density involves
the backward (in financial terms, spot) time in all places except for the factor
√
g(F, A, T ) in Eq. 4.9, the Laplace expansion technique is essentially unchanged.
√
As was done in [28], in Sect. 2, we need to develop g(F, A, T ) in a power series
expansion around in time around T = t. Ie.
√
√ √ ∂ g(F, A, T )
g(F, A, T ) = g(F, A, t) + |T =t (τ )
∂T
√
1 ∂ g(F, A, T )
+ |T =t (τ )2 + o(τ 2 )
2 ∂T
=: d0 (A, t) + d1 (A, t) + d2 (A, t) 2 + o( 2 ), = τ (4.25)
where αi was defined in (4.9), the expansion in powers of will now follow from
the expansion in powers of of
In [28] we obtained an optimal result for the asymptotics of the implied volatility in
a local volatility model of the form5
d f t = σ L ( f t , t)dWt (5.1)
In order to formulate our asymptotic result in the stochastic volatility setting, we will
need to combine those results with the results in the previous section. We begin by
recalling some of the required auxiliary quantities derived in [28].
• One dimensional (signed) distance function
f 1
d1 ( f, K , t) = du, t ∈ [0, T ]
K σ L (u, t)
and
η
(1d)
(1d)
u ( f, K , t) f σ L2 2
u1 = 0 H + H f + bH + c + Ht (ζ, K , t)dζ
d1 (K , f, t) K 2 K
dη
× (5.3)
σ L (η, t)
where
∂ (σ L ) f ( f, t) (d1 )t (K , f, t)
H ( f, K , t) = [ln u 0 ( f, K , t)] = − .
∂f 2σ L ( f, t) σ L ( f, t)
5 Reference [28] explains how to adjust the results to allow for a non zero but constant interest (or
other constant yield) rate.
108 G. Ben Arous and P. Laurence
T
+1
C(s, K , t, T ) = (s − K ) + σ L (K , u)2 p(s, t, K , u)du.
2 t
C(s, K , t, T ) − (s − K )+
k T
1 −d(K ,s,t)2 /2(u−t) i− 21
∼ √ σ L (K , u)e (u − t) du u i (s, K , t).
2 2π i=0 t
Letting τ 1
u i− 2 e−d
2 /2u
Ui (ω, τ ) = du. (5.4)
0
the expansion may, in the time inhomogeneous case, be expressed in the compact
form
Proposition 5.1 The expansion of the call prices in a driftless local volatility model,
is given by:
C(s, K , t, T ) − (s − K )+ (5.5)
1
k
1
∼ √ Ui (d, τ ) + (σ L )t (K , t)Ui+1 (d, τ ) + (σ L )tt (K , t)Ui+2 u i (s, K , t).
2 2π i=0 2
Using that
τ 3/2 3τ 5/2 −ω2 /2τ
U0 (ω, τ ) ∼ 2 − e ,
d2 d4
2τ 5/2 −d 2 /2τ
U1 (ω, τ ) ∼ e
d2
Proposition 5.2 The expansion of the call prices in a driftless local volatility model,
is given, in the small time limit τ → 0, up to the order τ 5/2 by
C(s, K , t, T ) − (s − K )+
1 −d 2 /2τ 1
=√ e σ L (K , t)u 0 (s, K , t) τ 3/2
2π d2
3 1 1
+ − 4 σ L (K , t)u 0 (s, K , t) + (σ L )t 2 u 0 (s, K , t) + 2 σ L (K , t)u 1 (s, K , t) τ 5/2
d d d
Second Order Expansion for Implied Volatility … 109
Implied volatility
For the implied volatility the following expansion was obtained in the same paper:
Proposition 5.3 The implied volatility σBS admits the following asymptotic expan-
sion, away from the money:
where
f
ξ ln K
σBS,0 = = f . (5.6)
d1 (K , f, t) dη
K σ L (η,t)
(1d)
(1d)
u0 ( f,K ,t)σ L (K ,t)ξ 2 σ L (K ,t)u 0 ( f,K ,t)d(K , f,t)
σBS,0
3 ln √
f K d 2 σBS,0
3 ξ ln √
ξ fK
σBS,1 = = . (5.7)
ξ2 d1 (K , f, t)3
3σBS,1 σBS,0
2 2
3σBS,1 σBS,0
3
σBS,2 = − + +
ξ2 2σBS,0 ξ2
2
3σBS,0 σBS,0
2
(σ L )t (K , t) 3 u (1d)
1 ( f, K , t)
+ + − + (1d)
ξ 2 8 σ L (K , t) d (K , f, t) u ( f, K , t)
2
0
2
(1d)
3σBS,1 3σBS,1 ξ 3 ξ (σ L )t (K , t) u ( f, K , t)
=− + + 5+ 3 + (1d)
1
.
d2 2σBS,0 8d d σ L (K , t) u ( f, K , t) 0
(5.8)
Given the expansion (8.4) for the local volatility in powers of = τ , we can use
the 1D expansion above to determine the contributions of the higher order terms to
the implied volatility. Note that the above expansion of the Black-Scholes implied
volatility in powers of τ has an underlying functional dependence on the underly-
(0) (1) (2)
ing local volatility, σ L ∼ σ L + σ L + 2 σ L 2 , obtained in Sect. 3. In order to
emphasize this dependence below we write σ B S,i [σ L ]. In order to combine the local
volatility asymptotics (4.2) with the implied volatility asymptotics, we must plug
the former into the latter and again expand in powers of = τ . Now an additional
subtlety arises and taking it into account properly is crucial to the correct derivation
110 G. Ben Arous and P. Laurence
of all subsequent formulas. The subtlety is that the local volatility expansion obtained
in the previous section introduces an additional dependence on the backward variable
f 0 which (alongside a0 ) serves as an initial condition when using the call decom-
position formula (4.10). Thus f enters in as a parameter and the “active” variable
in σ L ( f 0 , a0 , f, t) is f . Every time a differentiation or integration with respect to a
spatial argument needs to be carried out, we need to use f as the active variable.
This may seem a little surprising because in deriving the 2-D heat kernel we are
freezing the forward variable f (or K ), whereas now we are freezing the backward
variable f 0 . This state of affairs is easily clarified when we remember that underlying
Dupire’s derivation of the local volatility function is a differentiation with respect to
the forward variable. So, in a sense we are mixing a backward and a forward repre-
sentation. Although this may seem odd, it does carry has some distinct advantages
especially in the time-inhomogeneous case, since all quantities of interest there are
evaluated at the spot time t, which is mostly frozen throughout.
As an example, letting d1 ( f, f 0 , t) denote the 1 − D distance function introduced
in the beginning of the preceding section, corresponding to the expansion for the
local volatility denote σ L obtained in (8.4)
f0 1
d ( f 0 , f, t, T ) = du
f σ L ( f 0 , a0 , f, t, T )
∂ ∂
So, we obtain using that by definition ∂ T |T =t = ∂
(1)
d f0 σ L ( f 0 , a0 , f, t)
= d( f, f 0 , t, )|=0 = − (0)
du (5.9)
d f (σ L ( f 0 , a0 , f, t))2
Proposition 5.4 The small time to maturity expansion of the call prices in the family
(2.1)–(2.3) is given by
C(s, K , t, T ) − (s − K )+
1 1 3
= √ e−d /2τ + − 4 σ L(0) (K , t)u 0 (s, K , t)
2
σ L (K , t)u 0,1 (s, K , t) τ 3/2
2π d2 d
(1) 1 1 ((0)
+ σ L 2 u 0,1 (s, K , t) + 2 σ L (K , t)u 1,1 (s, K , t) τ 5/2
d d
Second Order Expansion for Implied Volatility … 111
(0) (1)
where σ L is given by (4.19), σ L is given by (4.20) and where u 0,1 and u 1,1 are
given by (5.14) and u 1,1 is given by (5.16).
σ BSVS [σ L ]( f 0 , a0 , K , t) ≡ σ BSVS,0 [σ L ]
+ σBS,1
SV
[σ L ]( f 0 , a0 , K , t)τ + σBS,2
SV
[σ L ]( f 0 , K , T )τ 2 , +o(τ 2 ) τ = T − t → 0
where
ξ
σBS,0
SV
[σ L ]( f 0 , a0 , K , t) = f0
du (5.10)
1
K σ (0) ( f ,a ,u,t)
L 0 0
σBS,1
SV
[σ L ]( f 0 , a0 , K , t) (5.11)
(0)
σ L ( f 0 ,a0 ,K ,t)u 0,1 ( f 0 ,K ,t)d (0) ( f 0 ,a0 ,K ,t)
ξ ln √
ξ f0 K
= (0)
. (5.12)
d ( f 0 , K , t)3
1
d (0) ( f 0 , K , t) = f
(5.13)
1
K σ (0) ( f ,a ,u,t) du
L 0 0
u 0,1 ( f, K , t)
(0) η (1)
! σ L ( f 0 , a0 , f 0 , t) f 1 σ L ( f 0 , a0 , u, t)
= (0)
exp (0) (0)
dudη .
σ L ( f 0 , a0 , K , t) K σ L ( f 0 , η, t) K (σ L ( f 0 , a0 , u, t)))2
(5.14)
Also
σBS,2
SV
( f 0 , a0 , K , t)
SV SV )2
(1)
3σBS,1 3(σBS,1 ξ3 ξ (σ L u 1,1
=− + + + (0) 3 +
(d (0) )2 SV
2σBS,0 8(d (0) )5 (d ) σL
(0) u 0,1
(5.15)
where, recalling (5.3), and, for brevity suppressing the dependence on the initial
variables ( f 0 , a0 ) we have
112 G. Ben Arous and P. Laurence
(0)
u 0,1 f0 σ L (η, t)
u 1,1 = (0)
d K d (0)
η
dη
× H(2) + (H f )(2) +
2
(Ht )(2) (ζ, K , t)dζ (0)
, (5.16)
K σ L (η, t)
where
(1)
f 0 σ L (u,t)
σL
(1) K (σ (0) (u,t)))2 du
H(2) = (0)
− L
(0)
,
2(σ L )2 σL
(H f )(2)
(1)
f0 σL
(1) K (σ (0) )2 du σ L(1) (0) (0)
σL (σ L )2K (σ ) K K
= (0)
− L
(0)
− + L (0) ,
(σ L )3 (σ L )2 2(σ L )(0) )2 2σ L
(Ht )(2)
(1) (0) (2)
1 f −2(σ L )2 + σ L σ L (0)
= 2 du σ L
2(σ L(0) )2 K (σ L(0) )3
f (0,1)
σL (1) (1) (0) (0) (1)
+2 (0)
du σ L − σ L (σ L ) K + σ L (σ L ) K
K (σ L )2
expression only σ L(0) and σ L(1) and their derivatives are involved. Recall that the
determination of the first of these required only the heat kernel coefficient u 0 in the
two heat kernel expansion, while the second of these required only the heat kernel
(2)
coefficient u 1 . The lengthy expression for σ L , i.e. the coefficient of τ 2 in (8.4),
expressed in terms of the lengthy expression for V2 , is required above, only for the
first term under the integral of (Ht )(2) . Thus, we may propose as a very reasonable
approximation for the implied volatility expansion at the second order to use (5.15),
with all terms the same but using instead the modified H2m defined by
1 f −2(σ L(1) )2 +
2 du σ L(0)
2(σ L(0) )2 K (σ L(0) )3
f (0,1)
σL (1) (1) (0) (0) (1)
+2 (0)
du σ L − σ L (σ L ) K + σ L (σ L ) K
K (σ L )2
Second Order Expansion for Implied Volatility … 113
d f t = C( f, t)at dW1,t
dat = κ(t)(λ̄ − at ) + at ν(t)dW2,t
with inverse
ρ
1
− a2 ν
gi j = C 2 a 2 (1−ρ 2 ) (1−ρ 2 )C
− a 2 νC ρ1−ρ 2 1
( ) ν 2 (1−ρ 2 )a 2
g −1 = −a 4 C 2 ν 2 (−1 + ρ 2 ); (6.1)
We now use the definition of the Laplace-Beltrami operator to rewrite the original
PDE in the form
1
ut + Bu + V1 u f + V2 u a = 0
2
where a straightforward calculation using the explicit form of the metric coefficients
yields that the “metric” part of the drift is given by
1 1 1
Vm1 = a2νγ f β 1 − ρ2 ∂ f ( g 11 + ∂a g 12
2 a2νγ f β 1 − ρ2 a2νγ f β 1 − ρ2
1 β β γ f β−1 12
= a νγ f
2 1−ρ 2 = γ βa 2 f 2β−1
2 ν 1 − ρ2 2
114 G. Ben Arous and P. Laurence
1 1 1
V2 = a2νγ f β 1 − ρ2 ∂ f g 12 + ∂a g 22
2 a2νγ f β 1 − ρ2 a2ν γ f β 1 − ρ2
1 1
= − a2ν γ f β 1 − ρ2∂ f γ νρ f β a 2
2 a2ν γ f β 1 − ρ2
1 2 2
+ ∂a ν a
a2ν γ f β 1 − ρ2
=0
1 2 2 2β−1
Vm = γ βa f ∂f
2
The corresponding covector field, is obtained by lowering the indices
Hence
1 2 2 2β−1 1
V1m = γ βa f
2 γ f
2 2β 1 − ρ2 a2
1 β
=
2 f (1 − ρ 2 )
ρ 1 2 2 2β−1
V2m = − γ βa f
γ
ν 1−ρ f a 22 β 2
β γρ
=− f β−1
2ν(1 − ρ 2 )
β β γρ da
df − f β−1 (6.2)
2 f (1 − ρ )
2 2(1 − ρ 2 ) ν
∂C
1 ∂f ρ ∂C
∂f
d f − da, (6.3)
2(1 − ρ 2 ) C 2ν(1 − ρ 2 )
and, in the new variables defined by (6.6) and (6.7) below, this term may be expressed
in the form
Second Order Expansion for Implied Volatility … 115
∂C
∂f
d x,
2 1 − ρ2
1
C( f 0 , t)) 2(1−ρ 2 )
C(K , t)
β
F0 2(1−ρ 2 )
(6.4)
K
To this metric part, we need to add the drift coming from the mean reverting
volatility. This vector has only the ∂a component
V meanrev = κ(λ̄ − a)∂a
Vmeanrev = g12 κ(λ̄ − a)d f + g22 κ(λ̄ − a)da
ρκ[t](−a + λ[t]) κ(−a + λ[t])
= 2 df − 2 2 da
a ν −1 + ρ 2 C[ f, t] a ν −1 + ρ 2
(6.5)
Define
f 1−β
q=
γ(t)(1 − β)
νq − ρa
x= (6.6)
ν 1 − ρ2
a
y= (6.7)
ν
d f = γ −ρ 2 Cd x + γ ρCdy (6.8)
116 G. Ben Arous and P. Laurence
This change of variables is defined in such a way that the principal part of the
partial differential equation, expressed in the new coordinates, is the standard Sabr
model corresponding to the (rescaled) hyperbolic plane, i.e., of the form
1 2 2
ν y (u x x + u yy )
2
After writing the backward partial differential equation in the form 21 B + V , each
of the terms is covariant. This means that in order to determine the contravariant
drift in the new coordinates, it suffices to transform the contravariant drift in the old
coordinates by the recipe for the change of a contravariant vector under changes of
the independent variables. However, since the changes of variables we are making
(6.6) and (6.7) depend explicitly on time it is clear by the expression of the new
dependent variable in terms of the old, i.e. u( f, a, t) = v(x( f, a, t), y( f, a, t), t)
that the in the drift term for the new PDE there is an extra term coming from the time
derivatives of the independent variables. A simple calculation shows that the metric
part of the drift (6.2) can be expressed in the compact form
β f β−1
dx (6.9)
2 1 − ρ2
Recalling the transformation (6.6) and (6.7), the time derivatives are
1 f 1−β ρ 1 ρ
xt = ( ) − y−( ) νy
γ (1 − β) 1 − ρ 2 (1 − ρ 2 )3/2 ν 1 − ρ2
γ y ρ ν ρ
= − (x + ρ )− y + ( ) y (6.10)
γ 1 − ρ2 (1 − ρ )
2 3/2 ν 1 − ρ2
1
yt = ( ) νy (6.11)
ν
Note that from the transformations (6.6) and (6.7) we obtain the relation
df dy
dx = −ρ (6.12)
1 − ρ2 γ f β 1 − ρ2
For the non-metric part of the drift arising from the mean reversion we obtain
from (6.12) and from (6.5):
ρκ(λ̄ − νy) −ρ 2 κ(λ̄ − νy) κ(λ̄ − νy)
− dx + + dy (6.13)
1 − ρ 2 ν 3 y2 (1 − ρ )ν y
2 3 2 (1 − ρ 2 )ν 3 y 2
Second Order Expansion for Implied Volatility … 117
1
vt + ν 2 y 2 (vx x + vyy ) + Ṽ x vx + Ṽ y vy = 0, (6.14)
2
where
β γ f β−1 ρκ(λ̄ − νy) 1
Ṽx d x = − + 2 xt d x
2 1 − ρ2 1 − ρ 2 ν 3 y2 y
Since the first part contains an exact part, it is better to express the first expression
in the equation above in the form
1 β γρ
d log f β − f β−1 dy
2 1−ρ 2 2(1 − ρ 2 )
As mentioned earlier, the first part is trivially integrated (see (6.4) and the second
part of Ṽy dy so we have the full drift written as
1
V̄˜x d x + V̄˜y dy + d log f β , (6.15)
2 1 − ρ2
where
˜ ρκ(λ̄ − νy) 1
V̄x d x = − + 2 xt d x
1 − ρ 2 ν 3 y2 y
β γρ 1 κ(λ̄ − νy)
V̄˜y dy = − f β−1
dy + y t + dy, (6.16)
2(1 − ρ 2 ) y2 ν 3 y2
β γρ 1 1 κ(λ̄ − νy)
= − + 2 yt + dy
2(1 − ρ 2 ) (1 − β) γ 1 − ρ 2 x + ρ(1 − β) γ y y ν 3 y2
(6.17)
1
f β−1 =
(1 − β) γ 1 − ρ 2 x + ρ(1 − β) γ y
Recall that the difference between V̄˜ and Ṽ is that in V̄˜ we removed the exact part
of the drift vector. For completeness we note that, alternatively, including the exact
part explicitly, we have the relations:
118 G. Ben Arous and P. Laurence
β γ f β−1 ρκ(λ̄ − νy) 1
Ṽx = − + 2 xt d x
2 1 − ρ2 1 − ρ 2 ν 3 y2 y
β ρκ(λ̄ − νy) 1
= − + 2 xt d x
2 1 − ρ 2 (1 − β)( 1 − ρ 2 x + ρy) 1 − ρ 2 ν 3 y2 y
(6.18)
1 κ(λ̄ − νy)
Ṽy = yt + dy, (6.19)
y2 ν 3 y2
βν 2 y 2 ν 2 ρκ(λ̄ − νy)
Ṽ x = − + ν 2 xt
2 1 − ρ 2 (1 − β)( 1 − ρ 2 x + ρy) 1 − ρ2ν3
κ(λ̄ − νy)
Ṽ y = ν 2 yt + (6.20)
ν
The changes of variables we have made in the last section, transform the original
operator into the operator
1
∂t + ν 2 y 2 (∂x2 + ∂y2 ) + Ṽ x ∂x + Ṽy ∂y
2
The factor ν 2 in front of the second order part means we are not yet in the standard
Poincaré plane where ν = 1. To adjust for this we further make the change of time
t
t= (6.21)
ν2
1 Ṽ x Ṽy
∂t + y 2 (∂x2 + ∂y2 ) + 2 ∂x + 2 ∂y
2 ν ν
As is easily seen the covariant form of the drift remains the same (due to offsetting
effects), but the contravariant form changes by a factor of ν 2 .
The geodesic passing through the points (x1 , y1 ) and (x2 , y2 ) is known to be the
semi-circle with origin at (x0 , 0) where
x = x0 + R cos θ, 0 ≤ θ ≤ π
y = R sin θ, 0 ≤ θ ≤ π (6.23)
Notice that if we denote by (x, y) the running point on the geodesic, then the polar
angle corresponding to this point, issuing from a fixed point (x2 , y2 ) is given by
θ (x, y)
⎧
⎪ arctan( x 2 −xy2 +y2 −y2 ) arctan( x 2 −xy2 +y2 −y2 ) > 0
⎨ x− 2 2(x −x) x− 2 2(x −x)
= y 2 2
y (6.24)
⎩ arctan( x22 −x 2 +y2 −y2 ) + π, arctan( x22 −x 2 +y2 −y2 ) < 0
⎪
x− 2(x2 −x) x− 2(x2 −x)
where γ((x1 , y1 ), (x2 , y2 ) is the geodesic joining the points (x1 , y1 ) and (x2 , y2 ) in
the standard hyperbolic plane, which is the image of the original f, a planed under
the coordinate transformation (6.6) and (6.7), call it .
Let us first carry this out for the so-called “metric part” of the drift, i.e. the part
corresponding to κ = 0 and no time dependence in the coefficients.
Then, using (6.18), for the metric part of the drift we have that
Ṽ m dy
γ(x,y)
θ2
β γρ cos θ
=− dθ
2(1 − ρ 2 ) 1 − ρ (x0 + R cos θ ) + Rρ sin θ
2
θ1
120 G. Ben Arous and P. Laurence
cos θ
1 − ρ 2 (x
+ R cos θ ) + ρ R sin θ
0
1 − ρ2ρ
= + d log 1 − ρ 2 (x0 + R cos θ ) + ρ R sin θ
R R
x0 1 − ρ 2 1
−
R 1 − ρ 2 (x0 + R cos θ ) + ρ R sin θ
θ2
1
dθ
θ1 1 − ρ 2 x0 + R cos θ − arctan( √ ρ 2
)
1−ρ
Letting
a1 = 1 − ρ 2 x0
ρ
γ̂ = arctan( ),
1 − ρ2
θ2
1
d(θ − γ̂) = (6.26)
a1 + R cos(θ − γ̂)
θ1
I1
Second Order Expansion for Implied Volatility … 121
θ2 −γ̂ 1
= dθ (6.27)
θ1 −γ̂ a1 + R cos θ
⎧ & '
⎪
⎪ 2 −γ̂
⎪
⎪ 2
arctan a 1 −R
a1 +R tan 2
θ
|θ=θ if a12 > R 2
⎪
⎪ a −R
2 2 θ=θ 1 −γ̂
⎪
⎪ & 1 '
⎪
⎪
⎪
⎨ 2 −γ̂
2 arctanh R−a1
R+a 1
tan θ
2 |θ=θ
θ=θ −γ̂
if a12 < R 2
R 2 −a1
2 1
⎪
⎪ (1 )
⎪ 2 −γ̂
⎪
⎪ tan( θ2 ) |θ=θ if a1 = R
⎪
⎪
R θ=θ −γ̂ 1
⎪
⎪
⎪
⎪ (1 )
⎩ 2 −γ̂
R cot( θ2 ) |θ=θ
θ=θ −γ̂
if a1 = −R (6.28)
1
So that all told, taking into account (6.4) we obtain the following expression for
the line integral corresponding to the metric part:
log(P ) M E
β γρ 1 − ρ2 ρ
=− (θ2 − θ1 ) + log 1 − ρ 2 (x0 + R cos θ) + ρ R sin θ |θθ21 (6.29)
2(1 − ρ )
2 R R
x0 1 − ρ 2
β f0
− I1 + log( ) (6.30)
R 2 1−ρ 2 K
λSabr part
Next we deal with the specific to λ-Sabr part, which have found above to be
Inserting this into the line integral that must be evaluated, we find
θ2
ρκ λ̄
= csc θ dθ
R 1 − ρ2ν3
θ1
θ2
ρκ
− 1dθ
1 − ρ2ν2
θ1
122 G. Ben Arous and P. Laurence
θ2
κ λ̄ cos θ
+ 3 dθ
ν R sin2 θ
θ1
θ2
κ cos θ
− dθ
ν2 sin θ
θ1
Now
csc θ dθ = log(tan(θ/2)) + C
cos θ 1
=− +C
θ
sin θ
2 sin
cot θ dθ = log(| sin θ |) + C
Therefore, all told we obtain for the line integral corresponding to the mean reverting
part:
ρκ λ̄ tan(θ2 /2) ρκ
log | |− (θ2 − θ1 ) (6.32)
R 1 − ρ2ν3 tan(θ1 /2) 1 − ρ2ν2
κλ 1 1 κ | sin θ2 |
+ 3 ( − ) − 2 log( (6.33)
ν R sin θ1 sin θ2 ν | sin θ1 |
Since θ varies between 0 and π and since in this range both sin θ and tan θ/2 are
non-negative, we can remove the absolute value signs above and arrive at
log(P) M R
ρκ λ̄ tan(θ2 /2) ρκ
= log − (θ2 − θ1 )
R 1−ρ ν 2 3 tan(θ1 /2) 1 − ρ2ν2
κλ 1 1 κ sin θ2
+ 3 ( − ) − 2 log( )
ν R sin θ1 sin θ2 ν sin θ1
(6.34)
yt ν 1
=−
y2 ν y
log(Pt ) (6.35)
θ2 θ2
1 γ ρ ρ νρ dx ν dy
= (− − + ) −
1−ρ 2 γ 1−ρ 2 ν θ1 y ν θ1 y
θ2
γ x
− dx
γ θ1 y 2
which gives
log(Pt )
1 γ ρ ρ νρ ν sin θ2
= (− − + )(θ1 − θ2 ) − log | |
1 − ρ2 γ 1 − ρ2 ν ν sin θ1
γ 1 tan θ2 sin θ2
− log − log | |
γ R tan θ1 sin θ1
(6.36)
The calculations of the heat kernel coefficient u 1 (and all other heat kernel coeffi-
cients) can be carried out either in the original variables ( f, a) or in the (rescaled)
standard hyperbolic place with metric ds 2 = y12 (d x 2 + dy 2 ),6 provided we use the
transformed drift vector (6.20).
Recall, from formula (2.12) with i = 1 that
d
1
u1 = u0 u −1
0 (Lu 0 + u τ ) ds (7.1)
d 0
6 Recall that a time change (6.21) gets rid of the extra factor ν 2 .
124 G. Ben Arous and P. Laurence
where
√ d( f 0 ,a0 , f,a,t) ∂d(s̃(ρ), ã(ρ), s, a)
u0 = P exp − dρ (7.2)
0 ∂t
u0t
where we recall that using (6.4), (6.34) and (6.36), P is known in closed form and
where in the a formula above the distance function is given by (7.8). Also recall that
is the Van-Vleck De Witt determinant, and we know that in the hyperbolic space
of curvature −1
d
= , (7.3)
sinh d
As noted by Willmore [54] p. 208, the Van-Vleck De Witt determinant (which Will-
more calls the “discriminant function”) is independent of the coordinate system cho-
sen. This means that to express the VVDW determinant in the original coordinate
system, it is sufficient to use the same expression (7.3) in conjunction with the dis-
tance function in that coordinate system, i.e. (7.8). Since all ingredients in u 0 are
explicit we can now simply insert this expression into (7.1) and use a symbolic
calculation engine like Mathematica or Maple to do the calculations.
Alternatively, we can do the calculations using the formula for u 1 in the (x, y)
plane. This has the advantage that many of the terms arising during the calculation
can be computed, once again, in closed form. We discuss this further in the remainder
of this section.
The fact that depends exclusively on the distance simplifies some of the expres-
sions involved in u 1 . In polar coordinates based at the point y, we have that the Laplace
Beltrami operator can be expressed in the form
n−1
∂2 ∂ r ∂ ∂2
+ log + a(r, θ, t) 2 (7.4)
∂r 2 ∂r ∂r ∂θ
∂
∂ 2 n−1 ∂ ∂2
= 2 + − ∂r + + a(r, θ, t) 2 , (7.5)
∂r r ∂r ∂θ
see Proposition G.V.3, p. 134 of Berger-Gauduchon-Mazet [12] and use the fact that
( )−1 = θ (r u) as defined in C.III.3 in [12]. For brevity’s sake, let
∂
∂r n−1
c(r, t)) = − +
r
Using the explicit form of on calculates that, in the case of standard, hyperbolic
plane,
Second Order Expansion for Implied Volatility … 125
c(r, t) = Cotanh(d)
1
a(r, t) =
sinh2 (d)
∂2 ∂ ∂2
+ c(r, t) + a(r, θ, t)
∂r 2 ∂r ∂θ 2
Letting B act on u 0 , there are some simplifications due to the fact that (Van
Vleck De Witt determinant), depends only on r
B u0 = B( (r )P(r, θ ))
∂2 ∂ ∂2
= +
+ c(r, t) a(r, θ, t) B (P(r, θ ))
∂r 2 ∂r ∂θ 2
2
∂ ∂ ∂2
= (P(r, θ )) + c(r, t) ( (r )) + a(r, θ, t) 2 (P(r, θ )) ( (r ))
∂r 2 ∂r ∂θ
Therefore
Lu 0 =
√ √
= PL + LP
√ √ 1 √
= PL + P LA + g i j Ai A j P
2
And so
u −1
0 Lu 0
√
L 1 ij
= √ + LA + g Ai A j
2
√
L 1 2 2
= √ + LA + y (Ax + A2y )
2
So, that putting it all together and inserting into (7.1) we obtain an expression of
the form
& √ '
√ 1 L 1 2 2
P √ + LA + y (Ax + Ay ) ds 2
d γ[x2 ,y2 ,x1 ,y1 ] 2
√
Since the first part L√ depends only of the distance, it can be integrated in
closed form, and we obtain, using Mathematica, and the expression (7.7) below, that
it equals
126 G. Ben Arous and P. Laurence
√
(−1 + d 2 + d coth d) d csch d
−P
4d 2
Now
LA
y2
= (Ax x + Ayy ) + Ṽ x Ax + Ṽ y Ay (7.6)
√2
L
√
√
−1 − 3d 2 + (1 + d 2 ) cosh 2d csch4 d sinh d
= (7.7)
8d 2 (csch d)3/2
So a key quantity that we must calculate is the action of the differential operator L
on the exponent A in P.
Expressing this exponent using the standard polar coordinates (6.23).7 We have,
with fixed point (x2 , y2 ), letting for brevity θ1 = θ (x1 , y1 ), θ2 = θ (x2 , y2 ) with θ
given by (6.24)
θ1 (x,y,x2 ,y2 )
A(x, y, x2 , y2 ) = R(x, y, x2 , y2 ) (−Ṽx sin θ + Ṽy cos θ )dθ
θ2
Therefore
∂A
(x1 , y1 ) = R −Ṽx sin θ1 + Ṽy (x1 , y1 ) cos θ1 (θ1 )x |x=x1 ,y=y1
∂ x& '
θ1 (x1 ,y2 ,x2 ,y2 )
+ (R)x (−Ṽx sin θ + Ṽy cos θ )dθ
θ2
x=x1 ,y=y1
A
and similarly for other partial order derivatives ∂∂y . Note that the evaluation of the
terms above requires knowledge of the value of the integral which was computed in
the previous section.
7 Note this angle is not the same angle as that used above in geodesic polar coordinates.
Second Order Expansion for Implied Volatility … 127
Also we may calculate the second order partial derivatives Ax x , Ayy , Axy and
express these in terms of Ṽ and derivatives of the polar angle θ1 , given as in (6.23),
(6.24) and of the radius, given by (6.22).
where
f0 1 1 1−β
q( f, f 1 , t) = df = ( f 1−β − f 1 )
f1 γ(t) f β γ(1 − β)
Next we check the value that minimizes the geodesic distance from the hyperplane
f = f1:
This value is given by
c(t) := (a1 )min (t) = a0 a02 − 2ν(t)q(t)a0 ρ(t) + ν 2 (t)q 2 (t) (7.9)
After some simplification the value of the minimum distance at the minimum point
this may be written in one of the two forms, one corresponding to a distance function
and the second to a signed distance function (of the point (a0 , f 0 ) to the line f = K )
1 qνρ − a0 ρ − a02 + q 2 ν 2 + 2a0 qνρ
2
us
dmin = , unsigned
ν a0 1 − ρ 2
(7.10)
and
1 1
s
dmin = log νq − a0 ρ + a02 − 2a0 νρq + ν 2 q 2 , signed
ν a0 (1 − ρ)
(7.11)
128 G. Ben Arous and P. Laurence
As mentioned above, this second version of the distance function can become
negative, but it’s absolute value coincides with the expression above. This distinction
seems to have not been emphasized in previous treatments of the subject. In calcu-
lating the local volatility expansion, especially the term V2 , we require derivatives
2
up to the sixth order of the φ = d2 . These are listed below:
Let
φ (2) (7.12)
Csch(νd)d
= (7.13)
ν a02 + q 2 − 2qνρ a0 − a0 ρ 2
φ (3)
3Csch(ν d)d
= 2
a0 ν a0 + q 2 − 2qνρ −1 + ρ 2
(7.14)
φ (4)
3Csch(dν) −4da0 ν −1 + ρ 2 + a02 + q 2 ν 2 − 2qa0 νρ(1 − dνCoth(dν))Csch(dν)
= 3/2 2
a02 ν 2 a02 + q 2 ν 2 − 2qa0 νρ −1 + ρ 2
(7.15)
φ (5)
(−1+dνCoth(dν))Csch(dν)
30Csch(dν) 2
2dν
2 + 3/2
a0 +q 2 ν 2 −2qa0 νρ a0 a02 +q 2 ν 2 −2qa0 νρ (−1+ρ 2 )
=
a0 ν 2 −1 + ρ 2
(7.16)
and
φ (6)
1
= − 5/2 3
a ν a + q ν − 2qaνρ
3 2 2 2 2 −1 + ρ 2
2
× 15Csch[dν] 24da 2 ν −1 + ρ 2 − 3 a 2 + q 2 ν 2 − 2qaνρ Coth[dν]Csch[dν]2
+ dν a 2 + q 2 ν 2 − 2qaνρ (2 + Cosh[2dν])Csch[dν]4
+ 18a a 2 + q 2 ν 2 − 2qaνρ −1 + ρ 2 Csch[dν]2 (dνCosh[dν] − Sinh[dν]) (7.17)
Second Order Expansion for Implied Volatility … 129
Collecting the results in this paper, we see that the probability density function in
the λ-Sabr model is given, at the zero-th order, by
p( f, a, t, K , A, T )
√ *
1 1 C( f, t 1−ρ
1 d
= (√ ) 2
2π(T − t) 1 − ρ(T )2 C(K , T )ν(T )A2 C(K , t) sinh d
2 d( f 0 ,a0 , f,a,t) ∂d( f˜(ρ),ã(ρ), f,a)
− 2(Td −t) − − dρ +log(P ) M R +log(Pt )
×e 0 ∂t
where the distance d in the above formula is given by (7.8). In the case of the λ-Sabr
model, the results in this paper allow us to sharpen the above results and obtain the
expansion for the heat kernel up to the zero-th order in the form
p CEV ( f, a, t, K , A, T )
β 1 *
1−ρ 2
1 1 f 2 d
= β
2π(T − t) 1 − ρ(T )2 ν(T ) γ(T )K β A2 K 2 sinh d
2 d( f 0 ,a0 , f,a,t) ∂d( f˜(ρ),ã(ρ), f,a)
− 2(Td −t) + − dρ +log(P ) M R +log(Pt )
×e 0 ∂t
(8.1)
where log(P) M R is explicit and defined in (6.34) and also the fully explicit log(Pt ) is
given by (6.36). Also the derivative of the distance function appearing in the exponent
of the exponential can easily calculated for any specific functional form of the time
dependent parameters γ(t), ν(t), ρ(t) by differentiating (7.8) with respect to t.
Using (8.4) we see that the zero-th order local volatility σ L , for the family of
stochastic volatility models (2.1), (2.2) (for any local volatility C( f, t)), can be
expressed, in the form
σ L(0) ( f 0 , a0 , K , τ )
f0
1
= C(K , t)(a1 )min = C(K , t) a02 − 2qa1 ν(t)ρ(t)q + q 2 ν 2 (t), q( f 0 , K ) =
K C(u, t)
where ξ was defined via (5.6), dmin s is the signed distance to line f = K given in
(7.11) and q was defined by (7.9), and we recall that a0 is the initial value of at .
This zero-th order result, in the time homogeneous case, where the coefficients ν, ρ
are constants and where γ = 1 agrees with that given in Berestycki et al., Henry-
Labordère in his Encyclopedia article [35] but does not agree with the formula in
Hagan et al. [31], a discrepancy already pointed out by Obloj in [46]
Ĉ(c)
1 dmin
=
a 2 ν(t) γ(t) f β (1 − ρ 2 (t)) sinh(dmin )
β
1
2
f 2 1−ρ log(P M R )+log(Pt )
× β
e
K2
where the quantities in the exponent of the exponential were defined respectively
in, (6.34), and (6.36), and where the derivatives of the functions φ were supplied in
(7.13) and (7.14).
The Black-Scholes implied volatility is now given by (5.12) whose expression we
recall here
Second Order Expansion for Implied Volatility … 131
(0)
σ L ( f 0 ,a0 ,K ,t)u 0,1 ( f 0 ,K ,t)dmin ( f 0 ,a0 ,K ,t)
ξ ln √
ξ f0 K
(1)
σBS = 3
.
dmin
and where we recall that the definition of u 0,1 , given by (5.14) (where we need to plug
(1)
in the quantity, and only that term, involves σ L and hence involves the non-metric
of the zero-th order heat kernel. Also, recall that dmin was given above (8.3).
Moreover for the second order expansion of local volatility and implied volatility
respectively we have
0.074
new order 1
0.072
MC "exact"
Hagan et al
0.07
0.068
0.066
0.064
0.062
100 100.5 101 101.5 102 102.5 103 103.5 104
0.078
new sig1
0.076 sig1 Hag
MC sig1
0.074
0.072
0.07
0.068
0.066
0.064
0.062
0.06
0.058
99 100 101 102 103 104 105 106
9 Numerics
In this section we compare the accuracy of our first order expansion with that
of Hagan-Kumar-Lesniewski-Woodward [31], who do not provide second order
approximations, on three time horizons. In all of the numerics below, the bench-
mark prices used were obtained from a Monte Carlo simulation with 40 time steps
and 300,000 paths per time step. The standard error associated, calculated by means
of dividing the empirical standard deviation by the square root of the number of
paths, was 0.0032. The dimensionless parameters involved in the empirical work
were large, since we took vol of vol ν to be equal to 1.
What we have found is that our first order approximation is more accurate on
time horizons of 0.5 and 1 year, both in and out of the money. On a 2 year time
Second Order Expansion for Implied Volatility … 133
0.078
new order 1
0.076 order 1 Hag
order 1 MC
0.074
0.072
0.07
0.068
0.066
0.064
0.062
100 101 102 103 104 105 106
horizon, Hagan et al.’s first order approximation is sometimes better than ours out
of the money and tends to be worse in the money. These results are illustrated in the
figures (Figs. 1, 2 and 3).
Acknowledgments We would like to thank Tai-Ho Wang of Baruch College for his assistance in
calculating the higher derivatives of φ in Sect. 7, and for having pointed out to us the validity of the
curvature formula (3.3) in the case q = 1. We would also like to thank Elton Hsu for interesting
discussions.
(2)
VL =
1
− + (c) + cCt(u)(d
C( f, t)2 12cφ (c)2 −2d0 (c)Ct + 1 (u) + u 1 (u))φ (c)
ˆ
24Ct(c) d0 (c) φ (c)
2 2 5
+ (c)d0 (c) − d0 (c)Ct
φ (c) −2Ct + (c) − 2Ct(u)(d
+ + (c)φ (3) (c)
1 (u) + u 1 (u))φ (c) + d0 (c)Ct
134 G. Ben Arous and P. Laurence
+
Ct(c)φ (c) 12cφ (c)2 2Ct + (c)d0 (c) 2d0 (c) + c(d1 (c) + u 1 (c))φ (c) +
+
+ Ct(u)(d (3)
1 (u) + u 1 (u)) φ (c) 4d0 (c) − cd0 (c) + 2c(d1 (c) + u 1 (c))φ (c) + cd0 (c)φ (c)
− d0 (c)φ (c) 48cd0 (c)Ct + (c)φ (c) − 24Ct(u)d
+ +
1 (u)φ (c) − 24Ct(u)u
2
1 (u)φ (c)
2
− 24cφ (c)2 d0(3) (c) + 48d0 (c)φ (c)φ (3) (c) + 24cd1 (c)φ (c)2 φ (3) (c)
+ 24cu 1 (c)φ (c)2 φ (3) (c)− 48cd0 (c)φ (3) (c)2 + 5c2 d1 (c)φ (c)φ (3) (c)2 + 5c2 u 1 (c)φ (c)φ (3) (c)2
− 24d0 (c)φ (c) φ (c) − 2cφ (3) (c) − 3cφ (c) −8d0 (c) + c(d1 (c) + u 1 (c))φ (c) φ (4) (c)
+ 2d0 (c)2 15cφ (3) (c)3 − φ (c)φ (3) (c) 15φ (3) (c) + 16cφ (4) (c)
+ 3φ (c)2 2φ (4) (c) + cφ (5) (c) (A.1)
where all quantities appearing above where already defined in connection with (4.24)
in the time-inhomogeneous case and where
√
+=
Ct P
and d1 , d2 where defined in (4.25). In the case of the λ-Sabr model all required
derivatives of φ (up to order 6) were supplied in Sect. 7.1.
References
1. Andersen, L.B.G.: Option pricing with quadratic volatility: a revisit. Financ. Stoch. 15(2),
191–219 (2011)
2. Andreasen, J., Andersen, L.B.G.: Jump diffusion pricing: volatility smile fitting and numerical
methods for option pricing. Rev. Deriv. Res. 4, 231–262 (2000)
3. Avellaneda, M., Boyer-Olson, D., Busca, J., Friz, P.: Reconstructing volatility. Risk Mag.
15(10), 87–91 (2002)
4. Avramidi, I.: Heat Kernel and Quantum Gravity. Lecture Notes in Physics, New Series, Mono-
graphs, vol. 64. Springer, Berlin (2000)
5. Avramidi, I.: Lectures on Heat Kernel Applications in Finance. New Mexico Technology. http://
infohost.nmt.edu/iavramid/notes/hkt/hkt.html
6. Azencott, R.: Géodésiques et diffusions en temps petit. Séminaire de Probabilités. pp. 84–85
(1991)
Second Order Expansion for Implied Volatility … 135
7. Bates, D.: Jumps and stochastic volatility: the exchange rate processes implicit in Deutschemark
opions. Rev. Financ. Stud. 9, 69–107 (1996)
8. Benhamou, E., Croissant, O.: Local time for the SABR model: connection with the ‘complex’
black scholes and application to CMS and spread options, SSRN preprint (2004)
9. Bender, C.M., Orszag, S.A.: Advanced Mathematical Methods for Scientists and Engineers.
Springer, Berlin (1999)
10. Beresticky, H., Busca, J., Florent, I.: Asymptotics and calibration of local volatility models.
Quant. Financ. 2, 61–69 (2002)
11. Beresticky, H., Busca, J., Florent, I.: Computing the implied volatility in stochastic volatility
models. Commun. Pure Appl. Math. 57(10), 1352–1373 (2004)
12. Berger, M., Gauduchon, P., Mazet, E.: Le spectre d’une variété Riemanniene. Lecture Notes
in Mathematics. Springer, Berlin (1970)
13. Bourgade, P., Croissant, O.: Heat kernel expansion for a family of stochastic volatility models:
δ geometry, SSRN (2005)
14. Carr, P., Jarrow, R.: The stop-loss start-gain paradox and option valuation: a new decomposition
into intrinsic and time value. Rev. Financ. Stud. 3(3), 469–492 (1990)
15. Carr, P., Wu, L.: Stochastic skew for currency options. J. Financ. Econ. 86, 213–247 (2007)
16. Carr, P., Geman, H., Madan, D., Yor, M.: From local volatility to local Lévy models. Quant.
Financ. 4, 581–588 (2004)
17. Chavel, I.: Eigenvalues in Riemannian Geometry. Academic Press, Waltham (1984)
18. Cox, J.: Notes on option pricing I: constant elasticity of diffusions. Unpublished draft. Palo
Alto, CA: Stanford University, September 1975
19. Doust, P.: No arbitrage Sabr, Royal Bank of Scotland working paper (2010)
20. Dupire, B.: Pricing with a smile. Risk 7, 18–20 (1994)
21. Dupire, B.: A unified theory of volatility, discussion paper paribas capital management. In:
Peter, C. (ed.) Reprinted in Derivative Pricing: The Classic Collection. Risk Books (2004)
22. Derman, E., Kani, I.: Riding on a smile. Risk 7, 32–39 (1994)
23. Feng, J., Forde, M., Fouque, J.-P.: Short-maturity asymptotics for a fast mean-reverting Heston
stochastic volatility model. SIAM J. Financ. Math. 1, 126–141 (2010)
24. Forde, M., Jacquier, A.: Small time asymptotics for implied volatility under the Heston model.
Int. J. Theory Appl. Financ. 12, 861–876 (2009)
25. Forde, M., Jacquier, A.: The large-maturity smile for the Heston model. Financ. Stoch. 15(4),
755–780 (2011)
26. Forde, M., Jacquier, A., Mijatović, A.: Asymptotic formulae for implied volatility in the Heston
model. Proc. R. Soc. A 466(2124), 3593–3620 (2010)
27. Forde, M., Jacquier, A., Lee, R.: The small-time smile and term structure of implied volatility
under the Heston model. SIAM J. Financ. Math. 3(1), 690–708 (2012)
28. Gatheral, J., Hsu, E.P., Laurence, P., Ouyang, C., Wang, T.H.: Asymptotics of implied volatility
in local volatility models. Math. Financ. 22(4), 591–620 (2012)
29. Hadamard, J.: Lecons sur les equations différentielles. Dover Publications, Sewed (1952)
30. Hagan, P., Woodward, D.E.: Equivalent black volatilities. Appl. Math. Financ. 6, 147–157
(1999)
31. Hagan, P., Kumar, P.S., Lesniewski, A., Woodward, D.E.: Managing smile risk. Wilmott Mag.,
84–108 (2003)
32. Hagan, P., Lesniewski, A.: Probability distribution in the SABR model of stochastic volatility.
In: Friz, P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.) Large Deviations
and Asymptotic Methods in Finance. Springer Proceedings in Mathematics and Statistics, vol.
110 (2015)
33. Heston, S.: A closed form solution for options with stochastoc volatility, with applications to
bond and currency pricing. Rev. Financ. Stud. 6, 327–342 (1993)
34. Hsu, E.: Stochatic analysis on manifolds, Graduate Studies in Mathematics. American Math-
ematical Society (2002)
35. Henry-Labordère, P.: SABR Model. Encyclopedia of Quantitative Finance (2010)
36. Hsu, E.: The heat kernel on non-complete manifolds. Indiana Math. J. 39(2), 431 (1990)
136 G. Ben Arous and P. Laurence
37. Hull, J., White, A.: The pricing of options on assets with stochastic volatilities. J. Financ. 42(2),
282–300 (1987)
38. Kunimoto, N., Takahashi, A.: The asymptotic expansion approach to the valuation of interest
rate contingent claims. Math. Financ. 11(1), 117–151 (2001)
39. Henry-Labordère, P.: A general asymptotic implied volatility for stochastic volatility models.
http://arxiv.org/abs/cond-mat/0504317 (2005)
40. Henry-Labordère, P.: Analysis, Geometry, and Modeling in Finance. Chapman & Hall/CRC
Financial Mathematics Series (2008)
41. Henry-Labordère, P.: Solvable local and stochastic volatility models: supersymmetric methods
in option pricing. Quant. Financ. 7(5), 525–535 (2007)
42. Lewis, A.L.: Option valuation under stochastic volatility: with mathematica code. CA: Finance
Press (2000)
43. Lesniewski, A.: Swaption Smiles via the WKB Method. Seminar Mathematical Finance,
Courant Institute of Mathematical Sciences (February 2002)
44. Minakshisundaram, S., Pleijel, A.: Some properties of the eigenfunctions of the Laplace- oper-
ator on Riemannian manifolds. Can. J. Math. 1, 242–256 (1949)
45. Megvedev, A., Scaillet, O.: Approximation and calibration of short-term implied volatilities
under jump-diffusion stochastic volatility. Rev. Financ. Stud. 20, 427–459 (2007)
46. Obloj, J.: Fine-tuning your smile: correction to Hagan et al. Wilmott Mag. (2008)
47. Molchanov, S.: Diffusion processes and Riemannian geometry. Russ. Math. Surv. 30, 11–63
(1975)
48. Paulot, L.: Asymptotic implied volatility at the second order with application to the SABR
model. In: Friz, P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.) Large
Deviations and Asymptotic Methods in Finance. Springer Proceedings in Mathematics and
Statistics, vol. 110 (2015)
49. Renault, E., Touzi, N.: Option hedging and implied volatilities in a stochastic volatility model.
Math. Financ. 6(3), 279–302 (1996)
50. Stein, E.M., Stein, J.C.: Stock distributions with stochastic volatility: an analytic approach.
Rev. Financ. Stud. 4(4), 727–752 (1991)
51. Takahashi, A., Takehara, K., Toda, M.: Computation in an asymptotic expansion method, SSRN
preprint (2009)
52. Varadhan, S.: On the behavior of the fundamental solution of the heat equation with variable
coefficients. Commun. Pure Appl. Math. 20, 431–455 (1967)
53. Varadhan, S.: Diffusion processes in a small time interval. Commun. Pure Appl. Math. 20,
659–685 (1967)
54. Willmore, T.J.: Riemannian Geometry. Oxford Science Publications, Oxford (2002)
55. Yoshida, K.: On the fundamental solution of the parabolic equation in a Riemannian space.
Osaka Math. J. 1(1), 1–52 (1953)
General Asymptotics of Wiener Functionals
and Application to Implied Volatilities
Yasufumi Osajima
1 Introduction
Y. Osajima (B)
BNP Paribas, Fixed Income Research and Strategies, 10 Harewood Avenue,
London NW1 6AA, UK
e-mail: [email protected]
d
d X εi (t) = εVki (t, X ε (t))dWk (t) + V0i (t, X ε (t))dt, 1 ≤ i ≤ N ,
k=1
X ε (0) = x0 = (x01 , . . . , x0N ), x0 ∈ R N . (1)
V01 ≡ 0, (2)
and the ellipticity of V1 , . . . , Vd at x0 , i.e. there exists a constant δ > 0 such that
d
Vk (0, x0 ) ⊗ Vk (0, x0 ) ≥ δ I, (3)
k=1
where I denotes the identity matrix. Then there exists a unique solution to (1).
Moreover, we assume that X ε (t) is continuous in t with probability one.
We investigate the distribution of X ε1 (T ). From the ellipticity condition (3), the
law of X ε1 (T ), denoted by νε , is absolutely continuous and has a smooth density
pε (y). Let H be the Cameron-Martin space of d-dimensional Wiener space. We
consider the associated ordinary differential equation:
d i d
y (t; h) = Vki (t, y(t; h))ḣ k (t) + V0i (t, y(t; h)), t ∈ [0, T ], h ∈ H,
dt
k=1
y(0; h) = x0 , x0 ∈ Rn . (4)
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 139
Since V01 ≡ 0, this energy function satisfies e(x01 ) = 0. Let us define a flow φ :
[0, T ] × R N → R N by
d
φ(t, x) = V0 (t, φ(t, x)), t ∈ [0, T ], x ∈ R N , (6)
dt
φ(0, x) = x.
N
∂φ i j
Ṽki (t, y) = (−t, φ(t, y))Vk (t, φ(t, y)), 1 ≤ i ≤ N , 1 ≤ k ≤ d, (7)
∂x j
j=1
which is the push-forward of the vector field V by the map φt . Let us define
(g ij )1≤i, j≤N : [0, T ] × R N → R by
d
j
g ij (t, x) = Ṽki (t, x)Ṽk (t, x), 1 ≤ i, j ≤ N .
k=1
From (3), the matrix (g ij )1≤i, j≤N is positive definite corresponding to the Riemaniann
metric on R N . We define the generating operator L t , t ∈ [0, T ] by
1 ij
N N
∂2 f ∂f
(L t f )(x) = g (t, x) i j (x) + Ṽ0i (t, x) i (x),
2 ∂x ∂x ∂x
i, j=1 i=1
1 ∂ 2 φi
N d
Ṽ0i (t, y) = (−t, φ(t, y))Vmk (t, φ(t, y))Vml (t, φ(t, y)), 1 ≤ i ≤ N . (9)
2 ∂ xk ∂ xl
k,l=1 m=1
N T ∂f
(V f )(t, x) ≡ g 1i (t, x) (s, x)ds, (10)
t ∂ xi
i=1
N
T ∂f T ∂g
Γ ( f, g)(x) ≡ g ij (t, x) (s, x)ds (s, x)ds dt. (11)
t ∂ xi t ∂x j
i, j=1
1 b2 b b2
3
e(y) − (y − x01 )2 − 3 (y − x01 )3 + − 4 + 25 (y − x01 )4
2b1 3b1 4b1 2b1
≤ C0 |y − x01 |5 , y ∈ [x01 − r0 , x01 + r0 ], (12)
where
T
3 T
b1 = g (t, x0 )dt, b2 =
11
(V g 11 )(t, x0 )dt, (13)
0 2 0
T
1 T
b3 = 2 (V g )(t, x0 )dt +
2 11
Γ (g 11 , g 11 )(x0 ).
0 2 0
(2) There are constants C1 , C2 > 0 such that the probability density pε (y) satisfies
the following:
1
e(y)
(2π ε2 ) 2 exp pε (y) − a0 (y) − ε2 a2 (y) ≤ ε4 C1 , y ∈ [x01 − r0 , x01 + r0 ].
ε2
(14)
∂ 2 e(y) 1 L(y − x 1 )2
2
a0 (y) − exp 0
≤ C2 |y − x01 |3 , y ∈ [x01 − r0 , x01 + r0 ],
∂ y2 2b12
(15)
and
1 L 5 b22 3 b3
a2 (x01 ) = √ − − + , (16)
b1 2b1 6 b13 4 b12
where
Remark 1 We can restate our results (14) as the heat kernel expansion:
1
pε (y) ∼ e−e(y)/ε
2
(a0 (y) + ε2 a2 (y) + O(ε4 )).
ε(2π )1/2
Next, we apply our results to the asymptotic expansion of call option values and
their implied volatilities. We regard X ε1 as the underlying of these options. Then the
forward value of a call option of strike rate K and maturity T is given by
Some properties of ϕn are given in Lemma 7. Since (12), we can define the following
function q ∈ C 2 ([x01 − r0 , x01 + r0 ]; R+ ) such that
1 x dy 2
e(x) = , x ∈ [x01 − r0 , x01 + r0 ]. (19)
2 x01 q(y)
Then the asymptotic expansion of call option values are given by the following.
Theorem 2 There are constants K 0 < K 1 and C1 such that the value of the call
option with strike rate K , maturity T satisfies
√
√ e(K ) 2e(K )
2π exp( 2 )Cε (T, K ) − εa0 (K )q(K )2 ϕ1 R2 (ε, K ) ≤ C1 ε4 ,
ε ε
ε ∈ (0, 1], K ∈ [K 0 , K 1 ],
where
√
a0 (K ) 3 q (K ) ϕ2 ( 2e(K )/ε)
R2 (ε, K ) = εq(K ) + √
a0 (K ) 2 q(K ) ϕ1 ( 2e(K )/ε)
2 1 a0 (K ) a0 (K ) q (K ) 7 q (K ) 2 2 q (K )
+ ε q(K )
2
+2 + +
2 a0 (K ) a0 (K ) q(K ) 6 q(K ) 3 q(K )
√
ϕ3 ( 2e(K )/ε) a2 (K )
× √ + ε2 . (20)
ϕ1 ( 2e(K )/ε) a0 (K )
142 Y. Osajima
f (0+ ) = ∞, f (∞) = 0.
the value of the call option with strike rate K and maturity T is given by
∞
1 z2 K − x1
C N (T, K ) = √ (z + x01 − K )+ exp − 2 d x = (K − x01 ) · f √ 0 .
2π σ 2 T −∞ 2σ T σ T
K − x01
σ Nε (T, K ) = √ , K > x01 .
f −1 (Cε (T, K )/(K − x01 )) T
The asymptotic expansion of the implied normal volatilities are given by the
following.
Theorem 3 The asymptotic expansion of implied normal volatilities are given by
ε|K − x 1 | −1
√ 0
σ Nε (T, K ) − exp(J ) ≤ C(ε + |K − x01 |)3 , K ∈ [x01 , K 1 ], (22)
2e(K )T
where
√ √
|K − x01 |2 L 1 b22 1 b3 2e(K ) ε2 L 5 3 b3
b22 2e(K )
J = + − ϕ 1 + − − + ϕ 1
b12 2 6 b2 4 b1 ε b1 2 6b12 4 b1 ε
1
√ √
ε |K − x01 | 2 b22 3 b3 2e(K ) 2
ε L b 2
b3 2e(K )
+ √ L+ − ϕ 2 + + 22 − ϕ3 .
b1 b1 3 b2 4 b1 ε b1 2 2b1 2b1 ε
1
Remark 2 Since we can give the same formula for put options, Theorem 3 still
holds in the case K < x01 . The implied volatility for a put option of strike rate K and
maturity T is the same as the implied volatility for a call option with the same strike
rate and maturity due to the put-call parity. See Appendix 3 for the details.
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 143
d 1
2
H = h ∈ C0 ([0, 1]; Rd ) : h is absolutely continuous and ḣ i (t) dt < ∞ .
i=1 0
d
1
(h, k) H = ḣ i (t)k̇ i (t)dt.
i=1 0
d i
d
y (t; h) = Ṽki (t, y(t; h))ḣ k (t), 1 ≤ i ≤ N , t ∈ [0, 1],
dt
k=1
y(0; h) = x0 , x0 ∈ R N .
d
j
g ij (t, x) = Ṽki (t, x)Ṽk (t, x).
k=1
1 ij
N
H (t, x, p) = g (t, x) pi p j . (23)
2
i, j=1
Then the correspondence between Hamilton equation and the energy of path is given
by the following.
Proposition 1 Let J ji : [0, 1] × H → R be the solution to the following ordinary
differential equation:
144 Y. Osajima
d i ∂ Ṽ i d N
k
J j (t; h) = (t, y(t; h))J rj (t; h)ḣ k (t),
dt ∂ xr
k=1 r =1
J ji (0; h) = δij , 1 ≤ i, j ≤ N ,
where δij is Kronecker’s delta. Let J¯(t; h) = J −1 (t; h). We assume there is h 0 ∈ H
and λ ∈ R N such that1
N
h0 = λk Dy k (1; h 0 ). (24)
k=1
x(t) = y(t; h 0 ),
N
pi (t) = j,k=1 J¯i (t; h 0 )J jk (1; h 0 )λk .
j
(25)
d i ∂
x (t) = H (t, x(t), p(t)),
dt ∂ pi
d ∂
pi (t) = − i H (t, x(t), p(t)), 0 ≤ t ≤ 1, 1 ≤ i ≤ N , (26)
dt ∂x
x(0) = x0 , x0 ∈ Rn .
d k N
h 0 (t) = pi (t)Ṽki (t; x(t)), 0 ≤ t ≤ 1, 1 ≤ k ≤ d,
dt
i=1
N 1
h0 2
= g ij (t, x(t)) pi (t) p j (t)dt. (27)
i, j=1 0
Proof We note that J¯ji : [0, 1] × H → R satisfies the following ordinary differential
equation:
d ¯i ∂ d N
J j (t; h) = − Ṽ r (t, y(t; h)) J¯ri (t; h)ḣ k (t),
dt ∂x j k
k=1 r =1
J¯ji (0; h) = δij , 1 ≤ i, j ≤ N .
d
N 1
J¯rj (t; h)Ṽl (t, y(t; h))k̇ l (t)dt, 1 ≤ i ≤ N ,
j
Dy i (1; h)[k] = Jri (1; h)
l=1 r, j=1 0
N (28)
From (25), it is easy to see λ = p(1). Since h 0 = i=1 λi Dy i (h ), we see that
0
d
N 1
(h 0 , k) = pi (t)Ṽli (t, y(t; h 0 )k̇ l (t)dt.
i=1 l=1 0
Therefore we have (27). We can check that (x(t), p(t)), 0 ≤ t ≤ 1, satisfies (26) as
follows:
d i d N
x (t) = Ṽki (t, x(t))ḣ k0 (t) = g ij (t, x(t)) p j (t),
dt
k=1 j=1
d
d
N
∂ Ṽk
j
N
∂g jr
pi (t) = − (t, x(t)) p j (t)ḣ k0 (t) = − (t, x(t)) p j (t) pr (t).
dt ∂ xi ∂ xi
k=1 j,r =1 j,r =1
and let h 0 ∈ H be the minimizer of the energy function. Then we can apply
Lagrange’s method and there is a λ ∈ R N such that
N
h0 = λk Dy k (1; h 0 ),
k=1
which is the condition (24). In particular, the condition (29) in the next proposition
is corresponding to the energy function (5).
Proposition 2 Let x(t; w), p(t; w) be the solution to the Hamilton equation (26) with
w (i = 1), w ∈ R
λi = (29)
0 (2 ≤ i ≤ N ),
under the boundary condition x(0) = x0 , p(1) = λ. Then the asymptotic expansion
of x 1 (1; w) is given as follows:
x 1 (1; w) ∼ x0 + b1 w + b2 w2 + b3 w3 , (30)
3
N
t
x i (t; w) = x0i + g ij (s, x(s; w)) p j (s; w)ds, (31)
j=1 0
N
1 1 ∂g jr
pi (t; w) = pi (1; w) + (s, x(s; w)) p j (s; w) pr (s; w)ds. (32)
t ∂x
2 i
j,r =1
Since the integral term in (32) is of the second order in w and from the boundary
condition (29), we have the first order expansion of p:
w (i = 1)
pi (t; w) ∼ pi (1; w) = (34)
1 0 (2 ≤ i ≤ N ).
1 1 ∂g jr
N
pi (t; w) ∼ pi (1; w) + (s, x(s; w))ds p j (1; w) pr (1; w)
2 2 ∂x i
j,r =1 t
1 1 ∂g 11
∼ pi (1; w) + (s, x 0 )ds w2 .
2 2 t ∂ xi
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 147
We substitute (35) for (31). Then we have the second order expansion of x:
N
t 1 1 ∂g 11
x (t; w) ∼
i
x0i + g (s, x(s; w)) p j (1; w) +
ij
(r, x 0 )dr w2 ds
2 2 s ∂x j
j=1 0
t N t
s ∂g i1
∼ x0i + g i1 (s, x0 )ds w + (s, x0 )g j1 (u, x0 )duds
2 0 0 0 ∂x j
j=1
t
1 1 ∂g 11
+ g ij (s, x0 ) (u, x 0 ) du ds w2 .
2 0 s ∂x j
From the second order expansion of p and the first order expansion of x, we have
third order expansion of p:
1 1 ∂g 11
pi (t; w) ∼ pi (1; w) + (s, x 0 )ds w2
3 2 t ∂ xi
N 1 ∂g 11
1 1 ∂g j1
+ (s, x 0 ) (u, x 0 )du ds
2 ∂ xi s ∂x
j
j=1 t
1 2 11 s
∂ g
+ (s, x 0 ) g j1
(u, x 0 )du ds w3 .
t ∂x ∂x
i j
0
1 t
N
1 ∂g k1 1 ∂g 11
+ g ij (s, x0 ) (u, x 0 ) (r, x0 )dr du ds
s ∂x u ∂x
2 0 j k
j,k=1
1 ∂ 2 g 11 u
1 t ij
+ g (s, x0 ) (u, x 0 ) g k1
(r, x 0 )dr du ds
s ∂x ∂x
2 0 j k
0
1 ∂g 11 s
1 t ∂g ij
+ (s, x 0 ) (u, x 0 )du g k1
(r, x 0 )dr ds
2 0 ∂xk s ∂x
j
0
s 1 ∂g 11
1 t ∂g i1
+ (s, x 0 ) g jk
(u, x 0 ) (r, x0 )dr du ds
2 0 ∂x j
0 u ∂x
k
t i1 s u
∂g ∂g j1
+ (s, x 0 ) (u, x 0 ) g k1
(r, x 0 )dr du ds
0 ∂x 0 ∂x
j k
0
148 Y. Osajima
s s
1 t ∂ 2 g i1
+ (s, x 0 ) g j1
(u, x 0 )du g k1
(r, x 0 )dr ds w3 .
2 0 ∂x j ∂xk 0 0
x 1 (1; w) ∼ x01 + b1 w + b2 w2 + b3 w3 .
3
3 Proof of Theorem 1
Let X˜ε be defined by X˜ε (t) = φ(−t, X ε (t)). Then X̃ satisfies the following stochastic
differential equation:
d
d X̃ εi (t) = ε Ṽki (t, X̃ ε (t))dWk (t) + ε2 Ṽ0i (t, X̃ ε (t))dt, 1 ≤ i ≤ N , t ∈ [0, 1],
k=1
X̃ ε (0) = x0 , (36)
where Ṽ is defined as (7) and (9). The solution to the associated ordinary differential
equation ỹ satisfies (37) in the next lemma.
Lemma 1 Let y(t; h) : [0, 1]× H → R, be the solution defined by (4). Let us define
d i d
ỹ (t; h) = Ṽki (t, ỹ(t; h))ḣ k (t), 1 ≤ i ≤ N , t ∈ [0, 1]. (37)
dt
k=1
d
j
−V0i (t, φ(−t, φ(t, y))) + ∇ j φ i (−t, φ(t, y))V0 (t, φ(t, y)) = 0.
j=1
Proof (Theorem 1(1)) Since V01 ≡ 0, we have ỹ 1 (t; h) = y 1 (t; h), and the energy
function can be defined as follows.
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 149
1
d 1
2
e(x) = inf ḣ k0 (t) dt : ỹ 1 (1; h) = x .
2
k=1 0
Therefore it is enough to prove the theorem for the driftless case, i.e. V0 ≡ 0.
Let h 0 be defined by
where
1 b2 b3 b2
c1 = , c2 = − 3 , c3 = − 4 + 2 25 . (41)
b1 b1 b1 b1
N
t
h k0 (t, x) = pi (u; w)Ṽki (u, x(u; w))dt
i=1 0
t t
∼ Ṽk1 (u; x0 )du w∼ Ṽk1 (u; x0 )du c1 (x − x01 ).
1 0 1 0
In this section, we will use the same notations as in [12, 13]. Let (Θ, · Θ ) be a
separable Banach space and (H, · H ) be a separable Hilbert space such that H is
a dense subspace of Θ and the inclusion map is continuous. Let μs , s ∈ [0, ∞), be
the (necessarily unique) probability measure on (Θ, BΘ ) with the property that
√ s
exp[ −1u, θ ]μs (dθ ) = exp(− u H ),
2
u ∈ Θ ∗.
Θ 2
d
d X si (t, θ) = Vki (t, X s (t, θ))dθ k (t) + sV0i (t, X s (t, θ))dt, 1 ≤ i ≤ N , t ∈ [0, 1],
k=1
X s (0) = x0 . (44)
For each (s, y) ∈ (0, 1] × [−r0 , r0 ], the density function ps (y) satisfies
e(y)
(2π s)1/2 exp( ) ps (y) − a0 (y) ≤ K 0 s 1/2 , (s, y) ∈ (0, 1] × [−r0 , r0 ].
s
∂f 1
A f (s, θ ) = [ + trace H D 2 f ](s, θ ),
∂s 2
and
∂e(y) 2 1
B(y) ≡ D F (0, h 0 (y), y). (47)
∂y
In this section, we calculate each terms in right hand side of (46) explicitly. First we
calculate the heat operator.
Lemma 2 There are constants C > 0 and r > 0 such that
1 t
(y − x01 )
N
A F 1 (0, h 0 (y), y) − V0i (u, x0 )∇i g 11 (t, x0 )dudt
2b1 0 0
i=1
N d 1
t
+ Vk1 (t, x0 )∇i,2 j Vk1 (t, x0 ) g ij (u, x0 )du dt
i, j=1 k=1 0 0
d
1 1
A F (s, θ, y) =
i
A [Vki (u, X s (u, θ ))]dθsk (u) + V0i (u, X s (u, θ ))du
k=1 0 0
1
+s A [V0i (u, X s (u, θ ))]du, 1 ≤ i ≤ N .
0
152 Y. Osajima
Therefore we have
A F 1 (0, h 0 (y), y)
N d 1
j
= ∇ j Vk1 (u, X 0 (u, h 0 (u; y)))A X 0 (u, h 0 (u; y))ḣ k0 (u; y)du
j=1 k=1 0
d
1 1 2 1
N
j
+ ∇i, j Vk (u, X 0 (u, h 0 (u; y)))D X 0i (u), D X 0 (u)ḣ k0 (u; y)du.
2 0
i, j=1 k=1
d
N 1
j
A F 1 (0, h 0 (y), y) − (y − x01 ) ∇ j Vk1 (u, X 0 (u; 0))A X 0 (u; 0)α̇ k (u)du
j=1 k=1 0
d
1 1 2 1
N
j
+ ∇i, j Vk (u, X 0 (u; 0))D X 0i (u; 0), D X 0 (u; 0)α̇ k (u)du
2 0
i, j=1 k=1
= O(|y − x01 |2 ),
j t j
where A X 0 (t; 0) = 0 V0 (u, X 0 (u; 0))du.
N d 1
t
D 2 F 1 (0, 0, x0 ) 2
HS =2 gl1 l2 (u, x0 )∇l1 Vm1 (t, x0 )∇l2 Vm1 (t, x0 ) du dt.
l1 ,l2 =1 m=1 0 0
d
N t
D X 0i (t; h)[k] = ∇l Vmi (u, X 0 (u; h))D X 0l (u; h)[k]ḣ m (u)du
l=1 m=1 0
d
t
+ Vmi (u, X 0 (u; h))k̇ m (u)du.
m=1 0
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 153
N
d 1 1
= (∇l Vm1 1 (t, x0 )Vml 2 (u, x0 )1t>u
l=1 m 1 ,m 2 =1 0 0
D 2 F 1 (0, 0, x0 ) 2
HS
N
d 1 1
= ((∇l Vm1 1 (t, x0 )Vml 2 (u, x0 )1t>u
l=1 m 1 ,m 2 =1 0 0
N
d 1 t
=2 ∇l1 Vm1 1 (t, x0 )Vml12 (u, x0 )∇l2 Vm1 1 (t, x0 )Vml22 (u, x0 ) du dt
l1 ,l2 =1 m 1 ,m 2 =1 0 0
N d
1 t
=2 gl1 l2 (u, x0 )∇l1 Vm1 (t, x0 )∇l2 Vm1 (t, x0 ) du dt.
l1 ,l2 =1 m=1 0 0
Finally we will complete the proof of Theorem 1.
Proof (Theorem 1 (2)) Using (46), we have
1 ∂e(y) 1 ∂ 2 e(y)
log a0 (y) = − log det 2 (I H − B(y)) + A F 1 (0, h(y), y)+ log .
2 ∂y 2 ∂ y2
In the right hand side, the asymptotic expansion of second term is given by Lemma 2,
so we will give the asymptotic expansion of the first term.
154 Y. Osajima
∂e(y)
Since B is defined by (47) and ∂y ∼ 1
c1 (y − x01 ), we have
∞
1
det 2 (I − B(y)) = exp(− trace H (B(y)n )).
n
n=2
Therefore we have
c2 (y − x01 )2
log det 2 (I H − B(y)) + 1 D 2 F(0, 0, x0 ) 2
HS = O(|y − x0 |3 ).
2
(48)
(y − x01 )2
log a0 (y) ∼ D 2 F 1 (0, 0, x0 ) 2
HS
2 4b12
(y − x01 ) 1 ∂ 2 e(y)
+ A F 1 (0, h 0 (y), y) + log
b1 2 ∂ y2
d
1 (y − x01 )2 1 t l1 l2
N
= g (u, x0 )∇l1 Vm1 (t, x0 )∇l2 Vm1 (t, x0 )dudt
2 b12 l ,l =1 m=1 0 0
1 2
N
(y − x01 )2 1 t j
+ V0 (u, x0 )∇ j g 11 (t, x0 )dudt
2b12 j=1 0 0
d
1 (y − x01 )2 1 t 1
N
+ Vm (t, 0)gl1 l2 (u, x0 )∇l1 ,l2 Vm1 (t, x0 )dudt
2 b12 0 0
l1 ,l2 =1 m=1
1 ∂ 2 e(y)
+ log .
2 ∂ y2
εz εdz
pε (y)dy = pε (x01 + √ ) √
c1 c1
εz 1 c1 z 2 εc2 z 3
∼ (a0 (x01 + √ ) + ε2 a2 (x01 )) √ exp − √ + √
2 c1 2π 2 c1 3 c1
ε c3 z 4
2
+ √ dz
4 c1
c2 c2 c22 c3
= 1 − 3/2 ε(z 3 − 3z) + ε2 2
3
z 6
− 3
+ 2 z4
3c1 18c1 3c1 4c1
c 2
3c3 Lc1 2
+ − 23 + 2 + z + ε2 a2 (x01 ) φ(z)dz
2c1 2c1 2
c2 c22 c22 c3
= 1−ε 3/2
H3 (z) + ε2 H6 (z) + ε2 − H4 (z)
3c1 18c13 2c13 4c12
c1 L c22 c22 c3
+ ε2 H2 (z) + ε2 (a2 (x01 ) − H (0) −
3 6
− H4 (0)
2 18c1 2c13 4c12
c1 L c1 z2
− H2 (0)) · exp(− )dz,
2 2π ε 2 2
H2 (x) = x 2 − 1,
H3 (x) = x 3 − 3x,
H4 (x) = x 4 − 6x 2 + 3,
H6 (x) = x 6 − 15y 4 + 45y 2 − 15.
then we have
c22 c22 c3 c1 L
a2 (x01 ) = H6 (0) − − H4 (0) − H2 (0).
18c13 2c13 4c12 2
4 Proof of Theorem 2
e(y)
aε (y) = (2π ε2 )1/2 exp pε (y), y ∈ R.
ε2
We assume that there are constants N ∈ N, C0 > 0 and K 0 > 0 such that
N
aε (y) − a2k (y)ε2k ≤ C0 ε2N +2 , y ∈ [x01 , K 0 ],
k=0
and assume that the energy function e satisfies e (x) > 0, x ∈ (x01 , K 0 ]. We define
g : R → R by
x2
e(g(x)) = .
2
Since e is strictly increasing, g is well defined. Then there are constants K 1 < K 0
and C1 , such that the value of the call option satisfies following:
√ e(K ) g −1 (K )
2π exp( )C ε (T, K ) − εϕ1 a0 (K )q(K )2 R N (ε, K ) ≤ C1 ε N +1 ,
ε2 ε
ε ∈ (0, 1], K ∈ [x01 , K 1 ].
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 157
where
cn,m (g −1 (K )) ϕm+1 (g −1 (K )/ε) 2n+m
R N (ε, K ) = ε . (49)
n,m≥0,n+m≥1
c0,0 (g −1 (K )) ϕ1 (g −1 (K )/ε)
2n+m+1≤N
m
1 d k+1 d m−k
cn,m (x) = g(x) · An (x), (50)
(k + 1)!(m − k)! d x dx
k=0
where
Ak (x) = a2k (g(x))g (x), n ∈ N, x ∈ [x01 , K 1 ]. (51)
Proof Since
∞ ∞
1 e(y)
1= pε (y)dy = aε (y) exp(− )dy,
−∞ (2π ε2 )1/2 −∞ ε2
we have
1 ∞ y2
1= aε (g(εy)) exp(− )g (εy)dy.
(2π )1/2 −∞ 2
Since the right hand side is bounded, taking the limit of ε ↓ 0, we have
a0 (g(0))g (0) = 1.
Proof (Proof of Theorem 4) We can divide the value of a call option into two parts:
where
K0 K0 1 1 e(y)
C̃ε (T, K ) = (y − K ) pε (y)dy = (y − K ) 2 exp(− )aε (y)dy,
K K 2π ε2 ε2
and
Rε (K 0 ) = E[X ε1 (T ) − K : X ε1 (T ) > K 0 ].
158 Y. Osajima
x2
Since e(g(x)) = 2 , we have
g −1 (K 0 ) 1 1 x2
C̃ε (T, K ) = (g(x) − K ) 2 exp(− )aε (g(x))g (x)dx.
g −1 (K ) 2π ε2 2ε2
g −1 (K )2
exp C̃ε (T, K )
2ε2
K̃ ε
1 z2 zg −1 (K )
= g(εz + g −1 (K )) − K √ exp − − Aε (εz + g −1 (K ))dz.
0 2π 2 ε
We define
n
Ãε,n (x) = āε,n (g(x))g (x) = Ak (x)ε2k .
k=0
We also define
C̃ε,n (T, K )
g −1 (K )2 K̃ ε 1 z2 zg −1 (K )
= exp − g(εz + g −1 (K )) − K √ exp − −
2ε2 0 2π 2 ε
−1
× Ãε,n (εz + g (K ))dz.
g −1 (K )2
exp C̃ε (T, K ) − C̃ε,n (T, K ) ≤ C1 ε2n+2 .
2ε2
Since
we have
e(K ) 1 g −1 (K )
exp( )C̃ε,n (T, K ) − cn,m (g −1 (K ))ε2n+m+1 √ ϕm+1
ε2 n,m≥0 2π ε
2n+m+1≤N
≤ Rε N +1 , K ∈ [x01 , K 1 ].
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 159
Rε (K 0 ) ≤ E[X ε1 (T ); X ε1 (T ) > K 0 ]
≤ E[X ε1 (T )1/δ ]δ P(X ε1 (T ) > K 0 )1−δ .
Therefore we have
d −1 −1
q(K ) = g (g −1 (K )) = g (K ) .
dK
Then we have our assertion.
c0,1 (g −1 (K )) ϕ2 (g −1 (K )/ε) −1 −1
2 c0,2 (g (K )) ϕ3 (g (K )/ε)
R2 (ε, K ) = ε + ε
c0,0 (g −1 (K )) ϕ1 (g −1 (K )/ε) c0,0 (g −1 (K )) ϕ1 (g −1 (K )/ε)
c1,0 (g −1 (K )/ε)
+ ε2 .
c0,0 (g −1 (K )/ε)
d2
g(g −1 (K )) = q(K )q (K ),
dK2
d3
g(g −1 (K )) = q(K )q (K )2 + q(K )2 q (K ).
dK3
160 Y. Osajima
Using the definition of cn,m given in (50), we can calculate c0,0 , c0,1 , c1,0 , c0,2 explic-
itly as follows:
c0,0 (g −1 (K )) = a0 (K )q(K )2 ,
c1,0 (g −1 (K )) = a2 (K )q(K )2 ,
3
c0,1 (g −1 (K )) = a0 (K )q(K )3 + a0 (K )q(K )2 q (K ),
2
−1 1 7
c0,2 (g (K )) = a0 (K )q(K ) + 2a0 (K )q(K )3 q (K ) + a0 (K )q(K )2 q (K )2
4
2 6
2
+ a0 (K )q(K ) q (K ).
3
3
Then we have our theorem.
5 Proof of Theorem 3
θ1 (x) = ϕ1 (x),
θn+1 (x) = −nθn (x) + θn (x)θ1 (x)x. (52)
where f is defined by (21). The properties of h are given in Lemma 9. Then we have
the following.
Proposition 3 The implied normal volatilities of call options are given as follows.
1+l(ε,K )
ε(K − x01 ) 1 g −1 (K )
σ Nε (T, K ) = √ exp − ϕ1 (h(t, ))dt , K > x01 .
g −1 (K ) T 1 t ε
Here
l(ε, K ) = (1 + R(ε, K )(1 + r (K )) − 1,
where √ )
2π exp( e(K
ε2
)Cε (T, K )
R(ε, K ) = − 1,
εc0,0 (g −1 (K ))ϕ1 (g −1 (K )/ε)
and
g −1 (K )c0,0 (g −1 (K ))
r (K ) = − 1.
(K − x01 )
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 161
and
lim r (K ) = 0.
K ↓x01
Proof From Theorem 4 and Lemma 7, we have (54). Using l’Hospital’s rule, we
have
g −1 (K )c0,0 (g −1 (K ))
lim = g (x01 )a0 (x01 ) = 1.
K ↓0 K − x01
On the other hand, the value of call option under the normal model is given by
K − x01
V = (K − x01 ) f √ .
σ T
Therefore we have
K − x01 g −1 (K )
f √ = (1 + r (K ))(1 + R(ε, K )) f .
σ T ε
Using the definition of h given by (53) and Lemma 9, we have our assertion.
Next we will give the asymptotic expansion of implied volatilities.
Theorem 5 For any N ∈ N, there is a constant C > 0 such that the asymptotic
expansion of implied volatilities satisfies the following:
Here
l N (ε, K ) = (1 + R N (ε, K ))(1 + r (K )) − 1, (55)
where
g −1 (K )c0,0 (g −1 (K ))
r (K ) = − 1. (56)
K − x01
162 Y. Osajima
Therefore
1+l(ε,K ) N 1+l N (ε,K )
1 g −1 (K ) θn (y)
θ1 (h(t, ))dt − (t − 1)n dt
1 t ε 1 n!
n=0
1+l(ε,K ) −1 1+l N (ε,K )
1 g (K ) 1 g −1 (K )
≤ θ1 (h(t, ))dt − θ1 (h(t, ))dt
1 t ε 1 t ε
1+l N (ε,K ) N 1+l N (ε,K )
1 g −1 (K ) θn (y)
+ θ1 (h(t, ))dt − (t − 1)n dt
1 t ε 1 n!
n=0
≤ C1 |l(ε, K ) − l N (ε, K )| + C2 |l N (ε, K )| ≤ C(ε + |K − x01 |) N .
N
1 q (x01 ) 2 c2 q (x01 ) 11 c2 2 3 c3
q(x01 ) = √ , =− , = − ,
c1 q(x01 ) 3 c1 q(x01 ) 9 c1 2 c1
a0 (x01 ) c2 a0 (x01 ) c 2 3c
2 3
= , = c12 L − + ,
a0 (x0 )
1 c1 a0 (x0 )
1 c1 c1
a2 (x01 ) 1 c12 L 2 c2 2 3 c3
= − + − ,
a0 (x01 ) c1 2 3 c1 4 c1
Proof Since
1 2
e(g(x)) = x ,
2
x = e (g(x))g (x),
1 = e (g(x))g (x)2 + e (g(x))g (x),
0 = e (g(x))g (x)3 + 3e (g(x))g (x)g (x) + e (g(x))g (x),
0 = e(4) (g(x))g (x)4 + 6e (g(x))g (x)2 g (x) + 3e (g(x))g (x)2
+ 4e (g(x))g (x)g (x) + e (g(x))g (4) (x).
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 163
Furthermore, since
we have
√
2 b2 b1
g (0) = b1 , g (0) = , g (0) = (9b1 b3 − 8b22 ).
3 b1 6
Lemma 6
R2 (ε, K ) − R20 (ε, K ) ≤ C(ε + |K − x01 |)3 ,
r (K ) − r 0 (K ) ≤ C|K − x01 |3 ,
where
and
1 c2 2 1 c3 c2 L
r 0 (K ) = − + + 1 (K − x01 )2 .
3 c1 4 c1 2
Proof We will calculate each terms of R2 given by (20). From Lemma 7, the functions
ϕ2 /ϕ1 and ϕ3 /ϕ1 are bounded above. Since the first term is O(ε) and other terms are
O(ε2 ), it is enough to calculate the first order of K in the first term and 0th order in
the other terms. Using Lemma 5, we have
c0,1 (x01 )
= 0,
c0,0 (x01 )
d c0,1 (g −1 (K )) a0 (K ) a0 (K ) 2 3 q (K ) a0 (K ) q (K )
= q(K ) + + + .
d K c0,0 (g −1 (K )) a0 (K ) a0 (K ) 2 q(K ) a0 (K ) q(K )
164 Y. Osajima
c0,1 (g −1 (K )) (K − x01 ) 2 5 c2 2 3 c3
∼ √ c L − + ,
c0,0 (g −1 (K )) 1 c1 1
6 c1 4 c1
c0,2 (g −1 (K )) 1 c12 L 1 c2 2 1 c3
∼ − + ,
c0,0 (g −1 (K )) 0 c1 2 2 c1 2 c1
c1,0 (g −1 (K )) 1 c2 L 2 c2 2 3 c3
−1
∼ − 1 + − .
c0,0 (g (K )) 0 c1 2 3 c1 4 c1
2
l2 (ε, K )n+1 g −1 (K ) g −1 (K )
θn+1 ( ) ∼ (R20 (ε, K ) + r 0 (K ))ϕ1 ( ).
(n + 1)! ε 2 ε
n=0
6 Examples
where T
Λ= λ2 (t)dt.
0
3
b1 = σ (x0 )2 Λ, b2 = σ (x0 )3 σ (x0 )Λ2 ,
2
8 2
b3 = σ (x0 )4 σ (x0 )2 + σ (x0 )5 σ (x0 ) Λ3 ,
3 3
1 1 1 y dx
−1
L= σ (x0 ) σ (x0 ) + σ (x0 ) σ (x0 ) Λ , g (y) = √
2 2 3 2
.
2 2 Λ x0 σ (x)
Then using Theorems 1 and 3 we can calculate the density function and implied
normal volatilities. We illustrate some cases.
Example 1 (CEV model) This is the case λ(t) ≡ α and
σ (x) = x β .
2β 3 4β−1 2 2 6β−2 3
Λ = α 2 T, b1 = x0 Λ, b2 = βx0 Λ , b3 = (β 2 − β + 4)x0 Λ ,
2 3
β 4β−2 2 β(1 + β)
L = (β 2 − )x0 Λ , e (y) = ,
2 2α 2 T y β+2
⎧ 1−β
⎪
⎨ √1 y −x0
1−β
(β = 1)
Λ
g −1 (y) =
1−β
⎪
⎩ √1 log( y ) (β = 1).
Λ x0
σ (x) = q x + (1 − q)x0 .
166 Y. Osajima
Fig. 1 Implied volatility smile of displaced diffusion, asymptotic expansion versus analytic solution
with x0 = 1.0, q = 0.5, σ = 0.15, T = 10
3 8 1
Λ = σ 2 T, b1 = x02 Λ, b2 = x03 qΛ2 , b3 = x04 q 2 Λ3 , L = x02 q 2 Λ2 ,
y 2 3 2
−1 1 dx 1 qy + (1 − q)x0
g (y) = √ = √ log ,
Λ x0 q x + (1 − q)x0 q Λ x0
√
1 + g −1 (y)q Λ
e (y) = .
Λ(qy + (1 − q)x0 )2
This model was investigated in Hagan and Woodward [6, 14]. The energy function
was given in Hagan et al. [8] as follows.
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 167
2
1 1 − 2ρζ + ζ 2 − ρ + ζ x̂(ζ (y))2
e(y) = 2 log = ,
2ν T 1−ρ 2ν 2 T
where
ν y dz
ζ (y) = − .
α x0 σ (z)
In Theorem 3.1 [14], we also gave the energy function by solving Hamilton equations.
Then the parameters are given by (Fig. 2)
3
b1 = α 2 σ (x0 )2 T, b2 = σ (x0 )3 α 3 (ασ (x0 ) + νρ)T 2 ,
2
8 2
b3 = α σ (x0 ) σ (x0 ) + α 6 σ (X 0 )5 σ (x0 ) + 6νρσ (x0 )4 σ (x0 )α 5
6 4 2
3 3
2 4
+ 2ν ρ σ (x0 ) α + α σ (x0 )4 ν 2 T 3 ,
2 2 4 4
3
α 2 σ (x0 )2 T 2 2
L= α (σ (x0 )2 + σ (x0 )σ (x0 )) + 4νρασ (x0 ) + ν 2 ,
2
−1 1 1 − 2ρζ (y) + ζ (y)2 − ρ + ζ (y)
g (y) = √ log .
ν T 1−ρ
Fig. 2 Implied volatility smile of SABR model, asymptotic expansion versus Monte Carlo simu-
lation with x0 = 1, α = 0.15, β = 0.5, ν = 0.2, ρ = −0.2, T = 10.
168 Y. Osajima
Acknowledgments The author would like to thank Professor Shigeo Kusuoka for useful
discussions.
Appendix 1
Then we have ∞
lim x n+1 ϕn (x) = y n e−y dy = n!.
x→∞ 0
∂ n 1
log h(t, y) = n θn (h(t, y)), t ∈ [0, 1], y > 0,
∂t t
where θn ∈ Cb [0, ∞], n ≥ 1 are given inductively as follows:
θ1 (x) = ϕ1 (x),
θn+1 (x) = nθn (x) + θn (x)θ1 (x)x.
∂h f (h(t, y))
(t, y) = .
∂t t f (h(t, y))
Since
1 ϕ2 (x)
f (x) = −( +x+ ) f (x) < 0, x > 0,
x ϕ1 (x)
we have
f (x) ϕ2 (x) −1
θ1 (x) = = 1 + x2 + x = ϕ1 (x).
x f (x) ϕ1 (x)
It is easy to check that θ1 ∈ Cb ([0, ∞]) and xθ1 (x) ∈ Cb ([0, ∞]). We have
∂ 1
log h(t, y) = θ1 (h(t, y)).
∂t t
Since
∂ 1 1
θn (h(t, y)) = n+1 −nθn (h(t, y)) + θn (h(t, y))θ1 (h(t, y))h(t, y) ,
∂t t n t
it is easy to prove our lemma.
Appendix 2
In this section, we summarize the main theorem in Kusuoka and Osajima [13]. See
[13] for the definitions.
Let f, g ∈ G ∞ (A ; R) and F ∈ G ∞ (A ; R N ) be completely P-regular functions
and Y be a compact subset in R N . We assume the following.
(A1) There is an α > 0 such that
(1 + α) f (s, θ )
sup s log( exp( )μs (dθ )) < ∞.
s∈(0,1] Θ s
170 Y. Osajima
We define e : R N → [−∞, ∞] by
h 2
e(x) ≡ inf{ − f (0, h) : F(0, h) = x}, x ∈ RN .
2
We also assume the following.
(A2) For each y ∈ Y ,
M(y) ≡ {h ∈ H ; F(0, h) = y} = ∅
and that
h(y) 2
e(y) = − f (0, h(y))
2
for precisely one h(y) ∈ M(y).
We assume moreover the following.
(A3) T (y) ≡ D F(0, h(y)) has rank N for every y ∈ Y.
Let π(y) = T (y)∗ (T (y)T (y)∗ )−1 T (y), y ∈ Y. π(y) is an orthogonal projection
in H. Let π(y)⊥ = I H − π(y). Then π(y)⊥ is also an orthogonal projection in H
onto ker T (y). Let V (y) : H × H → R be a bilinear form given by
V (y)(h, h )
= D 2 f (0, h(y))(π(y)⊥ h, π(y)⊥ h )
+ (h(y) − D f (0, h(y)), T (y)∗ (T (y)T (y)∗ )−1 D 2 F(0, h(y))(π(y)⊥ h, π(y)⊥ h )) H .
V (y)(h, h) < h 2 .
Finally we define
admits a smooth density ps (·) with respect to Lebesgue’s measure. Moreover, there
exist sequence {an }∞ ∞
n=0 ⊆ C(Y ; R) and {K n }n=0 ⊆ (0, ∞) with the property that,
for every n ∈ N,
n
(2π s) N /2 ee(y)/s ps (y; 0) − s m/2 am (y) K n s (n+1)/2 , (s, y) ∈ (0, 1] × Y.
m=0
N
∂e
a0 (y) = (det ∇ 2 e(y))1/2 det 2 (I H − B(y))−1/2 exp (y)A F i (0, h(y))
∂ yi
i=1
+ A f (0, h(y))
for y ∈ Y, where
N
∂e
B(y) ≡ (y)D 2 F i (0, h(y)) + D 2 f (0, h(y)), y ∈ Y.
∂ yi
i=1
Appendix 3
In this section, we discuss about the implied volatilities for the case K < x01 . We
define the forward value of a put option of strike rate K and maturity T by
Since we have put-call parity, the implied volatility of the put option is the same as
the implied volatility of a call option with strike rate K and maturity T . Since
d
d X̄ εi (t) = ε V̄ki (t, X̃ ε (t))dWk (t) + V̄0i (t, X̃ ε (t))dt, 1 ≤ i ≤ N ,
k=1
where
j −Vk1 (t, x̄) (1 ≤ k ≤ d)
V̄k (t, x) = j
Vk (t, x̄) (1 ≤ k ≤ d, j = 1).
d j
Since the associated Riemaniann metric ḡ ij (t, x) = k=1 V̄ki (t, x)V̄k (t, x) is
given by
ḡ 11 (t, x) = g 11 (t, x), ḡ 1i (t, x) = −g 1i (t, x) (i = 1), ḡ ij (t, x) = g ij (t, x) (i, j = 1),
we have
b̄1 = b1 , b̄2 = −b2 , b̄3 = b3 , L̄ = L .
References
1. Berestycki, H., Busca, J., Florent, I.: Computing the implied volatility in stochastic volatility
models. Commun. Pure Appl. Math. 57(10), 1352–1373 (2004)
2. Bismut, J.M.: Large Deviations and the Malliavin Calculus. Birkhauser, Boston (1984)
3. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, part I: theoretical foundations. commun. Pure Appl. Math. 67(1) (2014)
4. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, part II: applications. Commun. Pure Appl. Math. 67(2), 321–350
(2014)
5. Dunford, N., Schwartz, J.T.: Linear Operators, Part II. Wiley, New York (1988)
6. Hagan, P.S., Woodward, D.E.: Equivalent black volatilities. Appl. Math. Financ. 6, 147–157
(1999)
7. Hagan, P.S., Kumar, D., Lesniewski, S., Woodward, D.E.: Managing smile risk. Wilmott Mag.
18(11), 84–108 (2002)
8. Hagan, P.S., Lesniewski, S., Woodward, D.E.: Probability distribution in the SABR model of
stochastic volatility. In: Friz, P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.)
Large Deviations and Asymptotic Methods in Finance. Springer Proceedings in Mathematics
and Statistics, vol. 110 (2015)
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 173
9. Henry-Labordère, P.: A General Asymptotic Implied Volatility for Stochastic Volatility Models,
preprint, http://arxiv.org/abs/cond-mat/0504317 (2005)
10. Kunitomo, N., Takahashi, A.: The asymptotic expansion approach to the valuation of interest
rate contingent claims. Math. Financ. 11, 117–151 (2001)
11. Kusuoka, S., Stroock, D.W.: Applications of Malliavin Calculus, Part I. In: Ito, K. (ed.) Pro-
ceedings of the Taniguchi International Symposium on Stochastic Analysis, Kyoto and Katata,
1982, pp. 271–360. Kinokuniya, Tokyo (1984)
12. Kusuoka, S., Stroock, D.W.: Precise asymptotics of certain Wiener functionals. J. Funct. Anal.
99, 1–74 (1991)
13. Kusuoka, S., Osajima, Y.: A remark on the asymptotic expansion of density function of Wiener
functionals. J. Funct. Anal. 255, 2545–2562 (2007)
14. Osajima, Y.: The Asymptotic Expansion Formula of Implied Volatility for Dynamic SABR
model and FX hybrid model, BNP Paribas, Date posted: 26 Feb 2007 SSRN working paper
series
15. Shigekawa, I.: Stochastic analysis. Am. Math. Soc. (2004)
16. Siopacha, M., Teichmann, J.: Weak and strong Taylor methods for numerical solutions of
stochastic differential equations. Quant. Financ. 11(4), 517–528 (2011)
17. Watanabe, S.: Analysis of wiener functionals (Malliavin calculus) and its application to heat
kernels. Ann. Probab. 15, 1–39 (1987)
18. Yoshida, N.: Asymptotic expansions of maximum likelihood estimators for small diffusions
via the theory of Malliavin-Watanabe. Probab. Theory Relat. Fields 92, 275–311 (1992)
Implied Volatility of Basket Options
at Extreme Strikes
We thank the anonymous reviewer for the careful reading of our manuscript and many constructive
comments.
A. Gulisashvili
Department of Mathematics, Ohio University, Athens, OH, USA
e-mail: [email protected]
P. Tankov (B)
Laboratoire de Probabilités et Modèles Aléatoires, Université Paris Diderot, Paris, France
e-mail: [email protected]
P. Tankov
International Laboratory of Quantitative Finance,
National Research University “Higher School of Economics”, Moscow, Russia
1 Introduction
In option markets, prices of vanilla call and put options are commonly quoted in terms
of their implied volatility I (T, K ), defined as the value of the volatility parameter
which must be substituted into the Black-Scholes option pricing formula to obtain
the quoted option price. Similarly, given a risk-neutral model, one can define the
function (T, K ) → I (T, K ) from the prices of vanilla options computed for that
model. However, since in most stochastic asset price models the implied volatility
function is not known explicitly, it becomes important to obtain efficient and accurate
asymptotic approximations for it. Such approximations are useful for at least two
reasons. First, they may shed light on the qualitative behavior of the implied volatility
in the asset price model, and also on the effect of different model parameters on
the shape of the model-generated implied volatility surface. Second, they allow to
perform an approximate calibration of the model by comparing the market implied
volatility with the asymptotic approximation. Such preliminary estimates can be
used as intelligent guesses in the construction of a numerical calibration algorithm
to accelerate its convergence.
Approximations to the implied volatility have been studied by many authors in
a variety of asymptotic regimes, both in specific models and in model-independent
settings. One of the early references on the subject is the book by Lewis [31] dealing
with stochastic volatility models. Various model-free formulas describing the wing
behavior of the implied volatility were obtained in the last decade. To our knowledge,
celebrated Lee’s moment formulas were the first model-independent asymptotic for-
mulas for the implied volatility at extreme strikes (see [30]). Lee’s results were
later refined by Benaim and Friz [8, 9] and Gulisashvili [22–24]. In Gao and Lee
[19], higher order asymptotic formulas for the implied volatility at extreme strikes
were found, and in Tehranchi [41], uniform estimates for the implied volatility are
obtained. Small-time behavior of implied volatility is analyzed, among other papers,
in [11] (in local volatility models), [17] (for the Heston stochastic volatility model),
[33] (for jump-diffusions), and in [2, 16, 34, 38] (for exponential Lévy models).
Formulae for the implied volatility far from maturity are given in [18] (for the Hes-
ton model) and [40] (model-independent). Finally, sharp price and implied volatility
approximations for various models have been obtained as “expansions around the
Black-Scholes model” in [10, 21].
Implied volatility is also quoted in the market for options on a basket of stocks.
Note that the Black-Scholes formula can be applied to price a vanilla option by
considering the entire basket (index) as a log-normal random variable. In particular,
options on stock indices or major exchange traded funds are often liquid and quoted
in terms of their implied volatility. Several studies [5, 13, 29] explore the relationship
between the implied volatilities of index options and those of the constituents, with
the aim of designing dispersion trading strategies. Another example is provided
by swaptions, which are also quite liquid, often quoted in terms of their implied
volatility, and can be interpreted as basket options on the underlying Libor rates
[4, 37]. A tractable relationship between swaption and caplet implied volatilities
Implied Volatility of Basket Options at Extreme Strikes 177
could be used to design a calibration procedure for the correlation structure of the
Libor rates.
In the above cases, finding reliable asymptotic approximations to the implied
volatility can be even more important, since calculating the exact value numerically
can be computationally very expensive due to the large dimension of the basket.
Approximations based on the small-noise asymptotics in multidimensional local
volatility models have been developed in [5] and more recently refined in [7], but in
other asymptotic regimes, much less is known about multi-asset options, than in the
single-asset case.
Our main goal in the present paper is to characterize the asymptotic behavior of
the implied volatility of a call option on a basket of stocks (with positive weights)
for large and small strikes. Three different classes of multidimensional risk-neutral
models with increasing generality are considered in the paper. In Sect. 3, we discuss
the case of correlated log-normal assets, in other words, the assets which follow the
multidimensional Black-Scholes model. Using a recent characterization of the tail
behavior of sums of correlated log-normal random variables [27], we obtain a sharp
asymptotic formula with error estimates for the implied volatility at small strikes. On
the other hand, the asymptotics of the implied volatility at large strikes can be easily
characterized using the results obtained in [3]. It turns out that for very large strikes,
the implied volatility of a basket call option converges to the highest volatility among
the stocks in the basket.
Section 4 deals with the case where the assets follow the multidimensional Black-
Scholes model time-changed by an independent increasing stochastic process. It is
assumed in this section that the marginal density of the time-change process decays
at infinity like the function s → s α e−θs with α ∈ R and θ > 0. The class of such
models, includes standard multidimensional extensions of various exponential Lévy
models, for instance, of the variance gamma model, the normal inverse Gaussian
model, or the generalized hyperbolic model. These extensions were previously dis-
cussed in, e.g., [15, 32, 36]. To our knowledge, for such a class of multidimensional
models, the tail behavior of marginal distributions has not been studied before. In
Sect. 4, we provide two-sided estimates for the distribution function of the asset price
in the time-changed multidimensional Black-Scholes model, and use these estimates
to find the leading term in the asymptotic expansion of the implied volatility.
Finally, in Sect. 5, we deal with the case where the assets in the basket are corre-
lated, and the dependence structure is described by a given copula function (we refer
the reader to the book [12] for details on this modeling approach). Here we obtain
an asymptotic formula that can be considered as a generalization to the multidimen-
sional setting of one of the tail-wing formulae established in [9]. The new tail-wing
formula uses a special characteristic of the copula called weak lower tail dependence
function. This notion was recently introduced in [39].
178 A. Gulisashvili and P. Tankov
• Let f and g be functions defined on R, and let a ∈ [−∞, ∞]. Throughout the
present paper, we write “ f ∼ g as x → a” provided that
f (x)
lim = 1.
x→a g(x)
f (x)
lim sup ≤ 1,
x→a g(x)
and write “ f (x) ≈ g(x) as x → a” if there exist c1 > 0 and c2 > 0 such that
f (λx)
lim = λα .
x→0 f (x)
for all α > 0. The class of all regularly varying functions with index α is denoted
by Rα . The elements of the class R0 are called slowly varying functions. Regularly
varying functions at zero can be defined similarly.
• The following set will be used in the paper:
d
d : = {w ∈ Rd : wi ≥ 0, i = 1, . . . , d, and wi = 1}.
i=1
• Let w ∈ d . We set
d
E(w) := − wi log wi , (1)
i=1
respectively. Here T > 0 is the maturity, while K > 0 is the strike price.
The implied volatility (T, K ) → I (T, K ) is determined from the following
equality:
C(K , T ) = CBS (T, K , σ = I (T, K )),
where the symbol CBS stands for the Black-Scholes call pricing function. In the
sequel, the maturity T will be fixed, and the implied volatility will be considered as
a function of only the strike price.
We will next formulate two model-free asymptotic formulas, characterizing the
left-wing behavior of the implied volatility in terms of the put pricing function. These
formulas will be needed below. Suppose the initial condition for the price process is
X 0 = 1. Suppose also that the asset price model does not have atoms at zero. The
previous assumption means that P(X T = 0) = 0. Then the following asymptotic
formula (a zero order formula for the implied volatility) holds:
√ √
2 1 1 K 2 K 1 K
I (K ) = √ log − log log −√ log − log log
T ) 2
P(K )
P(K T ) 2
P(K )
P(K
− 1
K 2
+O log (3)
P(K )
The next asymptotic formula (a first-order formula for the implied volatility) can
be easily deduced from the results formulated in [24, Sects. 9.6 and 9.9]:
√
2 1 1 K
I (K ) = √ log − log log + log B(K )
T P(K ) 2 P(K )
√
2 K 1 K
−√ log − log log + log B(K )
T P(K ) 2 P(K )
− 3
K K 2
+ O log log log (4)
P(K ) P(K )
as K → 0, where
log 1
P(K ) − log K
P(K )
B(K ) = √ . (5)
2 π log 1
P(K )
Formula (4) takes into account the results obtained in [19]. It provides more terms
in the asymptotic expansion of the implied volatility at small strikes than formula
= P. More information on model free formulas for the implied volatility
(3) with P
can be found in [24].
Our goal in the present section is to characterize the asymptotic behavior of the
implied volatility at small strikes in the case of a basket option of European style in
the n-dimensional driftless Black-Scholes model. We assume that the interest rate is
equal to zero. Let S 1 , . . . , S n be a basket of assets such that
diag(B)t
St = log
log
1
S0 − + B 2 Wt ,
2
n
St = λi Sti , t ≥ 0. (6)
i=1
Implied Volatility of Basket Options at Extreme Strikes 181
n
The initial condition for the process S is given by S0 = i=1 λi S0i , and we will
assume in the sequel that S0i = 1 for all 1 ≤ i ≤ n. The previous condition implies
that S0 = 1. Therefore,
n
St = exp{Yti }, (7)
i=1
where
bii t
n
j
Yti = log λi − + βij Wt , 1 ≤ i ≤ n. (8)
2
j=1
1
In (8), the symbols βij stand for the elements of the matrix B 2 . We also set
bii t
μi,t = log λi − , 1 ≤ i ≤ n. (9)
2
The existence and uniqueness of w̄ follows from the non-degeneracy of the matrix
B. We let
μ̄ ∈ Rn̄ with μ̄i = μk̄(i) , and B̄ ∈ Mn̄ (R) with B̄ij = Bk̄(i),k̄( j) . The inverse matrix
of B̄ is denoted by B̄−1 , and the elements and the row sums of B̄ are denoted
by āij and Āk := n̄j=1 ākj , respectively. Since the variables Y1 , . . . , Yn in (7) are
exchangeable, we can assume with no loss of generality that for the covariance matrix
B, I¯ = {1, . . . , n̄} with n̄ ≤ n. By the strict convexity of the objective function, the
minimizer of min w ⊥ B̄w coincides with the first n̄ components of w̄ and therefore
w∈n̄
belongs to the interior of the set Rn̄+ . The minimizer over n̄ then coincides with the
n̄
minimizer over the set {w ∈ Rn̄ : i=1 wi = 1}, which means that
182 A. Gulisashvili and P. Tankov
B̄−1 1
(w̄i )i=1,...,n̄ = ,
1⊥ B̄−1 1
or, equivalently,
Āk
w̄k = n̄
, k = 1, . . . , n̄. (11)
i=1 Āi
Since i=1n̄
Āi > 0 (the matrix B̄−1 is positive definite), this implies that Āk > 0
for k = 1, . . . , n̄.
We will next formulate a condition under which the asymptotic formula for the
density pT holds.
Assumption (A) For every i ∈ {1, . . . , n} \ I¯, (ei − w̄)⊥ Bw̄ = 0, where ei ∈ Rn
satisfies eij = 1 if i = j and eij = 0 otherwise.
Assumption (A) is a natural nondegeneracy condition for our problem. The following
straightforward equality gives a relation between the optimization problem in (10)
and a similar problem without the normalization constraint:
r2 ⊥ 1 ⊥
inf w Bw − r = inf v Bv − 1⊥ v. (12)
w∈n ,r ≥0 2 v∈R :vi ≥0,i=1,...,n 2
n
A minimizer v̄ of the right-hand side can therefore be constructed from the minimizer
w̄ of (10) as follows:
w̄
v̄ = ⊥ .
w̄ Bw̄
Now, introducing the vector λ ∈ Rd of Lagrange multipliers for the positivity con-
straints on the right-hand side of (12), we get the Lagrangian 21 v ⊥ Bv − 1⊥ v − λ⊥ v.
At the extremum therefore, Bv̄ = 1 + λ, or in other words,
Bw̄
= 1 + λ.
w̄ ⊥ Bw
Therefore, Assumption (A) simply states that for the constraints, which are saturated,
the Lagrange multipliers are not equal to zero (since the constraints are inequalities,
this is equivalent to the strict positivity for the multipliers). This is generally true,
except when the solution of the unconstrained problem belongs to the boundary of
the domain defined by the constraints. Assumption A is not restrictive and is satisfied
in most applications. Note that if the row sums of the covariance matrix B̄ satisfy
Ai > 0, 1 ≤ i ≤ n, then Assumption A holds.
Implied Volatility of Basket Options at Extreme Strikes 183
It was established in [27] that under Assumption (A), the following asymptotic
formula is valid for the density pT of the price ST of the basket:
1−n̄ n̄ Ā1 +···+ Ān̄
1 2 −1+ T1 k=1 Āk log +μ̄k,T
pT (x) = C T log x Āk
x
1 1 1 −1
exp − ( Ā1 + · · · + Ān̄ ) log2 1+O log , (13)
2T x x
Using formula (13), we can characterize the asymptotic behavior of the put pricing
function P at small strikes. This can be done as follows. Consider the fractional
integral of order two defined by
∞
F2 M(σ) = (τ − σ)M(τ )dτ , (15)
σ
as y → ∞, where
Ā1 +···+ Ān
1−n̄ −2−T −1 n̄
k=1 Āk log +μk,T
M1 (y) = C T (log y) 2 y Āk
1
exp − ( Ā1 + · · · + Ān̄ ) log2 y , y > y0 . (18)
2T
It follows from (17) that there exist c > 0 and y1 > 0 such that
In [26], a general asymptotic formula was obtained for fractional integrals (see
also Theorem 5.3 in [24]). We will next formulate this general result. Suppose
M(y) = a(y)e−b(y) for all y ≥ c
where c > 0 is some number. Suppose also that the following conditions hold:
1. y|a (y)| ≤ γa(y) for some γ > 0 and all y > c.
2. b(y) = B(log y), where B is a positive increasing function on (c, ∞) such that
B (y) ≈ 1 as y → ∞.
Then as σ → ∞,
M(σ)
F2 M(σ) = (1 + O((log σ)−1 )). (21)
b (σ)2
Mi (σ)
F2 Mi (σ) = (1 + O((log σ)−1 )) (22)
b (σ)2
as σ → ∞, where i = 1, 2 and
1
b(u) = ( Ā1 + · · · + Ān̄ ) log2 u, (23)
2T
Implied Volatility of Basket Options at Extreme Strikes 185
M1 (σ)
F2 M(σ) = F2 M1 (σ) + O(F2 M2 (σ)) = (1 + O((log σ)−1 )) (24)
b (σ)2
as σ → ∞. Now, using (16), (23), and (24), we establish the following assertion.
Theorem 1 Let P be the price of the put option defined in (2), and suppose Assump-
tion (A) holds for the covariance matrix B (see [27]). Then, as K → 0,
−1
1 δ1 1 δ2 1 1
P(K ) = δ0 log exp −δ3 log2 1+O log ,
K K K K
(25)
where
CT T 2 3 + n̄
δ0 = 2 , δ1 = − ,
Ā1 + · · · + Ān̄ 2
1
n̄
Ā1 + · · · + Ān̄ 1
δ2 = −1 − Āk log + μk,T , δ3 = ( Ā1 + · · · + Ān̄ ),
T Āk 2T
k=1
The next statement characterizes the asymptotic behavior of the implied volatility
for small strikes.
Theorem 2 Suppose Assumption (A) holds for the covariance matrix B. Then, as
K → 0,
1 2 n̄k=1 Āk log Ā1 +···+
Āk
Ān̄
+ μk,T +T
1 −1
I (K ) = − 3
log
Ā1 + · · · + Ān̄ 2( Ā1 + · · · + Ān̄ ) 2 K
−2 −2
T (n̄ − 1) 1 1 1
− 3
log log log +O log . (26)
2( Ā1 + · · · + Ān̄ ) 2 K K K
186 A. Gulisashvili and P. Tankov
Remark 1 The leading term in the implied volatility expression above can also be
written as
1
lim I (K ) = = min w ⊥ Bw. (27)
K ↓0 Ā1 + · · · + Ān̄ w∈n
Formula (27) for the leading term of the implied volatility holds even if assumption
(A) is not satisfied—in this case, this formula can be obtained as a corollary of
Theorem 9 of this paper.
1 1 1 1 1
log = log − δ1 log log − δ2 log + δ3 log2
P(K ) δ K K K
0 −1
1
+O log (28)
K
and
K 1 1 1
log = log − δ1 log log − (δ2 + 1) log
P(K ) δ0 K K
−1
1 1
+ δ3 log2 +O log (29)
K K
where δ0 , δ1 , δ2 , and δ3 are such as in Theorem 1. Moreover, the error term in (4)
can be represented as follows:
−3
1 1
O log log log . (30)
K K
and hence
1 1 1 −1
log B(K ) = log √ − log log + O log (31)
4 πδ3 K K
as K → 0.
Our next goal is to simplify formula (4) by taking into account (28), (29), and
(31),
and replacing the error term by the expression in (30). We can drop the terms
−1
O log K1 in (28), (29), and (31), using the mean value theorem. This will
−2
introduce an error term O log K1 in the formula that follows from formula
(4). Thus
√
2 1 2 (K ) + log √1 1
I (K ) = √ V1 (K ) − log V − log log
T 2 4 πδ3 K
√
2 1 2 (K ) + log √1 1
−√ V2 (K ) − log V − log log
T 2 4 πδ 3 K
−2
1
+O log (32)
K
as K → 0. Put
√ 3
− log 4 πδ0 δ32 − (δ1 + 2) log log 1
K − δ2 log 1
K
h 1 (K ) =
δ3 log2 1
K
188 A. Gulisashvili and P. Tankov
and
√ 3
− log 4 πδ0 δ32 − (δ1 + 2) log log 1
K − (δ2 + 1) log 1
K
h 2 (K ) = .
δ3 log2 1
K
K (log K − μ)2
E[(St − K )+ ] ≈ exp − , K → ∞.
log2 K 2σ 2 t
Applying Corollary 2.4 in [22] (which is nothing but the right-tail version of formula
(3)), we conclude that
ψ(K )
I (K ) = σ + O
log K
The function ψ can be removed from the error estimate in the previous formula,
using Lemma 3.1, part 1, in [24]. The resulting formula is as follows:
1
I (K ) = σ + O
log K
as K → ∞.
Numerical illustration In this part of the paper we compare the theoretical left-
tail limit of the implied volatility given by Formula (27) with the numerical values
computed by Monte Carlo in the multidimensional Black-Scholes model. Figure 1
plots the implied volatility of two basket call options as function of the strike price
with 2-standard deviation confidence intervals (for 5 million paths), as well as the
horizontal line corresponding to the theoretical limit.
In the left graph, the basket contains two independent identical assets following
the Black-Scholes model with volatility σ = 0.3. In the right graph, the basket
contains ten identical assets following the multidimensional Black-Scholes model,
where the volatility of every component is σ = 0.3 and the correlation between the
log-prices of different components is ρ = 0.5. The maturity of the options is T = 0.2
years in both graphs.
We observe that in both cases the volatility is almost constant as a function of
strike (note the scale on the vertical axis), and for all strikes it is very close to the
theoretical limit of Formula (27). We only show the zero-order term of the expansion
in Theorem 2 because the higher-order terms do not lead to an improvement of the
approximation for the strikes shown in the graph. Indeed, the higher-order terms in
this expansion have a singularity at K = 1 and have a “reasonable” value only when
log K1 is very small.
For comparison, we also plot the implied volatility in the right wing in Fig. 2.
According to Remark 2, in the right wing, the implied volatility must converge to
σ = 0.3. However, from the graph in Fig. 2 we see that this convergence is very slow:
for all strike values for which option prices may be computed without sophisticated
Fig. 1 Implied volatility of a basket call option in the multidimensional Black-Scholes model
together with the theoretical 0-order approximation for the left wing. Left option on a basket of 2
identical assets. Right option on a basket of 10 identical assets
190 A. Gulisashvili and P. Tankov
Fig. 2 Right wing of the implied volatility of a basket call option in the two-dimensional Black-
Scholes model together with the theoretical 0-order left-wing approximation
variance reduction, the implied volatility, although it increases slightly with strike,
remains close to its left-wing limit.
The detailed discussion of the behavior of the distribution of the sum of two log-
normal variables can be found in [20, 27]. The covariance matrix in this case is as
follows: B = [bij ], where b11 = σ12 , b12 = b21 = ρσ1 σ2 , b22 = σ22 with σ1 > 0,
σ2 > 0, and the correlation coefficient satisfies −1 < ρ < 1. We will also assume
σ1 ≥ σ2 . Note that the case where ρ < σσ21 is a regular case, and Assumption (A)
holds. In the case where ρ > σσ21 , we have to rearrange the rows and the columns
of B (see the example in Sect. 2.1 of [27]). Then B̄ = (σ22 ), and Assumption (A)
holds. The case where ρ = σσ21 is exceptional. Here Assumption (A) does not hold.
The following asymptotic formulas for the implied volatility follow from (26):
σ2
• Suppose ρ > σ1 . Then
−1
1 1 −2
I (K ) = σ2 − σ2 log λ2 log +O log (36)
K K
as K → 0.
Implied Volatility of Basket Options at Extreme Strikes 191
σ2
• Suppose ρ < σ1 . Then
T 2 σ12 T
I (K ) = σ∞ − σ∞ σ + log λ1 − − log v̄ v̄
2 ∞ 2
σ22 T 1 −1
+ log λ2 − − log(1 − v̄) (1 − v̄) log
2 K
T 3 log log K1 1 −2
− σ∞ + O log (37)
2 log2 K1 K
as K → 0, where
σ1 σ2 1 − ρ2 σ2 (σ2 − ρσ1 )
σ∞ = and v̄ = .
σ12 + σ22 − 2ρσ1 σ2 σ12 + σ22 − 2ρσ1 σ2
as x → 0. Recall that we assume that μ = 0. Recall also that μ1,T and μ2,T are
defined in (9).
Remark 3 Formula (38) can be derived from formula (B20) established at the end
of the proof of part (ii) of Theorem 2.3 in [20]. Note that in the present paper we
assume σ1 ≥ σ2 , while in [20], σ1 ≤ σ2 .
192 A. Gulisashvili and P. Tankov
Set
1 1
V1,T = log − 1 + μ1,T − μ2,T and V2 = log − 1 . (39)
ρ2 ρ2
as x → 0. Hence
! " ! "
1 1 1 1 2
exp − log2 V2 + log log ∼ exp − log log log
2T (σ12 − σ22 ) x 2T (σ12 − σ22 ) x
as x → 0. In addition,
! "
1 1 1
exp log log log V2 + log log
T (σ12 − σ22 ) x x
1 ! "
1 T (σ12 −σ22 ) 1 1 1
≈ log exp log log log log log
x T (σ12 − σ22 ) x x
as x → 0.
Implied Volatility of Basket Options at Extreme Strikes 193
Our next goal is to obtain a two-sided estimate for the put pricing function P,
by taking into account formula (40). We will use the ideas employed in the proof of
Theorem 1. Let us set
and
μ2,T V1,T V1,T
−2− − −1
T σ22 T (σ12 −σ22 ) T (σ12 −σ22 ) 2
a(y) = y (log y) (log log y) .
It is not hard to see that the restrictions, under which formula (21) is valid, are
satisfied. In addition, for the function b(x) = B(log x), we have b (x) ≈ logx x as
x → ∞. Now, reasoning as in the proof of Theorem 1, we obtain the following
formula: P(K ) ≈ P(K ) as K → 0, where
as K → 0. Next, using (3) with P given by (41), and making numerous simplifi-
cations, we obtain the following asymptotic formula for the implied volatility in the
exceptional case:
1 −1
I (K ) = σ2 + O log (42)
K
as K → 0. Comparing formula (42) with formulas (36) and (37), we see that the
behavior of the implied volatility at the critical point ρ = σσ21 , where the qualitative
change happens, is similar to that in the case where ρ > σσ21 .
194 A. Gulisashvili and P. Tankov
Recall that in Sect. 3, we introduced the price process S for a basket of assets (see
formula (6)). The present section deals with time changes in such processes. Suppose
τt , t ≥ 0, is a non-negative non-decreasing stochastic process on (, F, {F}t≥0 , P)
(a time change). Then, the time-changed process S has the following form: t → Sτt .
We only consider time changes which are independent of the price process S. In the
next subsections, two-sided estimates for marginal distribution functions of time-
changed price processes such as above will be established. Moreover, the leading
term in the asymptotic expansion of the implied volatility associated with a time-
changed price process t → Sτt in the n-dimensional Black-Scholes model will be
found.
The next assertion provides an upper bound for the distribution function of a random
variable imitating the random variable Sτt for fixed t > 0. The additional drift vector
μ̃ will be needed later to ensure the martingale property.
Theorem 3 (Upper bound) Let Y be a centered Gaussian vector with covariance
matrix B = [bij ]1≤i, j≤n , and let μ ∈ Rn and μ̃ ∈ Rn . Suppose Z is a random
variable with values in (0, ∞), which has a density ρ(x) satisfying ρ(s) ≤ csα e−θs
for s ≥ 1, where θ > 0, c > 0 and α ∈ R are constants. Then, there exists C > 0
such that as k → +∞,
n √ ∗
Z +μi Z +μ̃i
P[ eYi ≤ e−k ] Ck α e−c k ,
i=1
where
(1 + tμ⊥ w)2
c∗ = min max θt + . (43)
t≥0 w∈n 2w ⊥ Bwt
Proof In this proof, C denotes a constant which may change from line to line. For
k > 0, set n
√
Ft (k) = P eYi kt+μi kt+μ̃i ≤ e−k .
i=1
Implied Volatility of Basket Options at Extreme Strikes 195
Fix w ∈ n , and let t be such that 1 + tμ⊥ w > 0. Then, by Jensen’s inequality,
n √
Yi kt+μi kt+μ̃i −k
P e ≤e
i=1
√ n
≤P kt wi Yi + ktμ⊥ w + μ̃⊥ w + E(w) ≤ −k
i=1
k + tkμ⊥ w + μ̃⊥ w + E(w)
=N − √
w ⊥ Bwkt
√
C t (k + tkμ⊥ w + μ̃⊥ w + E(w))2
≤ √ exp −
(1 + tμ⊥ w) k 2w ⊥ Bwkt
√ ⊥
C t (1 + tμ w)2 (μ̃⊥ w + E(w))2
= √ exp −k exp −
(1 + tμ⊥ w) k 2w ⊥ Bwt 2w ⊥ Bwkt
E(w) + μ̃⊥ w μ⊥ w(E(w) + μ̃⊥ w)
× exp − exp −
w ⊥ Bwt w ⊥ Bw
√ ⊥
C t (1 + tμ w) 2 μ̃⊥ w
≤ √ exp −k exp − ,
(1 + tμ⊥ w) k 2w ⊥ Bwt w ⊥ Bwt
(1 + tμ⊥ w)2
F(t, w) = θt + .
2w ⊥ Bwt
The following lemma establishes some properties of this function. The proof is given
in the appendix.
Lemma 1 There exists a unique couple (t¯, w̄), with t¯ ∈ (0, ∞) and w̄ ∈ n such
that
F(t¯, w̄) = min max F(t, w).
t>0 w∈n
Remark that if T < ∞, then f (T ) = θT > f (t¯). Let us also choose T small
enough so that
1 1
1 − |μ⊥ w̄|T ≥ and > f (t¯).
2 8w̄ ⊥ Bw̄T
and assume that k is large enough so that k + 8μ̃w̄ > 0. We bound the distribution
function of the Gaussian mixture from above as follows:
n √
Z +μi Z +μ̃i
P[ eYi ≤ e−k ] = E[FZ /k (k)]
i=1
∞ ∞
= ρ(s)Fs/k (k)ds = k ρ(tk)Ft (k)dt (44)
0 0
√ ∞
T C(tk)α t
≤ k max Ft (k) + k √ e−k f (t) dt + ck e−tkθ (tk)α dt.
0≤t≤T T k(1 + μt)⊥ w T
(45)
Now, by the choice of T , the first term on the right-hand side of the last inequality
in (45) satisfies √
k max Ft (k) ≤ C ke−βk
0≤t≤T
with β > f (t ∗ ). The second term is computed using Laplace’s method. As k → +∞,
up to a constant,
√
T C(tk)α t ∗
k √ e−kf (t) dt ∼ Ck α e−kf (t ) .
T k(1 + μt)⊥ w
n √ ∗
Z +μi Z +μ̃i
P[ eYi ≤ e−k ] Ck α−n e−c k ,
i=1
By Proposition 3.2 in [28], the above probability can be bounded from below (very
roughly) as follows:
√ C
P[Yi kt + μi kt + μ̃i ≤ −k − log n, i = 1, . . . , n] ≥ exp{−αt /2},
(1 + k(1 + t))n
where
αt = min x ⊥ B−1 x
x≥ √1 ((k+log n)1+ktμ+μ̃)
kt
1 1
= maxn − u ⊥ Bu + u ⊥ √ ((k + log n)1 + ktμ + μ̃)
u∈R+ 2 kt
(k + log n + ktμ⊥ w + μ̃⊥ w)2
= max
w∈n 2w⊥ Bwkt
(1 + tμ⊥ w)2 (1 + tμ⊥ w)(log n + μ̃⊥ w) (log n + μ̃⊥ w)2
≤ max k ⊥
+ max ⊥
+ max .
w∈n 2w Bwt w∈n w Bwt w∈n 2w⊥ Bwkt
Finally, we bound the distribution function of the Gaussian mixture from below as
follows:
n √ ∞ t¯+1/k
Z +μi Z +μ̃i −k
P[ e Yi
≤e ]=k ρ(tk)Ft (k)dt ≥ ck (tk)α e−θtk Ft (k)dt
i=1 0 t¯−1/k
t¯+1/k
Ck(t¯k)α (1 + tμ⊥ w)2
≥ exp −θt¯k − k max dt
(1 + k(1 + t))n t¯−1/k w∈n 2w ⊥ Bwt
C(t¯k)α (1 + t¯μ⊥ w)2 Ck α e−k f (t¯)
≥ exp −θt¯k − k max = .
(1 + k(1 + t¯))n w∈n 2w ⊥ Bw t¯ (1 + k(1 + t¯))n
Remark 4 Theorems 3 and 4 show that under their assumptions, the dominating
factor describing the decay of the left tail of the price of a portfolio of assets is
exponential with the decay rate equal to the constant c∗ . For example, for n = 1, we
have
∗ (1 + μt)2 2θσ 2 + μ2 + μ
c = min{θt + } = .
t≥0 2σ 2 σ2
198 A. Gulisashvili and P. Tankov
St = log
log
1
S0 + μ̃t + μτt + B 2 Wτt , (46)
where we use the same notation as in the beginning of Sect. 3. Let S denote the price
process of the basket. Fix a maturity T > 0, and suppose the random variable τT has
a density ρT . Suppose also that there exist c1 > 0, c2 > 0, θ > 0 and α ∈ R such
that
c1 s α e−θs ≤ ρT (s) ≤ c2 s α e−θs , s ≥ 1. (47)
Bii
θ > μi + . (48)
2
This assumption implies that there exists ε > 0 such that
It follows from Theorems 3 and 4 that there exist C1 > 0, C2 > 0, and y0 > 0
such that
c∗ 1 α−n c∗ 1 α
C1 y log ≤ P[SτT ≤ y] ≤ C2 y log , y < y0 . (50)
y y
Since we have
+ K
P(K ) = E K − SτT = P[SτT ≤ y]dy,
0
Implied Volatility of Basket Options at Extreme Strikes 199
the estimates in (50) imply that there exist C3 > 0, C4 > 0, and K 0 > 0 such that
c∗ +1 1 α−n c∗ +1 1 α
C3 K log ≤ P(K ) ≤ C4 K log , K < K0. (51)
K K
Note that the put pricing pricing in (51) is squeezed between two regularly varying
functions with the same index of regular variation at zero. Such estimates allow one
to find the leading term in the asymptotic expansion of the implied volatility near
zero.
Theorem 5 Suppose condition (47) holds for the time-change process τ and that
the assumptions (48) and (49) are satisfied. Then the following asymptotic formula
holds for the implied volatility in time-changed n-dimensional Black-Scholes model:
1
ψ(c∗ ) 2 1
I (K ) ∼ log
T K
Remark 5 Condition (47) holds for many processes commonly used as stochastic
time changes, e.g., for the gamma process, the inverse Gaussian process, or the
generalized inverse Gaussian process. The latter process is used as time change in
the generalized hyperbolic Lévy model. Recall that the density of the gamma process
is given by
ct √
πλ−λs−πc2 t 2 /s
ρt (s) = 3/2
e2ct .
s
In the previous formulas, the symbols λ and c stand for the parameters of the distri-
butions.
We close this section with a counterpart of Theorem 5 for the right tail, which can
be deduced from Theorem 10 proved in the next section.
200 A. Gulisashvili and P. Tankov
Theorem 6 Suppose condition (47) holds for the time-change process τ and that the
assumptions (48) and (49) are satisfied. Then the following asymptotic formula holds
for the implied volatility in a time-changed n-dimensional Black-Scholes model:
21
ψ(cmin − 1)
I (K ) ∼ log K
T
as K → +∞, where
2θBii + μi2 − μi
cmin = min .
i=1,...,n Bii
Proof Let G i (x) = P[log STi ≥ x]. By Theorems 3 and 4, there exist constants C1
and C2 such that
C1 x α e−ci x G i (x) C2 x α−n e−ci x
as x → +∞, where
2θBii + μi2 − μi
ci = .
Bii
Note that in the single-asset case Theorems 3 and 4 can also be applied to the right
tail, by symmetry. It follows that
G i (x) ∼ −ci x
Numerical illustration In this part of the paper we illustrate the asymptotic result of
Theorem 5 with a numerical example. Figure 3 plots the squared implied volatility of
Fig. 3 Implied volatility of a basket call option together with the theoretical asymptote. Left option
on a basket of 2 identical assets. Right option on a basket of 10 identical assets. The logarithms of
asset prices follow the multidimensional variance gamma model
Implied Volatility of Basket Options at Extreme Strikes 201
two basket call options computed by Monte Carlo as function of the strike price on
logarithmic scale, as well as the theoretical asymptote with slope given by Theorem
5. Note that Theorem 5 only provides the value of the limiting slope of the squared
implied volatility. Therefore, the performance of the asymptotic results should be
evaluated by comparing the slope of the wing of the smile with the slope of the
straight line. The intercept of the straight line has been chosen to keep the straight line
relatively close to the curve, solely for the purpose of visualisation. The confidence
intervals for 5 million simulated paths are very tight and not shown on the graphs.
We focus on the left wing of the smile since in the left wing the slope of the
smile is correlation-dependent, and therefore can in principle be used to calibrate the
correlation structure. Also, numerical experiments for the right wing (not presented
in the paper) show that one needs to go much further into the tail to observe the
asymptotic behavior predicted by Theorem 6.
In this numerical illustration, the time change follows the variance gamma law
(53) with λ = 10 and c = 10. In the left graph, the basket contains two identical
assets with price processes given by (46), where we take μ ≡ 0, S0i = 50 for i = 1, 2
and the covariance matrix which satisfies Bii = σ 2 with σ = 0.3 for i = 1, 2 and
Bij = 0 for i = j. In the right graph, the basket contains ten identical assets with
price processes given by (46), where we take μ ≡ 0, S0i = 10 for i = 1, . . . , 10 and
the covariance matrix which satisfies Bii = σ 2 for i = 1, . . . , 10 and Bij = ρσ 2
with ρ = 0.5 for i = j. The maturity of the options is T = 0.2 years in both graphs.
A copula exists by Sklar’s theorem and is uniquely defined in the case where the
marginal distributions of X 1 , . . . , X n are continuous. We refer to [35] for more details
on copulas.
The Gaussian copula with correlation matrix R is the unique copula of any
Gaussian vector with correlation matrix R and nonconstant components (it does
not depend on the mean vector and on the variances of the components).
Given a function φ : [0, 1] → [0, ∞] which is continuous, strictly decreasing
and such that its inverse φ−1 is completely monotonic, the Archimedean copula with
generator φ is defined by
provided that the limit exists and is finite for all α1 , . . . , αn ≥ 0 such that αk > 0
for at least one k.
Suppose also that the copula C admits a weak lower tail dependence function χ.
Then,
log P[X 1 + · · · + X n ≤ x] 1
lim = .
x↓0 mini log P[X i ≤ x] χ(η1 , . . . , ηn )
Theorem 8
• Assume that a copula function C has strong tail dependence in the left tail, meaning
that the limit
C(u, . . . , u)
λ L = lim ,
u↓0 u
exists and satisfies λ L > 0. Then, the weak lower tail dependence function of C
satisfies χ(α1 , . . . , αn ) = 1.
Implied Volatility of Basket Options at Extreme Strikes 203
• Let C be a Gaussian copula with correlation matrix R such that det R = 0. Then,
Rij
where the matrix has entries ij = √
αi α j , 1 ≤ i, j ≤ n.
• Let C be an Archimedean copula with a generator function φ such that log φ−1 is
regularly varying at ∞ with index λ > 0. Then,
max(α1 , . . . , αn )
χ(α1 , . . . , αn ) = 1/λ 1/λ
.
(α1 + · · · + αn )λ
In this subsection, we study the left-wing behavior of the implied volatility associated
with a basket call option. Recall that we denoted by (Y1 , . . . , Yn ) the vector of
logarithmic returns of the risky assets, and by (λ1 , . . . , λn ) the corresponding vector
of weights. Let C be the copula of the vector (Y1 , . . . , Yn ), and G i be the distribution
function of Yi for i = 1, . . . , n. The implied volatility is considered in this section
as a function k → I (−k) of the variable −k, where k is the log-strike defined by
k = log K . The tail-wing formulas due to Benaim and Friz (see [9]) play an important
role in the sequel.
Theorem 9 Let α > 0, and assume that the following are true:
• There exists ε > 0 such that E[e−εYi ] < ∞, i = 1, . . . , n.
• For every 1 ≤ i ≤ n, the function k → − log G i (−k), k > k0 , belongs to the class
Rα of regularly varying functions, and there exist positive constants η1 , . . . , ηn
and a function G such that
Since the function log G i is regularly varying at −∞, it is clear that log Fi is slowly
varying at zero and
maxi ηi
log F(e−k ) ∼ log G(−k) as k → ∞,
χ(η1 , . . . , ηn )
and hence
ρ1 (x)
→ 1 as x → ∞. (58)
ρ2 (x)
Then
ψ(ρ1 (x))
→ 1 as x → ∞. (59)
ψ(ρ2 (x))
2
ψ(u) = √ √ . (60)
( u + 1 + u)2
The equality in (60) describes the structure of the function ψ better than the original
definition.
Implied Volatility of Basket Options at Extreme Strikes 205
2 2
ψ((1 − ε)u) ≤ ≤ √ √
√ 2 (1 − ε)( u + 1 + u)2
(1 − ε) u+ 1
1−ε + u
2 1
= √ √ = ψ(u).
(1 − ε)( u + 1 + u)2 1−ε
Similarly
1
ψ((1 + ε)u) ≥ ψ(u).
1+ε
Therefore,
1 1
ψ(u) ≤ ψ((1 + ε)u) ≤ ψ((1 − ε)u) ≤ ψ(u). (61)
1+ε 1−ε
It follows from (58) that for every ε > 0 there exists xε > 0 such that
for all x > xε . Since the function ψ decreases on (0, ∞), we have
1 1
ψ(ρ2 (x)) ≤ ψ(ρ1 (x)) ≤ ψ(ρ2 (x))
1+ε 1−ε
Finally, it is not hard to see that (56), (57), and Lemma 2 imply (55).
This completes the proof of Theorem 9.
The next example shows that condition (54) does not prevent one from choosing
different marginal laws for different components of the process (Y1 , . . . , Yn ) as long
as these laws have a similar tail behavior.
Example 1 Let us consider the following multidimensional extension of the example
given in Sect. 5.2 of [9]. We assume that for i = 1, . . . , n, the distribution of the
random variable Yi is normal inverse Gaussian, more precisely, NIG(αi , βi , μi , δi ).
It is also supposed that the parameters satisfy αi > |βi | > 0 and δi > 0. This means
that the moment generating function of Yi is given by
Mi (z) = exp δi αi2 − βi2 − αi2 − (βi + z)2 + μi z .
206 A. Gulisashvili and P. Tankov
We refer the reader to [6] for more details on the normal inverse Gaussian distribution.
In particular, it follows that Yi has a density gi which satisfies the following condition:
3
gi (k) ∼ Ci |k|− 2 e−αi |k|+βi k , k → ±∞,
Rij
ij = .
(αi − βi )(α j − β j )
Remark 6 In this remark, we compare the asymptotic formulas for the implied
volatility obtained in Sects. 3 and 5 (see Theorems 2 and 9). The latter theorem
is more general than the former one. It provides the leading term in the asymptotic
expansion of the implied volatility under certain restrictions on the marginal distri-
butions of log-returns and the corresponding copula, and applies to many special
models. In the case of a Gaussian copula and log-normal marginal distributions,
all the conditions in Theorem 9 are satisfied, and the leading term is equal to the
− 1
constant Ā1 + · · · + Ān̄ 2 . This follows from Theorem 9, the second equality in
formula (27), and the second statement in Theorem 8. The advantage of Theorem 9
is its generality, while the disadvantage is that the asymptotic formula for the implied
volatility contains only the leading term and no error estimate. On the other hand,
Theorem 2 applies only to the case of Gaussian copula and lognormal margins under
a not very restrictive condition (A), but provides a sharp asymptotic formula for the
implied volatility with several terms and an error estimate.
Theorem 10 Let α > 0, and suppose that the following assumptions hold:
• There exists ε > 0 such that E[e(1+ε)Yi ] < ∞ for i = 1, . . . , n.
• For each i = 1, . . . , n, the function k → − log G i (k) belongs to the class Rα at
infinity.
Then,
I (k)2 T 1
∼ ψ −1 − max log G i (k) as k → +∞. (63)
k k i
x n
x x
P[X 1 + · · · + X n ≥ x] ≤ P[∃i : X i ≥ ]≤ P[X i ≥ ] ≤ n max P[X i ≥ ].
n n i n
i=1
Since for each i, the function log G i is regularly varying at infinity, it follows that
the function x → log P[X i ≥ x] is slowly varying, and therefore, for x sufficiently
large and any ε > 0,
Finally,
log P[X 1 + · · · + X n ≥ x]
lim = 1,
x→+∞ maxi log P[X i ≥ x]
and formula (63) follows from Theorem 1 in [9] with a similar proof to that of
Theorem 9.
Numerical illustration In this part of the paper we illustrate the asymptotic result of
Theorem 9 with a numerical experiment. Figure 4 plots the squared implied volatility
of two basket call options computed by Monte Carlo as function of the strike price on
logarithmic scale, as well as the theoretical asymptote with slope given by Theorem 9.
Once again, we focus on the left wing of the smile since the slope of the left wing
depends on the correlation of the Gaussian copula while the slope of the right wing
does not.
In both graphs, the basket contains assets with price processes
Fig. 4 Implied volatility of a basket call option together with the theoretical asymptote. Left option
on a basket of 2 identical assets. Right option on a basket of 10 identical assets. The logarithms of
asset prices follow the variance gamma model with dependence given by a Gaussian copula
where μ̃i is chosen so that E[STi ] = S0i and X i is the variance gamma process with
characteristic function
−κi T
σ 2 u 2 κi−1
E[e iu X Ti
]= 1 − iuκi−1 μi + i .
2
Therefore, the limiting behavior of the implied volatility in this model is also given by
(62). We see that the numerical illustration agrees well with the theoretical prediction.
Compared to the numerical example of Sect. 4, we see that the slope of the implied
volatility is steeper in the multidimensional VG model than in the copula model.
This happens because the multidimensional VG model introduces additional positive
dependence between the assets through the common time change process.
Implied Volatility of Basket Options at Extreme Strikes 209
Proof of Lemma 1
λ2 w ⊥ Bwt
F(t, w) = max{θt + λw ⊥ (1 + μt) − },
λ>0 2
where 1 stands for the n-dimensional vector with all elements equal to 1. Therefore,
u),
max F(t, w) = maxn F(t,
w∈n u∈R+
with
⊥
u) = {θt + u ⊥ (1 + μt) − u But }.
F(t,
2
u) is strictly concave in u, there exists a unique ū(t) ∈ Rn+
Since for every t > 0, F(t,
ū) = maxu∈Rn F(t,
with ū(t) = 0 such that F(t, u). This in turn implies that there
+
exists a unique w̄(t) such that F(t, w̄) = maxw∈n F(t, w). It is also easy to see
that ū(t) depends continuously on t.
Let f¯(t) = F(t,
ū(t)). We would like to show that f¯ is differentiable in t and
compute its derivative. ū(t) may be characterized as follows: for i = 1, . . . , n
Let I (t) denote the set of indices i ∈ {1, . . . , n} such that ū(t)i > 0, and, for a
vector x ∈ Rn , let x I (t) denote the subset of components of x with indices in I (t):
x I (t) = {xi : i ∈ I (t)}. Furthermore, let B I (t),I (t) denote the submatrix of the
covariance matrix, containing the elements bij with i ∈ I (t) and j ∈ I (t). Then, the
vector ū(t) satisfies
1 −1
ū(t) I (t) = B (1 + μt) I (t) , ū(t) I˜(t) = 0,
t I (t),I (t)
where the set I˜(t) contains the indices i ∈ {1, . . . , n} which are not in I (t).
Now, fix t ∈ (0, ∞) and for t ∈ (0, ∞), define
1 −1
v(t ) I (t) = B (1 + μt ) I (t) , v(t) I˜(t) = 0
t I (t),I (t)
First, assume that for all i such that ū(t)i = 0, either [1 + μt − tBū(t)]i < 0 (with
strict inequality) or
[1 + μt − t Bv(t )]i = 0
210 A. Gulisashvili and P. Tankov
for all t ∈ (0, ∞). We shall call this Assumption 1. Then we can find δ > 0, such
that for every t ∈ (0, ∞) with |t − t| < δ, v(t ) satisfies the characterization (64)
and (65). Therefore, v(t ) = ū(t ). This means that
1
f¯(t ) = θt + (1 + μt )⊥ −1
I (t) B I (t),I (t) (1 + μt ) I (t) .
2t
1 1 1
f¯ (t) = θ − 2 1⊥ B−1 1 I (t) + μ⊥ B−1 μ = θ − ū(t)⊥ (1 − μt)
2t I (t) I (t),I (t) 2 I (t) I (t),I (t) I (t) 2t
(66)
Now assume that there exists at least one i such that ū(t)i = 0 and [1 + μt −
tBū(t)]i = 0, or, equivalently,
[1 + μt − t Bv(t )]i = 0
with t = t. The case when the above equality holds for all t is covered by Assump-
tion 1. Since the left-hand side is linear in t , this means that for a given index set
I (t) and for a given i, there exists only one t ∈ (0, ∞) which satisfies the above
equality. Since the number of possible index sets is finite, we conclude that there is
at most a finite number of elements t ∈ (0, ∞) which do not satisfy Assumption
1. But then, we can conclude by continuity that f¯ is strictly convex (which entails
uniqueness of t¯) and differentiable for all t ∈ (0, ∞), with the derivative given by
(66) or alternatively by
1 (w̄(t)⊥ μ)2
f¯ (t) = θ − + .
2t 2 w̄(t)⊥ Bw̄(t) 2w̄(t)⊥ Bw̄(t)
Comparing this with the derivative of f , which is easily computed, we see that at the
point t¯, these derivatives coincide. Since this point is characterized by the first order
condition f¯ (t¯) = 0, and the function f is strictly convex, f also attains its unique
minumum at t¯.
Implied Volatility of Basket Options at Extreme Strikes 211
References
1. Albin, J.M.P., Sundén, M.: On the asymptotic behaviour of Lévy processes, part I: subexpo-
nential and exponential processes. Stoch. Process. Appl. 119, 281–304 (2009)
2. Andersen, L., Lipton, A.: Asymptotics for exponential Lévy processes and their volatility smile:
survey and new results. Int. J. Theor. Appl. Financ. 16, 1350001-1–1350001-98 (2013)
3. Asmussen, S., Rojas-Nandayapa, L.: Asymptotics of sums of lognormal random variables with
Gaussian copula. Stat. Probab. Lett. 78, 2709–2714 (2008)
4. d’Aspremont, A.: Interest rate model calibration using semidefinite programming. Appl. Math.
Financ. 10, 183–213 (2003)
5. Avellaneda, M., Boyer-Olson, D., Busca, J., Friz, P.: Reconstruction of volatility: pricing index
options using the steepest-descent approximation. Risk Mag. 15, 87–91 (2002)
6. Barndorff-Nielsen, O.: Processes of normal inverse Gaussian type. Financ. Stoch. 2, 41–68
(1998)
7. Bayer, C., Laurence, P.: Asymptotics beats Monte Carlo: the case of correlated local vol baskets.
Commun. Pure Appl. Math. 67, 1618–1657 (2014)
8. Benaim, S., Friz, P.: Smile asymptotics II: models with known MGF. J. Appl. Probab. 45, 16–32
(2008)
9. Benaim, S., Friz, P.: Regular variation and smile asymptotics. Math. Financ. 19, 1–12 (2009)
10. Benhamou, E., Gobet, E., Miri, M.: Smart expansion and fast calibration for jump diffusions.
Financ. Stoch. 13, 563–589 (2009)
11. Berestycki, H., Busca, J., Florent, I.: Asymptotics and calibration of local volatility models.
Quant. Financ. 2, 61–69 (2002)
12. Cherubini, U., Luciano, E., Vecchiato, W.: Copula Methods in Finance. Wiley, Chichester
(2004)
13. Cont, R., Deguest, R.: Equity correlations implied by index options: estimation and model
uncertainty analysis. Math. Financ. 23, 496–530 (2013)
14. De Marco, S., Hillairet, C., Jacquier, A.: Shapes of implied volatility with positive mass at zero
(2013). arXiv:1310.1020
15. Eberlein, E., Madan, D.B.: On correlating Lévy processes. J. Risk 13, 3–16 (2010)
16. Figueroa-López, J., Forde, M.: The small-maturity smile for exponential Lévy models. SIAM
J. Financ. Math. 3, 33–65 (2012)
17. Forde, M., Jacquier, A.: Small-time asymptotics for implied volatility under the Heston model.
Int. J. Theor. Appl. Financ. 12, 861–876 (2009)
18. Forde, M., Jacquier, A.: The large-maturity smile for the Heston model. Financ. Stoch. 15,
755–780 (2011)
19. Gao, K., Lee, R.: Asymptotics of implied volatility to arbitrary order. Financ. Stoch. 18, 349–
392 (2014)
20. Gao, X., Xu, H., Ye, D.: Asymptotic behavior of tail density for sum of correlated lognormal
variables. Int. J. Math. Math. Sci. 2009, p. 28 (2009)
21. Gobet, E., Miri, M.: Time dependent Heston model. SIAM J. Financ. Math. 1, 289 (2010)
22. Gulisashvili, A.: Asymptotic formulas with error estimates for call pricing functions and the
implied volatility at extreme strikes. SIAM J. Financ. Math. 1, 609–641 (2010)
23. Gulisashvili, A.: Asymptotic equivalence in Lee’s moment formulas for the implied volatility,
asset price models without moment explosions, and Piterbarg’s conjecture. Int. J. Theor. Appl.
Financ. 15, 1250020 (2012)
24. Gulisashvili, A.: Analytically Tractable Stochastic Stock Price Models. Springer, Berlin (2012)
25. Gulisashvili, A.: Left-wing asymptotics of the implied volatility in the presence of atoms. Int.
J. Theor. Appl. Finan. 18(2) (2015)
26. Gulisashvili, A., Stein, E.M.: Asymptotic behavior of the stock price distribution density and
implied volatility in stochastic volatility models. Appl. Math. Optim. 61, 287–315 (2010)
27. Gulisashvili, A., Tankov, P.: Tail behavior of sums and differences of log-normal random
variables. Bernoulli (to appear)
212 A. Gulisashvili and P. Tankov
28. Hashorva, E., Hüsler, J.: On multivariate Gaussian tails. Ann. Inst. Stat. Math. 55, 507–522
(2003)
29. Jourdain, B., Sbai, M.: Coupling index and stocks. Quant. Financ. 12, 805–818 (2012)
30. Lee, R.: The moment formula for implied volatility at extreme strikes. Math. Financ. 14, 469–
480 (2004)
31. Lewis, A.: Option Valuation Under Stochastic Volatility. Finance Press, Newport Beach (2000)
32. Luciano, E., Schoutens, W.: A multivariate jump-driven financial asset model. Quant. Financ.
6, 385–402 (2006)
33. Medvedev, A., Scaillet, O.: Approximation and calibration of short-term implied volatilities
under jump-diffusion stochastic volatility. Rev. Financ. Stud. 20, 427–459 (2007)
34. Mijatović, A., Tankov, P.: A new look at short-term implied volatility in asset price models
with jumps. Math. Financ., to appear
35. Nelsen, R.: An Introduction to Copulas. Springer, New York (1999)
36. Prause, K.: The generalized hyperbolic model: estimation, financial derivatives, and risk mea-
sures, Ph.D. thesis, University of Freiburg (1999)
37. Schoenmakers, J.: Robust Libor Modelling and Pricing of Derivative Products. CRC Press,
Boca Raton (2005)
38. Tankov, P.: Pricing and Hedging in Exponential Lévy Models: Review of Recent Results.
Paris-Princeton Lectures on Mathematical Finance. Springer, Berlin (2010)
39. Tankov, P.: Large deviation asymptotics for the left tail of the sum of dependent positive random
variables (2014). arXiv:1402.4683
40. Tehranchi, M.R.: Asymptotics of implied volatility far from maturity. J. Appl. Probab. 46,
629–650 (2009)
41. Tehranchi, M.R.: Uniform bounds for Black-Scholes implied volatility, Pre-print (2014)
Small-Time Asymptotics
for the At-the-Money Implied Volatility
in a Multi-dimensional Local
Volatility Model
1 Introduction
For a local volatility type model for a basket of stocks, whose forward prices are
given by
To the memory of Peter Laurence, who passed away unexpectedly during the final stage of the
preparation of this manuscript.
C. Bayer (B)
Weierstrass Institute, Mohrenstrasse 39, 10117 Berlin, Germany
e-mail: [email protected]
P. Laurence
Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street,
New York, NY 10012, USA
P. Laurence
Dipartimento di Matematica, Università di Roma 1 Piazzale Aldo Moro, 2,
00185 Rome, Italy
where we generally denote in bold face a vector of the corresponding italic com-
ponents, as in F = (F1 , . . . , Fn ). Since we only assume that at least one of the
weights w1 , . . . , wn is positive, we will refer to options of that type as generalized
spread options.
The purpose of this paper is to provide an explicit first order accurate short time
expansion of the price CB (F0 , K , T ) of the above option using the heat kernel ex-
pansion technique (see, for instance, [12, 13, 20]) when the option is at the money.
Moreover, from the asymptotic formula for the option price we also obtain an asymp-
totic formula for the implied and for the local volatility.1 Thereby we complement
the results obtained in [5], where a first order accurate asymptotic formula was given
when the option is not at the money. (The zero order accurate formula is well-known,
see, for instance [1]. When the option is not at the money, alternative first order ac-
curate results can be found in [12].) Such asymptotic formulas are highly relevant,
in particular when the dimension of the model is high (say n > 3), since then tradi-
tional (simulation or PDE) techniques to compute CB fail or are at least very time
consuming. In fact, for a wide range of different parameters, [5] show numerically
that their asymptotic formula is remarkably close to the true price as given by the
model, even for not so small maturities T (like 5 or even 10 years), for dimensions of
up to n = 100 (or even more). The same holds true when the option is at the money,
see Sect. 6.
We now sketch the procedure for deriving the asymptotic formulas, highlighting
the differences to the non-ATM case.
• In the first step, we derive a Carr-Jarrow formula for the basket option price,
separating the price into the intrinsic value of the option
(which vanishes in the
ATM case) and an integral over the arrival manifold { i wi Fi = K } with respect
to the transition density p(F0 , F, T ). This is done in Sect. 2.
• The first terms in the heat kernel expansion of p(F0 , F, T ) are computed. In the
non-ATM case, a zero-order heat kernel expansion was sufficient to get first order
accurate formulas for the implied volatilities. At the money, we actually need to
add one additional term in the heat kernel expansion. The heat kernel coefficients
are computed in Lemma 3.6.
• The aforementioned integral on the arrival manifold
is essentially an integral with
respect to the rapidly decaying kernel exp −d(F0 , F)2 /(2T ) , where d denotes
1 Since
we consider spread options here (for which i wi F0,i may be negative), we derive implied
volatilities both in the Black-Scholes and in the Bachelier sense.
Small-Time Asymptotics for the At-the-Money Implied Volatility … 215
the Riemannian (geodesic) distance induced by the stock price process. Hence,
the integral can be approximated using Laplace’s expansion for T → 0, which
involves the minimizer F∗ of F → d(F0 , F)2 subject to i wi Fi = K . In the
general case, this minimizer has to be computed numerically, while it is obviously
given by F∗ = F0 when the option is at the money. On the other hand, the formulas
are much longer and more complex due to the higher order heat kernel expansion
used, see Proposition 3.4 together with Lemmas 3.3 and 3.7.
• In Sect. 4, we use the same Laplace’s expansion technique to derive the local
volatility of the basket, see Proposition 4.1.
• Finally, in Sect. 5, an asymptotic expansion for the implied volatilities is computed
by a comparison of coefficients between the asymptotic expansion of the basket
price derived in Proposition 5.1 and asymptotic expansions of the Black-Scholes
and Bachelier formulas, respectively, see Eqs. (5.2)–(5.4).
An alternative way to derive the asymptotic expansion for at-the-money options
would be to start from the non-at-the-money formulas and pass to the limit. This
would involve un-determined terms “ 00 ”, which would need to be resolved by the
l’Hopital rule. In particular, we would have to compute limits of derivatives of the
optimal configuration, which are not known in closed form when the option is not
at-the-money. Still, one could follow that approach using similar techniques as in
[4], but the derivation would hardly be any simpler than directly starting from scratch
again (the course of action chosen in this article).
In Sect. 6 we present numerical examples for one particular choice of a lo-
β
cal volatility model, namely the CEV model, corresponding to σi (Fi ) = ξi Fi i ,
0 ≤ βi ≤ 1, 1 ≤ i ≤ n. The numerical observations supports the claimed accuracy
of the asymptotic price formulas. In fact, comparisons with highly accurate reference
solutions show that the asymptotic formulas indeed have the suggested rates of con-
vergence as T → 0. Even more, they indicate that the formulas, in particular the first
order formula, are highly accurate even for large maturities such as T = 10 years,
thereby confirming the observations in [5].
Remark 1.1 In the same spirit as [5], the aim of this paper is to give informal deriva-
tions of fast and accurate asymptotic formulas. Indeed, there are several steps, in
which our derivations are not fully rigorous. In particular, most local volatility mod-
els (like the CEV model) exhibit singular behaviour at the boundary of the domain
Rn+ which can inhibit the validity of the heat kernel expansion, and, a fortiori, also
the Laplace expansion applied later. It is clearly possible to rigorously justify both
expansions under appropriate (uniform) ellipticity assumptions (see, for instance,
[20] for the validity of the heat kernel expansion and [6] for a rigorous version of
Laplace’s expansion). An extension to general or some specific local volatility mod-
els, however, seems to be a difficult task, see also the comments in [5, Sect. 4], and,
in particular, the results of [2]. Thus, we believe that a more “hands-on” approach
can be justified in this particular case. For related problems see also [3, 8, 9].
216 C. Bayer and P. Laurence
n
n
d wi Fi (t) = wi σi (Fi (t))dWi (t)
i=1 i=1
n
= wi w j σi (Fi (t)σ j (F j (t))ρij d W̄ (t),
i, j=1
σN
2
,B
for a new Brownian motion W̄ . Here we have used the notation σN,B to indicate the
“normal volatility” of the basket which must not be confused with the lognormal
σN,B
(Black) volatility σB = n used in reference [1]. Therefore, by the Itô-Tanaka
wi Fi
i=1
formula we have
n +
n
d wi Fi (t) − K = wi 1 wi Fi (t)>K d Fi (t)
i=1 i=1
1
+ δ{F: wi Fi (t)=K } σN,B
2
(F(t))dt.
2
Integrating we obtain
n + n +
n T
wi Fi (T ) − K = wi Fi (0) − K + wi 1 wi Fi (u)>K d Fi (u) +
i=1 i=1 i=1 0
T
1
+ δ{F(u): wi Fi (u)=K } σN,B
2
(F(u))du.
2 0
Letting E K = {F ∈ Rn+ : wi Fi = K } and taking conditional expectations with
respect to the filtration F0 at time 0, we obtain, assuming Fi (t) is a martingale for
each i 2 :
2 In many cases of interest, Fi (t) is only a local martingale and not a martingale. But the discrepancy
is not “felt” for short times, since the set of paths that can reach the boundary have small probability,
in this limit. This is known as the principle of “not feeling the boundary” for small times and is born
out by our numerical results. More surprisingly the boundary is not felt, even for quite large times.
Small-Time Asymptotics for the At-the-Money Implied Volatility … 217
n +
1 T
CB (F0 , K , T ) = wi Fi (0) − K + E σN,B
2
δE K (Bt ) dt.
2 0
i=1
n
Letting u(F) := i=1 wi Fi and recalling |∇u| = |w| (where | · | denotes the
Euclidean norm) we can re-express this as
n +
CB (F0 , K , T ) = wi Fi (0) − K +
i=1
T
1 1
+ |∇u(F)|σN
2
,B (F)δ0 (u(F) − K ) p(F0 , F, t)dFdt.
2 |w| 0 Rn
Using the formula for the basket’s local volatility, [1, 12], expressed in the notation
introduced above, after canceling common factors we also have the
218 C. Bayer and P. Laurence
Proposition 2.2 The local volatility of the basket option is given by:
n
wi w j σi (Fi )σ j (F j )ρij p(F0 , F, T )Hn−1 (dF)
E(K ) i, j=1
σloc
2
(K , T )K 2 = .
p(F0 , F, T )Hn−1 (dF)
E(K )
The starting point is the basket Carr-Jarrow formula derived above for the calculation
of the option prices as in Propositions 2.1 and 2.2 for the calculation of the local
volatilities. The next step is to approximate the transition density there using the heat
kernel. For reasons that will become clear in the course of the asymptotics, it will be
necessary to use the so-called geometric expansion
1 d 2 (F0 ,F)
p(F0 , F, t) = n det g(F)e− 2t (u 0 (F0 , F) + tu 1 (F0 , F) + o(t)). (3.1)
(2πt) 2
For a detailed exposition of the geometrical underpinning of (3.1) we refer to [5, 12,
13, 17, 20]. Here, we just give a very quick reminder.
Remark 3.1 We shall assume that the process Ft is locally elliptic in the sense that
ρ is invertible and σi (Fi ) > 0 for any Fi > 0 and any i, i.e., for F in the interior of
the domain of the process Ft . A rigorous heat kernel expansion for locally elliptic
diffusions is given in [2], with the restriction that the expansion is only valid for
compact subsets of Rn+ .
The state space Rn+ is equipped with a Riemannian metric by defining the inverse
g −1 of the metric tensor by
with determinant
n
det g(F) = det ρ−1 σk (Fk )−2
k=1
(where ρij denotes the (i, j)-component of the inverse matrix ρ−1 of the correla-
tion matrix ρ). The (geodesic) distance between two points F0 and F is denoted
by d(F0 , F).
Small-Time Asymptotics for the At-the-Money Implied Volatility … 219
The specific form of these quantities in the setting of local volatility models has
no relevance in our initial asymptotic derivations, which can be obtained for generic
versions of these. So, to lighten the notation and streamline the presentation, we
first derive the asymptotic expansions without any specific reference to these and
then plug in the specific form only at the end of the process in order to produce the
required concrete asymptotic expansions.
Plugging the heat kernel expansion (3.1) into the expressions in Propositions 2.1
and 2.2, respectively, we see that we have to compute expressions of the form
1 d(F0 , F)2
(F) exp − Hn−1 (dF), (3.2)
(2πt)n/2 EK 2t
where
(F) = ū i (F0 , F) := 2
det g(F)σN,B (F)u i (F0 , F), i = 0, 1, (3.3)
for the option price and for the numerator in Proposition 2.2 and
(F) = û i (F0 , F) := det g(F)u i (F0 , F), i = 0, 1 (3.4)
Denoting
Note that the set G K is introduced in order to ensure that Fn in (3.5) is non-negative,
as it needs to be. The set E K is an n − 1 dimensional hyperplane in Rn+ .
Note that, when we parametrize the hyperplane Ek using (F1 , . . . , Fn−1 ), as
in (3.5)
220 C. Bayer and P. Laurence
we will always assume that the weight multiplying Fn is positive. This can always
be achieved by choosing as the nth asset one of the assets with a positive weight.
Then for the surface measure, we have
|w|
d Hn−1 = 1 + |∇ Fn |2 d F1 . . . d Fn−1 = d F1 . . . d Fn .
|wn |
d2
In this notation, with = 2 , the integral (3.2) reads
1 |w| (F0 ,F K (G))
e− t (F K (G))d F1 . . . d Fn−1 =
(2πt) |wn |
n/2
GK
1 |w| (G)
e− t (G)dG,
(2πt)n/2 |wn |
GK
(3.6)
using the notation (G) := (F0 , F K (G)) and (by abuse of notation) (G) :=
(F K (G)). We now use Laplace asymptotics for multiple integrals. The main con-
tribution comes from a neighborhood of the minimum point.
= d 2 (F0 , E K ).
Set F∗K = (G∗ , Fn (G∗ , K )). (Of course, when the option is at the money, we have
G∗ = (F0,1 , . . . , F0,n−1 ).)
Order Zero. The zero-th order term in the Laplace expansion of
(G)
e− t (G)dG
GK
is identical to the one in [5] except that in the present setting we have d(F0 , F∗K ) = 0.
We get, as in [5]
n−1
n−1
∗ −z
T Qz n−1
∗ (2π) 2
t 2 (G ) × e 2 dz 2 . . . dz n = t 2 (G ) 1
,
Rn−1 (det Q) 2
|w| 1
h
0 := √ (F0 ). (3.8)
|wn | 2πt det Q
Order One. For obtaining first order implied or local volatility terms in the ATM
regime, we need to push the Laplace expansion one step further, i.e., we need one
additional term for
(G)
e− t (G)dG
GK
Hence, we apply the (multi-variate) Taylor expansion for (G) := (F0 , F K (G))
up to order 4 around the maximizer G∗ , which can be expressed in tensor notation
as
1 ⊗2
(G) = (G∗ ) + D(G∗ ) G − G∗ + D 2 (G∗ ) G − G∗ +
2
=0
1 3 ⊗3 1 4 ⊗4
+ D (G∗ ) G − G∗ + D (G∗ ) G − G∗ + ··· ,
6 24
with
∂k
D k (x)y⊗k := (x)yi1 · · · yik
∂xi1 · · · ∂xik
i 1 ,...,i k
1 ⊗2
(G) = (G∗ ) + D(G∗ ) G − G∗ + D 2 (G∗ ) G − G∗ + ··· .
2
In the end, we are interested in small-time asymptotics, so we change variables
1
z := √ G − G∗ ,
t
1 1 1 1 √
(G) = (G∗ ) + D 2 (G∗ )z⊗2 + D 3 (G∗ )z⊗3 t
t t 2 6
1 4 ∗ ⊗4
+ D (G )z t + o(t),
24
and
√ 1
(G) = (G∗ ) + D(G∗ )z t + D 2 (G∗ )z⊗2 t + o(t).
2
Using the above Taylor expansions, the change of variables, and
√ √ a2
e a t+bt
=1+a t + + b t + o(t),
2
we obtain
(F0 ,F K (G))
− (n−1)/2 −(G∗ )/t − 12 D 2 (G∗ )z⊗2
e t (G)dG = t e √ e
GK ∗
(G K −G )/ t
2
1 3 ∗ ⊗3
√ 1 1 3 ∗ ⊗3 1 4 ∗ ⊗4
× 1 − D (G )z t+ − D (G )z − D (G )z t
6 2 6 24
!
√ 1
+ o(t) × (G∗ ) + D(G∗ )z t + D 2 (G∗ )z⊗2 t + o(t) dz. (3.9)
2
with h
0 defined in (3.8) and
!
|w| 1 1 2 1
− 21 zT Qz
h
1 := D (G∗ )z⊗2 − D 3 (G∗ )z⊗3
e
|wn | (2πt)n/2 2
Rn−1 6
2
1 1 3 1 4
× D(G∗ )z + D (G∗ )z⊗3 (G∗ ) − D (G∗ )z⊗4 (G∗ ) dz.
2 6 24
(3.11)
Here we assume that is polynomially bounded and F0 > 0 (i.e., all components
of F0 are strictly positive. Indeed, under these assumptions we observe that the error
in the approximation (3.10) decays, in fact, like e−1/t by properties of the normal
distribution.
Using Isserlis’ Theorem (see [14]), the Eq. (3.11) for h 1 can be computed ex-
plicitly.
Small-Time Asymptotics for the At-the-Money Implied Volatility … 223
Lemma 3.2 (Isserlis’ theorem for fourth and sixth moments) For a covariance
matrix ∈ Rd×d let T 2 () ∈ (Rd )⊗4 and T 3 () ∈ (Rd )⊗6 be the tensors defined
by
T 2 ()i1 ,...,i4 = i1 i2 i3 i4 + i1 i3 i2 i4 + i1 i4 i2 i3
and
T 3 ()i1 ,...,i6 = i1 i2 i3 i4 i5 i6 + i1 i2 i3 i5 i4 i6 + i1 i2 i3 i6 i4 i5
+ i1 i3 i2 i4 i5 i6 + i1 i3 i2 i5 i4 i6 + i1 i3 i2 i6 i4 i5 + i1 i4 i2 i3 i5 i6
+ i1 i4 i2 i5 i3 i6 + i1 i4 i2 i6 i3 i5 + i1 i5 i2 i3 i4 i6 + i1 i5 i2 i4 i3 i6
+ i1 i5 i2 i6 i3 i5 + i1 i6 i2 i3 i4 i5 + i1 i6 i2 i4 i3 i5 + i1 i6 i2 i5 i3 i4 ,
!
|w| 1 1 2
h
1 = √ D (G∗ )Q −1
|wn | 2πt det Q 2
1
− (∂i1 ,i2 ,i3 )(G∗ )(∂i4 )(G∗ )T 2 (Q −1 )i1 ,...,i4
6
i 1 ,...,i 4
1
+ (G∗ ) ∂i1 ,i2 ,i3 (G∗ ) ∂i4 ,i5 ,i6 (G∗ )T 3 (Q −1 )i1 ,...,i6
72
i 1 ,...,i 6
1
− (G∗ ) ∂i1 ,...,i4 (G∗ )T 2 (Q −1 )i1 ,...,i4 .
24
i 1 ,...,i 4
with h
0 given in (3.8) and h 1 given in Lemma 3.3.
Remark 3.5 We note that the key assumptions of Proposition 3.4 are not easy to
verify in general. We refer to [5] for elements of a proof of the heat kernel expansion
and to [3] for further discussion.
The last ingredient needed for the asymptotic expansions of both implied and
local volatilities are the heat kernel coefficients u 0 and u 1 . As we are assuming the
options to be ATM, we only need the heat kernel coefficients on the diagonal.
Lemma 3.6 For a local volatility model, we have the following formulas for the
heat kernel coefficients on the diagonal:
u 0 (F, F) = 1,
1
n
1
n
u 1 (F, F) = σi (Fi )σi (Fi ) − σi (Fi )ρij σ j (F j ),
4 8
i=1 i, j=1
Proof Note that the infinitesimal generator A of the process F(t) can be expressed
(using the summation convention) as
1 1 ∂
A= − f i (F) ,
2 2 ∂ Fi
where
1 ∂ ij ∂
= √ g det g
det g ∂ Fi ∂ Fj
denotes the Laplace-Beltrami operator associated to g and the vector field f is given
by
f i (F) = σi (F)σi (Fi ), i = 1, . . . , n.
As indicated in (3.1), the transition density of the process F(t) satisfies (under certain
assumptions, see Proposition 3.4 and Remark 3.5)
1 d(F0 ,F)2
p(F0 , F, T ) = det g(F)e− 2T (u 0 (F0 , F) + T u 1 (F0 , F) + o(T )) ,
(2πT ) n/2
where d(F0 , F) is the geodesic distance between F0 and F and u 0 and u 1 are the heat
kernel coefficients. √
The
order zero heat kernel coefficient is given by u 0 (F0 , F) = (F0 , F)
e− 2 γ f , γ̇g , where
1
γ f , γ̇ is understood as integral along the geodesic γ joining
g
F0 and F and (F0 , F) is the Van Vleck–De Witt determinant,
Small-Time Asymptotics for the At-the-Money Implied Volatility … 225
1 1 ∂ 2 d 2 (F0 , F)
(F0 , F) = √ det − .
det g(F0 ) det g(F) 2 ∂F0 ∂F
On the diagonal, we clearly have γ f , γ̇g = 0 and for any local volatility model we
have (F0 , F) ≡ 1, as the geometry is isomorphic to the Euclidean geometry by the
F
coordinate transformation F → Ly, where LρL T = Id and yi := 0 i σi (u)−1 du.
Hence, u(F, F) = 1.
For the first order heat kernel coefficient, we refer to [16, Eq. (4.1)], where it is
shown that
1 1 1
u 1 (F, F) = κ + divg f (F) − | f (F)|2g .
6 4 8
Here, κ denotes the scalar curvature, which vanishes for local volatility model due
to the isomorphism with the Euclidean geometry already used above. (Note that [16]
consider the heat kernel corresponding to + f , whereas we consider the operator
2 + 2 f . Hence, we evaluate the formula obtained in [16, Eq. (4.1)] at t/2 instead
1 1
1 ∂
divg f (F) = √ f i (F det g(F) = σi (Fi )σi (Fi ),
det g(F) ∂ Fi
| f (F)|2g = gij (F) f i (F) f j (F) = σi (Fi )ρij σ j (F j ).
The numerator in the right hand side of the formula in Proposition 2.2 is given by
1 d(F0 , F)2
(ū 0 (F0 , F) + t ū 1 (F0 , F)) exp − Hn−1 (dF) =
(2πt)n/2 EK 2t
h 0ū 0 + t h 1ū 0 + h 0ū 0 + o(t),
As
a1 + b1 t + o(t) a1 a2 b1 − a1 b2
= + t + o(t),
a2 + b2 t + o(t) a2 a22
we arrive at
As ū 0 = σN,B
2 û , we can easily simplify
0
h ū0 0
= σN,B
2
(F0 ).
h û0 0
ū û
For the first order term, we note that all the terms h i j and h i j have the common
|w| √ 1
factor |w n | 2πT det Q
, which, hence, cancels out in the first order term—in particular,
implying that the “first order term” is really first order in T . Thus, we get
n
Proposition 4.1 For K = F 0 = i=1 wi F0,i , the basket local volatility has the
asymptotic expansion σloc 2 (T, K ) = σ 2
loc,0 (K ) + σloc,1 (K )T + o(T ), with
2
σN,B
2 (F )
0
σloc,0
2
(K ) = ,
K2
h û0 0 (h ū1 0 + h ū0 1 ) − h ū0 0 (h û1 0 + h û0 1 )
σloc,1
2
(K ) = 2 .
h û0 0 K 2
n
σN,B (F)2 = wi σi (Fi )ρij w j σ j (F j ).
i, j=1
5 Implied Volatility
The strategy for obtaining an asymptotic expansion for the implied volatility is as
follows: we first compute an asymptotic expansion of the basket option price in our
Small-Time Asymptotics for the At-the-Money Implied Volatility … 227
local volatility model, then we compare coefficients with the short time expansion
of the corresponding call option price in the Black-Scholes or Bachelier model,
respectively. Hence, we first apply our general asymptotic expansion obtained in
Proposition 3.4 to the Carr-Jarrow formula from Proposition 2.1, getting (for K = F 0 )
Now we can insert these results back into Proposition 2.1, and we obtain
T
1 √
CB (F0 , K , T ) = h ū0 0 + t h ū1 0 + h ū0 1 + o( t) dt
2 |w| 0
ū
1 T g0 0 √ ū 0 ū 1
√
= √ + t g1 + g0 + o( t) dt
2 0 t
√ 1 ū 0
= g0ū 0 T + g1 + g0ū 1 T 3/2 + o T 3/2 ,
3
where √
ū t ū j
gi j := h , i, j = 0, 1 (5.1)
|w| i
is independent of t. Finally, using (3.8) together with (3.3), and Lemma 3.7, we get
√
σN,B
2 (F ) det g(F )
0 0 σN,B (F0 )
g0ū 0 = √ = √ .
|wn | 2π det Q 2π
Proposition 5.1 The expansion of the call prices (at-the-money) in drift-less local
volatility models is asymptotically equivalent, to first order, to
σN,B (F0 ) 1 ū 0
CB (F0 , K , T ) = √ + g1 + g0ū 1 T 3/2 + o(T 3/2 )
2π 3
as T → 0.
In the final step, we compute an expansion of the implied volatility with respect
to either Black-Scholesor Bachelier model. Let us consider the prices of call options
n
with stock price F 0 = i=1 wi F0,i = K in the Black-Scholes and Bachelier models,
assuming that the respective volatilities are of the form σBS = σBS,0 + T σBS,1 and
σBach = σBach,0 + T σBach,1 . We obtain the well known formulas
CBS (F 0 , K , T ) =CBS (K , K , T ) =
!
K √ K 1 3
√ σBS,0 T + √ σBS,1 − σBS,0 + o(T 3/2 ),
2π 2π 24
228 C. Bayer and P. Laurence
CBach (F 0 , K , T ) = CBach (K , K , T ) =
K √ K
√ σBach,0 T + √ σBach,1 T 3/2 + o(T 3/2 ).
2π 2π
Despite being well-known, we recall the zeroth order implied volatility coefficients
and some of their properties. By comparison of coefficients, see Proposition 5.1 and
the above expansions for CBS and CBach , respectively, we find that
1 1 σN,B (F0 )
σBS,0 = σBach,0 = ū 0 (F0 , F0 ) (det Q)− 2 = , (5.2)
|wn |K F0
where we also used F 0 = K . Note, in particular, that the basket implied volatility
(5.2) can be interpreted as a weighted mean of the individual components’ (ATM)
F F j
implied volatilities in the sense that (σBS,0 )2 = i,n j=1 ρij wi K0,i σBS,0
i w j K0, j σBS,0 .
Remark 5.2 The right hand side in Eq. (5.2) is nothing but the local volatility of the
n
basket i=1 wi Fi at F0 in the Black-Scholes (i.e., log-normal) sense. Hence, we
have obtained that the zero order term in the small time expansion of the implied
volatility of the basket is equal to its local volatility when we consider an ATM option.
That result is not surprising in light of [11], where similar results were obtained (in
one-dimensional models). In this sense, one could even take (5.2) as an ex-post
justification of Lemma 3.7.
The first order implied volatilities in the Black Scholes and the Bachelier model do
not coincide any more. Indeed, we immediately have the first order correction term
in the Bachelier model
√
2π ū 0
σBach,1 = g1 + g0ū 1 . (5.3)
3K
On the other hand, for the Black-Scholes model we have
√ σ3
2π ū 0 BS,0 σBS,0
3
σBS,1 = g1 + g0ū 1 + = σBach,1 + , (5.4)
3K 24 24
implying that implied volatility quoted in the Black-Scholes framework is strictly
larger than the implied volatility in the Bachelier framework up to first order—the
prices are, of course, equal up to first order.
Small-Time Asymptotics for the At-the-Money Implied Volatility … 229
6 Numerical Results
As in [5], we consider the CEV model for the numerical examples. The CEV model
is a special case of the general local volatility model considered so far, where the
local volatilities are given by
β
σi (Fi ) = ξi Fi i , i = 1, . . . , n,
for some parameters ξi ≥ 0 and βi > 0. In fact, the most realistic scenario here is
0 < βi ≤ 1. Note that we allow βi < 1/2, which implies degenerate densities of Ft
at the boundary.
Implementation of the zero order terms of the implied volatilities in either Black-
Scholes or Bachelier setting is, of course, easy using (5.2). On the other hand, the
formulas for σBS,1 and σBach,1 are much less straightforward to implement. While the
formulas in the ATM case are fully explicit (unlike in [5]) an efficient implementation
is much less trivial. The formula for h 1 in Lemma 3.3, for instance, depends on the
derivatives up to order four of the squared Riemannian distance at F0 and on the
Jacobi matrix of F → u 0 (F0 , F). Already the evaluation of the (n − 1) × (n − 1) ×
(n−1)×(n−1) tensor D 4 can be very time-consuming, if a naive implementation is
used, which does not take into account that most derivatives actually vanish. But even
when more efficient implementations are used, the sheer size of the tensor may impose
limitations on the dimension of the problem. So far, we have implemented (3.11) in
Mathematica using symbolic differentiation of the squared Riemannian distance and
the zeroth order heat kernel coefficient u 0 , which works for small dimensions, up to
n = 5, say.
As in the paper [5], we compare the approximate prices against prices obtained
from sophisticated Monte Carlo simulation. Here, the CEV-SDE is discretized using
the Ninomiya-Victoir scheme [18], which is a second order weak approximation
scheme based on a splitting of the generator. Strictly speaking, the CEV process
violates the strong regularity assumption of that scheme, especially at the boundary
of the domain, but, as often in equity modelling, we do empirically observe second
order convergence for CEV-baskets, yet another beneficial effect of “not feeling
the boundary”. For variance reduction, we combine the discretization with the mean
value Monte Carlo method, see [19]. This is a variant of the control variate technique,
where a linear combination of one-dimensional geometrical Brownian motions is
230 C. Bayer and P. Laurence
used as control variate. More precisely, we freeze each component but one of the
basket, and replace the dynamics of the remaining basket by a corresponding Black-
Scholes dynamics. In the resulting model, the true option price can be explicitly
calculated. Finally, we choose a linear combination of those partially frozen model
so as to minimize the variance of the Monte Carlo estimator.
The expectation of the random variable obtained by combining the Ninomiya-
Victoir discretization of the CEV process and the mean value Monte Carlo method
is the approximated using Sobol numbers. In some sense, this contradicts the above
motivation for the variance reduction, but we do find empirically that the integration
error for a Quasi Monte Carlo estimator is also reduced by the variance reduction, i.e.,
the variance reduction also seems to reduce the number of most relevant dimensions
of the integration problem. Finally, we sacrifice some of the accuracy available by the
combination of the three techniques mentioned so far by introducing a random shift
of the Sobol numbers, i.e., we use the Randomized Quasi Monte Carlo technique,
see L’Ecuyer [15]. In this way, we can obtain reliable computable error bounds for
the integration error.
We compute the ATM price, i.e., the option price at K = 21, for maturities
T ∈ {0.5, 1, 2, 5, 10} years, which we compare with the zeroth and first order prices
in the corresponding Bachelier model. We also report σBach,0 = 0.1487036 and
σBach,1 = −6.72781 × 10−5 . Note that the “error bounds” reported in Tables 1
and 2 are upper estimates for the integration error (i.e., quasi Monte Carlo error) for
the reference values. Hence, numbers obtained from the first order approximation
formula are within the error bounds around the reference values.
In Fig. 1, we plot (linear interpolations of) the relative errors of the zeroth and
first order approximate pricing formulas close to the money (as obtained in [5]) and
compare them to the ATM-formulas represented by circles. We see that the accuracy
is extremely good in both cases, and that our approximation formulas for ATM CEV-
Small-Time Asymptotics for the At-the-Money Implied Volatility … 231
Table 1 Prices
Time Price 0th order price 1st order price Error bound
0.5 0.88073 0.88092 0.88072 2.43e-05
1 1.24525 1.24581 1.24524 4.63e-05
2 1.76023 1.76184 1.76024 8.90e-05
5 2.77895 2.78571 2.77941 3.21e-04
10 3.91968 3.93959 3.92176 5.92e-04
Error bounds given correspond to the (quasi) Monte Carlo error in the numerical scheme. The dis-
cretization error is of higher order
basket options nicely interpolate the formulas available away from the money. Indeed,
deviations from the non-ATM values only appears at very small orders of magnitude
in the logarithmic scale of Fig. 1 (where the Monte Carlo error contained in the
reference values probably dominates). For the sake of completeness, Fig. 2 reports
the absolute errors of the respective asymptotic formulas over a wide range of strike
prices, indicating that the asymptotic formulas exhibit their worst quality ATM.
We present a proof of Lemma 3.7. Recall that we want to compute the determinant
of the Hessian Q of the map
1
(G) := d (F0 , (G, FN (G, K )))2
2
evaluated at G = F0,1 , . . . , F0,n−1 . Let Si (x) denote the anti-derivative of 1/σi
satisfying (for simplicity) Si (F0,i )=0. Now consider the change of variables F → y
with yi := Si (Fi ), i = 1, . . . , n. As verified in [5], this transformation turns the
Riemannian geometry introduced above into an (almost) Euclidean geometry, with
5e−03
1e−03
5e−05 2e−04
Relative error
T = 0.5
T = 1.0
T = 2.0
T = 5.0
1e−05
T = 10.0
2e−06
This way, we understand (G) as a function ϕ(x) in the new (reduced) coordinates,
and obtain for the Hessian
1e−03
.
Absolute error
.
1e−05
. T = 0.5
T = 1.0
T = 2.0
T = 5.0
1e−07
T = 10.0
10 15 20 25 30
K
Fig. 2 Absolute errors. Solid lines correspond to prices obtained from (non-ATM) zeroth order
approximate formulas, dashed lines to (non-ATM) first order approximate formulas. The corre-
sponding ATM-approximate prices are represented by circles and other symbols. Note that the
option is ATM for K = 21
w j σ j (F0, j ) wi σi (F0,i ) wi σi (F0,i )w j σ j (F0, j ) n−1
Hx ϕ(0) = ρij − ρin − ρjn + ρnn .
wn σn (F0,n ) wn σn (F0,n ) wn2 σn (F0,n )2 i, j=1
From the structure of the above expression and the expression in Lemma 3.7, we see
that we may assume that wi = 1, i = 1, . . . , n, and σn (F0,n ) = 1. In this case, we
are left to prove that the determinant of the matrix
n−1
A := ρij − ρin s j − ρjn si + ρnn si s j i, j=1
is equal to the expression a := sT ρs/ det ρ, where we used the short-hand notation
si = σi (F0,i ), i = 1, . . . , n − 1, and sn = 1, and s = (s1 , . . . , sn ).
As both det A and a are polynomials in s1 , . . . , sn−1 , we prove this equality by
establishing that they have the same coefficients. Here, Cramer’s rule is the essential
tool:
1
B −1 = Adj(B),
det B
234 C. Bayer and P. Laurence
where the adjugate matrix Adj B is the transpose of the matrix of co-factors, i.e.,
with B ĵ î being obtained from B by removing the jth row and the ith column. By
symmetry, we hence have
ρij
= (−1)i+ j det ρ−1 , ∀(i, j) ∈ {1, . . . , n − 1}2 , (A.1)
det ρ î ĵ
)
2(n−1)
∀x ∈ {s1 , . . . , sn−1 }k : πx det A = πx a.
k=0
n−1
ρnn
π1 det A = sign(σ) ρiσ(i) = det ρn̂−1 = Adj(ρ−1 )nn = = π1 a.
n̂ det ρ
σ∈Sn−1 i=1
Moreover, one can see that sign(σ̃) = (−1)k+n−1 sign(σ). Hence, we obtain
˜
πsk det A = −2 sign(σ)ρn σ̃(n) ρi σ(i)
σ∈Sn−1 i∈{1,...,n−1}\{k}
˜
= 2(−1) k+n
sign(σ̃)ρn σ̃(n) ρi σ(i)
σ̃∈S({1,...,n}\{k};{1,...,n−1}) i∈{1,...,n−1}\{k}
= 2(−1) k+n
det ρ−1
k̂ n̂
2ρkn
= 2 Adj(ρ−1 )kn = = πsk a.
det ρ
else. Note that it is easy to see that sign(σ) = sign(σ̃). Hence, we have
236 C. Bayer and P. Laurence
πs 2 det A = sign(σ) ρi σ̃(i)
k
σ∈Sn−1 i∈{1,...,n}\{k}
= sign(σ̃) ρi σ̃(i)
σ̃∈S({1,...,n}\{k};{1,...,n}\{k}) i∈{1,...,n}\{k}
= det ρ−1 = πs 2 a.
k̂ k̂ k
Higher order terms. Regarding the higher order terms, we note that πx a = 0 for any
monomial of degree larger than two. Therefore, the same should be true for det A,
where it does not to seem to follow from an obvious argument. Note that we only
need to consider polynomials where each individual variable sk appears at most two
times, as any other monomial cannot appear in det A by the definition of A and of
the determinant. But any coefficient of det A with respect to such monomials can
be understood as the determinant of a matrix ρ2 −1 , which is obtained from ρ−1 by
omitting one row and one column and by replacing some rows/columns by copies of
other rows/columns. Of course, any such matrix ρ̃ has vanishing determinant, imply-
ing that πx det A = 0. For concreteness, we indicate this mechanism by appealing
to two special cases. First, take x = sk2 sl , l = k. Similarly to the case of x = sk , one
can show that
!
−1
πs 2 sl det A = −2 sign(σ) 1k=σ(k) ρnn ρσ (l)n ρiσ(i) +
k
σ∈Sn−1 i∈{1,...,n−1}\{k,l}
−1 (k)n −1 (l)n
+ 1k=σ(k) ρσ(k)n ρσ ρσ ρiσ(i) ,
i∈{1,...,n−1}\{k,σ −1 (k)σ −1 (l)}
πs 2 ···s 2 det A = sign(σ)(ρnn )n−1 = 0,
1 n−1
σ∈Sn−1
as the determinant of the (n − 1) × (n − 1) matrix with all entries being equal to ρnn .
Small-Time Asymptotics for the At-the-Money Implied Volatility … 237
References
1. Avellaneda, M., Boyer-Olson, D., Busca, J., Friz, P.: Application of large deviation methods to
the pricing of index options in finance. C. R. Math. Acad. Sci. Paris 336(3), 263–266 (2003)
2. Azencott, R.: Densité des diffusions en temps petit: développements asymptotiques. I. Seminar
on Probability, XVIII. Lecture Notes in Mathematics, vol. 1059, pp. 402–498. Springer, Berlin
(1984)
3. Bayer, C., Friz, P., Laurence, P.: On the Probability Density Function of Baskets. Springer
Proceedings in Mathematics & Statistics (2014)
4. Bayer, C., Laurence, P.: Calculation of greeks for basket options. Working paper
5. Bayer, C., Laurence, P.: Asymptotics beats Monte Carlo: the case of correlated local vol baskets.
Commun. Pure Appl. Math. 67(10), 1618–1657 (2014)
6. Breitung, K., Hohenbichler, M.: Asymptotic approximations for multivariate integrals with an
application to multinormal probabilities. J. Multivar. Anal. 30, 80–97 (1989)
7. Carr, Peter P., Jarrow, Robert A.: The stop-loss start-gain paradox and option valuation: a new
decomposition into intrinsic and time value. Rev. Financ. Stud. 3(3), 469–492 (1990)
8. Deuschel, J., Friz, P., Jacquier, A., Violante, S.: Marginal density expansions for diffusions and
stochastic volatility, part I: theoretical foundations. Commun. Pure Appl. Math. 67(1), 40–82
(2013)
9. Deuschel, J., Friz, P., Jacquier, A., Violante, S.: Marginal density expansions for diffusions and
stochastic volatility, part II: applications. Commun. Pure Appl. Math. 67(2), 321–350 (2013)
10. Evans, L.C., Gariepy, R.F.: Measure Theory and Fine Properties of Functions. Studies in Ad-
vanced Mathematics. CRC Press, Boca Raton (1992)
11. Gatheral, J., Hsu, E.P., Laurence, P., Ouyang, C., Wang, T.: Asymptotics of implied volatility
in local volatility models. Math. Financ. 22(4), 591–620 (2012)
12. Henry-Labordère, P.: Analysis, Geometry, and Modeling in Finance: Advanced Methods in
Option Pricing. Chapman & Hall/CRC Financial Mathematics Series. CRC Press, Boca Raton
(2009)
13. Hsu, P.: Heat kernel on noncomplete manifolds. Indiana Univ. Math. J. 39(2), 431–442 (1990)
14. Isserlis, L.: On a formula for the product-moment coefficient of any order of a normal frequency
distribution in any number of variables. Biometrika 12(1/2), 134–139 (1918)
15. L’Ecuyer, P.: Quasi-Monte Carlo methods with applications in finance. Financ. Stoch. 13(3),
307–349 (2009)
16. McKean Jr., H.P., Singer, I.M.: Curvature and the eigenvalues of the Laplacian. J. Differ. Geom.
1(1), 43–69 (1967)
17. Minakshisundaram, S., Pleijel, Å.: Some properties of the eigenfunctions of the Laplace-
operator on Riemannian manifolds. Can. J. Math. 1, 242–256 (1949)
18. Ninomiya, S., Victoir, N.: Weak approximation of stochastic differential equations and appli-
cation to derivative pricing. Appl. Math. Financ. 15(1–2), 107–121 (2008)
19. Pellizzari, P.: Efficient Monte Carlo pricing of European options using mean value control
variates. Decis. Econ. Financ. 24(2), 107–126 (2001)
20. Yosida, K.: On the fundamental solution of the parabolic equation in a Riemannian space.
Osaka Math. J. 5, 65–74 (1953)
A Remark on Gatheral’s ‘Most-Likely Path
Approximation’ of Implied Volatility
In his book ‘The Volatility Surface—A Practitioners Guide’, Jim Gatheral presents
an approximation formula for the implied volatility of a European option, when the
underlying stock follows a general diffusion process
dSt
= μ(t, St ) dt + σ(t, St ) dWt . (1)
St
M. Keller-Ressel (B)
Fachrichtung Mathematik, Institut f. Math. Stochastik, TU Dresden,
01062 Dresden, Germany
e-mail: [email protected]
J. Teichmann
Department of Mathematics, ETH Zürich,
Rämistrasse 101, 8092 Zürich, Switzerland
e-mail: [email protected]
Here, the measures Gt are given by their Radon-Nikodym derivatives with respect
to the risk-neutral measure Q,
where σ K ,T (t) is a function that is yet to be specified, BS denotes the Black-Scholes
Gamma and expectations are always taken to be under the risk-neutral pricing mea-
sure. Let us emphasize that (2) is an exact formula, and that it is the second part of
the method where the approximation happens: Gatheral argues that the density (3)
is concentrated (as a function of (t, S)) close to a narrow ridge connecting today’s
stock price S0 to the strike price K at time T , and claims that a good approximation
to (2) is to evaluate it as if the density was entirely concentrated on this ridge.1 In the
terminology of Gatheral this ridge is called the most-likely path and the described
approximation method the most-likely path approximation. Extensions of the repre-
sentation (3) have been proposed e.g. by Guyon and Henry-Labordère [2] for implied
correlations.
In this note we will only be concerned with the first part of Gatheral’s method,
i.e. the derivation of the exact Eq. (2), and in particular the definition of the yet
unknown function σ K ,T (t). Gatheral [1] defines on p. 27 first the ‘Black-Scholes
forward implied variance’ v K ,T (t) by
E σ 2 (t, St )S 2 t BS (St , σ K ,T (t))
v K ,T (t) = , (4)
E St2 BS (St , σ K ,T (t))
Differentiating (5) and inserting into (4) yields an ordinary differential equation for
σ K ,T (t). This definition through an ODE leaves open the question whether (and
under which conditions) the quantities v K ,T (t) and σ K ,T (t) actually exist.2 We will
show that a simpler definition of σ K ,T (t) can be given, which clarifies the problem of
existence, implies Eqs. (4) and (5) and finally leads to a proof of the implied volatility
representation (2).
For our proof of the implied volatility representation we assume that the stock price
follows an Itô-process with respect to the risk-neutral measure Q (with respect to
which all expectations are taken) of the form
dSt
= r dt + σt dWt , (6)
St
such that the discounted stock price (e−rt St )0≤t≤T is a square-integrable martin-
gale. The volatility process σ is a general predictable, W -integrable process. This
setup covers in particular local volatility models, where σt = σ(t, St ) and stochastic
volatility models where σt = σ(t, Vt ) and Vt is a stochastic factor driving the volatil-
ity. We fix a terminal time T and assume that S is non-deterministic in the sense that
P(St = ST ) > 0 for all t ∈ [0, T ]. Fixing also a strike price K we are ultimately
interested in the implied Black-Scholes volatility σimp (T, K ) for a European option
with expiry T and strike K in the above model.
To start our derivation, we associate for each u ∈ [0, T ] and u ≥ 0 the ‘regime-
switching’ process S u to S, given by
dSut
Stu
= r dt + σt dWt t ∈ [0, u]
(7)
dSut
Stu
= r dt + u dWt t ∈ [u, T ].
and
log er (T −u) S √
K w
d1,2 (w) = √ ± .
w 2
Definition 2.1 For u ∈ [0, T ) we define the implied forward total variance ŵu =
ŵu (T, K ) ≥ 0 as the solution of
e−ru E PBS (u, Su , T, K ; ŵu ) = e−rT E (K − ST )+ (8)
i.e. ŵu is the total variance wu = (T − u)( u )2 that has to be chosen in the regime-
switching model (7) such that the resulting put-price coincides with the put-price
from the original model (6).
Proposition 2.2 There exists a unique positive deterministic function u → ŵu , such
that the equality
e−ru E PBS (u, Su , T, K ; ŵu ) = e−rT E (K − ST )+ (9)
Proof For w = 0, the Black-Scholes price e−ru PBS (u, Su , K , T ; w) equals e−ru
(e−r (T −u) K − Su )+ . Since (e−ru Su )0≤u≤T is a martingale, we have by Jensen’s
inequality that
e−ru E [PBS (u, Su , K , T ; 0)] = e−ru E (e−r (T −u) K − Su )+ ≤ e−rT E (K − ST )+ .
Remark 2.3 Notice that the previous proof holds in fact for semi-martingales S, such
that (exp(−r t)St )0≤t≤T is a martingale, so neither square integrability nor absence
of jumps are needed. However, we do not get regularity assertions for u → ŵu .
A Remark on Gatheral’s ‘Most-Likely Path Approximation’ of Implied Volatility 243
We now present our main result on the implied forward total variance ŵu . Here
the assumption of continuous trajectories is really needed, as well as the following
L 2 -continuity assumption:
Assumption 2.4 We assume that σu is mean-square continuous, i.e. the map
[0, T ] u → σu2 ∈ L 2 (, Q) is continuous with respect to the L 2 -topology.
Theorem 2.5 Under Assumption 2.4 the mapping u → ŵu is in C 1 [0, T )∩C 0 [0, T ]
and satisfies the ODE
∂ ŵu E φ(d2 (ŵu ))σu2
=− , u ∈ [0, T ), (10)
∂u E φ(d2 (ŵu ))
with terminal condition limu→T ŵT = 0 and where φ denotes the standard normal
density. For u = 0 it holds that
where σimp (T, K ) is the implied Black-Scholes volatility for time-to-maturity T and
strike K in (6).
Remark 2.6 Equation (10) can be rewritten as (2). Alternatively, it can be written as
i.e., the rate of decrease in total implied variance is given by expected instanta-
neous stochastic volatility plus a correction term that accounts for correlation effects
between σu and Su in a highly non-linear way.
Proof We set
F(u, w) = e−r u E [PBS (u, Su , T, K ; w)].
Note that the derivative of PBS with respect to total variance w is given by
∂ 1
PBS (u, S, T, K ; w) = √ Sφ(d1 ),
∂w 2 w
∂ e−ru e−r T
F(u, w) = √ E [Su φ(d1 (w))] = √ E [φ(d2 (w))]. (11)
∂w 2 w 2 w
244 M. Keller-Ressel and J. Teichmann
∂ ∂ ∂ 1 ∂2
F(u, w) = e−r u E −r PBS + PBS + PBS r Su + PBS Su2 σu2 . (12)
∂u ∂u ∂S 2 ∂ S2
∂ ∂
−r PBS + PBS + r S PBS = 0 ,
∂u ∂S
such that (12) simplifies to
∂ 1 ∂2 1 e−rT K
F(u, w) = e−r u E PBS Su σ
2 2
u = √ E φ(d2 (w))σ 2
u . (13)
∂u 2 ∂ S2 2 w
Note that due to Assumption 2.4 both ∂u F(u, w) and ∂w F(u, w) are continuous.
Furthermore, recall that ŵu is given in Definition 2.1 by the implicit equation
F(u, ŵu ) = e−r T E (K − ST )+ , (14)
where the right hand side depends neither on u nor on ŵu . Let us first examine the
boundary behavior of F(u, w). We easily derive that
lim F(u, w) = E e−r T K − e−ru Su ,
w→0 +
for all u ∈ [0, T ). From (11) we see that ∂w F(u, w) > 0 and hence w → F(u, w)
is increasing for w ∈ (0, ∞). Altogether, it follows that for each u ∈ [0, T ] a
unique ŵu solving (14) exists. In addition, by the implicit function theorem, ŵu is
in C 1 [0, T ) ∩ C 0 [0, T ] with derivative
∂ ∂u F(u, w) E φ(d2 (wu ))σu2
ŵu = − =− ,
∂u ∂w F(u, w) E [φ(d2 (wu ))]
A Remark on Gatheral’s ‘Most-Likely Path Approximation’ of Implied Volatility 245
where we have combined (11) and (13). The initial and terminal conditions for
ŵu at u = 0 and u = T can be derived from the above boundary conditions for
F(u, w). Indeed,
PBS (0, S0 , K ; ŵ0 ) = C(K , T )
implies that w = 0 and hence both boundary conditions for ŵu follow.
Acknowledgments MKR acknowledges funding from the Excellence Initiative of the German
Research Foundation (DFG).
References
1 Introduction
Because of their consistency with the known prices of European options, and despite
their unrealistic dynamical implications, local volatility models continue to be used
in practice as powerful tools for risk management of equity derivatives portfolios.
Under the forward measure (with no drift), local volatility models take the form
d St
= σ (St , t) d Bt , (1.1)
St
In memory of our long term collaborator and friend, a passionate mathematician, Peter Laurence.
where Bt is a Brownian motion and σ is a local volatility function that depends only
on the underlying level S and the time t.
Assume that prices of European options of all strikes K and expirations T are
given or equivalently that the Black-Scholes implied volatility function σBS (K , T )
is known. In that case, it is straightforward to compute the local volatility function
σ from, for example, Eq. (1.10) of Gatheral [6]:
∂w
∂T
σ 2 (K , T ) = 2 1 ∂w 2 (1.2)
k ∂w 1 ∂2w
1− 2 w ∂k − 1
4 4 + 1
w ∂k + 2 ∂k 2
where k denotes the log-strike k := log K /S and w, the Black-Scholes implied total
variance, given by w(K , T ) := σBS 2 (K , T ) T .
In practice, we observe option prices for only a finite set of strikes and expirations.
Moreover (see for example Gatheral and Jacquier [7]), it is very hard if not impossible
to find a functional form for implied volatility that both matches observed prices and
is free from static arbitrage. One alternative approach is to assume a parameterized
functional form for the local volatility function σ (S, t) and price a finite set of
European options, tuning the parameters of the function until a satisfactory fit is
achieved. Such calibration of local volatility models to given option prices is in
practice typically performed using numerical PDE techniques. However, numerical
PDE techniques are slow and moreover are not practical in higher dimensions.
Alternatively, to achieve better understanding of the qualitative properties of local
volatility models, and potentially faster calibration, both academics and practitioners
have exploited asymptotic expansions of implied volatility in terms of local volatility.
First, Berestycki et al. [2] solved the nonlinear PDE (1.2) for the implied total vari-
ance w in the small time to expiration limit, obtaining an exact expression for implied
volatility as an integral of local volatility. Subsequently, this asymptotic approxima-
tion was extended, to first order in time to expiry τ = T − t by Henry-Labordère (see
the article in this volume and also Henry-Labordère [12]), and then to second order in
Gatheral et al. [9] using the heat kernel expansion. Jordan and Tier [14] apply similar
methods to derive an asymptotic solution for the SABR and CEV models. In related
work, the paper of Cheng et al. [5] derives an operator expansion of the density,
which up to first order agrees with prior expansions obtained using the heat kernel
expansion. As an earlier example of work in a similar spirit to the most-likely-path
approach of our paper, Baldi and Caramellino [1] develop a small-time expansion
for the hitting probability of a one-dimensional diffusion.
Our contribution in this paper is to derive an exact Brownian bridge representation
for the transition density, from which an exact expression for the transition density
in terms of a path integral follows. Indeed, the path integral representation of the
density has often been used as a powerful tool for the derivation of improved asymp-
totic expansions of the transition density. For example, in the foregoing, we apply
a technique from the paper of Goovaerts et al. [10]. An earlier paper by Linetsky
[16] provides a more general survey of the application of path integral techniques to
option pricing.
Implied Volatility from Local Volatility: A Path Integral Approach 249
By replacing all paths that contribute to the path integral by the most-likely-path,
the unique path that minimizes the action functional in the path integral formulation,
we obtain a new approximation to the transition density which is both more accurate
and natural than the classical heat kernel version. As an application, we obtain an
improved most-likely-path approximation for implied volatility in terms of local
volatility.
The most-likely-path (MLP) approach has been used to analyze the asymptotic
behavior of implied volatility in stochastic volatility models in Gatheral [6]; this
analysis is further elaborated in an article by Keller-Ressel and Teichmann [15] in
these proceedings. Guyon and Henry-Labordère [11] and Reghai [17] both explore
alternative definitions of the most likely path, achieving improved accuracy by con-
sidering fluctuations around the MLP. In particular, Guyon and Henry-Labordère
[11] compare and contrast various approximations in a unified setting. Though the
approach of Guyon and Henry-Labordère [11] differs from our path integral approach
in the current paper, it is worth mentioning that their heat kernel approximation is
closely related to ours. Once again however, our path integral approach leads to an
unambiguously natural definition of the most-likely-path.
Our paper is organized as follows. In Sect. 2, we derive Brownian bridge and path
integral representations for the transition density of one dimensional diffusions. As
an application, in Sect. 3, we present a novel probabilistic derivation of the heat kernel
expansion, also referred to as the WKB method in the physics literature. For time
homogeneous diffusions, this new expansion recovers the conventional heat kernel
expansion; however, in the time-inhomogeneous case, the two expansions differ a
little. In Sect. 4, we present heuristic derivations of known small time asymptotic
expansion of implied volatility to zeroth order. From the path integral perspective,
these known approximations are suboptimal in the sense that they correspond to
computing the optimal path of an approximate but incomplete action functional. By
considering the optimal path of the exact action functional, we show how an optimal
approximation may be computed. An interesting feature of the optimal approximation
is that it recovers the implied volatility of the time dependent Black-Scholes model
exactly, which so far, to the best of our knowledge, none of the existing small time
approximations are able to achieve. Finally, in Sect. 5, we summarize and conclude.
Throughout the text, Bt denotes the standard Brownian motion defined on the
filtered probability space (, Ft , P) satisfying the usual conditions. X t denotes the
Brownian motion with some drift h. p X (T, y|t, x) denotes the transition density
of X from x at time t to y at time T and similarly p S (T, sT |t, st ) is the transition
density from st to sT of the process St . Moreover, dot will always refer to the partial
derivative with respect to the time variable and prime to the space variables x or s.
250 T.-H. Wang and J. Gatheral
In this section, we derive path integral representations of the transition density and
of the call prices under local volatility, which will in turn yield the most-likely-path
approximation to implied volatility. The key ingredient in this derivation is a Brown-
ian bridge representation for the transition density, which though straightforward,
does not appear to be well-known.
We start with the case of one-dimensional Brownian motion with general but
Markovian drift. We reduce the more general diffusion case which concerns us here
to this one by applying the well-known Lamperti change of variable.
Two Brownian bridge representations for the transition density of Brownian motion
with general but Markovian, smooth and bounded, drift are derived in Theorem 1. The
first expression, (2.1), will be used in the derivation of the path integral representation
for transition density in Sect. 2.2 and the second, (2.2), will be used to derive the heat
kernel expansion of transition density in Sect. 3.
Theorem 1 Let X t be a Brownian motion with drift driven by
d X t = d Bt + h(X t , t)dt,
and
ξ2
where φ is the Gaussian density φ(t, ξ) = √ 1 e− 2t . The notation Ẽx,y [·] denotes
2πt
the expectation under the Brownian bridge measure from x to y.
Proof Note that X t under the original measure P is a Brownian motion with drift h.
Define a new probability measure P̃ through the Radon-Nikydom derivative
d P̃ T
1 T 2
= e− t h(X s ,s)d Bs − 2 t h (X s ,s)ds .
dP
Implied Volatility from Local Volatility: A Path Integral Approach 251
By the Girsanov theorem, X t is a Brownian motion under P̃. Given any bounded
measurable function f , we have, since d Bt = d X t − h(X t , t)dt,
dP T
1 T 2
Et,x [ f (X T )] = Ẽt,x f (X T ) = Ẽt,x f (X T )e t h(X s ,s)d X s − 2 t h (X s ,s)ds ,
d P̃
where, for notational simplicity, Et,x [·] denotes the conditional expectation E[·|X t =
x], and similarly for Ẽt,x [·]. It follows that, for any bounded measurable function f ,
T
1 T 2
f (y) p X (T, y|t, x)dy = f (y)Ẽx,y e t h(X s ,s)d X s − 2 t h (X s ,s)ds
φ(T − t, y − x)dy,
T T h x (X s , s)
h(X s , s)d X s = H (X T , T ) − H (X t , t) − Ht (X s , s) + ds,
t t 2
Remark 1 We remark that the conditional expectations in both (2.1) and (2.2) are
under the Brownian bridge measure since X t is a Brownian motion under P̃. One
intriguing feature of the representation (2.2) is that, if we Taylor expand the condi-
tional expectation for small T − t around the straight line connecting the initial and
terminal points, we recover the heat kernel expansion in the time-homogeneous case
and probably do better than the heat kernel expansion in the time-inhomogeneous
case. See Sect. 3 for more detailed discussions on the heat kernel expansion.
252 T.-H. Wang and J. Gatheral
Now for the general diffusion case, consider the process St driven by the stochastic
differential equation (SDE)
where for simplicity, we assume the coefficients μ and a are Lipschitz and of linear
growth; a is further assumed strictly away from zero. By applying the Lamperti
s dξ
transformation x = s0 a(ξ,t) , the process St is transformed into a Brownian motion
with drift. Specifically, denote the transformation from s to x by x = ϕ(s, t) =
s dξ
s0 a(ξ,t) . Applying Ito’s formula to X t = ϕ(St , t) yields
d X t = dϕ(St , t)
a 2 (St , t)
= ϕ̇(St , t) + μ(St , t)ϕs (St , t) + ϕss (St , t) dt + ϕs (St , t)a(St , t)d Bt
2
μ(St , t) as (St , t)
= ϕ̇(St , t) + − dt + d Bt
a(St , t) 2
= d Bt + h(X t , t)dt,
1
p S (T, sT |t, st ) = p X (T, x T |t, xt ),
a(sT , T )
with x T = ϕ(sT , T ) and xt = ϕ(st , t). Thus, the transition from the Brownian
bridge representation for p X to a similar representation for p S is straightforward by
applying Theorem 1. Theorem 2 formalizes this result.
Theorem 2 Let St be the diffusion process driven by the stochastic differential equa-
tion
d St = μ(St , t)dt + a(St , t)d Bt , S0 = s0 .
s dξ
Denote the Lamperti transformation from s to x by x = ϕ(s, t) = s0 a(ξ,t) . Define
the function h by h(x, t) = ϕ̇(s, t) + μ(s,t)
a(s,t) −
as (s,t)
2 , with s = ϕ−1 (x, t), where
subindices refer to corresponding partial derivatives. Let H be an antiderivative
∂
of h with respect to x, namely, ∂x H (x, t) = h(x, t), for all x and t. Then the
transition density p of St from (t, st ) to (T, sT ) has the following Brownian bridge
S
representations:
and
φ(T − t, ϕ(sT , T ) − ϕ(st , t)) H (ϕ(sT ,T ),T )−H (ϕ(st ,t),t)
p S (T, sT |t, st ) = e ×
a(sT , T )
1 T 2
Ẽϕ(st ,t),ϕ(sT ,T ) e− 2 t h (X s ,s)+h x (X s ,s)+2Ht (X s ,s)ds , (2.4)
ξ2
where again φ denote the Gaussian density φ(t, ξ) = √ 1 e− 2t . As before, the
2πt
notation Ẽx,y [·] denotes the expectation under the Brownian bridge measure from x
to y.
Note that the X t process in both expressions (2.3) and (2.4) is a Brownian bridge
from x T = ϕ(sT , T ) to xt = ϕ(st , t). One application of such Brownian bridge rep-
resentations of transition densities is to devise more efficient simulation schemes. For
example, for some given function f , we may compute numerically the expectation
Et,st [ f (ST )] in x-space as
Hence,
1
T
Et,st [ f (ST )] ≈ e− 2 t h (xτ ,τ )+h x (xτ ,τ )+2Ht (xτ ,τ )dτ
2
E f ◦ ϕ−1 (Y, T ) e H (Y,T )−H (xt ,t) ,
n
p S (T, sT |t, st ) = ··· p S (ti , si |ti−1 , si−1 ) ds1 . . . dsn−1 , (2.6)
i=1
where we set s0 = st and sn = sT . Recall from (2.3) that the transition density p S
of St from (ti−1 , si−1 ) to (ti , si ) has the Brownian bridge representation
where X τ is a Brownian bridge from ϕ(si−1 , ti−i ) to ϕ(si , ti ). We next compute the
limit of (2.6), as t → 0+ (or equivalently n → ∞), assuming that, for i = 1, . . . , n,
the si ’s form a discretization of a differentiable curve sτ , for τ ∈ [t, T ].
Implied Volatility from Local Volatility: A Path Integral Approach 255
We have
n ti
h(X τ ,τ )d X τ − 21
ti
h 2 (X τ ,τ )dτ
lim Ẽϕ(si−1 ,ti−1 ),ϕ(si ,ti ) e ti−1 ti−1
n→∞
i=1
T
1 T 2
=e t h(ϕ(sτ ,τ ),τ ) ẋ τ dτ − 2 t h (ϕ(sτ ,τ ),τ )dτ
and
n
1 2
lim e− 2t [ϕ(si ,ti )−ϕ(si−1 ,ti−1 )]
n→∞
i=1
n
= lim e− 2t
1
i=1 [ϕ(si ,ti )−ϕ(si−1 ,ti−1 )]2
n→∞
n si
2
− 12 ϕi−1 +ϕ̇i−1 t+O (si 2 +t)2
= lim e i=1 t
n→∞
T
− 12 t [ϕ (sτ ,τ )ṡτ +ϕ̇(sτ ,τ )]2 dτ
=e .
Substitution into (2.6) and taking the limit n → ∞ yields the following path integral
representation for the transition density p S
T
p S (T, sT |t, st ) = e− 2
1
t [ϕ (sτ ,τ )ṡτ +ϕ̇(sτ ,τ )−h(ϕ(sτ ,τ ),τ )]2 dτ D[s], (2.7)
Cs
where
1
n−1
1 dsi
D[s] = lim √ √ ,
n→∞ 2πt a(s , T )
T i=1
2πt a(si , ti )
and Cs denotes the collection of all differentiable curves from (t, st ) to (T, sT ).
Equivalently, because d xi = a(sdsi ,ti i ) , we may rewrite the path integral representation
(2.7) more neatly and simply in x-space as
1
1 T
e− 2 t [ẋτ −h(xτ ,τ )] dτ D[x]
2
p S (T, sT |t, st ) = (2.8)
a(ST , T ) Cx
where
1 d xi
n−1
D[x] = lim √ √
n→∞ 2πt 2πt
i=1
and Cx denotes the collection of all differentiable curves from (t, xt ) to (T, x T ).
We shall henceforth deal mostly with the simpler expression (2.8). Heuristically, one
could think of the path integral representation (2.8) of the density as an exponentially-
weighted average over all possible differentiable curves connecting xt to x T . D[x]
could then be regarded as the “Lebesgue” measure on the space of differentiable
256 T.-H. Wang and J. Gatheral
The path integral representation (2.7) of the transition density p S in this case has the
following simpler form
T ṡτ
as (sτ ,τ ) 2
− 12 + dτ
p (T, sT |t, st ) =
S
e t a(sτ ,τ ) 2
D[s].
Cs
Integrating the payoff function over the transition density, the path integral represen-
tation for call price is immediate:
∞ T ṡτ
as (sτ ,τ ) 2
− 21 + dτ
C(t, st , K , T ) = (sT − K ) e t a(sτ ,τ ) 2
D[s]dsT ,
K Cs
or equivalently in x-space,
∞ sT − K
1 T
e− 2 t |ẋτ −h(xτ ,τ )| dτ D[x]dsT ,
2
C(t, st , K , T ) = (2.9)
K a(sT , T ) Cx
as (s,t)
where h(x, t) = ϕ̇(s, t) − 2 .
The heat kernel expansion is a small time asymptotic expansion of the fundamental
solution of the heat equation over a Riemannian manifold. Reexpressing the transition
density of a diffusion process in terms of this fundamental solution leads naturally to a
small time asymptotic expansion of the transition density. This topic is well-studied
in the Riemannian geometry literature, see Chavel [4] for a geometric analytical
approach and Hsu [13] for a probabilistic approach. In the physics literature, the heat
kernel approach to deriving small time asymptotic expansions is also known as the
WKB method or the ray solution, see Jordan and Tier [14]. Deriving such expansions
in one dimension is much simpler than in higher dimensions where no analogue of
the Lamperti transformation exists.
Though the heat kernel expansion is very well-known, the Brownian bridge rep-
resentation (2.4) of Theorem 2 leads to a novel probabilistic derivation which we
will now present. To fix ideas and illustrate the methodology employed, we start
with the case of Brownian motion with drift; as before, the general diffusion case
Implied Volatility from Local Volatility: A Path Integral Approach 257
Theorem 3 Let X t be the Brownian motion with drift h, i.e., X t satisfies the SDE
d X t = d Bt + h(X t , t)dt. Denote by H an antiderivative of h with respect to x,
∂
namely, ∂x H (x, t) = h(x, t), for all x and t. The transition density p X of X t has,
−
as t → T , the following small time asymptotic expansion:
ξ2
where φ is the Gaussian density φ(t, ξ) = √ 1 e− 2t . xs∗ denotes the straight line
2πt
from (t, x) to (T, y), i.e., xs∗ = x + T −t (y
s−t
− x) for s ∈ [t, T ].
Notice that in the time-inhomogeneous case h = h(x, t), the approximation (3.1)
is different from the heat kernel expansion (see, for example, (3.3), (3.6), and (3.7)
on page 603 of Gatheral et al. [9]) in that the approximation in (3.1) involves an
integration from t to T whereas, in the classical heat kernel expansion, all quantities
are evaluated at the fixed initial time t. Of course, in the time homogeneous case where
the drift h = h(x) has no explicit dependence on t, the expansion (3.1) coincides
with the classical heat kernel expansion as formalized in the following corollary.
Corollary 1 (Heat kernel expansion for Brownian motion with drift) For Brownian
motion with time homogeneous drift h = h(x), the transition density p X of X t from
(t, x) to (T, y) has the asymptotic expansion up to first order as
which coincides with the classical heat kernel expansion up first order (see, for
instance, Gatheral et al. [9]).
258 T.-H. Wang and J. Gatheral
T
h 2 (xs∗ ) + h (xs∗ ) ds
t
T s−t s−t
= h2 x + (y − x) + h x + (y − x) ds
t T −t T −t
T −t y 2
= h (ξ) + h (ξ) dξ,
y−x x
Let Yt denote the Brownian bridge from x at time t to y at time T and Ẽx,y [·]
be the expectation under the Brownian bridge measure. The proof of the asymptotic
expansion (3.1) requires the following two lemmas.
Lemma 1 For a bounded function g = g(x, s), |g| ≤ M say, we have the following
estimate
T T
Ẽx,y e t g(Ys ,s)ds = 1 + Ẽx,y [g(Ys , s)] ds + O(T − t)2 .
t
Proof The proof is based on a clever application of the convex order for random
variables first observed, to our knowledge, in the paper by Goovaerts et al. [10] (see
Proposition 6.2 on p. 348). Denote by Q g(Ys ,s) (q) the qth quantile of the random
variable g(Ys , s). Since exponential functions are convex, it follows from Proposition
6.2 of Goovaerts et al. [10] that
T 1 T
Ẽx,y e t g(Ys ,s)ds ≤ e t Q g(Ys ,s) (q)ds
dq.
0
We establish an upper bound for the right hand side. First we Taylor expand the
integrand and rewrite the integral as
1 T ∞
1 1 T k
Q g(Ys ,s) (q)ds
e t dq = Q g(Ys ,s) (q)ds dq.
0 k! 0 t
k=0
1 T k
An upper bound for 0 t Q g(Ys ,s) (q)ds dq is then determined as
1 T k
Q g(Ys ,s) (q)ds dq
0 t
1 T
≤ (T − t)k−1 |Q g(Ys ,s) (q)|k ds dq (by Hölder’s inequality)
0 t
Implied Volatility from Local Volatility: A Path Integral Approach 259
T
= (T − t)k−1 Ẽx,y |g(Ys , s)|k ds
t
≤ M k (T − t)k (since |g| ≤ M).
Thus,
1 T
Q g(Ys ,s) (q) ds
e t dq
0
1 T ∞
1 1 T k
= 1+ Q g(Ys ,s) (q) ds dq + Q g(Ys ,s) (q)ds dq
0 t k! 0 t
k=2
T ∞
1 k
≤ 1+ Ẽx,y [g(Ys , s)] ds + M (T − t)k
t k!
k=2
T
≤ 1+ Ẽx,y [g(Ys , s)] ds + M 2 (T − t)2 e M(T −t) ,
t
Lemma 2 asserts that the time integral of the conditional expectation in Lemma 1
is approximately, up to order (T − t)2 , equal to the integral along a straight line
connecting x at time t to y at time T .
Lemma 2 For a bounded function g = g(x, s) with bounded second partial deriv-
ative with respect to x, the following asymptotic holds.
T T
Ẽx,y [g(Ys , s)] ds = g(xs , s)ds + O(T − t)2 ,
t t
gx x (ξs , s)
g(Ys , s) = g(xs , s) + gx (xs , s)(Ys − xs ) + (Ys − xs )2 ,
2
for some ξs between Ys and xs . Since
Ys is a Brownian
bridge from (t, x) to (T, y),
(s−t)(T −s)
Ys is normally distributed: Ys ∼ N xs , T −t . Therefore,
gx x (ξs , s)
Ẽx,y [g(Ys , s)] = g(xs , s) + gx (xs , s) Ẽx,y [Ys − xs ] + Ẽx,y (Ys − xs )2
2
gx x (ξs , s) (s − t)(T − s)
= g(xs , s) + .
2 T −t
For general nondegenerate diffusions, consider the process St driven by the SDE:
Again the Lamperti transformation allows us to carry over the small time asymptotic
expansion (3.1) in x-space to s-space. Specifically, recall that the Lamperti transfor-
s dξ
mation xt = ϕ(st , t) = s0t a(ξ,t) transforms the SDE (3.2) into a Brownian motion
Implied Volatility from Local Volatility: A Path Integral Approach 261
1
p S (T, sT |t, st ) = p X (T, x T |t, xt ),
a(sT , T )
with x T = ϕ(sT , T ) and xt = ϕ(st , t). Hence, a small time asymptotic expansion
as t → T − for p S can be obtained by simply applying the expansion (3.1). This
argument is formalized in Theorem 4.
T −τ τ −t
where ϕτ = T −t ϕ(st , t) + T −t ϕ(sT , T ).
We stress once again that in the time-inhomogeneous case, a = a(s, t), the
expansion in (3.3) is not identical to the classical heat kernel expansion as it involves
an integral along the path ϕτ . On the other hand, in the time-homogeneous case a =
a(s), (3.3) does recover the classical heat kernel expansion. In this sense therefore,
we have derived a natural generalization of the classical heat kernel expansion.
Proof We verify that the expansion (3.4) is indeed the classical heat kernel expansion.
The classical heat kernel expansion up to first order (see, for instance, Gatheral et al.
[9]) reads in our notation
T −τ τ −t
where ϕτ = T −t ϕ(st ) + T −t ϕ(sT ). Therefore, it suffices to show that
and
1 T T −t sT Lu(s, sT ) ds
− h 2 (ϕτ ) + h (ϕτ )dτ = . (3.6)
2 t ϕ(sT ) − ϕ(st ) st u(s, sT ) a(s)
μ(s) a (s)
For (3.5), since h ◦ ϕ(s) = a(s) − 2 , ϕ (s) = 1
a(s) , and H is an antiderivative of
h, we have
ϕ(sT ) sT
H ◦ ϕ(sT ) − H ◦ ϕ(st ) = h(ξ)dξ = h ◦ ϕ(s)dϕ(s)
ϕ(st ) st
sT μ(s) a (s) ds sT μ(s) 1 a(sT )
= − = ds − log .
st a(s) 2 a(s) st a 2 (s) 2 a(st )
Therefore,
s
H ◦ϕ(sT )−H ◦ϕ(st )
T μ(s)
st a 2 (s) ds
a(st )
e =e = u(st , sT ).
a(sT )
Implied Volatility from Local Volatility: A Path Integral Approach 263
T −τ τ −t
As for (3.6), since ϕτ = T −t ϕ(st ) + T −t ϕ(sT ), we have
T
h 2 (ϕτ ) + h (ϕτ )dτ
t
2 T −τ τ −t T −τ τ −t
T
= h ϕ(st ) + ϕ(sT ) + h ϕ(st ) + ϕ(sT ) dτ
t T −t T −t T −t T −t
T −t sT
= h 2 (ϕ(s)) + h (ϕ(s))dϕ(s).
ϕ(sT ) − ϕ(st ) st
1 d d μ(s) a (s)
h ◦ ϕ(s) = [h ◦ ϕ(s)] = a(s) × − ,
ϕ (s) ds ds a(s) 2
consequently,
2
sT sT μ a μ a ds
h (ϕ(s)) + h (ϕ(s)) dϕ(s) =
2
− +a − ,
st st a 2 a 2 a(s)
a 2 (s) 2
Lu(s, sT ) = ∂ u(s, sT ) + μ(s)∂s u(s, sT )
2 s
μ2 (a )2 a μ a aμ
= − 2− − − + u(s, sT )
2a 8 2 a 2 2a
1 μ a 2 μ a
=− − +a − u(s, sT ).
2 a 2 a 2
It follows that
Lu(s, sT ) ds
sT 1 sT μ a 2 μ a ds
=− − +a −
st u(s, sT ) a(s) 2 st a 2 a 2 a(s)
1 s T
=− h 2 (ϕ(s)) + h (ϕ(s)) dϕ(s),
2 st
The implied volatility σBS = σBS (K , T ) is defined implicitly by solving the nonlin-
ear equation
C(s, t, K , T ) = CBS (s, t, K , T, σBS (K , T )), (4.1)
where the function CBS on the right hand side is the celebrated Black-Scholes pricing
formula for call options (assuming zero interest rate and dividend yield):
As t → T − , the main contribution to the integral comes from the minimum point of
d, which in this case is the boundary point of the support of f because, in the OTM
case, d(x) is strictly increasing in x, and f (x) has the payoff function as a factor (see
(4.4)). To zeroth order, the Laplace asymptotic formula (for example, see (5.2.23)
on p. 193 of Bleistein and Handelsman [3]) then reads
∞ d(x) d(K ) f (K )
e− T −t f (x)d x ≈ (T − t)2 e− T −t (4.3)
K |d (K )|2
order term. Our objective in this section is to demonstrate how to implement this
matching procedure from the path integral perspective.
Recasting Eq. (4.1) for implied volatility using our path integral representation of
the density, and using our earlier representation (2.9) of the call price, we obtain
∞ ST − K
1 T
e− 2 t |ẋτ −h(xτ ,τ )| dτ D[x] d ST ,
2
K a(ST , T ) Cx
∞ ST − K − σBS (x T −xt )− σBS
2
1 T
8 (T −t) e− 2 t |ẋτ | dτ D[x]d ST
2
= e 2
K σBS sT Cx
√ 2
∞ log sT −log st σBS T −t
ST − K − 12 √
σBS T −t
+ 2
= √ e d ST . (4.4)
K 2π(T − t)σBS ST
To rederive the results in Berestycki et al. [2] and Gatheral et al. [9] from (4.4), we
approximate both sides of (4.4) as Laplace type integrals as in (4.2). The path integral
on the left hand side of (4.4) is approximated as follows:
1
T
e− 2 t |ẋτ −h(xτ ,τ )|2 dτ
D[x]
Cx
1
T T T
= e− 2 t |ẋτ |2 −2 t h(xτ ,τ )d x τ + t h 2 (xτ ,τ )dτ
D[x]
Cx
1
T T T
≈ e− 2 t |ẋτ |2 dτ
1−2 h(xτ , τ )d xτ + h 2 (xτ , τ )dτ D[x]
Cx t t
(x T −xt )2
− 2(T
≈e −t) [1 + O(T − t)] ,
where in the last step we approximated the path integral by evaluating the integral in
the exponent along a single path: the straight line connecting xt and x T . Recall that
s dξ
xt = ϕ(st , t) = s0t a(ξ,t) . Substitution back into the left hand side of (4.4) gives
ST − K ∞
1 T
e− 2 t |ẋτ −h(xτ ,τ )| dτ D[x] d ST
2
K a(ST , T ) Cx
∞ |ϕ(sT ,T )−ϕ(st ,t)|2 S − K
e−
T
≈ 2(T −t) [1 + O(T − t)] d ST ,
K a(S T,T)
266 T.-H. Wang and J. Gatheral
which is of Laplace type as in (4.2). Applying the Laplace asymptotic formula (4.3),
we obtain that, up to a factor,
|ϕ(K ,T )−ϕ(s,t)|2
C(s, t, K , T ) ≈ e− 2(T −t) . (4.5)
Likewise, the Black-Scholes price on the right hand side of (4.4) is given, up to a
factor, by
2
− | log2K −log s|
2σBS (T −t)
CBS (s, t, K , T ) ≈ e . (4.6)
Finally, by matching the exponents in (4.5) and (4.6), we obtain the zeroth order
approximation of the implied volatility as
log K − log s
σBS ≈ .
ϕ(K , T ) − ϕ(s, t)
K dξ
ϕ(K ) − ϕ(s) =
s a(ξ)
and we recover the BBF formula as in Berestycki et al. [2] and Gatheral et al. [9].
where
1
n−1
1 dsi
D[s] = lim √ √ .
n→∞ 2πta(sT , T ) i=1
2πt a(si , ti )
∞ T ṡτ
as (sτ ,τ ) 2
− 12 + dτ
C(t, st , K , T ) = (sT − K ) e t a(sτ ,τ ) 2
D[s]dsT .
K Cs
integral and evaluating the resulting path integral along the path that minimizes the
functional 2 T ṡτ
− 12 t a(sτ ,τ ) dτ
e .
In other words,
T s˙τ∗ 2
∞ − 21
t a(sτ∗ ,τ ) dτ
C(s, t, K , T ) ≈ (sT − K )e dsT ,
K
T 2
where sτ∗ is the optimal path that maximizes the action functional t a(sṡττ,τ ) dτ
subject to the constraints that initial and terminal points are fixed at st and sT respec-
tively. Moreover, since the resulting integral is of Laplace type, the call price is given
asymptotically, up to a factor, by
T s˙τ∗ 2
− 12
t a(sτ∗ ,τ ) dτ
C(s, t, K , T ) ≈ e ,
where the optimal path sτ∗ has initial and terminal points s and K respectively. Finally,
by matching the exponent with the Black-Scholes asymptotic as in (4.6), the zeroth
order approximation of implied volatility is given by
− 1
|log K − log s| T s˙τ∗ 2 2
σBS ≈ √ dτ
T −t a(s ∗ , τ )
t τ
1 T
|ẋτ − h(xτ , τ )|2 dτ (4.7)
2 t
2
1 T ṡτ as (sτ , τ )
+ dτ (4.8)
2 t a(sτ , τ ) 2
without dropping terms. The Euler-Lagrange equation associated with the functional
in (4.7) is
ẍτ = h h x + h t (4.9)
where xτ∗ is the optimal path which maximizes the functional (4.7) (or equivalently
solves (4.9)) with initial and terminal points given by ϕ(s, t) and ϕ(K , T ) respec-
tively. Solving (4.10) for σBS yields our new-and-improved zeroth order approxima-
tion for implied volatility.
To illustrate the accuracy of our new approximation (4.10), consider the case
of time dependent Black-Scholes, where rather pleasingly, (4.10) gives the exact
solution. Note in passing that, to the best of our knowledge, none of the existing
small time approximations is able to recover this very simple case.
d Sτ = σ(τ ) Sτ d Bτ , St = st .
1
d X t = ϕ̇(St , t)dt + ϕs (St , t)d St + ϕss (St , t)d[S]t
2
σ σ
= d Bt − Xt + dt.
σ 2
σ
Thus h(x, t) = − σ2 − σ x.
Implied Volatility from Local Volatility: A Path Integral Approach 269
With the change of variable x = σz , the above ODE for x is transformed into the
following ODE for z
2σ d ż
z̈ − ż = 0 =⇒ = 0.
σ dτ σ2
σ T x T − σt x t τ
στ x τ = z τ = σ t x t + T σ 2 (s)ds.
t σ (s)ds
2 t
(c) Solve for implied volatility: It follows that the functional (4.7) evaluated along
the optimal path, taking into account that σż2 = σTT x T2−σt xt is a constant, is given
t σ (s)ds
by
T T σ σ 2
|ẋτ − h(xτ , τ )| dτ =
ẋτ + 2 + σ x dτ
2
t t
T ∂ (σ x)
τ σ 2 T ż
σ 2
= σ + dτ = σ + 2 dτ
t 2 t
2
σ T x T − σt x t 1 T
= T + στ2 dτ
t σ (s)ds
2 2 t
⎛ ⎞2
log s − log s 1 T
= ⎝ σ 2 (s)ds ⎠ .
T t
+
T 2 2 t
t σ (s)ds
Finally, substituting this last expression into (4.10) gives the well-known result
1 T
σBS
2
= σ 2 (s)ds,
T −t t
which is exact.
270 T.-H. Wang and J. Gatheral
5 Conclusion
We have shown, up to first order in τ = T − t, that the classical heat kernel expan-
sion can be derived using a novel probabilistic approach. This new probabilistic
derivation of the heat kernel expansion inspires a path integral representation of the
transition density; natural definitions of the most-likely-path approximation of the
transition density, the call price, and the implied volatility then follow. In the time
homogeneous case, we recover well-known classical results. However, in the time
inhomogeneous case, we obtain a new asymptotic expansion that generalizes the
classical one. We showed how the lowest order approximation of Berestycki, Busca
and Florent as well as the higher order approximations of Gatheral et al. [9] and
Gatheral and Wang [8] correspond to dropping terms in our lowest order path inte-
gral representation. We further showed that by restoring the dropped terms, our new
representation recovers the exact expression for Black-Scholes implied volatility in
the time-dependent Black-Scholes model, which no existing asymptotic expansion
technique has so far been able to achieve, to the best of our knowledge. Further appli-
cations of this promising approach to the important practical problem of accurately
approximating implied volatility under local volatility is left for future research.
Acknowledgments We thank the anonymous reviewer for his helpful and constructive comments.
We are also grateful for helpful discussions with the participants of the following seminars: Math
Finance and PDE Seminar at Rutgers University, Probability Seminar at TU Berlin, Probability
Seminar at Academia Sinica, Mathematics Colloquium at Ritsumeikan University, Mathematical
Finance Seminar at Osaka University. All errors are our own responsibility.
References
1. Baldi, P., Caramellino, L.: Asymptotics of hitting probabilities for general one-dimensional
diffusions. Ann. Appl. Probab. 12, 1071–1095 (2002)
2. Berestycki, H., Busca, J., Florent, I.: Asymptotics and calibration of local volatility models.
Quant. Financ. 2, 61–69 (2002)
3. Bleistein, N., Handelsman, R.A.: Asymptotic Expansions of Integrals. Dover Publications,
New York (1986)
4. Chavel, I.: Eigenvalues in Riemannian geometry. Pure and Applied Mathematics, Book 115,
Academic Press (1984)
5. Cheng, W., Costanzino, N., Liechty, J., Mazzucato, A.L., Nistor, V.: Closed-form asymptotics
and numerical approximations of 1D parabolic equations with applications to option pricing.
SIAM J. Financ. Math. 2(1), 901–934 (2011)
6. Gatheral, J.: The Volatility Surface: A Practitioner’s Guide, Wiley Finance (2006)
7. Gatheral, J., Jacquier, A.: Arbitrage-free SVI volatility surfaces. Quant. Financ. 14(1), 59–71
(2014)
8. Gatheral, J., Wang, T.-H.: The heat kernel most-likely-path approximation. Int. J. Theor. Appl.
Financ. 15(1), 1250001 (2012)
9. Gatheral, J., Hsu, E.P., Laurence, P., Ouyang, C., Wang, T.-H.: Asymptotics of implied volatility
in local volatility models. Math. Financ. 22(4), 591–620 (2012)
10. Goovaerts, M., De Schepper, A., Decamps, M.: Closed-form approximations for diffusion
densities: a path integral approach. J. Comput. Appl. Math. 164–165, 337–364 (2004)
Implied Volatility from Local Volatility: A Path Integral Approach 271
11. Guyon, J., Henry-Labordère, P.: From spot volatilities to implied volatilities. Risk Mag. pp.
79–84 (2011)
12. Henry-Labordère, P.: Analysis, Geometry, and Modeling in Finance. Chapman & Hall/CRC,
Financial Mathematics Series (2008)
13. Hsu, E.P.: Stochastic Analysis on Manifolds. Graduate Studies in Mathematics, American
Mathematical Society (2002)
14. Jordan, R., Tier, C.: Asymptotic approximations to deterministic and stochastic volatility mod-
els. SIAM J. Financ. Math. 2(1), 935–964 (2011)
15. Keller-Ressel, M., Teichmann, J.: A remark on Gatheral’s ’most-likely path approximation’ of
implied volatility. In: Springer Proceedings in Mathematics & Statistics (2014)
16. Linetsky, V.: The path integral approach to financial modeling and options pricing. Comput.
Econ. 11(1–2), 129–163 (1997)
17. Reghai, A.: The hybrid most likely path. Risk Mag. 34–35 (2006)
Extrapolation Analytics for Dupire’s Local
Volatility
1 Introduction
One of the main objectives in option pricing theory is to price exotic derivatives con-
sistently with observed vanilla prices. According to the seminal work of Dupire [5],
this can in principle be achieved, for a one-dimensional underlying, by a model with
dynamics d St /St = σ(St , t)dWt . As opposed to stochastic volatility models, here
the volatility is a deterministic function of time and current underlying price. Any
given smooth call price surface C(K , T ), for strikes K > 0 and maturities T > 0,
A preprint of this article circulated under the title “Don’t stay local—extrapolation analytics for
Dupire’s local volatility”.
P. Friz (B)
Institut für Mathematik, Technische Universität Berlin, Berlin, Germany
e-mail: [email protected]
P. Friz
Weierstraß-Institut für Angewandte Analysis und Stochastik,
Berlin, Germany
S. Gerhold
Financial and Actuarial Mathematics, Vienna University of Technology, Wiedner Hauptstraße
8/105-1 A-1040, Vienna, Austria
e-mail: [email protected]
can be recovered by a so-called local volatility model d St /St = σloc (St , t)dWt ,
where the volatility function is given by Dupire’s formula [5]
2∂T C
σloc
2
(K , T ) = . (1)
K 2∂K K C
Exotic options can then be priced by Monte Carlo simulation. Local volatility models
are of considerable practical importance, and serve as building blocks for more
advanced models, e.g. local-stochastic-volatility (LSV) models.
In the present paper, we consider local volatility surfaces that arise from call
prices that are generated by some model for the underlying. Our aim is to turn the
knowledge of that model’s mgf (moment generating function; of log-spot X T ) into
asymptotic results of the corresponding local volatility surface. In [3], we described
two applications of such approximations. One is to the design of local volatility
parametrizations, whose asymptotic behavior may be matched to our results. Another
application concerns model risk. Consider pricing under an “advanced” model (affine
stochastic volatility, Lévy, etc.; anything with known mgf) versus a local volatility
model. The relative differences between the prices has been named “toxicity index”
in [13]. Roughly speaking, it measures the distance of the trade from vanilla options.
The most consistent way to calculate this index is to use the local volatility model
generated by the “advanced” model, because only then all vanillas will have zero
toxicity. When computing the local volatility surface, our accurate approximations
can then profitably replace other numerical methods in regimes where the latter
become unstable (see [3] for details).
We suppose that the underlying price process St = exp(X t ) is a martingale under
the pricing measure P and write C(K , T ) for its call price surface. For simplicity we
assume zero interest rate throughout. If C is sufficiently smooth, then the associated
local volatility function is given by Dupire’s formula (1). Recall the main asymptotic
formula from [3]:
∂
2 ∂T m(s, T )
σloc (K , T ) ∼
2
, (2)
s(s − 1)
s=ŝ(k,T )
Here, m(s, T ) := log M(s, T ) is the logarithm of the moment generating function
(mgf) M, which is defined by M(s, T ) := E exp(s X T ) and is analytic in the (max-
imal) strip s− (T ) < Re(s) < s+ (T ). The numbers s− and s+ are called critical
exponents. In this note, we will use (2) for K → ∞, but other asymptotic regimes
can also be covered [3, 8]; it is thus not only a local-volatility analogue of Lee’s
moment formula [11], but works also for maturity (or joint) asymptotics.
Extrapolation Analytics for Dupire’s Local Volatility 275
Theorem 1 In the Heston model with ρ ≤ 0 (the relevant regime in practice, at least
for equity models), the asymptotic equivalence (2) holds for k → ∞. The explicit
leading term is
2
σloc
2
(K , T ) ∼ × k, k → ∞, (5)
s+ (s+ − 1)R1 /R2
276 P. Friz and S. Gerhold
Proof It was shown in [3] that the right hand side of (5) asymptotically equals the
right hand side of (2). It thus remains to show that (2) holds for the Heston model as
k → ∞.
By the exponential decay of the Heston mgf towards ±i∞, the second equality
in formula (4) is correct for the Heston model. For the saddle point analysis of (4),
we employ the approximate saddle point
∂T ∗
σ(T ) = − (s+ (T )),
∂s
and
T ∗ (s) = sup{t ≥ 0 : E[es X t ] < ∞}.
This is the same approximate saddle point as in [7]; see there for more details on its
choice, and the definition of σ(T ) and T ∗ (s). (In [7], our ŝapprox was called simply ŝ,
since the exact saddle point of the denominator of (4), defined in (3), did not occur.)
This approximate saddle may be used for both integrals in (4). As for the denominator,
this was carried out in detail in [7], where an expansion of the Heston density ∂ K K C
was determined. The analysis of the numerator in (4) is similar, except that a new
tail estimate is required. But first we discuss the local expansion around the saddle
point. Let us fix a number α ∈ ( 23 , 43 ) and define h(k) = k −α . Then, in the central
range |s − ŝapprox (k)| ≤ h(k), we have
1 1
= + O(s+ − s)
s(s − 1) s+ (s+ − 1)
1
= 1 + O(k −1/2 )
s+ (s+ − 1)
Extrapolation Analytics for Dupire’s Local Volatility 277
∂ 2β 2 1
2 m(s, T ) = +O
∂T σ(s+ − s)2 s+ − s
2β 2
= (βk −1/2 + O(k −α ))−2 + O(k −1/2 )
σ
2k
= (1 + O(k 1/2−α )).
σ
Therefore, the local expansions of the two integrands in (4) agree, up to a factor that
is given by
2∂T m(s, T ) 2k
= (1 + O(k 1/2−α )), (8)
s(s − 1) σs+ (s+ − 1)
where the error term holds uniformly w.r.t. the integration variable s. According to
Theorem 1.2 of [7], we have
1 ŝapprox +i h(k) √
e−ks M(s, T )ds ∼ A1 e(1−A3 )k+A2 k −3/4+a/c2
k (9)
2iπ ŝapprox −i h(k)
ŝapprox +i h(k) 2∂
1 T m(s, T ) −ks
e M(s, T )ds (10)
2iπ ŝapprox −i h(k) s(s − 1)
2k √
× A1 e(1−A3 )k+A2 k k −3/4+a/c .
2
∼
σs+ (s+ − 1)
Dividing (10) by (9) shows our claim (5), provided that the tails |s − ŝapprox (k)| >
h(k) of the integrals can be discarded. For the denominator of (4), this was shown in
Lemma A.3 of [7]. So we proceed with the numerator. We consider only the upper
tail, as the lower one is handled by symmetry. By Lemma A.3 of [7], there is a
constant B > 0 such that
ŝapprox +i B √
e−ks M(s, T )ds ≤ e(1−A3 )k exp(A2 k − 21 β −1 k 3/2−2α + O(log k)).
ŝapprox +i h(k)
(11)
From formula (18) in [3] we obtain
∂T m(s, T )
s(s − 1) ≤ const × k
278 P. Friz and S. Gerhold
for all s on the contour in (11). This estimate can be absorbed into the factor
exp(O(log k)) in (11), so that we conclude
ŝapprox +i B ∂T m(s, T ) −ks
e M(s, T )ds
ŝapprox +I h(k) s(s − 1)
√
≤ e(1−A3 )k exp(A2 k − 21 β −1 k 3/2−2α + O(log k)).
(12)
This grows slower than the right hand side of (10) (compare the relevant factors
k −3/4+a/c resp. exp(− 21 β −1 k 3/2−2α )). As for Im(s) > B, it was shown in [7]
2
This was deduced from the exponential decay of M(s, T ) for large Im(s) (Lemma
A.1 in [7]). The following lemma implies that the new factor ∂T m(s, T )/(s(s − 1))
grows only polynomially, so that the exponential decay of the integrand persists for
the numerator of (4). This finishes the proof of Theorem 1.
To state the lemma, recall that m(s, t) = φ(s, t) + v0 ψ(s, t), where φ and ψ
satisfy the Riccati equations
φ̇ = aψ, φ(0) = 0,
ψ̇ = 2 (s
1 2
− s) + 21 c2 ψ 2 + bψ + sρcψ, ψ(0) = 0.
C1,T = 1/ (3c) ,
1
C2,T = (2ξmax − 1) T,
2
c2 2
C3,T = T 1 + C2,T .
2
Proof It follows from the proof of Lemma A.1 in [7] that (e.g. with C1,T := T θ =
1√
c 1/6 ≤ 3c
1
)
1 1
f (t) ≤ −T θy = − 1/6y ≤ − y =: −C1,T y.
c 3c
We next provide a similar upper estimate for g. To this end we first show that g = g(t)
remains ≥ 0 for all times t > 0. The differential equation for g,
1
ġ = (2ξ y − y) + c2 f g − γg, g(0) = 0,
2
implies the first order Euler estimate
1
g (t) = g(0) + (2ξ y − y) + c2 f (0)g(0) − γg(0) t + o(t)
2
1
= (2ξ y − y) t + o(t),
2
>0
and hence g is positive (even strictly so) on some interval (0, ε1 ). Assume this interval
is maximal in the sense that g(ε1 ) = 0 and g is (strictly) negative on some further
interval (ε1 , ε2 ). Clearly then ġ(ε1 ) ≤ 0, which contradicts the information from the
differential equation: indeed, using g(ε1 ) = 0, we obtain the contradiction
1
ġ(ε1 ) = (2ξ y − y).
2
>0
280 P. Friz and S. Gerhold
The observation that g ≥ 0 is useful to us, since it leads, together with f ≤ −C1,T y
and γ ≥ 0, to the differential inequality
1
ġ = (2ξ y − y) + c2 f g − γg
2
1
≤ (2ξ y − y) − c2 C1,T + γ g
2
1
≤ (2ξ y − y) ,
2
and hence to the upper estimate
1
∀0 ≤ t ≤ T : g(t) ≤ (2ξmax − 1) T × y =: C2,T y.
2
We can feed this upper estimate on g back in the differential equation for f to obtain
a lower estimate
1 2 c2
f˙ = ξ − y2 − ξ + f 2 − g2 − γ f
2 2
1 2 c2 c2 2 2
≥ ξ − y2 − ξ + f 2 − C2,T y −γf
2 2 2
1 1 2 c2 2
= − 1 + c2 C2,T
2
y2 + ξ −ξ −γf + f
2 2 2
1 1 2
≥ − 1 + c2 C2,T
2
y2 + ξ −ξ −γf
2 2
c2 2
≥ − 1 + C2,T y 2 − γ f,
2
where in the last step we assume that yis large enough so that the extra amount
subtracted (at least: 21 y 2 ) is larger than 21 ξ 2 − ξ , which remains bounded. We also
know that f (t) ≤ −C1,T y ≤ 0 for all 0 ≤ t ≤ T . It follows that −γ f ≥ 0 and
omission leads to our final lower bound on f˙, namely
c2 2
f˙ ≥ − 1 + C2,T y2.
2
c2 2
f (t) ≥ −T 1 + C y 2 =: −C3,T y 2 .
2 2,T
Extrapolation Analytics for Dupire’s Local Volatility 281
2 log(k/T )
σloc
2
(K , T ) ∼ , k → ∞. (14)
νs+ (s+ − 1)
Note that the numerator of (14) is ∼ 2 log k. We kept the T -dependence, because
the same analysis works for fixed k and T → 0, and in fact for any asymptotic regime
with k/T → ∞. This is a common feature of Lévy models, since the right-hand side
of (2) depends on k and T only through k/T .
Proof We write the moment generating function as
−T /ν
M(s, T ) = ebT s 2 σ ν(s+
1 2
− s)(s − s− )
282 P. Friz and S. Gerhold
1 i∞ exp(m(ŝ, t) − k ŝ)
e−ks M(s, T )ds ≈ . (15)
2iπ −i∞ 2πm (ŝ, T )
The interesting point now is that (15) is wrong for the variance gamma model,
inasmuch as asymptotic equality does not hold. The algebraic singularity of the mgf
is not pronounced enough to make the saddle point method work; see also the remark
after the proof. For a correct analysis, we use an integration contour as in Fig. 1. The
U-shaped notch, denoted by C(k), extends a bit to the right of the singularity s+ ,
and captures enough asymptotic information from it. By transformation into a so
called Hankel path, Hankel’s representation of the Gamma function can be invoked
after termwise integration of a local expansion. This “Hankel contour approach” is
well known in analytic combinatorics, in particular, from the so-called singularity
analysis of generating functions [6].
Let us first argue that the integrals over the dashed lines in Fig. 1 can be discarded.
By symmetry, it suffices to consider the upper one. The real part of s is then Re(s) =
s+ + (log k)/k. First suppose that s is away from the singularity, say Im(s) > 1. The
1
k
Re(s)
s+
C(k)
Extrapolation Analytics for Dupire’s Local Volatility 283
integral of ((s+ − s)(s − s− ))−T /ν over this part of the contour is O(1), and so we
get the bound O(e−kRe(s) ) = O(e−ks+ /k). Now consider s with 1/k ≤ Im(s) < 1.
We estimate the resulting integral by the length of the contour, which is O(1), times
the absolute value of the integrand at the lower endpoint s = s+ + (log k)/k + i/k.
The latter is easily seen to be O(e−ks+ k T /ν−1 (log k)−T /ν ).
We will now show that the integral over C(k) is of order e−ks+ k T /ν−1 , so that the
tail estimates we have just derived are good enough. The factor (s − s− ) is locally
almost constant; we have, uniformly for s ∈ C(k),
c1
e−ks M(s, T )ds ∼ e−ks ds.
C (k) C (k) (s+ − s)T /ν
T /ν
e−ks+ k e−ks+
ew c1 dw = c1 ew w −T /ν dw
k H(k) w k 1−T /ν H(k)
e−ks+
∼ c1 1−T /ν ew w −T /ν dw.
k H(∞)
The integration paths are displayed in Fig. 2. The right one, H(∞), is called a Hankel
contour; H(k) is a Hankel contour truncated at Re(s) = −log k. Now recall Hankel’s
representation for the Gamma function [12]:
Im(s) Im(s)
log k
1 1
Re(s) Re(s)
0 0
H(k)
H(∞)
Fig. 2 The integration contours H(k) and H(∞). The dots should indicate that the contour H(∞)
extends to −∞
284 P. Friz and S. Gerhold
1 1
ew w −z dw = .
2iπ H(∞) (z)
We thus arrive at
1 i∞ c1
e−ks M(s, T )ds ∼ e−ks+ k T /ν−1 . (16)
2iπ −i∞ (T /ν)
The numerator of (4) can be treated analogously, with a very similar tail estimate.
The contribution of the new factor to the local expansion is
∂T m(s, T ) 2/ν 1
2 ∼ log
s(s − 1) s+ (s+ − 1) s+ − s
2/ν k
= log
s+ (s+ − 1) w
2 log k
∼ ,
νs+ (s+ − 1)
and so
i∞ ∂T m(s, T ) −ks 2 log k c1
2 e M(s, T )ds ∼ × e−ks+ k T /ν−1 . (17)
−i∞ s(s − 1) νs+ (s+ − 1) (T /ν)
Without giving proofs, we briefly discuss local volatility asymptotics for two other
jump models. The mgf of Kou’s double exponential Lévy jump diffusion is given by
σ2 s 2 λ+ p λ− (1 − p)
M(s, T ) = exp T bs + +λ + −1 .
2 λ+ − s λ− + s
Extrapolation Analytics for Dupire’s Local Volatility 285
The singularity type, the same as in the Heston model, is amenable to the saddle
point method. Formula (2) can thus certainly be verified, and yields
√
2 λp
σloc
2
(K , T ) ∼ k 1/2 , k → ∞.
λ+ T (λ+ − 1)
For T → 0, the blowup of local volatility is of order T −1/2 . (Just as the Hankel
contour analysis in the proof of Theorem 3 can be carried out for any asymptotic
regime with k/T → ∞, the same is true when applying the saddle point method to
the local volatility surface of a Lévy model.)
Finally, we consider the normal inverse Gaussian (NIG) model. The mgf
M(s, T ) = exp T bs + δT α2 − β 2 − α2 − (β + s)2
s+ = α − β,
of M(s, T ) near s+ . However, the approximation (2) hinges on the first term of the
2 (K , T )
local expansion of M(s, T ). It therefore fails to capture the asymptotics of σloc
√
here, which depend on the first singular term (the term s+ − s in (18)). The NIG
model is thus one of the few examples where (2) is wrong. (It gives the qualitatively
correct result of convergence to a constant, but a wrong one.) The Hankel contour
analysis in the proof of Theorem 3 can be adapted to handle this situation. The result
is that local volatility tends to a constant for k → ∞. This fact may be understood
by comparing the NIG marginals with those of Heston’s in the time T → ∞ regime
(this link is made precise in [10]). In particular, the result is then consistent with the
Heston asymptotics (5) of local vol, given that the O(k) term carries a factor ≈ 1/T
which tends to zero as T → ∞.
Acknowledgments We thank M. Drmota, J. Morgenbesser, and the referee for helpful comments,
and gratefully acknowledge financial support from MATHEON (P. Friz) resp. the Austrian Science
Fund (FWF) under grant P 24880-N25 (S. Gerhold).
286 P. Friz and S. Gerhold
References
1. Carr, P., Chang, E., Madan, D.: The variance gamma process and option pricing. Eur. Financ.
Rev. 2, 79–105 (1998)
2. Carr, P., Geman, H., Madan, D.P., Yor, M.: From local volatility to local Lévy models. Quant.
Financ. 4, 581–588 (2004)
3. De Marco, S., Friz, P., Gerhold, S.: Rational shapes of local volatility. Risk 2, 82–87 (2013)
4. Drmota, M., Soria, M.: Marking in combinatorial constructions: generating functions and
limiting distributions. Theor. Comput. Sci. 144, 67–99 (1995). Special volume on mathematical
analysis of algorithms
5. Dupire, B.: Pricing with a smile. Risk 7, 18–20 (1994)
6. Flajolet, P., Odlyzko, A.: Singularity analysis of generating functions. SIAM J. Discret. Math.
3, 216–240 (1990)
7. Friz, P., Gerhold, S., Gulisashvili, A., Sturm, S.: On refined volatility smile expansion in the
Heston model. Quant. Financ. 11, 1151–1164 (2011)
8. Friz, P.K., Gerhold, S., Yor, M.: How to make Dupire’s local volatility work with jumps. Quant.
Financ. 14, 1327–1331 (2014)
9. Gatheral, J., Jacquier, A.: Convergence of Heston to SVI. Quant. Financ. 11, 1129–1132 (2011)
10. Keller-Ressel, M.: Moment explosions and long-term behavior of affine stochastic volatility
models. Math. Financ. 21, 73–98 (2011)
11. Lee, R.W.: The moment formula for implied volatility at extreme strikes. Math. Financ. 14,
469–480 (2004)
12. Olver, F.W.J., Lozier, D.W., Boisvert, R.F., Clark, C.W. (eds.): NIST Handbook of Mathematical
Functions. U.S. Department of Commerce National Institute of Standards and Technology,
Washington, DC (2010)
13. Reghai, A.: Model evolution. Presentation at the Parisian Model Validation seminar. https://
sites.google.com/site/projeteuclide/les-seminaires-vmf/archives-vmf (2011)
The Gärtner-Ellis Theorem,
Homogenization, and Affine
Processes
Abstract We obtain a first order extension of the large deviation estimates in the
Gärtner-Ellis theorem. In addition, for a given family of measures, we find a spe-
cial family of functions having a similar Laplace principle expansion up to order
one to that of the original family of measures. The construction of the special fam-
ily of functions mentioned above is based on heat kernel expansions. Some of the
ideas employed in the paper come from the theory of affine stochastic processes.
For instance, we provide an explicit expansion with respect to the homogenization
parameter of the rescaled cumulant generating function in the case of a generic
continuous affine process. We also compute the coefficients in the homogenization
expansion for the Heston model that is one of the most popular stock price models
with stochastic volatility.
1 Introduction
The large deviations theory has found numerous applications in mathematical finance
(see, e.g., [19]). For instance, using the methods of the large deviations theory, one
can estimate various important characteristics of financial models such as tails of
asset price distributions, option pricing functions, and the implied volatility (see,
e.g., [7–11, 13, 15] and the references therein). A popular source of information on
A. Gulisashvili (B)
Department of Mathematics, Ohio University, Athens, OH, USA
e-mail: [email protected]
J. Teichmann
Department of Mathematics, ETH Zürich, Zürich, Switzerland
e-mail: [email protected]
the large deviations theory is the book [4] by Dembo and Zeitouni. A useful result
in the theory is the Gärtner-Ellis theorem (see [6, 12], see also [4]). This theorem
allows to infer the upper and lower estimates in the large deviation principle knowing
the properties of the limiting cumulant generating function.
We will next provide a brief overview of the contents of the paper. In Sect. 2, a
new notion of Laplace principle equivalent expansions for families of functions and
measures is introduced. This notion is motivated by the homogenization expansion
of the rescaled cumulant generating function associated with an affine stochastic
process X , that is, the function defined by
u u
(, u) = log E exp − X = log exp − z p (dz).
R
Actually, the homogenization expansion mentioned above is nothing else but the real
analytic expansion of the function with respect to the parameter (see Sect. 4).
In Sect. 3, we gather definitions and known facts from the theory of general affine
processes, while in Sect. 4, the homogenization procedure is described in all details
for continuous affine processes. The main general results obtained in the paper are
contained in Sect. 2 (see Theorems 2.4 and 2.7). Theorem 2.4 states that for any family
of measures on the real line, satisfying the conditions in the Gärtner-Ellis theorem,
and such that the homogenization expansion exists, we can find a special family of
functions that is Laplace principle equivalent to the original family of measures. The
structure of the function family in Theorem 2.4 resembles the first two terms in the
heat kernel expansions on Riemannian manifolds (notice that we face a degenerate
situation here, so we could not apply heat kernel expansion directly). Theorem 2.7 is a
generalization of the Gärtner-Ellis theorem. It is shown in Theorem 2.7 that under the
same conditions as in Theorem 2.4, the first order large deviation estimates are valid.
Finally, in Sect. 5, we compute the coefficients in the homogenization expansion for
the correlated Heston model that is one of the most popular stochastic stock price
models with stochastic volatility.
• The functions f and φ in (2.1) are continuous on the interval (a, b), and the integral
in (2.1) converges absolutely for all 0 < < 0 .
• The function φ has a unique absolute minimum that occurs at z = z 0 with a <
z 0 < b.
• The function φ is strictly convex in a neighborhood of z 0 .
• The function φ is four times continuously differentiable in a neighborhood of z 0 ,
and
4
∂ n φ(z 0 )
φ(z) = φ(z 0 ) + (z − z 0 )n + O (z − z 0 )5 (2.2)
n!
n=2
as z → z 0 .
• The formula in (2.2) can be differentiated. More exactly, the condition
4
∂ n φ(z 0 )
∂φ(z) = (z − z 0 )n−1 + O (z − z 0 )4 , z → z 0 , (2.3)
(n − 1)!
n=2
holds.
• The function f is twice continuously differentiable in a neighborhood of z 0 , and
2
∂ n f (z 0 )
f (z) = (z − z 0 )n + O (z − z 0 )3 (2.4)
n!
n=0
as z → z 0 .
Then, as → 0,
b φ(z)
f (z) exp − dz
a
2 2
φ(z 0 ) 2π ∂ f (z 0 ) 5(∂ 3 φ(z 0 )) f (z 0 )
= exp − f (z 0 ) + +
∂ 2 φ(z 0 ) 2∂ 2 φ(z 0 ) 24(∂ 2 φ(z 0 ))3
∂ 4 φ(z 0 ) f (z 0 ) ∂ 3 φ(z 0 )∂ f (z 0 )
− 2
− 2
+ O 2 . (2.5)
8(∂ φ(z 0 ))
2 2(∂ φ(z 0 ))
2
Formula (2.5) can be derived by following the proof of Theorem 8.1 in [18].
Let us next assume that weaker differentiability restrictions than those listed above
are imposed on the functions f and φ:
• The function φ is twice continuously differentiable in a neighborhood of z 0 , and
∂ 2 φ(z 0 )
φ(z) = φ(z 0 ) + (z − z 0 )2 + O (z − z 0 )3 (2.6)
2
as z → z 0 .
290 A. Gulisashvili and J. Teichmann
holds.
• The function f is such that
f (z) = f (z 0 ) + O (z − z 0 ) as z → z 0 . (2.8)
Then, as → 0,
b φ(z) φ(z 0 ) 2π
f (z) exp − dz = exp − f (z 0 ) + O () . (2.9)
a ∂ 2 φ(z 0 )
Remark 2.1 Using the Taylor formula, we see that (2.2), (2.3), and (2.4) hold pro-
vided that the function f is three times continuously differentiable and the function φ
is five times continuously differentiable near z 0 . Similarly, (2.6), (2.7), and (2.8) hold
if f is continuously differentiable and φ is three times continuously differentiable
near z 0 .
Let p = { p }>0 be a family of probability measures on R. The following assump-
tion is modeled on the behavior of the family of moment generating functions of the
affine process and on the homogenization ideas (see Sect. 4 for more details):
u
(0) (u) (1) (2)
exp − z p (dz) = exp exp (u) 1 + (u) + O( )
2
R
(2.10)
and
exp (1) (u) (2) (u)
1 (0) (u) u
(1)
= lim exp exp − z p (dz) − exp (u) . (2.13)
→0 R
It will be assumed throughout the rest of the paper that the conditions in the
Gärtner-Ellis theorem hold. More precisely, we suppose that the following are true:
• The function (0) defined in (2.11) exists as an extended real number for all u ∈ R.
We denote by I the maximum open interval such that the number (0) (u) is finite
for all u ∈ I .
• The point u = 0 belongs to the interval I .
• The function (0) is continuously differentiable on I , the derivative ∂u (0) is a
strictly increasing function on I , and the range of the function ∂u (0) is R.
The previous restrictions concern only the function (0) . By the Gärtner-Ellis
theorem, they imply the validity of the large deviation principle for the family p.
More information on the Gärtner-Ellis theorem can be found in [4]. The existence
of the functions (1) and (2) (these functions are determined from (2.12) and
(2.13), respectively), signals that certain refinements of large deviation results may
be possible.
Remark 2.2 In the paper [16] of Jacquier and Roome, an assumption similar to that
in (2.10) is imposed on the rescaled cumulant generating function (see (2.1) in [16]).
Moreover, there are more similarities between the assumptions in the present section
and those in Sect. 2 of [16]. Note that the main results obtained in [16] concern the
asymptotic behavior of forward start options and forward smiles.
The function (0) is strictly convex on I . Let us define an appropriate Legendre-
Fenchel transform of (0) , more precisely, we put
∗
(0) (z) = − inf (uz + (0) (u)), z ∈ R.
u∈I
It is clear that there exists a unique minimizer z → u ∗ (z) in the problem described
above, satisfying the condition
It follows that ∗
(0) (z) = −zu ∗ (z) − (0) (u ∗ (z)). (2.15)
∗
Since (0) (0) = 0, we have (0) (z) ≥ 0. It is well-known that the function
(0) ∗
is strictly convex on R. The previous statements, (2.14), and (2.15) imply
(0) ∗ ∗
that (z) = 0 if z = −∂u (0) (0), and (0) (z) > 0 if z = −∂u (0) (0).
292 A. Gulisashvili and J. Teichmann
Next, set ∗
d(z) = 2 (0) (z). (2.16)
It is clear that
d 2 (z) (0) ∗
= (z). (2.17)
2
Therefore,
d(z) = −2 zu ∗ (z) + (0) (u ∗ (z)) . (2.18)
for all u ∈ Jn .
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 293
∞
(ii) The sequence of intervals Jn , n ≥ 1, is increasing and n=1 Jn = I.
The next statement explains how to construct the family f. The ansatz, defining
the structure of the function f in formula (2.20), is based on the classical theory of
heat kernel expansions.
Theorem 2.4 Let p be a family of Borel probability measures on R satisfying (2.10),
and suppose the conditions in the Gärtner-Ellis theorem hold. Suppose also that the
function (0) is five times continuously differentiable on I , the function (1) is
three times continuously differentiable on I , and the function (2) is continuously
differentiable on I . Define a family f of functions as follows:
2
1 d (z)
f (z) = √ exp − (C0 (z) + C1 (z)), > 0, (2.20)
2π 2
and
2
∂ 2 C0 (z) ∂u2 (0) (u ∗ (z)) 5C0 (z) ∂u3 (0) (u ∗ (z))
C1 (z) = C0 (z) (2) (u ∗ (z)) − − 3
2 24 ∂u2 (0) (u ∗ (z))
2
C0 (z) 3 ∂u3 (0) (u ∗ (z)) − ∂u2 (0) (u ∗ (z)) ∂u4 (0) (u ∗ (z))
+ 3
8 ∂u2 (0) (u ∗ (z))
∂C0 (z) ∂u3 (0) (u ∗ (z))
+ .
2∂u2 (0) (u ∗ (z))
Set
d 2 (z)
φu (z) = uz + . (2.22)
2
Laplace’s principle will be applied to the family of integrals appearing on the right-
hand side of (2.21) twice. The first time, formula (2.5) with f = C0 and φ = φu
will be used, while for the second time, formula (2.9) will be used with f = C1 and
φ = φu .
∗
The critical
point
(0)
∗ z (u) of the function φu given by (2.22) is∗the solution to the
equation ∂z (z) = u. It is not hard to see that z = z (u) if and only if
u = u ∗ (z). It follows from (2.14) that
The next formulas can be derived using (2.17), (2.15), (2.22), and (2.23). We have
1
∂z2 φu (z ∗ (u)) = , (2.24)
∂ 2 (0) (u)
∂ 3 (0) (u)
∂z3 φu (z ∗ (u)) = , (2.25)
[∂ 2 (0) (u)]3
and
3[∂ 3 (0) (u)]2 − ∂ 2 (0) (u) ∂ 4 (0) (u)
∂z4 φu (z ∗ (u)) = . (2.26)
[∂ 2 (0) (u)]5
It is not hard to see that condition (ii) in Definition 2.3 is satisfied. Next, using (2.5)
and (2.21), we obtain
u
n φu (z ∗ (u)) 1
exp − z f (z)dz = exp −
−n ∂z φ(z ∗ (u))
2
2
∗ ∗ ∂z2 C0 (z ∗ (u)) 5(∂z3 φu (z ∗ (u))) C0 (z ∗ (u))
C0 (z (u)) + C1 (z (u)) + 2 + −
2∂z φu (z ∗ (u)) 24(∂z2 φu (z ∗ (u)))3
∂z4 φu (z ∗ (u))C0 (z ∗ (u)) ∂z3 φu (z ∗ (u))∂z C0 (z ∗ (u))
− − + On,u 2
(2.27)
8(∂z2 φu (z ∗ (u)))2 2(∂z2 φu (z ∗ (u)))2
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 295
This shows that if we choose the function d as in (2.16), then the first factors in
formulas (2.10) and (2.27) coincide. Moreover, the functions C0 and C1 have to be
chosen so that
C0 (z ∗ (u)) = ∂u2 (0) (u) exp((1) (u)) (2.28)
and
2
∂z2 C0 (z ∗ (u)) 5(∂z3 φu (z ∗ (u))) C0 (z ∗ (u))
C1 (z ∗ (u)) = C0 (z ∗ (u)) (2) (u) − ∗
−
2∂z φu (z (u))
2
24(∂z2 φu (z ∗ (u)))3
∂z4 φu (z ∗ (u)) C0 (z ∗ (u)) ∂z3 φu (z ∗ (u)) ∂z C0 (z ∗ (u))
+ 2
+ . (2.29)
8(∂z2 φu (z ∗ (u))) 2(∂z2 φu (z ∗ (u)))2
∗
for all y ∈ R. Hence the infimum of the function (0) on the real line is attained
at the point y such that u ∗ (y) = 0. This point is given by y = z ∗ (0) = ∂(0) (0).
Moreover, ∗
inf (0) (y) = −(0) (0) = 0.
y∈R
Remark 2.6 A heuristic conclusion that can be reached using Theorem 2.4 is that the
family f is a small-time approximation to the family p in a certain very weak sense.
Finding such approximations is an important problem. We consider our results as
first modest steps in going beyond the celebrated Gärtner-Ellis theorem.
The next assertion provides a first order large deviation estimate in the Gärtner-
Ellis theorem for families of measures satisfying condition (2.10). Higher order
estimates can also be found, but we do not include them in the present paper. Let A
296 A. Gulisashvili and J. Teichmann
be a bounded Borel set. Denote by A the closure of the set A, and let
Then we have z + , z − ∈ A.
Theorem 2.7 Let p be a family of probability Borel measures on R such that (2.10)
holds. Suppose also that the function (0) is twice continuously differentiable on I
and the conditions in the Gärtner-Ellis theorem hold (see the conditions listed after
formula (2.13)). Suppose also that A ⊂ R is a bounded Borel set, and x ∈ A. Then
the following are true:
The big O estimates in (2.30) and (2.31) are uniform with respect to x ∈ A.
Remark 2.8 The conditions x ≥ ∂(0) (0) and x < ∂(0) (0) are equivalent to
u ∗ (x) ≥ 0 and u ∗ (x) < 0, respectively.
Theorem 2.9 Let p be a family of probability Borel measures on R such that (2.10)
holds. Suppose also that the function (0) is twice continuously differentiable on I
and the conditions in the Gärtner-Ellis theorem hold (see the conditions listed after
formula (2.13)). Suppose also that A ⊂ R is a bounded open set, and x ∈ A. Then
the following are true:
(i) Let x ≥ ∂(0) (0). Then there exists a constant γ A > 0 depending on the set
A such that as → 0,
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 297
∗
(0)
(x) + u ∗ (x) (x − a − )
p (A) ≥ exp − exp (1) (u ∗ (x))
γ
A (2) ∗
× 1 − exp − 1 + (u (x)) + O( ) . 2
(2.32)
The constant γ A in (2.33) is the same as in (2.32), and the big O estimates in (2.32)
and (2.33) are uniform with respect to x ∈ A.
Remark 2.10 Note that performing the transformation lim sup→0 log p (A) in the
upper estimates in Theorem 2.7, we obtain the upper estimate in the large deviation
principle for any bounded Borel set A. This gives a little more than the upper estimate
in the Gärtner-Ellis theorem. However, we should not forget that formula (2.30) was
derived under a stronger restriction (2.10), than in the Gärtner-Ellis theorem.
Proof of Theorem 2.7 We borrow some ideas from the proofs of Cramer’s theorem
and the Gärtner-Ellis theorem given in [4]. The proofs of the upper estimates in those
theorems use Chebyshev’s inequality. In our case, due to a special structure of the
problem, we can provide a slightly more direct proof.
Suppose the conditions in Theorem 2.7 hold, and let u ∈ I and > 0. Then we
have
uz uz
exp − p (dz) ≥ p (A) inf exp − . (2.34)
A z∈A
It follows from (2.34) that for every u ∈ I there exists ξ(u) ∈ A such that
uz
uξ(u)
p (A) ≤ exp exp − p (dz)
A
(0) (u) u
= exp − exp − z p (dz)
A
(0) (u) + xu + u(ξ(u) − x)
× exp .
Next, by plugging u = u ∗ (x) into the previous equalities and taking into account
condition (2.10), we get
∗
(0) (u ∗ (x)) u (x)
p (A) ≤ exp − exp − z p (dz)
A
∗
(0) (x) − u ∗ (x)(ξ(u ∗ (x)) − x)
× exp −
∗
(0) (x) − u ∗ (x)(ξ(u ∗ (x)) − x)
(1) ∗
≤ exp (u (x)) exp −
× 1 + (2) (u ∗ (x)) + O(2 ) (2.35)
Proof of Theorem 2.9 The lower bounds given in Theorem 2.9 are more delicate.
Here we start with the estimate
uz uz
exp − p (dz) ≤ p (A) sup exp −
A z∈A
Our next goal is to use the change of measure method. Consider a new family
p
of probability measures defined by
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 299
∗
exp − u (x)z
p (dz)
p (dz) = ∗ , > 0.
u (x)z
R exp − p (dz)
as → 0.
We will next estimate the quantity
p (A) = 1 −
p (Ac ) (2.38)
from below. This will be done using the upper estimate in the Gärtner-Ellis theorem.
Let us denote by (0) the function defined by (2.11) for the family
p instead of the
family p. Then it is not hard to see that
Next, taking into account that Ac is a closed set, and using the upper large deviations
estimate in the Gärtner-Ellis theorem (see Theorem 2.3.6 in [4]), we obtain
∗
p (Ac ) ≤ − inf c
lim sup log (0) (y).
→0 y∈A
(0) ∗
Set δ A = inf y∈Ac (y). Using Remark 2.5 and (2.39), we see that the
(0) ∗
unique infimum of the function on the real line is attained at the point
∗
y=∂ (0) (0) = (0) (u ∗ (x)) = x,
300 A. Gulisashvili and J. Teichmann
and is equal to zero. Since x ∈ / Ac , and the set Ac is closed, we have δ A > 0.
Therefore, for every τ > 0, there exists τ > 0 such that
−δ A + τ
p (A ) ≤ exp
c
, 0 < < τ . (2.40)
Fix any number τ > 0 with 0 < τ < δ A , and set γ A = δ A − τ . Then (2.38) and
(2.40) imply the following estimate:
−γ A
p (A) ≥ 1 − exp
, 0 < < τ . (2.41)
3 Affine Processes
Let D be a non-empty Borel subset of the real Euclidian space Rd , equipped with
the Borel σ-algebra D, and assume that the affine hull of D is the full space Rd . To
D we add a point δ that serves as a ‘cemetery state’. Define
= D ∪ {δ} ,
D = σ(D, {δ}),
D
and equip D with the Alexandrov topology, in which any open set with a compact
complement in D is declared an open neighborhood of δ.1 Any continuous function
f defined on D is extended to D by setting f (δ) = 0.
Let (, F, F) be a filtered measurable space, on which a family (Px )x∈ D of
probability measures is defined, and assume that F is Px -complete for all x ∈ D
and that the filtration F is right continuous. Finally, let X be a càdlàg process taking
whose transition kernel
values in D,
pt (x, A) = Px (X t ∈ A), (t ≥ 0, x ∈ D,
A ∈ D)
× D.
holds for each t, s ≥ 0 and (x, dξ) ∈ D
We equip Rd with the canonical inner product , , and associate to D the set
U ⊆ Cd defined by
U = u ∈ Cd : sup Re u, x < ∞ .
x∈D
Note that the set U is the set of complex vectors u such that the exponential function
x → e u,x is bounded on D. It is easy to see that U is a convex cone and always
contains the set of purely imaginary vectors iRd .
Definition 3.1 (Affine processes) A stochastic process X is called affine with state
space D, if the transition kernel pt (x, dξ) of X satisfies the following conditions:
(i) It is stochastically continuous, i.e., lims→t ps (x, .) = pt (x, .) weakly for all
t ≥ 0, x ∈ D.
(ii) The Fourier-Laplace transform of the kernel depends on the initial state in the
following way: there exist functions : R0 ×U → C and ψ : R0 ×U → Cd ,
such that
e ξ,u pt (x, dξ) = (t, u) exp( x, ψ(t, u)) (3.1)
D
This is the essentially the definition that was used in [5]. Condition (3.2) means
that the Fourier-Laplace transform of the transition function is the exponential of an
affine function of x. This fact is usually interpreted as the reason for the name ‘affine
process’, even though affine functions also appear in other aspects of affine processes,
e.g., in the coefficients of the infinitesimal generator, or in the differentiated semi-
martingale characteristics. We prefer to use equality (3.1) instead of equality (3.2),
since the former equality leads to a slightly more general definition that avoids the
necessity of the a-priori assumption that the left hand side of (3.1) is non-zero for
all t and u.
Before we start exploring the first simple consequences of Definition 3.1, addi-
tional notation will be introduced. For any u ∈ U, set σ(u) := inf {t ≥ 0 : (t, u) = 0}
302 A. Gulisashvili and J. Teichmann
and Q := (t, u) ∈ R0 × U : t < σ(u) , and let φ be a function on Q such that
The uniqueness of φ will be discussed below. The functions φ and ψ have the fol-
lowing properties (see [17]):
Remark 3.4 In the sequel, the functions φ and ψ will always be chosen according to
Proposition 3.3.
∂φ(t, u) ∂ψ(t, u)
F(u) = , R(u) =
∂t t=0+ ∂t t=0+
The next statement illustrates why the regularity is a crucial property. This
statement was originally established by [5] for affine processes on the state-space
Rn × Rm0 .
Proposition 3.6 Let X be a regular affine process. Then there exist Rd -vectors
b, β 1 , . . . , β d ; d ×d-matrices a, α1 , . . . , αd ; real numbers c, γ 1 , . . . , γ d , and signed
Borel measures m, μ1 , . . . , μd on Rd \ {0} such that the functions F(u) and R(u)
can be represented as follows:
1
F(u) = u, au + b, u − c + e ξ,u − 1 − h(ξ), u m(dξ) ,
2 Rd \{0}
(3.3a)
1! " ! "
Ri (u) = u, αi u + β i , u − γ i + e ξ,u − 1 − h(ξ), u μi (dξ) .
2 Rd \{0}
(3.3b)
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 303
A(x) = a + x1 α1 + · · · + xd αd , (3.4a)
B(x) = b + x1 β + · · · + xd β ,
1 d
(3.4b)
C(x) = c + x1 γ 1 + · · · + xd γ d , (3.4c)
ν(x, dξ) = m(dξ) + x1 μ (dξ) + · · · + xd μ (dξ)
1 d
(3.4d)
Moreover, for u ∈ U and t ∈ [0, σ(u)), the functions φ and ψ satisfy the following
ordinary differential equations:
∂
φ(t, u) = F(ψ(t, u)), φ(0, u) = 0 (3.5a)
∂t
∂
ψ(t, u) = R(ψ(t, u)), ψ(0, u) = u. (3.5b)
∂t
Remark 3.7 The Eq. (3.5) are called generalized Riccati equations, since they are
classical Riccati equations when m(dξ) = μi (dξ) = 0. Moreover, Eqs. (3.3) and (3.4)
imply that u → F(u) + R(u), x is a function of Lévy-Khintchine form for each
x ∈ D.
Proof See [17].
In general, the parameters (a, αi , b, β i , c, γ i , m, μi )i∈{1,...,d} appearing in the rep-
resentations of F and R in (3.5a) and (3.5b) have to satisfy additional conditions,
called the admissibility conditions. These conditions guarantee the existence of an
affine Markov process X with state space D and with prescribed F and R. It is clear
that such conditions should depend strongly on the geometry of the (boundary of the)
state space D. Finding such (necessary and sufficient) conditions on the parameters
for different types of state spaces has been the focus of several publications. For
D = Rm 0 × R , the admissibility conditions were derived in [5]. For the cone of
n
semi-definite matrices D = Sd+ , such conditions were found in [2], and for sym-
metric irreducible cones, the admissibility conditions were found in [3]. Finally, for
affine diffusions (m = μi = 0) on polyhedral cones and on quadratic state spaces,
the admissibility conditions were given in [20].
state space.
Affine processes on canonical state spaces are completely characterized in [5]
in terms of the admissibility conditions imposed on F and R. Affine processes
304 A. Gulisashvili and J. Teichmann
on canonical state spaces have continuous trajectories (such processes are called
continuous affine processes) if and only if the functions F and R satisfy the admis-
sibility conditions and are polynomials of degree at most 2 (see Proposition 3.6).
4 Homogenization Procedure
In this section, we consider continuous, affine processes on the canonical state space
D = Rm 0 × R . We will next introduce a natural homogenization procedure, which
n
allows to analyze the short-time asymptotics of the law of continuous affine processes.
In the case of affine processes, the homogenization leads in fact to real analytic
expansions with respect to the homogenization parameter.
The following lemmas introduce the homogenization procedure.
Lemma 4.1 Let ψ : U × R≥0 → U be the unique solution of the equation
∂
ψ(u, t) = R ψ(u, t) , ψ(u, 0) = u ∈ U,
∂t
∂
ψ (u, t) = R ψ (u, t) , ψ (u, 0) = u
∂t
with R (u) := 2 R −1 u for u ∈ U.
Analogously, let φ : U × R≥0 → C be the unique solution of the equation
∂
ψ(u, t) = F ψ(u, t) , φ(u, 0) = 0.
∂t
Then, for every > 0, the function
u
φ (u, t) := φ , t
solves the equation
∂
φ (u, t) = F ψ (u, t) , φ (u, 0) = 0
∂t
with F (u) := 2 F −1 u for u ∈ U.
The proof of Lemma 4.1 is simple, and we leave it as an exercise for the reader.
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 305
Lemma 4.2 Under the previous assumptions, the limit lim→0 ψ = ψ (0) exists
uniformly on compact sets in U × R≥0 . Furthermore,
is a convergent power series expansion for small > 0. The coefficient functions in
(4.1) satisfy certain ordinary differential equations, i.e., in particular,
∂ (0)
ψ (u, t) = R (0) ψ (0) (u, t) , ψ (0) (u, 0) = u ,
∂t
and
∂ (1) ∂
ψ (u, t) = R ψ (0) (u, t) ψ (1) (u, t), ψ (1) (u, 0) = 0.
∂t ∂ =0
For n ≥ 2, the equations for the coefficient functions involve higher order derivatives.
In complete analogy, the limit lim→0 φ = φ(0) exists uniformly on compact sets in
U × R≥0 . Furthermore
Proof Observe that R = R (0) + R (1) + 2 R (2) and F = F (0) + F (1) + 2 F (2) .
2 2
Hence, the vector fields appearing in the equation in Lemma 4.2 are polynomial in u
and . Standard results on differential equations with polynomial vector fields yield
the assertions in Lemma 4.2, in particular, the real analyticity of the solution with
respect to .
ˆ (0) (u) +
ˆ (1) (u) + ... := φ (−u, 1) + x, ψ (−u, 1) , (4.2)
They are the solutions of the Riccati equations appearing in the previous lemmas.
Note that we suppress the dependence on the initial value x on the left-hand side of
ˆ (i) exist as extended real numbers for u ∈ Rd .
(4.2). The functions
Remark 4.3 If the expression on right-hand side of (4.2) is finite, then the power
series on the left-hand side converges absolutely for sufficiently small values of .
306 A. Gulisashvili and J. Teichmann
Remark 4.4 For continuous affine processes, the homogenization procedure leads
to the following representation:
u u
E exp − , X = exp − , z p (dz)
D
ˆ (u)
(0)
= exp ˆ (1)
+ (u) + ... , (4.3)
where u is such that the expressions on both sides of (4.3) are finite for small enough
values of .
The representation in (4.3) valid for any continuous affine process was a motivation
for us for introducing condition (2.10) used in the previous sections. However, the
expansion in (4.3) is a little different from that in (2.10).
In this section, we find explicit formulas for the functions (i) , 0 ≤ i ≤ 2, associ-
ated with the log-price process in the Heston model. Let us consider the following
correlated Heston model:
#
d X t = (r + kVt )dt + Vt dW1, t ,
#
d Vt = (a − bVt )dt + σ Vt dW2, t , (5.1)
b − ρσu + d(u)
g(u) = ,
b − ρσu − d(u)
and #
d(u) = (ρσu − b)2 − σ 2 (2ku + u 2 )
√
(see [1]). Here and in the sequel, the symbol · stands for the principal square root
function. We will explain below the meaning of the logarithmic function appearing
in the expression for the function C (see the discussion after formula (5.7)). Note
that for u = 0, the expressions for the functions C and D should be understood in
the limiting sense. More precisely,
u u
(u, t) = tC(− , t) + t D(− , t)v0 − ux0 . (5.2)
t t
1 1 − ed(u)t
D(u, t) = (A(u) + d(u))
σ 2
1 − A(u)+d(u) ed(u)t
A(u)−d(u)
1 sinh d(u)t
= (A(u)2
− d(u)2
) 2
.
σ2 d(u) cosh d(u)t + A(u) sinh d(u)t
2 2
Moreover,
d(u)t
a 1 − A(u)+d(u)
A(u)−d(u) e
C(u, t) = r ut + (A(u) + d(u))t − 2 log
σ2 1 − A(u)+d(u)
A(u)−d(u)
a d(u) cosh 2 + A(u) sinh d(u)t
d(u)t
= r ut + 2 A(u)t − 2 log 2
.
σ d(u)
308 A. Gulisashvili and J. Teichmann
We also have u u
A − = b + ρσ ,
t t
u u u2
A2 − = b2 + 2bρσ + ρ2 σ 2 2 ,
t t t
u u 2 (1 − ρ2 )σ 2 2σu(kσ + bρ)
d2 − = − 2
+ + b2 ,
t t t
1 u u u2 2ku
A 2
− − d 2
− = − ,
σ2 t t t2 t
and
d (− u )t
u u2 2ku sinh 2 t
D − ,t = 2 − . (5.4)
t t t d − u cosh d (− ut )t + A − u sinh d (− ut )t
t 2 t 2
Let us denote by Z the set of such real numbers u that the expressions on the
right-hand side of (5.3) and (5.4) are finite for all small enough values of t, and put
u t
Ŝ(u, t) = d − .
t 2
It is easy to see that
1# 2
Ŝ(u, t) = −u (1 − ρ2 )σ 2 + 2tu(kσ 2 + bρσ) + t 2 b2 .
2
In the previous formula, t is a real number. Therefore, for every real number u = 0,
Ŝ(u, t) is purely imaginary for all numbers t with |t| small enough. For such u and
t, Ŝ(u, t) = i S(u, t), where
1# 2
S(u, t) = u (1 − ρ2 )σ 2 − 2tu(kσ 2 + bρσ) − t 2 b2 (5.5)
2
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 309
and
u sin S(u, t)
tD − , t = u 2 − 2tku . (5.7)
t 2S(u, t) cos S(u, t) + (bt + ρσu) sin S(u, t)
Our next goal is to introduce an additional condition under which the logarithmic
function appearing in formula (5.6) exists, and the expressions on the right-hand
sides of (5.6) and (5.7) are finite. Recall that we have assumed that u = 0 and |t| is
small enough. Set
S(u) = lim [2S(u, t) cos S(u, t) + (bt + ρσu) sin S(u, t)] .
t→0
Then, we have
1 #
lim S(u, t) = |u|σ 1 − ρ2
t→0 2
and # #
# |u|σ 1 − ρ2 |u|σ 1 − ρ2
S(u) = |u|σ 1 − ρ2 cos + uσρ sin .
2 2
Let ρ = 0, and assume that
# $ # %
2 1 − ρ2 2 1 − ρ2
− # arctan <u< # π − arctan . (5.8)
σ 1 − ρ2 ρ σ 1 − ρ2 ρ
The restriction in (5.8) means that the variable u is bounded from below by the largest
negative root of the function
# #
# uσ 1 − ρ2 uσ 1 − ρ2
S(u) = 1 − ρ2 cos + ρ sin ,
2 2
and from above by the smallest positive root of the same function. Note that S(0) > 0.
Therefore, we have S(u) > 0, for all u satisfying the condition in (5.8).
It is easy to see that
S(u) = σ|u| S(u) for all u = 0, satisfying the condition in
(5.8). Hence, S(u) > 0, under the same restrictions on u. It follows from (5.7) that
for all u = 0 such that (5.8) holds, the right-hand side of (5.7) is eventually finite as
t → 0, and moreover
310 A. Gulisashvili and J. Teichmann
√
uσ 1−ρ2
u u sin
lim t D − , t = # √
2
√ . (5.9)
t→0 t uσ 1−ρ 2 uσ 1−ρ2
σ 1 − ρ2 cos 2 + ρ sin 2
In addition, the expression under the logarithm sign in (5.6) is eventually positive,
and u
lim tC − , t = 0. (5.10)
t→0 t
v0 sin(2θu) + σu
∂u (0) (u) = − x0 .
σ 1 + cos(2θu)
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 311
In the following two statements, we provide formulas for the critical point u ∗ (x)
and the second derivative of the function (0) (u). These results can be used in the
asymptotic formulas established in the previous sections in the case of the Heston
model.
Lemma 5.2 Suppose ρ = 0 and condition (5.8) holds. Then, for every x ∈ R, the
critical point u ∗ (x) is the unique solution to the equation
#
ρ[1 − cos(2θu)] + 1 − ρ2 sin(2θu) + σ(1 − ρ2 )u 2σ
# = (x0 − x).
( 1 − ρ cos(θu) + ρ sin(θu))
2 2 v0
If ρ = 0 and condition (5.11) holds, then for every x ∈ R, u ∗ (x) is the unique
solution to the equation
sin(2θu) + σu σ
= (x0 − x).
1 + cos(2θu) v0
v0 S(u)
∂ 2 (0) (u) = #
2σ[ 1 − ρ2 cos(θu) + ρ sin(θu)]3
where
# #
S(u) = (2θ + σ 1 − ρ2 )[ρ 1 − ρ2 sin(θu) + (1 − ρ2 ) cos(θu)]
#
+ 2σθ(1 − ρ2 )u[ 1 − ρ2 sin(θu) − ρ cos(θu)].
Lemmas 5.2 and 5.3 are straightforward, and their proofs are omitted.
We will next compute the functions (1) and (2) . Recall that
u (0) (u) (1) (2)
exp − z pt (dz) = exp exp (u) 1 + t (u) + . . . .
t t
Therefore,
∂
(1) (u) = lim (u, t)
t→0 ∂t
and
1 ∂2
(2) (u) = lim (u, t). (5.13)
2 t→0 ∂t 2
Let us fix u = 0 such as in Theorem 5.1. Then the function t → S(u, t) defined
by (5.5) is real analytic in t in a small neighborhood of t = 0, depending on u. Using
the Taylor formula, we obtain
1
S(u, t) = c0 (u) + c1 (u)t + c2 (t)t 2 + O t 3 (5.14)
2
as t → 0, where the O-estimate depends on u, and the coefficients are given by
|u|σ #
c0 (u) = 1 − ρ2 , (5.15)
2
|u| kσ + bρ
c1 (u) = − # , (5.16)
u 2 1 − ρ2
and
|u| b2 (1 − ρ2 ) + (kσ + bρ)2
c2 (u) = − 3
.
u2 2σ(1 − ρ2 ) 2
Our next goal is to expand the functions t → sin S(u, t) and t → cos S(u, t).
Using the Taylor formula and (5.14), we get
1
sin S(u, t) = U0 (u) + U1 (u)t + U2 (u)t 2 + O(t 3 ) (5.17)
2
as t → 0, where
U0 (u) = sin c0 (u), (5.18)
and
U2 (u) = c2 (u) cos c0 (u) − c1 (u)2 sin c0 (u).
Similarly,
1
cos S(u, t) = W0 (u) + W1 (u)t + W2 (u)t 2 + O(t 3 ) (5.20)
2
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 313
as t → 0, where
W0 (u) = cos c0 (u),
and
W2 (u) = −[c2 (u) sin c0 (u) + c1 (u)2 cos c0 (u)].
We will next expand the functions t → t D(− ut , t) and t → tC(− ut , t). It follows
from (5.14), (5.17), and (5.20) that
1
2S(u, t) cos S(u, t) + (bt + ρσu) sin S(u, t) = V0 (u) + V1 (u)t + V2 (u)t 2 + O(t 3 )
2
(5.21)
as t → 0, where
V0 (u) = 2c0 (u)W0 (u) + ρσuU0 (u),
V1 (u) = 2c0 (u)W1 (u) + 2c1 (u)W0 (u) + bU0 (u) + ρσuU1 (u),
and
V2 (u) = 2c0 (u)W2 (u) + 4c1 (u)W1 (u) + 2c2 (u)W0 (u) + 2bU1 (u) + ρσuU2 (u).
and
V2 (u) = [2c2 (u) + 2bc1 (u) + ρσuc2 (u) − 2c0 (u)c1 (u)2 ] cos c0 (u)
− [2c0 (u)c2 (u) + 4c1 (u)2 + ρσuc1 (u)2 ] sin c0 (u).
Therefore,
and
1 1 1
T0 (u)V2 (u) + T1 (u)V1 (u) + T2 (u)V0 (u) = u 2 U2 (u) − 2kuU1 (u).
2 2 2
It follows from the previous equalities that
u 2 U0 (u)
T0 (u) = ,
V0 (u)
and
Q(u)
T2 (u) = ,
V0 (u)3
where
as t → 0.
Now, we turn our attention to the function t → tC(− ut , t). Using (5.6), we see
that
u a V0 (u) + V1 (u)t + O(t 2 )
tC(− , t) = −tr u + t 2 bt + ρσu − 2 log .
t σ 2c0 (u) + 2c1 (u)t + O t 2
(5.28)
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 315
Set
V0 (u) + V1 (u)t + O(t 2 )
= L 0 (u) + L 1 (u)t + O(t 2 )
2c0 (u) + 2c1 (u)t + O t 2
V0 (u)
L 0 (u) = , (5.29)
2c0 (u)
We also have
L 1 (u)
log[L 0 (u) + L 1 (u)t + O(t 2 )] = log L 0 (u) + t + O(t 2 ) (5.31)
L 0 (u)
as t → 0.
Next, we will find explicit expressions for the functions (1) and (2) . Suppose
ρ = 0, u = 0, and condition (5.8) holds. Then
aρ 2a V0 (u)
(1) (u) = − r u − 2 log
σ σ 2c0 (u)
u 2 U1 (u)V0 (u) − 2kuU0 (u)V0 (u) − u 2 U0 (u)V1 (u)
+ v0 . (5.33)
V0 (u)2
where
uσ(kσ + bρ)
E 1 (u) = − ,
2
# kσ + bρ
E 2 (u) = −2kσ 1 − ρ2 + # ,
1 − ρ2
and
kσ + bρ
E 3 (u) = − 2kρσ + b + uσ .
2
where
3
1 (u) = − u σ(kσ + bρ),
E
2
kσ + bρ # kσ + bρ
2 (u) = u −ρσu|u| #
E − 2k|u|σ 1 − ρ + (2 + ρσu)|u| #
2 ,
2 1 − ρ2 2 1 − ρ2
and
3 (u) = −u 2 2kρσ + b + uσ kσ + bρ .
E
2
Next, replacing |u| by u in the previous formulas (it is not hard to see that this can
be done) and making several cancellations, we obtain formula (5.34).
This completes the proof of Theorem 5.4.
Our final goal in the present section is to find an explicit formula for the function
(2) in terms of the Heston model parameters. It follows from (5.2), (5.13), (5.27),
and (5.32) that
where Q(u) is given by (5.26). Now, it is clear how to obtain an explicit expression
for the function (2) , expressed in terms of the Heston model parameters. It suffices
to transform the formula in (5.35), using the explicit expressions for the functions
ci , Ui , Vi with i = 0, 1, 2, and the function Q. Let us also note that the value of the
function on the right-hand of formula (5.35) does not change if we replace |u| by
u. Taking into account what was said above, and making long but straightforward
computations, we see that the following statement holds.
Theorem 5.5 Suppose ρ = 0, u = 0, and condition (5.8) holds. Then
ab a I0 (u)
(2) (u) = − 3 √ √
σ 2 σ (1 − ρ2 )u # uσ 1−ρ2 uσ 1−ρ2
1 − ρ2 cos 2 + ρ sin 2
v0 I1 (u)
+ √ √ 2
2 # uσ 1−ρ2 uσ 1−ρ2
σ2 1 − ρ2 cos 2 + ρ sin 2
v0 I2 (u)I3 (u)
+ √ √ 3 , (5.36)
2 # uσ 1−ρ2 uσ 1−ρ2
uσ 3 1 − ρ2 cos 2 + ρ sin 2
where
#
# uσ 1 − ρ2
I0 (u) = −uρσ 1 − ρ (kσ + bρ) cos
2
2 #
uσ 1 − ρ2
+ 2b + 2ρkσ + uσ(kσ + bρ)(1 − ρ ) sin
2
,
2
#
u b2 (1 − ρ2 ) + (kσ + bρ)2 (kσ + bρ)2 uσ 1 − ρ2
I1 (u) = − + sin 2
2(1 − ρ2 ) 1 − ρ2 2
# #
b2 (1 − ρ2 ) + (kσ + bρ)2 b(kσ + bρ) uσ 1 − ρ2 uσ 1 − ρ2
+ 3
+ # sin cos ,
σ(1 − ρ2 ) 2 1 − ρ2 2 2
#
# (2 + ρσu)(kσ + bρ) uσ 1 − ρ2
I2 (u) = 2 2kσ 1 − ρ −
2 # cos
2 1 − ρ2 2
#
uσ(kσ + bρ) uσ 1 − ρ2
+ 2 2kρσ + b + sin ,
2 2
and
318 A. Gulisashvili and J. Teichmann
#
#
2 uσ 1 − ρ
2
I3 (u) = − uσ 1 − ρ + b sin
2
# 2 #
kσ + bρ uσ 1 − ρ2 uσ 1 − ρ2
−# sin cos .
1 − ρ2 2 2
Proof The second term on the right-hand side of (5.36) can be obtained from the
corresponding term in (5.35) by taking into account (5.15), (5.16), (5.22), and (5.23).
Next, using (5.26), we see that
Moreover
U2 (u)V0 (u) − U0 (u)V2 (u) = 2c0 (u)c2 (u) + 4c1 (u)2 sin2 c0 (u)
− 2 [c2 (u) + bc1 (u)] sin c0 (u) cos c0 (u),
4kuV0 (u) + 2u 2 V1 (u) = 2u [4kc0 (u) + u(2 + ρσu)c1 (u)] cos c0 (u)
+ 2u 2 [2kρσ + b − 2c0 (u)c1 (u)] sin c0 (u),
and
Set
I1 (u) = U2 (u)V0 (u) − U0 (u)V2 (u),
I2 (u) = u −2 4kuV0 (u) + 2u 2 V1 (u) ,
and
I3 (u) = U0 (u)V1 (u) − U1 (u)V0 (u).
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 319
Next, taking into account (5.35), (5.37), and using the explicit expressions for the
functions ci , Ui , and V1 with i = 0, 1, 2, which were found above, we obtain (5.36).
This completes the proof of Theorem 5.5.
Remark 5.6 The present remark concerns the continuity of the functions (i) with
i = 0, 1, 2 on their domain. Recall that (i) (0) = 0. It follows from Theorems 5.1,
5.4, and 5.5 that the functions (i) are continuous on their domain with a possible
exception of the point u = 0. However, it is not hard to see, using the explicit
expressions for the functions (i) , provided in the theorems mentioned above, that
References
1. Aït-Sahalia, Y., Yu, J.: Saddlepoint approximations for continuous-time Markov processes. J.
Econom. 134, 507–551 (2006)
2. Cuchiero, C., Filipovic, D., Mayerhofer, E., Teichmann, J.: Affine processes on positive semi-
definite matrices. Ann. Appl. Probab. 21, 397–463 (2011)
3. Cuchiero, C., Keller-Ressel, M., Mayerhofer, E., Teichmann, J.: Affine processes on symmetric
cones. J. Theor. Probab. doi:10.1007/s10959-014-0580-x (2014)
4. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Jones and Bartlett
Publishers Inc., Boston (1993)
5. Duffie, D., Filipovic, D., Schachermayer, W.: Affine processes and applications in finance.
Ann. Appl. Probab. 13, 984–1053 (2003)
6. Ellis, R.S.: Large deviations for a general class of random vectors. Ann. Probab. 12, 1–12
(1984)
7. Forde, M., Jacquier, A.: Small-time asymptotics for implied volatility under the Heston model.
IJTAF 12, 861–876 (2009)
8. Forde, M., Jacquier, A.: The large-maturity smile for the Heston model. Financ. Stoch. 15,
755–780 (2011)
9. Forde, M., Jacquier, A., Mijatović, A.: Asymptotic formulae for implied volatility in the Heston
model. Proc. R. Soc. A 466, 3593–3620 (2010)
10. Forde, M., Jacquier, A., Lee, R.: The small-time smile and term structure of implied volatility
under the Heston model. SIAM J. Financ. Math. 3, 690–708 (2012)
11. Forde, M., Kumar, R.: Large-time option pricing for a general stochastic volatility model with
a stochastic interest rate, using the Donsker-Varadhan LDP, Preprint (2013)
12. Gärtner, J.: On large deviations from the invariant measure. Theory Probab. Appl. 22, 24–39
(1977)
13. Gulisashvili, A., Laurence, P.: The Heston Riemannian distance function. Journal de Mathé-
matiques Pures et Appliquées 101, 303–329 (2014)
14. Heston, S.L.: A closed-form solution for options with stochastic volatility, with applications
to bond and currency options. Rev. Financ. Stud. 6, 327–343 (1993)
15. Jacquier, A., Mijatović, A.: Large deviations for the extended Heston model: the large-time
case. Asia-Pacific Finan. Markets 21(3), 263–280 (2014)
320 A. Gulisashvili and J. Teichmann
16. Jacquier, A., Roome, P.: Asymptotics of forward implied volatility. SIAM J. Finan. Math. 6(1),
307–351 (2015)
17. Keller-Ressel, M., Teichmann, J., Schachermayer, W.: Regularity of affine processes on general
state spaces. Electron. J. Probab. 18, 1–17 (2013)
18. Olver, F.W.J.: Asymptotics and Special Functions. A K Peters Ltd., Wellesley (1997)
19. Pham, H.: Some applications and methods of large deviations in finance and insurance. Paris-
Princeton Lectures on Mathematical Finance 2004. Lecture Notes in Mathematics, vol. 1919,
pp. 191–244 (2007)
20. Spreij, P., Veerman, E.: Affine diffusions with non-canonical state space. Stoch. Anal. Appl.
30, 605–641 (2012)
Asymptotics for d-Dimensional Lévy-Type
Processes
1 Introduction
M. Lorig
Department of Applied Mathematics, University of Washington, Seattle, WA, USA
e-mail: [email protected]
S. Pagliarani
Ecole Polytechnique, CMAP, Route de Saclay, 91128 Palaiseau Cedex, France
e-mail: [email protected]
A. Pascucci (B)
Dipartimento di Matematica, Università di Bologna, Bologna, Italy
e-mail: [email protected]
the form u(t, x) := E[ϕ(X T )|X t = x]. Under mild conditions, the function u(t, x)
is the unique classical solution of a partial integro-differential equation (PIDE).
Unfortunately, closed form and even semi-closed form solutions of these PIDEs are
available only in rare cases. As such, it is important to develop general methods for
finding analytical approximations for the solutions of these PIDEs.
Within the mathematical finance literature, a number of different approaches have
been taken for finding approximate transition densities and option prices for markets
described by Markov processes. Most of these techniques involve expansions that
exploit a small parameter or a limiting case. For example, Benhamou et al. [1] develop
analytical approximations for models with local volatility and Gaussian jumps in the
small diffusion and small jump frequency/size limits (see also the recent review
paper by Bompis and Gobet [2]). Deuschel et al. [5] obtain densities for diffusion
processes in a small noise limit. Fouque et al. [6] find option prices for Black-Scholes-
like multiscale models where volatility is driven by two factors, one running on a
fast scale, one running on a slow scale. Lorig [11], Lorig and Lozano-Carbassé [12]
extend these multiscale techniques to more general diffusions and to the exponential
Lévy setting.
Recently, Pagliarani and Pascucci [19] introduce a method for finding asymptotic
solutions of parabolic PDEs. The approach, called the adjoint expansion method, is
extended by Lorig et al. [15], Pagliarani et al. [21] to models with jumps and it was
further generalized by Lorig et al. [13] to a family of asymptotic expansions for a d-
dimensional market described by an Itô SDE (i.e., a Markov market with no jumps).
The method consists of expanding the pricing PDE in polynomial basis functions,
which results in a nested sequence of Cauchy problems, and deriving analytical
solutions for these nested Cauchy problems. In this paper, we extend the results of
Lorig et al. [13], Lorig et al. [15], Pagliarani et al. [21] to the PIDEs that arise when
markets are described by a d-dimensional Lévy-Itô SDE. Results presented here also
simplify results from Lorig et al. [13], Lorig et al. [15], Pagliarani et al. [21].
The rest of this paper proceeds as follows. In Sect. 2 we present a general d-
dimensional market model. We also describe the kinds of derivative-assets we wish
to price, and we relate the price of such derivative-assets to the solution of a parabolic
PIDE. In Sect. 3 we introduce the idea of polynomial expansions of the pricing PIDE
and in Sect. 4, we derive a family of analytical price approximations—one for each
polynomial expansion of the pricing PIDE. Lastly, in Sect. 5 we provide a numerical
example, illustrating the versatility and accuracy of our methods.
2 Market Model
which is rather standard for Lévy-type models. The components of X could represent
a number of things such as e.g., economic factors, asset prices, indices, or functions
of these quantities. We assume a risk-free interest rate of the form r (t, X t ) where
r : R+ × Rd → R. We also introduce a random time ζ , which is given by
t
ζ = inf t ≥ 0 : γ (s, X s )ds ≥ E , γ : R+ × Rd → R+ ,
0
Thus, to value a European-style option, one must compute functions of the form
T
u(t, x) := E e− t λ(s,X s )ds
ϕ(X T ) | X t = x . (2.3)
Under mild assumptions (see, for instance, Pascucci [22]), the function u, defined
by (2.3), satisfies the Kolmogorov backward equation
with
d
z, x := z i xi , ∇x := (∂x1 , ∂x2 , . . . , ∂xd ), ez,∇x f (x) := f (x + z).
i=1
The formal representation of the shift operator ez,∇x is motivated by the fact that its
Taylor expansion applied to the function f (x) gives the Taylor expansion of f (x + z)
about the point x. As in Øksendal and Sulem [18, Chap. 1], we regard the domain of
A(t) to be all functions f : Rd → R such that A(t) f (x) exists and is finite for all
x ∈ Rd .
Remark 2.1 (Martingale property) Let us denote by X (i) the ith component of the
vector X and assume that
ezi ν̄(dz) < ∞,
|z|≥1
(i)
for some i ≤ d, with ν̄ as in (2.1). If St := I{ζ >t} e X t is supposed to be a traded
asset then, in order for S to be a martingale, the drift μi must satisfy
1
μi (t, x) = γ (t, x) − ν(t, x, dz)(ezi − 1 − z i ) − σ σ T (t, x),
Rd 2 ii
Let us start by rewriting the differential operator (2.5) in the more compact form
A(t) := ν(t, x, dz) ez,∇x − 1 − z, ∇x + aα (t, x)Dxα , t ∈ R, x ∈ Rd ,
Rd |α|≤2
Asymptotics for d-Dimensional Lévy-Type Processes 325
d
α = (α1 , . . . , αd ) ∈ Nd0 , |α| = αi , Dxα∗ = ∂xα11 · · · ∂xαdd .
i=1
In this section we introduce a family of expansion schemes for A(t), which we shall
use to construct closed-form approximate solutions (one for each family) of (2.4).
Definition 3.1 For |α| ≤ 2 and n ≤ N ∈ N0 , let aα,n = aα,n (t, x) and νn =
νn (t, x, dz) be such that the following hold:
(i) For any t ∈ [0, T ], aα,n (t, ·) are polynomial functions with aα,0 (t, x) ≡ aα,0 (t),
and for any x ∈ Rd the functions aα,n (·, x) belong to L ∞ ([0, T ]).
(ii) For any t ∈ [0, T ], x ∈ Rd , we have
νn (t, x, dz) = x β νn,β (t, dz), Mn ∈ N0 , (3.1)
|β|≤Mn
where each νn,β (t, dz) satisfies condition (2.1). Moreover, M0 = 0, ν0 ≥ 0 and
eλ|z| ν0 (t, dz) < ∞, t ∈ [0, T ], (3.2)
|z|≥1
where h(t, ·, z) ∈ C N (Rd ) with h ≥ 0, and ν̄ is a Lévy measure. Then, for any
fixed x̄ ∈ Rd and n ≤ N , we define νn and aα,n as the nth order term of the Taylor
expansions of ν and aα respectively in the spatial variables x around the point x̄.
That is, we set
β β
where as usual β! = β1 ! · · · βd ! and x β = x1 1 · · · xd d . The expansion proposed
in Lorig et al. [14, 17] is the particular case when ν ≡ 0, whereas the expansion
proposed in Lorig et al. [15, 16] is a particular case when d = 1.
This expansion for the coefficients allows the expansion point x̄ of the Taylor series to
evolve in time according to the evolution of the underlying process X t . For instance,
one could choose x̄(t) = E[X t ]. In Lorig et al. [14] this choice results in a highly
accurate approximation for option prices and implied volatility in the Heston [8]
model.
fundamental solution (i.e., the transition density of the underlying stochastic model)
has singularities. In such cases, it is natural to approximate it in some L p norm rather
than in the pointwise sense. For the Hermite expansion centered at x̄, one sets
νn (t, x, dz) = Hβ (· − x̄), ν(t, ·, dz) Hβ (x − x̄),
|β|=n
aα,n (t, x) = Hβ (· − x̄), aα (t, ·) Hβ (x − x̄), |α| ≤ 2,
|β|=n
where the inner product ·, · is an integral over Rd with a Gaussian weighting
centered at x̄ and the functions Hβ (x) = Hβ1 (x1 ) · · · Hβd (xd ) where Hn is the nth
one-dimensional Hermite polynomial (properly normalized so that Hα , Hβ =
δα,β with δα,β being the Kronecker’s delta function).
We insert expansion (4.1) for A(t) into Cauchy problem (2.4) and find
P0 (t, t1 )B(t1 )P0 (t1 , t2 )B(t2 ) · · · P0 (tk−1 , tk )B(tk )P0 (tk , T )ϕ (4.3)
∞ n T T T
= P0 (t, T )ϕ + dt1 dt2 · · · dtk
n=1 k=1 t t1 tk−1
P0 (t, t1 )Ai1 (t1 )P0 (t1 , t2 )Ai2 (t2 ) · · · P0 (tk−1 , tk )Aik (tk )P0 (tk , T )ϕ,
i∈In,k
(4.4)
In,k = {i = (i 1 , i 2 , . . . , i k ) ∈ N | i 1 + i 2 + · · · + i k = n}.
k
(4.5)
The second-to-last equality (4.3) is known as the Dyson series expansion of u (see, for
instance, Sect. 5.7 of Sakurai and Tuan [23] or Chap. IX.2.6 of Kato [10]).
To obtain
(4.4) from (4.3) we have used (4.1) to replace B(t) by the infinite sum ∞ n=1 An (t),
and we have partitioned on the sum of the subscripts of the (Aik ). Expansion (4.4)
motivates the following definition.
Definition 4.1 For a fixed N th order polynomial expansion (An (t))0≤n≤N satisfying
Definition 3.1, we define ū N , the N th order price approximation of u, as
N
ū N := un , (4.6)
n=0
where
Here, P0 (t, T ) is the semigroup generated by A0 (t) and In,k is as given in (4.5).
Asymptotics for d-Dimensional Lévy-Type Processes 329
In Sects. 4.1 and 4.2 we will provide explicit expressions for u 0 and (u n )n≥1
respectively.
In what follows, it will be helpful to recall the definition of the Fourier and inverse
Fourier transforms. For any function ϕ in the Schwartz class, we define
Fourier transform: F[ϕ](ξ ) = ϕ̂(ξ ) = dx ϕ(x)eiξ,x ,
Rd
1
Inverse transform: F −1 [ϕ̂](x) = ϕ(x) = dξ ϕ̂(ξ )e−iξ,x .
(2π )d Rd
Recall that by construction M0 = 0 (cf. Definition 3.1) and therefore the operator
A0 (t) has time-dependent coefficients which are independent of x. Then the action
of the semigroup of operators P0 (t, T ) of A0 (t) is well-known:
1
u 0 (t) := P0 (t, T )ϕ = P̂0 (t, x, T, ξ )ϕ̂(−ξ ) dξ (4.8)
(2π )d Rd
where
+0 (t,T,ξ )
P̂0 (t, x, T, ξ ) := eiξ,x (4.9)
with
T
0 (t, T, ξ ) = (iξ )α ds aα,0 (s) + 0 (t, T, ξ ), (4.10)
|α|≤2 t
and
T
0 (t, T, ξ ) = eiξ,z − 1 − iξ, z ν0 (s, dz)ds.
t Rd
Remark 4.2 We introduce P̂ and eξ , the characteristic function and oscillating expo-
nential, respectively
T
a0,0 (s,X s )ds iξ,X T
P̂(t, x, T, ξ ) := E e t e |X t = x , eξ (x) = eiξ,x , (4.11)
where a0,0 is short-hand for a(0,0,...,0),0 . From (2.3) we observe that P̂(t, x, T, ξ ) is
obtained as the special case ϕ = eξ . We note that P̂0 (t, x, T, ξ ) in (4.9) represents the
330 M. Lorig et al.
Moreover, the components of M(t, tk ) commute. Therefore the operators (G j (t, tk )),
which are polynomials in M(t, tk ) by construction, are well defined.
Proof The proof consists in showing that the operator G j (t, tk ) in (4.14) satisfies
Assuming (4.16) holds, we can use the fact that P0 (tk , tk+1 ) is a semigroup
from which (4.12) and (4.13) follows directly. Thus, we only need to show that
G j (t, tk ) satisfies (4.16). It is sufficient to investigate how the operator P0 (t, tk )A j (tk )
acts on the oscillating exponential in (4.11). First, we note that
(−i∂ξi )(−i∂ξ j )e0 (t,tk ,ξ ) eξ (x) = (−i∂ξi )M j (t, tk , ξ )e0 (t,tk ,ξ ) eξ (x)
= M j (t, tk )(−i∂ξi )e0 (t,tk ,ξ ) eξ (x)
= M j (t, tk )Mi (t, tk , ξ )e0 (t,tk ,ξ ) eξ (x)
= M j (t, tk )Mi (t, tk )e0 (t,tk ,ξ ) eξ (x). (4.19)
(−i∇ξ )β e0 (t,tk ,ξ ) eξ (x) = (M(t, tk ))β e0 (t,tk ,ξ ) eξ (x). (4.20)
M j (t, tk )Mi (t, tk )e0 (t,tk ,ξ ) eξ (x) = Mi (t, tk )M j (t, tk )e0 (t,tk ,ξ ) eξ (x).
332 M. Lorig et al.
Finally, we compute
P0 (t, tk )A j (tk )eξ (x) = P0 (t, tk ) ν j (tk , x, dz)(ez,∇x − 1 − z, ∇x )eξ (x)
Rd
+ P0 (t, tk )aα, j (tk , x)Dxα eξ (x) (by (3.3))
|α|≤2
= P0 (t, tk ) (eiz,ξ − 1 − iz, ξ )ν j (tk , x, dz)eξ (x)
Rd
+ (iξ )α P0 (t, tk )aα, j (tk , x)eξ (x)
|α|≤2
= (eiz,ξ − 1 − iz, ξ )ν j (tk , −i∇ξ , dz)P0 (t, tk )eξ (x)
Rd
+ (iξ )α aα, j (tk , −i∇ξ )P0 (t, tk )eξ (x)
|α|≤2
= (eiz,ξ − 1 − iz, ξ )ν j (tk , −i∇ξ , dz)e0 (t,tk ,ξ ) eξ (x)
Rd
+ (iξ )α aα, j (tk , −i∇ξ )e0 (t,tk ,ξ ) eξ (x) (by (4.17))
|α|≤2
= (eiz,ξ − 1 − iz, ξ )ν j (tk , M(t, tk ), dz)e0 (t,tk ,ξ ) eξ (x)
Rd
+ (iξ )α aα, j (tk , M(t, tk ))e0 (t,tk ,ξ ) eξ (x) (by (4.20))
|α|≤2
= ν j (tk , M(t, tk ), dz)(ez,∇x − 1 − z, ∇x )e0 (t,tk ,ξ ) eξ (x)
Rd
+ aα, j (tk , M(t, tk ))Dxα e0 (t,tk ,ξ ) eξ (x)
|α|≤2
= ν j (tk , M(t, tk ), dz)(ez,∇x − 1 − z, ∇x )P0 (t, tk )eξ (x)
Rd
+ aα, j (tk , M(t, tk ))Dxα P0 (t, tk )eξ (x) (by (4.17))
|α|≤2
= G j (t, tk )P0 (t, tk )eξ (x), (by (4.14))
Remark 4.4 Error bounds for the Taylor approximation ū N in the scalar case d = 1
can be found in Lorig et al. [15, 16].
Asymptotics for d-Dimensional Lévy-Type Processes 333
where, for clarity, we have explicitly indicated using superscripts that Lnx (t, T ) acts
ξ ξ
on x and L̂n (t, T ) acts on ξ . With a slight abuse of terminology, we call L̂n the
symbol1 of the operator Lnx (t, T ) in (4.13).
Let us consider the operator Mx (t, tk ) ≡ M(t, tk ) in (4.15) and denote by
Mix (t, tk ) its ith component. The symbol M ξ (t, tk ) of Mx (t, tk ) is defined analo-
i i
gously to (4.21), that is
ξ
(t, tk )eiξ,x .
Mix (t, tk )eiξ,x = Mi
Explicitly, we have
ξ (t, tk )eiξ,x
Mix (t, tk )Mxj (t, tk )eiξ,x = Mix (t, tk )M j
ξ
(t, tk )Mx (t, tk )eiξ,x
=M j i
ξ
(t, tk )M ξ
(t, tk )eiξ,x .
=M j i
1 The operator L̂ξ is not a function as in the classical theory of pseudo-differential calculus. However
n
ξ
e−iξ,x L̂n eiξ,x is the symbol of Lnx (t, T ).
334 M. Lorig et al.
Since Mix and Mxj commute when applied to a function that admits a Fourier rep-
resentation, then Mξ and Mξ also commute when applied to such functions. In
j i
ξ
(t, tk ) β , for β ∈ Nd , is well defined and we have
particular, the operator M 0
ξ β iξ,x
(t, tk )
M e = (M(t, tk ))β eiξ,x . (4.22)
From identity (4.22) we obtain directly the expression of the symbol of G j in (4.14).
Indeed, recalling the expression (3.1) of ν j we have
ξ
ξ β
Ĝ j (t, tk ) = eiz,ξ − 1 − iz, ξ (t, tk )
ν j,β (tk , dz) M
|β|≤M j Rd
+ ξ (t, tk ) .
(iξ )α aα, j tk , M
|α|≤2
The following theorem extends the Fourier pricing formula (4.8) to higher order
approximations.
Theorem 4.6 Under the assumptions of Proposition 4.3, for any n ≥ 1 we have
1
u n (t) = P̂n (t, x, T, ξ )ϕ̂(−ξ ) dξ, (4.24)
(2π )d Rd
where P̂n (t, x, T, ξ ) is the nth order term of the approximation of the characteristic
function of X (cf. Remark 4.2). Explicitly, we have
P̂n (t, x, T, ξ ) := P̂0 (t, x, T, ξ ) e−iξ,x L̂ξn (t, T )eiξ,x
ξ
where P̂0 (t, x, T, ξ ) is the 0th order approximation in (4.9) and L̂n (t, T ) is the
differential operator defined in (4.23).
Asymptotics for d-Dimensional Lévy-Type Processes 335
Proof We first note that, since the approximating operator Lnx acts in the x variables,
then it commutes2 with the Fourier pricing operator (4.8). Thus, by (4.12) combined
with (4.8), we get
1
u n (t) = Lnx (t, T )u 0 (t) = Lx (t, T )eiξ,x +0 (t,T,ξ ) ϕ̂(−ξ ) dξ
(2π )d Rd n
1 ξ
= P̂0 (t, x, T, ξ ) e−iξ,x L̂n (t, T )eiξ,x ϕ̂(−ξ ) dξ,
(2π ) Rd
d
Remark 4.8 In case of non-integrable payoffs (e.g. Call and Put options), the Fourier
representation (4.24) can be easily extended by considering the Fourier transform
on the imaginary line ξ = ξr + iξi . For instance, since the Call option payoff
+
ϕ(x) = e x − ek is not integrable, its Fourier transform ϕ̂(−ξ ) must be computed
in a generalized sense by fixing an imaginary component of the Fourier variable
ξi < −1.
Remark 4.9 Observe that the N th order approximation (4.6)–(4.24) requires only a
single Fourier inversion
N
1
N
ū N (t, x) = u n (t, x) = P̂n (t, x, T, ξ )ϕ̂(−ξ ) dξ.
(2π )d
n=0 R
d
n=0
Moreover, when evaluating the inverse transform, the number of dimensions over
which one must integrate numerically is equal to the number of components of x that
appear in the option payoff ϕ. This is due to the fact that the Fourier transform of a con-
stant is a Dirac delta function. In particular, let ϕ(x) ≡ ϕ̄(x̄) with x̄ = (x1 , . . . , xd ),
for some d < d. Then we have ϕ̂(ξ ) = (2π )d−d ϕ̄ˆ ξ̄ δ0 (ξd +1 ) · · · δ0 (ξd ) with
ξ̄ = (ξ1 , . . . , ξd ), and thus
N
1
ū N (t, x) = P̂n t, x, T, ξ̄ , 0 ϕ̄ˆ −ξ̄ dξ̄ .
(2π ) n=0 Rd
d
2 This was one of the main points of the adjoint expansion method proposed by Pagliarani et al.
[21].
336 M. Lorig et al.
Consider the following model for an asset S = e X , written under the pricing measure
Q assuming zero interest rates
1 ζ (t, Z t , dt, dζ ),
dX t = − − ν(dζ )(e − 1 − ζ ) Z t dt + Z t dWt + ζ dN
2 R R
dZ t = κ(θ − Z t )dt + δ Z t dBt , dW, B t = ρdt.
Note
√ that, just as in the Heston model, the instantaneous volatility of X is given by
Z t , where Z is a CIR process. Likewise, the instantaneous arrival rate of jumps of
size dζ is given by Z t ν(dζ ), where ν is a Lévy measure satisfying all of the usual
integrability conditions. The generator A of the process (X, Z ) is given by
1 2 ζ ∂x
A = z μ∂x + ∂x + ν(dζ )(e − 1 − ζ ∂x )
2 R
1
+ κ(θ − z)∂z + δ 2 z∂z2 + ρδz∂x ∂ y ,
2
1
μ=− − ν(dζ )(eζ − 1 − ζ ).
2 R
With an explicit expression for P̂(t, x, z, T, ξ ) available, the price of a European call
option can be computed using standard Fourier methods
1 −ek−ikξ
u(t, x, z) = dξr P̂(t, x, z, T, ξ )ϕ̂(−ξ ), ϕ̂(ξ ) = ,
2π R iξ + ξ 2
ξ = ξr + iξi , ξi < −1. (5.1)
Note that, since the call option payoff ϕ(x) = (e x − ek )+ is not in L 1 (R), its Fourier
transform ϕ̂(ξ ) must be computed in a generalized sense by fixing an imaginary
component of the Fourier variable ξi < −1.
Also of interest are sensitivities of option prices or Greeks. In particular, consider
the and the , which are defined as
where we have used x(s) = log s. When computing terms of the form ∂xm u(t, x, z),
observe that the differential operator ∂xm acts only on the characteristic function P̂
appearing in (5.1) and not on the Fourier transformϕ̂ of the payoff ϕ. Likewise, when
n
using Theorem 4.6 to compute ∂xm ū n (t, x, z) = i=0 ∂xm u i (t, x, z) the differential
operator ∂x acts only on P̂i in (4.24).
m
In Fig. 1 we plot the implied volatility σ corresponding to the exact price u as well
as the implied volatility σ̄2 corresponding to our second order approximation ū 2 .
To compute σ we first compute option prices using (5.1); we then invert the Black-
Scholes equation numerically in order to obtain the implied volatility σ . To compute
our second order approximation of implied volatility σ̄2 we first compute our second
order approximation for prices ū 2 using Theorem 4.6; we then invert the Black-
Scholes equation numerically in order to obtain σ̄2 . Values from Fig. 1 can be found
¯ 2.
in Table 1. In Fig. 2 we plot the exact as well as our second order approximation
In Fig. 3 we plot the exact as well as our second order approximation ¯ 2 . Values
from Figs. 2 and 3 are given in Tables 2 and 3 respectively. Exact Greeks are com-
puted by combining (5.1)–(5.3). Approximate Greeks are computed by combining
Theorem 4.6 and Eqs. (5.2) and (5.3).
338 M. Lorig et al.
t = 0.10 t = 0.25
0.28 0.26
0.26 0.24
0.24
0.22
0.22
0.20
0.24 0.24
0.22 0.22
0.20 0.20
λ −(ζ − m)2
ν(dζ) = √ exp .
2πs2 2s2
Fig. 1 For the model considered in Sect. 5, we plot the implied volatility σ corresponding to the
exact option price u (solid black) as well as the implied volatility σ̄2 corresponding to our second
order option price approximation ū 2 (dashed black). The units of the horizontal axis are log strike
k := log K . Approximate prices are computed using the Taylor series expansion of A(t) as described
in Example 3.2. We assume the Lévy measure ν is as parametrized above. The following parameters
are used in all four plots: κ = 1.15, θ = 0.04, δ = 0.2, ρ = −0.7, z = θ, x = 0, m = −0.1,
s = 0.2, λ = 2.0
Table 1 Exact implied vols σ , second order approximation σ̄2 and relative error |(σ̄2 − σ )/σ |
k − x −0.2 −0.15 −0.1 −0.05 0.00 0.05 0.1 0.15 0.2
t = 0.10 σ 0.2797 0.2478 0.2269 0.2133 0.2028 0.1940 0.1881 0.1960 0.2296
σ̄2 0.2795 0.2483 0.2271 0.2132 0.2028 0.1939 0.1877 0.1963 0.2324
rel. err. 0.0006 0.0018 0.0009 0.0003 0.0002 0.0001 0.0020 0.0018 0.0120
t = 0.25 σ 0.2441 0.2323 0.2217 0.2120 0.2028 0.1941 0.1863 0.1805 0.1803
σ̄2 0.2456 0.2328 0.2215 0.2116 0.2025 0.1939 0.1859 0.1793 0.1799
rel. err. 0.0059 0.0018 0.0013 0.0020 0.0013 0.0009 0.0021 0.0067 0.0027
t = 0.50 σ 0.2348 0.2266 0.2183 0.2101 0.202 0.1940 0.1864 0.1796 0.1743
σ̄2 0.2350 0.2254 0.2168 0.2088 0.201 0.1933 0.1856 0.1783 0.1723
rel. err. 0.0005 0.0049 0.0069 0.0063 0.004 0.0037 0.0040 0.0070 0.0116
t = 1.00 σ 0.2268 0.2204 0.2138 0.2072 0.2005 0.1939 0.1875 0.1813 0.1757
σ̄2 0.2217 0.2149 0.2089 0.2031 0.1973 0.1914 0.1854 0.1794 0.1740
rel. err. 0.0227 0.0246 0.0230 0.0197 0.0160 0.0130 0.0111 0.0103 0.0096
Parameters are the same as those in Fig. 1
Asymptotics for d-Dimensional Lévy-Type Processes 339
t = 0.10 t = 0.25
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.2 0.2 0.4
Fig. 2 For the model considered in Sect. 5, we plot the Delta corresponding to the exact option
price u (solid black) as well as the Delta ¯ 2 corresponding to our second order option price
approximation ū 2 (dashed black). The units of the horizontal axis are x. Approximate prices are
computed using the Taylor series expansion of A(t) as described in Example 3.2. We assume the
Lévy measure ν is as given in Fig. 1. The following parameters are used in all four plots: κ = 1.15,
θ = 0.04, δ = 0.2, ρ = −0.7, z = θ, k = 0, m = −0.1, s = 0.2, λ = 2.0
340 M. Lorig et al.
t = 0.10 t = 0.25
6 4
5
3
4
3 2
2
1
1
2.5 2.0
2.0 1.5
1.5
1.0
1.0
0.5
0.5
0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.2 0.2 0.4
Fig. 3 For the model considered in Sect. 5, we plot the Gamma corresponding to the exact option
price u (solid black) as well as the Gamma ¯ 2 corresponding to our second order option price
approximation ū 2 (dashed black). The units of the horizontal axis are x. Approximate prices are
computed using the Taylor series expansion of A(t) as described in Example 3.2. We assume the
Lévy measure ν is as given in Fig. 1. The following parameters are used in all four plots: κ = 1.15,
θ = 0.04, δ = 0.2, ρ = −0.7, z = θ, k = 0, m = −0.1, s = 0.2, λ = 2.0
Table 2 Exact Delta , second order approximation
¯ 2 and relative error |(
¯ 2 − )/|
x −0.2 −0.15 −0.1 −0.05 0.00 0.05 0.1 0.15 0.2
t = 0.10 0.0008 0.00516 0.05084 0.2312 0.5370 0.8024 0.9385 0.9845 0.9959
¯2 0.0009 0.00478 0.05081 0.2313 0.5368 0.8026 0.9387 0.9843 0.9958
rel. err. 0.1309 0.07358 0.00048 0.0006 0.0003 0.0002 0.0002 0.0002 0.0000
t = 0.25 0.01311 0.05708 0.1690 0.3503 0.5559 0.7329 0.8563 0.9293 0.9672
¯2 0.0114 0.05674 0.1696 0.3502 0.5552 0.7330 0.8576 0.9306 0.9673
rel. err. 0.1305 0.00585 0.0035 0.0004 0.0012 0.0000 0.0014 0.0014 0.0000
Asymptotics for d-Dimensional Lévy-Type Processes
t = 0.50 0.06608 0.1506 0.2767 0.4260 0.5739 0.7018 0.8014 0.8731 0.9215
¯2 0.06425 0.1508 0.2766 0.4246 0.5719 0.7007 0.8027 0.8766 0.9256
rel. err. 0.02773 0.0014 0.0003 0.0032 0.0034 0.0015 0.0015 0.0040 0.0044
t = 1.00 0.1708 0.2667 0.3760 0.4878 0.5927 0.6849 0.7618 0.8234 0.8713
¯2 0.1662 0.2627 0.3710 0.4814 0.5857 0.6791 0.7595 0.8262 0.8789
rel. err. 0.0268 0.01496 0.0131 0.0130 0.0117 0.0084 0.0030 0.0033 0.0088
Parameters are the same as those in Fig. 2
341
342 M. Lorig et al.
6 Conclusion
In this paper we derive a family of asymptotic expansions for European option prices
when the underlying is modeled as a d-dimensional time inhomogeneous Lévy-type
process. By combining the classical Dyson series expansion with a novel polynomial
expansion of the generator, we obtain two equivalent representations for approximate
option price: (i) as an integro-differential operator acting on the order zero price, and
(ii) as a Fourier transform. We implement our pricing approximation on a Heston-
like model which allows for both stochastic volatility and stochastic jump intensity.
We find that our second order expansion provides and excellent approximation for
prices (as seen through corresponding implied volatilities), as well as for the Greeks
and .
References
1. Benhamou, E., Gobet, E., Miri, M.: Smart expansion and fast calibration for jump diffusions.
Financ. Stoch. 13(4), 563–589 (2009)
2. Bompis, R., Gobet, E.: Asymptotic and non asymptotic approximations for option valuation.
Recent Developments in Computational Finance. Foundations, Algorithms and Applications,
pp. 159–241. World Scientific, Hackensack (2013)
3. Carr, P., Wu, L.: Time-changed Lévy processes and option pricing. J. Financ. Econ. 71(1),
113–141 (2004)
4. Corielli, F., Foschi, P., Pascucci, A.: Parametrix approximation of diffusion transition densities.
SIAM J. Financ. Math. 1, 833–867 (2010)
5. Deuschel, J.-D., Friz, P., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, part i: theoretical foundations. Commun. Pure Appl. Math. 67(1),
40–82 (2014)
Asymptotics for d-Dimensional Lévy-Type Processes 343
6. Fouque, J.-P., Papanicolaou, G., Sircar, R., Solna, K.: Multiscale Stochastic Volatility for Equity,
Interest Rate, and Credit Derivatives. Cambridge University Press, Cambridge (2011)
7. Friz, P. K., Gerhold, S., Yor, M.: How to make Dupire’s local volatility work with jumps. Quant.
Financ. 14(8), 1327–1331 (2014)
8. Heston, S.: A closed-form solution for options with stochastic volatility with applications to
bond and currency options. Rev. Financ. Stud. 6(2), 327–343 (1993)
9. Jeanblanc, M., Yor, M., Chesney, M.: Mathematical Methods for Financial Markets. Springer,
London (2009)
10. Kato, T.: Perturbation Theory for Linear Operators. Classics in Mathematics. Springer, Berlin
(1995). Reprint of the 1980 edition
11. Lorig, M.: Pricing derivatives on multiscale diffusions: an eigenfunction expansion approach.
Math. Finance 24(2), 331–363 (2014)
12. Lorig, M., Lozano-Carbassé O.: Multiscale Exponential Lévy models. Quant. Finance 15(1),
91–100 (2015)
13. Lorig, M., Pagliarani, S., Pascucci, A.: Analytical expansions for parabolic equations. SIAM
J. Appl. Math. 75(2), 468–491 (2015)
14. Lorig, M., Pagliarani, S., Pascucci, A.: Explicit implied volatilities for multifactor local-
stochastic volatility models. Math. Finance (to appear) (2015). ArXiv preprint arXiv:1306.5447
15. Lorig, M., Pagliarani, S., Pascucci, A.: A family of density expansions for Lévy-type processes
with default. Ann. Appl. Probab. 25(1), 235–267 (2015)
16. Lorig, M., Pagliarani, S., Pascucci, A.: Pricing approximations and error estimates for local
Lévy-type models with default. Comp. Math. App. 69(10), 1189–1219 (2015)
17. Lorig, M., Pagliarani, S., Pascucci, A.: A Taylor series approach to pricing and implied vol for
LSV models. J. Risk 17(2), 1–17 (2014)
18. Øksendal, B., Sulem, A.: Applied Stochastic Control of Jump Diffusions. Springer, Berlin
(2005)
19. Pagliarani, S., Pascucci, A.: Analytical approximation of the transition density in a local volatil-
ity model. Cent. Eur. J. Math. 10(1), 250–270 (2012)
20. Pagliarani, S., Pascucci, A.: Local stochastic volatility with jumps: analytical approximations.
Int. J. Theor. Appl. Financ. 16(8), 1–35 (2013)
21. Pagliarani, S., Pascucci, A., Riga, C.: Adjoint expansions in local Lévy models. SIAM J. Financ.
Math. 4, 265–296 (2013)
22. Pascucci, A.: PDE and Martingale Methods in Option Pricing. Bocconi & Springer Series, vol.
2. Springer, Milan (2011)
23. Sakurai, J.J., Tuan, S.F.: Modern Quantum Mechanics, vol. 104. Addison-Wesley, Reading
(Mass.) (1994)
Asymptotic Expansion Approach in Finance
Akihiko Takahashi
1 Introduction
Let (, F , {Ft }t∈[0,T ] , P) denote a probability space with filtration, on which a r -
dimensional standard Wiener process W is defined, where P is an appropriate pricing
measure (a risk neutral measure) in finance, and T denotes some positive constant.
Now, let F(ω) be a Wiener functional and then V, the security or portfolio value can
be expressed as V = E[F(ω)] under certain conditions. Evaluating this expectation
is one of the main issues in finance. Moreover, if F depends on the parameter θ,
computation of ∂V ∂
∂θ = ∂θ E[F(ω; θ)], the sensitivity of the security value with respect
to the change in this parameter (so called Greeks) is also an important task in practice.
I dedicate this note to the late Professor Peter Laurence and Koji Takahashi.
I am very grateful to Professor Fujii, Professor Shiraya, Professor Takehara, Dr. Toda,
Dr. Tsuzuki and Professor Yamada, my coauthors in the original articles, which are main bases
for this survey.
A. Takahashi (B)
Graduate School of Economics, The University of Tokyo, 7-3-1, Hongo,
Tokyo, Bunkyo-ku 113-0033, Japan
e-mail: [email protected]
where ∈ [0, 1] is a known parameter. Here, the coefficients are assumed to sat-
isfy some regularity conditions. In finance, many problems of pricing derivatives
and evaluating the portfolios in investment theories are reduced to the problems of
() () ()
computing E[ f (X T )], the expectation of f (X T ), that is a function of X T .
In finance applications, it is important to deal with not only a smooth function
f (x) but also non-smooth one. For example, when various options are evaluated,
f is expressed as f = T ◦ g, where T (x) = max{x, 0} and g stands for a smooth
function of Rd → R. In general, it is difficult to represent this expectation explicitly
except for special cases. Hence, numerical methods such as Monte Carlo simulations
or numerical solutions of partial differential equations (PDEs) are employed and
various speeding up techniques are developed, since fast and precise computation is
required in practice.
As a different approach, an approximation of the expectation by an asymptotic
expansion of the stochastic differential equation around = 0 may be considered.
() ()
Furthermore, because ∂x∂ 0 E[ f (X T )] and ∂
∂
E[ f (X T )], the sensitivities of the secu-
rity value with respect to the changes in the initial value x0 and in the parameter are
important indicators for practical purposes, the approximations with high accuracies
are so valuable. Moreover, some schemes that combine Monte Carlo simulations
with asymptotic expansions with low orders are developed, since the asymptotic
expansion up to the first or second order can be easily evaluated. Those schemes are
able to improve the efficiencies of Monte Carlo simulations and the accuracies of
approximations obtained by the asymptotic expansions.
An asymptotic expansion approach in finance has been developed for the past two
decades, which is mathematically justified by Watanabe theory (Watanabe [111])
in Malliavin calculus (e.g. Malliavin [64], Chap. V-8 in Ikeda and Watanabe [39],
Nualart [72]). To the best of our knowledge, the asymptotic expansion technique is
firstly applied to finance for evaluation of average options that are popular deriva-
tives in commodity markets. Kunitomo and Takahashi [48] and Takahashi [85] derive
approximation formulas for average options by an asymptotic expansion method
based on log-normal approximations for average prices distributions, when the under-
lying asset prices follow geometric Brownian motions. Yoshida [119] derives an
asymptotic expansion of an average option price around a normal distribution for a
general diffusion model, which is a byproduct of his result in statistics [118] based
on the Watanabe theory.
Thereafter, the asymptotic expansion approach have been applied to a broad class
of valuation problems in finance, which includes pricing options with stochastic
volatility models, pricing options under Heath-Jarrow-Morton (HJM) models [37]
or Libor market models (LMM) (Brace, Gatarek and Musiela [7], Jamshidian [43])
Asymptotic Expansion Approach in Finance 347
of interest rates, and pricing so called exotic-type options such as basket and barrier
options in addition to average options.
For instance, please see Kawai [44], Kobayashi, Takahashi and Tokioka [45],
Kunitomo and Takahashi [49–51], Li [59], Matsuoka, Takahashi and Uchida [66],
Muroi [67], Nishiba [71], Osajima [75], Shiraya and Takahashi [78–80], Shiraya,
Takahashi and Toda [81], Shiraya, Takahashi and Yamada [83], Shiraya, Takahashi
and Yamazaki [82], Takahashi and Matsushima [88], Takahashi and Saito [89], Taka-
hashi and Takehara [90–94], Takahashi, Takehara and Toda [90, 91], Takahashi and
Tsuzuki [98], Takahashi and Uchida [99], Takahashi and Yamada [100–104], Taka-
hashi and Yoshida [106, 107], Takahashi and Takehara [92, 93], Violante [110], Xu
and Zheng [112, 113], and Takahashi [86, 87].
We briefly introduce some of above works in Sect. 3.6. Moreover, we remark
that the asymptotic expansion approach is employed by Yamanobe [116, 117] in
physics for analyses of the impulse-driven stochastic biological oscillator and global
dynamics of a stochastic neuronal oscillator.
We also note that there exist many other types of the expansion/perturbation
methods which have turned out to be so useful for applications in finance. For exam-
ple, see Bayer and Laurence [2], Ben Arous and Laurence [3], Benaim, Friz and
Lee [4], Col, Gnoatto and Grasselli [9], Davydov and Linetsky [11], Deuschel, Friz,
Jacquier and Violante [12, 13], Forde and Jacquier [18], Forde, Jacquier and Lee [17],
Foschi, Pagliarani, Pascucci [19], Fouque, Papanicolaou and Sircar [20, 21], Fujii
[24], Fujii and Takahashi [25–27, 29], Gatheral, Hsu, Laurence, Ouyang, and Wang
[30], Gnoatto and Grasselli [31], Gulisashvili [32], Hagan, Kumar, Lesniewski and
Woodward [33], Henry-Labordère [38], Kato, Takahashi and Yamada [46, 47],
Kusuoka and Osajima [57], Lee [58], Lipton [60], Linetsky [61], Osajima [76],
Pagliarani and Pascucci [77], Siopacha and Teichmann [84], Yamamoto, Sato and
Takahashi [114], Yamamoto and Takahashi [115], and references therein.
The organization of the paper is as follows. The next section describes the outline
of the asymptotic expansion approach in a general diffusion setting. Then, Sect. 3
explains a computational scheme for the expansion method. Section 4 provides an
extension of the general computational scheme in the previous section, and Sect. 5
briefly introduces two improvement scheme for the expansion method. Section 6
extends the approach to non-diffusion Wiener functionals by using an instantaneous
forward rates model as an example. Sections 7 and 8 introduce an asymptotic expan-
sion in jump-diffusion models and a perturbation scheme in forward backward sto-
chastic differential equations (FBSDEs). Section 9 concludes.
Following [87, 96], this section briefly describes an asymptotic expansion method
in a general diffusion setting.
348 A. Takahashi
1 ()
g(X T ) − (g0T + g1T + · · · + k−1 gk−1,T ) q,s = O(1) (as ↓ 0),
k
N −1
1− 1
j=0 1+τ L ()
() () jT
g(X T ) = S RT = N −1 i ,
τ i=0
1
j=0 1+τ L ()
jT
that is a swap rate with inception date T and maturity date TN = T + N τ . Here,
L j T stands for the forward Libor rate at T fixing at T + jτ with tenor τ .
Asymptotic Expansion Approach in Finance 349
()
∂k X j
Let Akt = k!1 ∂kt |=0 and Akt , j = 1, . . . , d denote the jth elements of Akt . In
particular, A1t is represented by
t
A1t = Yt Yu−1 ∂ V0 (X u(0) , 0)du + V (X u(0) )dWu , (2)
0
∂l β ∂β
where ∂l = ∂l
, ∂ = ∂xd1 ···∂xdβ ,
dβ
(l)
l
:= (4)
lβ ,dβ β=1 lβ ∈L l,β dβ ∈{1,...,d}β
for l ≥ 1, ⎧ ⎫
⎨ β
⎬
L l,β := lβ = (l1 , . . . , lβ ); l j = l; (l, l j , β ∈ N) , (5)
⎩ ⎭
j=1
and for l = 0,
(0)
= .
lβ ,dβ β=0 l0 =(∅) d0 =(∅)
350 A. Takahashi
(0)
g0T = g(X T ),
d
(0) j
g1T = ∂ j g(X T )A1T ,
j=1
(n)
1 β
∂ g(X T(0) )Ald11T · · · AlββT .
d
gnT = (6)
β! dβ
lβ ,dβ
where μ : Rd × R+ → Rd and σ : Rd × R+ → Rd ⊗ Rr .
Definition 1 A grading of Rd is a decomposition Rd = Rd1 × · · · × Rdq with
d = d1 + · · · + dq . The coordinates of a point in Rd are always arranged in an
increasing order along the subspace Rdi , and we set M0 = 0 and Ml = d1 + · · · + dl
for 1 ≤ l ≤ q. We say that the coefficients μ and σ are graded according to the
grading Rd = Rd1 × · · · × Rdq if μi (x, t) and σ ij (x, t), j = 1, . . . , r depend upon
only through the coordinates (x k )1≤k≤M p when M p−1 ≤ i ≤ M p .
Theorem 1 We assume that the coefficients μ and σ in (7) are graded according to
Rd = Rd1 × · · · × Rdq . Moreover for F(x, t) = μ(x, t) or σ j (x, t), j = 1, . . . , r ,
we assume that F is differentiable in x on Rd and
1. |F i (0, t)| ≤ Z t for i = 1, . . . , d
2. | ∂x∂ j F i (x, t)| ≤ Ẑ t (1 + |x|θ ) for all i, j
3. | ∂x∂ j F i (x, t)| ≤ ζ if M p−1 ≤ i, j ≤ M p for some p ≤ q,
For the detail of the definition and theorem above, see pp. 45–47 in Bichteler,
Gravereaux and Jacod [5].
Applying Theorem 1 to the system of stochastic differential equations consisting
of Alti (i = 1, . . . , d, l = 1, . . . , k, 0 ≤ t ≤ T ) as well as any products of them, we
obtain the following lemma.
()
g(X T ) − g0T
G () =
for ∈ (0, 1]. Then, we have
in D∞ .
Next, for h ∈ H , where H denotes the Cameron-Martin subspace of the r -
dimensional Wiener space, the H -derivative of G () is expressed as
1
d d T
() () (),i () () () ()
Dh G = ∂i g(X T )Dh X T = ∂i g(X T ) [YT (Yt )−1 V (X t )ḣ t ]i dt,
0
i=1 i=1
where Y () is the Rd ⊗ Rd -valued stochastic process which is the solution to the
stochastic differential equation:
() () ()
where ∂g(X T ) = (∂1 g(X T ), . . . , ∂d g(X T )), the Malliavin (co)variance of
G () is given by
T
() ()
σG () = V̂t (V̂t ) dt. (8)
0
Moreover, let
(0) (0) (0)
V̂t := V̂t = ∂g(X T ) YT Yt−1 V (X t )
Note that g1T follows a normal distribution with variance T , and the density function
of g1T denoted by f g1T (x) is given as
1 (x − C)2
f g1T (x) = √ exp −
2πT 2T
where
T
(0) (0)
C := ∂g(X T ) YT Yt−1 ∂ V0 (X t , 0)dt. (9)
0
Since T is the variance of the random variable g1T , which follows a normal dis-
tribution, (Assumption 1) means the condition that the distribution of g1T does not
degenerate. In application, as it is easy to check this condition in most cases, it plays
an important role for practical purposes.
Next, let us briefly introduce a truncated version of the Watanabe theory [111]
based on Yoshida [118, 119]. Under (Assumption 1), σG () is uniformly non-
()
degenerate for {|ηc | ≤ 1}; that is, it can be shown that there exists a positive
real number c0 > 0 such that for any c > c0 and p > 1,
() T ()
where ηc = c 0 |V̂t − V̂t |dt.
Let S be the real Schwartz space of rapidly decreasing C ∞ -functions on R and S
()
be its dual space. Then, for : R → R, ∈ S , a composite function ψ(ηc ) ◦
()
G () = ψ(ηc )(G () ) is well-defined as an element of D̃−∞ = ∪s<0 ∩1< p<∞ D p,s .
Here, ψ(x), x ∈ R denotes a smooth function 0 ≤ ψ(x) ≤ 1, defined as ψ(x) = 1
for |x| ≤ 1/2 and ψ(x) = 0 for |x| ≥ 1. Here, a Banach space D p,s , s < 0 is the
dual space of Dq,−s (R)(q = p/( p − 1)).
Asymptotic Expansion Approach in Finance 353
()
This means that the probability of the events truncated by ψ(ηc ) is smaller than any
()
polynomial orders of . Then, in the expansion of ψ(ηc ) ◦ G () , the coefficients
expressed as generalized Wiener functionals belonging to D̃−∞ can be written by
applying Taylor’s formula to (g0T + g1T + 2 g2T +· · · ). Therefore, the asymptotic
expansion of the expectation E[(G () )] can be obtained relatively easily. For the
details of Watanabe theory and its truncated version above, please consult Watanabe
[111] and Yoshida [118, 119]. For its application to valuation problems in finance,
please also see [50].
In particular, if we take the delta function at y ∈ R, δ y as , that is (x) = δ y (x),
we obtain an asymptotic expansion of the density function of G () . Moreover, because
functions such as (x) = max{x, 0} that is measurable but not smooth, frequently
appear in finance, the framework mentioned above is necessary for the asymptotic
expansion.
For instance, when we take max{x, 0}, min{x, 0} or δ y (x) as (x) for a useful
application in finance, the expectation of (G () ) is expanded as follows: for N =
0, 1, 2, . . . ,
⎡ ⎛ ⎞⎤
N (n) m
1
E[(G () )] = n E ⎣(m) (g1T ) ⎝ g(k j +1)T ⎠⎦ + o( N )
m!
n=0 km j=1
1 (m)
N (n)
= n E (g1T )X km + o( N )
m!
n=0 km
N (n)
∞
1
= n
(m) (x)E[X km |g1T = x] f g1T (x)d x + o( N )
m! −∞
n=0 km
N (n)
∞
1
= n
(x)(−1)m
m! −∞
n=0 km
dm
× E[X km |g1T = x] f g1T (x) d x + o( N )
dxm
(11)
354 A. Takahashi
(n) n
d m (x)
where (m) (g1T ) = d x m x=g , = km ∈L n,m , and
1T km m=1
m
X km := g(k j +1)T . (12)
j=1
In order to compute the asymptotic expansion (11), we need to evaluate the condi-
tional expectations of the form:
E X̃ km g1T = x ,
where X̃ km is represented by a product of multiple Wiener-Itô integrals.
In the preceding works on application of the asymptotic expansion, the conditional
expectations in (11) were directly computed with some formulas including multi-
dimensional ones given for example, in [85, 86]. Recently, while the formulas up to
the third order are given in the works, [95] has developed a high-order computation
j
scheme for the conditional expectations by using the fact that each of these {Ak,t } j,k ,
{gnT }n and also {X km }km can be decomposed into a finite sum of iterated multiple
Wiener-Itô integrals by applications of the Itô’s formula with certain properties of
iterated multiple Wiener-Itô integrals. (Please see Sect. 4 of [95] for the detail.)
On the other hand, as shown in the next section, we can develop an alternative
method which does not evaluate the conditional expectations directly.
3 Computational Scheme
This section follows [96] to introduce a computational scheme for the asymptotic
expansion, which is an alternative to the direct calculation method for the conditional
expectations given in [95].
3.1 Preparation
To compute the conditional expectations on the right hand side of (11), we use the
following lemma which can be derived from a property of Hermite polynomials and
leads us to compute the unconditional expectations instead of the conditional ones.
∞
an
E[X |Z = x] = Hn (x; ) (13)
n
n=0
2 /2 d n −x 2 /2
Hn (x; ) = (−)n e x e
dxn
and the coefficients an are given by
2
1 1 ∂ n ξ √
an = e 2 E[eiξ Z X ] , (i = −1). (14)
n! i n ∂ξ n ξ=0
(Proof) Since the system of Hermite polynomials {Hn (x; )} is an orthogonal
basis of L 2 (R, μ), and E[X |Z = x] ∈ L 2 (R, μ), we have the following unique
expansion of E[X |Z = x] in L 2 (R, μ):
∞
an
E[X |Z = x] = Hn (x; ).
n
n=0
then,
ξ2 ξ2
e 2 E[eiξ Z X ] = e 2 eiξx E[X |Z = x]μ(d x)
R
∞ ∞
Hm (x; ) Hn (x; )μ(d x)
= (iξ)m an
R m! n
m=0 n=0
∞
= an (iξ)n .
n=0
ξ2
Comparing to the coefficients of the Taylor series of e 2 E[eiξ Z X ] around 0 with
respect to ξ, we see that an can be written as (14).
(0) (0) (0)
Next, we write V̂t = (∂g(X T )) YT Yt−1 V (X t ) as V̂ (X t ). Then, we define
ξ
ĝ1 = {ĝ1t ; t ∈ R+ } and Z ξ = {Z t ; t ∈ R+ } as the stochastic processes
t
ĝ1t = V̂ (X u(0) )dWu
0
356 A. Takahashi
and
ξ ξ2
Zt = exp iξ ĝ1t + t ,
2
t (0) (0)
respectively, where t := 0 V̂ (X u )V̂ (X u ) du.
Then, from Lemma 2, the conditional expectations appearing on the right hand
side of the Eq. (11) is expressed as
E[X km |g1T = x] = E[X km |ĝ1T = x − C]
∞
a km l
= Hl (x − C; T ) (15)
l=0
Tl
where
1 1 ∂ l km ξ
alkm = E[X Z ] . (16)
l! i l ∂ξl ξ=0 T
Here it is noted
that with
this expression we now need to compute unconditional
ξ
expectations E X kδ Z T instead of the conditional expectations.
⎡⎛ ⎞ ⎤
β
d ξ
= E ⎣⎝ Al jjt ⎠ Zt ⎦ ,
d
η β (t; ξ) (17)
lβ
j=1
and for n = 0 as
(∅) ξ
η(∅) (t; ξ) = E Z t . (18)
ξ
Then, by using (6) we write the unconditional expectations E[X km Z T ] in (16)
in terms of η as follows:
Asymptotic Expansion Approach in Finance 357
⎡⎛ ⎞ ⎤
m
km ξ ξ
E[X Z T ] = E ⎣⎝ g(k j +1)T ⎠ Z T ⎦
j=1
⎡⎛ ⎧ ⎫⎞ ⎤
⎪
⎪ (k j +1) j ⎪
⎪
⎢⎜ ⎨ 1 βjm j dβ ⎬ ⎟ ξ ⎥
= E⎢ ⎜ ∂ j g(X T(0) )A j1 · · · A j j ⎟ ⎥
d
⎣⎝ ⎪ lβ T ⎪ ⎠ ZT ⎦
⎪j j β j ! dβ j
j=1 ⎩
l1 T
j ⎪
⎭
lβ ,dβ
j j
⎛ ⎞
(k1 +1) (k
m +1) m
1 βj (0) dβ1 ⊗···⊗dβmm
= ··· ⎝ ∂ j g(X T )⎠ η1 1 (T ; ξ)
β j ! dβ j lβ ⊗···⊗lβδ
δ
lβ ,dβ lβm ,dβm
1 1 m δ j=1 1
1 1
(19)
where
d
So, we have to calculate η β (T ; ξ) to evaluate the asymptotic expansion (11).
lβ
d
In the following, we derive a system of ODEs satisfied by these {η β }. Before
lβ
showing a general result, we first derive the ODEs for a few leading-low-order terms
explicitly to give a better intuition of a key idea of our method. Particularly, let us
j j ξ
consider the evaluation of η(2) (T ; ξ) = E[A2T Z T ] which appears in the -order.
Here, for simplicity, we assume that V0 does not depend on , and write V0 (x, ) as
j j
V0 (x). In this case, we first note that the SDEs of A1t and A2t ( j = 1, . . . , d) are
given as follows:
j
d
j
j (0) (0)
d A1t = A1t ∂ j V0 (X t )dt + V j (X t )dWt (20)
j =1
⎡ ⎤
d
(0) 1 d
(0)
d A2t = ⎣ A1t Ak1t ∂ j ∂k V0 (X t )⎦ dt
j j j j j
A2t ∂ j V0 (X t ) +
2
j =1 j ,k =1
d
j
(0)
+ A1t ∂ j V j (X t )dWt . (21)
j =1
ξ
Also, the SDE of Z t is expressed as:
ξ ξ
d Zt = (iξ)V̂ (X (0) )Z t dWt . (22)
358 A. Takahashi
j ξ
Then, applying Itô’s formula to A2t Z t , we have
d
d
j ξ (0) (0) j ξ (0)
A1t Z t V̂ (X t )∂ j V j (X t ) +
j
= (iξ) A2t Z t ∂ j V0 (X t )
j =1 j =1
1
d
j ξ j (0)
+ A1t Ak1t Z t ∂ j ∂k V0 (X t ) dt
2
j ,k =1
⎧ ⎫
⎨ d
⎬
j ξ (0) j ξ (0)
+ (iξ)A2t Z t V̂ (X t ) + A1t Z t ∂ j V j (X t ) dWt .
⎩
⎭
j =1
Since the last term is a martingale, taking expectation on both sides, we have the
j
following ordinary differential equation for η(2) :
d j j d
(0) (0)
η(2) (t; ξ) = (iξ) η(1) (t; ξ)V̂ (X t )∂ j V j (X t )
dt j =1
d
j j (0) 1 j ,k
d
j (0)
+ η(2) (t; ξ)∂ j V0 (X t ) + η(1,1) (t; ξ)∂ j ∂k V0 (X t ).
2
j =1 j ,k =1
j
Here, η(1) ( j = 1, . . . , d) appearing in the right hand side of the above ODE are
evaluated in the similar manner:
ξ ξ ξ
d(A1t Z t ) = A1t d Z t + Z t d A1t + dA1 , Z ξ t
j j j j
⎧ ⎫
⎨ d
⎬
ξ (0) (0) j ξ (0)
= (iξ)Z t V̂ (X t )V j (X t ) +
j
A1t Z t ∂ j V0 (X t ) dt
⎩ ⎭
j =1
j ξ (0) ξ (0)
+ (iξ)A1t Z t V̂ (X t ) + Z t V j (X t ) dWt ,
hence, we have
d j (0) (0)
d
j j (0)
η (t; ξ) = (iξ)V̂ (X t )V (X t ) +
j
η(1) (t; ξ)∂ j V0 (X t ).
dt (1) j =1
j,k
η(1,1) and other higher-order terms can be evaluated in the same way. The key obser-
vation is that each ODE does not involve any higher-order terms, and only lower- or
the same order-terms appear in the right hand side of the ODE. So, one can easily
solve (analytically or numerically) the system of ODEs and evaluate the expectations.
Asymptotic Expansion Approach in Finance 359
d
The following proposition provides a way to calculate general η β (T ; ξ) as a
lβ
solution to the system of the ordinary differential equations:
d
Proposition 1 For η β (t; ξ) defined in (17), the following system of ordinary differ-
lβ
ential equations is satisfied:
β
d dβ 1 d (0)
η (t; ξ) = η β/k (t; ξ) ∂lk V0dk (X t , 0)
dt lβ lk ! lβ/k
k=1
β
(l)
lk
1 1 (d )⊗d̃ γ
+ η β/k (t; ξ)
(lk − l)! γ! (lβ/k )⊗m γ
γ ,d̃ γ
k=1 l=1
m
γ (0)
× ∂ ∂lk −l V0dk (X t , 0)
d̃ γ
β (l
k −1) (l
m −1)
1 (dβ/k,m )⊗d̃ γ ⊗d̂ δ γ (0)
+ η (t; ξ) ∂ V dk (X t )
γ!δ! (lβ/k,m )⊗m γ ⊗m δ d̃ γ
γ ,d̃ γ m
δ ,d̂ δ
k,m=1
m
k<m
(0)
× ∂ δ V dm (X t )
d̂ δ
β (l
k −1)
1 (d )⊗d̃ γ
+ (iξ) η β/k (t; ξ)
γ! (lβ/k )⊗m γ
γ ,d̃ γ
k=1
m
γ (0) (0)
× ∂ V dk (X t ) V̂ (X t ) (23)
d̃ γ
(l)
where is defined in (4), and
γ ,d̃ γ
m
(∅) ξ
Remark 2 Due to η(∅) (t; ξ) = E[Z t ] = 1, and the hierarchical structure of the
β
ODEs with respect to n = j=1 l j , one can easily solve these ODEs successively
d
from lower-order terms to higher-order terms with initial conditions η β (0; ξ) = 0
lβ
for (lβ , dβ ) = (∅, ∅).
Remark 3 Further, due to the structure of the system of the differential equations,
d
it is easily shown by induction that each η β (t; ξ) is expressed as a polynomial of
lβ
β ξ
degree n = j=1 l j with respect to (iξ). Then, we can also show that E[X km Z T ]
Asymptotic Expansion Approach in Finance 361
is a polynomial of degree (n + m) with respect to (iξ), and thus alkm = 0(l > n + m)
for km ∈ L n,m . This ensures a convergence of the infinite sum in (15).
Then, from Lemma 2 and (11), we have the following expression of E[(G () )]:
N (n)
1 dm
E[(G () )] = n (x)(−1)m m
m! R dx
n=0 km
⎧ ⎫
a km
⎨n+m ⎬
l
× Hl (x − C; ) f
T g1T (x) d x + o( N )
⎩ Tl ⎭
l=0
N (n)
1
= n
(x)
m! R
n=0 km
⎧ ⎫
⎨n+m a km ⎬
l
× Hl+m (x − C; T ) f g (x) d x + o( N )
⎩ l+m
1T
⎭
l=0 T
Theorem 2 Let X () be the solution to the stochastic differential equation (1).
Suppose a function g : Rd → R is smooth and all of its derivatives have poly-
nomial growth. Then, the asymptotic expansion of the density function of G () =
() (0)
g(X T )−g(X T )
up to N -order is given by
where
1 (x − C)2
f g1T (x) = √ exp − (27)
2πT 2T
with
T
(0) (0)
C = ∂g(X T ) YT Yt−1 ∂ V0 (X t , 0)dt,
0
T
(0) (0)
T = V̂ (X t )V̂ (X t ) dt > 0,
0
(0) (0) (0)
V̂ (X t ) = (∂g(X T )) YT Yt−1 V (X t ).
2 /2 d n −x 2 /2
Hn (x; ) = (−)n e x e , (28)
dxn
and
(m) (k1 +1) (kδ +1)
1 1
Cnm = . . .
Tm δ!(m − δ)!
kδ lβ ,dβ
1 1 δ δ lβ ,dβ
1 1 δ δ
⎛ ⎞
δ
1 βj (0)
×⎝ ∂ g(X T )⎠
β j ! dβj j
j=1
1
1 ∂ m−δ dβ ⊗...⊗dβδ √
× m−δ η 1 δ
(T ; ξ) , i = −1 . (29)
i ∂ξ m−δ lβ1 ⊗...⊗lβδ
1
ξ=0 δ
d
η β (T ; ξ) are obtained as a solution to the following system of ODEs:
lβ
β
d d 1 dβ/k (0)
η β (t; ξ) = η (t; ξ)∂lk V0dk (X t , 0)
dt lβ lk ! lβ/k
k=1
β
(l)
lk
1 1 (dβ/k )⊗d̃ γ γ
+ η (t; ξ)∂ ∂lk −l V0dk ( X̃ t(0) , 0)
(lk − l)! γ! (lβ/k )⊗m γ d̃ γ
γ ,d̃ γ
k=1 l=1
m
β (l
k −1) (l
m −1)
1 (dβ/k,m )⊗d̃ γ ⊗d̂ δ
+ η (t; ξ)
γ!δ! (lβ/k,m )⊗m γ ⊗m δ
γ ,d̃ γ m
δ ,d̂ δ
k,m=1
m
k<m
Asymptotic Expansion Approach in Finance 363
γ (0) (0)
× ∂ V dk (X t )∂ δ V dm (X t )
d̃ γ d̂ δ
β (l
k −1)
1 (dβ/k )⊗d̃ γ γ (0) (0)
+ (iξ) η (t; ξ)∂ V dk (X t )V̂ (X t , t)
γ! (lβ/k )⊗m γ d̃ γ
γ ,d̃ γ
k=1
m
(∅)
η β (0; ξ) = 0 for (lβ , dβ ) = (∅, ∅), η(∅) (t; ξ) = 1 f or (lβ , dβ ) = (∅, ∅).
d
lβ
(30)
H0 (x; ) = 1,
H1 (x; ) = x,
H2 (x; ) = x 2 − ,
H3 (x; ) = x 3 − 3x,
H4 (x; ) = x 4 − 6x 2 + 3 2 ,
H5 (x; ) = x 5 − 10x 3 + 15 2 x,
H6 (x; ) = x 6 − 15x 4 + 45 2 x 2 − 15 3 .
We can also apply the conditional expectation formulas for the multi-dimensional
case in Lemma 1.1 of [85] and Lemma 2.1 of [86] to derive an asymptotic expansion
up to the third order of the multi-dimensional density functions. This is particularly
useful for pricing exotic-type options such as barrier options with discrete monitoring
(e.g. [83]), and pricing Bermudan-type or approximate American-type derivatives
(e.g. Nishiba [71]).
364 A. Takahashi
where n = (n 1 , n 2 , . . . , n d ), |
n | = n 1 + n 2 + · · · + n d , n! = n 1 !n 2 ! · · · n d ! and
1 1 ∂ n 1
√
an = e 2 (ξ) ξ E ei(ξ) Z X , i = −1 . (32)
n i |n | ∂ ξ
ξ=0
where
1 1
x : ] =
n[ exp − x −1 x . (34)
(2π)d/2 ||1/2 2
{Hn ( x : ) : n = (n 1 , n 2 , . . . , n d ); n i = 0, 1, 2, . . . , (i = 1, 2, . . . , d)},
{ H̃n (
x : ) : n = (n 1 , n 2 , . . . , n d ); n i = 0, 1, 2, . . . , (i = 1, 2, . . . , d)},
where Hn (
x : ) is given by (33) and H̃n (
x : ) is defined as follows:
1 ∂ ∂ ∂
H̃n (
x ; ) = − − ··· − x : ],
n[ (35)
x : ]
n[ ∂ y1 ∂ y2 ∂ yd
y = (y1 , y2 , . . . , yd ) = −1 x.
Asymptotic Expansion Approach in Finance 365
j = (iξ1 ) j1 (iξ2 ) j2 · · · (iξd ) jd . Hence,
where (i ξ)
∞
j
x 1
ξ (i ξ)
ei ξ = e− 2 ξ H̃ j (
x : ).
j!
| j|=0
Therefore,
1
ξ 1
e2ξ E ei ξ Z X = e 2 ξ ξ E ei ξ Z E X | Z
⎧ ⎫
⎨ ∞ j ⎬
(i ξ)
= H̃ j (
x : )
Rd ⎩ j! ⎭
| j|=0
⎧ ⎫
⎨ ∞ ⎬
× an Hn (x : ) µ(d x) (38)
⎩ ⎭
|
n |=0
∞
= n ; ((ξ)
an i |n | (ξ) n = ξ n 1 ξ n 2 · · · ξ n d ), (39)
1 2 d
|
n |=0
Here, Cnm is given by (29), and Hn (x; ) is the Hermite polynomial of degree n with
parameter , which is defined as
2 /2 d n −x 2 /2
Hn (x; ) = (−)n e x e .
dxn
C and T are given respectively by
T
(0) (0)
C = ∂g(X T ) YT Yt−1 ∂ V0 (X t , 0)dt
0
Asymptotic Expansion Approach in Finance 367
and
T
(0) (0)
T = V̂ (X t )V̂ (X t ) dt,
0
where
(0) (0) (0)
V̂ (X t ) = (∂g(X T )) YT Yt−1 V (X t ).
Also, P(0, T ) denotes the price at time 0 of a zero coupon bond with maturity T .
N (x) stands for the standard normal distribution function, and its density function
is given by n(x) = √1 e−x /2 .
2
2π
(Proof) We firstly note that the call price is expanded as follows:
()
C(K , T ) = P(0, T )E[max{g(X T ) − K , 0}]
. *( () (0)
) ( (0)
) /0
g(X T ) − g(X T ) g(X T ) − K
= P(0, T )E max + ,0
= P(0, T )E max G () + y, 0
∞
= P(0, T ) (x + y) f G () ,N (x)d x + o((N +1) ). (41)
−y
where
1 (x − C)2
f g1T (x) = √ exp − .
2πT 2T
d
Hn (x; ) = n Hn−1 (x; ) (43)
dx
dm −1 m
{H n (x; )n(x; )} = Hn+m (x; )n(x; )
dxm
Hn+1 (x; ) = x Hn (x; ) − n Hn−1 (x; ),
x2
where n(x; ) = √ 1 e− 2 .
2π
368 A. Takahashi
Then, we can obtain the following expressions for the Integrals appearing on the
right hand side of (41):
∞
y+C
f g1T (x)d x = N
√ , (44)
−y T
∞ ,
y+C y+C
x f g1T (x)d x = T n √ + CN √ ,
−y T T
∞ ,
y+C
Hm (x − C; T ) f g1T (x)d x = T Hm−1 (−(y + C); T ) n √ ; m ≥ 1,
−y T
∞ ,
y+C
x Hm (x − C; T ) f g1T (x)d x = − T y Hm−1 (−(y + C); T ) n √
−y T
3 y + C
+ T2 Hm−2 (−(y + C); T ) n √ ; m ≥ 2.
T
j j
d X̂ t = V̂0 ( X̂ t )dt + V̂ j ( X̂ t )dWt ( j = 1, . . . , d) (45)
X̂ 0 = x0 ∈ Rd .
Then, in order to apply the asymptotic expansion method, we may rewrite the model
for instance, as
(), j j () ()
d Xt = V0 (X t )dt + V j (X t )dWt ( j = 1, . . . , d) (46)
X 0() = x0 ∈ R , d
where by rescaling V̂ j (x) we set V j (x) so that V̂ j (x) = V j (x) for some ∈ (0, 1].
Consequently, an approximate call price under the original model (45) is obtained
by (40) without o( N +1 ).
We already have a so called closed form approximate formula (40) for the option
price, and hence are able to obtain approximations of its Greeks (that is, sensitivities
to the changes in parameters in a model) as closed forms as well (or at least with
easy numerical method such as the difference quotient method with the approximate
option pricing formula).
For instance, [68] implements direct differentiations of the approximate formulas
for option values under a time-homogeneous general local volatility model, and
Asymptotic Expansion Approach in Finance 369
obtains closed form approximate formulas for the Deltas and Vegas. Moreover, [68]
applies the similar technique to computing the Deltas and Vegas for average options
with continuous monitoring, and gets their closed form approximate formulas as well.
They also confirms the validity of the approximations through numerical experiments
in the CEV model.
By deriving asymptotic expansions of characteristic functions of option values,
[93, 94] propose a new expansion scheme for pricing options on long-term currencies
under a Libor market model (LMM) and a general diffusion stochastic volatility
model with jump of spot exchange rates. Furthermore, applying the approximate
formulas, they provide analytical (closed form) approximations for the Deltas and
Gammas of the options. Please see [93, 94] for the detail.
Alternatively, for a parameter θ, the sensitivity of a call price C(K , T ) with respect
to the change in θ is expressed as follows:
∂
C(K , T ) = P(0, T )E[max{g(X T() ) − K , 0}]
∂θ
∂
= P(0, T )E max G () + y, 0
∂θ
∂
= {P(0, T )} E max G () + y, 0
∂θ
∂
()
+ P(0, T ) E max G + y, 0
∂θ
∂{P(0, T )} C AE (K , T )
=
∂θ P(0, T )
.( ) 0
∂G () ∂y
+ P(0, T )E + 1{G () >−y } , (47)
∂θ ∂θ
where C AE (K , T ) stands for the approximate call price with strike K and maturity
T , which is obtained by the asymptotic expansion.
Then, we are able to obtain an approximation of the sensitivity by a direct appli-
cation of the asymptotic expansion to the above equation, particularly, the second
term in the last equation. For example, under one dimensional diffusion setting, that
is a general time homogeneous local volatility model, [66] successfully applies the
expansion technique to computation of the Deltas and the Vegas with numerical
experiments.
More generally, we note that the similar method as in option pricing in the previous
subsection can be applied in Greeks, since we can take ∈ S for E[(G () )] in
(11) and apply the integration-by-parts method in Malliavin calculus. Recently, [103]
takes this approach and derives asymptotic expansions of Greeks around the Black-
Scholes model in stochastic volatility environment, and develop a unified method
for precise estimates of the expansion errors. Particularly, they make use of the so
called Kusuoka-Stroock functions introduced by Kusuoka [52], which is a powerful
tool to clarify the order of a Wiener functional with respect to the time parameter t
370 A. Takahashi
in a unified manner. Then, they estimate the error bounds for the Malliavin weights
of both the coefficient and the residual terms in the expansions.
The framework of the asymptotic expansion can be applied not only to the simple
cases mentioned above, but also to evaluation of much broader range of asset and
security values. In particular, there are many cases where the asymptotic expansion
can be applied to approximate their values when the underlying asset prices of finan-
cial securities, cash flows and interest rates are expressed as some functions of a
random vector X () that follows a diffusion process. The method is almost the same
as the one illustrated above and hence it is omitted. In this subsection, we only review
how to represent the values of financial assets.
First, just as in the previous subsections, we consider a d-dimensional diffusion
process X () defined as the strong solution to the stochastic differential equation (1).
As an example, the present value V of a financial asset which generates a cash flow
at the maturity date T is represented as
T ()
V = E e− 0 R2 (X u )du
F(g(X T() )) , (48)
where g denotes the underlying asset price and F is the cash flow which characterizes
the asset to be evaluated. Note that the underlying asset price g follows a diffusion
() ()
process, whose drift term (the coefficient of the dt term) is R1 (X t )g − D(X t )
under an equivalent martingale measure. Moreover, R1 at time t ∈ [0, T ] is repre-
sented as
() ()
J1
()
R1 (X t ) = r (X t ) + s1 j (X t ),
j=1
where r denotes the risk-free interest rate and s1 j , j = 1, . . . , J1 stand for various
spreads (the differences from the risk-free rate) such as credit spreads and liquidity
spreads. Suppose also that those are expressed as functions of the variable X () . Fur-
ther, D(X t() ) denotes a payoff generated by the underlying asset such as a dividend or
an interest rate and is also represented as a function of the variable X () . Meanwhile,
the discount rate at time t that is, R2 (X t() ) of the target asset F to be evaluated is
also expressed as
() ()
J2
()
R2 (X t ) = r (X t ) + s2 j (X t ),
j=1
As an example, let F = 1 in (48) for a zero-coupon bond with the face value
1 and the maturity date T . Also, let Vi denote the price of the zero-coupon bond
with the maturity Ti . Then, V , the value of a coupon bond with the maturity TN
and coupon (and principal) payments N ci at Ti (i = 1, . . . , N , T1 < · · · < TN ) is
represented by the equation V = i=1 ci Vi . Moreover, the present value of a call
option on the coupon bond with the option maturity T (< T1 ) can be evaluated if we
() N ()
set F(x) = (x − K )+ and g(X T ) = i=1 ci gi (X T ) in the equation (48), where
()
gi (X T ), i = 1, . . . , N are given by
+ -
()
Ti () ()
gi (X T ) = E e− T R1 (X u )du
|X T .
Applying the framework described above to default risk models, Muroi [67]
derives asymptotic expansions for approximations of CDS (credit default swap)
spreads.
Shiraya et al. [82] applies the expansion technique to obtain an approximation
of swaption values under the Libor market model(LMM) of interest rates (Brace,
Gatarek and Musiela [7], Jamshidian [43]) with local-stochastic volatility models.
Takahashi and Takehara [90–92] develop asymptotic expansion formulas for pric-
ing long-term currency options with a Libor market model(LMM) of interest rates
and diffusion or jump-diffusion stochastic volatility processes of spot exchange rates.
Moreover, [92] presents a new characteristic-function-based Monte Carlo simulation
scheme with the asymptotic expansion as a control variate.
Takahashi and Takehara [96] develops a general computation scheme for a high-
order expansion method explained in this section, and applies it to the SABR model
(Hagan, Kumar, Lesniewski, and Woodward [33]). They derives the expansions of
the option prices up to the fifth order to show that the higher order expansion improves
the approximations.
Takahashi and Takehara [108] and Takahashi et al. [109] also apply this scheme to
the long-term currency options such as the 10 year maturity one under a Libor market
model (LMM) of interest rates and stochastic volatility processes of spot exchange
rates. Again, they confirm that the fourth or the fifth order expansion provides the
better approximations than the lower order ones.
Furthermore, we are able to apply the expansion method to pricing the so called
exotic type options. For instance, [78] derives expansions of average options with
discrete monitoring under stochastic volatility models in order to obtain approximate
prices of commodities average options. Moreover, they implement calibration to
real futures plain-vanilla option prices of the underlying commodities, and evaluate
average options based on the parameters obtained by the calibration.
Shiraya et al. [83] develops new approximation formulas for pricing single and
double barrier options with discrete monitoring under stochastic volatility models.
In addition, they demonstrate its validity through numerical experiments.
Shiraya et al. [81] presents a new approximation scheme for pricing continuous
barrier options in stochastic volatility environment. Particularly, they make use of
a static hedging scheme and the fifth order expansions of the vanilla options to
obtain accurate approximate prices. Further, they derives the fifth order expansions
for pricing average options with continuous monitoring under stochastic volatility
models to achieve very precise approximations.
Shiraya and Takahashi [79] develops a general scheme for evaluation of the so
called multi-asset cross currency options. In particular, they derive the expansions of
basket option prices with 100 underlying assets (200 state variables with their stochas-
tic volatilities), and cross currency average/basket options with discrete monitoring
under stochastic volatility models to obtain accurate approximations.
Kato et al. [46, 47] develop a new expansion scheme for solutions of Cauchy-
Dirichlet problems for second order parabolic partial differential equations (PDEs)
and apply it to pricing down-and-out/up-and-out barrier options with continuous
monitoring under stochastic volatility models.
Asymptotic Expansion Approach in Finance 373
4 Extension
This section follows [97] which presents an extension of the general computational
scheme of the asymptotic expansion described in the previous section. In particular,
by a change of variable technique and by various ways of setting the perturbation
parameters in the expansion, we are able to provide the flexibility of setting the bench-
mark distribution around which the expansion is made, and an automatic way for
computation up to any order in the expansion. For instance we introduce expansions,
called the log-normal expansion and the CEV expansion.
j j
d X t = V0 (X t )dt + V j (X t )dWt ( j = 1, . . . , d) (49)
X 0 = x0 ∈ R d
j
where W = (W 1 , . . . , W r ) is an r -dimensional standard Wiener process; V0 :
Rd → R and V j : Rd → Rd are smooth functions with bounded derivatives of all
orders.
Next, let C : Rd → Rd be a C2 -function which has the unique inverse function,
−1
C , and define X̃ t as X̃ t = C(X t ). Then, the dynamics of X̃ is given by
j j
d X̃ t = Ṽ0 ( X̃ t )dt + Ṽ j ( X̃ t )dWt ( j = 1, . . . , d), (50)
X̃ 0 = x̃0 ,
where
d
j
∂ j C j (C −1 (x̃))V0 (C −1 (x̃))
j
Ṽ0 (x̃) :=
j =1
1
d
+ ∂ j k C j (C −1 (x̃))V j (C −1 (x̃))V k (C −1 (x̃)) ,
2
j ,k =1
d
Ṽ j (x̃) := ∂ j C j (C −1 (x̃))V j (C −1 (x̃)),
j =1
()
X̃ t → X̃ t
j (), j
Ṽ0 (x̃) → Ṽ0 (x̃, )
Ṽ (x̃) → Ṽ (x̃),
j j
Then, we are able to apply the technique developed in the previous section to the
transformed SDE (51).
We assume that the underlying process is the unique solution to the following SDE:
C(x) = (C1 (x 1 ), x 2 , . . . , x d ),
1 ( )
d S̃t = |σ(X t )|2 h(C1−1 ( S̃t ))2 C1 (C1−1 ( S̃t ))dt
2
+ σ(X t )h(C1−1 ( S̃t ))C1( ) (C1−1 ( S̃t ))dWt , s̃0 = C1 (s0 ), (54)
Asymptotic Expansion Approach in Finance 375
( ) ( ) 2
where C1 (x) := ddx C1 (x) and C1 (x) := ddx 2 C1 (x).
Next, we introduce a perturbation parameter as follows:
(1)
St = C1−1 ( S̃t ) = C1−1 ( S̃t ),
(1) ()
where S̃t = S̃t |=1 .
According to Theorem 2 in the previous section, we have already an asymptotic
() (0)
S̃ − S̃
expansion of the density function of G () = T T up to N -order, denoted by
f G () ,N (x).
Therefore, an approximation formula of the call price is given as follows:
+ -
(1)
Call(K , T ) = E[(ST − K )+ ] = E C1−1 S̃T − K (56)
+
∞
(0)
≈ C1−1 (x + S̃T ) − K f G (1) ,N (x)d x, (57)
y
(0)
where y = C1 (K ) − S̃T .
A simple example is the following. Set the local volatility function to be linear:
For x = (x 1 , x 2 , . . . , x d ), let
C(x) = (log x 1 , x 2 , . . . , x d ),
and set η() = k where k is 0, 1 or 2. Then, we have S̃t() = log St() , where
This case corresponds to some existing researches. (e.g. [91, 92, 95, 96, 100])
376 A. Takahashi
4.3 Examples
The first example is on the well-known CEV (Constant Elasticity of Variance) model
(Cox [10]):
β 1−β
dSt = σ(St S0 )dWt , σ and S0 are positive constants, β ∈ [0, 1], (60)
1−β
where the term S0 makes the level of σ is of the same order for different β. For
x > 0, let us take the change of variable function to be C(x) = log(x/S0 ), that is
x = C −1 (x̃) = S0 exp(x̃). Hence, S̃t = log SS0t and we have
1
d S̃t = − σ 2 e2(β−1) S̃t dt + σe(β−1) S̃t dWt ; S̃0 = 0. (61)
2
Next, we introduce a perturbation ∈ [0, 1], again as follows:
an approximation formula of the call price with strike K and maturity T is given as
follows:
+ -
(0)
Call(K , T ) = E[(ST − K )+ ] = E S0 exp G (1) + S̃T − K
+
∞
(0)
≈ S0 exp x + S̃T − K f G (1) ,N (x)d x; (63)
y
K
y = C(K ) − S̃T(0) = log − S̃T(0) . (64)
S0
Note that f g1T , the first term in the asymptotic expansion of the density f G () is a
normal density and hence, the underlying asset price is expanded around a log-normal
distribution. Thus, we could call this case a log-normal asymptotic expansion. We
Asymptotic Expansion Approach in Finance 377
also remark that the case of η() = 0 = 1 is harder to be evaluated than the other
(0)
cases, which is essentially due to difficulty in computation of S̃t for η() = 1.
• On the Validity of the Asymptotic Expansion for CEV model
Previous works such as [85, 86, 107] have considered an asymptotic expansion
of (average and vanilla) option prices based on the following type of a perturbed
process: For β ∈ [1/2, 1),
Although the coefficient function in this model is not smooth at 0, the asymp-
totic expansion method is still applicable. For instance, we could use a smooth
modification technique (e.g. [106, 107]). That is, let us take a modified process
( S̃t() )t∈[0,T ] of (St() )t∈[0,T ] as follows:
Here, g(x) is a smooth modification of g(x) = (x ∨ 0)β such that g(x) = x β when
x ≥ a1 for some small a1 ∈ (0, a) for a = 21 s0 and g(x) = 0 when x ≤ a2 for
some a2 ∈ (0, a1 ). Specifically, we may set g(x) as follows. For t ∈ [0, T ],
g(x) = h(x)x β
ψ(x − a2 )
h(x) = , 0 < a2 < a1
ψ(x − a2 ) + ψ(a1 − x)
ψ(x) = e−1/x for x > 0, ψ(x) = 0 for x ≤ 0.
+ 2 -
2
() ()
Suppose that for a R-valued function f , E f (S ) < ∞ and E f ( S̃ ) <
∞. (e.g. we can take option payoff functions as f in our setting.) Then, we have
1 1
() () () 2 2 () 2 2
E f (S ) − f ( S̃ ) 1{S () = S̃ () } ≤ E | f (S )| + E | f ( S̃ )|
1
× P {S () = S̃ () } .
2
We can easily see that the second term after the last inequality is 0. The first term
is smaller than any n for n = 1, 2, . . . by the following lemma of a large deviation
inequality:
Lemma 4 Suppose that Z t , t ∈ [0, T ] follows a process of the solution to the SDE:
where μ(z) satisfies the Lipschitz and linear growth conditions, and σ(z) satisfies
the linear growth condition. We assume that the unique strong solution exists. Then,
there exists positive constants c1 and c2 independent of such that
Next, let us consider a stochastic volatility model so called SABR [33] (or λ-SABR
[38]) Model:
β 1−β
d St = σt (St S0 )dWt1 ; S0 > 0, (69)
dσt = λ(θ − σt )dt + νσt dWt2 ; σ0 > 0
1
d S̃t = − σt2 e2(β−1) S̃t dt + σt e(β−1) S̃t dWt1 ; S˜0 = 0 (70)
2
dσt = λ(θ − σt )dt + νσt dWt2 ; σ0 > 0.
Again, we note that the case of η() = 0 = 1 is harder to be evaluated than the
(0)
other cases, which results from difficulty in computation of S̃t for η() = 1.
• CEV Asymptotic Expansion
Let us take change of variable function C as C(x1 , x2 ) = (C1 (x1 ), x2 ) for (x1 , x2 ),
where for x > 0 and β ∈ [0, 1),
( )
1 x 1−β x dz
C1 (x) = = . (74)
1 − β S 1−β 1−β
z β S0
0
That is,
1 1
C1−1 (x̃) = S0 (1 − β) (1−β) x̃ (1−β) . (75)
380 A. Takahashi
1 β 1 1
d S̃t = − σ 2 dt + σt dWt1 ; S̃0 = >0 (76)
2 1 − β t S̃t 1−β
dσt = λ(θ − σt )dt + νσt dWt2 ; σ0 > 0.
β 1 1
d S̃t() = − (σt() )2 () dt + σt() dWt1 ; S̃0() = , (78)
21−β S̃t 1−β
() () () ()
dσt = λ(θ − σt )dt + νσt dWt2 ; σ0 = σ0 .
(0) (0)
In this case, as S̃t = 1
and σt = σ0 for all t ∈ [0, T ], the first two terms in
1−β
∂ ()
the asymptotic expansion, g̃1t = 1
1−β + ∂ S̃t follows a Gaussian process:
=0
−βσ02 1
d g̃1t = dt + σ0 dWt1 ; g̃10 = . (79)
2 1−β
1 1
ĝ1t := C1−1 (g̃1t ) = S0 (1 − β) (1−β) g̃1t(1−β) , (80)
and using
1−β
1 ĝ1t
g̃1t = , (81)
1 − β S 1−β
0
we formally obtain the SDE of ĝ1t though it is generally well-defined only for
g̃1t ≥ 0:
βσ02 S0
1−β
β 1−β β−1 1−β β
d ĝ1t = ĝ1t −1 + S0 ĝ1t dt + σ0 S0 ĝ1t dWt1 ; ĝ10 = S0 . (82)
2
Asymptotic Expansion Approach in Finance 381
Here, because the diffusion coefficient of ĝ1t is given by σ0 S0 (ĝ1t )β and we may
1−β
think that S is expanded around ĝ1 , we call this case a CEV asymptotic expansion
(though ĝ1 is not exactly a CEV process).
In particular, when β = 1/2,
σ02 , ,
d ĝ1t = − S0 ĝ1t + S0 dt + σ0 S0 ĝ1t dWt1 ; ĝ10 = S0 , (83)
4
and because
S0 2
ĝ1T = g̃ , (84)
4 1T
ĝ1T /(S0 σ02 T /4) follows a non-central χ2 distribution, around which the original
underlying asset price ST is expanded.
Finally, for ηi () = ji , i = 1, 2 and ji is a nonnegative integer, an approximation
formula of the call price with strike K and maturity T is obtained as follows:
+ -
−1
Call(K , T ) = E[(ST − K )+ ] = E C1 ( S̃T ) − K
+
+ -
1 1
= E S0 (1 − β) (1−β) ( S̃T ) (1−β) − K
+
+ -
1 1
= E S0 (1 − β) (1−β) ( S̃T(1) ) (1−β) − K
+
+ -
1 1
(1) (0) (1−β)
= E S0 (1 − β) (1−β) (G + S̃T ) −K
+
∞
1 1
(0)
≈ S0 (1 − β) (1−β) (x + S̃T ) (1−β) − K f G (1) ,N (x)d x;
y
(85)
1−β
1 K
y = C1 (K ) − S̃T(0) = − S̃T(0) . (86)
1−β S0
Takahashi and Tsuzuki [98] develops a new scheme for improving density approx-
imation methods, which also provides precise approximations of option values.
Specifically, the scheme is inspired by the idea in the Hilbert space projection theo-
rem, and so called “Dykstra’s cyclic projections algorithm” is applied for its imple-
mentation. (Please consult Deutsch [14] for the detail of the algorithm.) We also
remark that the scheme can be easily implemented in practice, where we need only
market data used for usual calibration such as option prices with strikes.
Furthermore, numerical experiments for vanilla option pricing under SABR model
demonstrate the validity of the scheme. In fact, in terms of approximation accuracies
this scheme improves the third and fifth order asymptotic expansions preserving
the required conditions such as nonnegative densities under an appropriate forward
measure.
We finally remark that the scheme is general and flexible enough to include a set
of conditions and information as one would like to put on an approximate density,
and it can be applied to approximation methods other than the asymptotic expansion
method. For example, a number of researches have been going on in order to extend
SABR model with fixing the problem of the negative densities in the method of
[33]. (For instance, see Doust [15].) We note that the scheme is also a candidate for
handling this issue. Also, the estimate of the absorption probability based on Monte
Carlo simulations as in [15] can be consistently incorporated in the scheme.
Asymptotic Expansion Approach in Finance 383
Takahashi and Yamada [105] develops a new weak approximation scheme for
expectations of functions of the solutions to SDEs. In particular, the scheme con-
nects approximate operators constructed based on the asymptotic expansion. More
concretely, a diffusion semigroup is defined as the expectation of an appropriate
function of the solution to a certain SDE: for example, Pt f (x) = E[ f (X tx, )]
with the solution X tx, of a SDE with perturbation parameter and a function f .
Then, we approximate Pt by an operator Q t,m which is constructed based on the
asymptotic expansion up to a certain order m. Thus, given a partition of [0, T ],
π = {(t0 , t1 , . . . , tn ) : 0 = t0 < t1 < · · · < tn = T }, we are able to approxi-
mate PT f (x) by connecting the expansion-based approximations with the use of
multi-dimensional Malliavin weights sequentially: that is, roughly speaking, with
sk = tk − tk−1 , k = 1, . . . , n,
The present research justifies this idea by applying Malliavin calculus, particularly,
theories developed by Watanabe [111] and Kusuoka [52–54]. In computation, in order
to evaluate the Malliavin weights, the paper makes use of conditional expectation
formulas for multi-dimensional asymptotic expansions in [86].
Moreover, the paper shows through numerical examples for option pricing under
local and stochastic volatility models that very few partition such as n = 2 is mostly
enough to substantially improve the errors at deep OTMs of expansions with the first
or second order (m = 1, 2).
Among main stochastic models in finance, there exist models in which the stochastic
processes of the underlying variables do not belong to the class of diffusion processes.
This section illustrates an instantaneous forward rates model as a typical example.
where σ F stands for the Malliavin covariance matrix of F . Then, for a Schwartz
distribution T ∈ S (Rn ), we have an asymptotic expansion in R:
⎧
⎨
N
E[T (F )] −
T (x) p F0
(x)d x + j
T (x)E
⎩ Rn
j=1 Rn
⎡ ( ) ⎤ ⎫
( j) k ⎬
× ⎣ |F = x ⎦ p (x)d x = O( N +1 ),
0
Hα(k) F , 0 0,β
Fαl l 0 F
⎭
k l=1
(88)
Equivalently,
*
E[T (F )] −
0
T (x) p F (x)d x
R n
N ( j)
+ j
(−1) k
T (x)∂αk (k)
j=1 k Rn
* . 0 / /
k
× E Fα0,βl |F 0 =x p F0
(x) d x = O( N +1 ),
l
l=1
(89)
k
where Fi0,k := k!1 d (k) denotes a multi-index,
k Fi |=0 , k ∈ N (i = 1, . . . , n), α
d
(k)
α = (α1 , . . . , αk ) and
( j) j 1
≡ .
k!
k k=1 β1 +···+βk = j,βi ≥1 α
(k) k ∈{1,...,n}
0
p F (x) stands for the density function of F 0 . The Malliavin weight Hα(k) is recursively
defined as follows:
where
( n )
∗
H(l) (F, G) = D GγliF D Fi . (91)
i=1
Asymptotic Expansion Approach in Finance 385
2 1n n
Here, Fi ∈ D ∞ , G ∈ D ∞ , D ∗ GγliF D Fi is the divergence
i=1 of i=1 GγliF
D Fi , D Fi is the Malliavin derivative of Fi , and γ F = γiFj denotes the
1≤i, j≤n
inverse
matrix of the Malliavin covariance matrix of F. Moreover, we use the notation
T (x)g(x)d x for T ∈ S (Rn ) and g ∈ S(Rn ) meaning that S T, gS . (See the
Sect. 2 of [100] for the details of those definitions.)
Remark 7 The asymptotic expansion formula (89) is the formula developed by
Watanabe [111]. Hence, this theorem shows the expansion (88) based on push down
(conditional expectation) of Malliavin weights (divergences) is equivalent to the
Watanabe’s formula.
(Proof) We use α as an abbreviation of α(k) in the proof, and the notation
·, · p F 0 (x)d x is defined as follows:
0 0 0
T, E F [·] p F 0 (x)d x := S T, E F [·] p F S .
0
l=1 D−∞ ×D∞ l=1 p F (x)d x
6 . k
07
= T, (∂ ∗ )kα E F
0
Fα0,β
l
l
0
l=1 p F (x)d x
386 A. Takahashi
* . k
0 /
F0
= (−1) k
T (x)∂αk E Fα0,β
l
l |F 0 =x p (x) d x.
Rn l=1
(94)
Here, (∂ ∗ )kα means (∂ ∗ )kα = ∂α∗ · ·· ∂α∗ (k times), and ∂α∗ denotes the divergence oper-
0
ator on the space Rn , p F (x)d x .
Corollary 1 The asymptotic expansion of the density function of F , p F (y) is
expressed with the push-down of the Malliavin weights as the follows:
⎡ ( ) ⎤
m ( j) k
F j ⎣
F0
|F 0 = y ⎦ p F (y) + O(m+1 ),
0
p (y) = p (y) + E Hα(k) F 0 , Fα0,β
l
l
j=1 k l=1
(95)
0
where p F (y) is the density function of F 0 . An alternative expression is given as
follows:
( j)
* . 0 /
m k
F F0 F0
p (y) = p (y) + j
(−1)k ∂αk (k) E Fα0,β
l
l |F 0 =y p (y) + O(m+1 ).
j=1 k l=1
(96)
As a typical stochastic model for pricing the interest rate derivatives, there exists a
model developed by Heath-Jarrow-Morton [37], the so called HJM model, which is
formulated based on the forward rates with infinitesimal terms of the interest rates,
that is the instantaneous forward rates { f (s, t) : 0 ≤ s ≤ t ≤ T }. Here, s is the time
when the forward rate is fixed and t denotes the inception time when the forward
rate is applied.
The stochastic processes for the instantaneous forward rates are considered in
the framework of the asymptotic expansion by introducing a parameter ∈ [0, 1].
For example, let W be a m-dimensional standard Wiener process and let f (0, t),
t ∈ [0, T ] be a given Lipschitz continuous function of t. Then, under the equivalent
martingale measure, the stochastic processes of { f () (s, t) : 0 ≤ s ≤ t ≤ T } are
solutions to the following stochastic integral equations:
Asymptotic Expansion Approach in Finance 387
m +
s t -
() () ()
f (s, t) = f (0, t) + 2
σi ( f (v, t), v, t) σi ( f (v, y), v, y)dy dv
0 i=1 v
m
s
+ σi ( f () (v, t), v, t)dWi (v) ; ∈ [0, 1], (97)
i=1 0
where the volatility functions {σi (x, s, t); i = 1, . . . , m} are smooth and satisfy the
regularity conditions which guarantee that the equation (97) has its unique strong
solution. It is to be noted that the drift term (the coefficient of the dv term) of f () (s, t)
depends on { f () (v, y); 0 ≤ v < s, v ≤ y < t}. Moreover, the stochastic process
of the instantaneous short-term interest rate r () (t) is determined by the relation,
r () (t) = f () (t, t).
For this model, the approximations of the values for interest rate derivatives can
still be considered in a unified framework with derivation of asymptotic expansions
of the instantaneous forward rates when ↓ 0 and with use of the relation between
the instantaneous forward rates and a zero-coupon bond price:
T
()
P (t, T ) = exp − f () (t, u)du . (98)
t
follows:
388 A. Takahashi
. T T
() P(0, T )
P (t, T ) ∼ 1− f 1 (t, u)du − 2 f 2 (t, u)du
P(0, t) t t
T 2
0
1
+ 2 f 1 (t, u)du + · · · in D∞ , (101)
2 t
.
T T T
r () (s)ds
e− 0 ∼ P(0, T ) 1 − f 1 (t, t)dt − 2 f 2 (t, t)dt
0 0
2
0
T
21
+ f 1 (t, t)dt + · · · in D∞ , (102)
2 0
(0) (0)
Here, σi (v, t) = σi ( f (0) (v, t), v, t), and b(0) (v, t) and ∂σi (v, t) are defined as
n t
b(0) (v, t) = σi ( f (0) (v, t), v, t) σi ( f (0) (v, y), v, y)dy,
i=1 v
Then, the payoff at maturity of the call option on a coupon bond is written as
* /
n
(),i
Vc (T ) = max ci X t − K, 0 . (105)
i=2
So far, we have used stochastic models whose randomnesses are generated by only
Wiener processes. However, we are also able to apply the asymptotic expansion
approach to stochastic processes including jumps in their sample paths. This section
provides its very brief review. For the details, please see the cited papers.
In terms of the mathematical viewpoint, Yoshida [120] presented an extension
of Watanabe theory to develop a framework for providing a validity of asymptotic
expansions in Wiener-Poisson spaces, which can be applied to jump-diffusion models
under some regularity conditions. Hayashi [34] applied a Malliaivin calculus of jump-
type to prove an asymptotic expansion theorem for functionals of a Poisson random
measure, and Hayashi [35] derived the coefficients in the expansion of a call option
price under a pure jump model. Moreover, Hayashi and Ishikawa [36] proved an
asymptotic expansion formula for the compositions of a smooth Wiener-Poisson
functional with Schwartz distributions.
390 A. Takahashi
In the first place, we define the model of the underlying asset prices and its volatility
processes, which is used for pricing the European type basket options. In particular,
suppose that the filtered probability space (, F, P, {Ft }t≥0 ) is given, where P is an
equivalent martingale measure and the filtration satisfies the usual conditions. The
risk-free interest rate is assumed to be a nonnegative constant r for simplicity. Then,
(Sti )t∈[0,T ] and (σti )t∈[0,T ] , i = 1, . . . , d represent the underlying asset prices and
their volatilities for t ∈ [0, T ], respectively. Particularly, let us assume that STi and
σTi are given by the solutions of the following stochastic integral equations:
T T i
STi = s0i + αi St−
i
dt + φ Si σt−
i
, St−
i
dWtS
0 0
⎛ ⎞
n Nl,T
T
+ ⎝ h Si ,l, j Sτi j,l − − l St−
i
E[h Si ,l, j ]dt ⎠ , (107)
l=1 j=1 0
T T
dWtσ
i
σTi = σ0i + λ (θ
i i
− σt−
i
)dt + φσi σt−
i
0 0
Asymptotic Expansion Approach in Finance 391
⎛ ⎞
n Nl,T
T
+ ⎝ h σi ,l, j στi − l σt−
i
E[h σi ,l, j ]dt ⎠ , (108)
j,l −
l=1 j=1 0
where s0i and σ0i , i = 1, . . . , d are given as some constants. The notations are defined
as follows:
• αi (i = 1, . . . , d) are constants.
λi and θi (i = 1, . . . , d) are nonnegative constants.
• φ Si (x, y) and φσi (x) are some functions with appropriate regularity conditions.
• W S and W σ , (i = 1, . . . , d) are correlated Brownian motions.
i i
• We also define ∂x S (x = S or σ) as
⎡ ∂ ∂ ⎤
∂x 1
( S )1,1 ··· ∂x 1
( S )1,2d
⎢ .. .. .. ⎥
∂x S := ⎣ . . . ⎦, (117)
∂ ∂
∂x d
( S )d,1 ··· ∂x d
( S )d,2d
Next, let us define the payoff of a basket call option with strike price K as
Then, for a strike price K = g(ST0 ) − y for an arbitrary y ∈ R, the payoff of the
call option with maturity T is expanded as follows:
( () (0) )+
+ g(ST ) − g(ST )
()
g ST − K = +y
+
(1) (2)
= g ST + g ST + y + o()
2
+ 2
(1) g S (2) + o(2 ).
= g(ST ) + y + 1 (1) T (128)
2 g(ST )>−y
where
n
(0)
ξ{kl } := (kl − l T )m S,l ∗ ST (130)
l=1
and
⎛ ⎞
T
n kl
(0) (0) (0)
ŜT := eα(T −t) ∗ S σt , St d Zt + ⎝ γ S,l ∗ ζ S, j,l ∗ ST ⎠ . (131)
0 l=1 j=1
{k }
Here, T l is defined as follows:
T
{k } (0) (0) (0) (0)
T l := w ∗ eα(T −t) ∗ S σt , St w ∗ eα(T −t) ∗ S σt , St dt
0
n
(0) (0)
+ kl (w ∗ γ S,l ∗ ST ) ϑζ S,l (w ∗ γ S,l ∗ ST ), (133)
l=1
where ϑζ S,l stands for the correlation matrix of ζ S, j,l = (ζ S 1 , j,l , . . . , ζ S d , j,l ), and x
denotes the transpose of x.
Next, we define
η2 (x, {kl }) = E g ST(2) g( ŜT ) = x, {Nl = kl } . (134)
With those preparations, we approximate the expectation of the basket call payoff
under an equivalent martingale measure in the following way:
396 A. Takahashi
+ + -
E g ST() − K
+ + + --
(1)
= E E g(ST ) + y g( ŜT ) = x, {Nl = kl }
+ + --
2 (2)
+ E E 1 (1) g ST g( ŜT ) = x, {Nl = kl } + o(2 ). (135)
2 g(ST )>−y
& '
We also note that the probability of {Nl = kl } := N1,T = k1 , . . . , Nn,T = kn
is expressed as
n
(l T )kl e−l T
p{kl } := , (136)
kl !
l=1
which is the product of the kl times of the jump probabilities of Nl,T (l = 1, . . . , n),
n
that is l=1 P({Nl,T = kl }), thanks to the independence of Nl,T (l = 1, . . . , n).
Then, we calculate the coefficients of and 2 on the right hand of (135) as follows:
2
2
and the coefficient of 2 is given by:
+ + --
(2)
E E 1 (1)
g ST g( ŜT ) = x, {Nl = kl }
g(ST )>−y
∞
∞
{k }
= p{kl } η2 (x, {kl })n(x; 0, T l )d x. (138)
n −(g(ξ{kl } )+y)
l=1 kl =k
k=0
Then, the initial value, C(K , T ) of the basket call option with maturity T and
strike K is expanded around = 0 as follows:
C(K , T ) =
∞
∞
−r T {k }
p{kl } e (x + y{kl } ) n(x; 0, T l )d x
n −y{kl }
l=1 kl =k
k=0
∞
{k }
+ 2 η2 (x, {kl })n(x; 0, T l )d x + o(2 ),
−y{kl }
(139)
Asymptotic Expansion Approach in Finance 397
The FBSDEs have become quite popular in finance community since El Karoui,
Peng and Quenez [16], especially after the recent financial crises and the subsequent
quite volatile markets, which leads us to recognize the importance of counter party
risk management, particularly the credit value adjustments (CVA).
However, an explicit solution for a FBSDE has been known only for a simple
linear or quadratic example. Although several techniques have been proposed in
the last decade, they seem very limited in practical applications since they rely on
numerical methods for non-linear partial differential equations (PDEs) or regression
based Monte Carlo simulations, which are generally very difficult to implement or
quite time-consuming especially for high-dimensional and long-horizon problems.
Recently, [25] has developed a simple analytical approximation scheme for the
nonlinear FBSDEs, notably for not only the so called decoupled cases but also the
coupled cases. Fujii and Takahashi [25] has introduced a perturbation parameter
398 A. Takahashi
where the credit value adjustment (CVA) is taken into account. Roughly speaking,
considering a perturbed forward SDE X ε , ε ∈ (0, 1] and an associated backward
SDE (Y ε , Z ε ), they have the following recursive asymptotic expansion around some
non-degenerate gaussian model X̄ 0 . That is, for k ≥ 0, N ≥ 1
(142)
where Ysε,k,N ,t,x = u ε,k,N (s, X̄ s0,t,x ) and Z sε,k,N ,t,x = (∇x u ε,k,N σ)(s, X̄ s0,t,x ). Here,
0 and N 0 , i = 1, . . . , N are the Malliavin weights and in particular,
the processes πi,t i,t
0
N0,t corresponds to the weight appeared in a representation theorem in Ma and Zhang
[63].
This subsection briefly describes the perturbation method following [25]. Firstly, let
us consider the following decoupled FBSDE:
SDE:
d X t = γ0 (X t )dt + γ(X t ) · dWt ; X 0 = x . (144)
Hereafter, we assume the appropriate regularity conditions that guarantee the math-
ematical validity. For example, pleases see [104] on this point.
In order to approximate the pair of (Vt , Z t ) in terms of X t , we extract the linear
term from the generator f and treat the residual non-linear term as a perturbation
to the linear FBSDE. That is, let us introduce a perturbation parameter , and then
write the equation as
We remark that as in the previous asymptotic expansion cases, the residual part g
should be small for a precise approximation. Hence, one should choose the linear
()
term c(X t )Vt in such a way that the residual non-linear term g becomes as small
as possible.
Now, we are going to expand the solution of BSDE (145) with respect to . That
() ()
is, suppose Vt and Z t are expanded as follows:
For illustrative purpose, let us show a first few steps of the expansion. For the zeroth
order of , it is easily seen that Vt(0) is a solution to the following equation:
(0)
VT = (X T ) . (150)
(0)
Then, Vt can be represented as follows:
T
(0)
Vt = E e− t c(X s )ds (X T ) Ft , (151)
which is equivalent to the value of a standard European contingent claim with the
terminal payoff (X T ) and the discount rate c(X t ) under a suitable pricing measure.
(0)
Clearly, Vt is a function of X t due to the Markovian nature of the model. Moreover,
Asymptotic Expansion Approach in Finance 401
(0)
applying Itô’s formula (or the Malliavin derivative), we are able to obtain Z t as a
function of X t as well.
Next, let us consider the process V () − V (0) :
1 () (0) 2 1 () (0) 2
d Vt − Vt = c(X t ) Vt − Vt dt
() () 1 () (0) 2
− g(X t , Vt , Z t )dt + Z t − Z t · dWt
VT() − VT(0) = 0 . (152)
Now, by extracting the -first order term, we can once again recover the linear
FBSDE:
(1) (1) (0) (0) (1)
d Vt = c(X t )Vt dt − g(X t , Vt , Z t )dt + Z t · dWt
VT(1) =0, (153)
which leads to
+ -
(1)
T u
Vt =E e − t c(X s )ds
g(X u , Vu(0) , Z u(0) )du Ft . (154)
t
Even if it seems impossible to get the exact result, we can still have an analytic
approximation for (Vt(i) , Z t(i) ). through again, the asymptotic expansion method.
402 A. Takahashi
We are able to treat this case in the similar way as in the decoupled case by introducing
perturbations to the forward SDE in addition to the one in BSDE:
() () () () () () ()
d Vt = c(t, X t )Vt dt − g t, X t , Vt , Z t dt + Z t · dWt
() ()
VT = X T
d X t() = r t, X t() + μ t, X t() , Vt() , Z t() dt
() () () ()
+ σ t, X t + η t, X t , Vt , Z t · dWt
We also note that the similar method can be applied to the coupled case under a PDE
(partial differential equation) formulation based on the so called four step scheme
(e.g. Ma-Yong [62].) Please see [25] for the details. Developing a mathematical
validity of the scheme for the coupled case will be one of the research topics in the
future.
This subsection briefly introduces a new scheme proposed by Fujii and Takahashi
[27]. Except the cases that we are able to obtain fully closed form expressions, the
high orders’ expansions of perturbed FBSDEs generally contain multi-dimensional
time integrations of expectation values due to a convoluted nature of the scheme,
which makes standard Monte Carlo simulations too time consuming. To avoid nested
simulations, one can applies a particle representation inspired by the ideas of branch-
ing diffusion models (e.g. Fujita [23], Ikeda, Nagasawa and Watanabe [40–42],
McKean [69], Nagasawa and Sirao [70]). Then, we are able to provide a straight-
forward simulation scheme to solve nonlinear FBSDEs at each order of the approx-
imation based on the perturbation. In particular, comparing to the direct application
of the branching diffusion method, the method is expected to be less numerically
intensive, because thanks to expansions of the perturbed generator, the interested
Asymptotic Expansion Approach in Finance 403
system is already decomposed into a set of linear problems. We illustrate the outline
of the method by following [27].
Again, let us introduce a perturbation parameter in the generator of a BSDE as
follows:
*
() () () ()
d Vs = − f (X s , Vs , Z s )ds + Z s · dWs
() (158)
VT = (X T ),
Next, let us fix the initial time as t. We denote the Malliavin derivative of X u (u ≥ t)
at time t as
Dt X u ∈ Rr ×d . (160)
Let us also note that in terms of the future time u, the SDE of (Yt,u )ij defined by
(Yt,u )ij = ∂x j X ui is given in the following:
t
where ∂k denotes the partial differentiation with respect to the kth component of
X , and δ ij stands for the Kronecker delta. Here, i and j run through {1, . . . , d} and
{1, . . . , r } for a, and we adopt the Einstein notation which assumes the summation
of all the paired indexes. Then, it is well-known that
Then, it is clear that they can be evaluated by standard Monte Carlo simulations.
However, for their use in higher order approximations, it is crucial to obtain analytical
(closed form) approximate expressions for these two quantities, for example based
on the asymptotic expansion technique as before.
In the following, let us suppose that we have obtained the solutions up to a given
order of the asymptotic expansion, and write each of them as a function of xt :
404 A. Takahashi
*
Vt(0) = v (0) (xt )
(164)
Z t(0) = z (0) (xt ).
(1)
Next, for the -first order’s coefficient Vt , we obtain an expression as
T
(1)
Vt = E f (X u , Vu(0) , Z u(0) )Ft du
t
T
= E f X u , v (0) (X u ), z (0) (X u ) Ft du. (165)
t
Then, we define the new process for (s > t) by introducing a deterministic positive
process λt as follows:
s
(1) λu du
V̂ts = e t Vs(1) , (166)
Here, λt can be a positive constant for the simplest case. Then, for the fixed initial
time t, its SDE is given by
s
(1) (1)
d V̂ts = λs V̂ts ds − λs fˆts (X s , v (0) (X s ), z (0) (X s ))ds + e t λu du
Z s(1) · dWs ,
where
1 s
fˆts (x, v (0) (x), z (0) (x)) = e t λu du f (x, v (0) (x), z (0) (x)).
λs
(1) (1)
Since we have V̂tt = Vt , one can easily see the following relation holds:
+ -
(1)
T u
Vt =E e− t λs ds
λu fˆtu (X u , v (0) (X u ), z (0) (X u ))du Ft (167)
t
Similarly to the cases of the standard credit risk modeling (e.g. Bielecki-Rutkowski
[6]), it is the present value of default payment where the default intensity is λs
with the default payoff at s(> t) as fˆts (X s , v (0) (X s ), z (0) (X s )). Thus, we obtain the
following proposition.
(1)
Proposition 3 The Vt in (165) can be equivalently expressed as
(1)
Vt = 1{τ >t} E 1{τ <T } fˆtτ X τ , v (0) (X τ ), z (0) (X τ ) Ft . (168)
Here τ is the interaction time where the interaction is drawn independently from the
Poisson distribution with an arbitrary deterministic positive intensity process λt . fˆ
is defined as
Asymptotic Expansion Approach in Finance 405
1 s
fˆts (x, v (0) (x), z (0) (x)) = e t λu du f (x, v (0) (x), z (0) (x)) . (169)
λs
Now, let us consider the -order’s coefficient of Z () , that is the component Z (1) . It
can be expressed as
T
(1)
Zt = E Dt f X u , v (0) (X u ), z (0) (X u ) Ft du (170)
t
Firstly, we observe that the SDE of the Malliavin derivative of V (1) is given as
follows:
d(Dt Vs(1) ) = −(Dt X si )∇i (x, v (0) , z (0) ) f (x, v (0) , z (0) ) + (Dt Z s(1) ) · dWs ;
(1) (1)
Dt Vt = Zt , (171)
where
(1)
Then, we define for (s > t), Dt Vs as
s
Dt Vs(1) = e t λu du (Dt Vs(1) ), (174)
d(Dt Vs(1) ) = λs (Dt Vs(1) )ds − λs (Dt X si )∇i (X s , v (0) , z (0) ) fˆts (X s , v (0) , z (0) )ds
s
λu du
+e t (Dt Z s(0) ) · dWs . (175)
(1) (1)
Dt Vt = Z t . (176)
Hence,
+ -
T u
Z t(1) = E e− t λs ds
λu (Dt X ui )∇i (X u , v (0) , z (0) ) fˆtu (X u , v (0) , z (0) )du Ft . (177)
t
406 A. Takahashi
Thus, following the same argument as for the previous proposition, we have the next
result:
(1)
Proposition 4 Z t in (170) is equivalently expressed as
(1),a
Zt = 1{τ >t} E 1{τ <T } (Yt,τ γ(X τ ))ia ∇i (X τ , v (0) , z (0) ) fˆtτ (X τ , v (0) , z (0) ) Ft , (178)
where the definitions of random time τ and the positive deterministic process λ are
the same as those in the previous proposition.
Now, we are able to obtain a new Monte Carlo scheme. That is, we have a new
particle interpretation of (V (1) , Z (1) ) as follows:
(1)
Vt = 1{τ >t} E 1{τ <T } fˆtτ X τ , v (0) , z (0) Ft (179)
(1)
Z t = 1{τ >t} E 1{τ <T } (Yt,τ γ(X τ ))i ∇i (X τ , v (0) , z (0) ) fˆtτ (X τ , v (0) , z (0) ) Ft , (180)
which allows an efficient time integration with the following Monte Carlo scheme:
• Run the diffusion processes of X and Y .
• Carry out Poisson draw with probability λs s at each time s and if “one” is drawn,
set that time as τ .
• Then stores the relevant quantities at τ , or in the case of (τ > T ) stores 0.
• Repeat the above procedures and take their expectation.
Finally, we remark that the higher order coefficients in the expansions are evalu-
ated in the similar way. Please see [27] for the details.
9 Conclusion
The present note has reviewed an asymptotic expansion approach in finance, partic-
ularly in terms of computational problems arising in practice of financial derivatives.
in finance. However, due to the limitation of the space, we have not provided thorough
explanations especially for recent progress such as improvement schemes in Sect. 5,
expansion methods in jump and jump-diffusion models in Sect. 7 and perturbation
schemes in forward backward stochastic differential equations (FBSDEs) in Sect. 8.
Please see the cited papers for the details.
Moreover, we have not introduced an application of the method to mean-variance
hedging problems in partially observable markets, which is an interesting topic as an
application of stochastic filtering problems in finance. Please see [29] for the detail.
Asymptotic Expansion Approach in Finance 407
References
1. Alòs, E., Eydeland, A., Laurence, P.: A Kirk’s and a Bachelier’s formula for three asset spread
options. Energy Risk 09(2011), 52–57 (2011)
2. Bayer, C., Laurence, P.: Asymptotics beats Monte Carlo: the case of correlated local vol
baskets. Commun. Pure Appl. Math. (2013). Published online 9 October
3. Ben Arous, G., Laurence, P.: Second order expansion for implied volatility in two factor
local stochastic volatility models and applications to the dynamic λ-SABR model. In: Friz,
P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.) Large Deviations and
Asymptotic Methods in Finance. Springer Proceedings in Mathematics and Statistics, vol.
110. Springer, Berlin (2009)
4. Benaim, S., Friz, P., Lee, R.: On Black-Scholes implied volatility at extreme strikes. In:
Cont, R. (ed.) Frontiers in Quantitative Finance: Volatility and Credit Risk Modeling. Wiley,
Hoboken (2008)
5. Bichteler, K., Gravereaux, J.-B., Jacod, J.: Malliavin Calculus for Processes with Jumps.
Stochastic Monographs. Gordon and Breach Science Publishers, New York (1987)
6. Bielecki, T., Rutkowski, M.: Credit Risk: Modeling, Valuation and Hedging. Springer, Berlin
(2000)
7. Brace, A., Gatarek, D., Musiela, M.: The market model of interest rate dynamics. Math.
Financ. 7, 127–155 (1997)
8. Carr, P., Jarrow, R., Myneni, R.: Alternative characterizations of American put options. Math.
Financ. 2, 87–106 (1992)
9. Col, A.D., Gnoatto, A., Grasselli, M.: Smiles all around: FX joint calibration in a multi-Heston
model. J. Bank. Financ. 37(10), 3799–3818 (2013)
10. Cox, J.: Notes on option pricing I: constant elasticity of diffusions. Unpublished draft, Stanford
University (1975)
11. Davydov, D., Linetsky, V.: Pricing options on scalar diffusions: an eigenfunction expansion
approach. Oper. Res. 51, 185–209 (2003)
12. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffu-
sions and stochastic volatility I: theoretical foundations. Commun. Pure Appl. Math. 67–1,
321–350 (2014)
13. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for dif-
fusions and stochastic volatility II: applications. Commun. Pure Appl. Math. 67–2, 40–82
(2014)
14. Deutsch, F.: Best Approximation in Inner Product Spaces. Springer, New York (2001)
15. Doust, P.: No-arbitrage SABR. J. Comput. Financ. 15(3), 3–31 (2012)
16. El Karoui, N., Peng, S.G., Quenez, M.C.: Backward stochastic differential equations in
finance. Math. Financ. 7, 1–71 (1997)
17. Forde, M., Jacquier, A., Lee, R.: The small-time smile and term structure of implied volatility
under the Heston model. SIAM J. Financ. Math. 3, 690–708 (2012)
18. Forde, M., Jacquier, A.: Small-time asymptotics for implied volatility under the Heston model.
Int. J. Theor. Appl. Financ. 12(6), 861–876 (2009)
19. Foschi, P.P., Pagliarani, S., Pascucci, A.: Approximations for Asian options in local volatility
models. J. Comput. Appl. Math. 237, 442–459 (2013)
20. Fouque, J.-P., Papanicolaou, G., Sircar, K.R.: Financial modeling in a fast mean-reverting
stochastic volatility environment. Asia-Pac. Financ. Mark. 6(1), 37–48 (1999)
21. Fouque, J.-P., Papanicolaou, G., Sircar, K.R.: Derivatives in Financial Markets with Stochastic
Volatility. Cambridge University Press, Cambridge (2000)
22. Friz, P., Gerhold, S., Gulisashvili, A., Sturm, S.: On refined volatility smile expansion in the
Heston model. Quant. Financ. 11(8), 1151–1164 (2011)
23. Fujita, H.: On the blowing up of solutions of the Cauchy problem for u t = u + u 1+α . J.
Fac. Sci. Univ. Tokyo 13, 109–124 (1966)
24. Fujii, M.: Momentum-space approach to asymptotic expansion for stochastic filtering. Ann.
Inst. Stat. Math. 66(1) (2012)
408 A. Takahashi
25. Fujii, M., Takahashi, A.: Analytical approximation for non-linear FBSDEs with perturbation
scheme. Int. J. Theor. Appl. Financ. 15(5) (2012)
26. Fujii, M., Takahashi, A.: Perturbative expansion of FBSDE in an incomplete market with
stochastic volatility. Q. J. Financ. 2(3) (2012)
27. Fujii, M., Takahashi, A.: Perturbative expansion technique for non-linear FBSDEs with inter-
acting particle method. Asia-Pacific Finan. Markets (2015)
28. Fujii, M., Sato, S., Takahashi, A.: An FBSDE approach to American option pricing with an
interacting particle method. CARF-F-302 (2012)
29. Fujii, M., Takahashi, A.: Making mean-variance hedging implementable in a partially observ-
able market. Quant. Financ. 14(10), 1709–1724 (2014)
30. Gatheral, J., Hsu, E.P., Laurence, P., Ouyang, C., Wang, T.-H.: Asymptotics of implied volatil-
ity in local volatility models. Math. Financ. 22(4), 591–620 (2012)
31. Gnoatto, A., Grasselli, M.: An affine multi-currency model with stochastic volatility and
stochastic interest rates. SIAM J. Financ. Math. 5(1), 493–531 (2014)
32. Gulisashvili, A.: Asymptotic formulas with error estimates for call pricing functions and the
implied volatility at extreme strikes. SIAM J. Financ. Math. 1(1), 609–641 (2011)
33. Hagan, P.S., Kumar, D., Lesniewski, A.S., Woodward, D.E.: Managing smile risk. Willmott
Mag. 15, 84–108 (2002)
34. Hayashi, M.: Asymptotic expansions for functionals of a Poisson random measure. J. Math.
Kyoto Univ. 48(1), 91–132 (2008)
35. Hayashi, M.: Coefficients of asymptotic expansions of SDE with jumps. Asia-Pac. Financ.
Mark. 17(4), 373–380 (2010)
36. Hayashi, M., Ishikawa, Y.: Composition with distributions of Wiener-Poisson variables and
its asymptotic expansion. Mathematische Nachrichten 285(5–6), 619–658 (2011)
37. Heath, D., Jarrow, R., Morton, A.: Bond pricing and the term structure of interest rates: a new
methodology for contingent claims valuation. Econometrica 60, 77–105 (1992)
38. Henry-Labordère, P.: Analysis, Geometry and Modeling in Finance: Advanced Methods in
Options Pricing. Chapman and Hall, Boca Raton (2008)
39. Ikeda, N., Watanabe, S.: Stochastic Differential Equations and Diffusion Processes, 2nd edn.
North-Holland/Kodansha, Tokyo (1989)
40. Ikeda, N., Nagasawa, M., Watanabe, S.: Branching Markov processes. Proc. Jpn. Acad. 41,
816–821 (1965)
41. Ikeda, N., Nagasawa, M., Watanabe, S.: Branching Markov processes. Proc. Jpn. Acad. 42,
252–257, 370–375, 380–384, 719–724, 1016–1021, 1022–1026 (1966)
42. Ikeda, N., Nagasawa, M., Watanabe, S.: Branching Markov processes I(II). J. Math. Kyoto
Univ. 8, 233–278, 365–410 (1968)
43. Jamshidian, F.: LIBOR and Swap market models and measures. Financ. Stoch. 1, 293–330
(1997)
44. Kawai, A.: A new approximate Swaption formula in the LIBOR market model: an asymptotic
expansion approach. Appl. Math. Financ. 10, 49–74 (2003)
45. Kobayashi, T., Takahashi, A., Tokioka, N.: Dynamic optimality of yield curve strategies. Int.
Rev. Financ. 4, 49–78 (2003) (published in 2005)
46. Kato, T., Takahashi, A., Yamada. T.: A semi-group expansion for pricing barrier options. Int.
J. Stoch. Anal. 2014(268086) (2014)
47. Kato, T., Takahashi, A., Yamada. T.: An asymptotic expansion formula for up-and-out barrier
option price under stochastic volatility model. JSIAM Lett. 5, 17–20 (2013)
48. Kunitomo, N., Takahashi, A.: Pricing average options. Jpn. Financ. Rev. 14, 1–20 (1992). (in
Japanese)
49. Kunitomo, N., Takahashi, A.: The asymptotic expansion approach to the valuation of interest
rate contingent claims. Math. Financ. 11, 117–151 (2001)
50. Kunitomo, N., Takahashi, A.: On validity of the asymptotic expansion approach in contingent
claim analysis. Ann. Appl. Probab. 13(3), 914–952 (2003)
51. Kunitomo, N., Takahashi, A.: Applications of the asymptotic expansion approach based on
Malliavin-Watanabe calculus in financial problems. Stochastic Processes and Applications to
Mathematical Finance, pp. 195–232 (2004)
Asymptotic Expansion Approach in Finance 409
52. Kusuoka, S.: Malliavin calculus revisited. J. Math. Sci. Univ. Tokyo 10, 261–277 (2003)
53. Kusuoka, K.: Approximation of expectation of diffusion process and mathematical finance.
Taniguchi Conference on Mathematics, Nara, 1998. Advanced Studies in Pure Mathematics,
vol. 31, pp. 147–165. Mathematical Society of Japan, Tokyo (2001)
54. Kusuoka, K.: Approximation of expectation of diffusion process based on Lie algebra and
Malliavin calculus. Adv. Math. Econ. 6, 69–83 (2004)
55. Kusuoka S., Stroock, D.: Applications of the Malliavin Calculus Part I. Stochastic Analysis
(Katata/Kyoto 1982), pp. 271–306 (1984)
56. Kusuoka, S., Strook, D.: Precise asymptotics of certain Wiener functionals. J. Funct. Anal.
99, 1–74 (1991)
57. Kusuoka, S., Osajima, Y.: A remark on the asymptotic expansion of density function of Wiener
functionals. J. Funct. Anal. 255(9), 2545–2562 (2007)
58. Lee, R.: The moment formula for implied volatility at extreme. Math. Financ. 14(3), 469–480
(2004)
59. Li, C.: Closed-form expansion, conditional expectation, and option valuation. Math. Oper.
Res. 39(2), 487–516 (2014)
60. Lipton, A.: Mathematical Methods for Foreign Exchange: A Financial Engineer’s Approach.
World Scientific Publication, Singapore (2001)
61. Linetsky, V.: Spectral expansions for Asian (average price) options. Oper. Res. 52, 856–867
(2004)
62. Ma, J., Yong, J.: Forward-Backward Stochastic Differential Equations and Their Applications.
Springer, Berlin (2000)
63. Ma, J., Zhang, J.: Representation theorem of backward stochastic differential equations. Ann.
Appl. Probab. 12(4), 1390–1418 (2002)
64. Malliavin, P.: Stochastic Analysis. Springer, Berlin (1997)
65. Malliavin, P., Thalmaier, A.: Stochastic Calculus of Variations in Mathematical Finance.
Springer, Berlin (2006)
66. Matsuoka, R., Takahashi, A., Uchida, Y.: A new computational scheme for computing greeks
by the asymptotic expansion approach. Asia-Pac. Financ. Mark. 11, 393–430 (2004)
67. Muroi, Y.: Pricing contingent claims with credit risk: asymptotic expansion approach. Financ.
Stoch. 9(3), 415–427 (2005)
68. Matsuoka, R., Takahashi, A.: An asymptotic expansion approach to computing Greeks. FSA
Res. Rev. 2005, 72–108 (2005)
69. McKean, H.P.: Application of Brownian motion to the equation of Kolmogorov-Petrovskii-
Piskunov. Commun. Pure Appl. Math. 28, 323–331 (1975)
70. Nagasawa, M., Sirao, T.: Probabilistic treatment of the blowing up of solutions for a nonlinear
integral equation. Trans. Am. Math. Soc. 139, 301–310 (1969)
71. Nishiba, M.: Pricing exotic options and American options: a multidimensional asymptotic
expansion approach. Asia-Pac. Financ. Mark. 20(2), 147–182 (2013)
72. Nualart, D.: The Malliavin Calculus and Related Topics. Springer, Berlin (1995)
73. Nualart, D., Üstünel, A.S., Zakai, M.: On the moments of a multiple Wiener-Itô integral and
the space induced by the polynomials of the integral. Stochastics 25, 233–340 (1988)
74. Ocone, D., Karatzas, I.: A generalized clark representation formula, with application to opti-
mal portfolios. Stoch. Stoch. Rep. 34, 187–220 (1991)
75. Osajima, Y.: The asymptotic expansion formula of implied volatility for dynamic SABR model
and FX hybrid model. Preprint, Graduate School of Mathematical Sciences, The University
of Tokyo (2006)
76. Osajima, Y.: General asymptotics of wiener functionals and application to mathematical
finance. In: Friz, P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.) Large
Deviations and Asymptotic Methods in Finance Springer Proceedings in Mathematics and
Statistics, vol. 110 (2015)
77. Pagliarani, S., Pascucci, A.: Local stochastic volatility with jumps. Int. J. Theor. Appl. Financ
16(8), 1350050 (2013)
410 A. Takahashi
78. Shiraya, K., Takahashi, A.: Pricing average options on commodities. J. Futures Mark. 31(5),
407–439 (2011)
79. Shiraya, K., Takahashi, A.: Pricing multi-asset cross currency options. J. Futures Mark. 34(1),
1–19 (2014)
80. Shiraya, K., Takahashi, A.: Pricing basket options under local stochastic volatility with jumps.
CARF-F-336 (2013)
81. Shiraya, K., Takahashi, A., Toda, M.: Pricing barrier and average options under stochastic
volatility environment. J. Comput. Financ. 15(2), 111–148 (2011)
82. Shiraya, K., Takahashi, A., Yamazaki, A.: Pricing swaptions under the LIBOR market model
of interest rates with local-stochastic volatility models. Wilmott 2011(54), 61–73 (2011)
83. Shiraya, K., Takahashi, A., Yamada, T.: Pricing discrete barrier options under stochastic
volatility. Asia-Pac. Financ. Mark. 19(3), 205–232 (2012)
84. Siopacha, M., Teichmann, J.: Weak and strong Taylor methods for numerical solutions of
stochastic differential equations. Quant. Financ. 11(4), 517–528 (2011)
85. Takahashi, A.: Essays on the valuation problems of contingent claims. Unpublished Ph.D.
Dissertation, Haas School of Business, University of California, Berkeley (1995)
86. Takahashi, A.: An asymptotic expansion approach to pricing contingent claims. Asia-Pac.
Financ. Mark. 6, 115–151 (1999)
87. Takahashi, A.: On an asymptotic expansion approach to numerical problems in finance.
Selected Papers on Probability and Statistics, pp. 199–217. American Mathematical Soci-
ety (2009)
88. Takahashi, A., Matsushima, S.: Monte Carlo simulation with an asymptotic expansion in HJM
framework. FSA Research Review 2004, pp. 82–103. Financial Services Agency (2004)
89. Takahashi, A., Saito, T.: An asymptotic expansion approach to pricing American options.
Monet. Econ. Stud. 22, 35–87 (2003). (in Japanese)
90. Takahashi, A., Takehara, K.: An asymptotic expansion approach to currency options with a
market model of interest rates under stochastic volatility processes of spot exchange rates.
Asia-Pac. Financ. Mark. 14, 69–121 (2007)
91. Takahashi, A., Takehara, K.: Fourier transform method with an asymptotic expansion
approach: an applications to currency options. Int. J. Theor. Appl. Financ. 11(4), 381–401
(2008)
92. Takahashi, A., Takehara, K.: A hybrid asymptotic expansion scheme: an application to cur-
rency options. Working paper, CARF-F-116, The University of Tokyo, http://www.carf.e.u-
tokyo.ac.jp/workingpaper/ (2008)
93. Takahashi, A., Takehara, K.: A hybrid asymptotic expansion scheme: an application to long-
term currency options. Int. J. Theor. Appl. Financ. 13(8), 1179–1221 (2010)
94. Takahashi, A., Takehara, K.: Asymptotic expansion approaches in finance: applications to
currency options. Finance and Banking Developments, pp. 185–232. Nova Science Publishers,
New York (2010)
95. Takahashi, A., Takehara, K., Toda, M.: Computation in an asymptotic expansion method.
CARF-F-149 (2009)
96. Takahashi, A., Takehara, K., Toda, M.: A general computation scheme for a high-order asymp-
totic expansion method. Int. J. Theor. Appl. Financ. 15(6) (2012)
97. Takahashi, A., Toda, M.: Note on an extension of an asymptotic expansion scheme. Int. J.
Theor. Appl. Financ. 16(5), 1350031-1–1350031-23 (2013)
98. Takahashi, A., Tsuzuki, Y.: A new improvement scheme for approximation methods of prob-
ability density functions. CARF-F-350. Forthcoming in J. Comput. Financ. (2013)
99. Takahashi, A., Uchida, Y.: New acceleration schemes with the asymptotic expansion in Monte
Carlo simulation. Adv. Math. Econ. 8, 411–431 (2006)
100. Takahashi, A., Yamada, T.: An asymptotic expansion with push-down of Malliavin weights.
SIAM J. Financ. Math. 3, 95–136 (2012)
101. Takahashi, A., Yamada, T.: A remark on approximation of the solutions to partial differential
equations in finance. Recent Adv. Financ. Eng. 2011, 133–181 (2011)
Asymptotic Expansion Approach in Finance 411
102. Takahashi, A., Yamada, T.: An asymptotic expansion for forward-backward SDEs: a Malliavin
calculus approach. CARF-F-296 (2012)
103. Takahashi, A., Yamada, T.: On error estimates for asymptotic expansions with Malliavin
weights—application to stochastic volatility model-. CARF-F-324. Forthcoming in Math.
Oper. Res. (2013)
104. Takahashi, A., Yamada, T.: An asymptotic expansion for forward-backward SDEs with a
perturbed driver. CARF-F-326 (2013)
105. Takahashi, A., Yamada, T.: A weak approximation with asymptotic expansion and multidi-
mensional Malliavin weights. CARF-F-335. Forthcoming in Ann. Appl. Probab. (2013)
106. Takahashi, A., Yoshida, N.: An asymptotic expansion scheme for optimal investment prob-
lems. Stat. Inference Stoch. Process. 7(2), 153–188 (2004)
107. Takahashi, A., Yoshida, N.: Monte Carlo simulation with asymptotic method. J. Jpn. Stat.
Soc. 35(2), 171–203 (2005)
108. Takehara, K., Takahashi, A., Toda, M.: New unified computation algorithm in a high-order
asymptotic expansion scheme. In: Recent Advances in Financial Engineering (The Proceed-
ings of KIER-TMU International Workshop on Financial Engineering 2009), pp. 231–251
(2010)
109. Takehara, K., Toda, M., Takahashi, A.: Application of a high-order asymptotic expansion
scheme to long-term currency options. Int. J. Bus. Financ. Res. 5(3), 87–100 (2011)
110. Violante, S.P.N.: Asymptotics of Wiener functionals and applications to mathematical finance.
Ph.D. Thesis, Department of Mathematics, Imperial College London (2012)
111. Watanabe, S.: Analysis of Wiener functionals (Malliavin calculus) and its applications to heat
kernels. Ann. Probab. 15, 1–39 (1987)
112. Xu, G., Zheng, H.: Basket options valuation for a local volatility jump-diffusion model with
the asymptotic expansion method. Insur. Math. Econ. 47(3), 415–422 (2010)
113. Xu, G., Zheng, H.: Lower bound approximation to basket option values for local volatility
jump-diffusion models. Int. J. Theor. Appl. Financ. 17, 1–15 (2014)
114. Yamamoto, K., Sato, S., Takahashi, A.: Probability distribution and option pricing for draw-
down in a stochastic volatility environment. Int. J. Theor. Appl. Financ. 13(2), 335–354 (2010)
115. Yamamoto, K., Takahashi, A.: A remark on a singular perturbation method for option pricing
under a stochastic volatility model. Asia-Pac. Financ. Mark. 16(4), 333–345 (2009)
116. Yamanobe, T.: Stochastic phase transition operator. Phys. Rev. E 84, 011924 (2011)
117. Yamanobe, T.: Global dynamics of a stochastic neuronal oscillator. Phys. Rev. E 88, 052709
(2013)
118. Yoshida, N.: Asymptotic expansion for small diffusions via the theory of Malliavin-Watanabe.
Probab. Theor. Relat. Fields 92, 275–311 (1992)
119. Yoshida, N.: Asymptotic expansions for statistics related to small diffusions. J. Jpn. Stat. Soc.
22, 139–159 (1992)
120. Yoshida, N.: Conditional expansions and their applications. Stoch. Process. Appl. 107, 53–81
(2003)
121. Zariphopoulou, T.: A solution approach to valuation with unhedgeable risks. Financ. Stoch.
5, 61–82 (2001)
On Small Time Asymptotics for Rough
Differential Equations Driven by Fractional
Brownian Motions
Abstract We survey existing results concerning the study in small times of the
density of the solution of a rough differential equation driven by fractional Brown-
ian motions. We also slightly improve existing results and discuss some possible
applications to mathematical finance.
1 Introduction
In this paper, our main goal is to survey some existing results concerning the small-
time asymptotics of the density of rough differential equations driven by fractional
Brownian motions. Even though we do not claim any new results, we slightly improve
The first author of this research was supported in part by NSF Grant DMS 0907326.
F. Baudoin (B)
Department of Mathematics, Purdue University, West Lafayette, IN 47907, USA
e-mail: [email protected]
C. Ouyang
Department of Mathematics, Statistics and Computer Science, University of Illinois
at Chicago, Chicago, IL 60607, USA
e-mail: [email protected]
some of the existing ones and also point out some possible connections to finance.
We also hope, it will be useful for the reader to have, in one place, the most recent
results concerning the small-time asymptotics questions related to rough differential
equations driven by fractional Brownian motions. Our discussion will mainly be
based on one hand on the papers [5–7] by the two present authors and on the other
hand on the papers [27, 28] by Inahama.
Random dynamical systems are a well established modeling tool for a variety of
natural phenomena ranging from physics (fundamental and phenomenological) to
chemistry and more recently to biology, economy, engineering sciences and mathe-
matical finance. In many interesting models the lack of any regularity of the external
inputs of the differential equation as functions of time is a technical difficulty that
hampers their mathematical analysis. The theory of rough paths has been initially
developed by T. Lyons [31] in the 1990s to provide a framework to analyze a large
class of driven differential equations and the precise relations between the driving
signal and the output (that is the state, as function of time, of the controlled system).
Rough paths theory provides a perfect framework to study differential equations
driven by Gaussian processes (see [19]). In particular, using rough paths theory,
we may define solutions of stochastic differential equations driven by a fractional
Brownian motion with a parameter H > 1/4 (see [15]). Let us then consider the
equation
t d
t
X tx = x + V0 (X sx )ds + Vi (X sx )d Bsi , (1.1)
0 i=1 0
and
d(I ) = k + n(I ),
where n(I ) is the number of 0 in the word I . The basic and fundamental result
concerning the existence of a density for stochastic differential equations driven by
fractional Brownian motions is the following:
Theorem 1.1 ([4, 10, 12, 24]) Assume H > 1
4 and assume that, at some x ∈ Rn ,
there exists N such that
On Small Time Asymptotics for Rough Differential Equations … 415
Then, for any t > 0, the law of the random variable X tx has a smooth density pt (x, y)
with respect to the Lebesgue measure on Rn .
2 (x,y)
N
1 −d 2(N +1)H
pt (x, y) = e 2t 2H ci (x, y)t 2i H
+ r N +1 (t, x, y)t . (1.3)
(t H )d
i=0
Our goal is to discuss here the various assumptions under which such expansion
is known to be true and also discuss possible variations. The approach to study the
problem is similar to the case of Brownian motion, the main difficulty to overcome
is to study the Laplace method on the path space of the fractional Brownian motion
(see [3] for the Brownian case).
The paper is organized as follows. In Sect. 2 we give some basic results of the
theory of rough paths and of the Malliavin calculus tools that will be needed. In Sect. 3,
we prove a Varadhan’s type small time asymptotics for ln pt (x, y). The discussion
is mainly based on [7]. In Sect. 4, we study sufficient conditions under which the
above expansion (1.3) is valid. Our discussion is based on [5, 27, 28]. Finally, in
Sect. 5, we discuss some models in mathematical finance where the asymptotics of
the density for rough differential equations may play an important role.
2 Preliminary Material
For some fixed H > 41 , we consider (, F, P) the canonical probability space
associated with the fractional Brownian motion (in short fBm) with Hurst parameter
H . That is, = C0 ([0, 1]) is the Banach space of continuous functions vanishing
at zero equipped with the supremum norm, F is the Borel sigma-algebra and P is
the unique probability measure on such that the canonical process B = {Bt =
(Bt1 , . . . , Btd ), t ∈ [0, 1]} is a fractional Brownian motion with Hurst parameter H .
In this context, let us recall that B is a d-dimensional centered Gaussian process,
whose covariance structure is induced by
1
j j
R(t, s) := E Bs Bt = s 2H + t 2H − |t − s|2H , s, t ∈ [0, 1] and j = 1, . . . , d.
2
(2.1)
416 F. Baudoin and C. Ouyang
In this section, we recall some basic results in rough paths theory. More details can be
found in the monographs [20] and [32]. For N ∈ N, recall that the truncated algebra
T N (Rd ) is defined by
N
T N (Rd ) = (Rd )⊗m ,
m=0
with the convention (Rd )⊗0 = R. The set T N (Rd ) is equipped with a straightforward
vector space structure plus an multiplication ⊗. Let πm be the projection on the
mth tensor level. Then (T N (Rd ), +, ⊗) is an associative algebra with unit element
1 ∈ (Rd )⊗0 .
For s < t and m ≥ 2, consider the simplex m st = {(u 1 , . . . , u m ) ∈ [s, t] ; u 1
m
< · · · < u m }, while the simplices over [0, 1] will be denoted by . A continuous
m
where {e1 , . . . , ed } denotes the canonical basis of Rd , and then define the truncated
signature of x as
N
S N (x) : 2 → T N (Rd ), (s, t) → S N (x)s,t := 1 + m
xs,t .
m=1
The function S N (x) for a smooth function x will be our typical example of multi-
plicative functional. Let us stress the fact that those elements take values in the strict
subset G N (Rd ) ⊂ T N (Rd ), called free nilpotent group of step N , and is equipped
with the classical Carnot-Caratheodory norm which we simply denote by | · |. For a
path x ∈ C([0, 1], G N (Rd )), the p-variation norm of x is defined to be
1/ p
x p−var;[0,1] = sup |xt−1
i
⊗ xti+1 | p
⊂[0,1] i
With these notions in hand, let us briefly define what we mean by geometric
rough path (we refer to [20, 32] for a complete overview): for p ≥ 1, an element
x : [0, 1] → G p (Rd ) is said to be a geometric rough path if it is the p-var limit of
a sequence S p (x m ). In particular, it is an element of the space
It can be shown that Th (x) is an element in C p−var ([0, 1], G p (Rd )). Moreover, one
can show that Th (x) uniformly continuous in h and x on bounded sets.
Remark 2.1 A typical situation of the above translation of x by h in the present paper
is when x = B, the fractional Brownian motion lifted as a rough path, and h is a
Cameron-Martin element of B. In this case, we simply denote Th (B) = B + h.
where stands again for the set of partitions of [0, 1]. It is know that (see, for example
[20]) if a process has a covariance function with finite ρ-variation for ρ ∈ [1, 2), it
admits a lift to a geometric p-rough path for all p > 2ρ. As a consequence, we have
the following for fractional Brownian motions:
Proposition 2.2 For a fractional Brownian motion with Hurst parameter H , we
have Vρ (R) < ∞ for all ρ ≥ 1/(2H ). Consequently, for H > 1/4 the process B
admits a lift B as a geometric rough path of order p for any p > 1/H .
418 F. Baudoin and C. Ouyang
We introduce the basic framework of Malliavin calculus in this subsection. The reader
is invited to read the corresponding chapters in [33] for further details. Let E be the
space of Rd -valued step functions on [0, 1], and H the closure of E for the scalar
product:
d
(1[0,t1 ] , · · · , 1[0,td ] ), (1[0,s1 ] , · · · , 1[0,sd ] )H = R(ti , si ).
i=1
C γ ⊂ H ⊂ L 2 ([0, 1])
defines an isometry between H and H H . Let us now quote from [20, Chap. 15] a
result relating the 2-d regularity of R and the regularity of H H .
Proposition 2.3 Let B be a fBm with Hurst parameter 41 < H < 21 . Then one has
H H ⊂ C ρ−var for ρ > (H + 1/2)−1 . Furthermore, the following quantitative bound
holds:
h ρ−var
h HH ≥ .
(Vρ (R))1/2
Remark 2.4 The above proposition shows that for fBm we have H H ⊂ C ρ−var for
ρ > (H + 1/2)−1 . Hence an integral of the form h d B can be interpreted in the
Young sense by means of p-variation techniques.
Remark 2.5 Under the same conditions, the above embedding can be sharpened to
H H ⊂ C ρ−var for all ρ ≥ (H + 1/2)−1 . We refer interested readers to [17] for more
details.
∂f
n
Dt F = φi (t) B(φ1 ), . . . , B(φn ) .
∂xi
i=1
For any p ≥ 1, it can be checked that the operator Dk is closable from S into
L p (; H⊗k ). We denote by Dk, p the closure of the class of cylindrical random
variables with respect to the norm
⎛ ⎞1
k p p
j
F k, p = ⎝E F p
+ E D F ⊗ j ⎠ ,
H
j=1
and
D∞ = Dk, p .
p≥1 k≥1
d
H(i) = δ G(γ F−1 )i j DF j
j=1
Hα = H(αk ) (H(α1 ,...,αk−1 ) ),
Hα Lp ≤ C p,q γ F−1 DF k
k,2k−1 r
G k,q ,
where 1
p = 1
q + r1 .
Remark 2.8 By the estimates for Hα above, one can conclude that there exist con-
stants β, γ > 1 and integers m, r such that
d
t t
X tε = x +ε Vi (X sε )d Bsi + V0 (ε, X sε )ds, (2.4)
i=1 0 0
where the vector fields V1 , . . . , Vd are C ∞ -bounded vector fields on Rn and V0 (ε, ·)
is C ∞ -bounded uniform in ε ∈ [0, 1].
Proposition 2.2 ensures the existence of a lift of B as a geometrical rough path.
The general rough paths theory (see e.g. [20, 22]) together with some integrability
results (see e.g. [12, 18]) allow us to state the following proposition:
Proposition 2.9 Consider Eq. (2.4) driven by a d-dimensional fBm B with Hurst
parameter H > 41 , and assume that the vector fields Vi s are C ∞ -bounded. Then
(i) For each ε ∈ (0, 1], Eq. (2.4) admits a unique finite p-var continuous solution X ε
in the rough paths sense, for any p > H1 .
(ii) There exists λ > 0 such that
E exp λ sup |X tε |(2H +1)∧2 < ∞. (2.5)
t∈[0,1],∈(0,1]
On Small Time Asymptotics for Rough Differential Equations … 421
Once Eq. (2.4) is solved, the vector X tε is a typical example of random variable
which can be differentiated in the Malliavin sense. We shall express this Malliavin
derivative in terms of the Jacobian Jε of the equation, which is defined by the relation
ε,i j
Jt = ∂x j X tε,i .
d
t
Jtε = Idn + ε DV j (X sε ) Jsε d Bs ,
j
(2.6)
j=1 0
and that the following results hold true (see [10, 11, 34] for further details):
Proposition 2.10 Let X ε be the solution to Eq. (2.4) and suppose the Vi ’s are C ∞ -
bounded. Then for every i = 1, . . . , n, t > 0, and x ∈ Rn , we have X tε,i ∈ D∞
and
Ds X tε = Jst
ε
V j (X sε ), j = 1, . . . , d, 0 ≤ s ≤ t,
j
Let us now quote the recent result [12], which gives a useful estimate for moments
of the Jacobian of rough differential equations driven by Gaussian processes.
Proposition 2.11 Consider a fractional Brownian motion B with Hurst parameter
H > 41 and p > H1 . Then for any η ≥ 1, there exists a finite constant cη such that
the Jacobian Jε defined at Proposition 2.10 satisfies:
η
E sup Jε p−var;[0,1] = cη . (2.7)
ε∈[0,1]
Proof The integrability of Jε is only proved in [12] when ε = 1. On the other hand,
the estimates of J in [12] only depends on the supremum norm of the vector fields and
their derivatives. In our case, the vector fields in Eq. (2.4) are εVi s whose derivatives
together with themselves are bounded uniform in ε ∈ (0, 1). Hence the uniform
integrability of Jε (in ε) follows.
Finally, we close the discussion of this section by the following large deviation
principle that will be needed later. Let : H H → C([0, 1], Rn ) be given by solving
the ordinary differential equation
d
t t
t (h) = x + Vi (s (h))dh is + V0 (0, s (h))ds. (2.8)
i=1 0 0
422 F. Baudoin and C. Ouyang
1
I (y) = inf h 2H H .
1 (h)=y 2
Recall that X 1ε is the solution to Eq. (2.4). Then X 1ε satisfies a large deviation principle
with rate function I (y).
Proof Fix any p > H1 . It is known (see [20]) that εB as a G p (Rd )-valued rough path
satisfies a large deviation principle in p-variation topology with good rate function
given by
1
h 2H if h ∈ H
J (h) = 2
+∞ otherwise.
3 Varadhan Asymptotics
d
t
t (h) = x + Vi (s (h))dh is .
i=1 0
Clearly, we have X tε = t (εB). Denote by γ1 (h) the deterministic Malliavin matrix
of 1 (h), i.e.,
ij j
γ1 (h) = Di1 (h), D1 (h)H .
1 1
d 2 (y) = I (y) = inf h 2H H , and d R2 (y) = inf h 2H H .
1 (h)=y 2 1 (h)=y,det γ1 (h) >0 2
On Small Time Asymptotics for Rough Differential Equations … 423
In the absence of the drift term (V0 = 0) in our setting in this section, one can show
that the above two distances coincide.
Lemma 3.1 For every y ∈ Rn , we have d(y) = d R (y).
Proof We follow an argument of Léandre (see [30]). By using Theorem I.2 in [30] and
the isometry between the Cameron-Martin space of the fractional Brownian motion
and the Cameron-Martin space of the Brownian motion, we see that for every ε > 0,
there exists h ∈ H such that h H ≤ ε and det γ1 (h) > 0. Then arguing as in the
Remark after Proposition II.1 in [30], we can for every η > 0 and y ∈ Rn construct
h ∈ H such that 1 (h) = y, det γ1 (h) > 0 and
1
h 2H ≤ d 2 (y) + η.
2
Throughout the section, we assume that the following assumption ! Hypothesis 3.2k
is satisfied. Let us first introduce some notations. Let A = {∅} ∪ ∞ k=1 {1, 2, · · ·, n}
and A1 = A \ {∅}. We say that I ∈ A is a word of length k if I = (i 1 , · · ·, i k ) and
we write |I | = k. If I = ∅, then we denote |I | = 0. For any integer l ≥ 1, we denote
by A(l) the set {I ∈ A; |I | ≤ l} and by A1 (l) the set {I ∈ A1 ; |I | ≤ l} . We also
define an operation ∗ on A by I ∗ J = (i 1 , · · ·, i k , j1 , · · ·, jl ) for I = (i 1 , · · ·, i k )
and J = ( j1 , · · ·, jl ) in A. We define vector fields V[I ] inductively by
V[ j] = V j , V[I ∗ j] = [V[I ] , V j ], j = 1, · · ·, d
Under this assumption the main result proved in [7] is the following Varadhan’s
type estimate:
The two key ingredients in proving Theorem 3.3 are an estimate for the Malliavin
derivative DX 1ε and an estimate of the Malliavin matrix γ X 1ε of X 1ε . Building on
previous results from [8], the following estimates were obtained in [7]:
424 F. Baudoin and C. Ouyang
h 2
HH
− B(h)
E f (X 1ε ) = e 2ε2 E f (1 (εB + h))e ε .
Hence, we obtain
1
ε2 log pε (y) ≥ − h 2H H + 2η + ε2 log E χ(εB(h))δ y (1 (εB + h)) .
2
(3.4)
Note that
1 (εB + h) − 1 (h)
Z 1 (h) = lim
ε↓0 ε
is a n-dimensional random vector in the first Wiener chaos with variance γ1 (h) > 0.
Hence Z 1 (h) is non-degenerate and we can then prove that we obtain
1 (εB + h) − 1 (h)
lim E χ(εB(h))δ0 = Eδ0 (Z 1 (h)).
ε↓0 ε
On Small Time Asymptotics for Rough Differential Equations … 425
Therefore,
lim ε2 log E χ(εB(h))δ y (1 (εB + h)) = 0.
ε↓0
where 1
p + 1
q = 1. By Remark 2.8 we know that
for some constants β, γ > 0 and integers k, m, r . Thus, by Lemma 3.4 we have
Finally by Theorem 2.12, a large deviation principle for X 1ε ensures that for small
ε we have
1 − 1 (inf y∈suppχ d 2 (y)+o(1))
P(X 1ε ∈ suppχ) q ≤ e qε2 .
Fix H > 41 and consider Eq. (2.4). For the convenience of our discussion, in what
follows, we write the above equation in the following form
t t
X tε = x +ε σ(X sε )d Bs + b(ε, X sε )ds,
0 0
Recall for each k ∈ H H , (k) is the deterministic Itô map defined in (2.8). Set
$
1
(φ) = inf k H H , φ = (k), k ∈ H H .
2
Here
and % &
c = inf d F(φi )Yi , i ∈ {1, 2, . . . , n} ,
On Small Time Asymptotics for Rough Differential Equations … 427
dYi (s) = ∂x σ(φi (s))Yi (s)dγi (s) + ∂ε b(0, φi (s))ds + ∂x b(0, φi (s))Yi (s)ds
with Yi (0) = 0.
In what follows, we sketch the proof of the above Laplace approximation in the
case H > 21 . Remarks on the rough case 41 < H < 21 will be provided afterwards.
Without loss of generality, we may assume that F + attains its minimum at a
unique path φ. There exists a γ ∈ H H such that
1
φ = (γ), and (φ) = γ 2H H ,
2
and
$
def 1
a = inf{F + (φ), φ ∈ P(Rd )} = inf F ◦ (k) + k 2H H , k ∈ H H .
2
1
d 2 (F ◦ + H H )(γ)k
2 2
> 0.
2
Consider the following stochastic differential equation
t t
Z tε =x+ σ(Z sε )(εd Bs + dγs ) + b(ε, Z sε )ds.
0 0
It is clear that Z 0 = φ. Denote Z tm,ε = ∂εm Z tε and consider the Taylor expansion
with respect to ε near ε = 0, we obtain
N
gjεj
Zε = φ + + ε N +1 R εN +1 ,
j!
j=0
dg1 (s) = σ(φs )d Bs + ∂x σ(φs )g1 (s)dγs + ∂x b(0, φs )g1 (s)ds + ∂ε b(0, φs )ds.
Hence, letting
ε
Jρ (ε) = E f (X Tε )e−F(X T )/ε , X ε ∈ B(φ, ρ) ,
2
1
θ(ε) = θ(0) + εθ (0) + ε2 θ (0) + ε3 R(ε).
2
By the Cameron-Martin theorem for fractional Brownian motions, we have
Jρ (ε) (4.1)
⎧ ⎛ ⎞ ⎫
⎨ 2
γ H ⎬
ε F(Z ε ) 1 T
∗ −1 ˙
−1 ε
= E f (Z ) exp − 2 exp ⎝− (K H ) ( K H γ) s d Bs − H ⎠ ; Z ∈ B(φ, ρ)
⎩ ε ε 0 2ε 2 ⎭
⎡ ∗ −1 −1 ⎤
' ( ˙
1 1 θ(0) + 0T (K H ) ( K H γ) s d Bs
= E exp − 2 F(φ) + γ 2H exp ⎣− ⎦
ε 2 H ε
' ( $
1
exp − θ (0) · f (Z ε )e−εR(ε) ; Z ε ∈ B(φ, ρ) .
2
Step 3: It is clear that to prove Theorem 4.2, it suffices to analyze the four terms in
the expectation above. First of all, it is apparent that the first term (of order-2) is
' (
1 1 − a
exp − 2 F(φ) + γ 2H H =e ε2 , (4.2)
ε 2
T ∗ −1 −1
d F(φ)(d(γ)k) = − (K H ) (K H˙ γ) s dks .
0
By the continuity of Young’s integral with respect to the driving path, the above
extends to
T ∗ −1 −1
d F(φ)(d(γ)B) = − (K H ) (K H˙ γ) s d Bs .
0
and
g1 = d(γ)B + Y.
dYs = ∂x σ(φs )Ys dγs + ∂ε b(0, φs )ds + ∂x b(0, φs )Ys ds, Y (0) = 0.
We obtain
⎡ T ∗ −1 −1 ⎤
θ(0) + (K ) ( K ˙ γ) d B ' (
exp ⎣− 0 H H s s
⎦ = exp − d F(φ)Y . (4.3)
ε ε
For the third term (of order 0), one can show that there exists a β > 0 such that
' ($
1
E exp −(1 + β) θ (0) < ∞. (4.4)
2
Let us emphasize that in order to show the above integrability of θ (0), one needs
to use assumption H2 and prove that d 2 F ◦ (γ)(k 1 , k 2 ) is Hilbert-Schmidt. For
more details, we refer the reader to [5] for the case when H > 21 , and to [27] when
4 < H < 2 . Moreover, one can prove the following integrability of R(ε).
1 1
Lemma 4.3 and (4.4) allows us to analyze the third and forth terms and show
" 1 # N
E f (Z ε )e− 2 θ (0)−εR(ε) ; Z ε ∈ B(φ, ρ) = αm εm + O(ε N +1 ). (4.5)
m=0
Finally, combining (4.1)–(4.3), and (4.5), the proof of Theorem 4.2 is complete.
Remark 4.4 In application (see the next section), one may also be interested in an
SDE which involves a fractional order term of ε,
t t
1
X tε = x +ε σ(X sε )d Bs +ε H b(ε, X sε )ds. (4.6)
0 0
430 F. Baudoin and C. Ouyang
Set
2 = {κ − 2|κ ∈ 1 \{0}},
and define
3 = {a1 + a2 + · · · + am |m ∈ N+ and a1 , . . . , am ∈ 1 }
and
3 = {a1 + a2 + · · · + am |m ∈ N+ and a1 , . . . , am ∈ 2 }.
Finally let
4 = {a + b|a ∈ 3 , b ∈ 3 }
N
Zε = φ + gκ j εκ j + εκ N +1 Rκε N +1 .
j=0
Note that in (4.8), indices up to degree two are (0, 1, 1/H, 2). There is an extra term
1/H compared to the case without fractional order. Hence when plugging (4.9) into
Step 2 of the proof of Theorem 4.2, there is an extra (but deterministic) term
$
d F()gκ2
exp − 1
,
ε2− H
On Small Time Asymptotics for Rough Differential Equations … 431
It is not hard to see that the other terms up to degree two remain the same, and that
although higher order terms are different they could be handled similarly as before.
Hence we obtain
Theorem 4.5 Let X ε satisfy (4.6). We have
$
" ε 2# −a c d
E f (X ε )e−F(X )/ε = e ε2 e− ε exp − αλ0 + αλ1 ελ1 + · · · + αλ N ελ N
2− H1
ε
λ N +1
+ O(ε ) .
Here
dY (s) = ∂x σ(φi (s))Y (s)dγ(s)+∂ε b(0, φ(s))ds+∂x b(0, φ(s))Y (s)ds, Y (0) = 0,
and
dgκ2 (s) = ∂x σ(φs )gκ2 (s)dγs + b(0, φs )ds, gκ2 (0) = 0.
Remark 4.6 Theorem 4.2 for the rough case 41 < H < 21 was proved by Inahama
[27]. In this case, equation is understood in the rough path sense. Thanks to Propo-
sition 2.3, equations for gi and Ri are understood as Young’s paring.
In [27] the author also discussed RDEs with fractional orders of ε, in which the
index set 1 was introduced. The main idea of the proof for the rough case is the same
as that outlined above. But the major difficulty is to show that d 2 F ◦ (γ)(k 1 , k 2 ) is
Hilbert-Schmidt. This is easier when H > 21 , since in this case ∂t K (t, s) is integrable,
and one can easily obtain a nice representation for d 2 F ◦ (γ)(k 1 , k 2 ).
Consider
d
t t
Xt = x + Vi (X s )d Bsi + V0 (X s )ds. (4.10)
i=1 0 0
432 F. Baudoin and C. Ouyang
d t t
1
X tε =x+ ε Vi (X s )d Bs + ε H
i
V0 (X s )ds.
i=1 0 0
In what follows, we use the Laplace approximation to obtain a short time asymp-
totic expansion for the density of X 1ε in the case when H > 21 . For this purpose, we
need the following assumption.
Assumption 4.7 • A 1: For every x ∈ Rd , the vectors V1 (x), · · · , Vd (x) form a
basis of Rd .
• A 2: There exist smooth and bounded functions ωil j such that:
d
[Vi , V j ] = ωil j Vl ,
l=1
and
j
ωil j = −ωil .
Whenever there is no confusion, we always suppress the starting point x and denote
it simply by (k) as before. Then we have (see Lemma 4.2 in [5])
Lemma 4.8 (x, k) is a geodesic if and only if k(t) = tu for some u ∈ Rd .
As a consequence of the previous lemma, we then have the following key result
(Proposition 4.3 in [5]):
d 2 (x, y)
inf k 2H H = .
k∈H H ,T (x,k)=y T 2H
Let F be in the above lemma and pε (x, y) the density function of X 1ε . By the
inversion of Fourier transformation we have
− F(x,y,y) 1 −iζ·y iζ·z − ε2
F(x,y,z)
pε (x, y)e ε 2 = e dζ e e pε (x, z)dz
(2π)d
1 ζ·y ζ·z − F(x,y,z)
= e−i ε dζ ei ε e ε2 pε (x, z)dz
(2πε) d
iζ·(X ε −y) F(x,y,X ε )
1 1 − 1
= dζE x e ε e ε2 . (4.11)
(2πε) d
It is clear that by applying Laplace approximation to the expectation in the last equa-
tion above and switching the order of integration (with respect to ζ) and summation,
we obtain an asymptotic expansion for the density function pε (x, y).
Remark 4.11 One might wonder why not constructing, for each fixed x, y, a function
F which minimizes (at z = y)
D(x, z)2
F(x, y, z) +
2
in Lemma 4.10, where
D 2 (x, y) = inf k 2H H .
k∈H H ,1 (x,k)=y
After all D(x, y) seems the natural “distance” for the system (4.10), instead of
the Riemannian distance d(x, y). The problem with D(x, y) is that it is not clear
weather it is differentiable, while the construction of F in Lemma 4.10 needs some
differentiability of D(x, y). This is indeed one of the reasons why we impose the
structure assumption A2 so that D(x, y) = d(x, y) (content of Proposition 4.9).
With this identification, we know D(x, y) is smooth for all x = y.
Remark 4.12 In order to show Proposition 4.9, we used the fact that ∂ K (t, s)/∂t is
integrable, which is only true for the smooth case H > 21 . Hence although Inahama
proved the Laplace approximation for 41 < H < 21 in [27], we can not repeat the proof
in this section to produce an expansion of the density function for the rough case.
434 F. Baudoin and C. Ouyang
2 = {κ − 1|κ ∈ 1 \{0}}
and
2 = {κ − 2|κ ∈ 1 \{0}}.
Next define
3 = {a1 + a2 + · · · + am |m ∈ N+ and a1 , . . . , am ∈ 2 }.
and
3 = {a1 + a2 + · · · + am |m ∈ N+ and a1 , . . . , am ∈ 2 }.
Finally, set
4 = {a + b|a ∈ 3 , b ∈ 3 }
2 (x,y)
N
1 −d + 2Hβ−1
p(t; x, y) = e 2t 2H t ci (x, y)t λi H +r N +1 (t, x, y)t λ N +1 H , y ∈ V.
(t H ) d
i=0
and uses Watanabe distribution theory. Hence he is able to work with D(x, y) intro-
duced in Remark 4.11 directly and avoids the technical assumption A2 of Assumption
4.7. On the other hand, the smoothness of coefficient and the uniform estimate for
the remainder terms in the expansion are not provided in [28, 29].
Fractional Brownian motions have been used in financial models to introduce mem-
ory. In this section, we give two examples of such models and remark on how the
methods and results in the previous sections could be applied to the study of such
models.
Memories can be introduced to stock price process directly. In particular, the so-called
fractional Black and Scholes model is given by
σ 2 2H
St = S0 exp μt + σ Bt −
H
t , (5.1)
2
where B H is a fractional Brownian motion with Hurst parameter H , μ the mean rate
of return and σ > 0 the volatility. Let r be the interest rate. The price for the risk-free
bond is given by er t .
More generally, one can also consider a fractional local volatility model
Here the stochastic integration with respect to B H could be understood in the sense
of rough path theory. After a simple change of variable X t = log St , one obtains
There has been an intensive study recently of option prices and implied volatilities for
options with short maturity (e.g. [9, 16, 21]). Since the above equation is a special
case of (4.10), we can use the results obtained in the previous sections to obtain
short-time asymptotic behavior of such models.
A drawback of the finance models discussed above is that they lead to the existence
of arbitrage opportunities. For example, let the couple (αt , βt ), t ∈ [0, T ] be a
portfolio with αt the amount of bonds and βt the amount of stocks at time t. One can
construct an arbitrage in the fractional Black and Scholes model by (for simplicity,
we assume μ = r = 0)
436 F. Baudoin and C. Ouyang
t
βt = St − S0 , and αt = βt d St − βt St .
0
Let Vt be the value of the portfolio at time t. It is not hard to see that this is a self-
financing portfolio that satisfies V0 = 0 and Vt = (St − S0 )2 for all t > 0, and hence
it is an arbitrage. For more discussion on arbitrage in models given by fractional
Brownian motions, we refer the reader to [35].
Stochastic volatility models were introduced to capture both the volatility smile and
the correct dynamics of the volatility smile (see [23] for instance). For these models,
modeling the volatility process is one of the key factors. In [14], the authors proposed
a long memory specification of the volatility process in order to capture the steepness
of long term volatility smiles without over increasing the short run persistence.
The following stochastic volatility model based on the fractional Ornstein-
Uhlenbeck process provides another way introducing long memory to the volatility
process:
d St = μSt dt + σt St dWt ,
Then Q is the minimal martingale measure associated with P. Moreover, the risk
minimizing-hedging price at t = 0 of an European call option with payoff (ST − K )+
is given by
On Small Time Asymptotics for Rough Differential Equations … 437
C0 = e−r T EQ (ST − K )+ .
References
1. Azencott, R.: Densité des diffusions en temps petit: développements asymptotiques. I. Seminar
on probability, XVIII. Lecture Notes in Mathematics, vol. 1059, pp. 402-498. Springer, Berlin
(1984)
2. Ben Arous, G.: Développement asymptotique du noyau de la chaleur hypoelliptique hors du
cut-locus. Ann. Sci. École Norm. Sup. (4) 21(3), 307–331 (1988)
3. Ben Arous, G.: Méthode de Laplace et de la phase stationnaire sur l’espace de Wiener, Sto-
chastics 25(3), 125–153 (1988)
4. Baudoin, F., Hairer, M.: A version of Hörmander’s theorem for the fractional Brownian motion.
Probab. Theory Relat. Fields 139, 373–395 (2007)
5. Baudoin, F., Ouyang, C.: Small-time kernel expansion for solutions of stochastic differential
equations driven by fractional Brownian motions. Stoch. Process. Appl. 121(4), 759–792 (2011)
6. Baudoin, F., Ouyang, C.: Gradient bounds for solutions of stochastic differential equa-
tions driven by fractional Brownian motions. Malliavin Calculus and Stochastic Analysis:
A Festschrift in Honor of David Nualart. Springer, Berlin (2012)
7. Baudoin, F., Ouyang, C., Zhang, X.: Varadhan estimates for RDEs driven by fractional Brown-
ian motions. Stoch. Proc. Appl. 125(2), 634–652 (2015)
8. Baudoin, F., Ouyang, C., Zhang, X.: Smoothing effect of rough differential equations driven
by fractional Brownian motions. Ann. Inst. Henri Poincare Probab. Statist. (2013)
9. Berestyki, H., Busca, J., Florent, I.: Computing the implied volatility in stochastic volatility
models. Commun. Pure Appl. Math., Vol. LVII, 1352–1373 (2004)
10. Cass, T., Friz, P.: Densities for rough differential equations under Hörmander condition. Ann.
Math. 171(3), 2115–2141 (2010)
11. Cass, T., Friz, P., Victoir, N.: Non-degeneracy of Wiener functionals arising from rough differ-
ential equations. Trans. Am. Math. Soc. 361, 3359–3371 (2009)
12. Cass, T., Litterer, C., Lyons, T.: Integrability and tail estimates for Gaussian rough differential
equations. Ann. Probab. 41(4), 3026–3050 (2013)
13. Comte, F., Renault, E.: Long memory in continuous-time stochastic volatility models. Math.
Financ. 8, 291–323 (1998)
14. Comte, F., Coutin, L., Renault, E.: Affine fractional stochastic volatility models. Ann. Financ.
8(2–3), 337–378 (2012)
15. Coutin, L., Qian, Z.M.: Stochastic analysis, rough path analysis and fractional Brownian
motions. Probab. Theory Relat. Fields 122(1), 108–140 (2002)
16. Feng, J., Forde, M., Fouque, J.P.: Short maturity asymptotics for a fast mean reverting Heston
stochastic volatility model. SIAM J. Financ. Math. 1, 126–141 (2010)
17. Friz, P., Gess, B., Gulisashvili, A., Riedel, S.: The Jain-Monrad criterion for rough paths and
applications to random Fourier series and non-Markovian Hörmander theory. Ann. Probab.
(2013)
18. Friz, P., Riedel, S.: Integrability of (non-)linear rough differential equations and integrals.
Stoch. Anal. Appl. 31(2), 336–358 (2013)
19. Friz, P., Victoir, N.: Differential equations driven by Gaussian signals. Ann. Inst. Henri Poincare
Probab. Stat. 46(2), 369–413 (2010)
438 F. Baudoin and C. Ouyang
20. Friz, P., Victoir, N.: Multidimensional Dimensional Processes seen as Rough Paths. Cambridge
University Press, Cambridge (2010)
21. Gatheral, J., Hsu, E., Laurence, P., Ouyang, C., Wang, T.-H.: Asymptotics of implied volatility
in local volatility models. Math. Financ. 22, 591–620 (2012)
22. Gubinelli, M.: Controlling rough paths. J. Funct. Anal. 216, 86–140 (2004)
23. Hagan, P., Kumar, D., Lesniewski, A., Woodward, D.: Managing Smile Risk. Wilmott Mag.
(2003)
24. Hairer, M., Pillai, N.S.: Regularity of laws and ergodicity of hypoelliptic SDEs driven by rough
paths. Ann. Inst. Henri Poincaré Probab. Stat. 47(2), 601–628 (2011)
25. Hu, Y.: Integral transformations and anticipative calculus for fractional Brownian motions.
Mem. Am. Math. Soc. 175(825), 324 (2005)
26. Hull, J., White, A.: The pricing of options on assets with stochastic volatilities. J. Financ. 3,
281–300 (1987)
27. Inahama, Y.: Laplace approximation for rough differential equation driven by fractional Brown-
ian motion. Ann. Probab. 41(1), 170–205 (2013)
28. Inahama, Y.: Short time kernel asymptotics for young SDE by means of Watanabe distribution
theory. To appear in J. Math. Soc. Jpn. (2013)
29. Inahama, Y.: Short time kernel asymptotics for rough differential equation driven by fractional
Brownian motion. Preprint (2014)
30. Léandre, R.: Minoration en temps petit de la densité d’une diffusion dégénérée. J. Funct. Anal.
74, 399–414 (1987)
31. Lyons, T.: Differential equations driven by rough signals. Rev. Mat. Iberoam. 14(2), 215–310
(1998)
32. Lyons, T., Qian, Z.: System Control and Rough Paths. Oxford University Press, Oxford (2002)
33. Nualart, D.: The Malliavin Calculus and Related Topics, 2nd edn. Probability and its Applica-
tions. Springer, Berlin (2006)
34. Nualart, D., Saussereau, B.: Malliavin calculus for stochastic differential equations driven by
a fractional Brownian motion. Stoch. Process. Appl. 119(2), 391–409 (2009)
35. Rogers, L.C.G.: Arbitrage with fractional Brownian motion. Math. Financ. 7(1), 95–105 (1997)
On Singularities in the Heston Model
Vladimir Lucic
1 Problem Formulation
Consider the Heston stochastic volatility model, which under risk-neutral measure
and with zero drift has the following dynamics
√ (1)
dSt = St vt dWt ,
√ (1)
(2)
dvt = λ(v̄ − vt ) dt + η vt (ρ dWt + 1 − ρ2 dWt ),
where the parameters λ, η, and v̄ are nonnegative, ρ ∈ [−1, 1], and the initial values
S0 and v0 are positive.
The Heston characteristic function is defined as
φ H (u, τ ) = E eiu log(Sτ /S0 ) , α < (u) < β.
Results of Heston [5] and Lewis [7] show that on the strip of convergence α <
(u) < β the Heston characteristic function coincides with
V. Lucic (B)
Quantitative Analytics, Barclays, 5 The North Colonnade, Canary Wharf,
London E14 4BB, UK
e-mail: [email protected]
where
1 − e−dτ 2 1 − ge−dτ
D(u, τ ) = r− , C(u, τ ) = λ r − τ − log ,
1 − ge−dτ η2 1−g
β±d r−
r± = , d = β 2 + 2αη 2 , g = ,
η 2 r+
u2 iu
α= − , β = λ + ρηiu.
2 2
With the customary abuse of terminology, we’ll refer to φ(u, τ ), u ∈ Z as the Heston
characteristic function.
Using a result1 of Lukacs [8], Lewis [7] points out that φ(u, τ ) has singularities
on the imaginary axis at the boundaries of the strip of convergence. Whether there
are any other singularities (necessarily complex-conjugate) on that boundary could
not be readily established. Furthermore, no conclusions can be made about singular-
ities outside of the strip of convergence. The purpose of this note is to provide full
characterization of the singularities of φ(u, τ ).
2 Main Result
The following theorem, although presented as an existence result, allows for con-
struction of the singularities of φ(u, τ ) via standard numerical methods.
Theorem 2.1 All singularities of φ(u, τ ) are pure imaginary.
Proof Assume η > 0, as for η = 0 we have the Black-Scholes model whose char-
acteristic function is free of singularities (see, e.g., Lewis [7]).
To simplify notation we put is = u and show that the (essential) singularities of
φ(is, τ ) are real. To this end, we show that the transcendental equation
r+
= e−dτ , (2.1)
r−
where
β = λ − ρηs (2.2a)
d = β 2 − η 2 s(s − 1) (2.2b)
β±d
r± = (2.2c)
η2
1 Asnoted in Lukacs [8], this is a corollary of a more general result on Laplace transforms, e.g.
Theorem II.5b of Widder [9].
On Singularities in the Heston Model 441
We consider (2.1) and (2.2) as a system in d and s. Equation (2.1) can be written as
d 2 − λ2
s= . (2.6)
η 2 − 2λρη
1
s= .
1 − e−λτ
If d = 0 we have equality in (2.3), while from (2.5) and (2.6) it follows that the roots
in s are real.
For d = 0 substituting (2.5) in (2.3) yields
η − 2ρλ ± (η − 2ρλ)2 + 4q 2 (λ2 − d 2 )
d = −λ + ρ tanh(τ d/2),
2q 2
2
2dq 2 coth(τ d/2) + 2λ − ρη = ρ2 ((η − 2ρλ)2 + 4q 2 (λ2 − d 2 )), (2.7)
and ρ
d coth(τ d/2) + λ = (λ2 − d 2 ). (2.8)
η − 2ρλ
442 V. Lucic
With
τd τ (η − 2λρ) τλ |ρ|
= i z, a = sgn(ρ), b = , c=
2 4q 2 q
Lemma 2.2 implies that the roots of (2.7) are either real or pure imaginary. For the
special case (2.8), Lemma 2.3 with
τd τλ 2ρ
= i z, b = , c=
2 2 τ (η − 2ρλ)
implies that the corresponding roots are also either real or pure imaginary.
Therefore, it follows that for d = 0 the expression in the brackets in (2.3) is
real (being ratio of either real or imaginary numbers), which in turn implies that the
solutions of the transcendental equation (2.1) are real in s.
Lemma 2.2 For real a and real nonnegative b, c the roots of the equation
Proof For c = 0 the result follows from Lemma A.6. If c > 0 from Lemma 2.4 we
have that for sufficiently large N equation (2.9) has 4N + 2 roots inside the square
with vertices (N + 1/2)(±π, ±iπ). On the other hand, from Lemmas A.1 and A.3
it follows that there are 4N + 2 real or pure imaginary roots inside the same square,
so the result follows.
Lemma 2.3 For real nonnegative b and real c the roots of the equation
Proof For c = 0 the result follows from Lemma A.6. Putting a = 0 in Lemma 2.4
we conclude that for every c = 0 and sufficiently large N equation
has 4N + 4 roots inside the square with vertices (N + 1/2)(±π, ±iπ). On the other
hand, from Lemmas A.2 and A.4 it follows that both equations
z cot(z) + b = ±c(b2 + z 2 )
have 2N + 2 real or pure imaginary roots inside the same square, whence the result
follows.
On Singularities in the Heston Model 443
In the next lemma2 we make repeated use of the Rouché’s theorem (e.g., Hille [6]
[Theorem 9.2.3]).
Lemma 2.4 Let C N , N ∈ N denote the square in complex plane with vertices at
(N + 21 )(±π, ±iπ). Then for real a, nonnegative b, c, and d = 1, 2 there exists
N0 ∈ N such that for every integer N > N0 the equation
π e y − e−y
| cot(z)| = cot + Nπ + iy = | tan(i y)| = < 1, (2.12)
2 e y + e−y
Together with (2.12) and the fact that | cot(z)| = | cot(−z)| this implies
1 + e−(2N +1)π
| cot(z)| ≤ =: k N , z ∈ C N .
1 − e−(2N +1)π
For z ∈ C N have
2 A weaker version of this result (dealing with the case of real roots only) appears as Problem E1295
their multiplicities). For sufficiently large N those two numbers are 4N and 2d
respectively, whence the equation (2.11) has 4N + 2d roots inside C N .
Consider now d = 1, 0 < c < 1. Let D N be the square vertices at (±N π, ±N iπ),
and let D N denote D N extended with semicircles of radius so that the poles of
cot(z) at ±N π are inside D N , but the real zeros of (2.11) in (N π, (N + 1/2)π)
and (−(N + 1/2)π, −N π) described in Lemma A.1 remain outside. For ease of
exposition in what follows we make smaller if necessary, which can be done without
invalidating previously established statements.
Similarly as before, on the right vertical side of D N we have
e y − e−y
| tan(z)| = |tan (N π + i y)| = | tan(i y)| = < 1, (2.13)
e y + e−y
Together with (2.13) and the fact that | cot(z)| = | cot(−z)| this implies that for
sufficiently small > 0
1 − e−2N π
| cot(z)| ≥ =: kn , z ∈ D N . (2.15)
1 + e−2N π
On D N we have
|c2 (a 2 + b2 + z 2 )| c2 |c2 (a 2 + b2 )|
≤ + 2 ,
|(z cot(z) + b − ac)2 | 2
|z|| cot(z)| − |b − ac|
| cot(z)| − b−ac
z
|c2 (a 2 + b2 + z 2 )| c2 |c2 (a 2 + b2 )|
≤ + 2 .
|(z cot(z) + b − ac)2 | 2
|z| k N − |b − ac|
kN − b−ac
z
so we are left with number of zeros of the second mapping, which for sufficiently
small , according to Lemma A.6, is 4N . Therefore, from Lemma A.5, and taking
into account two real zeros in (−(N + 1/2)π, −N π) ∪ (N π, (N + 1/2)π) whose
existence is established in Lemma A.1, we conclude that for d = 1, 0 < c < 1 and
sufficiently large N there are 4N + 2 zeros inside C N .
Finally, consider the case c = 1, d = 1. Put α = b − ac, β 2 = a 2 + b2 , so that
we get
cos(2z) 2 sin(2z)
z + αz 2 − (β 2 − α2 ) = 0, (2.16)
sin2 (z) sin (z)
or, equivalently,
On D N we have
2α − zβcot(z)
−α2 2
|2α| + |β|z|k
−α | 2 2
|2αz cot(z) − (β 2 − α2 )|
≤ ≤ N
.
|2 cot(2z) cot(z)z 2 | 2|z|| cot(2z)| 2|z|k2N
that is,
cos(2z) 2 sin(2z)
z > αz 2 − (β 2 − α2 ) , z ∈ D N .
sin2 (z) sin (z)
Thus, by Rouché’s theorem this implies that the number of roots of (2.16) inside D N
equals the number of zeros of z → cos(2z)
sin2 (z)
z 2 inside D N , which is 4N . Therefore,
reasoning as in the previous part of the proof we conclude that for c = 1 we have
4N + 2 zeros of (2.11) inside C N for N large enough.
Acknowledgments I wish to thank Tomislav Šekara of University of Belgrade and the anonymous
referee for their comments and suggestions.
Addendum
The first version of this paper appeared on SSRN in 2007 (following the author’s
investigation into applicability of the Talbot’s numerical inversion method in trans-
form analysis of option prices). Since then several publications have appeared using
the main result of the present work, which we list below for completeness.
446 V. Lucic
Based on Theorem 2.1, in Ferreiro-Castilla [2] and del Baño Rollin et al. [1] a
smoothness result for the density of the log-spot in the Heston model is presented,
together with an alternative proof of our main result. Theorem 2.1 was also used in
Friz et al. [3] in the study of the asymptotic behaviour of the stock price density in
the negatively correlated Heston model. Finally, Lemma 6.1 from Gulisashvili et al.
[4], used in the study of the asymptotic behaviour of the mixing distribution density
in the uncorrelated Heston model, is quite close in spirit to the results presented here.
Appendix
Lemma A.1 For N sufficiently large, equation (2.9) has 4N −2 real roots in (−(N +
1/2)π, −π) ∪ (π, (N + 1/2)π).
Proof By Lemma A.5 for every N > 1 equation (2.9) as two real roots in each of
the intervals (−(k + 1)π, −kπ) and (kπ, (k + 1)π), k = 1, 2, . . . N − 1.
Rewrite (2.9) as
z cot(z) = −(b − ac) ± c a 2 + b2 + z 2 . (A.1)
For N > 0
lim z cot(N π) = +∞, z cot(N π + π/2) = 0, (A.2)
z→N π+
so we conclude that for sufficiently large N the equation with plus sign has one real
root in (N π, (N + 1/2)π), hence by symmetry in (−(N + 1/2)π, −N π).
Lemma A.2 For N sufficiently large equation (2.10) has 2N real roots in (−(N +
1/2)π, −π) ∪ (π, (N + 1/2)π) if c > 0, and 2N − 2 real roots if c < 0.
Proof By Lemma A.5 for every N > 1 equation (2.10) has one real root in each of
the intervals (−(k + 1)π, −kπ) and (kπ, (k + 1)π), k = 1, 2, . . . N − 1.
Rewrite (2.10) as
z cot(z) = −b + c(b2 + z 2 ). (A.3)
From (A.2) and (A.3) we conclude that if c > 0 for sufficiently large N equa-
tion (2.10) has one real root in (N π, (N + 1/2)π), hence by symmetry in (−(N +
1/2)π, −N π).
Lemma A.3 For real a, nonnegative b, and c > 0 equation (2.9) has either four
real roots in (−π, π), or two real roots in (−π, π) and two imaginary roots.
Proof The proof follows by simple geometrical considerations. For z = 0 the right-
hand side of (A.1) assumes two values
α1 := −(b − ac) + c a 2 + b2 , α1 := −(b − ac) − c a 2 + b2 .
On Singularities in the Heston Model 447
√
Since ac − c b2 + a 2 ≤ 0 we have α2 ≤ 0. On the other hand, the function
x → x cot(x) is zero at the origin and strictly decreases on [0, π), with a discontinuity
of the second kind at π. Thus, (A.1) has one√ real root corresponding to the intersection
of x → x cot(x) and x → −(b − ac) − c a 2 + b2 + z 2 on (0, π).
If α1 < 1 following the same argument we conclude that there is another real root
in√(0, π) corresponding to the intersection of x → x cot(x) and x → −(b − ac) +
c a 2 + b2 + z 2 . If α1 = 1 we have a double root at zero.
Thus, based on the above considerations and the symmetry around the origin it
follows that in (−π, π) equation (A.1) has four real roots if α1 ≤ 1, and two real roots
if α1 > 1. Therefore, to complete the proof we show that (A.1) has two imaginary
roots if α1 > 1.
Put z = i y, y ∈ R in (A.1) to get
y coth(y) = −(b − ac) ± c a 2 + b2 − y 2 . (A.4)
On the left-hand side we have a continuous function equal to one at the origin that
tends to infinity as y increases. Note that α1 > 1 implies a 2 + b2 > 0 Thus, on
the right-hand side we have a semi-circle starting at (0, α1 ) on the ordinate, entering
into the right half-plane, and ending at (0, α2 ) on the ordinate, half-encircling the
point (0, 1) (as α1 > 1 and α2 ≤ 0). Therefore, there must exist y0 > 0 for which
the equality holds in (A.4). Since −y0 also solves (A.4), we have two imaginary
solutions.
Lemma A.4 Assume b ≥ 0. For c > 0 equation (2.10) has either two real roots
in (−π, π) or two imaginary roots. If c < 0 equation (2.10) has two real roots in
(−π, π) and two imaginary roots.
Therefore, if c > 0 and −b+cb2 > 1 the right-hand side dominates the left-hand side
at the origin, while the opposite is true for sufficiently large y. From the continuity of
the two functions it then follows that (A.5) has one positive root, hence by symmetry
one negative root. Finally, if c < 0 the left-hand side dominates the right-hand side at
the origin, while the opposite is true for sufficiently large y, giving a pair of imaginary
roots.
Lemma A.5 For every positive integer k equation (2.11) has two real roots in each
of the intervals (−(k + 1)π, −kπ) and (kπ, (k + 1)π).
448 V. Lucic
Proof The result follows from the fact that on each of those intervals the range of
the map x → x cot(x) is the whole real line, while the maps x → −b + ac ± c(a 2 +
b2 + x 2 )d/2 are bounded.
z cot(z) = a, z ∈ Z (A.6)
has 2N roots inside the square with vertices (±N π, ±N iπ). The roots are real or
pure imaginary.
Proof For a = 0 the roots are the zeros of cos(z). If a = 0 from (2.13) and (2.14)
we conclude that for sufficiently large N
has 2N + 1 roots inside the square with vertices (±N π, ±N iπ). If k > 0 it has two
real roots in (−(k + 1)π/2, kπ/2) ∪ (kπ/2, (k + 1)π/2) if either a > 0 and k is
even, or a < 0 and k is odd.
On the other hand, in (−π, π) there are three roots (counting their multiplicities)
if a ≥ 1 and one root if 0 < a < 1. In the latter case there are two imaginary roots
(c.f. example on p. 255 of Hille [6]). Since (A.6) has one root less at the origin, the
result follows.
References
1. del Baño Rollin, S., Ferreiro-Castilla, A., Utzet, F.: On the density of log-spot in the Heston
volatility model. Stoch. Process. Appl. 120, 2037–2063 (2010)
2. Ferreiro-Castilla, A.: Stochastic Calculus and Analytic Characteristic Functions: Applications
to Finance. Ph.D. thesis, Universitat Autònoma de Barcelona (2011)
3. Friz, P., Gerhold, S., Gulisashvili, A., Sturm, S.: On refined volatility smile expansion in the
Heston model. Quant. Financ. 11, 1151–1164 (2011)
4. Gulisashvili, A., Stein, E.M.: Asymptotic behavior of the stock price distribution density and
implied volatility in stochastic volatility models. Appl. Math. Optim. 61, 287–315 (2010)
5. Heston, S.L.: A closed-form solution for options with stochastic volatility with applications to
bond and currency options. Rev. Financ. Stud. 6(2), 327–343 (1993)
6. Hille, E.: Analytic Function Theory, vol. 1. Blaisdell, New York (1965)
7. Lewis, A.L.: Option Valuation Under Stochastic Volatility. Finance Press, Newport Beach (2000)
8. Lukacs, E.: Characteristic Functions. Charles Griffin & Co., London (1970)
9. Widder, D.V.: The Laplace Transform. Princeton University Press, Princeton (1946)
On the Probability Density Function
of Baskets
Abstract The state price density of a basket, even under uncorrelated Black–Scholes
dynamics, does not allow for a closed form density. (This may be rephrased as state-
ment on the sum of lognormals and is especially annoying for such are used most
frequently in Financial and Actuarial Mathematics.) In this note we discuss short
time and small volatility expansions, respectively. The method works for general
multi-factor models with correlations and leads to the analysis of a system of ordi-
nary (Hamiltonian) differential equations. Surprisingly perhaps, even in two asset
Black–Scholes situation (with its flat geometry), the expansion can degenerate at
a critical (basket) strike level; a phenomena which seems to have gone unnoticed
in the literature to date. Explicit computations relate this to a phase transition from
a unique to more than one “most-likely” paths (along which the diffusion, if suit-
ably conditioned, concentrates in the afore-mentioned regimes). This also provides a
(quantifiable) understanding of how precisely a presently out-of-money basket option
may still end up in-the-money.
C. Bayer
Weierstrass Institute, Mohrenstrasse 39, 10117 Berlin, Germany
e-mail: [email protected]
P.K. Friz (B)
Institut für Mathematik, Technische Universität Berlin, Berlin, Germany
e-mail: [email protected]
P.K. Friz
Weierstraß-Institut für Angewandte Analysis und Stochastik, Berlin, Germany
P. Laurence
Dipartimento di Matematica, Università di Roma 1 Piazzale Aldo Moro 2,
00185 Rome, Italy
P. Laurence
Courant Institute of Mathematical Sciences, New York University,
251 Mercer Street, New York, NY 10012, USA
1 Introduction
As is well known, the sum of independent log-normal variable does not admit a
closed-form density. And yet, there are countless applications in Finance and Actu-
arial Mathematics where such sums play a crucial role, consider for instance the law
of a Black–Scholes basket B at time T , i.e. the weighted average of d geometric
Brownian motions.
As a consequence, there is a natural interest in approximations and expansions,
see e.g. [9] and the references therein. This article contains a detailed investigation
in small volatility and short time regimes. Forthcoming work of A. Gulisashvili
and P. Tankov [12] deals with tail asymptotics. Our methods are not restricted to
the geometric Brownian motion case: in principle, each Black–Scholes component
could be replaced by the asset price in a stochastic volatility model, such as the the
Stein–Stein model [16], with full correlation between all assets and their volatilities.
In the end, explicit solutions only depend on the analytical tractability of a system of
ordinary differential equations. If such tractability is not given, one can still proceed
with numerical ODE solvers.
As a matter of fact, our aim here is not to push the generality in which our methods
work: one can and should expect involved answers in complicated models. Rather, our
main—and somewhat surprising—insight is that unexpected phenomena are already
present in the simplest possible setting: to this end, our first focus will be on the
case of d = 2 independent Black–Scholes assets, without drift and correlation, with
unit spot and unit volatility). To be more specific, if C B denotes the fair value of an
(out-of-money) call option on the basket B struck at K , one naturally expects, for a
small maturity T ,
∂2 (K ) 1
C B (K , T ) ∼ (const) exp − √ .
∂K2 T T
And yet, while true for most strikes, it fails for K = K ∗ ; in fact,
∂2 (K ∗ ) 1
C B (K , T ) ∼ (const) exp − .
∂K2 K =K ∗ T T 3/4
To the best of our knowledge, and despite the seeming triviality of the situation (two
independent Black–Scholes assets!), the existence of a “special” strike level K ∗ , at
which the value of a basket option (here: butterfly spread1 ) has a “special” decay
behavior, as maturity approaches 0, seems to be new. There are different proofs
of this fact; the most elementary argument—based on the analysis of a convolu-
tion integral—is given in Sect. 2. However, this approach—while telling us what
happens—does not tell us how it happens.
The main contribution of this note is precisely a good understanding of the latter.
In fact, there is clear picture that comes with K ∗ . For K < K ∗ and conditional on the
1 Extensions to spreads and vanilla options are possible and will be discussed elsewhere.
On the Probability Density Function of Baskets 451
option to expire on the money, there is a unique “most likely” path around which the
underlying asset price process will concentrate as maturity approaches 0. For K >
K ∗ , however, this ceases to be true: there will be two distinct (here: equally likely)
paths around which concentration occurs. What underlies this interpretation is that
large deviation theory not only characterizes the probability of unlikely events (such
as expiration in-the-money, if presently out-of-the-money, as time to maturity goes to
zero) but also the mechanism via which these events can occur. Such understanding
was already crucial in previous works on baskets aiming at quantification of basket
(implied vol) skew relative to its components, starting with [1, 2]. As a matter of
fact, the analysis in these papers relied on the statement that “generically there is
a unique arrival point (of a unique energy minimizing path) on the (basket-strike)
arrival manifold”. The situation, however, even in the Black–Scholes model, is more
involved. And indeed, we shall establish existence of a critical strike K ∗ , at which
one sees the phase-transition from one to two energy minimizing, “most likely”,
paths.2 And this information will have meaning to traders (as long as they believe in
a diffusion model as maturity approaches 0, which may or may not be a good idea
…) as it tells them the possible scenarios in which an out-of-the money basket option
may still expire in the money.
Let us conclude this introduction with a few technical notes. We view the evolution
of the basket price—even in the Black–Scholes model—as a stochastic volatility
evolution model; by which we mean d Bt /Bt = σ (t, ω)dWt (as opposed to a local
vol evolution where σ = σ (t, Bt )). This should explain why the methods developed
in Part I of [6, 7] for the analysis of stochastic volatility models (then used in Part
II, [7], to solve the concrete smile problem (shape of the wings) for the correlated
Stein–Stein model), are also adequate for the analysis of baskets.
d
BT = S0i exp μi T + σ i WTi .
i=1
2 It
can be shown that, sufficiently close to the arrival manifold, there is in fact a unique energy
minimizing paths. The (near-the-money) analysis of [1, 2] is then justified.
452 C. Bayer et al.
saddle point method. It will be enough for our purposes to illustrate the method
in the afore-mentioned simplest possible setting:
d = 2, S01 = S02 = 1, μ1 = μ2 = 0, σ 1 = σ 2 = 1.
In other words, BT = exp WT1 + exp WT2 . We claim that for some constant
c0 = c0 (K ) > 0
⎧
⎪
⎨exp − (K ) √1
(c0 + O (T )) , when K = K ∗ , (1a)
T T
f (K ) = (1)
⎩exp − ( K )
⎪ ∗
1
(c + O (T )) , when K = K ∗, (1b)
T T 3/4 0
with
K ∗ = 2e ≈ 5.43656
and
(K ) = inf{h K (x) | x ∈ [0, K ]}
with
h K (x) := (log x)2 + (log(K − x))2 . (2)
Note that for K ≤ K ∗ we can explicitly solve this minimization problem and obtain
(K ) = log(K /2)2 with corresponding minimizer x ∗ = K /2, corresponding to
the single local extremum of h K . For K > K ∗ , we have two global minima, which
cannot be given in closed form, and hence (K ) can only be computed numerically
(Fig. 1).
Fig. 1 Plot of h K for different choices of K . a For K < K ∗ there is a unique global minimum at
x ∗ = K /2 which is non-degenerate in the sense that h (x ∗ ) > 0. b For K = K ∗ there is a unique
global minimum at x = K /2 which is degenerate in the sense that h (x ∗ ) = 0. c For K > K ∗ ,
x = K /2 gives a local maximum. There are two symmetric global minimizers, which are not given
in closed form
On the Probability Density Function of Baskets 453
The stock price STi has a log-normal distribution with parameters μi = 0 and
√ √
ξ i = σ i T = T , where the density of the log-normal distribution is given by
1(log x − μ)2
f μ,ξ (x) = √ exp − . (3)
2π ξ x 2ξ 2
Obviously, the density of the sum of these two independent log-normal random
variables satisfies
K
f (K ) = f μ1 ,ξ 1 (K − x) f μ2 ,ξ 2 (x)d x. (4)
0
In order to apply the Laplace approximation to (4), we compute the minimizer for
h K , which is found by the first order condition
log x log(K − x)
h K (x) = 0 ⇐⇒ − = 0. (5)
x K −x
1 − log(K /2)
h K (x ∗ ) = h K (K /2) = 16 .
K2
and
h K (x ∗ ) = 0 ⇐⇒ K = 2e. (6)
With more work one can see that also the global minima x1∗ , x2∗ , in the case K > 2e,
are non-degenerate. Hence, whenever K = 2e a standard Laplace method leads to
the expansion (1)a. In the remainder of this section, we consider the degenerate case
and establish (1)b.
454 C. Bayer et al.
Our main tool here are novel marginal density expansions in small-noise regime [6].
This was used in order to compute the large-strike behavior of implied volatility in
the correlated Stein–Stein model; [11, 16].3
In fact, the technical assumptions of [6] were satisfied in the analysis of the Stein–
Stein model whereas in the (seemingly) trivial case of two IID Black-Scholes assets,
the technical assumptions of [6] are indeed violated for a critical strike K = K ∗ .
The necessity of this condition is then highlighted by the fact, as was seen in the
previous section,
∂2 (K ∗ ) 1
C B (K , T ) (const) exp − .
∂K2 K =K ∗ T T 1/2
3 Similarinvestigations have recently been conducted in the Heston model; [10, 13] and the refer-
ences therein.
On the Probability Density Function of Baskets 455
In the following, we review [6]. Consider a d-dimensional diffusion Xtε t≥0 given
by the stochastic differential equation
dXtε = b ε, Xtε dt + εσ Xtε dWt , with X0ε = x0ε ∈ Rd , (7)
and
x0ε = x0 + εx̂0 + o (ε) as ε ↓ 0. (10)
Assume b (ε, ·) → σ0 (·) in the sense of (8), (9), and X0ε ≡ x0ε → x0 as ε → 0
in the sense of (10). Assume non-degeneracy of σ in the sense that σ.σ T is strictly
positive definite everywhere in space.5 Fix y ∈ Rl , Ny := (y, ·) and let Ky be the
the space of all h ∈ H , the Cameron-Martin space of absolutely continuous paths
with derivatives in L 2 ([0, T ], Rm ), s.t. the solution to
m
dφth = σ0 φth dt + σi φth dhti , φ0h = x0 ∈ Rd
i=1
4 If (7) is understood in Stratonovich sense, so that d W is replaced by ◦d W , the drift vector field
m
b (ε, ·) is changed to b̃ (ε, ·) = b (ε, ·) − ε2 /2 i=1 σi · ∂σi . In particular, σ0 is also the limit of
b̃ (ε, ·) in the sense of (8).
5 This may be relaxed to a weak Hoermander condition with an explicit controllability condition.
6 If # Kmin = 1 smoothness of the energy can be shown and need not be assumed; [6]. Note also
y
that in our application to tail asymptotics, with θ-scaling, θ ∈ {1, 2}, the energy must be linear resp.
quadratic (by scaling) and hence smooth.
456 C. Bayer et al.
Assume also (i) there are only finitely many minimizers, i.e. Kmin
y < ∞ where
1
Kmin
y := h0 ∈ Ky : h0 2
H = (y) ;
2
(ii) x0 is non-focal for Ny in the sense of [6]. (We shall review below how to check
this.) Then there exists c0 = c0 (x0 , y, T ) > 0 such that
YεT = ε
l XT = X Tε,1 , . . . , X Tε,l , 1 ≤ l ≤ d,
− (y) {
max (y)· ŶT (h0 ):h0 ∈Kmin
y }
f ε (y, T ) = e 2
ε e ε ε−l (c0 + O (ε)) as ε ↓ 0,
d X̂t = ∂x b 0, φth0 (x0 ) + ∂x σ (φth0 (x0 ))ḣ0 (t) X̂t dt + ∂ε b 0, φth0 (x0 ) dt, (11)
X̂0 = x̂0 .
We present here the mechanics of the actual computations, in the spirit of the Pon-
tryagin maximum principle (e.g. [15]). For details we refer to [6].
• The Hamiltonian. Based on the SDE (7), with diffusion vector fields σ1 , . . . , σm
and drift vector field σ0 (in the ε → 0 limit) we define the Hamiltonian
1
m
H (x, p) := p, σ0 (x) + p, σi (x)2
2
i=1
1
= p, σ0 (x) + p, σ σ T (x) p .
2
Ht←0 (x0 , p0 )
458 C. Bayer et al.
is the unique solution to the above ODE with initial data (x0 , p0 ). Our standing
(regularity) assumption are more than enough to guarantee
uniqueness
and local
ODE existence. As in [5, p. 37], the vector field ∂p H, −∂x H is complete, i.e.,
one has global existence. It can be useful to start the flow backwards with time-T
terminal data, say (xT , pT ); we then write
Ht←T (xT , pT )
for the unique solution to (14) with given time-T terminal data. Of course,
• Solving the Hamiltonian ODEs as boundary value problem. Given the target
manifold Na = (a, ·), the analysis in [6] requires solving the Hamiltonian ODEs
(14) with mixed initial-, terminal—and transversality conditions,
x (0) = x0 ∈ Rd ,
x (T ) = (y, ·) ∈ Rl ⊕ Rd−l , (15)
p (T ) = (·, 0) ∈ R ⊕ R
l d−l
.
Each such candidate is indeed admissible in the sense h0 ∈ Ka but may fail
to be a minimizer. We thus compute the energy h0 2H = H(x0 , p0 ) for each
candidate and identify those (“h0 ∈ Kmin a ”) with minimal energy. The procedure
via Hamiltonian flows also yields a unique p0 = p0 (h0 ). If σ0 = 0—as in our
case—the energy is equal to H(x0 , p0 ), otherwise the formula is slightly more
complicated.
• Checking non-focality. By definition
x0 is non-focal
[6], for N = (y, ·) along
∗ d
h0 ∈ Kmin
a in the sense that, with x T , p T := H T ←0 0 0 (h0 ) ∈ T R ,
x , p
On the Probability Density Function of Baskets 459
0
∂(z,q) |(z,q)=(0,0) π H0←T xT + , pT + (q, 0)
z
π (x, p) = x). Note that in the point-point setting, xT = y is fixed and only pertur-
bations of the arrival “velocity” pT —without restrictions, i.e. without transversal-
ity condition—are considered. Non-degeneracy of the resulting map should then
be called non-conjugacy (between two points; here: xT and x0 ). In the absence of
the drift vector field σ0 , this is consistent with the usual meaning of non-conjugacy;
after identifying tangent- and cotangent-space ∂q|q=0 π H0←T is precisely the dif-
ferential of the exponential map.
• The explicit marginal density expansion. We then have
1
H(x, p) = p, (σ (x)σ (x)T ) p ,
2
which can be easily seen from the observation that H is constant along solutions of
the Hamiltonian ODEs together with symmetry between (x 1 , p 1 ) and (x 2 , p 2 ). This
immediately implies that the inverse flow is given by
⎛ ⎞
xt1 e−σ
2 x 1 p1 t
t t
⎜ ⎟
⎜x 2 e−σ 2 xt2 pt2 t ⎟
⎜ t ⎟
H0←t (xt , pt ) = ⎜ ⎟. (18)
⎜ p 1 eσ 2 xt1 pt1 t ⎟
⎝ t ⎠
2 σ 2 x 2 p2 t
pt e t t
x01 = S01 = 1,
x02 = S02 = 1,
x11 + x12 = K ,
p11 − p12 = 0.
1 log(K /2)2
(K ) = h 0 2H = = H(x0 , p0 ). (20)
2 σ2
On the Probability Density Function of Baskets 461
M(x1 , p1 ) = ,
−e−σ x1 p1 + x12 p12 σ 2 e−σ x1 p1 −σ 2 (x12 )2 e−σ x1 p1
2 2 2 2 2 2 2 2 2
implying that
− σ 2K
2
K (1 − log(K /2))
2
M(x1∗ , p1∗ ) = ,
− σ 2K
2
K (−1 + log(K /2))
2
which is zero if and only if K = 2e. We summarize the results of this calculation as
follows:
• In the generic case K = 2e, the non-focality condition of Theorem 1 holds true,
and we obtain (from Corollary 3) the following (short time) density expansion of
BT = exp(σ WT1 ) + exp(σ WT2 ), expansion
(K ) 1
K → exp − √ (c0 + O (T ))
T T
Remark 4 It is immediate to use this analysis to deal also with the case of non-unit
(but identical) spots S01 = S02 by scaling the Black-Scholes dynamics accordingly,
i.e.,
by replacing K with K /S01 . Hence, in this case focality happens when log K
2S01
= 1,
i.e., when K = 2S01 e.
462 C. Bayer et al.
Remark 5 The question arises if the critical (“focal”) case K = 2e, with atypical
algebraic factor T −3/4 cf. (1)b, can also be recovered by a general theorem. Related
results in [14] and also [17] suggests that this may indeed be the case but would
require substantial additional work.
As explained in Sect. 3 the analysis is really based on a small noise (small vol)
expansion of
dBt = St1, σ dWt1 + St2, σ dWt2 ,
run til time T = 1. Consider now a situation with small rates, also of order . In
other words,
d Sti, = r Sti, dt + Sti, σ dW i ,
and then Bt = St1, + St2, as before. We still assume S0i = 1. A look at Theorem 1
(now we cannot use Corollary 3) reveals that the entire leading order computation
remains unchanged (at least at unit time and with trivial changes otherwise). The
resulting (now: small noise) density expansion of BT |T =1 is more involved and
takes the form
(K ) 2r log(K /2) 1
K → exp − 2 exp (c0 + O()) . (22)
σ 2 log(2)
/2)
Here (K ) is given in closed form, cf. (20), so that (K ) = 2 log(K
σ2K
is also
explicitly known. Furthermore, under similar restrictions on K as before, h 0 is (still)
given by (19), so that
(K /2)t
φth 0 = .
(K /2)t
implying that Ŷ1 = X̂ 11 + X̂ 12 = r K / log(2). Thus, the second exponential term has
the form given above.
One can immediately write down the Hamiltonian associated to, say two, or d > 2
assets, each of which is governed by local vol dynamics or stochastic vol, based
on additional factors. In general, however, one will be stuck with the analysis of
the resulting boundary value problem for the Hamiltonian ODEs; numerical (e.g.
shooting) methods will have to be used. In some models, including the Stein–Stein
model, we believe (due to the analysis carried out in [7]) that, in special cases, closed
form answers are possible but we will not pursue this here. Instead, we continue with
a few more computation in the Black–Scholes case for d assets.
1
d
H(x, p) = ρi j σ i pi x i σ j x j p j .
2
i, j=1
d
ẋ l = σ l x l ρli σ i pi x i , i = 1, . . . , d
i=1
d
ṗl = −σ l pl ρli σ i pi x i , i = 1, . . . , d.
i=1
Consequently, it is again easy to see that ∂t∂ x l (t) pl (t) = 0, implying that x l (t) pl (t) =
x0l p0l . The Hamiltonian flow has the form
464 C. Bayer et al.
⎛ d ! d ⎞
x0l exp σ l i=1 ρli σ p0 x 0 t
i i i
⎜ l=1 ⎟
Ht←0 (x0 , p0 ) = ⎝ ! ⎠. (23)
d d
p0l exp −σ l i=1 ρli σi p0i x0i t
l=1
Using again that pl (t)x l (t) = pl (0)x l (0) for any l, we obtain the inverse Hamiltonian
flow ⎛ ! d ⎞
d
xtl exp −σ l ρli σ i pi x i t
⎜ i=1 t t
l=1 ⎟
H0←t (xt , pt ) = ⎝ ! d ⎠. (24)
d
pt exp σ
l l ρ
i=1 li σ i p i x
t t
i t
l=1
x0 = S0 (25a)
d
x l (1) = K (25b)
l=1
p 1 (1) = p (1) = · · · = p d (1).
2
(25c)
Indeed, the transversality"condition (25c)# says that the final momentum p(1) is
d
l=1 y = K , whose tangent space is spanned by the
orthogonal to the surface l
Remark 6 The main point of this calculation is that while explicit solutions are no
longer possible in a general Black-Scholes model, the phenomenon (1) potentially
appears in all Black-Scholes models. Moreover, we stress that the non-focality con-
ditions are easily checked numerically.
Remark 7 Note that the discretely monitored Asian option can be considered as
a special case of a basket option on correlated assets. Indeed, let us consider an
option on
1
N
Sti , with (for simplicity) ti = it, i = 1, . . . , N .
N
i=1
For each individual i ∈ { 1, . . . , N } we have, for fixed t > 0, the equality in law
1 i W i − 1 (σ i )2 t
Sti = S0 eσ Bit − 2 σ = S0 eσ
2 it
t 2
On the Probability Density Function of Baskets 465
√ √ 1
for σ i := iσ and Wt i := B
it / i. In law, the vector Wt , . . . , Wt corre-
N
Introducing ⎛ ⎞ ⎛ ⎞
1 2 + · · · + d
⎜1⎟ ⎜ −2 ⎟
⎜ ⎟ ⎜ ⎟
q = 1 ⎜ . ⎟ , z = ⎜ .. ⎟,
⎝ .. ⎠ ⎝ . ⎠
1 −d
!
gl = − 1 − (σ l )2 x1l p1l e−(σ ) p1 x1 , l = 2, . . . d.
l 2 l l
466 C. Bayer et al.
In the symmetric case, we can evaluate M at the optimal configuration and obtain
⎛ % & % & ⎞
−σ 2 Kd 1 − log(K /d) Kd ··· 1 − log(K /d) d
K
⎜ 2K % & ⎟
⎜−σ d − 1 − log(K /d) Kd ··· 0 ⎟
M(x1∗ , p1∗ ) =⎜
⎜ .. .. .. .. ⎟,
⎟
⎝ . . . . ⎠
% &
−σ 2 Kd 0 · · · − 1 − log(K /d) d
K
Thus, the non-focality condition fails if and only if K = de. Moreover, we obtain
the energy
d log(K /d)2
(K ) = H(x0∗ , p0∗ ) = .
2 σ2
In this final section we take a more geometrical look at the non-focality condition
appearing in Sect. 3.2. Consider the Black Scholes model
d Sti = σ i Sti dWti , dW i , dW j = ρi, j dt.
t
We change parameters S → y → x, by
Si
log
S0i
y i := , x i = L i p y p , i = 1, . . . , d,
σi
p
x i = x i (F) = L i p log S p /S0 /σ p ,
S i = S i (x) = S0i eσi L
ipx p
.
The advantage of using the chart x is that the corresponding Riemannian metric
tensor is the usual Euclidean metric tensor. Thus, we simply have
d(S0 , S) = |x0 − x|
On the Probability Density Function of Baskets 467
and the geodesics are straight lines as seen from the x-chart. Note furthermore that
S = S0 is transformed to x = 0. +
The payoff function of the option is given " by wiSTi − K . We# normalize
d
wi ≡ 1 and T ≡ 1. The strike surface F = S ∈ Rd+ i=1 S i = K , which is
(a sub-set of) a hyperplane in S coordinates is, however, transformed to a much more
complicated submanifold in x coordinates. Re-phrasing the equation i S i = K in
y-coordinates and solving for y d gives
) *
d−1
σi
d
Li p x p
y d = log K− S0i e p=1 /S0d /σ d ,
i=1
and
+ ) * ,
1
d−1 i
d−1
S0i eσ
i ip p
ϕ(q) := q, dd log K− p=1 L q /S0d /σ −
d dk k
L q .
L
i=1 k=1
Note that by the change of coordinates, we are implicitly assuming that S i > 0 for
all i. Moreover, the standard basis e1 (p), . . . , ed−1 (p) of the tangent space Tp F to F
at p = ϕ(q) is given by the columns of the Jacobi matrix of ϕ evaluated at q, more
precisely we have
⎛ ⎡ d−1 j ji j σ j j L jr q r ⎤⎞
1 1 σ L S e r =1
ei (p) = ⎝(δi )d−1 ⎣ + L di ⎦⎠
j j=i 0
j=1 , − dd
σ d K − d−1 S j eσ j rj=1 L jr q r
L
j=1 0
468 C. Bayer et al.
1
{p + N (p)|1 ≤ i ≤ d − 1 such that ki (p) = 0}.
ki (p)
∂
L(ϕ(q))i j = − (N ◦ ϕ)(q), ei (ϕ(q)), i, j = 1, . . . , d − 1.
∂q j
The principal curvatures k1 (p), . . . , kd−1 (p) are, thus, the eigenvalues of the (d −1)-
dimensional matrix L(p).
Since the calculations become too complicated in the general case, we now again
concentrate on the case of two uncorrelated assets, i.e., d = 2 and ρ = L = I2 . In
this case, we have
σ 1 S01 eσ q
1 1
e1 (p) = 1, − 2 ,
σ K − S01 eσ 1 q 1
1
σ 1 S01 eσ , σ 2 K − S01 eσ
1q1 1q1
N (ϕ(q)) = 1 .
1 1 2
(σ 1 )2 (S01 )2 e2σ q + (σ 2 )2 K − S01 eσ q
1 1
On the Probability Density Function of Baskets 469
where for q = (q 1 ) ∈ R
K (σ 1 )2 (σ 2 )2 S01 eσ S01 eσ
1q1 1q1
−K
κ(ϕ(q)) = k1 (ϕ(q)) = 2 !3/2
(σ 1 )2 (S01 )2 e2σ + (σ 2 )2 S01 eσ q − K
1q1 1 1
f1 = q 1 + 1 1
,
σ 1 (σ 2 )2 K K − S01 eσ q
K − S01 eσ q ((σ 1 )2 + (σ 2 )2 )S01 eσ q
1 1 1 1
σ 2 K e−σ q
1 1
1 σ2
f = 2 log
2
+ 2 − − .
σ S02 (σ 1 )2 (σ 1 )2 S01 (σ 1 )2 σ 2 K
Remark 9 As both components of the normal vector N are non-negative on F and the
curvature κ is negative, 0 can only be a focal point if F has a non-empty intersection
with the positive quadrant. Inserting into the parametrization of F, we see that this
can only be the case if K > S01 + S02 . In other words: if the option is in the money, then
the non-focality condition is always satisfied (in the two-dimensional, uncorrelated
case).
470 C. Bayer et al.
Let us again use the parameters of Sect. 2, i.e., S01 = S02 = 1, σ 1 = σ 2 = σ . Then
log(K /2) log(K /2)
we consider S∗ = (K /2, K /2), which translates into x∗ = σ , σ .
Inserting into the formulas for the focal points, we obtain
K
∗ ∗ log −1
f (x ) = f (x ) =
1 2 2
.
σ
So, 0 is focal to the optimal configuration, if and only if
K = 2e,
and we recover, once more, the results of Sects. 2 and 4—recall that S0 corresponds
to 0 in x-coordinates.
In Figs. 2 and 3 the focal points are visualized for two different configurations
of two uncorrelated baskets. We plot the surface F as a submanifold of R2 . We
have seen above that for any p ∈ F there is precisely one focal point f(p). Hence,
we additionally plot the surface {f(p)|p ∈ F}—more precisely, part of this surface.
In Fig. 2 we show the case constructed above where the non-focality condition is
violated. In Fig. 3 the option is ITM. As explained above, in the ITM case the manifold
F does not intersect the positive quadrant, implying that the non-focality condition
is satisfied.
(a) (b)
2
2
1
1
0
0
−1
−1
F F
Opt. path Focal points
Opt. config.
−2
−2
−2 −1 0 1 2 −2 −1 0 1 2
Optimal configuration Focal points
Fig. 2 Optimal configuration and focal points for two independent assets with σ 1 = σ 2 = 1,
S0 = (1, 1), K = 2e. a The dashed line depicts the optimal path between the spot price S0 (0 in the
q-chart) and the optimal configuration. b Dotted lines connect some selected points on the manifold
F with the corresponding focal points. Points marked with a triangle visualize the construction of
the focal points. We see that 0 is, indeed, focal to the optimal configuration
On the Probability Density Function of Baskets 471
(a)
0.0 (b)
0.0
0
−0.5
−0.5
−1.0
−1.0
−1.5
−1.5
−2.0
−2.0
F
Opt. path F
−2.5
−2.5
Focal points
Opt. config.
−3.0
−3.0
−3.0 −2.5 −2.0 −1.5 −1.0 −0.5 0.0 −3.0 −2.5 −2.0 −1.5 −1.0 −0.5 0.0
Optimal configuration (in the money regime) Focal points
Fig. 3 Optimal configuration and focal points for two independent assets with σ 1 = σ 2 = 1,
S0 = (1, 1), K = 2/e. a The dashed line depicts the optimal path between the spot price S0
(0 in the q-chart) and the optimal configuration. b Dotted lines connect some selected points on
the manifold F with the corresponding focal points. Points marked with a triangle visualize the
construction of the focal points. This example illustrates the fact that the non-focality condition
always holds when the basket option is in the money
References
1. M. Avellaneda, Boyer-Olson, D., Busca, J., Friz, P.: Application of large deviation methods to
the pricing of index options in finance, Comptes Rendus de l’Académie des Sciences—Series
I—Mathematique (2003)
2. Avellaneda, M., Boyer-Olson, D., Busca, J., Friz, P.: Reconstructing volatility. RISK (2004)
3. Ben Arous, G.: Développement asymptotique du noyau de la chaleur hypoelliptique hors du
cut-locus. Annales Scientifiques de l’Ecole Normale Supérieure 4(21), 307–331 (1988)
4. Ben Arous, G.: Methods de Laplace et de la phase stationnaire sur l’espace de Wiener. Sto-
chastics 25, 125–153 (1988)
5. Bismut, J.M.: Malliavin Calculus and Large Deviations. Birkhauser, Boston (1984)
6. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, part I: theoretical foundations. Commun. Pure Appl. Math. 67(1), 40–
82 (2014)
7. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, part II: applications. Commun. Pure Appl. Math. 67(2), 321–350
(2014)
8. do Carmo, M.P.: Differential Geometry of Curves and Surfaces. Prentice-Hall, Englewood
Cliffs (1976)
472 C. Bayer et al.
9. Dufresne, D.: The log-normal approximation in financial and other computations. Adv. Appl.
Probab. 36, 747–773 (2004)
10. Gulisashvili, A.: Analytically tractable stochastic stock price models. Springer Finance.
Springer, London (2012)
11. Gulisashvili, A., Stein, E.: Asymptotic behavior of the stock price distribution density and
implied volatility in stochastic volatility models. Appl. Math. Optim. 61(3), 287–315. doi:10.
1007/s00245-009-9085-x
12. Gulisashvili, A., Tankov, P.: Tail behavior of sums and differences of log-normal random
variables, Bernoulli, to appear
13. Heston, S.: A closed-form solution for options with stochastic volatility, with application to
bond and currency options. Rev. Financ. Stud. 6, 327–343 (1993)
14. Molchanov, S.A.: Diffusion processes and Riemannian geometry. Russ. Math. Surv. 30(1),
1–63 (1975)
15. Seierstad, A., Sydsaeter, K.: Optimal Control Theory with Economic Applications. Advanced
Textbooks in Economics, vol. 24. North- Holland, Amsterdam (1987)
16. Stein, E.M., Stein, J.C.: Stock price distributions with stochastic volatility: an analytic approach.
Rev. Financ. Stud. 4, 727–752 (1991)
17. Takanobu, S., Watanabe, S.: Asymptotic expansion formulas of the Schilder type for a class
of conditional Wiener functional integration. In Asymptotics problems in probability theory:
Wiener functionals and asymptotics. In: Elworthy, K.D., Ikeda, N. (eds.) Pitman Research
Notes in Mathematics Series, vol. 284, pp. 194–241. (1993)
On Small-Noise Equations with Degenerate
Limiting System Arising from Volatility
Models
G. Conforti
Universität Potsdam, Potsdam, Germany
e-mail: [email protected]
S. De Marco (B)
Ecole Polytechnique, Route de Saclay, Palaiseau Cedex 91128, France
e-mail: [email protected]
J.-D. Deuschel
Technische Universität Berlin, Berlin, Germany
e-mail: [email protected]
1 Introduction
holds for subsets of the path space C([0, T ]).1 Denote ϕ(h) the unique solution of
the ODE dϕt = b(ϕt )dt + σ(ϕt )dh t , ϕ0 = x, where the control h is an absolutely
continuous path with square integrable derivative ḣ. The rate function I is given by
I (φ) = 21 |ḣ|2L 2 , where h is the control steering the trajectory of the deterministic
system along the given path φ, that is ϕ(h) = φ. When the diffusion coefficient σ
is invertible, the control h is identified by ḣ t = σ(ϕt )−1 (ϕ̇t − b(ϕt )), yielding the
typical form of the rate function
1 T (φ̇t − b(φt ))2
I (φ) = dt.
2 0 σ(φt )2
The intuition behind such a result is that we can write X ε (ω) = X (εω), where X
is the ‘pathwise’ solution of dX = b(X )dt + σ(X )dB, X 0 = x. If we accept that such
a map X exists and is regular enough, then the contraction principle in conjunction
with Schilder’s theorem for large deviations of Brownian paths [12, Chap. 1] provides
the LDP and the rate function for X ε . The standard assumptions under which such a
program is carried are conditions of global Lipschitz continuity and ellipticity for the
coefficients, see [10, 12]. Several works have aimed at weakening these assumptions
and extending the class of equations for which the LDP holds. Dependence on ε in
both the drift and the starting point can be introduced, and global Lipschitz continuity
can be replaced with (essentially) local Lipschitz-continuity and conditions for the
non explosion of the solution (building on the idea of Azencott [3] to exploit the
quasi-continuity property of the Itô map, that only relies on local properties of the
equation coefficients). We refer to [4] for a nice recent summary of sets of conditions
under which the Wentzell–Freidlin estimate holds.
1 Theprecise statement here is − inf ◦ I (φ) ≤ lim inf ε→0 ε2 log W (X ε ∈ ) ≤ lim supε→0 ε2
φ∈
log W (X ε ∈ ) ≤ − inf φ∈ I (φ).
On Small-Noise Equations with Degenerate Limiting System … 475
Recent research on heat kernel asymptotics [11] focuses on the tail behavior for
correlated stochastic volatility models. Exploiting the space-scaling properties of
the log-price process Yt in some parametric models (namely: there exists θ > 0
such that the rescaled variable Ytε := εθ Yt has the same law as the log-price in a
stochastic volatility model with driving noise εdBt ), the approach of [11] is to convert
the asymptotic problem for the tail distribution, W (Yt > R) as R → ∞, to the
problem of small-noise probabilities, W (Ytε > 1) as ε → 0. Then, a large deviation
principle for the rescaled process serves as a building block to study the asymptotic
behavior of the corresponding heat kernel (using the tools of Malliavin calculs and
the Laplace method on path space, see [5, 7]). This approach can be fully justified,
and explicit computations are possible, for the stochastic volatility model of Stein
and Stein [25] (also known as Schöbel–Zhu [24] in the correlated case), where the
stochastic volatility follows an Ornstein–Uhlenbeck process with constant diffusion
coefficient, which is the main case-study of [11]. As pointed out in [11, Sect. 5.3], in
the framework of models where the volatility has square-root diffusion coefficient
(main example: Heston), or more generally a diffusion coefficient of the form x γ ,
γ < 1 (as in [2, 21]), such a space-scaling approach leads to a situation where the
same approach is not justified anymore (and a formal application of the resulting
expansion even leads to a wrong conclusion). Quoting [11, Sect. 5.3], “curiously
then even a large deviation principle for (the rescaled volatility process) as given
above presently lacks justification”.
γ
To be more specific, consider the equation dXt = (α + β X t )dt + σ X t dBt with
θ
positive initial condition X 0 = x > 0. Looking for a value of θ such that ε X satisfies
an equation with small-noise ε leads to define the rescaled process X ε := ε1/(1−γ) X ,
which indeed satisfies the equation
with
αε := ε1/(1−γ) α x ε := ε1/(1−γ) x.
The Eq. (1.2) is known to admit infinitely many solutions.When ḣ t ≥ 0, the set
(θ) t
of solutions contains the one-parameter family ϕt = eβt σ(1 − γ) θ e−β(1−γ)s
1/(1−γ)
ḣ s ds 1{t≥θ} , with θ ≥ 0.2 Then, the definition itself of the map h → ϕ(h)
associating the control with the corresponding solution of the ODE is not anymore
possible.
We will occasionally address this situation as “degenerate”. Let us note straight
away that large deviations for diffusions with non-Lipschitz coefficients have been
studied in Baldi and Caramellino [4] Donati-Martin et al. [13], Klebaner and Lipster
[19] and Robertson [23]. In [4, Theorem 1.2] a large deviation principle is derived
for the family of equations dXεt = b(X tε )dt + εσ(X tε )dBt , X 0ε = x > 0 (note the
strictly positive initial condition), where the function σ(·) roughly behaves like σx γ
(see [4, Assumption (A1.1)] for precise conditions) and b : [0, ∞) → R is a locally
Lipschitz function with sub-linear growth and b(0) > 0. The conditions for both a
drift term b and an initial datum independent of ε, such that b(0) > 0 and x > 0, are
violated in the situation we consider here. In [13], b(0) = 0 and x = 0 are allowed,
but the analysis is limited to the square-root case γ = 1/2, and b and x remain
independent of ε. Note in this respect that setting b(0) = x = 0 implies X ε ≡ 0
for all ε, and in this case a LDP trivially holds with the rate function I (0) = 0,
I (φ) = ∞ for φ ≡ 0 (as stated in [13, Theorem 1.3]); in contrast with (1.1), where
both bε (0) = αε and x ε do tend to zero as ε → 0, but coming from strictly positive
values, so that the solution of the SDE is non trivial for every value of ε. In both
these works, uniqueness for the limiting ODE is a key point (and appears as a part
of [4, Assumption (A2.3)] and is exploited in [13, Sect. 5]). In order to study the
asymptotic behavior of the ruin probability W (τ0 ≤ T ) with τ0 = inf{t : X t = 0}
as the initial condition x tends to infinity, Klebaner and Lipster [19] exploit a similar
space scaling by working with the ‘normed’ process X tx = X t /x, and show that a
LDP holds for the process X x as x → ∞. The major difference with our setting is
that the initial condition X 0x = 1 in [19] is fixed and does not tend to zero as x ε in
(1.1), which is one of the difficulties to encompass in our analysis. Robertson [23]
derives LDP for a class of stochastic volatility models, including the Heston model
with square-root volatility process. One of the assumptions used there is that the
small noise problem for the volatility process has the same form as in Donati-Martin
et al. [13], see [23, Assumption 2.1], and the work carried out is to transfer the LDP
to the second component of the process (the log-price). Therefore, the work of [23]
does not cover small-noise problems in the form of (1.1).
We establish a LDP for a generalized version of Eq. (1.1), allowing α to be a
function of the process. That is, we start from Eq. (1) under the assumptions:
(H1) γ ∈ [1/2, 1), σ > 0, x > 0.
(H2) b(y) = α(y) + β y, where α is a Lipschitz continuous and bounded function,
and α(y) ≥ 0 in a neighbourhood of 0.
2 When β = 0, γ = 1/2 and ḣ ≡ 1, one retrieves the textbook example of ODE for which uniqueness
√ (θ)
fails, ϕ̇t = σ |ϕt |, whose solutions from ϕ0 = 0 are given by the one-parameter family ϕt =
σ2
4 (t − θ) 1{t≥θ} .
2
On Small-Noise Equations with Degenerate Limiting System … 477
Under (H1)–(H2), (1) is known to admit a positive solution, which is pathwise unique
by Yamada and Watanabe’s uniqueness theorem.
Theorem 1.1 Assume conditions (H1)–(H2), and let (X t )t≥0 be the unique strong
solution to (1). Set X ε := ε1/(1−γ) X ; then X ε satisfies (1.1) with the constant α
replaced by the function α(·). Then, the family {X ε }ε satisfies a large deviation
principle on the path space C([0, T ], R+ ) with inverse speed ε2 and rate function
2
1 T ϕ̇t − βϕt
IT (ϕ) = γ 1{ϕt =0} dt,
2σ 2 0 ϕt
Let us note that in the definition of IT above, the expression ϕ1t γ 1ϕt =0 is intended
to be well defined for any ϕt ∈ R+ , and it is equal to zero when ϕt = 0.
It is easy to see that the unique zero of IT is ϕ ≡ 0, consistently with the
W
fact that X ε → 0 as ε → 0. Roughly speaking, Theorem 1.1 allows to write
W (X ε ∈ ) = exp − ε12 inf φ∈ IT (φ) + ψ(ε) for subsets of C(0, T ) such
that inf φ∈ IT (φ) = inf ◦ IT (φ), where the function ψ(ε) vanishes as ε → 0; we
φ∈
refer to Theorem 2.1 in Sect. 2 for the precise statements.
According to our definition of X ε , one has W (X tε ≥ 0, ∀t ≥ 0, ∀ε > 0) =
W (X t ≥ 0, ∀t ≥ 0) = 1. A criterium for the strict positivity of the trajectories of
X ε , based on Feller’s test for explosion, can also be given (see [9, Proposition 3.1]:
when γ > 1/2, a(0) > 0 implies W (X tε > 0, t ≥ 0) = 1, while for γ = 1/2, the
same conclusion is guaranteed by 2α(y)/σ 2 ≥ 1 for y in a right neighborhood of
zero—yielding the familiar Feller condition 2α/σ 2 ≥ 1 when α is constant). Note
that Theorem 1.1 does not assume any of these conditions for the non-attainability
of zero; in particular for the CIR diffusion, we do not assume the Feller condition on
the coefficients α and σ.
From Theorem 1.1, tail asymptotics for some functionals of the process X can
be derived (which is exactly why the ε-scaling leading to X ε was introduced!). The
pathwise LDP allows to consider path functionals of the process, such as the running
supremum, or the time average.
Theorem 1.2 Let (X t )t≥0 be the unique strong solution to (1) under conditions
(H1)–(H2), and let T > 0. Then, as R → ∞
W (X T ≥ R) = e−R
2(1−γ) (c
T +o(1)) (1.3)
and
= e−R
2(1−γ) (c
W sup X t ≥ R T +o(1)) (1.4)
t∈[0,T ]
478 G. Conforti et al.
and
1 T
= e−R
2(1−γ) (ν
W X t dt ≥ R T +o(1)) . (1.5)
T 0
The constant cT , resp. νT are explicitly known in terms of the model parameters, and
are provided below in Proposition 2.5, resp. Proposition 3.14 for the case γ = 1/2.
The estimates in Theorem 1.2 can be compared with the explicit formulae available
for cumulative distributions and critical exponents in the CIR and CEV models:
these consistency checks are done in Sects. 2.1 and 3.4, showing that the estimates
in Theorem 1.2 are correct on the log-scale. While in the one-dimensional setting
the large deviation approach yield by Theorem 1.1 applies to equations with a more
general drift term than a purely affine function, it also opens the way to heat kernel
analysis for higher-dimensional diffusions involving (1) as a component, which is
exactly the case left open in [11].
Let us finally note that, due to the non uniqueness of solutions for the limiting
system, the problem we consider here appears to be related to the issue of regulariza-
tion by noise of ODEs. Leaving further discussions to future work, let us just point
out here a structural difference with that setting: in that context, one considers an
SDE of the form dXεt = b(X tε )dt + εdBt , with unit dispersion coefficient, seen as a
perturbation of the deterministic system ẋt = b(xt ) with non-Lipschitz drift b (e.g.
b(x) = sign(x)|x|γ ). Among the possible solutions of the deterministic system, one
then looks at the (few) ones supporting the limiting law of X ε , obtaining the so-called
zero noise limits of the equation; see [27] and references therein. In our framework,
the equation for X ε already possesses a Lipschitz continuous drift b(x) = αε + βx.
Correspondingly, the limiting system ẋt = βx, x0 = 0, already has a unique solu-
tion (here: the null path x = 0), which then gives the unique weak limit for X ε (in
contrast to [27, Corollary 1.2], where the limit is a probability distribution supported
on two trajectories). As we pointed out, the difficulties in our setting come from the
non-Lipschitz diffusion coefficient and appear at the level of the definition of the rate
function via the control system (1.2).
In the remainder of the document, Sect. 2 is devoted to the proof of Theorem 1.1,
while in Sect. 3.4 we prove the different statements of Theorem 1.2. We collect in
Appendix A the proofs of some of the more technical material.
we denote X the W almost-surely unique strong solution of (1). We define the rescaled
1
process X ε := ε 1−γ X ; it is clear that X ε solves Eq. (1.1) with coefficients identified
by αε (x) = ε1/(1−γ) α(x) and x ε = ε1/(1−γ) x. Denote bε (x) := αε (x) + x.
The following theorem gives the precise LDP announced in Theorem 1.1 in the
Introduction. We recall that the expression y1γ 1 y =0 is well defined for any y ∈ R+ ,
and it is equal to zero when y = 0.
for every closed set F ⊆ ≥0 and every open set G ⊆ ≥0 , where the rate function
IT (ϕ) is defined by
2
1 T ϕ̇t − βϕt
IT (ϕ) := γ 1{ϕt =0} dt, (2.2)
2σ 2 0 ϕt
Remark 2.2 We could state the large deviation principle of Theorem 2.1 on =
C([0, T ], R), setting the rate function IT (ϕ) to +∞ whenever ϕ ∈
/ ≥0 . Since the
process X ε is known to be positive W -a.s. for every ε > 0, with such a definition
of the rate function the LDP (2.1) holds for every closed subset F and every open
subset G of .
Remark 2.3 As pointed out in the Introduction, the rate function for a family {X ε }ε
satisfying dXε = b(X ε )dt + εσ(X ε )dBt , X 0ε = x, can be written as
1
I T (ϕ) = inf |ḣ| 2 : h ∈ H, ϕ(h) = ϕ (2.3)
2 L
where ϕ(h) is the solution to the limiting ODE controlled by h, ϕ̇ = b(ϕ) + σ(ϕ)ḣ
and ϕ0 = x, provided this solution is unique. In our setting, consider ϕ ∈ S(u),
where now S(u) denotes the set of positive solutions of the degenerate ODE (1.2)
with control parameter h = u ∈ H : on the set {ϕ > 0}, u is uniquely determined by
ϕ via u̇ t = ϕ̇t −βϕ
γ
ϕt
t
; on the set {ϕ = 0}, the function ϕ is seen to satisfy Eq. (1.2) for
any control parameter h. This means that the set of h such that ϕ ∈ S(h) contains
the infinitely many elements given by
ϕ̇t − βϕt d h̃ t
ḣ t = γ 1{ϕt >0} + 1{ϕt =0} , h̃ ∈ H.
ϕt dt
480 G. Conforti et al.
The control h 0 achieving the minimum norm is obtained setting h̃ ≡ 0. This gives
2 |ḣ 0 | L 2 = inf 2 |ḣ| L 2 : h ∈ H, ϕ ∈ S(h) = I T (ϕ) for the rate function I T defined
1 1
in (2.2).
and JT (ϕ) = ∞ if ϕ is not absolutely continuous, where one classically agrees that
1/ϕt is equal to +∞ if ϕt = 0. We stress that the latter rate function is radically
different from IT defined in (2.2): whenever ϕ = 0 on some non trivial interval
K ⊂ [0, 1], then JT (ϕ) = ∞, while in such a case the integrand in (2.2) gives zero
contribution to IT on K . In other words, while trajectories with a zero-set of positive
ε
measure require infinite energy to be followed by the process X in the small-noise
limit, they are favoured by the rate function of the process X ε .
The space-scaling X ε = ε1/(1−γ) X together with the large deviation principle (2.1)
allow to work out tail asymptotics for functionals of the process X . The follow-
ing proposition provides the precise constants appearing in Theorem 1.2 in the
Introduction.
Proposition 2.5 The asymptotic formulas (1.3) and (1.4) in Theorem 1.2 hold with
the constant cT given by
⎧ −2β(1−γ)T
⎨ 2 βe if β = 0
σ (1−γ)(1−e−2β(1−γ)T )
cT = (2.4)
⎩ 1
if β = 0.
2σ 2 (1−γ)2 T
One can see that cT does not depend on the function α(·) in the drift of X , nor on
the initial condition x.
(i) Comparison with explicit formulae for the CEV process. The asymptotic
behavior (1.3) can be compared with the explicit formulae available for the den-
sity of the CEV process. When α ≡ 0 in (1), X can be obtained as a deterministic
time-change of a power of a squared Bessel process (see [16, Sect. 6.4.3]). As
On Small-Noise Equations with Degenerate Limiting System … 481
(1 − γ) β(−2(1−γ)+1/2)T 1
f X T (y) = e exp − x 2(1−γ) + y 2(1−γ) e−2β(1−γ)T
d(T ) 2d(T )
1
× x 1/2 y −2γ+1/2 I1/2(1−γ) x 1−γ y 1−γ e−β(1−γ)T , y > 0,
d(T )
(2.5)
where Iν is the modified Bessel function of the first kind of index ν > 0, and
d(T ) = (1−γ)σ
2
−2β(1−γ)T ) (note en passant that one has d(T ) > 0 for
2β (1 − e
every choice of the sign of β).3 The formula (2.5) is also valid for β = 0, when
one replaces all the β-dependent constants with their limits as β → 0, such as
d(T )|β=0 = (1 − γ)2 σ 2 T . Using the asymptotic behavior (see [1, Sect. 9.7.1])
z
of the modified Bessel function Iν (z) ∼ √e2πz as z → ∞ for fixed ν > 0, one
immediately obtains
e−2β(1−γ)T 2(1−γ)
log f X T (y) =: g(y) ∼ − y = −cT y 2(1−γ) , x → ∞,
2d(T )
with the constant cT defined in (2.4). Using some standard tools of ∞regular vari-
ation [6], one can then easily prove that log W (X T > y) = log y eg(z) dz ∼
g(y) ∼ −cT y 2(1−γ) as y → ∞, thus showing that estimate (1.3) is exact on the
log-scale.
(ii) The asymptotic estimate f X T (y) ≤ A T e−aT y
2(1−γ)
, y > 1, for the density of
X T was proven in [9] for the solutions of a class of SDEs containing (1) under
conditions (H1)–(H2) (namely, in [9] the coefficients β and γ are also allowed
to depend smoothly on X ), relying on techniques of Malliavin calculus and
transformations for 1-dimensional SDEs. The constant aT provided there is not
optimal. While the estimates in [9] remain valid for more general equations, the
large deviation principle in Theorem 2.1 allows to obtain a sharp estimate on
the log-scale.
T
The asymptotic behavior W T1 0 X t dt = exp −R 2(1−γ) (νT + o(1)) for the
time average of the process can also be proven using Theorem 2.1: see Proposition
3.14 in Sect. 3.4, where an expression of the constant νT is provided in the case
γ = 1/2.
3 When γ ∈ [1/2, 1), the law of X T also possesses an atom at zero, P(X T = 0) = m T > 0, and an
explicit formula for the mass m T is available (see again [16, Chap. 6]). From our point of view, this
only means that the density f X T does not integrate to 1 on (0, ∞), without affecting our analysis
of the tail asymptotics at ∞.
482 G. Conforti et al.
We prove the large deviation principle in Theorem 2.1 by first showing the exponential
tightness of the family {X ε }ε , namely for every m < 0 there exists a compact set
K m ⊂ C([0, T ]) such that lim supε→0 ε2 log W (X ε ∈ K mc ) ≤ m. We then prove the
weak upper bound
lim sup lim sup ε2 log W (X ε ∈ B(ϕ, R)) ≤ −IT (ϕ) ∀ϕ ∈ ≥0 ,
R→0 ε→0
lim inf lim inf ε2 log W (X ε ∈ B(ϕ, R)) ≥ −IT (ϕ) ∀ϕ ∈ ≥0
R→0 ε→0
where B(ϕ, R) denotes the closed ball in C([0, T ]) of radius R, B(ϕ, R) := {ϕ̃ :
|ϕ̃ − ϕ|∞ ≤ R}. It is a general fact that exponential tightness combined with the
weak upper bound yields the large deviation upper bound in (2.1) for any closed set
after a covering argument (see [12, Chaps. 1 and 2]). On the other hand, the weak
lower bound trivially provides the full lower bound in (2.1), observing that open sets
are neighborhoods of their points.
We prove the exponential tightness considering balls in the Hölder norm ωη :=
sups,t≤T,s =t |ω|t−s|
t −ωs |
η and a natural bound on the initial condition ω0 . More precisely,
we define
K R := {ωη ≤ R} ∩ {ω0 ∈ (0, x]}. (3.1)
We follow [13] in the proof of Proposition 3.1. First, let us observe that for ε ≤ 1,
W (X 0ε ∈ (0, x]) = 1 so that we just need to estimate the Hölder norm of X ε . To this
end, we use a version of Garsia-Rodemich-Rumsey’s Lemma, and the existence of
exponential moments for a process bounding X ε from above.
On Small-Noise Equations with Degenerate Limiting System … 483
and define X̃ ε := ε1/(1−γ) X̃ . Then, there exist positive constants c and C such that:
E exp cε−2 ( X̃ tε )2(1−γ) ≤ C, ∀t ∈ [0, T ], ∀ε > 0. (3.2)
ε −2 ε 2(1−γ) = X̃ 2(1−γ)
has ε ( X̃ t )
of X̃ , one
Proof According to the definition t , so that
2(1−γ)
(3.2) holds if and only if E exp ≤ C for all t ∈ [0, T ]. When γ = 1/2,
c X̃ t
(3.2) follows from the asymptotic behavior of the density of the CIR process for
large arguments (see e.g. [16, Sect. 6.3.2, p. 358]); for general γ and β = 0, from the
asymptotic behavior of the density of the classical CEV process as stated for example
in [16, Lemma 6.4.3.1, p. 368]. For general γ and β, we rely on a slight generalization
of the proof of [9, Proposition 3.3]; we leave the details to Appendix A.
ωη ≤ R. (3.4)
In the proof of Proposition 3.1, we exploit a localization procedure: for any ε > 0
and n ∈ N, define the process X ε,n as the strong solution of the SDE with truncated
coefficients:
γ
dXtε,n = bε (X tε,n ∧ n)dt + σε X tε,n ∧ n dBt , X 0ε,n = x ε . (3.5)
The paths of X ε,n can be decomposed in their martingale part and locally bounded
variation part
d X tε,n = d Atε,n + d Mtε,n
with d Mtε,n = εσ(X tε,n ∧ n)γ dBt and d Atε,n = bε (X tε,n ∧ n)dt. We shall also define
for every n, ε the stopping time T ε,n := inf t ≥ 0 : X tε ≥ n . By the pathwise
uniqueness for Eq. (1) (equivalently, (3.5)), we have that up to time T ε,n the processes
484 G. Conforti et al.
X tε t∈[0,T ] and X tε,n t∈[0,T ] coincide almost surely. More precisely, ∀n ∈ N and
ε>0
ε ε,n
W X t∧T ε,n = X t∧T ε,n , ∀t ∈ [0, T ] = 1. (3.6)
Let us estimate the first term in (3.7). Using Proposition 3.3 and Markov’s inequality
we have for every ε, n:
T |M ε,n − Msε,n |
T
W (M ε,n η ≥ R) ≤ W exp ε−2 t√ dsdt ≥ K ε,η (R)
0 0 |t − s|
T T ε,n
1 −2 |Mt − Msε,n |
≤ E exp ε √ dsdt.
K ε,η (R) 0 0 |t − s|
Applying the exponential martingale inequality E (exp(λMt )) ≤ E exp 2λ2 Mt
[22, Chap. IV] with λ = √1
ε2 |t−s|
, for t > s one has
|Mtε,n − Msε,n | 2σ 2 t 2γ
E exp √ ≤ 2 E exp X rε,n ∧n dr
ε2 t − s ε2 (t − s) s
≤ 2 exp σ 2 ε−2 n 2γ .
Therefore, using the definition of the constant K ε,η (R) in Proposition 3.3
R
lim sup ε2 log W M ε,n η ≥ R ≤ −T η−1/2 + σ 2 n 2γ . (3.8)
ε→0 8
Under hypothesis (H), bε (x) ≤ |α|∞ + βx for every x. Therefore, for every ε, n
W Aε,n η ≥ R ≤ W T 1−η (|α|∞ + βn) ≥ R = 0, (3.9)
where the last identity holds as soon as R > T 1−η (|α|∞ + βn).
On Small-Noise Equations with Degenerate Limiting System … 485
We now deal with the second term in (3.7). It follows from the comparison theorem
for one-dimensional SDEs [17, Proposition 5.2.18], that X tε ≤ X̃ tε , t ≤ T , almost
surely, where X̃ ε is defined in Lemma 3.2. For every fixed γ and a > 0, it is a simple
exercise to show that the function y → exp(aε−2 (1 + y)2(1−γ) ), y > 0, is increasing
if ε is small enough. ε
For such values of ε, since X̃ t is a submartingale, so
and convex 4
is exp aε−2 (1 + X̃ tε )2(1−γ) . Then, we can apply Markov’s inequality and Doob’s
L 2 -inequality, obtaining:
W T n,ε
≤T =W sup X tε ≥n
t∈[0,T ]
2(1−γ)
−2 ε
≤W sup exp aε 1 + X̃ t
t∈[0,T ]
≥ exp aε−2 (1 + n)2(1−γ)
≤ exp −aε−2 (1 + n)2(1−γ) × 4 E exp aε−2 (1 + X̃ Tε )2(1−γ) .
(3.10)
√
where C is the second constant in Lemma 3.2. Now choosing n := R, the
condition under which (3.9) holds true is satisfied for R large enough. Passing to the
limit as ε → 0 in (3.7) and using (3.8), (3.9) and (3.11), we obtain
ε
R 2 γ (1−γ)
lim sup ε log W X η ≥ R
2
≤ max − + σ R , −a R +c .
ε→0 8
−2 (1+y)2(1−γ)
4 Thesecond derivative reads eaε × 2aε−2 (1 − γ)(1 + y)−2γ × [1 − 2γ + 2a
ε2
(1 − γ)
(1 + y)2(1−γ) ].
486 G. Conforti et al.
By setting ε = 0 in (3.13), we can define the functional F 0 (φ, h). Note that F ε (·, h)
is continuous ∀h ∈ H on the whole space ≥0 with respect to the sup-norm topology,
and converges to F 0 (·, h) uniformly on ≥0 as ε → 0.
Remark 3.5 Applying the integration by parts formula to the product h t X tε , one has
T T
εσ h t (X tε )γ dBt = h T X Tε − h 0 x0ε − [ḣ t X tε + h t bε (X tε )]dt
0 0
= h T X Tε − h 0 x0ε
T T t
− hT bε (X tε )dt − ḣ t X tε − bε (X sε )ds dt,
0 0 0
hence
ε
T σ2 T
F (X ·ε , h) = εσ h s (X sε )γ dBs − h 2s (X sε )2γ ds.
0 2 0
Proof Assume ϕ ∈ ≥0 ∩ H is such that IT (ϕ) < ∞. Then, the function u defined by
u 0 = 0, u̇ s = ϕ̇sσϕ
−bϕs
γ 1ϕs =0 is by definition an element of H , and ϕ satisfies by con-
s
struction the ODE (1.2) with control u. Repeating the computations in Remark 3.5,
one can see that
T σ2 T
F 0 (ϕ, h) = σ h s ϕγs u̇ s ds − h 2s ϕ2γ
s ds.
0 2 0
Note that F 0 (ϕ, h) is concave in h, hence if it has a critical point, this must be a
maximum. The Fréchet differential D h F 0 (ϕ, h) at h, applied to the generical element
k ∈ H , reads T
D h F 0 (ϕ, h)[k] = σ ks ϕγs u̇ s − σh s ϕ2γ
s ds.
0
488 G. Conforti et al.
T
1 T 1 T
F 0 (ϕ, h ∗ ) = (u̇ s )2 1ϕs =0 ds − (u̇ s )2 1ϕs =0 ds = (u̇ s )2 1ϕs =0 ds = IT (ϕ).
0 2 0 2 0
On the other hand, if ϕ is absolutely continuous and such that IT (ϕ) = +∞, one can
approximate the function ϕ̇s −βϕ
2γ
s
with a sequence h n ∈ H such that F 0 (ϕ, h n ) →
ϕs
+∞.
In other words, the family Y ε satisfies a large deviation weak lower bound on
C([0, T ], R+ ), with rate function IT (ψ).
Once we are provided with Proposition 3.8, it is straightforward to prove the weak
lower bound for X ε .
Proof of Proposition 3.7 Consider ψ ∈ ≥0 absolutely continuous. By Lemma
3.45 in [20], ψ̇ = 0 a.s. on {ψ = 0}. Therefore, IT defined in Proposition 3.8
T 2
can be rewritten as IT (ψ) = 2σ2 (1−γ)
1
2 0 ψ̇t − β(1 − γ)ψt 1ψt =0 dt. Using the
1
definition of Y ε and (3.18), since the map ψ → ϕ = ψ 1−γ is continuous on ≥0 ,
On Small-Noise Equations with Degenerate Limiting System … 489
we can apply the contraction principle and obtain that W (X ε ∈ .) satisfies a large
deviation weak lower bound with rate function I¯T . Let us describe I¯T (ϕ) when ϕ is
absolutely continuous and such that IT (ϕ) < ∞ (where IT was defined in (2.2)). Let
1−γ
ψt = ϕt . On {ϕ = 0}, one has ψ = 0 as well, while for a point t in the open set
{ϕ > 0} such that ϕ̇t exists, one has ψ̇t = (1 − γ) ϕϕ̇γt . Then, noting that IT (ϕ) < ∞
t
implies that ϕϕ̇γt 1ϕt >0 is integrable on [0, T ], ψ is also absolutely continuous on [0, T ]
t
(see [20, Corollary 3.41]), with derivative ψ̇t = (1 − γ) ϕϕ̇γt 1ϕ>0 This yields
t
1
I¯T (ϕ) = IT (ψ(ϕ)) =
2σ 2 (1 − γ)2
ϕ̇t 1−γ 2
T
(1 − γ) γ − β(1 − γ)ϕt 1ϕt =0 dt = IT (ϕ) < ∞. (3.19)
0 ϕt
This section is devoted to the proof of the large deviation weak lower bound for
the process Y ε in (3.18). While postponing some of the most technical elements to
Appendix A, we will make use here of the following notation: for every h ∈ H, y ∈ R,
we define S y (h) to be the unique solution on [0, T ] of the ODE
Remark 3.9 Note that for (3.22) there exists a weak solution, which we construct
directly from a solution of (1.1) applying Girsanov’s Theorem. Since pathwise
uniqueness holds for the couple (b, σ), another application of the same theorem
shows that pathwise uniqueness for (1.1) implies pathwise uniqueness for (3.22).
Therefore we can always assume that X ε,h solves (3.22) with the Brownian motion B.
490 G. Conforti et al.
Two main ingredients enter in the proof of Proposition 3.8: the convergence in
law (under some conditions on h) of the process Y ε,h to the deterministic limit
S0 (h) under the measure W (equivalently: the weak convergence of the measure
W ε,h (Y ε ∈ .) to δ S0 (h) ), and a lower bound for the probability W (Y ε ∈ B(ψ, R))
depending explicitly on the relative entropy between the two measures W ε,h and W .
This is the content of the two following lemmas.
(i) S0 (h)t > 0, ∀t ∈ (0, T ]; (ii) ḣ t > k in a neighborhood of 0, for some k > 0.
(3.23)
Lemma 3.11 (Relative entropy bound) Let (, F) be a probability space and P,Q
two probability measures on (, F) such that d Q = Fd P. The relative entropy
H (Q|P) is defined as:
H (Q|P) := F log(F)d P
Then, ∀A ∈ F we have:
P(A) e−1 + H (Q|P)
log ≥− . (3.24)
Q(A) Q(A)
therefore
1 T
ε,h
H W |W = 2 ḣ 2t dt. (3.25)
2ε 0
The proof of Lemma 3.10 is postponed to Appendix A; using this lemma and
Lemma 3.11, we can achieve here the proof of Proposition 3.8, completing the proof
of the large deviation weak lower bound for the process X ε .
Proof of Proposition 3.8 If IT (ψ) = ∞, (3.18) is trivially true. Then, consider
ψ ∈ ≥0 such that IT (ψ) < ∞, and define h ∈ H by setting ḣ t = ψ̇t −β(1−γ)ψ
σ(1−γ)
t
, so
that S0 (h) = ψ.
Step 1. Assume that h is such that (3.23) holds true. An application of the relative
entropy bound (3.24) with P = W , Q = W ε,h yields
ε
e−1 + H (W ε,h |W )
ε log W Y ∈ B(ψ, R)
2
≥ −ε 2
W ε,h (Y ε ∈ B(ψ, R))
+ ε2 log W ε,h (Y ε ∈ B(ψ, R)).
Using W ε,h (Y ε ∈ B(ψ, R)) = W ε (Y ε,h ∈ B(ψ, R)) → 1 for every R > 0 by
Proposition 3.10, and the expression of H (W ε,h |W ) from (3.25), taking the limit as
ε → 0 we obtain (3.18).
Step 2. Assume now ψ ∈ C 1 ([0, 1]). Let h be defined as above, and define h n ∈ H ,
n ∈ N, by
ḣ nt := ḣ t + 1/n. (3.26)
We claim that ∀n ∈ N, h n satisfies (3.23). Let us first prove that condition (ii) in
(3.23) holds. Observe that ψ ≥ 0 and ψ0 = 0 imply ψ̇0 ≥ 0, hence ḣ n0 ≥ 1/n.
By the continuity of ḣ n , ensured by the fact that ψ ∈ C 1 ([0, T ]), it follows that the
condition (ii) in (3.23) holds with, say, k = 1/(2n). In order to prove condition (i), we
observe that the comparison principle for ODEs implies that ∀t ∈ (0, T ], S0 (h n )t >
S0 (h)t = ψt ≥ 0; condition (i) is then proved. Furthermore, by the continuity of the
solution to (3.20) with respect to the control parameter h, one has
if n is large enough. In the first part of the proof, we have shown that the weak lower
bound holds for W (Y ε ∈ B(S0 (h n ), R/2)); then, taking the limits as ε → 0 and
R → 0 in (3.28), one has
lim inf lim inf ε2 log W Y ε ∈ B(ψ, R) ≥ −IT (S0 (h n )) for every n ∈ N.
R→0 ε→0
T T
Since IT (S0 (h n )) = 21 0 (ḣ n )2 dt → 21 0 (ḣ)2 dt = IT (ψ), the bound (3.18) fol-
lows. Finally, a standard density argument of C 1 ([0, 1]) functions in C([0, 1]) allows
to extend the claim to any ψ ∈ ≥0 such that IT (ψ) < +∞.
Remark 3.12 In a classical situation, the claim would be the lower bound (3.17) for a
process X ε satisfying, say, dXε = bε (X ε ) + εσ(X ε )dB with Lipschitz coefficients σ
and bε → b0 , and X 0ε = x ε → x. In this setting, fixing a control h ∈ H and defining
X ε,h from X ε by shifting the Brownian motion B as in (3.22), it is straightforward (in
fact: an application of Gronwall’s Lemma) to show that X ε,h converges in law to the
unique solution of the deterministic limit equation dϕ = b0 (ϕ)dt + σ(ϕ)dh, ϕ0 = x.
In the present (degenerate) situation, the deterministic limit equation for the process
X ε,h (obtained setting ε = 0 in (3.22)) coincides with the ODE (1.2) which admits
infinitely many solutions. When circumventing this problem by passing through the
transformed process Y ε,h , we actually show that the convergence in law of X ε,h to
a particular solution ϕ∗ of the limiting equation is restored. Indeed, assume as in
Proposition 3.10 that h is such that the unique solution ψ of the well-posed equation
(3.20) with y = 0 is positive for every t > 0, and Y ε,h converges t in law to ψ. The
function ψ is easily computed, namely ψt = σ(1 − γ)eβ(1−γ)t 0 e−β(1−γ)s ḣ s ds. By
1 W 1
definition, one has X ε,h = Y ε,h 1−γ −→ ψ 1−γ =: ϕ∗ . By direct computation, ϕ∗
is absolutely continuous and such that ϕ∗0 = 0 and ϕ̇∗ = βϕ∗ + σ(ϕ∗ )γ ḣ, hence ϕ∗
is a solution to (1.2); in particular,
t 1
ϕ∗t := eβt σ(1 − γ) e−β(1−γ)s ḣ s ds
1−γ
. (3.29)
0
Therefore, in the small noise limit, the stochastic dynamics (3.22) performs a selec-
tion among the solutions of the limiting deterministic system (1.2), selecting the
strictly positive one, ϕ∗ . This looks reasonable in light of the fact that, though con-
verging to zero, the drift parameter αε and the initial condition x ε of the process
remain strictly positive for all ε > 0.5 Figure 1 shows the convergence of simulated
5 By perturbing the initial condition and the drift in (1.2), one can retrieve the trajectory ϕ∗ in (3.29)
γ
as the limit as ρ → 0 of the solution of the equation dϕt = ρ + βϕt dt + σϕt dh, ϕ0 = ρ, for
which existence and uniqueness hold.
On Small-Noise Equations with Degenerate Limiting System … 493
25
0.25
20
0.20
0.15
15
Out[8]= 0.10
10 0.05
1 2 3 4 5
Fig. 1 An illustration of the convergence of the process X ε,h in (3.22) to a particular solution ϕ∗ of
the limiting deterministic sytem. Trajectories have been simulated for different values of the noise
parameter ε and γ = 1/2, α(x) ≡ 1, β = 0, σ = 2, ḣ = 1, x = 0
Remark 3.13 (Lower bound from the upper bound) In general, the weak conver-
gence of the controlled process X ε,h can be shown exploiting the large deviation
upper bound. This goes as follows: in the notation of Remark 3.12, assume X ε sat-
isfies dXε = bε (X ε ) + εσ(X ε )dB with Lipschitz coefficients, and define X ε,h from
X ε as in (3.22). Assume one has proven a large deviation upper bound analogous to
(3.12) for the process X ε,h , with a good rate function I h depending on the control
T 2
parameter h, I h (ψ) := 21 0 ψ̇t −b0 (ψ t )−σ(ψt )ḣ t
σ(ψt ) dt. It is clear that I h admits as a
unique zero the solution ϕ(h) of ψ̇t = b0 (ψt ) + σ(ψt )ḣ t . Using the compactness of
the level sets of I h and the large deviation upper bound, it is easy to conclude that
lim W X ε,h ∈
/ B(ϕ(h), R) = 0 ∀R > 0,
ε→0
hence X ε,h → ϕ(h) in law. This provides a way of “bootstrapping” the large devi-
ation lower bound from the upper bound (via weak convergence, together with the
bound on relative entropy in Lemma 3.11). When the limit ODE has several solu-
tions, this approach is not possible anymore: in the present case, the rate func-
T ψ̇t −βψt −ψtγ ḣ t 2
tion I h (ψ) = 21 0 ψt
γ 1{ψt >0} dt has uncountably many zeroes, cor-
responding to the possible solutions of the degenerate ODE (1.2). While one is
expecting that converging subsequences of the family of measures {W (X ε,h ∈ ·)}ε
converge to a probability distribution supported by the set of solutions, it is not obvi-
ous a priori how to restore a unique limit for X ε,h (which is why we pass through
the transformed process Y ε,h ). When uniqueness for the limiting equation is granted,
such an approach remains efficient, and applies outside the Markovian framework
(see [8] for a treatment of delayed equations. In the setting of [8], uniqueness of solu-
tions for the deterministic sytem is essential, and enters via their condition (H4)).
494 G. Conforti et al.
In this section, we prove the asymptotic estimates that have been stated in Sect. 2.1
and that follow from Theorem 2.1.
Proof of Proposition 2.5 Setting ε := R −(1−γ) into (2.1), one has
lim sup R −2(1−γ) log W (X T ≥ R) = lim sup ε2 log W X Tε ≥ 1 ≤ −P
R→+∞ ε→0
where
Fix y ≥ 1 and a function ϕ in the admissible set of P(y), such that IT (ϕ) < ∞. Set
1−γ
ψt = ϕt . On {ϕ = 0}, one has ψ = 0 as well, while for a point t in the open set
{ϕ > 0} such that ϕ̇t exists, one has ψ̇t = (1 − γ) ϕϕ̇γt . Then, noting that IT (ϕ) < ∞
t
ϕ̇t
implies that γ1 is integrable on [0, T ], ψ is also absolutely continuous on
ϕt ϕt >0
T
[0, T ] (see [20, Corollary 3.41]). Moreover, IT (ϕ) = 2σ1 2 0 ϕ̇t −βϕ γ
ϕt
t 2
1ϕt >0 dt =
T
1
2σ 2 (1−γ)2 0
(ψ̇t − β(1 − γ)ψt ) 1ψt >0 dt. Noting that the inverse transformation ϕ =
2
1
ψ (1−γ) also maps AC positive functions to AC positive functions (as (1−γ)
1
> 1), one
has
T
1 2
P(y) = inf ψ̇t − β(1 − γ)ψt 1ψt >0 dt : ψ is abs. cont.,
2σ (1 − γ)
2 2
0
ψ0 = 0, ψ ≥ 0, ψT = y 1−γ
.
When β = 0, the minimizer of this problem is ψt∗ (y) = y 1−γ t/T . When β = 0, the
solution of the Euler-Lagrange equation associated with the Lagrangian (ψ̇ − β(1 −
γ)ψ)2 and the boundary conditions ψ0 = 0, ψT = y 1−γ yields the minimizer
y 1−γ
ψt∗ (y) = (eβ(1−γ)t − e−β(1−γ)t ).
eβ(1−γ)T − e−β(1−γ)T
In both cases, ψt∗ (y) > 0 for all t ∈ (0, T ], and the positivity constraint in P(y)
can be dropped. Using the monotonicity of ψ ∗ w.r.t. y, this yields inf y≥1 P(y) =
T ∗ ∗
2
P(1) = 2σ2 (1−γ)
1
2 0 ψ̇t (1) − β(1 − γ)ψt (1) dt. An application of the large
deviation lower bound (2.1) gives lim inf R→+∞ R −2(1−γ) log W (X T > R) =
lim inf ε→0 ε2 log W X Tε > 1 = − inf y>1 P(y) = −P(1). Finally, the explicit
On Small-Noise Equations with Degenerate Limiting System … 495
evaluation of the integral in P(1) over the function ψ ∗ yields the expression of the
constant cT in (2.4).
Let us consider the running maximum process. Another application of the large
deviation principle (2.1) with ε = R −(1−γ) gives
lim inf R −2(1−γ) log W sup X t > R ≥ −c T
R→+∞ t∈[0,T ]
Proposition 3.14 Estimate (1.5) in Theorem 1.2 holds with νT > 0. When γ = 1/2,
the constant νT is given by
⎧
⎪
⎨ 2σ1 2 T β 2 + 4ω 2
if T β/2 < 1
T
νT = (3.30)
⎪
⎩ 12 T β2 − 4ω 2
if T β/2 ≥ 1
2σ T
where
⎧
⎨ the ω ∈ (0, π) such that ω cos ω = T β/2 sin(ω)
⎪ if T β/2 < 1
ω= 0 if T β/2 = 1
⎪
⎩
the ω ∈ (0, ∞) such that ω cosh(ω) = T β/2 sinh(ω) if T β(1 − γ) ≥ 1.
(3.31)
496 G. Conforti et al.
Remark 3.15 Following the lines of the proof of Proposition 3.14, one can prove
T
the analogous asymptotic relation for a general time-average functional 0 X t μ(dt),
where μ is a bounded signed measure on [0, T ]. One gets
T
2(1−γ) (V
W X t μ(dt) ≥ R = e−R T +ψ(R)) as R → ∞,
0
T
where VT is characterised by the variational formula VT := inf IT (ϕ) : 0 ϕt μ
(dt) ≥ 1, ϕt ≥ 0, ∀t ∈ [0, T ] .
When γ = 1/2, the latter variational problem was studied in [12, Exercise
2.1.13]. The explicit solution for J provides the expression of the constant νT =
inf η≥1 J (η) = J (1) given in (3.30). The large deviation lower bound yields
T T
lim inf R→+∞ R −2(1−γ) log W T1 0 X t dt > R = lim inf ε2 →0 ε2 log W T1 0 X tε
dt > 1 ≥ −J (1) = ηT , and the claim is proved.
Consistency check with the explicit formulae for the integrated CIR process.
Let us consider the case γ = 1/2, and compare Proposition 3.14 with the moment
explosion of the integrated CIR process, corresponding to α(x) ≡ α ≥ 0 in con-
On Small-Noise Equations with Degenerate Limiting System … 497
dition (H2). We focus on the (common) case of a mean-reverting drift, i.e. β < 0;
T
computations for β > 0 are similar. Estimate (1.5) establishes that T1 0 X t dt has
finite exponential moments up to order νT : more precisely,
u T 1 T
u ∗ := sup{u > 0 : E exp X t dt < ∞} = sup{ν > 0 : P X t dt > x
T 0 T 0
−νx
= O(e ) as x → ∞} = νT
(3.32)
(for the central identity, see for example [15, Sect. 4]); in other words, νT is the posi-
T
tive critical exponent of T1 0 X t dt. Critical exponents for integrated CIR have been
assessed by [2, 14, 18] relying (essentially) on the affine structure of the process. It
is typical to obtain u ∗ by inverting an explicit explosion time: following [2, Corollary
T
3.3], E[exp( Tu 0 X t dt)] is always finite if u ≤ T β 2 /(2σ 2 ), and if u > T β 2 /(2σ 2 ),
the expectation is finite for T < T ∗ (u) and infinite for T > T ∗ (u), where T ∗ reads
γ(u)
π + arctan β
T ∗ (u) = 2 ,
γ(u)
where γ(u) = 2σ 2 Tu − β 2 . Fixing T and using the monotonicity of T ∗ , this means
that the expectation becomes infinite for u > u ∗ with u ∗ the solution to
γ(u) T
π + arctan = γ(u) (3.33)
β 2
As an equation in γ, it is easy to see that (3.33) has a unique root γ ∗ on R+ such that
T ∗ π
2 γ ∈ ( 2 , π). From the definition of γ,
1 1 2 4 T γ ∗ 2 1 2 4 ∗ 2
u∗ = (T β 2
+ T (γ ∗ 2
) ) = T β + = T β + (ω )
2σ 2 2σ 2 T 2 2σ 2 T
∗
setting ω ∗ = T 2γ . From (3.33), ω ∗ is the unique solution to ω = π + arctan T2ωβ ,
which is equivalent to tan(ω) = T2ωβ together with ω ∈ ( π2 , π): one sees that this
definition coincides with the one for ω in (3.31) (noticing we are in the first case
when β < 0).
Acknowledgments We would like to thank an anonymous referee for the careful reading of the
paper and for several valuable comments which helped to improve the presentation. We thank
Peter Friz for stimulating discussions and Antoine Jacquier for useful references on integrated
CIR processes. SDM (affiliated with TU-Berlin when this work was started) acknowledges partial
financial support from Matheon. GC acknowledges financial support from Berlin Mathematical
School. SDM and GC acknowledge financial support for travel expenses from the research program
‘Chaire Risques Financiers’ of the Fondation du Risque.
498 G. Conforti et al.
Appendix A
after a simple application of the product rule, one has that the process Z t :=
exp(|β|t)X t is a solution to
γ
d Z t = |α|∞ exp(|β|t) + |β|Z t dt + σ Z t dBt , Z 0 = x.
Since |α|∞ exp(|β|t) ≥ |α|∞ , an application of the comparison principle for SDE’s
2(1−γ)
[17, Proposition 5.2.18] yields Z t ≥ X̃ t , for all t ≥ 0. Therefore, if X admits
2(1−γ) 2(1−γ)
(some) exponential moments, so does Z t and by comparison X̃ t . In this
sense, the process X is not covered by Proposition 3.3 in [9], since the latter deals
with the case of a diffusion coefficient that does not depend on time (see [9, Eq.
(3.1)]); nonetheless, the essential condition that [9, Proposition 3.3] relies on is the
presence of a non-strictly positive slope coefficient, say b in the drift term a + bX
(cf. [9, Eq. (3.3)]). Since this is the case for the process X (which has zero slope
coefficient b), it is straightforward to extend the proof to the present setting: in
particular, in the spiritof Lamperti’s change-of-variable argument, one still defines
x
the function ϕ(x) = 0 σx1 γ = σ(1−γ) 1
x 1−γ and studies the process ϕ̃(X t ), where
the function ϕ̃ is a modification of ϕ identically null around zero. Itô’s formula
shows that ϕ̃(X t ) is an Itô process with bounded quadratic variation and a bounded
drift term; the existence of quadratic exponential moments for ϕ̃(X t ), then, is a
consequence of Dubins–Schwarz time-change argument and Fernique’s theorem.
2(1−γ)
As a consequence, there exist c , C > 0 such that supt≤T E[exp(c X t )] ≤ C;
2(1−γ) 2(1−γ)
it follows supt≤T E[exp(c X̃ t )] ≤ supt≤T E[exp(cZ t )] ≤ C with c :=
c exp(−2|β|(1 − γ)T ), and the claim is proved.
We report the statement given in [26, Chap. 2, Theorem 2.13].
then
|t−s| 4K
|ωt − ωs | ≤ 8 −1 dp(u). (A.2)
0 u2
On Small-Noise Equations with Degenerate Limiting System … 499
We can apply Itô formula to the function f (x) = x 1−γ up to time T ε (Y ε,h ), and
obtain
t
Ytε,h − εx 1−γ = b̃ε (Ysε,h )ds + σ(1 − γ)h t + εσ(1 − γ)Bt , ∀ t ≤ T ε (Y ε,h ), a.s.
0
(A.4)
where b̃ε is given by
1 1
− (1−γ) 1 1 σ 2 γ(1 − γ) 2 1
b̃ε (y) := (1−γ)ε 1−γ α(ε y (1−γ) ) γ − ε +β(1−γ)y (A.5)
y 1−γ 2 y
500 G. Conforti et al.
We need to prove
lim W sup |Ytε,h − S0 (h)t | ≤ R =1 ∀R > 0. (A.6)
ε→0 t∈[0,T ]
A direct computation shows that there exist a constant c > 0 depending on x, σ, α(·)
such that:
inf b̃ε (y) − β(1 − γ)y ≥ −cε. (A.9)
y≥ 21 εx 1−γ
Define (Z t )t∈[0,T ] by
t
Z t = εx 1−γ
+ (−cε + σ(1 − γ)k) t + β(1 − γ) Z s ds + εσ(1 − γ)Bt (A.10)
0
Using (A.9), it follows from the comparison principle for SDEs that
Yt ≥ Z t ∀ t ≤ T ε (Y ), a.s. (A.11)
We claim that
W T ε (Z ) ≤ T → 0 (A.12)
Since both the events in the right hand side of the last inequality have probability
converging to 1, (A.6) follows, and Lemma 3.10 is proved under condition (A.7).
Step 2. We assume that (A.7) holds only on the time interval [0, ρ], that is ḣ t ≥ k
for every t ≤ ρ, for some k, ρ > 0. Repeating the argument of Step 1 with T = ρ,
we have
lim W sup |Yt − S0 (h)t | ≤ R = 1, ∀R > 0 (A.13)
ε→0 t∈[0,ρ]
and set
Y y,ρ := (X y,ρ )1−γ .
Note that Y y,ρ is well defined since X y,ρ ≥ 0 for all t ∈ [0, T ], W -almost surely.
If h = 0 the non negativity of the trajectories of X y,ρ follows from an application
Proposition 3.1 in [9] and extends to h ∈ H by an application of the Girsanov
theorem. By definition of Y and Y y,ρ , the Markov property yields
E( f (τρ Y )|Fρ ) = E( f (Y Yρ ,ρ ))
By the continuity of the map (h, y) → S y (h) we can choose R > 0 such that
R
sup sup |S y (τρ h)t − SS0 (h)ρ (τρ h)t | ≤ (A.14)
y∈B(S0 (h)ρ ,R ) t∈[0,T −ρ] 2
Therefore, using (A.14) the following inclusion of events holds (assume w.lo.g R ≤
R
2 ):
sup |Yt − S0 (h)t | ≤ R ⊇ sup |Yt − S0 (h)t | ≤ R
t∈[0,T ] [0,ρ]
R
∩ sup |τρ (Y )t − SYρ (τρ h)t | ≤
t∈[0,T −ρ] 2
502 G. Conforti et al.
It follows from the hypothesis S0 (h)t > 0 ∀t > 0 and the continuity of the map
(y, h) → S y (h) that, if R , R are small enough
R
y ∗ := inf inf S y (τρ h)t − > 0. (A.17)
y∈B (S0 (h)ρ ,R ) t∈[0,T −ρ] 2
where
⎧
⎪ ε if y ≥ y ∗
⎨b̃ (y)
b̃uε (y) = 1 1
− (1−γ) 1
σ 2 γ(1−γ)
⎪β(1 − γ)y
⎩ + (1 − γ)ε 1−γ α(ε (y ∗ ) (1−γ) ) 1
γ − 2 ε2 y1∗ if y < y ∗ .
(y ∗ ) 1−γ
By letting ε → 0 and applying the Markov inequality, observing that the right hand
side of (A.19) does not depend on y, we have proven (A.16). By letting ε → 0 in
(A.15) and applying (A.13) and (A.16), the proof of Lemma 3.10 is complete.
Proof of (A.12). Observe that Z̃ := 1ε Z is an Ornstein-Uhlenbeck process,
t
Z̃ t = x 1−γ
+ με t + β(1 − γ) Z̃ s ds + σ(1 − γ)Bt (A.20)
0
√
Now, the choice τε = ε gives f ε (τε ) ∼ με τε → ∞ as ε → 0, so that
−1
f ε (τε ) − x 1−γ /2 → 0. On the other hand, inf t∈[0,τε ] Z̃ t → x 1−γ a.s. as ε → 0,
hence W inf t∈[0,τε ] Z̃ t ≤ x/2 → 0 as ε → 0, and the claim is proven.
References
1. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions with Formulas, Graphs,
and Mathematical Tables, 10th edn. Dover, New York (1972)
2. Andersen, L., Piterbarg, V.: Moment explosions in stochastic volatility models. Financ. Stoch.
11, 29–50 (2007)
3. Azencott, R.: Grandes déviations et applications. Ecole d’été de Probabilités de Saint-Flour
VIII-1978. Lecture Notes in Mathematics, vol. 774, pp. 1–176. Springer, Berlin (1980)
4. Baldi, P., Caramellino, L.: General Freidlin-Wentzell large deviations and positive diffusions.
Stat. Probab. Lett. 81, 1218–1229 (2011)
5. Ben Arous, G.: Développement asymptotique du noyau de la chaleur hypoelliptique hors du
cut-locus. Annales scientifiques de l’Ecole Normale Supérieure 4(21), 307–331 (1988)
6. Bingham, N.H., Goldie, C.M., Teugels, J.L.: Regular Variation. Cambridge University Press,
Cambridge (1987)
7. Bismut, J.-M.: Large Deviations and the Malliavin Calculus. Birkhäuser, Boston (1984)
8. Chiarini, A., Fischer, M.: On large deviations for small noise Itô processes. Adv. Appl. Probab.
46(4), 1126–1147 (2014)
9. De Marco, S.: Smoothness and asymptotic estimates of densities for SDEs with locally smooth
coefficients and applications to square root-type diffusions. Ann. Appl. Probab. 4(21), 1282–
1321 (2011)
10. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Applications
of Mathematics, Springer, New York (1998)
11. Deuschel, J.-D., Friz, P., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, II: Applications. Comm. in Pure and Applied Math. 67(1), 40–82
(2014)
12. Deuschel, J-D., Stroock., W.: Large Deviations. Pure and Applied Mathematics. American
Mathematical Society, New York, London. Revised edition of: An introduction to the theory
of large deviations/D.W. Stroock. cop.1984 (2000)
13. Donati-Martin, C., Rouault, A., Yor, M., Zani, M.: Large deviations for squares of Bessel and
Ornstein-Uhlenbeck processes. Probab. Theory Relat. Fields 129, 261–289 (2004)
14. Dufresne, D.: The integrated square-root process. Research Paper no. 90, Centre for Actuarial
Studies, University of Melbourne (2001)
15. Gulisashvili, A.: Asymptotic formulas with error estimates for call pricing functions and the
implied volatility at extreme strikes. SIAM J. Financ. Math. 1(1), 609–641 (2010)
16. Jeanblanc, M., Yor, M., Chesney, M.: Mathematical Methods for Financial Markets. Springer
finance. Springer, Dordrecht, Heidelberg, London (2009)
17. Karatzas, I., Shreve, S.: Brownian Motion and Stochastic Calculus, 2nd edn. Springer, New
York (1991)
18. Keller-Ressel, M.: Moment explosions and long-term behavior of affine stochastic volatility
models. Math. Financ. 21, 73–98 (2011)
19. Klebaner, F., Liptser, R.: Asymptotic analysis of ruin in the constant elasticity of variance
model. Theory Probab. Appl. 55(2), 291–297 (2011)
20. Leoni, G.: A First Course in Sobolev Spaces. Graduate studies in mathematics. American
Mathematical Society, Cambridge (2009)
21. Lions, P.-L., Musiela, M.: Correlations and bounds for stochastic volatility models. Annales
de l’Institut H. Poincaré 24, 1–16 (2007)
On Small-Noise Equations with Degenerate Limiting System … 505
22. Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion, 3rd edn. Springer, New
York (1999)
23. Robertson, S.: Sample path large deviations and optimal importance sampling for stochastic
volatility models. Stoch. Process. Appl. 120(1), 66–83 (2010)
24. Schöbel, R., Zhu, J.: Stochastic volatility with an Ornstein-Uhlenbeck process: an extension.
Eur. Financ. Rev. 3(1), 23–46 (1999)
25. Stein, E.M., Stein, J.C.: Stock price distribution with stochastic volatility: an analytic approach.
Rev. Financ. Stud. 4, 727–752 (1991)
26. Stroock, D., Varadhan, S.R.S.: Multidimensional Diffusion Processes. Grundlehren der math-
ematischen Wissenschaften. Fundamental Principles of Mathematical Sciences, vol. 233.
Springer, Berlin (1979). Reprinted in 2006
27. Trevisan, D.: Zero noise limits using local times. Electron. Commun. Probab. 18(31), 1–7
(2013)
Long Time Asymptotics for Optimal
Investment
Huyên Pham
Abstract This survey reviews portfolio selection problem for long-term horizon.
We consider two objectives: (i) maximize the probability for outperforming a target
growth rate of wealth process (ii) minimize the probability of falling below a target
growth rate. We study the asymptotic behavior of these criteria formulated as large
deviations control problems, that we solve by duality method leading to ergodic
risk-sensitive portfolio optimization problems. Special emphasis is placed on linear
factor models where explicit solutions are obtained.
1 Introduction
Dynamic portfolio selection looks for strategies maximizing some performance cri-
terion. It is a main topic in mathematical finance, first solved in continuous time in
the seminal paper [13], and extended in various directions by taking into account
stochastic investment opportunities, market imperfections and/or transaction costs.
H. Pham (B)
Laboratoire de Probabilités et Modèles Aléatoires CNRS, UMR 7599,
Université Paris Diderot, Paris, France
e-mail: [email protected]
URL: http://www.math.univ-paris-diderot.fr
H. Pham
CREST-ENSAE, Malakoff, France
We refer for instance to the textbooks [10, 11] or [19], and the recent survey paper
[12] for developments on this subject.
Classical criterion for investment decision is the expected utility maximization
from terminal wealth, which requires to specify on one hand the utility function
representing the investor’s preference, and subjective by nature, and on the other
hand the finite horizon. We consider in this paper an alternative behavioral founda-
tion, with an objective criterion over long term. More precisely, we are concerned
with the performance of a portfolio relative to a given target, and are interested in
maximizing (resp. minimizing) the probability to outperform (resp. to fall below)
a target growth rate when time horizon goes to infinity. Such criterion, formulated
as a large deviations portfolio optimization problem, has been proposed by [22] in
a static framework, studied in a continuous-time framework for the maximization
of upside chance probability by [17], and then by [9], see also [21] in discrete-time
models and [18] for a survey paper. The asymptotics of minimizing the downside
risk probability is studied in [8, 15].
Large deviations portfolio optimization is a nonstandard stochastic control prob-
lem, and is tackled by duality approach. The dual control problem is an ergodic risk-
sensitive portfolio optimization problem studied in [6] by dynamic programming
PDE methods in a Markovian setting, see also [7], and leads to particularly tractable
results with time-homogenous policies. A nice feature of the duality approach is also
to relate the target level in the objective probability of upside chance maximization
or downside risk minimization to the subjective degree of risk aversion, hence to
make endogenous the utility function of the investor.
The rest of this paper is organized as follows. Section 2 formulates the large devi-
ations criterion. In Sect. 3, we state the general duality relation for the large devi-
ations optimization problem, both for the upside chance probability maximization
and downside risk minimization. We illustrate in Sect. 4 our results in the Black-
Scholes toy model with constant proportion portfolio. Finally, we consider in Sect. 5
a factor model for assets price, and characterize the optimal strategy of the large
deviations optimization problem via the resolution of an ergodic Hamilton-Jacobi-
Bellman equation from the risk-sensitive dual control. Explicit solutions are provided
in the linear Gaussian factor model.
dynamics:
where diag(St )−1 denotes the diagonal d × d matrix of i-th diagonal term 1/Sti .
We then define the so-called growth rate portfolio, i.e. the logarithm of the wealth
process X π :
L πt := ln X tπ , t ≥ 0.
L πt
L̄ πt := , t > 0.
t
We shall then consider two problems on the long time asymptotics for the average
growth rate:
(i) Upside chance probability: given a target growth rate , the agent wants to
maximize over portfolio strategies π ∈ A
P L̄ πT ≥ when T → ∞.
(ii) Downside risk probability: given a target growth rate , the agent wants to
minimize over portfolio strategies π ∈ A
P L̄ πT ≤ when T → ∞.
Actually, when horizon time T goes to infinity, the probabilities of upside chance
or downside risk have typically an exponential decay in time, and we are led to the
following mathematical formulations of large deviations criterion:
1
v+ () := sup lim sup ln P L̄ πT ≥ , (2.2)
π∈A T →∞ T
1
v− () := inf lim inf ln P L̄ πT ≤ . (2.3)
π∈A T →∞ T
This criterion depends on the objective probability P, and the target growth rate , but
there is no exogenous utility function, and finite horizon. Large deviations control
problem (2.2) and (2.3) are nonstandard in the literature on stochastic control, and
we shall study these problems by a duality approach.
510 H. Pham
3 Duality
We derive in this section the dual formulation of the large deviations criterion intro-
duced in (2.2) and (2.3). Given π ∈ A, if the average growth rate portfolio L̄ πT satisfies
a large deviations principle, then large deviations theory states that its rate function
I (., π) should be related to its limiting log-Laplace transform (., π) by duality via
the Gärtner-Ellis theorem:
I (, π) = sup θ − (θ, π) , (3.1)
θ
1
lim sup ln P L̄ πT ≥ = − inf I ( , π) = I (, π), ≥ lim L̄ πT , (3.2)
T →∞ T ≥ T →∞
1 π
(θ, π) := lim sup ln E eθT L̄ T , θ ∈ R,
T →∞ T
The issue is now to extend this duality relation (3.1) when optimizing over control
π. To fix the ideas, let us formally derive from (3.1) and (3.2) the maximization of
upside chance probability.
1
sup lim sup ln P L̄ πT ≥ = sup − I (, π)
π T →∞ T π
= sup − sup θ − (θ, π)
π θ
= sup inf (θ, π) − θ
π θ
(if we can invert sup and inf) = inf sup (θ, π) − θ .
θ π
where + is defined by
In other words, we should have a duality relation between the value function v+ of the
large deviations control problem, and the value function + , which is known in the
Long Time Asymptotics for Optimal Investment 511
1 π
+ (θ) := sup lim sup ln E eθT L̄ T , θ ≥ 0. (3.4)
π∈A T →∞ T
We easily see from Hölder inequality that + is convex on R+ . The following result
is due to [17].
Theorem 3.1 Suppose that + is finite and differentiable on (0, θ̄) for some θ̄ ∈
(0, ∞], and there exists π̂(θ) ∈ A solution to + (θ) for any θ ∈ (0, θ̄). Then, for all
< + (θ̄), we have:
v+ () = inf + (θ) − θ .
θ∈[0,θ̄)
Moreover, an optimal control for v+ (), when ∈ (+ (0), + (θ̄)), is
Proof Step 1: Let us consider the Fenchel-Legendre transform of the convex function
+ on [0, θ̄):
Since + is C 1 on (0, θ̄), it is well-known (see e.g. Lemma 2.3.9 in [4]) that the
function ∗+ is convex, nondecreasing and satisfies:
θ() − + (θ()), if + (0) < < + (θ̄)
∗+ () = (3.6)
0, if ≤ + (0),
512 H. Pham
θ() − ∗+ () > θ() − ∗+ ( ), ∀+ (0) < < + (θ̄), ∀ = , (3.7)
where θ() ∈ (0, θ̄) is s.t. + (θ()) = ∈ (+ (0), + (θ̄)). Moreover, ∗+ is
continuous on (−∞, + (θ̄)).
Step 2: Upper bound. For all ∈ R, π ∈ A, an application of Chebycheff’s inequality
yields:
and so
1 1
lim sup ln P[ L̄ πT ≥ ] ≤ −θ + lim sup ln E[exp(θT L̄ πT )], ∀ θ ∈ [0, θ̄).
T →∞ T T →∞ T
1
sup lim sup ln P[ L̄ πT ≥ ] ≤ −∗+ (). (3.8)
π∈A T →∞ T
Step 3: Lower bound. Consider first the case ∈ (+ (0), + (θ̄)), and let us define
the probability measure QT on (, FT ) via:
dQT +,
= exp θ()L πT − T (θ(), π +, ) , (3.9)
dP
where
1
≥ −θ() + ε + T (θ(), π +, )
T
1 +,
+ ln QT − ε < L̄ πT < + ε ,
T
where we use (3.9) in the last inequality. By definition of the dual problem, this
yields:
1 +,
lim inf ln P[ − ε < L̄ πT < + ε]
T →∞ T
≥ −θ() + ε + + (θ())
Long Time Asymptotics for Optimal Investment 513
1 +,
+ lim infln QT − ε < L̄ πT < + ε
T →∞ T
≥ −∗+ () − θ()ε
1 +,
+ lim inf ln QT − ε < L̄ πT < + ε , (3.10)
T →∞ T
where the second inequality follows by the definition of ∗+ (and actually holds with
equality due to (3.6)). We now show that:
1 +,
lim inf ln QT − ε < L̄ πT < + ε = 0. (3.11)
T →∞ T
+,
Denote by ˜ T the c.g.f. under QT of L πT . For all ζ ∈ R, we have by (3.9):
+,
˜ T (ζ) := ln EQT [exp(ζ L πT )]
= T (θ() + ζ, π +, ) − T (θ(), π +, ).
Therefore, by definition of the dual control problem (3.4), we have for all ζ ∈
[−θ(), θ̄ − θ()):
1
lim sup ˜ T (ζ) ≤ + (θ() + ζ) − + (θ()). (3.12)
T →∞ T
As in part (1) of this proof, by Chebycheff’s inequality, we have for all ζ ∈ [0, θ̄ −
θ()):
1 +, 1
lim sup ln QT L̄ πT ≥ + ε ≤ −ζ( + ε) + lim sup ˜ T (ζ)
T →∞ T T →∞ T
≤ −ζ ( + ε) + + (ζ + θ()) − + (θ()),
1 +,
lim sup ln QT L̄ πT ≥ + ε ≤ − sup{ζ ( + ε) − + (ζ) : ζ ∈ [θ(), θ̄)}
T →∞ T
−+ (θ()) + θ() ( + ε)
≤ −∗+ ( + ε) − + (θ()) + θ() ( + ε) ,
= −∗+ ( + ε) + ∗+ () + εθ(), (3.13)
514 H. Pham
where the second inequality and the last equality follow from (3.6). Similarly, we
have for all ζ ∈ [−θ(), 0]:
1 +, 1
lim sup ln QT L̄ πT ≤ − ε ≤ −ζ ( − ε) + lim sup ˜ T (ζ)
T →∞ T T →∞ T
≤ −ζ ( − ε) + + (θ() + ζ) − + (θ()),
and so:
1 +,
lim sup ln QT L̄ πT ≤ − ε ≤ − sup{ζ ( − ε) − + (ζ) : ζ ∈ [0, θ()]}
T →∞ T
−+ (θ()) + θ() ( − ε)
≤ −∗+ ( − ε) + ∗+ (θ()) − εθ(). (3.14)
1 +, +,
lim sup ln QT L̄ πT ≤ − ε ∪ L̄ πT ≥ + ε
T →∞ T
+, +,
1 1
≤ max lim sup ln QT L̄ πT ≥ + ε ; lim sup ln QT L̄ πT ≤ − ε
T →∞ T T →∞ T
≤ max −+ ( + ε) + + () + εθ(); −∗+ ( − ε) + ∗+ (θ()) − εθ()
∗ ∗
< 0,
+,
where the strict inequality follows from (3.7). This implies that QT [{ L̄ πT ≤ − ε}
+, +,
∪ { L̄ πT ≥ + ε}] → 0 and hence QT [ − ε < L̄ πT < + ε] → 1 as T goes to
infinity. In particular (3.11) is satisfied, and by sending ε to zero in (3.10), we get for
any < < + (θ̄):
1 +, 1 +,
lim inf ln P[ L̄ πT > ] ≥ lim lim inf ln P[ − ε < L̄ πT < + ε]
T →∞ T ε→0 T →∞ T
≥ −∗+ ().
1 +,
lim inf ln P[ L̄ πT ≥ ] ≥ −∗+ ().
T →∞ T
This last inequality combined with (3.8) proves the assertion for v+ () when ∈
(+ (0), + (θ̄)).
Now, consider the case ≤ + (0), and define n = + (0)+ n1 , π +(n) = π̂(θ(n )).
Then, by the same arguments as in (3.10) with n ∈ (+ (0), + (θ̄)), we have
Long Time Asymptotics for Optimal Investment 515
1 +(n) 1 +(n)
lim inf ln P[ L̄ πT ≥ ] ≥ lim lim inf ln P[n − ε < L̄ πT < n + ε]
T →∞ T ε→0 T →∞ T
≥ −∗+ (n ).
1 +(n)
lim inf lim inf ln P[ L̄ πT ≥ ] ≥ −∗+ (+ (0)) = 0,
n→∞ T →∞ T
Remark 3.1 Theorem 3.1 shows that the upside chance large deviations control prob-
lem can be solved via the resolution of the dual control problem. When the target
growth rate level is smaller than + (0), then one can achieve almost surely over
long term an average growth term above , in the sense that v+ () = 0, with a nearly
optimal portfolio strategy which does not depend on this level. When the target level
lies between + (0) and + (θ̄), the optimal strategy depends on this level and is
obtained from the optimal strategy for the dual control problem + (θ) at point θ =
θ(). When + (θ̄) = ∞, i.e. + is steep, we have a complete resolution of the large
deviations control problem for all values of . Otherwise, the problem remains open
for > + (θ̄).
Let us next consider the downside risk probability, and define the corresponding
dual control problem:
1 π
− (θ) := inf lim inf ln E eθT L̄ T , θ ≤ 0. (3.15)
π∈A T →∞ T
λθ1 (1 − λ)θ2
Xπ + X π = X Tπ .
1 2
λθ1 (1 − λ)θ2
ln X Tπ ≥ ln X Tπ + ln X Tπ ,
1 2
θT L̄ πT ≤ λθ1 T L̄ πT + (1 − λ)θ2 T L̄ πT .
1 2
516 H. Pham
Taking exponential and expectation on both sides of this relation, and using Hölder
inequality, we get:
π
π1 λ
π 2 1−λ
E eθT L̄ T ≤ E eθ1 T L̄ T E eθ2 T L̄ T .
Theorem 3.2 Suppose that − is differentiable on (−∞, 0), and there exists π̂(θ)
∈ A solution to − (θ) for any θ < 0. Then, for all < − (0), we have:
v− () = inf − (θ) − θ ,
θ≤0
and an optimal control for v− (), when ∈ (− (−∞), − (0)) is:
Remark 3.2 Theorem 3.2 shows that the downside risk large deviations control prob-
lem can be solved via the resolution of the dual control problem. When the target
growth rate level is smaller than − (−∞), then one can find a portfolio strategy
so that the average growth term almost never fall below over the long term, in the
sense that v− () = −∞. When the target level lies between − (−∞) and − (0),
the optimal strategy depends on this level and is obtained from the optimal strategy
for the dual control problem − (θ) at point θ = θ().
1
± (θ) = sup lim sup JT (θ, π),
θ π∈A T →∞
with
1 π
JT (θ, π) := ln E eθT L̄ T ,
θT
Long Time Asymptotics for Optimal Investment 517
This relation shows that risk-sensitive control amounts to making dynamic the
Markowitz problem: one maximizes the expected average growth rate subject to
a constraint on its variance. Risk-sensitive portfolio criterion on finite horizon T has
been studied in [2, 3], and in the ergodic case T → ∞, by [6, 16].
Endogenous utility function
Recalling that growth rate is the logarithm of wealth process, the duality relation for
the upside large deviations probability means formally that for large horizon T :
+
P L̄ πT , ≥ exp v+ ()T
= exp + (θ())T − θ()T
+, θ()
E X Tπ e−θ()T , with θ() > 0.
In other words, the target growth rate level determines endogenously the risk
aversion parameter 1−θ() of an agent with Constant Relative Risk Aversion (CRRA)
utility function and large investment horizon. Moreover, the optimal strategy π ±,
for v± () is expected to provide a good approximation for the solution to the CRRA
utility maximization problem
sup E (X Tπ )θ() ,
π∈A
We illustrate the results of the previous section in a toy example, namely the Black-
Scholes model, with one stock of price process
d St = St bd + σdWt , t ≥ 0.
∈ R invested in the stock, and starting w.l.o.g. with unit capital, the average growth
rate portfolio of the agent is equal to
L πT σ2 π2 WT
L̄ πT = = bπ − + σπ .
T 2 T
σ2 π2 σ2 π2
L̄ πT N bπ − , ,
2 T
and its (limiting) Log-Laplace function is equal to
1 π
σ2 π2
(θ, π) := ( lim ) ln E eθT L̄ T = θ bπ − (1 − θ)
T →∞ T 2
with
b
π̂(θ) = .
σ 2 (1 − θ)
2
Hence, + differentiable on [0, 1) with: + (0) = 2σb
2 , and + (1) = ∞, i.e. + is
steep. From Theorem 3.1, the value function of the upside large deviations probability
is explicitly computed as:
1
v+ () := sup lim sup ln P L̄ πT ≥
π∈R T →∞ T
= inf + (θ) − θ
0≤θ<1
⎧
⎨ 0, if ≤ + (0) = b2
= √ 2σ 2
⎩ − + (0) − 2 , if > + (0)
Notice that, when ≤ + (0), we have not only a nearly optimal control as stated
in Theorem 3.1, but an optimal control given by π + = b/σ 2 , which is precisely the
optimal portfolio for the classical Merton problem with logarithm utility function.
+ b2
Indeed, in this model, we have by the law of large numbers: L̄ πT → 2σ
2 = + (0), as
+
T goes to infinity, and so lim T →∞ T1 ln P[ L̄ πT ≥ ] = 0 = v+ (). Otherwise, when
> + (0), the optimal strategy depends on , and the larger the target growth rate
level, the more one has to invest in the stock.
• Downside risk probability.
The dual control problem in the downside case is then given by
b2 θ
− (θ) = inf (θ, π) = (θ, π̂(θ)) = , θ ≤ 0,
π∈R 2σ 2 1 − θ
with
b
π̂(θ) = .
σ 2 (1 − θ)
2
Hence, − is differentiable on R− with: − (−∞) = 0, and − (0) = 2σ
b
2 . From
Theorem 3.1, the value function of the downside large deviations probability is
explicitly computed as:
1
v− () := inf lim inf ln P L̄ πT ≤
π∈R T →∞ T
= inf − (θ) − θ
θ≤0
= −∞, √ if < 0
2 b2
− − (0) − , if 0 ≤ ≤ − (0) = 2σ 2
Remark 4.1 The above direct calculations rely on the fact that we restrict portfolio
π to be constant in proportion. Actually, the explicit forms of the value function and
optimal strategy remain the same if we allow a priori portfolio strategies π ∈ A to
change over time based on the available information, i.e. to be F-predictable. This
520 H. Pham
requires more advanced tools from stochastic control and PDEs to be presented in
the sequel in a more general framework.
5 Factor Model
We consider a market model with one riskless asset price S 0 = 1, and d stocks of
price process S governed by
T T
|πt b(Yt )|dt + |πt σ(Yt )|2 dt < ∞, a.s. for all T > 0.
0 0
T π σσ (Yt )πt T
L πT = πt b(Yt ) − t dt + πt σ(Yt )dWt .
0 2 0
For any θ ∈ R, and π, we compute the Log-Laplace function of the growth rate
portfolio:
π
T (θ, π) := ln E eθL T
T T
= ln E E θπt σ(Yt )dWt eθ 0 f (θ,Yt ,πt )dt
,
0
1−θ
f (θ, y, π) = π b(y) − π σσ (y)π.
2
We now impose condition that π lies in A if the Doléans-Dade local
the admissibility
.
martingale E 0 θπt σ(Yt )dWt is a true martingale for any T > 0, which is
0≤t≤T
ensured, for instance, by the Novikov condition. In this case, this Doléans-Dade
Long Time Asymptotics for Optimal Investment 521
1 T
+ (θ) = sup lim sup ln EQπ exp θ f (θ, Yt , πt )dt .
π∈A T →∞ T 0
1 T
− (θ) = inf lim inf ln EQπ exp θ f (θ, Yt , πt )dt .
π∈A T →∞ T 0
These problems are known in the literature as ergodic risk-sensitive control problems,
and studied by dynamic programming methods in [1, 5, 14]. Let us now formally
derive the ergodic equations associated to these risk-sensitive control problems. We
consider the finite horizon risk-sensitive stochastic control problems:
T
u + (T, y; θ) = sup EQπ exp θ f (θ, Yt , πt )dt Y0 = y , θ ≥ 0
π∈A 0
T
u − (T, y; θ) = inf EQπ exp θ f (θ, Yt , πt )dt Y0 = y , θ ≤ 0,
π∈A 0
∂u ± 1
= sup θ f (θ, y, π)u ± + (η(y) + θγ(y)σ (y)π) D y u ± + tr(γγ (y)D 2y u ± ) ,
∂T π∈Rd 2
we obtain the ergodic HJB equation for the pair (± (θ), ϕ± (., θ)) as:
522 H. Pham
1 1 2
(θ) = η(y) D y ϕ + tr(γγ (y)D 2y ϕ) + γ (y)D y ϕ
2 2
1−θ
+ θ sup π (b(y) + σ(y)γ (y)D y ϕ − π σσ (y)π ,
π∈Rd 2
which is well-defined for θ < 1. In the above equation (θ) is a candidate for ± (θ)
while ϕ is a candidate solution for ϕ± . This can be rewritten as a semi-linear ergodic
PDE with quadratic growth in the gradient:
θ 1
(θ) = η(y) + γσ (σσ )−1 b(y) .D y ϕ + tr(γγ (y)D 2y ϕ)
1−θ 2
1 θ
+ D y ϕ γ(y) Id+m + σ (σσ )−1 σ(y) γ (y)D y ϕ
2 1−θ
θ −1
+ b (σσ ) b(y), (5.1)
2(1 − θ)
1
π̂(y; θ) = (σσ )−1 (y) b(y) + σγ (y)D y ϕ(y; θ) . (5.2)
1−θ
According to [1] (see also [15, 20]), the next result states the existence of a smooth
solution to the ergodic equation.
Proposition 5.1 Under (H1)–(H4), there exists for any θ < 1, a solution
((θ), ϕ(.; θ)) with ϕ(.; θ) C 2 , to the ergodic HJB equation s.t:
Long Time Asymptotics for Optimal Investment 523
ϕ(y; θ) −→ ∞, as |y| → ∞,
and
D y ϕ(y; θ) ≤ Cθ (1 + |y|).
We now relate a solution to the ergodic equation to the dual risk-sensitive con-
trol problem. In other words, this means the convergence of the finite horizon risk-
sensitive stochastic control to the component of the ergodic equation. We distin-
guish the downside and upside cases.
• Downside risk: In this case, it is shown in [15] that for all θ < 0, the solution
((θ), ϕ(.; θ) to (5.1), with ϕ(., θ) C 2 and upper bounded, is unique (up to an additive
constant for ϕ(.; θ)), and we have:
Moreover, there is an admissible optimal feedback control π̂(., θ) for − (θ) given
by (5.2), and for which the factor process Y is ergodic under Qπ̂ . It is also proved
in [15] that = − is differentiable on (−∞, 0). Therefore, from Theorem 3.2, the
solution to the downside risk large deviations probability is given by:
v− () = inf (θ) − θ , < (0),
θ≤0
(θ) = + (θ),
1 θ
C(θ) γ Id+m + σ (σσ )−1 σ γ C(θ)
2 1−θ
θ 1 θ
+ K+ γσ (σσ )−1 B1 C(θ) + B (σσ )−1 B1 = 0, (5.3)
1−θ 21−θ 1
1 1 θ
(θ) = tr(γγ C(θ)) + D(θ) γ(Id+m + σ (σσ )−1 σ)γ D(θ)
2 2 1−θ
θ 1 θ
+ B0 (σσ )−1 σγ D(θ) + B (σσ )−1 B0 ,
1−θ 21−θ 0
1
π̂(y; θ) = (σσ )−1 (B1 + σγ C(θ))y + B0 + σγ D(θ) .
1−θ
In [6], it is shown that there exists some positive θ̄ small enough, s.t. for θ < θ̄, there
exists a solution C(θ) to the Riccati equation (5.3) s.t. Y is ergodic under Qπ̂ , and
so by verification theorem, (θ) = ± (θ). In the one-dimensional asset and factor
Long Time Asymptotics for Optimal Investment 525
model, as studied in [17], we obtain more precise results. Indeed, in this case: d =
m = 1, the Riccati equation is a second-order polynomial equation in C(θ), which
admits two explicits roots given by:
√
K 1 − θ 1 − ρ |γ|B
K |σ| ±
1
(1 − θ)(1 − θβ)
C± (θ) = − 2 ,
|γ| 1 − θ(1 − ρ2 )
1 |γ|B1 2
θ̄ = ∧ 1, β = 1 − ρ2 + ρ − > 0,
β K |σ|
where |γ| (resp. |σ|) is the Euclidian norm of γ (resp. σ), and ρ ∈ [−1, 1] is the
γσ
correlation between S and Y , i.e. ρ = |γ||σ| . Actually, only the solution C(θ) =
C− (θ) is relevant in the sense that for this root, Y is ergodic under Q π̂ , and thus by
verification theorem:
1 2 1 θ
± (θ) = (θ) = |γ| C− (θ) + |γ|2 D(θ)2 1 + ρ2
2 2 1−θ
θ B0 1 θ B02
+ ρ|γ|D(θ) + , θ < θ̄,
1 − θ |σ| 2 1 − θ |σ|2
where
B1
B0 θ ρ|γ|C− (θ) + |σ|
D(θ) = − √ ,
K |σ| (1 − θ)(1 − θβ)
1 B B0
1
π̂(y; θ) = + ρ|γ|C− (θ) y + + ρ|γ|D(θ) .
(1 − θ)|σ| |σ| |σ|
lim (θ) = ∞.
θ↑θ̄
526 H. Pham
From Theorems 3.1 and 3.2, the solutions to the upside chance and downside risk
large deviations probability are given by:
v+ () = inf (θ) − θ , ∈ R,
0≤θ<θ̄
v− () = inf (θ) − θ , < (0),
θ≤0
5.2 Examples
1 θ B02
± (θ) = (θ) = , ∀θ < 1.
2 1 − θ |σ|2
|K | √ 1
C− (θ) = 1 − 1 − θ , D(θ) = − θ,
|σ| 2 2
and so
|K | √ |σ|2
(θ) = 1− 1−θ +θ , θ < 1,
2 8
|K | |σ|2 |σ|2
(0) = ¯ := + , (−∞) = := ,
4 8 8
Long Time Asymptotics for Optimal Investment 527
¯ − 2
θ() = 1 − , ∀ > .
−
The solution to the upside chance large deviations probability is then given by:
¯2
(−)
− ¯ |K |
, if > ¯
v+ () = −+ 4
0, ¯
if ≤ .
K − 4( − ) ¯ 1
πt+, = Yt + , if > ¯
|σ| 2 2
+(n) K − 1/n 1 ¯
πt = Yt + , if ≤ .
|σ| 2 2
The solution to the downside risk large deviations probability is given by:
¯
− (−) ¯
2
4( − ) 1
πt−, = − Yt + , if < ≤ ¯
|σ|2 2
References
1. Bensoussan, A., Frehse, J.: On Bellman equations of ergodic control in R N . J. Reine Angew.
Math. 429, 125–160 (1992)
2. Bielecki, T.R., Pliska, S.R.: Risk-sensitive dynamic asset management. Appl. Math. Optim.
39, 337–360 (1999)
3. Davis, M., Lleo, S.: Risk-sensitive benchmarked asset management. Quant. Financ. 8, 415–426
(2008)
4. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer,
New York (1998)
5. Fleming, W., McEneaney, W.: Risk sensitive control on an infinite horizon. SIAM J. Control
Optim. 33, 1881–1915 (1995)
6. Fleming, W., Sheu, S.J.: Risk sensitive control and an optimal investment model. Math. Financ.
10, 197–213 (2000)
7. Guasoni, P., Robertson, S.: Portfolios and risk premia for the long run. Ann. Appl. Probab. 22,
239–284 (2012)
8. Hata, H., Nagai, H., Sheu, S.J.: Asymptotics of the probability minimizing a down-side risk.
Ann. Appl. Probab. 20, 52–89 (2010)
528 H. Pham
9. Hata, H., Sekine, J.: Solving long term investment problems with Cox-Ingersoll-Ross interest
rates. Adv. Math. Econ. 8, 231–255 (2005)
10. Karatzas, I., Shreve, S.: Methods of Mathematical Finance. Springer, New York (1998)
11. Korn, R.: Optimal Portfolios: Stochastic Models for Optimal Investment and Risk Management
in Continuous-time. World Scientific, Singapore (1997)
12. Liu, R., Muhle-Karbe, J.: Portfolio choice with stochastic investment opportunities: a user’s
guide, Preprint (2013)
13. Merton, R.: Optimum consumption and portfolio rules in a continuous-time model. J. Econ.
Theory 3, 373–413 (1971)
14. Nagai, H.: Bellman equations of risk sensitive control. SIAM J. Control Optim. 34, 74–101
(1996)
15. Nagai, H.: Downside risk minimization via a large deviation approach. Ann. Appl. Probab. 22,
608–669 (2012)
16. Nagai, H., Peng, S.: Risk-sensitive dynamic portfolio optimization with partial information on
infinite time horizon. Ann. Appl. Probab 12, 173–195 (2002)
17. Pham, H.: A large deviations approach to optimal long term investment. Financ. Stoch. 7,
169–195 (2003)
18. Pham, H.: Some applications and methods of large deviations in finance and insurance. Paris-
Princeton Lectures on Mathematical Finance, Lecture Notes in Mathematics, vol. 1919 (2007)
19. Pham, H.: Continuous Time Stochastic Control and Optimization with Financial Applications.
SMAP. Springer, New York (2009)
20. Robertson, S., Xing, H.: Large time behavior of solutions to semi-linear equations with
quadratic growth in the gradient. SIAM J. Control Optim. 53(1), 185–212 (2015)
21. Stettner, L.: Duality and risk-sensitive portfolio optimization. In: Yin, G., Zhang, Q. (eds.)
Mathematics of Finance, Contemporary Mathematics, vol. 351, pp. 333–347 (2004)
22. Stutzer, M.: Portfolio choice with endogenous utility: a large deviations approach. J. Econom.
116, 365–386 (2003)
Systemic Risk and Default Clustering
for Large Financial Systems
Konstantinos Spiliopoulos
1 Introduction
The past several years have made clear the need to better understand the behaviour in
large interconnected financial systems. Almost all areas of modern life are touched
by a financial crisis. The recent financial crisis of 2007–2009 brought into focus the
networked structure of the financial world. It challenged the mathematical finance
community to understand connectedness in financial systems. The understanding of
systemic risk, i.e., the risk that a large numbers of components of an interconnected
financial system fails within a short time leading to the failure of the system itself,
becomes an important issue to investigate.
Interconnections often make a system robust, but they can also act as conduits
for risk. Even things that may seemingly be unrelated, may become related as risk
K. Spiliopoulos (B)
Department of Mathematics & Statistics, Boston University, Boston, MA 02215, USA
e-mail: [email protected]
restrictions, may for example, force a sale of one type of a well-performing asset to
compensate for the poor behavior of another asset. Thus, appropriate mathematical
models need to be developed, in order to help in the understanding of how risk can
propagate between financial objects.
It is possible that initial shocks could trigger contagion effects (e.g., [1]). Examples
of such shocks include: changes in interest rate values, in currencies values, changes
of commodities prices, or reduction in global economic growth. Then, there may be a
transmission mechanism which causes other institutions in the system to be affected
by the initial shock. An example of such a mechanism is financial linkages among
economies. Another reason could simply be investor irrationality. In either case,
systemic risk causes the perceived risk-return trade-off in the economy to change.
Uncertainty becomes an issue and market participants fear subsequent losses in
asset prices with a large dispersion in regards to the magnitude of the crisis. Reduce-
form point process models of correlated default are many times used (a): to assess
portfolio credit risk and (b): to value securities exposed to correlated default risk.
The workhorses of these models are counting processes. In this work we focus on
using dynamic portfolio credit risk models to study large portfolio asymptotics and
default clustering.
Large portfolio asymptotic were first studied in [2]. The model in [2] is a sta-
tic model of a homogeneous pool and firms default independently of one another
conditional on a normally distributed random variable representing a systematic risk
factor. Alternative distributions of the systematic factor were examined in [3, 4] and
the case of heterogeneous portfolios was studied in [5]. In [6], the authors extend the
model of [2] dynamically and the systematic risk factor follows a Brownian motion.
In [6], the authors study a structural model for distance to default process in a pool
of names. A firm defaults when the default process hits zero. Exploiting conditional
independence of defaults, [7, 8] have studied the tail of the loss distribution in the
static case. Large deviations arguments were also used in [9] to study stochastic
recovery effects on large static pools of credit assets.
Reduced-form models of correlated default timing have appeared in the finance
literature under different forms. Giesecke and Weber [10] take the intensity of a
name as a function of the state of the names in a specified neighborhood of that
name. The authors in [11, 12] take the intensity to be a function of the portfolio
loss and each name can be either in a good or in a distressed financial state. These
papers prove law of large numbers for the portfolio loss distribution and develop
Gaussian approximations to the portfolio loss distribution based on central limit
theorems. Cvitanić et al. [13] consider the typical behavior of a mean field system
with permanent default impact.
Sircar and Zariphopoulou [14] study large portfolio asymptotics for utility indif-
ference valuation of securities exposed to the losses in the pool. In [15], the authors
study systematic risk via a mean field model of interacting agents. Using a model
of a two well potential, agents can move freely from a healthy state to a failed state.
The authors study probabilities of transition from the healthy to the failed state using
large deviations ideas. In [16] the authors propose and study a model for inter-bank
lending and study its stochastic stability.
Systemic Risk and Default Clustering for Large Financial Systems 531
tools developed allow to reach to financial related conclusions for the behavior of
such large financial systems.
Although the primary interest of this work is risk in financial systems, models of
the type discussed in this paper are generic enough to allow for modifications that
make them relevant in other domains, including systems reliability, insurance and
epidemiology. In reliability, a large system of interacting components might have
a central connection, and be influenced by an external environment (temperature,
for example). The failure of an individual component (which could be governed by
an intensity model appropriate for the particular application) increases the stress
on the central connection and thus the other components, making the entire system
more likely to fail. In insurance, the system could represent a pool of insurance
policies. The effect of wildfires might, in that example, be modelled by a contagion
term. Systematic risk in the form of environmental conditions has an impact on the
whole pool.
The rest of the article is structured as follows. In Sect. 2 we describe the correlated
default timing proposed in [20]. Section 3 studies the typical behavior of the loss
distribution in such portfolios as the number of names (agents) in the pool grow to
infinity. Section 4 focuses on developing the Gaussian correction theory. As we shall
see there, Gaussian corrections are very useful because they make the approximations
accurate even for portfolios of relatively small sizes. In Sect. 5, we study the tail of
the loss distribution using arguments from the large deviations theory. We also study
the most likely path to systemic failure and to the development of default clusters.
An understanding of the preferred paths to large default rates and the most likely
path to the creation of default clusters can give useful insights into how to optimally
safeguard against such events. Importance sampling techniques can then be used to
construct asymptotically efficient estimators for tail event probabilities, see Sect. 6.
Conclusions are in Sect. 7. A large part of the material presented in this work, but
not all, is related to recent work of the author described in [20, 24–26].
1
N
L tN = 1{τ n ≤t} , (3)
N
n=1
The process X t represents the systematic risk, which can be modeled to be the
solution to some SDE
d X t = b0 (X t )dt + σ0 (X t )d Vt , X 0 = x◦ . (5)
The system (2)–(5) can naturally be understood as an interacting particle system. This
suggests how to understand its large-scale behavior. The structure of the feedback (the
empirical average L N ) is of mean-field type (roughly within the class of McKean-
Vlasov models; see [31, 39]). An understanding of “typical” behavior of a system
as N → ∞ is fundamental in identifying “atypical” or “rare” events.
Systemic Risk and Default Clustering for Large Financial Systems 535
To formulate the law of large numbers result, we define the empirical distribution
of the pn ’s corresponding to the names that have survived up to time t, as follows:
1
N
μtN = δp N 1{τ n >t} .
N t
n=1
This captures the entire dynamics of the model (including the effect of the hetero-
geneities). We can directly calculate the failure rate from the μ N ’s:
Let us then identify the limit of μtN (P) as N → ∞. This is a law of large numbers
(LLN) result and it identifies the baseline “typical” behavior of the system. For
f ∈ C 2 (P), let
1 2 ∂2 f ∂f
(L1 f )(p) = σ λ 2 (p) − α(λ − λ̄) (p) − λ f (p)
2 ∂λ ∂λ
∂ f
(L2 f )(p) = β C (p)
∂λ
∂f ε2 ∂2 f
(L3x f )(p) = εβ S λb0 (x) (p) + (β S )2 λ2 σ02 (x) 2 (p) (8)
∂λ 2 ∂λ
∂ f
(L4x f )(p) = εβ S λσ0 (x) (p) and Q(p) = λ
∂λ
for all f ∈ C 2 (P), the limit μ̄ satisfies the stochastic evolution equation
X X
d f, μ̄t = L1 f, μ̄t + Q, μ̄t L2 f, μ̄t + L3 t f, μ̄t dt + L4 t f, μ̄t d Vt a.s.
(9)
where ∗ denotes adjoint in the appropriate sense (for notational simplicity, we have
written (10) to include the types as one of the coordinates; in a heterogeneous col-
lection in practice we would often use only λ in solving (10)). We recall the rigorous
statement in Theorem 3.1.
The SIPDE (10) gives us a “large system approximation” of the failure rate:
L tN ≈ 1 − μ̄t (P) = 1 − υ(t, p)dp. (11)
P
The computation of the first-order approximation (11) suggested by the LLN requires
solving the SIPDE (10) governing the density of the limiting measure. In [24] a
numerical method for this purpose is proposed. The method is based on an infinite
system of SDE’s for certain moments of the limiting measure. These SDEs are
driven by the systematic risk process X and a truncated system can be solved using
a discretization or random ODE scheme. The solution to the SDE system leads to
the solution to the SIPDE via an inverse moment problem.
The approximation (11) has significant computational advantages over a naive
Monte Carlo simulation of the high-dimensional original stochastic system (2)–(5)
and its accuracy is demonstrated in the left of Fig. 1 for a specific choice of parameters.
It also provides information about catastrophic failure.
The tail represents extreme default scenarios, and these are at the center of risk
measurement and management applications in practice. The analysis of the limiting
distribution generates important insights into the behavior of the tails as a function of
the characteristics of the system (2)–(5). For example, we see that the tail is heavily
influenced by the sensitivity of a name to the variations of the systematic risk X . The
bigger the sensitivity the fatter the tail, and the larger the likelihood of large losses
in the system (see the right of Fig. 1). Insights of this type can help understand the
5 70
N = 250 βS = 1
4.5 N = 1000
60 βS = 2
N = 5000
4 N = 10000 βS = 3
Asymptotic 50 βS = 4
3.5
3
40
2.5
30
2
1.5 20
1
10
0.5
0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38 0.4
Portfolio Loss Limiting Portfolio Loss
Fig. 1 On the left Comparison of distributions of failure rate L tN for different N at t = 1. Parameter
choices: (σ, α, λ̄, λ0 , β C , β S ) = (0.9, 4, 0.2, 0.2, 4, 8). On the right Comparison of distribution of
limiting failure rate 1 − μ̄t (P ) for different values of the systematic risk sensitivity β S at t = 1.
Parameter choices: (σ, α, λ̄, λ0 , β C ) = (0.9, 4, 0.2, 0.2, 2)
Systemic Risk and Default Clustering for Large Financial Systems 537
role of contagion and systematic risk, and how they interact to produce atypically
large failure rates. This, in turn, leads to ways to minimize or “manage” catastrophic
failures.
Let us next present the statement of the mathematical result. We denote by E the
collection of sub-probability measures (i.e., defective probability measures) on P;
i.e., E consists of those Borel measures ν on P such that ν(P) ≤ 1.
Theorem 3.1 (Theorem 3.1 in [24]) We have that μ·N converges in distribution to
μ̄· in D E [0, T ]. The evolution of μ̄· is given by the measure evolution equation
Then
μ̄t = υ(t, p)dp.
We close this section, by briefly describing the method of moments that leads to
the numerical computation of the loss from default. We focus our discussion on the
homogeneous case and we refer the reader to [24] for the general case.
Firstly, we remark that the SPDE (12) can be supplied with appropriate boundary
conditions, which as it is mentioned in [24], are
υ(t, λ = 0) = υ(t, λ = ∞) = 0.
∞
Secondly, it turns out that for k ∈ N, the moments u k (t) = 0 λk υ(t, λ)dλ exist
almost surely. By (11) is clear that we want to compute u 0 (t). In particular, note that
the limiting loss L t = 1 − u 0 (t).
538 K. Spiliopoulos
L t = 1 − u 0 (t).
The asymptotics of (10) give via (11) the limiting behavior of the system as the
number of components becomes large. Starting with that result, the results in [25]
develop Gaussian fluctuation theory analogous to the central limit theory (see for
example [11, 12, 32, 40] for some related literature). This result provides the leading
order asymptotics correction to the law of large numbers approximation developed
in Sect. 3. In practical terms, the usefulness of such of a result is twofold: (a) the
approximation is accurate even for portfolios of moderate size, see [25], and (b)
one can make use of the approximation to develop tractable statistical inference
procedures for the statistical calibration of such models, see [27].
To be more precise, let us define the signed measure
√
tN = N μtN − μ̄t ;
1 d 1
μtN = √ tN + μ̄t ≈ √ ¯ t + μ̄t ,
N N
which then implies the following second-order approximation for the portfolio loss.
d 1
L tN ≈ L t − √ ¯ t (P). (14)
N
30 0.16
Actual Loss Distribution Actual VaR
First−Order Approximation First−Order Approximation VaR
Second−Order Approximation 0.15 Second−Order Approximation VaR
25
0.14
99% VaR
20 N=
1000
0.13
VaR
N=250
15
95% VaR
0.12
N=150
10
0.11
5
0.1
0 0.09
0 0.05 0.1 0.15 0.2 0.35 0.4 0.45 0.5
Loss Time
Fig. 2 On the left Comparison of approximate and actual loss distributions of failure rate L tN
for different N at t = 0.5. Parameter choices: (σ, α, λ̄, λ0 , β C , β S ) = (0.9, 4, 0.2, 0.2, 1, 1). On
the right Comparison of approximate and actual VaR. Parameter choices: (σ, α, λ̄, λ0 , β C , β S ) =
(0.9, 4, 0.2, 0.2, 1, 1). In both cases, X is an OU process with reversion speed 2, volatility 1, initial
value 1 and mean 1
It is clear that if βnS = 0 for all n, then the limiting distribution-valued martingale
M̄ is centered Gaussian with covariance operator given by the (now deterministic)
term within the expectation in (16).
The main idea for the derivation of (15) comes from the proof of the convergence
to the solution of (9). Define
1 2 ∂2 f ∂f
(L◦1 f )(p) = σ λ 2 (p) − α(λ − λ̄) (p)
2 ∂λ ∂λ
for p = (λ, α, λ̄, σ, β C , β S ). Let’s also assume for the moment that βnS = 0 for every
n ∈ N, i.e., let’s neglect exposure to the exogenous
risk X and focus on contagion.
Then we can write the evolution of f, μtN as
1 ◦ 1
N N
d f, μtN = L f (ptN )1{t<τn } dt − f (pnt )λnt 1{t<τ n } dt
N N
n=1 n=1
N
1
N
βnC
+ f p +
n
e1 − f (pnt ) 1{τn <t} λm
t 1{τm ≤t} dt + d Mt
N N
n=1 m=1
where M is a martingale which may change from line to line. This leads to (9), when
βnS = 0 for every n ∈ N, see [20].
To get the Gaussian correction, we see that
Once we have identified what is typical, we can study the structure of atypically
large failure rates. Large deviations outlines a circle of ideas and calculations for
understanding the origination and transformation of rare events (see [42, 43]). Large
deviation arguments allow us to identify the “dominant” way that rare events will
occur in complex systems. This is the feature that is being exploited in [26], i.e., how
different sources of stochasticity can lead to system collapse.
By the discussion in Sect. 3, we have that the pool has a default rate L T =
1 − μ̄T (P) at time T . Let’s fix > L T . Then lim N →∞ P{L TN ≥ } = 0; it is a rare
event that the default rate in the pool exceeds . We want to understand as much as
possible about {L TN ≥ }.
Using, the theory of large deviations, we can understand both how rare this event
is, and what the “most likely” way is for this rare event to occur. Events far from
equilibrium crucially depend on how rare events propagate through the system. Large
deviations gives rigorous ways to understand these effects, and we want to use this
machinery to understand the structure of atypically large default clusters in the port-
folio. A reference for large deviations is [44].
If we have that
where
I () = inf{I (ϕ) : ϕ(T ) = } (18)
(in other words, I is the large deviations rate function for L TN ). This gives us the rate
at which the tail of the default rate L TN decays as the diversification parameter grows.
More importantly, though, the variational problem (18) gives us the preferred way
which atypically large default rates occur. Namely, if there is a ϕ∗ : [0, T ] → [0, 1]
such that
I () = I (ϕ∗ )
then for any δ > 0, the Gibbs conditioning principle suggests that
Insights into large deviations of (2)–(5) have been developed in [26] when ε ↓ 0
and when ε = O(1) as N ∞. We note here that in the case ε = O(1), the large
deviations principle is conditional on the systematic risk X . Such results allow us to
Systemic Risk and Default Clustering for Large Financial Systems 543
study the comparative effect of the systematic risk process X and of the contagion
feedback on the tails of the loss distribution.
Before presenting the result, let us first investigate numerically a test case, which
is indicative of the kind of results that large deviations theory can give us. Apart
from approximating the tail of the distribution, large deviations can give quantitative
insights into the most likely path to failure of a system.
For presentation purposes and for the rest of this section, we assume that ε =
ε N ↓ 0 as N ↑ ∞. Consider a heterogeneous test portfolio composed initially of
N = 200 names. Let us assume that we can separate the names in the portfolio
into three types: Type A is 16.67 % of the names, Type B is 33.33 % of the names
and Type C is 50 % of the names. For presentation purposes, we assume that all
parameters but the contagion parameter are the same among the different types. In
particular, we have the following choice of parameters.
It is instructive to compare the different cases, based on whether there are conta-
gion effects in the default intensities or not. In particular, we compare two different
cases, (a) Systematic risk only: β S = 0, β C = 0, and (b) Systematic risk and conta-
gion: β S = 0, β C = 0. In each case, the time horizon is T = 1.
Using the methods of Sect. 3, one can compute that the typical loss in such a pool
at time T = 1. If contagion effects are not present, i.e., if β CA = β BC = βCC = 0,
then the typical loss in such a portfolio at time T = 1 is L T = 42.5 %. If on the
other hand, contagion (feedback) effects are present and the β C parameters take the
values of Table 1, then the typical loss in such a portfolio at time T = 1 has been
increased to L T = 72.1 %. In Fig. 3, we plot the large deviations rate functions for
each of the two different cases. As we saw in the beginning of this section, the rate
function governs the asymptotics of the tail of the loss distribution. Notice that in
every case, the rate function is convex and it becomes zero at the corresponding law
of large numbers.
Moreover, since the contagion parameter of Type A is higher than the contagion
parameter for Type B or C, one expects that names of Type A will be more prompt
to the contagious impact of defaults. Indeed, after computing the rate function and
the associated extremals, as defined by large deviations theory, one gets the most
likely paths to failure as seen in Figs. 4 and 5. The ϕ(t) trajectories correspond to the
contagion extremals for each of the three types, whereas the ψ(t) corresponds to the
systematic risk extremal.
One can make two conclusions out of Figs. 4 and 5. The first conclusion is related
to the ϕ extremals (Fig. 4). We notice that at any given time t, the extremal for Type
Table 1 Parameter values for a test portfolio composed of three types of assets
α λ̄ σ λ0 γ βS βC
Type A 0.5 2 0.5 0.2 1 1 10
Type B 0.5 2 0.5 0.2 1 1 3
Type C 0.5 2 0.5 0.2 1 1 1
We take ε N = √1
N
544 K. Spiliopoulos
0.6
Rate functions
0.4
0.2
0.0
Fig. 3 Rate function governing the log-asymtptotics of the tail of the loss distribution
Type A
0.8
Type B
Type C
0.6
phi extremal
0.4
0.2
0.0
Fig. 4 Optimal ϕ(t) trajectories for the three different types in the pool for t ∈ [0, 1] and = 0.81
A is bigger than the extremal for Type B, which in turn is bigger than the extremal of
Type C. This implies that unlikely large losses for components of Type A are more
likely than unlikely large losses for components of Type B, which are more likely
than large losses for components of Type C. Thus, components of Type A affect
the pool more than components of Type B, which in turn affect the pool more than
components of Type C even though Type A composes 16.67 % of the pool, whereas
Type B, composes 33.33 % of the pool and Type C composes 50 % of the pool. The
second conclusion is related to the ψ extremals (Fig. 5). We notice that the effect
of the systematic risk is most profound in the beginning but then its significance
decreases.
Systemic Risk and Default Clustering for Large Financial Systems 545
0.005
Systematic risk only
Systematic risk and contagion
0.000
psi extremal
−0.005
−0.010
Fig. 5 Comparing optimal ψ(t) trajectories in the case of absence and presence of the contagion
effects for t ∈ [0, 1] and = 0.81
Namely, if a large cluster were to occur, the systematic risk factor is likely to
play an important role in the beginning, but then the contagion effects become more
important. Assets of Type A are likely to contribute to the default clustering effect
more, followed by assets of Type B and the ones that will contribute the least to the
default cluster are assets of Type C.
As it is also seen in the numerical experiments done in [26], the large deviations
analysis help quantify the effect that the contagion and the systematic risk factor have
on the behavior of the extremals (the most likely path to failure). An understanding
of the role of the preferred paths to large default rates and the most likely ways in
which contagion and systematic risk combine to lead to large default rates would
give useful insights into how to optimally hedge against such events.
Let us next proceed by motivating the development of the large deviations principle
for the default timing model (2)–(5) that is considered in this paper.
We denote scenarios, i.e., defaults, that are not in [0, T ] by an abstract point not
in [0, T ] and define the Polish space
T = [0, T ] ∪ {}
To motivate things, let’s first assume for simplicity that β C = β S = 0 and that
the system is homogeneous, i.e., that pn = p for all n. Define
dλt = −α(λt − λ̄)dt + σ λt dWt t >0
546 K. Spiliopoulos
for all t > 0; μ0 is the common law of the default times τn ’s.
In the independent case, i.e., when β C = 0, standard Sanov’s theorem [44],
implies that {d L N } N ∈N has a large deviations principle with rate function
dν
H (ν, μ0 ) = ln (t)ν(dt)
t∈T dμ0
I ind, () = inf {H (ν, μ0 ) : ν ∈ P(R+ ), ν[0, t] = ϕ(t) for all t ∈ [0, T ] and ν[0, T ] = }
In the independent case, we can actually compute both the extremal ϕ that achieves
the infimum and the corresponding rate function I ind, () in closed form.
Assume that μ0 [0, T ] ∈ (0, 1) and ∈ (0, 1). Fix ν ∈ P(T ) such that ν[0, T ] =
. Define
for all A ∈ B[0, T ]. Then μ− and ν− are in P[0, T ]. We can write that
ν{}
H (ν, μ0 ) = (ν− , μ0,− ) + ln + ln ν{} (19)
μ0 [0, T ] μ0 {}
1−
I ind, () = ln + (1 − ) ln (20)
μ0 [0, T ] μ0 {}
1−
= ln + (1 − ) ln .
μ0 [0, T ] 1 − μ0 [0, T ]
Systemic Risk and Default Clustering for Large Financial Systems 547
N
This is in fact obvious; L TN = N1 n=1 1{τn ≤T } , and in this case the 1{τn ≤T } ’s are i.i.d.
Bernoulli random variables with common bias μ0 [0, T ]. The rate function I ind, ()
of (20) is the entropy of Bernoulli coin flips. Of more interest, however, is the optimal
path. In setting ν− = μ− in (19), we essentially identify the optimal path
μ0 [0, t]
ϕ(t) = ,
μ0 [0, T ]
d X t = −γ X t dt + d Vt
X 0 = x◦
Let W ∗ be a reference Brownian motion. Fix a name in the pool p = (λ◦ , α, λ̄, σ,
βC , β S) ∈ P and time horizon T > 0.
The Freidlin-Wentzell theory of large deviations for SDE’s gives us a natural
starting point. In the Freidlin-Wentzell analysis, a dominant ODE is subjected to a
small diffusive perturbation; informally, the Freidlin-Wentzell theory tells us that
if we want to find the probability that the randomly-perturbed path is close to a
reference trajectory, we should use that reference trajectory in the dynamics. This
leads to the correct LDP rate function
for the original SDE. If we want to find the
asymptotics of the probability that d L N ≈ dϕ, ε N d X ≈ dψ for some absolutely
continuous functions ϕ and ψ, i.e., ϕ, ψ ∈ AC ([0, T ], R), we should consider the
stochastic hazard functions
ϕ,ψ ϕ,ψ ϕ,ψ ϕ,ψ
dλt = −α(λt − λ̄)dt + σ λt d Wt∗ + β C dϕ(t) + β S λt dψ(t) t ∈ [0, T ]
λ 0 = λ◦ .
where, we have used the superscript p to denote the dependence on the particular
type. Then for every t ∈ [0, T ] we have that
t t t
λϕ,ψ λϕ,ψ
p
f ϕ,ψ (s)ds = 1 − E exp − s ds =P s ds >e
s=0 s=0 s=0
In fact, the previous heuristics can be carried out rigorously and in the end one derives
the following rigorous large deviations result.
Theorem 5.1 (Theorem 3.8 in [26]) Consider the system defined in (2)–(5) with
lim N →∞ ε N = 0 such that lim N →∞ N ε2N = c ∈ (0, ∞) and let T < ∞. Under
the appropriate assumptions the family {L TN , N ∈ N} satisfies the large deviation
principle, with rate function
I () = inf I (ϕ, ψ) : ϕ ∈ C (P × [0, T ]) , ψ ∈ C ([0, T ]) , ψ(0) = ϕ(p, 0) = 0,
ϕ̄(s) = ϕ(p, s)U (dp), ϕ̄(T ) =
P
Systemic Risk and Default Clustering for Large Financial Systems 549
and I (ϕ, ψ) = ∞ otherwise. Here, J X (ψ) is the rate function for the process
{ε N X N , N < ∞}. Namely, for ψ ∈ AC ([0, T ]; R) with ψ(0) = 0 we have
1 T
J X (ψ) = ψ̇(s) + γ ψ(s)2 ds
2 0
p
Due to the affine structure of the model, we have an explicit expression for f ϕ,ψ (see
Lemma 4.1 in [26]). K
Assume that κi % of the names are of type Ai with i = 1, . . . , K and i=1 κi =
K κi
100. Setting ϕ(p, s) = i=1 100 ϕ Ai (s)χ{p Ai } (p), we get the following simplified
expression for the rate function
⎧
⎨ K
κi p A 1 K
κi
I () = inf g i (ϕ Ai , ϕ, ψ) + J X (ψ) : ϕ(t) = ϕ A (t) for every t ∈ [0, T ]
⎩ 100 c 100 i
i=1 i=1
⎫
⎬
ϕ(T ) = , ϕ Ai (0) = ψ(0) = 0, ϕ Ai , ψ ∈ AC([0, T ]) for every i = 1, . . . , K .
⎭
system. In particular, this understanding can guide the selection of meaningful stress
scenarios to be analyzed. Thirdly, they can motivate the design of asymptotically
efficient importance sampling schemes for the tail of the portfolio loss. We discuss
some of the related issues in Sect. 6.
dP
N = 1{L N >} ,
T dQ
P
where ddQ is the associated Radon-Nikodym derivative.
Importance sampling involves the generation of independent copies of N under
Q; the estimate is the sample mean. The specific number of samples required depends
on the desired accuracy, which is measured by the variance of the sample mean.
However, since the samples are independent it suffices to consider the variance of
a single sample. Because of unbiasedness, minimizing the variance is equivalent to
minimizing the second moment. An application of Jensen’s inequality, shows that if
1
lim inf ln EQ ( N )2 = −2I (),
N →∞ N
then N achieves this best decay rate, and is said to be asymptotically optimal. One
wants to choose Q such that asymptotic optimality is attained.
Systemic Risk and Default Clustering for Large Financial Systems 551
To motivates things let us assume for the moment that β C = β S = 0 and that
the system is homogeneous, i.e., that pn = p for all n. In the independent and
homogeneous case, n = 1{τn ≤T } are i.i.d. random variables such that for every
t ∈ [0, T ]
t t t
P {τn ≤ t} = P λs0,0 ds > e = 1 − E exp − λ0,0
s ds = f 0,0 (s)ds
0 0 0
1
¯ (θ ; t) = lim N
(N θ ; t) = ln p eθ − 1 + 1
N →∞ N
Define
peθ
pθ =
1 + p(eθ − 1)
)
N n 1−n )
N
p 1− p
Zθ = = 1 + p(eθ − 1) e−θn
pθ 1 − pθ
n=1 n=1
N −θ L TN + ¯ (θ;T )
=e
Therefore, for θ fixed, the suggestion is to simulate under a new change of measure,
under which N L TN ∼ Binomial(N , pθ ) and to return the estimator
1
M
N −θ L TN ,i + ¯ (θ;T )
= 1{L N ,i >} e
M T
i=1
552 K. Spiliopoulos
It is clear that this estimator is unbiased. We want to choose θ that minimizes the
variance, or equivalently the second moment. For this purpose, we define the second
moment N ¯
Q(, θ ) = Eθ 2 = Eθ 1{L T >} e2N −θ L T + (θ;T )
Notice that
1 1
− ln Q(, θ ) ≥ −2 N −θ + ¯ (θ ; T ) = 2(θ − ¯ (θ ; T ))
N N
Pθ ∗ {τn ≤ T } = pθ ∗ = .
Proof By Jensen’s inequality we clearly have the upper bound. Namely, for every
θ ∈ [0, ∞)
1
lim sup − ln Q(, θ ) ≤ 2I ind () (21)
N →∞ N
Now, we need to prove that the lower bound is achieved for θ = θ ∗ , i.e., that
1
lim inf − ln Q(, θ ∗ ) ≥ 2I ind () (22)
N →∞ N
Systemic Risk and Default Clustering for Large Financial Systems 553
(1− p) T
Recalling that θ ∗ = ln p(1−) and p = s=0 f 0,0 (s)ds, we easily see that
1
lim inf − ln Q(, θ ∗ ) ≥ 2 θ ∗ − ¯ (θ ∗ ; T )
N →∞ N
∗
= 2 θ ∗ − ln p(eθ − 1) + 1
1−
= 2 ln + (1 − ) ln
p 1− p
= 2I ind ()
indexed by n. Due to independence, similar methods as the one described above can
be used to construct asymptotically efficient importance sampling schemes in the
heterogeneous case.
The scheme just presented essentially amounts to a twist in the intensity of the
defaults. However, in contrast to the independent case, i.e., when β C = β S = 0, the
situation in the general dependent case β C , β S = 0 is more complicated. Notice also
if at least one of the βnC ’s is not zero, then the model (2)–(5) does not fall into the
category of the doubly-stochastic models, so techniques as the ones used in [45] do
not apply. Also, implementation of interacting particle schemes for Markov Chain
models as the ones developed in [29, 47] do not readily apply for such intensity
models. The re-sampling schemes of [48] could apply in this setting, but one would
need to construct an appropriate mimicking Markov Chain, something which is not
clear how to do in the current setting.
We briefly present here an importance sampling scheme for the case that there
exists at least one βnC = 0 and also applies independently of whether the systematic
effects are present in the model or not. The suggested measure change essentially
mimics the principal idea behind the measure change for theindependent case. To
N
be more precise, one directly twists the intensity of N L TN = n=1 1 .
{τn ≤T }
Let {Sk } be the arrival times of N L TN and notice that L TN ≥ = SN ≤ T .
Let Msn = 1{τ n >s} and θsN ≥ 1 be some progressively measurable twisting process.
Then, define the measure Q via the Radon-Nicodym derivative
SN N SN N
Z N = e− log θs− d(N L sN )− 0 1−θsN n=1 λs Ms ds
n n
0 .
554 K. Spiliopoulos
N
− k=1 log θ SN − P
It is known that if E e k < ∞, then Q defined by ddQ = ZN
is a probability measure and it can be shown that N L sN admits Q−intensity
N
θsN n=1 λns Msn on the interval [0, SN ).
This construction gives us some freedom into choosing appropriately the twisting
process θsN . Different choices of the twisting process θsN are of course possible. For
tractability purposes we restrict attention to a one-parameter family and set
βN
θsN = N + 1.
n=1 λs Ms
n n
For any β ≥ 0 and under the measure induced by Z N , i.e., under Qβ , the process
N
N L sN has intensity n=1 λns Msn + β N on [0, SN ), i.e., it amounts to an additive
shift of the intensity. Thus, β is a superimposed default rate and its role is to increase
the default rate in the whole portfolio.
The purpose then is to optimize the limit as N → ∞ of the upper bound of the
second moment of the resulting estimator over β. This is the measure change that is
investigated in [57], and it is shown there that there is a choice of β = β ∗ for which
asymptotic optimality can be established. Namely, there is a choice of β = β ∗ that
minimizes the second moment of the estimator in the limit as N → ∞. We refer the
interested reader to [57] for implementation details on this change of measure for
related intensity models and for corresponding simulation results.
7 Conclusions
Acknowledgments The author was partially supported by the National Science Foundation (DMS
1312124).
Systemic Risk and Default Clustering for Large Financial Systems 555
References
1. Meinerding, C.: Asset allocation and asset pricing in the face of systemic risk: a literature
overview and assessment. Int. J. Theor. Appl. Financ. (IJTAF) 15(03), 1250023-1–1250023-27
(2012)
2. Vasicek, O.: Limiting loan loss probability distribution. Technical Report, KMV Corporation
(1991)
3. Lucas, A., Klaassen, P., Spreij, P., Straetmans, S.: An analytic approach to credit risk of large
corporate bond and loan portfolios. J. Bank. Financ. 25, 1635–1664 (2001)
4. Schloegl, L., O’Kane, D.: A note on the large homogeneous portfolio approximation with the
student-t copula. Financ. Stoch. 9(4), 577–584 (2005)
5. Gordy, M.B.: A risk-factor model foundation for ratings-based bank capital rules. J. Financ.
Intermed. 12, 199–232 (2003)
6. Bush, N., Hambly, B., Haworth, H., Jin, L., Reisinger, C.: Stochastic evolution equations in
portfolio credit modelling. SIAM J. Financ. Math. 2, 627–664 (2011)
7. Dembo, A., Deuschel, J.-D., Duffie, D.: Large portfolio losses. Financ. Stoch. 8, 3–16 (2004)
8. Glasserman, P., Kang, W., Shahabuddin, P.: Large deviations in multifactor portfolio credit
risk. Math. Financ. 17(3), 345–379 (2007)
9. Spiliopoulos, K., Sowers, R.: Recovery rates in investment-grade pools of credit assets: a large
deviations analysis. Stoch. Process. Appl. 121(12), 2861–2898 (2011)
10. Giesecke, K., Weber, S.: Credit contagion and aggregate losses. J. Econ. Dyn. Control 30,
741–767 (2006)
11. Dai Pra, P., Runggaldier, W., Sartori, E., Tolotti, M.: Large portfolio losses: a dynamic contagion
model. Ann. Appl. Probab. 19, 347–394 (2009)
12. Dai Pra, P., Tolotti, M.: Heterogeneous credit portfolios and the dynamics of the aggregate
losses. Stoch. Process. Appl. 119, 2913–2944 (2009)
13. Cvitanić, J., Ma, J., Zhang, J.: The law of large numbers for self-exciting correlated defaults.
Stoch. Process. Appl. 122(8), 2781–2810 (2012)
14. Sircar, R., Zariphopoulou, T.: Utility valuation of credit derivatives and application to CDOs.
Quant. Financ. 10(2), 195–208 (2010)
15. Garnier, J., Papanicolaou, G., Yang, T.-W.: Large deviations for a mean field model of systemic
risk. SIAM J. Financ. Math. 4, 151–184 (2012)
16. Fouque, J.-P., Ichiba, T.: Stability in a model of inter-bank lending. SIAM J. Financ. Math. 4,
784–803 (2013)
17. Ait-Sahalia, Y., Cacho-Diaz, J., Laeven, R.: Modeling financial contagion using mutually excit-
ing jump processes. To appear in Journal of Financial Economics
18. Duan, J.-C.: Maximum likelihood estimation using price data of the derivative contract. Math.
Financ. 4, 155–167 (1994)
19. Bielecki, T., Crépey, S., Herbertsson, A.: Markov chain models of portfolio credit risk. Oxford
Handbook of Credit Derivatives. Oxford University Press, New York (2011)
20. Giesecke, K., Spiliopoulos, K., Sowers, R.: Default clustering in large portfolios: typical events.
Ann. Appl. Probab. 23(1), 348–385 (2013)
21. Azizpour, S., Giesecke, K., Schwenkler, G.: Exploring the sources of default clustering. Work-
ing paper, Stanford University (2010)
22. Duffie, D., Saita, L., Wang, K.: Multi-period corporate default prediction with stochastic covari-
ates. J. Financ. Econ. 83(3), 635–665 (2006)
23. Bo, L., Capponi, A.: Bilateral credit valuation adjustment for large credit derivatives portfolios.
Financ. Stoch. 18(2), 431–482 (2014)
24. Giesecke, K., Spiliopoulos, K., Sowers, R., Sirignano, J.A.: Large portfolio asymptotics for
loss from default. Math. Financ. 25(1), 77–114 (2015)
25. Spiliopoulos, K., Sirignano, J.A., Giesecke, K.: Fluctuation analysis for the loss from default.
Stoch. Process. Appl. 124(7), 2322–2362 (2014)
26. Spiliopoulos, K., Sowers, R.: Default clustering in large pools: large deviations. SIAM J.
Financ. Math. 6, 86–116 (2015)
556 K. Spiliopoulos
27. Sirignano, J.A., Schwenkler, G., Giesecke, K.: Likelihood estimation for large financial sys-
tems. Working paper, Stanford University (2013)
28. Glasserman, P., Wang, Y.: Counterexamples in importance sampling for large deviations prob-
abilities. Ann. Appl. Probab. 7, 731–746 (1997)
29. Carmona, R., Fouque, J.-P., Douglas, V.: Interacting particle systems for the computation of
rare credit portfolio losses. Financ. Stoch. 13(4), 613–633 (2009)
30. Deng, S., Giesecke, K., Lai, T.L.: Sequential importance sampling and resampling for dynamic
portfolio credit risk. Oper. Res. 60(1), 78–91 (2012)
31. Kotelenez, P.M., Kurtz, T.G.: Macroscopic limits for stochastic partial differential equations
of Mckeanvlasov type. Probab. Theory Relat. Fields 146(1–2), 189–222 (2010)
32. Kurtz, T.G., Xiong, J.: A stochastic evolution equation arising from the fluctuations of a class
of interacting particle systems. Commun. Math. Sci. 2(3), 325–358 (2004)
33. Purtukhia, O.G.: On the equations of filtering of multi-dimensional diffusion processes
(unbounded coefficients). Thesis, Moscow, Lomonosov University 1984 (in Russian) (1984)
34. Sadowsky, S.J.: On Monte Carlo estimation of large deviations probabilities. Ann. Appl. Probab.
6, 399–722 (1996)
35. Duffie, D., Pan, J., Singleton, K.: Transform analysis and asset pricing for affine jump-
diffusions. Econometrica 68, 1343–1376 (2000)
36. Dawson, D.A., Hochberg, K.J.: Wandering random measures in the Fleming-Viot model. Ann.
Probab. 10(3), 554–580 (1982)
37. Ethier, S.N., Kurtz, T.G.: Markov Processes: Characterization and Convergence. Wiley, New
York (1986)
38. Fleming, W.H., Viot, M.: Some measure-valued Markov processes in population genetics the-
ory. Indiana Univ. Math. J. 28(5), 817–843 (1979)
39. Gartner, J.: On the Mckean-Vlasov limit for interacting diffusions. Mathematische Nachrichten
137(1), 197–248 (1988)
40. Fernandez, B., Mélèard, S.: A Hilbertian approach for fluctuations on the Mckean-Vlasov
model. Stoch. Process. Appl. 71, 33–53 (1997)
41. Gyöngy, I., Krylov, N.: Stochastic partial differential equations with unbounded coefficients
and applications, I. Stoch. Stoch. Rep. 32, 53–91 (1990)
42. Freidlin, M., Wentzell, A.: Random Perturbations of Dynamical Systems, 2nd edn. Springer,
New York (1984)
43. Varadhan, S.R.S.: Large Deviations and Applications. CBMS-NSF Regional Conference Series
in Applied Mathematics, vol. 46. Society for Industrial and Applied Mathematics (SIAM),
Philadelphia (1984)
44. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer,
New York (1988)
45. Bassamboo, A., Jain, S.: Efficient importance sampling for reduced form models in credit risk.
In: Proceedings of the 2006 Winter Simulation Conference, pp. 741–748 (2006)
46. Juneja, S., Bassamboo, A., Zeevi, A.: Portfolio credit risk with extremal dependence: asymp-
totic analysis and efficient simulation. Oper. Res. 56(3), 593–606 (2008)
47. Carmona, R., Crépey, S.: Particle methods for the estimation of Markovian credit portfolio loss
distributions. Int. J. Theor. Appl. Financ. 13(4), 577–602 (2010)
48. Giesecke, K., Kakavand, H., Mousavi, M., Takada, H.: Exact and efficient simulation of cor-
related defaults. SIAM J. Financ. Math. 1, 868–896 (2010)
49. Glasserman, P., Li, J.: Importance sampling for portfolio credit risk. Manag. Sci 51(11), 1643–
1656 (2005)
50. Glasserman, P.: Tail approximations for portfolio credit risk. J. Deriv. 12(2), 24–42 (2004)
51. Zhang, X., Blanchet, J., Giesecke, K., Glynn, P.: Affine point processes: asymptotic analysis
and efficient rare-event simulation. Working paper, Stanford University (2011)
52. Asmussen, S., Peter Glynn, W.: Stochastic Simulation: Algorithms and Analysis. Grundlehren
der Mathematischen Wissenschaften. Springer, New York (2007)
53. Bucklew, J.: Introduction to Rare Event Simulation. Grundlehren der Mathematischen Wis-
senschaften. Springer, New York (2004)
Systemic Risk and Default Clustering for Large Financial Systems 557
54. Dupuis, P., Spiliopoulos, K., Wang, H.: Importance sampling for multiscale diffusions. Multi-
scale Model. Simul. 12, 1–27 (2012)
55. Dupuis, P., Wang, H.: Importance sampling, large deviations and differential games. Stoch.
Stoch. Rep. 76, 481–508 (2004)
56. Dupuis, P., Wang, H.: Subsolutions of an Isaacs equation and efficient schemes for importance
sampling. Math. Oper. Res. 32, 723–757 (2007)
57. Giesecke, K., Shkolnik, A.: Asymptotic optimal importance sampling of default times. Working
paper, Stanford University (2011)
Estimation of Volatility
√ Functionals:
The Case of a n Window
1 Introduction
J. Jacod
Institut de Mathématiques de Jussieu, CNRS – UMR 7586 and Université Pierre et Marie Curie,
4 Place Jussieu, 75252 Paris Cedex 05, France
e-mail: [email protected]
M. Rosenbaum (B)
Laboratoire de Probabilités et Modèles Aléatoires, CNRS – UMR 7599 and Université Pierre et
Marie Curie, 4 Place Jussieu, 75252 Paris Cedex 05, France
e-mail: [email protected]
This is of course a quite well understood problem when gis the identity function.
t
In particular, when X is one-dimensional and continuous, 0 g(cs ) ds corresponds
then to the integrated (squared) volatility, which can be efficiently estimated using
the so-called realized volatility, that is the sum of the squared increments of X . How-
ever, many other functions g are of interest. t For example, in the case of the realized
volatility mentioned above, the quantity 0 cs2 ds, called quarticity and corresponding
to g(x) = x 2 , appears in the asymptotic variance of the estimator. Therefore, esti-
mating the quarticity becomes necessary if one wants to build confidence intervals
for the integrated volatility. Actually, in the context of volatility estimation, for most
statistical
t procedures, the asymptotic variance is a combination of terms of the form
0 g(c s ) ds, see [4]. Hence the statistician needs to be able to estimate such quanti-
ties. Note that the functions g involved in limiting variances are often polynomial.
Nevertheless, more complicated expressions may also be found, in particular in the
multi-dimensional setting in the presence of jumps. We refer to [5] for more details on
the motivation for estimating general integrated functionals of the volatility process.
In [5], we have exhibited estimators which are consistent and asymptotically
√
optimal, in the sense that they asymptotically achieve the best rate 1/ n , and
also the minimal asymptotic variance in the cases where optimality is well-defined
(namely, when X is continuous and has a Markov type structure, in the sense of [2]).
These estimators have this rate and minimal asymptotic variance as soon as the jumps
of X are summable, plus some mild technical conditions.
The aim of this paper is to complement [5] with another estimator, of the same
type, but using spot volatility estimators based on a different window size. In this
introduction, we explain the differences between the estimator in [5] and the one
presented here.
For the sake of simplicity, we consider the case when X is continuous and one-
dimensional (the discontinuous and multi-dimensional case is considered later), that
is of the form t t
Xt = X0 + bs ds + σs dWs
0 0
t
and ct = σt2 is the squared volatility. Natural estimators for V (g)t = 0 g(cs ) ds are
]−kn +1
[t/n kn −1
1
V (g)nt = n g(
cin ), where
cin = (X (i+ j)n − X (i+ j−1)n )2
k n n
i=1 j=0
(1.1)
for an arbitrary sequence of integers such that kn → ∞ and kn n → 0. One knows
P
that V (g)nt −→V (g)t (when g is continuous and of polynomial growth).
The variables cin are spot volatility estimators, and according to [4] we know that
n
c[t/n ] estimates ct , with a rate depending on the “window size” kn . The optimal rate
1/4 √ √
1/n is achieved by taking kn 1/ n .1 When kn is smaller, the rate is kn
√ √ √
1 By kn 1/ n , we mean a1 / n ≤ kn ≤ a2 / n , for some a1 > 0 and a2 > 0.
√
Estimation of Volatility Functionals: The Case of a n Window 561
and√the estimation error is a purely “statistical error”; when kn is bigger, the rate is
1/ kn n and the estimation error is due to the variability of the volatility
√ process ct
itself (its volatility and its jumps). With the optimal choice kn 1/ n , the estima-
tion error is a mixture of the statistical error and the error due
√ to the variability of ct .
In [5], we have used a “small” window, that is kn 1/ n . Somewhat √surpris-
t
ingly, this allows for optimality in the estimation of 0 g(cs ) ds (rate 1/ n and
minimal asymptotic variance). However, the price to pay is the need of a de-biasing
term to be subtracted from V (g)n , without which the rate is smaller and no Central
Limit Theorem is available. √
Here,√ we considernthe window size kn 1/ n . This leads to a convergence
rate 1/ n for V (g) itself, and the limit is again conditionally Gaussian with the
“minimal” asymptotic variance, but with a bias that depends on the volatility of
the volatility ct , and on its jumps. It is however possible to subtract from V (g)n a
de-biasing term again, so that the limit becomes (conditionally) centered.
Section 2 is devoted to presenting assumptions and results, and all proofs are
gathered in Sect. 3. The reader is referred to [5] for motivation and various comments
and a detailed discussion of optimality. However, in order to make this paper readable,
we basically give the full proofs, even though a number of partial results have already
been proved in the above-mentioned paper, and with the exception of a few well
designated lemmas.
2 The Results
The underlying process X is d-dimensional, and observed at the times in for i =
0, 1, . . ., within a fixed interval of interest [0, t]. For any process we write in Y =
Yin − Y(i−1)n for the increment over the ith observation interval. We assume that
the sequence n goes to 0. The precise assumptions on X are as follows.
First, X is an Itô semimartingale on a filtered space (, F, (Ft )t≥0 , P). It can be
written in its Grigelionis form, as follows, using a d-dimensional Brownian motion
W and a Poisson random measure μ on R+ × E, where E is an auxiliary Polish
space and with the (non-random) intensity measure ν(dt, dz) = dt ⊗ λ(dz) for
some σ-finite measure λ on E:
t t t
X t = X 0 + 0 bs ds + 0 σs dWs + 0 E δ(s, t z) 1{ δ(s,z) ≤1} (μ − ν)(ds, dz)
+ 0 E δ(s, z) 1{ δ(s,z) >1} μ(ds, dz).
(2.1)
The spot volatility process ct = σt σt∗ (∗ denotes transpose) takes its values in the
set M+ d of all nonnegative symmetric d × d matrices. We suppose that ct is again
an Itô semimartingale, which can be written as
t t t
ct = c0 +
bs ds +
σs dWs +
δ(s, z) 1{ δ(s,z) ≤1} (μ − ν)(ds, dz)
0 0 0 E
t
+ 0 E δ(s, z) 1{ δ(s,z) >1} μ(ds, dz),
(2.2)
with the same W and μ as in (2.1). This is indeed not a restriction: if X and c are
two Itô semimartingales, we have a representation as above for the pair (X, c) and,
if the dimension of W exceeds the dimension of X , one can always add fictitious
component to X , arbitrarily set to 0, so that the dimensions of X and W agree.
In (2.2), σ are optional and b and
δ is as δ; moreover
2
b and δ are Rd -valued.
Finally, we need the spot volatility of the volatility and “spot covariation” of the
continuous martingale parts of X and c, which are
i j,kl
d
i j,m kl,m i, jk
d
jk,l
ct =
σt
σt ,
ct = σtil
σt .
m=1 l=1
The precise assumptions on the coefficients are as follows, with r a real in [0, 1).
Assumption (A’-r ): There are a sequence (Jn ) of nonnegative bounded λ-integrable
functions on E and a sequence (τn ) of stopping times increasing to ∞, such that
t ≤ τn (ω) =⇒ δ(ω, t, z) r
∧1+
δ(ω, t, z) 2
∧ 1 ≤ Jn (z).
Moreover, the processes bt = bt − δ(t, z) 1{ δ(t,z) ≤1} λ(dz)
(which is well
defined), ct and
ct are càdlàg or càglàd, and the maps t → δ(ω, t, z) are càglàd (recall
that δ should be predictable), as well as the processes bt + δ(t, z) κ( δ(t, z) ) −
1{ δ(t,z) ≤1} ) λ(dz) for one (hence for all) continuous function κ on R+ with compact
support and equal to 1 on a neighborhood of 0.
The bigger r , the weaker Assumption (A’-r ), and when (A’-0) holds the process
X has finitely many jumpson each finite interval. The part of (A’-r ) concerning the
jumps of X implies that s≤t X s r < ∞ a.s. for all t < ∞, and it is in fact
“almost” implied by this property. Since r < 1, this implies s≤t X s < ∞ a.s.
Remark 2.1 (A’-r ) above is basically the same as Assumption (A-r ) in [5], albeit
(slightly) stronger (hence its name): some degree of regularity in time seems to be
needed for c ,
c,
b, δ in the present case.
For defining the estimators of the spot volatility, we first choose a sequence kn of
integers which satisfies, as n → ∞:
√
Estimation of Volatility Functionals: The Case of a n Window 563
θ
kn ∼ √ , θ ∈ (0, ∞), (2.3)
n
]−kn +1
[t/n t
V (g)nt := n g(
cin ) =⇒ V (g)t := g(cs ) ds
u.c.p.
(2.5)
i=1 0
(convergence
b in probability, uniformly over each compact interval; by convention
v
i=a i = 0 if b < a), as soon as the function g on M+ d is continuous with
|g(x)| ≤ K (1 + x p ) for some constants K , p. Actually, for this to hold we need
much weaker assumptions on X , but we do not need this below. Note also that when
X is continuous, the truncation in (2.4) is useless: one may use (2.4) with u n ≡ ∞,
which reduces to (1.1) in the one-dimensional case.
Now, we want to determine at which rate the convergence (2.5) takes place. This
amounts to proving an associated Central Limit Theorem. For an appropriate √ choice
of the truncation levels, such a CLT is available for V (g)n , with the rate 1/ n , but
the limit exhibits a bias term. Below, g is a smooth function on M+ d , and the two first
partial derivatives are denoted as ∂ jk g and ∂ jk,lm g, since any x ∈ M+
2 2
d has d com-
ponents x . The family of all partial derivatives of order j is simply denoted as ∂ j g.
jk
∂ j g(x) ≤ K (1 + x p− j
), j = 0, 1, 2, 3 (2.6)
2p − 1 1
u n n , ≤ < . (2.7)
2(2 p − r ) 2
564 J. Jacod and M. Rosenbaum
1 Lf −s
√ (V (g)nt − V (g)t ) −→ A1t + A2t + A3t + A4t + Z t ,
n
d
t jl
E (Z t )2 | F =
jm
∂ jk g(cs ) ∂lm g(cs ) cs cskm + cs cskl ds, (2.8)
j,k,l,m=1 0
we have
A1t = − 2θ g(c0 ) + g(ct )
d t 2 jl km jm kl
A2t = 2θ
1
0 ∂ jk,lm g(cs ) cs cs + cs cs ds
j,k,l,m=1
θ d t jk,lm
A3t = − 12 0 ∂ 2jk,lm g(cs )
cs ds
j,k,l,m=1
A4t = θ G(cs− , cs ).
s≤t
(kn − 1)n n
g(
c1 ) + g(n
c[t/n ]−kn +1
) .
2
Remark 2.4 Observe that (2.7) implies r < 1. This restriction is not a surprise, since
one needs r ≤ 1 in order to estimate the integrated √ volatility by the (truncated)
realized volatility, with a rate of convergence 1/ n . When r = 1, it is likely that
the CLT still holds for an appropriate choice of the sequence u n , and with another
additional bias, see e.g. [6] for a slightly different context. Here we let this borderline
case aside.
Now we proceed to “remove” the bias, which means subtracting consistent estimators
for the bias from V (g)nt . As written before, we have
√
k n n n P
An,1
t =− g(
c1 ) + g(n
c[t/n ]−kn +1
−→ A1t (2.10)
2
P P
(this comes from
c1n −→c0 and n
c[t/ n ]−kn +1
−→ct− , plus ct− = ct a.s.). Next,
observe that A = θ V (h) for the test function h defined on M+
2 1
d by
1
d
h(x) = ∂ 2jk,lm g(x) x jl x km + x jm x kl .
2
j,k,l,m=1
Therefore
1 P
An,2
t = √ V (h)nt −→ A2t . (2.11)
k n n
566 J. Jacod and M. Rosenbaum
The term A3t involves the volatility of the volatility, for which estimators have
been provided in the one-dimensional case by Vetter [7]; namely, if d = 1 and under
suitable technical assumptions (slightly stronger than here), plus the continuity of
X t and ct , he proves that
[t/n
]−2kn +1
3
(n
ci+k n
−
cin )2
2kn
i=1
t
converges to 0 cs + θ62 (cs )2 ds. Of course, we need to modify this estimator here,
in order to include the function ∂ 2 g in the limit and account for the possibilities of
having d ≥ 2 and having jumps in X . We propose to take
√ [t/n
]−2kn +1
n
d
n, jk n, jk
An,3
t =− ∂ 2jk,lm g(
cin ) (
ci+kn −
ci ) (n,lm
ci+k cin,lm ).
−
8 n
i=1 j,k,l,m=1
(2.12)
When X and c are continuous, one may expect the convergence to A3t − 21 A2t (observe
√
that 8n ∼ 2k3n 12 θ
), and one may expect the same when X jumps and c is still
continuous, because in (2.4) the truncation basically eliminates the jumps of X . In
contrast, when c jumps, the limit should rather be related to the “full” quadratic
variation of c. Indeed we have the following theorem.
Theorem 2.5 Under the assumptions of Theorem 2.2, for all t ≥ 0 we have
P 1 2
An,3
t −→ − A + A3t + At4 ,
2 t
where
At4 = θ G (cs− , cs )
s≤t
and
1
1 2
G (x, y) = − ∂ jk,lm g(x) + ∂ 2jk,lm g(x + (1 − w)y) w 2 y jk y lm dw.
8 0
j,k,l,m
(2.13)
At this stage, it remains to find consistent estimators for A4t − At4 , which has the
form
A4t − At4 = θ G (cs− , cs ), where G = G − G .
s≤t
√
Estimation of Volatility Functionals: The Case of a n Window 567
un 1
u n → 0, → ∞ for some ∈ 0, . (2.14)
n 8
The condition (2.14) implies that for n large enough there is at most one jump of size
bigger than u n in each interval ((i − 1)n , (i − 1 + kn )n ] within [0, t], and no two
consecutive intervals of this form contain such jumps. Despite this, the statement
above is of course not true, the main reason being that cin and cin do not exactly
agree. However it is “true enough” to allow for the next estimators to be consistent
for V(F)t :
[t/k ]−3
V(F)nt = j=3n n c(nj−3)kn +1 , δ nj
F( c) 1{ δnj−1c ∨ δnj+1c ∨u n < δnjc } ,
(2.15)
where δ njc = cnjkn +1 − c(nj−2)kn +1 .
At this stage, we can set, with the notation (2.11), (2.12) and (2.15), and also (2.9)
and (2.13) for G and G :
k n n n
V (g)nt = V (g)nt + g(
c1 ) + g(n
c[t/n ]−kn +1
2
3 n,2
− n A + An,3 − kn n V(G − G )nt .
2 t t
Theorem 2.7 Under the assumptions of Theorem 2.2, and with Z as in this theorem,
for all t ≥ 0 we have the finite-dimensional stable convergence in law
1 Lf −s
√ (V (g)nt − V (g)t ) −→ Z t .
n
Note that θ no longer explicitly appears in this statement, so one can replace (2.3)
by the weaker statement
1
kn √
n
(this is easily seen by taking subsequences nl such that knl nl converge to an
arbitrary limit in (0, ∞)).
It is simple to make this CLT “feasible”, that is, usable in practice for determining
a confidence interval for V (g)t at any time t > 0. Indeed, we can define the following
function on M+ d:
d
h(x) = ∂ jk g(x) ∂lm g(x) x jl x km + x jm x kl .
j,k,l,m=1
We then have V (h)n =⇒ V (h), where V (h)t is the right hand side of (2.8). Then
u.c.p.
we readily deduce:
Corollary 2.8 Under the assumptions of the previous theorem, for any t > 0 we
have the following stable convergence in law, where Y is an N (0, 1) variable:
V (g)nt − V (g)t L− s
−→ Y, in restriction to the set {V (h)t > 0}.
n V (h)nt
√
Estimation of Volatility Functionals: The Case of a n Window 569
Finally, let us mention that the estimators V (g)nt enjoy exactly the same asymptotic
efficiency properties as the estimators in [5], and we refer to this paper for a discussion
of this topic.
[t/n
]−kn +1 [t/n
]−2kn +1
3 n
n 1 − (
cin )2 + (n
ci+k n
−
cin )2
kn 4
i=1 i=1
(kn − 1)n n 2
+ (
c1 ) + ( n
c[t/n ]−kn +1
)2 .
2
t
The asymptotic variance is 8 0 cs4 ds, to be compared with the asymptotic variance
[t/ ] t 4
of the more usual estimator 31 n i=1 n (in X )4 , which is 323 0 cs ds.
3 Proofs
3.1 Preliminaries
According to the localization Lemma 4.4.9 of [4] (for the assumption (K) in that
lemma), it is enough to show all four Theorems 2.2, 2.5–2.7 under the following
stronger assumption.
Assumption (SA’-r ): We have (A’-r ). Moreover, we have for a λ-integrable function
J on E and a constant A:
b , b , b , c ,
c , c , J ≤ A,
δ(ω, t, z) ≤ J (z),
r
δ(ω, t, z) 2 ≤ J (z). (3.1)
In the sequel, we thus suppose that X satisfies (SA’-r ), and also that (2.3) holds:
these assumptions are typically not recalled. Below, all constants are denoted by K ,
and they vary from line to line. They may implicitly depend on the process X (usually
through A in (3.1)). When they depend on an additional parameter p, we write K p .
We will usually replace the discontinuous process X by the continuous process
t t
Xt = bs ds + σs dWs , (3.2)
0 0
570 J. Jacod and M. Rosenbaum
connected with X by X t = X 0 + X t + s≤t X s . Note that b is bounded, and
without loss of generality we will use below its càdlàg version. Note also that, since
the jumps of c are bounded, one can rewrite (2.2) as
t t t
ct = c0 +
bs ds +
σs dWs +
δ(s, z) (μ − ν)(ds, dz).
0 0 0 E
This amounts to replacing b in (2.2) by bt+ + E δ(t+, z)(κ( δ(t+, z) )
− 1{ δ(t+,z) ≤1} ) λ(dz), where κ is a continuous function with compact support,
equal to 1 on the set [0, A]. Note that the new process
b is bounded càdlàg.
With any process Z we associate the variables
η(Z )t,s = E supv∈(t,t+s] Z t+v − Z t 2 | Ft , (3.3)
Lemma 3.1 For all t > 0, all bounded càdlàg processes Z , and all sequences
[t/n ]
vn ≥ 0 of real numbers tending to 0, we have n E i=1 η(Z )(i−1)n ,vn → 0,
and for all 0 ≤ v ≤ s we have E(η(Z )nt+v,s | Ft ) ≤ η(Z )t,s .
on some space (, F, (Ft )t≥0 , P), which may be different from the one on which
X is defined, as well as W and μ, but we still suppose that the intensity measure ν is
the same. Note that Y0 = 0 here. We assume that for some constant A and function
J Y we have, with cY = σ Y σ Y,∗ :
bY ≤ A, cY ≤ A2 , δ Y (ω, t, z) 2 ≤ J Y (z) ≤ A2 , J Y (z) λ(dz) ≤ A2 .
E
(3.4)
t
The compensator of the quadratic variation of Y is of the form 0 csY ds, where
ctY = ctY + E δ Y (t, z) δ Y (t, z)∗ λ(dz). Moreover, if the process cY is itself an Itô
semimartingale, the quadratic
t Ycovariation of the continuous martingale parts of Y
and cY is also of the form 0 cs ds for some process c Y , necessarily bounded if both
Y and cY satisfy (3.4) (and, if Y = X , we have cY = c and cY = c ).
√
Estimation of Volatility Functionals: The Case of a n Window 571
Lemma 3.2 Below we assume (3.4), and the constant K only depends on A.
(a) We have for t ∈ [0, 1]:
E(Yt | F0 ) − tbY ≤ t η(bY )0,t ≤ K t
0
√ (3.5)
E(Yt j Y m | F0 ) − tcY, jm ≤ K t (t + t η(bY )0,t + η(cY )0,t ) ≤ K t,
t 0
Proof The first part of (3.5) follows by taking the F0 -conditional expectation in the
t
decomposition Yt = Mt + tb0Y + 0 (bsY − b0Y ) ds, where M is a d-dimensional
martingale with M0 = 0. For the second part, we deduce from Itô’s formula that
Y j Y m is the sum of a martingale vanishing at 0 and of
t t t t
j j j j j
b0 Ysm ds + b0m Ys ds + Ysm (bs − b0 ) ds + Ys (bsm − b0m ) ds
0 0 0 0
t
Y, jm Y, jm Y, jm
+ c0 t + (cs − c0 ) ds.
0
√
Since E( Yt | F0 ) ≤ K A t, as in (3.9), we deduce the second part of (3.5) and also
(3.6) by taking again the conditional expectation and by using the Cauchy-Schwarz
inequality and the first part.
j
Equation (3.7) is a part of Lemma 4.1 of [5]. For (3.8), we first observe that Yt Ytk −
Y, jk Y,lm Y,lm
tc0 = Bt + Mt and ct − c0 = Bt + Mt , with M and M martingales (M
is continuous). The processes B, B , M, M, M , M and M, M are absolutely
continuous, with densities bs , bs , h s , h s and h s satisfying, by (3.4) for Y and cY :
j
where h s = Ys c Y,k,lm + Ysk
c Y, j;lm . Again as in (3.9) below, E( Yt q | F0 ) ≤
Kq t q/2 for all q, and E( ct − c0Y 2 | F0 ) ≤ K t. This yields E(Bt2 | F0 ) ≤ K t 3 and
Y
(1) We begin with well known estimates for X and c, under (3.1) and for s, t ≥ 0
and q ≥ 0, see [4] for details:
E supw∈[0,s] X t+w − X t q | Ft ≤ K q s q/2 , E(X t+s − X t | Fs ) ≤ K s
E supw∈[0,s] ct+w − ct q | Ft ≤ K q s 1∧(q/2) , E(ct+s − ct | Fs ) ≤ K s.
(3.9)
(2) The jumps of c also potentially cause troubles. So we will eliminate the “big”
jumps as follows. For any ρ > 0 we consider the subset E ρ = {z : J (z) > ρ}, which
satisfies λ(E ρ ) < ∞, and we denote by G ρ the σ-field generated by the variables
μ([0, t] × A), where t ≥ 0 and A runs through all Borel subsets of E ρ . The process
ρ
Nt = μ((0, t] × E ρ ) (3.11)
ρ ρ
is a Poisson process and we let S1 , S2 , . . . be its successive jump times, and n,t,ρ
ρ ρ
be the set on which S j ∈ / {in : i ≥ 1} for all j ≥ 1 such that S j < t, and
ρ ρ ρ
S j+1 > t ∧ S j + (6kn + 1)n for all j ≥ 0 (with the convention S0 = 0; taking 6kn
here instead of the more natural kn will be needed in the proof of Theorem 2.6, and
makes no difference here). All these objects are G ρ -measurable, and P(n,t,ρ ) → 1
as n → ∞, for all t, ρ > 0.
√
Estimation of Volatility Functionals: The Case of a n Window 573
t
c(ρ)t = ct −
δ(s, z) μ(ds, dz) = c(1) (ρ)t + c(2) (ρ)t , where
0 Eρ
t t
c(1) (ρ)t = c0 + 0 b(ρ)s ds + 0 σs dWs (3.12)
t
c (ρ)t = 0 (E ρ )c
(2) δ(t−, z) (μ − ν)(ds, dz),
2 2
so c(ρ), which is Rd ⊗ Rd -valued, is the càdlàg version of the density of the pre-
dictable quadratic variation of c(ρ). Moreover G ρ = {∅, } and (
b(ρ), c(ρ)) = (
b, c)
when ρ exceeds the bound of the function J . Note also that b(ρ) and c(ρ) are càdlàg.
By Lemma 2.1.5 and Proposition 2.1.10 in [4] applied to each components of X
and c(2) (ρ), plus the property b(ρ) ≤ K /ρ, for all t ≥ 0, s ∈ [0, 1], ρ ∈ (0, 1],
q ≥ 2, we have
E supw∈[0,s] X t+w − X t q | Ft ∨ G ρ ≤ K q s q/2
E(X t+s − X t | Fs ∨ G ρ ) + E(c(ρ)t+s − c(ρ)t | Fs ∨ G ρ ) ≤ K s
E supw∈[0,s] c(2) (ρ)t+w − c(2) (ρ)t q | Ft ∨ G ρ ≤ K q φρ (s + s q/2 )
q
E supw∈[0,s] c(ρ)t+w − c(ρ)t q | Ft ∨ G ρ ≤ K q φρ s + s q/2 + ρs q ≤ K q,ρ s.
(3.13)
where φρ = (E ρ )c J (z) λ(dz) → 0 as ρ → 0. Note also that
b(ρ)t ≤ K /ρ.
n,ρ
All the above variables are Fi -measurable. Recalling (3.3), and writing
η(Z , (Ht ))t,s if we use the filtration (Ht ) instead of (Ft ), we also set
η(ρ)i,n j = max(η(Y, (G ρ Ft ))(i−1)n , jn : Y = b , c ,
b(ρ), c, c(ρ),
η(ρ)in = η(ρ)i,i+2k
n
n
.
Therefore, Lemma 3.1 yields for all t, ρ > 0 and j, k such that j + k ≤ 2kn :
[t/ ]
n n,ρ
n E η(ρ)in → 0, E(η(ρ)i+
n
j,k | Fi ) ≤ η(ρ)in . (3.15)
i=1
574 J. Jacod and M. Rosenbaum
We still need some additional notation. First, define G ρ -measurable (random) set
of integers:
ρ ρ
L(n, ρ) = {i = 1, 2, . . . : N(i+2kn )n − N(i−1)n = 0} (3.16)
(taking above 2kn instead of kn is necessary for the proof of Theorem 2.5).
Observe that
the third inequality following from the first two ones, plus Burkholder-Gundy and
Hölder inequalities, and the last inequality from the third one and the boundedness
of ct . Moreover, since the set {i ∈ L(n, ρ)} is G ρ -measurable, the last part of (3.13),
(3.17), and Hölder’s inequality, readily yield
n
q/2
) ≤ K q
n,ρ q/4
q ≥ 2, i ∈ L(n, ρ) ⇒ E βin q
| Fi n φρ + n + .
ρq
(3.20)
(5) The previous estimates are not enough for us. We will apply the estimates of
Lemma 3.2 with Yt = X (i−1)n +t − X (i−1)n for any given pair n, i, and with the
filtration (F(i−1)n +t ∨ G ρ )t≥0 . We observe that on the set A(ρ, n, i) = {∃ j ≤ 2kn :
i − j ∈ L(n, ρ)}, which is G ρ -measurable, and because of (3.17), the process cY
coincides with c(ρ)(i−1)n +t − c(ρ)(i−1)n if t ∈ [0, n ]. Then in restriction to this
set, by (3.6) and (3.7) and by the definition of η(ρ)i,1
n , we have
√
E(n X j n X m | F n,ρ ) − cn, jm n ≤ K ρ n3/2 ( n + η(ρ)n )
i i i i i,1
n j n k n l n m
E X X X X | F n,ρ
i i i i i
c )2 ≤ K ρ n
n, jk n,lm n, jl n,km n, jm n,kl 5/2
− (ci c i +c c i +ci i i n
√
Estimation of Volatility Functionals: The Case of a n Window 575
(the constant above depends on ρ, through the bound K /ρ for the drift of c(ρ)).
Then a simple calculation gives us
√
E(αn | F n,ρ ) ≤ K ρ n3/2 ( n + η(ρ)n )
i i i,1
n, jk n,lm on A(ρ, n, i).
E α αi
n,ρ n, jl
| Fi ) − (ci cin,km + ci ci )2n ≤ K ρ n
n, jm n,kl 5/2
i
(3.21)
i ∈ L(n, ρ), 0 ≤ t ≤ kn n ⇒
n, jklm
E((c jk jk n,ρ
(i−1)n +t − c(i−1)n )(c(i−1)n +t − c(i−1)n ) | Fi ) − tc(ρ)i
lm lm
(3.22)
≤ K ρ t η(ρ)i,k
n
n
E(c(i−1) +t − c(i−1) | F n,ρ ) − t b(ρ)in ≤ K ρ t η(ρ)i,k
n ≤ K p t.
n n i n
Moreover, the Cauchy-Schwarz inequality and (3.19) on the one hand, and (3.8)
applied with the process Yt = X (i−1)n +t − X (i−1)n on the other hand, give us
n,kl n n,ρ
E α i b(ρ)ms | Fi ≤ K n η(ρ)i,1
n
i
i ∈ L(n, ρ) ⇒ n,kl 3/2 √
(3.23)
E α in cms | Fi ≤ K ρ n ( n + η(ρ)i,1
n,ρ n ).
i
n, jk n,lm
Proof We set ζi,n j = αi+
n
j + (ci+ j − ci )n and write βi
n n βi as
kn −1 kn −2 k
n −1 kn −2 k
n −1
1 n, jk n,lm 1 n, jk n,lm 1 n,lm n, jk
ζ ζ + ζ ζ + ζi,u ζi,v .
kn2 2n i,u i,u
kn2 2n i,u i,v
kn2 2n
u=0 u=0 v=u+1 u=0 v=u+1
(3.24)
For the estimates below, we implicitly assume i ∈ L(n, ρ) and u, v ∈ {0, . . . , kn −1}.
First, we deduce from (3.21) and (3.22), plus (3.23) and successive condition-
ing, that
576 J. Jacod and M. Rosenbaum
E(ζ n, jk ζ n,lm | F n,ρ ) − (cn, jl cn,km + cn, jm cn,kl )2 ≤ K n5/2 . (3.25)
i,u i,u i i i i i n
)n −
n, jk n,ρ n, jk n, jk n, jk
|E(ζi,v | Fi+u+1 ) − (ci+u+1 − ci b(ρ)i+u+1 2n (v − u − 1)|
3/2 √
≤ K n (kn n + η(ρ)i+v,1n )
n,lm n, jk n, jk n,ρ 3/2 √
|E(αi+u (ci+u+1 − ci+u ) | Fi+u )| ≤ K ρ n ( n + η(ρ)i+u,1 n )
n,lm n, jk n, jk n,ρ 3/2 √
|E(αi+u (ci+u − ci ) | Fi+u )| ≤ K n ( n + ηi+u,1 ) n
n,lm n, jk 3/2 √
(b(ρ)i+u+1 −
n, jk n,ρ
|E(αi+u b(ρ)i+u ) | Fi+u )| ≤ K ρ n ( n + η(ρ)i+u,1 n )
n,lm n, jk n,ρ 3/2 √
|E(αi+u b(ρ)i+u | Fi+u )| ≤ K ρ n ( n + η(ρ)i+u,1 ) n
term. The first claim of the lemma readily follows from this and (3.24) and (3.25).
The proof of the second claim is similar. Indeed, we have
kn −1
n, jk n,lm 1 n, jk n, jk n, jk n,lm
βi (ci+k − cin,lm ) = αi,u + (ci+u − ci )n ci+k − cin,lm
n k n n n
u=0
and
E(cn,lm − cn,lm | F n,ρ ) − cn,lm − cn,lm − n,lm
b(ρ)i+u+1 n (kn − u − 1)
i+kn i i+u+1 i+u+1 i
≤ K n η(ρ)i+u+1,k
n
n −u
.
(3.26)
n, jk
To see that the first claim holds, one expands the product γi γin,lm and uses suc-
cessive conditioning, the Cauchy-Schwarz inequality and (3.13), (3.17) and (3.22),
√
Estimation of Volatility Functionals: The Case of a n Window 577
n, jk n, jk n, jk n, jk
βi βin,lm + βi+kn βi+k
n,lm
n
+ (ci+kn − ci n,lm
)(ci+k n
− cin,lm )
n, jk n,lm n, jk n, jk
− βi (ci+k n
− cin,lm ) − βin,lm (ci+kn − ci ).
For the second claim we use (3.13), (3.17) and (3.20), and it holds for all q ≥ 2.
Assumption (SA’-r ) is of course not fully used. One only needs the assumptions
concerning the process ct .
Proof With the notation (3.12), and for l = 1, 2 we define μ(l) (ρ)nj and U (l) (ρ)nt
as above, upon substituting c(ρ) and u n /4 with c(l) (ρ) and u n /8. Since U (ρ)nt ≤
4U (1) (ρ)nt + 4U (2) (ρ)nt , it suffices to prove the result for each U (l) (ρ)nt .
First, μ(1) (ρ)nj 2 1{ μ(1) (ρ)n >u n /8} is smaller than K μ(1) (ρ)nj 4 /u n2 , whereas
i
(recalling b(ρ) ≤ K /ρ) classical estimates yield E μ(1) (ρ)n 4 ≤ K n (1 +
j
(1) 1/2−2
n /ρ). Thus the expectation of U (ρ)t is less than K n
n (1+n /ρ), yielding
the result for U (1) (ρ)nt .
[t/k ]
Secondly, we have U (2) (ρ)nt ≤ j=3n n μ(2) (ρ)in 2 and the first part of (3.13)
√
yields E μ(2) (ρ)in 2 ≤ K φρ n . Since φρ → 0 as ρ → 0, the result for U (1) (ρ)nt
follows.
We start the proof of Theorem 2.2 by giving a decomposition of V (g)n − V (g), with
cin = cin + βin and on the definition
quite a few terms. It is based on the key property
578 J. Jacod and M. Rosenbaum
(3.18) of αin and βin . A simple calculation shows that √1 (V (g)nt − V (g)t ) =
5 n, j
n
The leading term is V n,3 , the bias comes from the terms V n,4 and V n,5 , and the
first two terms are negligible, in the sense that they satisfy
n, j P
j = 1, 2 ⇒ Vt −→ 0 for all t > 0. (3.28)
cin ) − g(
|g( cin )| ≤ K (1 +
cin +
cin ) p−1
cin −
cin
cin ) p−1
≤ K (1 + cin −
cin + K
cin −
cin p
.
Recalling the last part of (3.19), we deduce from (3.10), together with the fact that
1 − r − p(1 − 2 ) < (2−r 2q
)
for all q > 1 small enough and Hölder’s inequality
(2 p−r ) +1− p
cin ) − g(
that E(|g( cin )|) ≤ K an n . Therefore
(2 p−r ) +1/2− p
E sup |Vsn,1 | ≤ K tan n
s≤t
(functional stable convergence in law), where Z is the process defined in Theorem 2.2.
A change of order of summation allows us to rewrite V n,3 as
[t/n ] (i−1)∧(k
n −1)
1 n,lm n,lm 1
Vtn,3 = √ wi αi , where win,lm = ∂lm g(ci−
n
j ).
n i=1 l,m kn
j=(i−[t/n ]+kn −1)+
Observe that win and αin are measurable with respect to Fin and Fi+1
n , respectively, so
[t/n
]−kn +1
1 P
√ win,lm E(αin,lm | Fin ) −→ 0 (3.30)
n i=1
]−kn +1
[t/n
1 n, jk n, jk
wi win,lm E(αi αin,lm | Fin )
n
i=1
P
t jl jm
−→ ∂ jk g(cs ) ∂lm g(cs ) cs cskm + cs cskl ds (3.31)
0
]−kn +1
[t/n
1 P
win 4
E( αin 4
| Fin ) −→ 0 (3.32)
2n
i=1
[t/n
]−kn +1
1 P
√ win,lm E(αin,lm in N | Fin ) −→ 0, (3.33)
n i=1
a way, we can apply all estimates of the previous subsections with the conditioning
σ-fields Fin . Therefore (3.19) and the property win ≤ K readily imply (3.30) and
(3.32). In view of the form of αin , a usual argument (see e.g. [4]) shows that in fact
E(αin,lm in N | Fin ) = 0 for all N as above, hence (3.33) holds.
For (3.31), by (3.21) it suffices to prove that
]−kn +1
[t/n
n, jk n, jl n,km n, jm n,kl
n wi win,lm (ci ci + ci ci )
i=1
P
t jl jm
−→ ∂ jk g(cs ) ∂lm g(cs ) cs cskm + cs cskl ds.
0
n, jk n, jk
In view of the definition of win , for each t we have wi(n,t) → ∂ jk g(ct ) and ci(n,t) →
jk
ct almost surely if |i(n, t)n − t| ≤ kn n (recall that c is almost surely continuous
at t, for any fixed t), and the above convergence follows by the dominated convergence
theorem, thus ending the proof of (3.29).
P θ t
Vtn,4 −→ ∂lm g(cs− ) dcslm − θ g(ct ). (3.34)
2 0
l,m
We call Vt n,4 and Vt n,4 , respectively, the first sum, and the last integral, in the
√
definition of Vtn,4 . Since kn n → θ and c is a.s. continuous at t, it is obvious that
Vt n,4 converges almost surely to −θ g(ct ), and it remains to prove the convergence
of Vt n,4 to the first term in the right side of (3.34).
We first observe that ci+un − cin = u−1 v=0 i+v c. Then, upon changing the order
n
[t/
n ]−1
n,4
Vt = win,lm in clm ,
i=1 l,m
√ (i−1)∧(k
n n −2)
win,lm = (kn − 1 − u)∂lm g(ci−u
n
).
kn
u=0∨(i+kn −1−[t/n ])
√
In other words, recalling kn n ≤ K and ∂g(cs ) ≤ K , we see that
√
Estimation of Volatility Functionals: The Case of a n Window 581
t
Vt n,4 = H (n, t)lm
s dcs ,
lm
l,m 0
(its expression on [0, kn n ) and on (t −kn n , t] is more complicated, but not needed,
kn −2
apart from the fact that it is uniformly bounded). Now, since u=0 (kn − 1 − u) =
θ
kn /2 + O(kn ) as n → ∞, we observe that H (n, t)s converges to 2 ∂lm g(cs− ) for all
2 lm
s ∈ (0, t). Since c is a.s. continuous at t, we deduce from the dominated convergence
theorem for stochastic integrals that Vt n,4 indeed converges in probability to the first
term in the right side of (3.34).
P 1
Vtn,5 −→A2t − 2 A3t +θ g(cs− + wcs ) − g(cs− ) − w ∂lm g(cs− ) cslm dw.
s≤t 0 l,m
(3.35)
[t/n ]−kn +1
We have Vtn,5 = i=1 vin , where
vin = n g(cin + βin ) − g(cin ) − ∂lm g(cin ) βin,lm .
l,m
We also set
kn −1 n kn −1 n
αin = kn1n u=0 αi+u , β in = βin − αin = k1n u=1 (ci+u − cin ),
√
vi n = n g(cin + β in ) − g(cin ) − ∂lm g(cin ) β in,lm , vi n = vin − vi n .
l,m
(3.36)
We take ρ ∈ (0, 1], and will eventually let it go to 0. With the sets L(n, ρ) of
(3.16), we associate
n,ρ
n,ρ
n,ρ
Ut = vin , Ut = vi n , Ut = vi n . (3.37)
i∈L(n,ρ,t) i∈L(n,ρ,t) i∈L(n,ρ,t)
Therefore
3
U n,ρ = U ( j)n,ρ , where U ( j)nt = v( j)in . (3.38)
j=1 i∈L(n,ρ,t)
√
the càdlàg property of c and c(ρ) and kn n → θ imply
]−kn +1
[t/n t
P ρ θ jklm
W (ρ)nt := n w(ρ)in −→U (1)t := A2t + ∂ 2jk,lm g(cs ) c(ρ)s ds.
6 0
i=1 j,k,l,m
1/4
On the other hand, Lemma 3.3 yields |v(1)in − n w(ρ)in | ≤ K ρ n (n + η(ρ)in )
when i ∈ L(n, ρ), whereas |w(ρ)in | ≤ K always. Therefore
[t/
n ]
n,ρ
E |U (1)t − W (ρ)nt | ≤ K ρ n E ( n + η(ρ)in ) + K n E #(L(n, ρ, t)) .
i=1
ρ
Now, √#(L(n, ρ, t)) is not bigger than (2kn +1)Nt , implying that n E(#(L (n, ρ, t)))
≤ K ρ n . Taking advantage of (3.15), we deduce that the above expectation goes
to 0 as n → ∞, and thus
n,ρ P ρ
U (1)t −→ U (1)t . (3.39)
n,ρ n,ρ
Next, v(2)in is Fi+kn -measurable, with vanishing Fi -conditional expectation,
n,ρ
and each set {i ∈ L(n, ρ)} is F0 -measurable. It follows that
√
Estimation of Volatility Functionals: The Case of a n Window 583
n,ρ n,ρ
E (U (2)t )2 ≤ 2kn E i∈L(n,ρ,t) E |v(2)i | | Fi
n 2
n4 √
n,ρ
≤ K k n n E i∈L(n,ρ,t) E |β i | | Fi ≤ K tφρ + K ρ t n ,
where we have applied (3.20) for the last inequality. Another application of the same
estimate gives us
1/4
E |U (3)nt |) ≤ K tφρ + K ρ tn .
ρ ρ
(B) The processes U n,ρ . We will use here the jump times S1 , S2 , . . . of the Poisson
process N ρ , and will restrict our attention to the set n,t,ρ defined before (3.12),
whose probability goes to 1 as n → ∞. On this set, L(n, ρ, t) is the collection of
ρ ρ
all integers i which are between [Sq /n ] − 2kn + 2 and [Sq /n ] + 1, for some q
ρ
between 1 and Nt . Thus
ρ ρ
Nt [Sq /n ]+1
n,ρ
Ut = H (n, ρ, q), where H (n, ρ, q) = vi n . (3.41)
ρ
q=1 i=[Sq /n ]−2kn +1
ρ
The behavior of each H (n, ρ, q) is a pathwise question. We fix q and set S = Sq and
an = [S/n ], so S > an n because S is not a multiple of n . For further reference
we consider a case slightly more general than strictly needed here. We have cin → c S−
when an − 6kn + 1 ≤ i ≤ an + 1 and cin → c S when an + 2 ≤ i ≤ an + 6kn ,
uniformly in i (for each given outcome ω). Hence
which implies
n −3
k
u u
H (n, ρ, q) − n g c Sq − + c Sq − g(c Sq − ) − ∂lm g(c Sq − ) clm
S →0
kn kn q
u=1 l,m
Henceforth, we have
ρ
Nt
n,ρ P ρ
1
Ut −→Ut := θ g(c Sq − + wc Sq ) − g(c Sq − ) − w ∂lm g(c Sq − ) clm
Sq dw.
q=1 0 l,m
(3.43)
(C) The processes U n,ρ . Since |β in | ≤ K we deduce from (2.6) that |vi n | ≤
√ n,ρ q/4
K n ( αin + αin p ). (3.19) yields E( αin q | Fi ) ≤ K q n for all q > 0.
Therefore n,ρ 3/4 1/4
E |Ut | ≤ K n E(#(L(, n, ρ, t))) ≤ K ρ n ,
n,ρ P
Ut −→ 0. (3.44)
(D) Proof of (3.35). On the one hand, V n,5 = U (1)n,ρ + U (2)n,ρ + U (3)n,ρ +
U n,ρ + U n,ρ ; on the other hand, the dominated convergence theorem (observe that
ρ P
c(ρ)t →
σt2 for all t) yields that U (1)t −→A2 − 1
2 A3t and
ρ P
1
Ut −→θ g(cs− + wcs ) − g(cs− ) − w ∂lm g(cs− ) cslm dw
s≤t 0 l,m
as ρ → 0 (for the latter convergence, note that |g(x +y)−g(x)− l,m ∂lm g(x)y lm | ≤
K y 2 when x, y stay in a compact set). Then the property (3.35) follows from (3.39),
(3.40), (3.43) and (3.44).
(E) Proof of Theorem 2.2.
5We are now ready to prove Theorem 2.2. Recall that
√1 (V (g)n t − V (g)) = V n, j . By virtue of (3.28), (3.29), (3.34), (3.35), it is
n j=1
enough to check that
t
A1t + A3t + A4t + A5t = 2θ l,m 0 ∂lm g(cs− ) dcs − θ g(ct )
lm
1
− 2 A3t + θ s≤t 0 g(cs− + wcs ) − g(cs− )−w l,m ∂lm g(cs− ) cslm dw.
1
so the desired equality is immediate (use also 0 w dw = 1
2 ), and the proof of
Theorem 2.2 is complete.
The proof of Theorem 2.5 follows the same line as in Sect. 3.8, and we begin with
an auxiliary step.
Step (1) Replacing cin . The summands in the definition (2.12) of An,3
cin by are
t
R(ci ,
n ci+kn ), where R(x, y) =
n ∂ 2
j,k,l,m jk,lm g(x)(y jk − x jk )(y lm − x lm ), and
we set
√ [t/n]−2kn +1
n
Atn,3 = − cin ,
R( n
ci+k n
).
8
i=1
for all t, and this is done as in the step j = 1 in Sect. 3.5. The function R is C 1 on
R2+ with ∂ j R(x, y) ≤ K (1 + x + y ) p− j for j = 0, 1, by (2.6). Thus
cin ,
|R( n
ci+k n
) − R(
cin , n
ci+k n
)|
≤ K (1 +
cin + n
ci+k n
) ) p−1 (
cin −
cin + n
ci+k n
− n
ci+k n
)
+K
cin −
cin p
+K n
ci+k n
− n
ci+k n
p
.
P 1 2
Atn,3 −→ − A + A3t + At4 .
2 t
Step (2) From now on we use the same notation as in Sect. 3.8, although they denote
different variables or processes. For any ρ ∈ (0, 1] we have A n,3 = U n,ρ + U n,ρ +
U n,ρ , as defined in (3.37), but with
√
n
vin = − 8 R(cin + βin , ci+k
n
n
+ βi+k
n
n
)
√
vi n = − 8n R(cin + β in , ci+k
n
n
+ β i+k
n
n
), vi n = vin − vi n .
Use
cin − cin = βin and (2.6) and a Taylor expansion to check that
|v(3)in | ≤ K n γin 2
βin (1 + βin ) p−3 .
√
We also have |v(2)in | ≤ K n γin 2, hence (3.20) and (3.26) yield
n
E(|v(3)in | | G ρ ) + E(|v(2)in |2 | G ρ ) ≤ K n φρ + n
1/4
+ ,
ρp
and thus (3.40) holds here as well, by the same argument. Moreover, (3.26) again
yields (3.39), with now
ρ
t θ jklm 1 jl km jm
Ut = − ∂ 2jk,lm g(cs ) c(ρ)s + (cs cs + cs cskl ) ds.
12 4θ
j,k,l,m 0
n,ρ P ρ ρ P
ρ > 0 ⇒ Ut −→Ut , with, as ρ → 0, Ut −→At4 . (3.46)
Step (3) On the set n,t,ρ we have (3.41) and we study H (n, ρ, q), in the same way
as before, on the set n,t,ρ . We fix q and set S = Sq and an = [S/n ]. We then apply
(3.42) and also cin → c S− or cin → c S , according to whether an −2kn +1 ≤ i ≤ an +1
or an +2 ≤ i ≤ an +kn , to obtain vi n −v in → 0, uniformly in i between an −2kn +1
and an + 1, where
⎧
⎪
⎪ 0 if an − 2kn + 1 ≤ i ≤ an − 2kn + 2
⎪ (2kn −an +i−2)2 √n
⎪
⎪ jk
⎨− j,k,l,m ∂ jk,lm g(c S− ) c S c S
2 lm
⎪ 8kn2
vi =
n
if an − 2kn + 3 ≤i ≤ an − kn + 1
⎪
⎪ 2 √
⎪ (a −i+2) kn −an +i+2 jk
j,k,l,m ∂ jk,lm g c S− + c S c S clm
2
⎪
⎪
n n
⎪
⎩
8kn2 kn S
if an − kn + 2 ≤ i ≤ an + 1.
√
Estimation of Volatility Functionals: The Case of a n Window 587
ρ Ntρ
which is θG (c Sq − , c Sq ), hence the first part of (3.46), with Ut = θ q=1 G
(c Sqρ − , c Sqρ ). The second part of (3.46) follows from the dominated convergence
theorem, and the proof of Theorem 2.5 is complete.
The proof is once more somewhat similar to the proof of Sect. 3.8, although the way
cin by
we replace cin and further by αin + βin is different.
(A) Preliminaries. The jth summand in (2.15) involves several estimators cin , span-
ning the time interval (( j − 3)kn n , ( j + 2)kn n ]. It is thus convenient to replace
the sets L(n, ρ), L(n, ρ, t) and L(n, ρ, t), for ρ, t > 0, by the following ones:
ρ ρ
L (n, ρ) = { j = 3, 4, . . . : N( j+2)kn n − N( j−3)kn n = 0}
L (n, ρ, t) = {3, . . . , [t/kn n ] − 3} ∩ L (n, ρ)
L (n, ρ, t) = {3, . . . , [t/kn n ] − 3} ∩ (N\L (n, ρ)).
n,ρ n,ρ
For any ρ ∈ (0, 1] we write V(F)nt = Vt + V t , where
v nj = F(
cn , δ n
c) 1{ δnj−1c ∨ δnj+1c ∨u n < δnjc }
n,ρ ( j−3)kn +1 jn n,ρ
Vt = j∈L (n,ρ,t) v j , V t = j∈L (n,ρ,t) v nj .
We also set
δ nj
c =
c n n +1 −
c(nj−2)kn +1 , δ nj β = β njkn +1 − β(nj−2)kn +1 ,
jk
2
w j = m=−3
n c( j+m)kn +1 −
n c( j+m)kn +1 , w jn = (1 +
n c(nj−3)kn +1 ) p−1 (1 + δ nj
c )2 .
n,ρ q/4
q/2
n
i ∈ L (n, ρ) ⇒ E δ nj
c q
| F( j−2)kn +1 ) ≤ K q n φρ + n + .
ρq
(3.48)
588 J. Jacod and M. Rosenbaum
|v nj | ≤ K (1 +
c(nj−3)kn +1 ) p−2 δ nj
c 2 1{ δ nj
c >u n } + K δ nj
c p.
[t/k
n n ]
n,ρ n,ρ
Bt = a nj , Ct = δ nj
c p, Dtn = a jn .
j∈L (n,ρ,t) j∈L (n,ρ,t) j=3
l(q,v)
First, (3.47) and Hölder’s inequality give us E(a jn ) ≤ K q,v n for any q > 1
and v > 0, where (recalling (2.7) and (2.14) for and ) we have set l(q, v) =
q − p(1 − 2 ) ∨ v(1 − 2 + ) . Upon choosing v small enough and q close
1−r
enough to 1, and in view of (2.7), we see that l(q, v) > 21 , thus implying
E(Dtn ) → 0. (3.49)
n,ρ n
p/2
| Gρ
p/4
E Ct ≤ KE E δ nj
c p
≤ K t φρ + n + ,
ρp
i∈L (n,ρ,t)
n,ρ
The analysis of Bt is more complicated. We have δ nj
c = z nj + z jn , where
1
kn
z nj = αnjkn +1 − αn( j−2)kn +1 , z jn = (cnjkn +m − c(nj−2)kn +m )
kn
m=1
n,ρ,m zn 3
Bt = j∈L (n,ρ,t) a(m)nj , a(1)nj = (1 + c(nj−3)kn +1 ) p−2 uj ,
n
a(2)nj = z jn 2 1{ z n >u n /4} , a(3)nj = c(nj−3)kn +1 p−1 z jn 2 .
i
On the other hand, observe that z jn = μ(ρ)nj , with the notation (3.27), and as soon
as j ∈ L (n, ρ), so Lemma 3.4 gives us
n,ρ,2
lim lim sup E Bt = 0. (3.53)
ρ→0 n→∞
n,ρ √
Finally, (3.13) shows that E( z jn q | F( j−2)kn +1 ) ≤ K q,ρ n for all q ≥ 2 and
j ∈ L (n, ρ), whereas c(nj−3)kn +1 is F(nj−2)kn +1 -measurable, so (3.13), (3.19) and
√
successive conditioning yield E(a(3)nj | G ρ ) ≤ K q,ρ n . Then, again as for (3.52),
one obtains n,ρ,3
E Bt ≤ K ρ t. (3.54)
ρ ρ
(C) The processes V n,ρ . With the previous notation S j and Nt , and on the set n,ρ,t ,
we have ρ
n,ρ
Nt
2
Vt = v[S
n
ρ
/k ]+ j
. (3.56)
m n n
m=1 j=−2
ρ
This is a finite sum (bounded in n for each ω). Letting S = Sm for m and ρ fixed and
wn = knSn − knSn , we know that for any given j ∈ Z the variable n
c([S/k n n ]+ j)kn +1
converge in probability to c S− if j < 0 and to c S if j > 0, whereas for j = 0 we
P
have n
c[S/k n n ]kn +1
− wn c S − (1 − wn )c S −→0. This in turn implies
590 J. Jacod and M. Rosenbaum
P
j < 0 or j > 2 ⇒ δ[S/k
n
n n ]+ j
c−→0,
P P P
δ[S/k
n
n n ]
c − (1 − wn )c S −→0, δ[S/k
n
n n ]+1
c−→c S , δ[S/k
n
n n ]+2
c − wn c S −→0.
By virtue of the definition of v nj , and since u n → 0 and also since wn is almost surely
in (0, 1) and F is continuous and F(x, 0) = 0, one readily deduces that
P F(c S− , c S ) if j = 1
v[S/k
n
n n ]+ j
−→
0 if j = 1.
n,ρ P ρ
Nt
Vt −→ Vt := F(c Smρ − , c Smρ ). (3.57)
m=1
ρ
In view of (2.16), an application of the dominated convergence theorem gives V t →
n,ρ n,ρ
V(F)t . Then (2.17) follows from V(F)nt = Vt + V t and (3.55) and (3.57). The
proof of Theorem 2.6 is complete.
Acknowledgments We are grateful to the referee for his/her very careful reading of the paper.
References
1. Alvarez, A., Panloup, P., Pontier, M., Savy, N.: Estimation of the instantaneous volatility. Stat.
Inference Stoch. Process. 15, 27–59 (2010)
2. Clément, E., Delattre, S., Gloter, A.: An infinite dimensional convolution theorem with applica-
tions to the efficient estimation of the integrated volatility. Stoch. Process. Appl. 123, 2500–2521
(2013)
3. Jacod, J., Shiryaev, A.N.: Limit Theorems for Stochastic Processes, 2nd edn. Springer, Berlin
(2003)
4. Jacod, J., Protter, P.: Discretization of Processes. Springer, Berlin (2012)
5. Jacod, J., Rosenbaum, M.: Quarticity and other functionals of volatility: efficient estimation.
Ann. Stat. 41, 1462–1484 (2013)
6. Vetter, M.: Limit theorems for bipower variation of semimartingales. Stoch. Process. Appl. 120,
22–38 (2010)
7. Dette, H., Podolskij, M., Vetter, M.: Estimation of integrated volatility in continuous-time finan-
cial models with applications to goodness-of-fit testing. Scand. J. Stat. 33(2), 259–278 (2006)