0% found this document useful (0 votes)
445 views590 pages

Large Deviations and Asymptotic Methods in Finance by Peter K. Friz, Jim Gatheral, Archil Gulisashvili, Antoine Jacquier, Josef Teichmann (Eds.)

This document provides information about a book titled "Large Deviations and Asymptotic Methods in Finance". The book contains selected contributions from workshops and conferences in mathematics and statistics related to current research areas. It focuses on large deviations and asymptotic methods, which involve analyzing stochastic processes in limits like the short-time limit. These methods provide expansions of transition densities and heat kernels that can be used to study models in finance and capture behaviors like implied volatility surfaces. The book contains chapters applying these asymptotic techniques to specific financial models.

Uploaded by

bobthrowaway
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
445 views590 pages

Large Deviations and Asymptotic Methods in Finance by Peter K. Friz, Jim Gatheral, Archil Gulisashvili, Antoine Jacquier, Josef Teichmann (Eds.)

This document provides information about a book titled "Large Deviations and Asymptotic Methods in Finance". The book contains selected contributions from workshops and conferences in mathematics and statistics related to current research areas. It focuses on large deviations and asymptotic methods, which involve analyzing stochastic processes in limits like the short-time limit. These methods provide expansions of transition densities and heat kernels that can be used to study models in finance and capture behaviors like implied volatility surfaces. The book contains chapters applying these asymptotic techniques to specific financial models.

Uploaded by

bobthrowaway
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 590

Springer Proceedings in Mathematics & Statistics

Peter K. Friz
Jim Gatheral
Archil Gulisashvili
Antoine Jacquier
Josef Teichmann Editors

Large Deviations
and Asymptotic
Methods in
Finance
Springer Proceedings in Mathematics & Statistics

Volume 110
Springer Proceedings in Mathematics & Statistics

This book series features volumes composed of selected contributions from


workshops and conferences in all areas of current research in mathematics and
statistics, including operation research and optimization. In addition to an overall
evaluation of the interest, scientific quality, and timeliness of each proposal at the
hands of the publisher, individual contributions are all refereed to the high quality
standards of leading journals in the field. Thus, this series provides the research
community with well-edited, authoritative reports on developments in the most
exciting areas of mathematical and statistical research today.

More information about this series at http://www.springer.com/series/10533


Peter K. Friz Jim Gatheral

Archil Gulisashvili Antoine Jacquier


Josef Teichmann
Editors

Large Deviations
and Asymptotic Methods
in Finance

123
Editors
Peter K. Friz Archil Gulisashvili
Institut für Mathematik Department of Mathematics
Technische Universität Berlin Ohio University
Berlin Athens, OH
Germany USA

and Antoine Jacquier


Department of Mathematics
Weierstraß-Institut für Angewandte Imperial College London
Analysis und Stochastik London
Berlin UK
Germany
Josef Teichmann
Jim Gatheral Department of Mathematics
Department of Mathematics ETH Zürich
City University of New York Baruch Zürich
College Switzerland
New York, NY
USA

ISSN 2194-1009 ISSN 2194-1017 (electronic)


Springer Proceedings in Mathematics & Statistics
ISBN 978-3-319-11604-4 ISBN 978-3-319-11605-1 (eBook)
DOI 10.1007/978-3-319-11605-1

Library of Congress Control Number: 2015935733

Mathematics Subject Classification (2010): 91G80, 60H30, 60F10, 91G20

Springer Cham Heidelberg New York Dordrecht London


© Springer International Publishing Switzerland 2015
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made.

Printed on acid-free paper

Springer International Publishing AG Switzerland is part of Springer Science+Business Media


(www.springer.com)
Contents

Probability Distribution in the SABR Model


of Stochastic Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Patrick Hagan, Andrew Lesniewski and Diana Woodward

Asymptotic Implied Volatility at the Second Order


with Application to the SABR Model . . . . . . . . . . . . . . . . . . . . . . . . . 37
Louis Paulot

Unifying the BGM and SABR Models: A Short Ride


in Hyperbolic Geometry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Pierre Henry-Labordère

Second Order Expansion for Implied Volatility in Two Factor Local


Stochastic Volatility Models and Applications to the Dynamic
λ-Sabr Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Gérard Ben Arous and Peter Laurence

General Asymptotics of Wiener Functionals and Application


to Implied Volatilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Yasufumi Osajima

Implied Volatility of Basket Options at Extreme Strikes . . . . . . . . . . . 175


Archil Gulisashvili and Peter Tankov

Small-Time Asymptotics for the At-the-Money Implied Volatility


in a Multi-dimensional Local Volatility Model. . . . . . . . . . . . . . . . . . . 213
Christian Bayer and Peter Laurence

A Remark on Gatheral’s ‘Most-Likely Path Approximation’


of Implied Volatility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Martin Keller-Ressel and Josef Teichmann

v
vi Contents

Implied Volatility from Local Volatility:


A Path Integral Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Tai-Ho Wang and Jim Gatheral

Extrapolation Analytics for Dupire’s Local Volatility . . . . . . . . . . . . . 273


Peter Friz and Stefan Gerhold

The Gärtner-Ellis Theorem, Homogenization, and Affine Processes . . . 287


Archil Gulisashvili and Josef Teichmann

Asymptotics for d-Dimensional Lévy-Type Processes . . . . . . . . . . . . . . 321


Matthew Lorig, Stefano Pagliarani and Andrea Pascucci

Asymptotic Expansion Approach in Finance . . . . . . . . . . . . . . . . . . . . 345


Akihiko Takahashi

On Small Time Asymptotics for Rough Differential Equations


Driven by Fractional Brownian Motions . . . . . . . . . . . . . . . . . . . . . . . 413
Fabrice Baudoin and Cheng Ouyang

On Singularities in the Heston Model . . . . . . . . . . . . . . . . . . . . . . . . . 439


Vladimir Lucic

On the Probability Density Function of Baskets . . . . . . . . . . . . . . . . . 449


Christian Bayer, Peter K. Friz and Peter Laurence

On Small-Noise Equations with Degenerate Limiting


System Arising from Volatility Models . . . . . . . . . . . . . . . . . . . . . . . . 473
Giovanni Conforti, Stefano De Marco and Jean-Dominique Deuschel

Long Time Asymptotics for Optimal Investment . . . . . . . . . . . . . . . . . 507


Huyên Pham

Systemic Risk and Default Clustering for Large Financial Systems . . . 529
Konstantinos Spiliopoulos
pffiffiffi
Estimation of Volatility Functionals: The Case of a n Window . . . . . 559
Jean Jacod and Mathieu Rosenbaum
Introduction

In a sense, this book is a celebration of the Black-Scholes model. Widely criticized


for its shortcomings, ever since the dramatic Long Term Capital Management
meltdown in 1998, this ‘first generation’ model is still the first benchmark in
financial modelling. May it be the Heston, Stein-Stein, Bergomi, local volatility,
local-stochastic volatility, Lévy, uncertain volatility or fancy hybrid model: they are
all perturbations of the Black-Scholes model, typically either making volatility
stochastic or introducing jumps. If Black-Scholes assumes, say 20–40 % volatility,
all the above extensions more or less agree with this order of magnitude for overall
volatility, and indeed the around-the-money volatility smile is plainly a perturbation
of the flat implied volatility corresponding to Black-Scholes.
The aforementioned extensions do not come with tractable option price for-
mulae, since most diffusion processes do not admit closed-form transition densities.
The second best situation is a closed-form Fourier Transform of the transition
density, and many of the aforementioned extensions share this property, addition-
ally to explaining some stylized facts from the dynamics of the implied volatility
surface.
An alternative to the Fourier approach is given by asymptotic expansions of
transition densities of stochastic processes, say in the short-time (or more generally
small-noise) limit. Such investigations go back to S.R. Srinivasa Varadhan in the
late 1960s and are intimately connected to his theory of large deviations: it is pretty
unlikely for a particle starting at some position to diffuse to some other position if
there is almost no time to do so (or if the driving noise is switched off). The beauty
of large deviations is to explicitly identify a precise scale rough enough to be
computable (or at least to be characterized in terms of some variational problem),
and fine enough to capture the most important leading-order behaviour of the
system. In the context of transition densities, or heat-kernels in PDE terminology,
complete expansions have been derived in the 1970s and 1980s, with a bulk of
geometric information hidden in the coefficients. The Russian school has also been
fundamental in the development of (sample path) large deviations for stochastic
processes, in particular through the works of Mark Freidlin and Alexander Wentzell
in the 1970s. On a historical note, it is interesting to remember that large deviations

vii
viii Introduction

theory was originally developed (in the finite-dimensional case) by Harald Cramér
in the 1930s for actuarial mathematics.
A widely circulated preprint by Patrick Hagan et al. (following the famous
SABR paper), first presented in 2001 by Andrew Lesniewski at the Courant Finance
Seminar, intensified the connection between heat-kernels, geometry and finance.
The resulting SABR formula has become industry standard in fixed income mod-
elling (and presumably a long-time headache for quants tortured by risk manage-
ment). The topic was further explored by a number of people including Marco
Avellaneda, Christian Bayer, Gérard Ben Arous, Jérôme Busca, Jean-Dominique
Deuschel, Martin Forde, Pierre Henry-Labordère, Elton Hsu, Peter Laurence,
Cheng Ouyang and many others (including, unsurprisingly, all the editors of this
volume).
Despite the undisputed mathematical depth of this development, the agenda has
been largely initiated by people in or near the industry, a quick publication not
always being their first priority. This, at least, is our only explanation for the fact
that some key papers have remained preprints ever since, though widely circulating
and used for years. We also note that the derivation of closed-form approximation
formulae in various non-tractable models remains a constant topic in major
academic and industry meetings alike, not to mention some specialist meetings
(Vienna 2009, Berlin 2011, London 2013) organized by factions of the present
group of editors in different constellations.
The present proceedings grew on this fertile ground. Contributions include some
unpublished classics (in brushed-up versions), notably the aforementioned preprint
by Patrick Hagan et al. as well as recent works touching the theme of large devi-
ations and/or asymptotic expansions in mathematical finance.
The editors have known each other for a long time. The idea for this book project
was born in July 2013, but the first step towards realization was overshadowed by a
sad event: we are still shocked that our esteemed colleague and friend, whom we
had invited to co-edit this volume, has never received his invitation: Peter Laurence
passed away unexpectedly in August 2013.
Peter Laurence was born in New York, NY, on 27 March 1952. After under-
graduate courses at the Wharton School of Finance and Commerce at the
University of Pennsylvania, he obtained a Bachelor of Science in Mathematics and
Philosophy degree in 1973. He also obtained a Master of Science degree (1977)
and a Ph.D. degree (1981) from the University of Wisconsin Madison. From
1974–1991, Peter was a faculty member at the University of Wisconsin, the
Courant Institute of Mathematical Sciences at New York University, Worcester
Polytechnic Institute, Pennsylvania State University, and the University of Milano,
Italy. From 1991 till his untimely death in 2013, Peter was a professor at Sapienza
Università di Roma and a visiting scholar at the Courant Institute. Peter published
more than 60 research papers, co-authored a book “Quantitative Methods of
Derivative Securities: From Theory to Practice” with Marco Avellaneda, and was
one of the editors of the volume “Quantitative Energy Finance: Modeling, Pricing,
and Hedging in Energy and Commodity Markets”. His long-term friend Marco
Avellaneda remembers: Peter had an infinite joie de vivre… This involved a lot of
Introduction ix

research in Math Physics, one of his passions. I enjoyed discussing Math Physics
with him. We also began our interest in finance in the 90s and co-authored [our]
book. He had a kind heart. May he rest in peace and live in us who are still here.
Or as Bruno Dupire articulates it: He was a gentleman and will be missed.
Each of us has stories to tell about Peter and his inexhaustible passion for
mathematics and its impact on finance. Instead of trying to fit them in this intro-
duction we rather let him speak through mathematics: some of Peter Laurence’s
final contributions to mathematical finance do appear in these proceedings, with the
kind agreement of the respective co-authors.
We are indebted to all the reviewers who helped us achieving this work. It is also
our pleasure to thank Magdalena Mueller-Laurence, as well as the Springer
Proceedings team, without whom this book would never have appeared.

January 2015 Peter K. Friz


Jim Gatheral
Archil Gulisashvili
Antoine Jacquier
Josef Teichmann
Probability Distribution in the SABR Model
of Stochastic Volatility

Patrick Hagan, Andrew Lesniewski and Diana Woodward

Abstract We study the SABR model of stochastic volatility (Wilmott Mag, 2003
[10]). This model is essentially an extension of the local volatility model (Risk
7(1):18–20 [4], Risk 7(2):32–39, 1994 [6]), in which a suitable volatility parameter is
assumed to be stochastic. The SABR model admits a large variety of shapes of volatil-
ity smiles, and it performs remarkably well in the swaptions and caps/floors markets.
We refine the results of (Wilmott Mag, 2003 [10]) by constructing an accurate and
efficient asymptotic form of the probability distribution of forwards. Furthermore,
we discuss the impact of boundary conditions at zero forward on the volatility smile.
Our analysis is based on a WKB type expansion for the heat kernel of a perturbed
Laplace-Beltrami operator on a suitable hyperbolic Riemannian manifold.

Keywords SABR · Heat kernel expansion · WKB expansion · Implied volatility ·


Asymptotic smile formula

1 Introduction

The SABR model [10] of stochastic volatility attempts to capture the dynamics of
smile in the interest rate derivatives markets which are dominated by caps/floors and
swaptions. It provides a parsimonious, accurate, intuitive, and easy to implement
framework for pricing, position management, and relative value in those markets.
The model describes the dynamics of a single forward (swap or LIBOR) rate with
stochastic volatility. The dynamics of the model is characterized by a function C ( f )
of the forward rate f which determines the general shape of the volatility skew, a

P. Hagan (B) · D. Woodward


Gorilla Science, 7700 NE Palm Way, Boca Raton, Fl 33487, USA
e-mail: [email protected]
A. Lesniewski
Department of Mathematics, Baruch College, CUNY 1 Bernard Baruch Way,
New York, NY 10010, USA
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 1


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_1
2 P. Hagan et al.

parameter v which controls the level of the volatility of volatility, and a parameter
ρ which governs the correlation between the changes in the underlying forward rate
and its volatility. It is an extension of Black’s model: choosing v = 0 and C ( f ) = f
reduces SABR to the lognormal Black model, while v = 0 and C ( f ) = 1 reduces
it to the normal Black model.
The main reason why the SABR model has proven effective in the industrial
setting is that, even though it is too complex to allow for a closed form solution,
it has an accurate asymptotic solution. This solution, as well as its implications for
pricing and risk management of interest derivatives, has been described in [10].
In this paper we further refine the results presented in [10]. Our developments
go in two directions. Fist, we present a more systematic framework for generating
an accurate, asymptotic form of the probability distribution in the SABR model.
Secondly, we address the issue of low strikes, or the behavior of the model as the
forward rate approaches zero.
Our way of thinking has been strongly influenced by the asymptotic techniques
which go by the names of the geometric optics or the WKB method, and, most
importantly, by the classical results of Varadhan [19, 20] (see also [13, 18] for
more recent presentations and refinements). These techniques allow one to relate
the short time asymptotics of the fundamental solution (or the Green’s function)
of Kolmogorov’s equation to the differential geometry of the state space. From the
probabilistic point of view, the Green’s function represents the transition probability
of the diffusion, and it thus carries all the information about the process.
Specifically, let U denote the state space of an n-dimensional diffusion process
with no drift, and let G X (s, x), x, X ∈ U, denote the Green’s function. We also
assume that the process is time homogeneous, meaning that the diffusion matrix is
independent of s. Then, Varadhan’s theorem states that

d (x, X )2
lim s log G X (s, x) = − .
s→0 2

Here d (x, X ) is the geodesic distance on U with respect to a Riemannian metric


which is determined by the coefficients of the Kolmogorov equation. This gives us
the leading order behavior of the Green’s function. To extract usable asymptotic
information about the transition probability, more accurate analysis is necessary, but
the choice of the Riemannian structure on U dictated by Varadhan’s theorem turns out
to be key. Indeed, that Riemannian geometry becomes an important book keeping
tool in carrying out the calculations, rather than merely fancy language. Techni-
cally speaking, we are led to studying the asymptotic properties of the perturbed
Laplace-Beltrami operator on a Riemannian manifold.
In order to explain the results of this paper we define a universal function D (ζ):

ζ 2 − 2ρζ + 1 + ζ − ρ
D (ζ) = log ,
1−ρ
Probability Distribution in the SABR Model of Stochastic Volatility 3

where ζ is the following combination of today’s forward rate f , strike F, and a


volatility parameter σ (which is calibrated so that the at the money options prices
match the market prices):
 f
v du
ζ= .
σ F C(u)

The function D (ζ) represents a certain metric whose precise meaning is explained
in the body of the paper. The key object from the point of view of option pricing is
the probability distribution of forwards PF (τ , f ). Our main result in this paper is
the explicit asymptotic formula:
 
exp −D (ζ)2 /2τ v 2
PF (τ , f ) = √ (1 + · · · ) .
2πτ σC (F) (cosh D (ζ) − ρ sinh D (ζ))3/2

In order not to burden the notation, we have written down the leading term only; the
complete formula is stated in Sect. 5. To leading order, the probability distribution of
forwards in the SABR model is Gaussian with the metric D (ζ) replacing the usual
distance.
From this probability distribution, we can deduce explicit expressions for implied
volatility. The normal volatility is given by:

σn = σC (F) (cosh D (ζ) − ρ sinh D (ζ)) (1 + · · · ) .

Precise formulas, including the subleading terms and the impact of boundary condi-
tions at zero forward, are stated in Sect. 5. To calculate the corresponding lognormal
volatility one can use the results of [11].
We would like to mention that other stochastic volatility models have been exten-
sively studied in the literature (notably among them the Heston model [12]). Useful
presentations of these models are contained in [5, 17].
A comment on our style of exposition in this paper. We chose to present the argu-
ments in an informal manner. In order to make the presentation self-contained, we
present all the details of calculations, and do not rely on general theorems of differ-
ential geometry, stochastic calculus, or the theory of partial differential equations.
And while we believe that all the results of this paper could be stated and proved
rigorously as theorems, little would be gained and clarity might easily get lost in the
course of doing so.
The paper is organized as follows. In Sect. 2 we review the model and formulate
the basic partial differential equation, the backward Kolmogorov equation. We also
introduce the Green’s and discuss various boundary conditions at zero. Section 3 is
devoted to the description of the differential geometry underlying the SABR model.
We show that the stochastic dynamics defining the model can be viewed as a pertur-
bation of the Brownian motion on a deformed Poincare plane. The elliptic operator
in the Kolmogorov equation turns out to be a perturbed Laplace-Beltrami opera-
tor. This differential geometric setup is key to our asymptotic analysis of the model
4 P. Hagan et al.

which is carried through in Sect. 4. In Sect. 5 we derive the explicit formulas for the
probability distribution and implied volatility which we have discussed above. In
Appendix A we review the derivation of the fundamental solution of the heat equa-
tion on the Poincare plane. This solution is the starting point of our perturbation
expansion. Finally, Appendix B contains some useful asymptotic expansions.

2 SABR Model

In this section we describe the SABR model of stochastic volatility [10]. It is a two
factor model with the dynamics given by a system of two stochastic differential
equations. The state variables of the model can be thought of as the forward price
of an asset, and a volatility parameter. In order to derive explicit expressions for the
associated probability distribution and the implied volatility, we study the Green’s
function of the backward Kolmogorov operator.

2.1 Underlying Process

We consider a European option on a forward asset expiring T years from today. The
forward asset that we have in mind can be for instance a forward LIBOR rate, a
forward swap rate, or the forward yield on a bond. The dynamics of the forward in
the SABR model is given by1 :

d Ft = t C (Ft ) dWt ,
(1)
dt = vt d Z t .

Here Ft is the forward rate process, and Wt and Z t are Brownian motions with

E [dWt d Z t ] = ρdt, (2)

where the correlation ρ is assumed constant. We supplement the dynamics (1) with
the initial condition

F0 = f,
(3)
0 = σ.

1 Note that our notation departs somewhat from the notation used in [10]: we use t instead of αt
and vt instead of νt . The name SABR is an acronym for “Stochastic Alpha Beta Rho” which was
the name of the model originally used at Paribas.
Probability Distribution in the SABR Model of Stochastic Volatility 5

Note that we assume that a suitable numeraire has been chosen so that Ft is a
martingale. The process t is the stochastic component of the volatility of Ft , and v
is the volatility of t (the “volvol”) which is also assumed to be constant.
The function C(x) is defined for x > 0, and is assumed to be positive, smooth,
and integrable around 0;
 K du
< ∞, for all K > 0. (4)
0 C(u)

Two examples of C, which are particularly popular among financial practitioners,


are functions of the form:

C(x) = x β , where 0 ≤ β < 1 (5)

(stochastic CEV model), or

C(x) = x + a, where a > 0 (6)

(stochastic shifted lognormal model).


Our analysis uses an asymptotic expansion in the parameter v 2 T , and we thus
require that v 2 T be small. In practice, this is an excellent assumption for medium and
longer dated options. Typical for shorter dated options are significant, discontinuous
movements in implied volatility. The SABR model should presumably be extended
to include such jump behavior of short dated options.
The process t is purely lognormal and thus t > 0 almost surely. Since, depend-
ing on the choice of C(x), Ft can reach zero with non-zero probability, we should
take into account the boundary behavior of the process (1), as Ft approaches 0. This
can easily be done in the case of zero correlation between Wt and Z t , ρ = 0. We
extend the function C(x) to all values of x by setting

C (−x) = C(x), for x < 0. (7)

The so extended C(x) is an even function, C (−x) = C(x), for all values of x, and
thus the process (1) is invariant under the reflection Ft → −Ft . The state space of
the extended process is thus the upper half plane. Later on in this paper we shall
discuss the Dirichlet and Neumann boundary conditions for the SABR model.
A special case of (1) which will play an important role in our analysis is the case
of C(x) = 1, and ρ = 0. In this situation, the basic equations of motion have a
particularly simple form:

d Ft = t dWt ,
(8)
dt = vt d Z t ,

with E [dWt d Z t ] = 0. We shall refer to this model as the normal SABR model.
6 P. Hagan et al.

Local volatility [4, 6], is defined as the conditional expectation value


  
σ K (T, f, σ)2 dT = E (d Ft )2  F (0) = f, Ft = K ,  (0) = σ , (9)

or, explicitly,
  
σ K (T, f, σ)2 = C (K )2 E (t )2  F (0) = f, Ft = K ,  (0) = σ . (10)

Our analysis in the following sections enables us, in particular, to derive an explicit
expression for σ K .

2.2 Green’s Function

Green’s functions arise in finance as the prices of Arrow-Debreu securities.


Equations (1)–(3) correspond to the Arrow-Debreu security whose payoff at time
T is given by Dirac’s delta function δ (FT − F, σT − ). The time t < T price
G = G T,F, (t, f, σ) of this security is the solution to the following parabolic par-
tial differential equation:

∂G 1 ∂2 G ∂2 G 2 ∂ G
2
+ σ 2 C ( f )2 + 2vρC ( f ) + v = 0, (11)
∂t 2 ∂ f2 ∂ f ∂σ ∂σ 2

with the terminal condition:

G T,F, (t, f, σ) = δ ( f − F, σ − ) , at t = T. (12)

This equation should also be supplemented by a boundary condition at infinity such


that G is financially meaningful. Since the payoff takes place only if the forward
has a predetermined value in a finite amount of time, the value of the Arrow-Debreu
security has to tend to zero as F and  become large:

G T,F, (t, f, σ) → 0, as F,  → ∞. (13)

Thus G T,F, (t, f, σ) is a Green’s function for (11). Once we have constructed
it, we can price any European option. For example, the price C T,K (t, f, σ) of a
European call option struck at K and expiring at time T can be written in terms of
G T,F, (t, f, σ) as

C T,K (t, f, σ) = (F − K )+ G T,F, (t, f, σ) d Fd, (14)
Probability Distribution in the SABR Model of Stochastic Volatility 7

where, as usual, (F − K )+ = max (F − K , 0), and where the integration extends
over the upper half plane (F, ) ∈ R2 :  > 0 .
Note that the process (1) is time homogeneous, and thus G T,F, (t, f, σ) is a
function of the time to expiry τ = T − t only. Denoting

G F, (τ , f, σ) ≡ G T,F, (t, f, σ) ,

and
C K (τ , f, σ) ≡ C T,K (t, f, σ) ,

we can reformulate (11)–(12) as the initial value problem:

∂G 1 ∂2 G ∂2 G 2 ∂ G
2
= σ 2 C ( f )2 + 2vρC ( f ) + v , (15)
∂τ 2 ∂ f2 ∂ f ∂σ ∂σ 2

and
G F, (τ , f, σ) = δ ( f − F, σ − ) , at τ = 0. (16)

Introducing the marginal probability distribution


 ∞
PF (τ , f, σ) = G F, (τ , f, σ) d, (17)
0

we can express the call price (14) as


 ∞
C K (τ , f, σ) = (F − K )+ PF (τ , f, σ) d F. (18)
−∞

This formula has familiar structure, and one of our main goals will be to derive a
useful expression for PF (τ , f ).
It is also easy to express the local volatility in terms of the Green’s function.
Indeed,

C (K )2 0  2 G K , (τ , f, σ) d
σ K (τ , f, σ)2 = ∞ , (19)
0 G K , (τ , f, σ) d

or
M K2 (τ , f, σ)
σ K (τ , f, σ) = C (K ) , (20)
PK (τ , f, σ)

where  ∞
M K2 (τ , f, σ) =  2 G K , (τ , f, σ) d (21)
0

is the conditional second moment.


8 P. Hagan et al.

We will solve (15)–(17) by means of asymptotic techniques. In order to set up the


expansion, it is convenient to introduce the following variables:

τ σ 
s= , x = f, X = F, y = , Y = ,
T v v
and the rescaled Green’s function:

K X,Y (s, x, y) = vT G X,vY (T s, x, vy) .

In terms of these variables, the initial value problem (15) and (16) can be recast as:

∂K 1 ∂2 K ∂2 K ∂2 K
= εy 2 C(x)2 + 2ρC (x) + ,
∂s 2 ∂x 2 ∂x∂ y ∂ y2 (22)
K (0, x, y) = δ (x − X, y − Y ) ,

where K = K X,Y , and


ε = v 2 T. (23)

It will be assumed that ε is small and it will serve as the parameter of our expansion.
The heuristic picture behind this idea is that the volatility varies slower than the for-
ward, and the rates of variability of f and σ/v are similar. The time T defines the time
scale of the problem, and thus s is a natural dimensionless time variable. Expressed in
terms of the new variables, our problem has a natural differential geometric content
which is key to its solution.
Finally, let us write down the equations above for the normal SABR model:

∂K 1 ∂2 K ∂2 K
= εy 2 + ,
∂s 2 ∂x 2 ∂ y2 (24)
K (0, x, y) = δ (x − X, y − Y ) .

We will show later that this initial value problem has a closed form solution.

2.3 Boundary Conditions at Zero Forward

The problem as we have formulated it so far is not complete. Since the value of the
forward rate should be positive,2 we have to specify a boundary condition for the
Green’s function at x = 0. Three commonly used boundary conditions are [9]:

2 Recent
history shows that this is not always necessarily the case, but we regard such occurances
as anomalous.
Probability Distribution in the SABR Model of Stochastic Volatility 9

• Dirichlet (or absorbing) boundary condition. We assume that the Green’s function,
D (s, x, y), vanishes at x = 0,
denoted by K X,Y

D
K X,Y (s, 0, y) = 0. (25)

• Neumann (or reflecting) boundary condition. We assume that the derivative of


the Green’s function at x = 0, normal to the boundary (and pointing outward),
N
vanishes. Let K X,Y (s, x, y) denote this Green’s function; then


K N (s, 0, y) = 0. (26)
∂x X,Y
• Robin (or mixed) boundary condition. The Green’s function, which we shall denote
R (s, x, y), satisfies the following condition. Given η > 0,
by K X,Y


− + η K X,Y
R
(s, 0, y) = 0. (27)
∂x

From the financial point of view, the relevant boundary conditions are the Dirichlet
and Neumann conditions. It is well known that the Green’s functions corresponding
to these different boundary conditions obey the following conditioning inequalities:

K D ≤ K ≤ K N. (28)

Since the Dirichlet boundary condition corresponds to the stochastic process being
killed at the boundary, the total mass of the Green’s function is less than one:

D
K X,Y (s, x, y) d x d y < 1. (29)

The remaining probability is a Dirac’s delta function at x = 0. On the other hand,


for the free and Neumann boundary conditions,
 
K X,Y (s, x, y) d x d y = N
K X,Y (s, x, y) d x d y = 1, (30)

and so they are bona fide probability distributions.


Our method allows for deriving explicit expressions for the Green’s functions in
the case of zero correlation. In this case, the differential operator in (22) is invariant
under a Z2 group action given by the reflection x → −x of the upper half plane.
This allows us to construct the desired Green’s functions by means of the method
of images. Namely, let K X,Y (s, x, y) denote now the solution to (22) with C(x)
10 P. Hagan et al.

extended to the entire upper half plane, as explained in Sect. 2.1.3 Then, one verifies
readily that
D
K X,Y (s, x, y) = K X,Y (s, x, y) − K X,Y (s, −x, y) , (31)

and
N
K X,Y (s, x, y) = K X,Y (s, x, y) + K X,Y (s, −x, y) (32)

are the solutions to the Dirichlet and Neumann problem, respectively.

2.4 Solving the Initial Value Problem

It is easy to write down a formal solution to the initial value problem (22). Let L
denote the partial differential operator

1 2 ∂2 ∂2 ∂2
L= y C(x)2 2 + 2ρC(x) + 2 (33)
2 ∂x ∂x∂ y ∂y

supplemented by a suitable boundary condition at x = 0. Consider the one-parameter


semigroup of operators
U (s) = exp (sεL) . (34)

Then U solves the following initial value problem:

∂U
= εLU,
∂s
U (0) = I,

and thus the Green’s function K X,Y (s, x, y) is the integral kernel of U (s):

K X,Y (s, x, y) = U (s) (x, y; X, Y ) . (35)

In order to solve the problem (22) it is thus sufficient to construct the semigroup
U (s) and find its integral kernel. Keeping in mind that our goal is to find an explicit
formula for K X,Y (s, x, y), the strategy will be to represent L as the sum

L = L 0 + V, (36)

where L 0 is a second order differential operator with the property that

U0 (s) = exp (sεL 0 ) (37)

3 This solution ignores any boundary condition at x = 0 and is sometimes referred to as the Green’s
function with a free boundary condition.
Probability Distribution in the SABR Model of Stochastic Volatility 11

can be represented in closed form. Specifically, we will proceed in several steps. We


start with the normal SABR model defined in Sect. 2.1, and notice that the corre-
sponding operator L is a well known object, namely the generator of the Brownian
motion on the upper half-plane. The integral kernel of the semigroup U (s) generated
by this operator can be represented as an explicit integral over the real axis. Next we
observe that the general SABR model can naturally be mapped on the normal SABR
model by means of a suitable diffeomorphism φ. We find that, under this mapping,
the operator L is the sum of two parts: (i) the pullback of the generator of the Brown-
ian motion on the upper half-plane, denoted by L 0 , and (ii) a perturbation V . The
kernel of the semigroup generated by L 0 has an explicit integral representation. The
operator V turns out to be a differential operator of first order, and we will treat it as
a small perturbation of the operator L 0 .
The semigroup U (s) can now be expressed in terms of U0 (s) and V as

U (s) = Q (s) U0 (s) . (38)

Here, the operator Q (s) is given by the well known regular perturbation expansion:

Q (s) = I + es1 ad L 0 (V ) . . . esn ad L 0 (V ) ds1 . . . dsn , (39)
1≤n<∞ 0≤s1 ≤...sn ≤sε

where ad L 0 is the commutator with L 0 :

ad L 0 (V ) = L 0 V − V L 0 . (40)

We will use the first few terms in the expansion above in order to construct an
accurate approximation to the Green’s function K X,Y (s, x, y):

1    
Q (s) = I + sεV + (sε)2 ad L 0 (V ) + V 2 + O (sε)3 . (41)
2
We shall disregard the convergence issues associated with this series, and use it solely
as a tool to generate an asymptotic expansion.

3 Stochastic Geometry of the State Space

In solving our model we find that the normal SABR model represents Brownian
motion on the Poincare plane. Generally, when ρ = 0, or C(x) = 1, the model
amounts to Brownian motion on a two dimensional manifold, the SABR plane, per-
turbed by a drift term. In this section we summarize a number of basic facts about
the differential geometry of the state space of the SABR model. The fundamental
geometric structure is that of the Poincare plane. We will show that the state space of
the SABR model can be viewed as a suitable deformation of the Poincare geometry.
12 P. Hagan et al.

3.1 SABR Plane

We begin by reviewing the Poincare geometry of the upper half plane which will
serve as the standard state space of our model. For a full (and very readable) account
of the theory the reader is referred to e.g. [1].
The Poincare plane (also known as the hyperbolic or Lobachevski plane) is the
upper half plane H2 = {(x, y) : y > 0} equipped with the Poincare line element

d x 2 + dy 2
ds 2 = . (42)
y2

This line element comes from the metric tensor given by

1 1 0
h= . (43)
y2 0 1

The Poincare plane admits a large group of symmetries. We introduce complex


coordinates on H2 , z = x + i y (the defining condition then reads Imz > 0), and
consider a Moebius transformation
az + b
z = , (44)
cz + d

where a, b, c, d are real numbers with ad − bc = 1. We verify easily the following


two facts.
• Transformation (44) is a biholomorphic map of H2 onto itself.
• The Poincare metric is invariant under (44).
As a consequence, the Lie group
 
a b
S L (2, R) = : a, b, c, d ∈ R, ad − bc = 1 (45)
c d

acts holomorphically and isometrically on H2 . This symmetry group plays very


much the same role in the hyperbolic geometry as the Euclidean group in the usual
Euclidean geometry of the plane R2 .
In order to study the SABR model with the Dirichlet or Neumann boundary
conditions at zero forward, we define the following reflection θ : H2 → H2 :

θ (x, y) = (−x, y) (46)

(clearly, this is a reflection with respect to the y-axis). The key fact about θ is that it
is an involution, i.e.
θ ◦ θ (z) = z. (47)
Probability Distribution in the SABR Model of Stochastic Volatility 13

One can also write θ as θ (z) = −z, which shows that it is an anti-holomorphic map
of H2 into itself. It is easy to find the set of fixed points of θ, namely the points on
the Poincare plane which are left invariant by θ:

θ (x, y) = (x, y) ⇔ x = 0. (48)

i.e. it is the positive y-axis.


Let d (z, Z ) denote the geodesic distance between two points z, Z ∈ H2 , z =
x + i y, Z = X + iY , i.e. the length of the shortest path connecting z and Z . There
is an explicit expression for d (z, Z ):

|z − Z |2
cosh d (z, Z ) = 1 + , (49)
2yY

where |z − Z | denotes the Euclidean distance between z and Z . In particular, if


x = X , then d (z, Z ) = |log (y/Y )|. We also note that the reflection θ is an isometry
with respect to this metric, d (θ (z), θ (Z )) = d (z, Z ).
We also note that since det (h) = y −4 , the invariant volume element on H2 is
given by

dμh (z) = det (h) d x d y
dx dy (50)
= .
y2

The state space associated with the general SABR model has a somewhat more
complicated geometry. Let S2 denote the upper half plane {(x, y) : y > 0} , equipped
with the following metric g:

1 1 −ρC(x)
g= . (51)
(1 − ρ2 )y 2 C(x)2 −ρC(x) C(x)2

This metric is a generalization of the Poincare metric: the case of ρ = 0 and C(x) = 1
reduces to the Poincare metric. In fact, the metric g is the pullback of the Poincare
metric under a suitable diffeomorphism. To see this, we define a map φ : S2 → H2 by
  
1 x du
φ (z) =  − ρy , y , (52)
1 − ρ2 0 C(u)

where z = (x, y). The Jacobian ∇φ of φ is


 
√ 1
−√ ρ
∇φ (z) = 1−ρ2 C(x) 1−ρ2 , (53)
0 1
14 P. Hagan et al.

and so φ∗ h = g, where φ∗ denotes the pullback of φ. The manifold S2 is thus


isometrically diffeomorphic with the Poincare plane. A consequence of this fact is
that we have an explicit formula for the geodesic distance δ (z, Z ) on S2 :

cosh δ (z, Z ) = cosh d (φ (z) , φ (Z ))


 
x du 2 x du
X C(u) − 2ρ (y − Y ) X C(u) + (y − Y )2
=1+   , (54)
2 1 − ρ2 yY

where z = (x, y) and Z = (X, Y ) are two points on S2 . Since det (g) = y −4 C(x)−2 ,
the invariant volume element on S2 is given by

dμg (z) = det (g) d x d y
dx dy (55)
= .
C(x)y 2

In the case of ρ = 0, the manifold S2 carries an isometric reflection θ which


commutes with (52):
θ ◦ φ (z) = φ ◦ θ (z) , (56)

i.e. θ is inherited from the corresponding reflection θ of the Poincare plane. Explicitly,
θ (x, y) = (−x, y). Strictly speaking, this holds only holds if x = 0, as the metric
(51) explodes at the boundary x = 0.

3.2 Brownian Motion on the SABR Plane

It is no coincidence that the SABR model leads to the Poincare geometry. Indeed,
the dynamics of the normal SABR model is given by the Brownian motion on the
Poincare plane. In this section we shall establish this relationship, and use it in
Sect. 3.3 in order to find an explicit representation of the integral kernel of (37).
Recall [13] that the Brownian motion on the Poincare plane is described by the
following system of stochastic differential equations:

d X t = Yt dWt ,
(57)
dYt = Yt d Z t ,

with the two Wiener processes Wt and Z t satisfying

E [dWt d Z t ] = 0. (58)
Probability Distribution in the SABR Model of Stochastic Volatility 15

Comparing this with the special case of the normal SABR model (8), we see that (8)
reduces to (57) once we have made the following identifications:

X t = Fv 2 t ,
1 (59)
Yt = v 2 t ,
v
and used the scaling properties of a Wiener process:

dWv 2 t = v dWt ,
d Z v2 t = v d Z t .

Note that the system (57) can easily be solved in closed form: its solution is
given by
 t s2
X t = X 0 + Y0 exp Z (s) − dW (s) ,
0 2
(60)
t2
Yt = Y0 exp Z t − .
2

Let us now compare the SABR dynamics with that of the diffusion on the SABR
plane. In order to find the dynamics of Brownian motion on the SABR plane we use
the fact that there is a mapping (namely, (52)) of S2 into H2 . Using this mapping and
Ito’s lemma yields the following system

1 2
d Xt = Y C (X t ) C (X t ) dt + Yt C (X t ) dWt ,
2 t (61)
dYt = Yt d Z t ,

with the two Wiener processes Wt and Z t satisfying

E [dWt d Z t ] = ρdt. (62)

Note that this is not exactly the SABR model dynamics. Indeed, one can regard the
SABR model as the perturbation of the Brownian motion on the SABR plane by the
drift term − 21 Yt2 C (X t ) C (X t ) dt.
As in the case of the Poincare plane, it is possible to represent the solution to the
system (61) explicitly:
 Xt  t
du s2
= Y0 exp Z (s) − dW (s) ,
X0 C(u) 0 2
(63)
t2
Yt = Y0 exp Z t − .
2
16 P. Hagan et al.

Parenthetically, we note that, within Stratonovich’s calculus, (61) can be written as

d X t = Yt C (X t ) ◦ dWt ,
dYt = Yt ◦ d Z t .

Therefore, the stochastic differential equations of the SABR model, if interpreted


according to Stratonovich, describe the dynamics of Brownian motion on the SABR
plane.

3.3 Laplace-Beltrami Operator on the SABR Plane

It will be convenient to use invariant notation. Let z 1 = x, z 2 = y, and let


∂μ = ∂/∂z μ , μ = 1, 2, denote the corresponding partial derivatives. We denote
the components of g −1 by g μν , and use g −1 and g to raise and lower the indices:
z μ = gμν z ν , ∂ μ = g μν ∂ν = ∂/∂z μ , where we sum over the repeated indices.
Explicitly,
 
∂ 1 = y 2 C(x)2 ∂1 + ρC(x)∂2 ,
∂ 2 = y 2 (ρC(x)∂1 + ∂2 ) .

Consequently, the initial value problem (22) can be written in the following geometric
form:
∂ 1
K Z (s, z) = ε ∂ μ ∂μ K Z (s, z) ,
∂s 2 (64)
K Z (0, z) = δ (z − Z ) ,

where δ(z − Z ) = δ(x − X, y − Y ) denotes the two-dimensional Dirac’s delta


function.
Recall that the Laplace-Beltrami operator g on a Riemannian manifold M with
metric tensor g is defined by

1 ∂  ∂f
g f = √ det g g μν ν , (65)
det g ∂x μ ∂x

where f is a smooth function on M. It is a natural generalization of the familiar


Laplace operator to spaces with non-Euclidean geometry. Its importance for prob-
ability theory comes from the fact that it serves as the infinitesimal generator of
Brownian motion on such spaces (see e.g. [7, 8, 13]).
Probability Distribution in the SABR Model of Stochastic Volatility 17

In the case of the Poincare plane, the Laplace-Beltrami operator has the form:

∂2 ∂2
h = y 2 + . (66)
∂x 2 ∂ y2

As anticipated by our discussion in Sect. 3.2, this operator is closely related to the
operator L in the normal SABR model. In fact, in this case,

1
L= h , (67)
2
and thus the problem (24) turns out to be the initial value problem the heat equation
on H2 :
∂KZ 1
= εh K Z ,
∂s 2 (68)
K Z (0, z) = δ (z − Z ) .

The key fact is that the Green’s function for this equation can be represented in closed
form, √  ∞
e−sε/8 2 ue−u /2sε
2

K Z (s, z) =
h
√ du. (69)
(2πsε)3/2 Y 2 d(z,Z ) cosh u − cosh d (z, Z )

This formula was originally derived by McKean [16] (see also [13] and references
therein). We have added the superscript h to indicate that this Green’s function
is associated with the Poincare metric. In Appendix A we outline an elementary
derivation of this fact.
Let us now extend the discussion above to the general case. We note first that,
except for the case of C(x) = 1, the operator ∂ μ ∂μ does not coincide with the
Laplace-Beltrami operator g on S2 associated with the metric (51). It is, however,
easy to verify that

1 ∂  
μν ∂ f
∂ μ ∂μ f = g f − √ det g g
det g ∂x ν ∂x μ
1 ∂f
= g f −  y 2 CC ,
1−ρ 2 ∂x

and thus
1 1 ∂
L= g −  y 2 CC
2 2 1 − ρ2 ∂x
= L 0 + V,
18 P. Hagan et al.

where L 0 is essentially the Laplace-Beltrami operator:

1
L0 = g , (70)
2
and V (x) is lower order:

1 ∂
V =−  y 2 C(x)C (x) . (71)
2 1 − ρ2 ∂x

Let us first focus on the Laplace-Beltrami operator g . The key property of


the Laplace-Beltrami operator is that it commutes with isometries of Riemannian
manifolds. In particular, this implies that

φ ◦ g = h ◦ φ, (72)

and, thus the Laplace-Beltrami operator g is the pullback of h under φ. As a


consequence, the heat equation

∂K 1
= ε g K
∂s 2
g
on S2 can be solved in closed form! The Green’s function K Z (s, z) of this equation
is related to (69) by
g
K Z (s, z) = det (∇φ (Z )) K φ(Z
h
) (s, φ (z)) . (73)

Explicitly,
√  ∞
e−sε/8 2 ue−u /2sε
2
g
K Z (s, z) =  √ du , (74)
(2πsε)3/2 1 − ρ2 Y 2 C (X ) δ cosh u − cosh δ

where δ = δ (z, Z ) is the geodesic distance (54) on S2 . This is the explicit represen-
tation of the integral kernel of the operator U0 (s).

4 Asymptotic Expansion

In principle, we have now completed our task of solving the initial value problem
(24). Indeed, its solution is given by
g
K Z (s, z) = Q (s) K Z (s, z) , (75)
Probability Distribution in the SABR Model of Stochastic Volatility 19

where Q (s) is the perturbation expansion given by (39). In order to produce clear
results that can readily be used in practice we perform now a perturbation expansion
on the expression above. Our method allows one to calculate the Green’s function of
the model to the desired order of accuracy.
Let us start with the Green’s function K Zh (s, z) which is defined on the Poincare
plane. In Appendix B we derived an asymptotic expansion (117) for the heat kernel
on the Poincare plane. After rescaling as in (106), we arrive at

1 d2
K Zh (s, z) = exp − ×
2πλY 2 2λ
  
d 1 d coth d − 1
1− +1 λ + O λ2 ,
sinh d 8 d2

where we have introduced a new variable,

λ = sε. (76)
g
We can now extend the expression to the general Green’s function K Z (s, z). Using
g
(73) or (74) we find that K Z (s, z) has the following asymptotic expansion:

g 1 δ2
K Z (s, z) =  exp − ×
2πλ 1 − ρ2 Y 2 C (X ) 2λ
  
δ 1 δ coth δ − 1
1− + 1 λ + O λ2 .
sinh δ 8 δ2

To complete the calculation in the case of general C(x) we need to take into
account the contribution to the Green’s function coming from perturbation V defined
in (71). Let us define the function:

q (z, Z ) = sinh δ (z, Z ) V δ (z, Z )


 x
yC (x) du
=−  3/2 − ρ (y − Y ) . (77)
2 1−ρ 2 Y X C(u)

From (117) and (118),


g
K Z (s, z) = (I + λV ) K Z (s, z)
1 q ∂
= K Z (s, z) + λ K Z (s, z) , (78)
1 − ρ2 Y 2 C (X ) sinh δ ∂δ
20 P. Hagan et al.

which yields the following asymptotic formula for the Green’s function:

1 δ2
K Z (s, z) =  exp −
2πλ 1 − ρ2 Y 2 C (X ) 2λ
 
δ δ
× 1− q
sinh δ sinh δ
1 δ coth δ − 1 3 (1 − δ coth δ) + δ 2  
− + − q λ + O λ2 . (79)
8 8δ 2 8δ sinh δ

In a way, this is the central result of this paper. It gives us a precise asymptotic
behavior of the Green’s function of the SABR model, as λ → 0.

5 Volatility Smile

We are now ready to complete our analysis. Given the explicit form of the approximate
Green’s function, we can calculate (via another asymptotic expansion) the marginal
probability distribution. Comparing the result with the normal probability distribution
allows us to find the implied normal and lognormal volatilities, as functions of the
model parameters. We conclude this section by deriving explicit formulas for the
case of the CEV model C (x) = x β and the shifted lognormal model C (x) = x + a.

5.1 Marginal Transition Probability

First, we integrate the asymptotic joint density over the terminal


 volatility variable
Y to find the marginal density for the forward x. To within O λ2 ,
 ∞
PX (s, x, y) = K Z (s, z) dY
0
  
1 ∞ δ δ
−δ 2 /2λ
=  e q 1−
2πλ 1 − ρ2 C
(X ) 0 sinh δ sinh δ

1 δ coth δ − 1 3 (1 − δ coth δ) + δ 2 dY
− λ 1+ − q . (80)
8 δ 2 δ sinh δ Y2

Here the metric δ (z, Z ) is defined implicitly by (54). We evaluate this integral asymp-
totically by using Laplace’s method (steepest descent). This analysis is carried out
in Appendix B.2. The key step is to analyze the argument Y of the exponent

1
φ (Y ) = δ (z, Z )2 , (81)
2
Probability Distribution in the SABR Model of Stochastic Volatility 21

in order to find the point Y0 where this function is at a minumum. Let us introduce
the notation: 
1 x du
ζ= .
y X C(u)

Since yC(u) is basically the rescaled volatility at forward u, 1/ζ represents the
average volatility between today’s forward x and at option’s strike X . In other words,
ζ represents how “easy” it is to reach the strike X . Some algebra shows that the
minimum of (81) occurs at Y0 = Y0 (ζ, y), where

Y0 = y ζ 2 − 2ρζ + 1. (82)

The meaning of Y0 is clear: it is the “most likely value” of Y , and thus Y0 C (X ) (when
expressed in the original units) should be the leading contribution to the observed
implied volatility. Also, let D (ζ) denote the value of δ (z, Z ) with Y = Y0 . Explicitly,

ζ 2 − 2ρζ + 1 + ζ − ρ
D (ζ) = log . (83)
1−ρ

The analysis in Appendix B.3 shows that the probability distribution for x is Gaussian
in this minimum
  distance, at least to leading order. Specifically, it is shown there that
to within O λ2 ,
 
1 1 D2 yC (x) D
PX (s, x, y) = √ exp − 1+ 
2πλ yC (X ) I 3/2 2λ 2 1 − ρ2 I

1 yC (x) D 6ρyC (x)
− λ 1+  + cosh (D) (84)
8 2 1−ρ I 2 1 − ρ2 I 2
       
3 1 − ρ2 3yC (x) 5 − ρ2 D sinh (D)
− +  + ··· ,
I 2 1 − ρ2 I 2 D

where

I (ζ) = ζ 2 − 2ρζ + 1
= cosh D (ζ) − ρ sinh D (ζ) . (85)
22 P. Hagan et al.

As this expression may be useful on its own, we rewrite it in terms of the original
variables:
 
1 1 D2 σC ( f ) D
PF (τ , f, σ) = √ exp − 1+ 
2πτ σC (F) I 3/2 2τ v 2
2v 1 − ρ2 I

1 σC ( f ) D 6ρσC ( f )
− τ v2 1 +  +  cosh (D)
8 2v 1 − ρ2 I v 1 − ρ2 I 2
       
3 1 − ρ2 3σC ( f ) 5 − ρ2 D sinh (D)
− +  + · · · , (86)
I 2v 1 − ρ2 I 2 D

where we have slightly abused the notation. This is the desired asymptotic form of
the marginal probability distribution.

5.2 Implied Volatility

The normal implied volatility is given by Sect. 2.2, and we are thus left with the task
of calculating the conditional second moment. Explicitly,
 ∞
M X2 (s, x, y) = Y 2 K Z (s, z) dY
0
  
1 ∞ δ δ
−δ 2 /2λ
=  e q 1−
2πλ 1 − ρ2 C (X )
0 sinh δ sinh δ

1 δ coth δ − 1 3 (1 − δ coth δ) + δ 2
− λ 1+ − q dY. (87)
8 δ2 δ sinh δ

In Appendix B.3 we show that



 
1 I D 2 yC (x) D
M X2 (s, x, y) = √ exp − 1+ 
2πλ yC (X ) 2λ 2 1 − ρ2 I

1 yC (x) D 2ρyC (x)
+ λ 1−  + cosh (D)
8 2 1 − ρ2 I 1 − ρ2 I 2
       
3 1 − ρ2 2yC (x) 3ρ2 − 4 D sinh (D)
+ +  + ··· .
I 1 − ρ2 I 2 D
Probability Distribution in the SABR Model of Stochastic Volatility 23

Despite their complicated appearances, the two expressions have a lot in common,
and their ratio has a rather simple form. After the dust settles, we find that

σ K (τ , f, σ)2 = σ 2 C ( f )2 I (ζ)
 
2σC ( f ) (ρ cosh (D) − sinh (D)) 2
× 1+  τv + · · · , (88)
σC ( f ) D I + 2 1 − ρ2 I 2 v

or

σ K (τ , f, σ) = σC ( f ) I (ζ)
 
σC ( f ) (ρ cosh (D) − sinh (D)) 2
× 1+  τv + · · · . (89)
σC ( f ) D I + 2 1 − ρ2 I 2 v

This is a refinement of the original asymptotic expression for implied volatility in


the SABR model.
It is easy to apply this formula to the specific choice of the function C ( f ). In case
of the stochastic CEV model, C ( f ) = f β , with 0 < β ≤ 1. If β = 1, then

v f
ζ= log . (90)
σ F

For 0 < β < 1,


v f 1−β − F 1−β
ζ= . (91)
σ 1−β

In the shifted lognormal model, C ( f ) = f + a, where a > 0. Consequently,

v f +a
ζ= log . (92)
σ F +a

5.3 Implied Volatility at Low Strikes

Our analysis so far has been base on the assumption that we were boundary conditions
at zero forward. In the case of ρ = 0, we can tackle the Dirichlet and Neumann
boundary conditions explicitly.
As explained in Sect. 2.3, the Green’s functions corresponding to the Dirichlet and
Neumann boundary conditions at zero forward can easily be calculated, using the
method of images, in terms of the Green’s function with free boundary conditions.
24 P. Hagan et al.

This, in turn, allows us to express the marginal probability distributions in terms


of (86):

PFDirichlet (τ , f, σ) = PF (τ , f, σ) − PF (τ , − f, σ) ,
(93)
PFNeumann (τ , f, σ) = PF (τ , f, σ) + PF (τ , − f, σ) .

Analogous formulas hold for the conditional second moments. We can now easily find
asymptotic expressions for the implied volatilities corresponding to these boundary
conditions.
In order to keep the appearance of the otherwise unwieldy formulas reasonable,
we shall introduce some additional notation. Let
 
I θ = I ζθ , (94)

where 
1 θ(x) du
θ
ζ = . (95)
y X C(u)

Furthermore, let us define the ratio



I
γ= . (96)

and note that γ < 1. Finally, we set:




⎨ −1, for the Dirichlet boundary condition,
η= 0, for the free boundary condition. (97)


1, for the Neumann boundary condition.

It is now easy to see that:

η 1
σ K (τ , f, σ) = σC (K ) I 
1 − ηγ + η 2 γ 2
 
σC ( f ) (ρ cosh (D) − sinh (D)) 2
× 1+  τv + · · · . (98)
σC ( f ) D I + 2 1 − ρ2 I 2 v

It is worthwhile to note that for large strikes all three of these quantities are practically
equal, and one might as well work  with the free boundary condition expression.
Indeed, in this case, γ ≈ 0, and so 1 − ηγ + η 2 γ 2 ≈ 1. Also, we see from this
expression that, at least asymptotically,

σ Dirichlet
K (τ , f, σ) < σ free
K (τ , f, σ) < σ K
Neumann
(τ , f, σ) . (99)
Probability Distribution in the SABR Model of Stochastic Volatility 25

This result is intuitively clear, and (98) quantifies it in a way that can be used for
position management purposes. The decision which boundary condition to adopt
should be made based on specific market conditions.

Appendix A Heat Equation on the Poincare Plane

In this appendix we present an elementary derivation of the explicit representation


of the Green’s function for the heat equation on H2 . This explicit formula has been
known for a long time (see e.g. [16]), and we include its construction here in order
to make our calculations self-contained.

A.1 Lower Bound on the Laplace-Beltrami Operator

We shall first establish a lower bound on the  spectrum of the Laplace-Beltrami


operator on the Poincare plane. Let H = L 2 H2 , dμh denote the Hilbert space of
complex functions on H2 which are square integrable with respect to the measure
(50). The inner product on this space is thus given by:

dx dy
(|) =  (z) (z) . (100)
H2 y2

It is easy to verify that the Laplace-Beltrami operator h is self-adjoint with respect


to this inner product.
Consider now the first order differential operator Q on H defined by

∂ 1 ∂
Q=i y − +y . (101)
∂y 2 ∂x

Its hermitian adjoint with respect to (100) is

∂ 1 ∂
Q† = i y − −y , (102)
∂y 2 ∂x

and we verify readily that

1  1
Q Q † + Q † Q = −h − . (103)
2 4
26 P. Hagan et al.

This implies that

1  1  1
(| − h ) = |Q Q †  + |Q † Q + (|)
2 2 4
1 †  1 1
= Q |Q  + (Q|Q) + (|)

2 2 4
1
≥ (|) ,
4
where we have used the fact that (|) ≥ 0, for all functions  ∈ H. As a
consequence, we have established that the spectrum of the operator −h is bounded
from below by 41 ! This fact was first proved in [16].

A.2 Construction of the Green’s Function

Let us now consider the the following initial value problem:


G Z (s, z) = h G Z (s, z) ,
∂s (104)
G Z (0, z) = Y 2 δ (z − Z ) ,

where z, Z ∈ H2 . In addition, we require that

G Z (s, z) → 0, as d (z, Z ) → ∞. (105)

Note that, up to the factor of Y 2 in front of the delta function and a trivial time
rescaling, this is exactly the initial value problem (68):

G Z (s, z) = Y 2 K Z (2s/ε, z) . (106)

The Green’s function G Z (s, z) is also referred to as the heat kernel4 on H2 . The
reason for inserting the factor of Y 2 in front of δ (z − Z ) is that the distribution
Y 2 δ (z − Z ) is invariant under the action (44) of the Lie group S L (2, R). In fact, we
verify readily that

1
Y 2 δ (z − Z ) = δ (cosh d (z, Z ) − 1) .
π

4 It is the integral kernel of the semigroup of operators generated by the heat equation.
Probability Distribution in the SABR Model of Stochastic Volatility 27

Now, since the initial value problem (105) is invariant under S O(2, R), its solution
must be invariant and thus a function of d (z, Z ) only. Let r = cosh d (z, Z ), and
write G Z (s, z) = ϕ (s, r ). Then the heat equation in (105) takes the form

∂   ∂2 ∂
ϕ (s, r ) = r 2 − 1 ϕ (s, r ) + 2r ϕ (s, r ) . (107)
∂s ∂r 2 ∂r
We have established above that the operator −h is self-adjoint on the Hilbert space
H, and its spectrum is bounded from below by 41 . Therefore, we shall seek the solution
as the Laplace transform
 ∞
ϕ (s, r ) = e−sλ L (λ, r ) dλ (108)
1/4

which yields the following ordinary differential equation:


  d2 d
1 − r2 2
L (λ, r ) − 2r L (λ, r ) − λL (λ, r ) = 0. (109)
dr dr
We write
λ = −ν (ν + 1) ,

where

1 1
ν =− ±i λ−
2 4
1
= − ± iω,
2
and recognize in (109) the Legendre equation. Note that, as a consequence of the
inequality λ ≥ 14 , ω is real and Re ν = − 21 .
In the remainder of this appendix, we will use the well known properties of the
solutions to the Legendre equation, and follow Chaps. 7 and 8 of Lebedev’s book on
special functions [15]. The general solution to (109) is a linear combination of the
Legendre functions of the first and second kinds, P−1/2+iω (r ) and Q −1/2+iω (r ),
respectively:

1
L + ω2 , r = Aω P−1/2+iω (r ) + Bω Q −1/2+iω (r ) . (110)
4

As d → 0 (which is equivalent to r → 1),

Q −1/2+iω (cosh d) ∼ const log d , (111)


28 P. Hagan et al.

which would imply that ϕ (s, cosh d) is singular at d = 0, for all values of s > 0.
Since this is impossible, we conclude that Bω = 0. Note that, on the other hand,

P−1/2+iω (1) = 1, (112)

i.e. P−1/2+iω (cosh d) is non-singular at d = 0. We will now invoke the Mehler-Fock


transformation of a function5 :
 ∞
!f (ω) = f (r ) P−1/2+iω (r ) dr , (113)
1 ∞
f (r ) = !
f (ω) P−1/2+iω (r ) ω tanh (πω) dω. (114)
0

In particular, (112) implies that the Mehler-Fock transform of δ (r − 1) is 1, and thus


(remember that we need to divide δ (r − 1) by π):

1
Aω = tanh (πω) .

Note that this relation can be viewed as a spectral representation for the unbounded
self-adjoint Laplace-Beltrami operator on the Poincare plane.
Now, the Legendre function of the first kind P−1/2+iω (r ) has the following
integral representation:
√  ∞
2 sin (ωu)
P−1/2+iω (cosh d) = coth (πω) √ du, (115)
π d cosh u − cosh d

which is valid for all real ω. Therefore


 ∞
1 1 sin (ωu)
L + ω 2 , cosh d =√ √ du ,
4 2 π2 d cosh u − cosh d

and we can easily carry out the integration in (108) to obtain


√  ∞
e−s/4 2 ue−u /4s
2

G Z (s, z) = √ du. (116)


(4πs)3/2 d(z,Z ) cosh u − cosh d (z, Z )

This is McKean’s closed form representation of the Green’s function of the heat
equation on the Poincare plane [16].
Going back to the original normalization conventions of (68) yields formula (69).

5 Strictly speaking, we will deal with distributions rather than functions. A rigor oriented reader can

easily recast the following calculations into respectable mathematics.


Probability Distribution in the SABR Model of Stochastic Volatility 29

Appendix B Some Asymptotic Expansions

In this appendix we collect a number of asymptotic expansions used in this paper.

B.1 Asymptotics of the McKean Kernel

We shall first establish a short time asymptotic expansion of McKean’s kernel. This
expansion plays a key role in the analysis of the Green’s
√ function of the SABR model.
In the right hand side of (116) we substitute u = 4sw + d 2 :
√ 
e−s/4 2 −d 2 /4s ∞ e−w dw
G Z (s, z) = √ e  √ .
4π 3/2 s 0 cosh 4sw + d 2 − cosh d

Expanding the integrand in powers of s yields



1 d
 √ = ×
cosh 4sw + d2 − cosh d sinh d
1 d coth d − 1 √  
√ − 2sw + O s 3/2
.
2sw 4d 2

Integrating term by term over w we find that

e−s/4 d2
G Z (s, z) = exp − ×
4πs 4s
  
d 1 d coth d − 1
1− s + O s2 ,
sinh d 4 d2

and we thus obtain the following asymptotic expansion of the McKean kernel:

1 d2
G Z (s, z) = exp − ×
4πs 4s
   
d 1 d coth d − 1
1− + 1 s + O s2 , (117)
sinh d 4 d2
30 P. Hagan et al.

Taking the derivative of G Z (s, z) with respect of d (z, Z ) in the expansion above,
we find that

∂ 1 d2
G Z (s, z) = exp − ×
∂d 4πs 4s
  
d d d 1 − d coth d
− + 1+3 + O (s) . (118)
sinh d 2s 8 d2

B.2 Laplace’s Method

Next we review the Laplace method (see e.g. [2, 3]) which allows one to evaluate
approximately integrals of the form:
 ∞
f (u) e−φ(u)/ du. (119)
0

We use this method in order to evaluate the marginal probability distribution for the
Green’s function.
In the integral (119),  is a small parameter, and f (u) and φ (u) are smooth
functions on the interval [0, ∞).6 We also assume that φ (u) has a unique minimum
u 0 inside the interval with φ (u 0 ) > 0. The idea is that, as  → 0, the value of the
integral is dominated by the quadratic approximation to φ (u) around u 0 .
More precisely, we have the following asymptotic expansion. As  → 0,

 ∞ 2π
f (u)e−φ(u)/ du = e−φ(u 0 )/ ×
0 φ (u 0 )
 
f (u 0 ) φ(4) (u 0 ) f (u 0 )
f (u 0 ) +  −
2φ (u 0 ) 8φ (u 0 )2
 
f (u 0 ) φ(3) (u 0 ) 5φ(3) (u 0 )2 f (u 0 )  
− +  3 + O 2 . (120)
2φ (u 0 )2 24φ u 0

To generate this expansion, we first expand f (u) and φ (u) in Taylor series around u 0
to orders 2 and 4, respectively (keep in mind that the first order term in the expansion
of φ (u) is zero). Then, expanding the regular terms in the exponential, we organize
the integrand as e−φ (u 0 )(u−u 0 ) /2 times a polynomial in . In the limit  → 0, the
2

integral reduces to calculating moments of the Gaussian measure; the result is (120).
It is straightforward to compute terms of order higher than 1 in , even though the
calculations become increasingly complex as the order increases.

6 It can be an arbitrary interval.


Probability Distribution in the SABR Model of Stochastic Volatility 31

Finally, let us state a slight generalization of (120), which we use below. In the
integral (119), we replace f (u) by f (u) + g (u). Then, as  → 0,

 ∞ 2π
[ f (u) + g (u)]e−φ(u)/ du = e−φ(u 0 )/ ×
0 φ (u 0 )
 
f (u 0 ) φ(4) (u 0 ) f (u 0 )
f (u 0 ) +  g (u 0 ) + −
2φ (u 0 ) 8φ (u 0 )2
 
f (u 0 ) φ(3) (u 0 ) 5φ(3) (u 0 )2 f (u 0 )  
− + +O  2
.
2φ (u 0 )2 24φ (u 0 )3
(121)

This formula follows immediately form (120).

B.3 Application of Laplace’s Method

We shall now apply formula (121) to evaluate the integrals (80) and (87). Each of
these integrals is of the form given by the right hand side of (121). We find easily
that the minimum Y0 of the function

1
φ (Y ) = δ (z, Z )2
2
is given by 
Y0 = y ζ 2 − 2ρζ + 1,

where  x
1 du
ζ= .
y X C(u)

Also, we let D (ζ) denote the value of δ (z, Z ) with Y = Y0 :



ζ 2 − 2ρζ + 1 + ζ − ρ
D (ζ) = log ,
1−ρ

and 
I (ζ) = ζ 2 − 2ρζ + 1.
32 P. Hagan et al.

Finally, we note that the second derivative φ (Y0 ) of φ (Y ) with respect to Y is

D
φ (Y0 ) =   ,
1 − ρ2 y 2 I sinh D

where we have suppressed the argument ζ in D (ζ) and I (ζ). Likewise,

3D
φ(3) (Y0 ) = −   ,
1 − ρ2 y 3 I 2 sinh D

and
3 (1 − D coth D) 12D
φ(4) (Y0 ) =  2 +  .
1−ρ 2 4 2 2
y I sinh D 1 − ρ y 4 I 3 sinh D
2

It is actually easier to begin the calculation with (87). In order to evaluate the
various terms on the right hand side of (121), let us define

δ δ
f (Y ) = 1− q ,
sinh δ sinh δ

and

δ 1 δ coth δ − 1 3 (1 − δ coth δ) + δ 2
g (Y ) = − + − q .
sinh δ 8 8δ 2 8δ sinh δ

Then, after some manipulations we find that:


  
D yC (x) D
f (Y0 ) = 1+  ,
sinh D 2 1 − ρ2 I

3/2
D C (x) (sinh (D) − ρ cosh (D))
f (Y0 ) = −  3/2 ,
sinh D 2 1 − ρ2 I2

  
D 1 − D coth D 3yC (x) D
f (Y0 ) =   1+ 
sinh D 2 1 − ρ2 y 2 I D sinh D 2 1 − ρ2 I
3/2
D C (x) (sinh (D) − ρ cosh (D))
+  3/2 ,
sinh D 1 − ρ2 yI3
Probability Distribution in the SABR Model of Stochastic Volatility 33

and
  
1 D 1 − D coth D 3 (1 − D coth D) + D 2
g (Y0 ) = − 1− + yC (x)  .
8 sinh D D2 2 1 − ρ2 I D

Putting all these together we find that


 
1 √ D2 yC (x) D
M X2 (s, x, y) = √ yC (X ) I exp − 1+ 
2πλ 2λ 2 1 − ρ2 I

1 yC (x) D 2ρyC (x)
+ λ 1−  + cosh (D)
8 2 1−ρ I 2 1 − ρ2 I 2
       
3 1 − ρ2 2yC (x) 3ρ2 − 4 D sinh (D)  
+ +  +O λ 2
,
I 1 − ρ2 I 2 D

as claimed in Sect. 5.
Let us now compute (80). We note that the functions f and g in (121) occurring in
this integral are obtained from the corresponding functions in (80) by dividing them
by Y 2 . We thus define
! f (Y )
f (Y ) = ,
Y2
and
g (Y )
!
g (Y ) = .
Y2
Then,

! f (Y0 )
f (Y0 ) = 2 2 ,
y I
! 2 f (Y0 ) f (Y0 )
f (Y0 ) = − 3 3 + 2 2 ,
y I y I
! 6 f (Y 0 ) 4 f (Y0 ) f (Y0 )
f (Y ) = 4 4
− 3 3
+ 2 2 ,
y I y I y I

and
g (Y0 )
!
g (Y0 ) = .
y2 I 2
34 P. Hagan et al.

Combining all the terms we find that


 
1 1 D2 yC (x) D
PX (s, x, y) = √ exp − 1+ 
2πλ yC (X ) I 3/2 2λ 2 1 − ρ2 I

1 yC (x) D 6ρyC (x)
− λ 1+  + cosh (D)
8 2 1−ρ I 2 1 − ρ2 I 2
       
3 1 − ρ2 3yC (x) 5 − ρ2 D sinh (D)  
− +  +O λ 2
,
I 2 1 − ρ2 I 2 D

as stated in Sect. 5.

References

1. Beardon, A.F.: The Geometry of Discrete Groups. Springer, New York (1983)
2. Bender, C.M., Orszag, S.A.: Advanced Mathematical Methods for Scientists and Engineers.
Springer, New York (1999)
3. Bleistein, N., Handelsman, R.A.: Asymptotic Expansions of Integrals. Dover Publications,
New York (1986)
4. Derman, E., Kani, I.: Riding on a smile. Risk 7(2), 32–39 (1994)
5. Duffie, D., Pan, J., Singleton, K.: Transform analysis and asset pricing for affine jump diffusions.
Econometrica 68, 1343–1377 (2000)
6. Dupire, B.: Pricing with a smile. Risk 7(1), 18–20 (1994)
7. Elworthy, K.D.: Geometric aspects of diffusions on manifolds. Ecole d’Ete de Probabilites de
Saint Flour, vol. XVII. Springer, New York (1987)
8. Emery, M.: Stochastic Calculus in Manifolds. Springer, Berlin (1989)
9. Guenther, R.B., Lee, J.W.: Partial Differential Equations of Mathematical Physics and Integral
Equations. Prentice Hall, Englewood Cliffs (1988)
10. Hagan, P.S., Kumar, D., Lesniewski, A., Woodward D.E.: Managing smile risk, Wilmott Mag.
(2003)
11. Hagan, P.S., Woodward, D.E.: Equivalent black volatilities. Appl. Math. Financ. 6(3), 147–157
(1999)
12. Heston, S.: A closed form solution for options with stochastic volatility with applications to
bond and currency options. Rev. Financ. Stud. 6, 327–343 (1993)
13. Hsu, E.P.: Stochastic Analysis on Manifolds. American Mathematical Society, Providence
(2002)
14. Kevorkian, J., Cole, J.D.: Perturbation Methods in Applied Mathematics. Springer, Berlin
(1985)
15. Lebedev, N.N.: Special Functions and their Applications. Dover Publications, New York (1972)
16. McKean, H.P.: An upper bound to the spectrum of  on a manifold of negative curvature. J.
Differ. Geom. 4, 359–366 (1970)
17. Lewis, A.L.: Option Valuation Under Stochastic Volatility. Finance Press, Newport Beach
(2000)
Probability Distribution in the SABR Model of Stochastic Volatility 35

18. Molchanov, S.A.: Diffusion processes and Riemannian geometry. Russ. Math. Surv. 30, 1–63
(1975)
19. Varadhan, S.R.S.: On the behavior of the fundamental solution of the heat equation with variable
coefficients. Commun. Pure Appl. Math. 20, 431–455 (1967)
20. Varadhan, S.R.S.: Diffusion processes in a small time interval. Commun. Pure Appl. Math. 20,
659–685 (1967)
Asymptotic Implied Volatility at the Second
Order with Application to the SABR Model

Louis Paulot

Abstract We provide a general method to compute a Taylor expansion in time of


implied volatility for stochastic volatility models, using a heat kernel expansion.
Beyond the order 0 implied volatility which is already known, we compute the first
order correction exactly at all strikes from the scalar coefficient of the heat kernel
expansion. Furthermore, the first correction in the heat kernel expansion gives the
second order correction for implied volatility, which we also give exactly at all strikes.
As an application, we compute this asymptotic expansion at order 2 for the SABR
model and compare it to the original formula.

Keywords Stochastic volatility · Asymptotic expansion · Implied volatility · Heat


kernel · SABR

1 Introduction

The most known model for pricing derivatives is the Black-Scholes-Merton model,
where the underlying is supposed to follow a geometric Brownian motion. Popular
extensions include local volatility models and stochastic volatility models. As an
example the SABR model [6] combines the local volatility of the CEV model [4] and
a lognormal volatility process. Closed formulas for European options can be obtained
for a few models; it is the case of the CEV model or for a stochastic volatility example
the Heston model [10]. These are however special cases and there are generally no
closed form formulas. Finite difference methods or Monte-Carlo simulations can be
used to price derivatives. Approximations have also been computed to achieve faster
pricing, especially for calibration processes.
For short maturities, Hagan, Kumar, Lesniewski and Woodward provide an
approximation for the implied volatility of the SABR model they introduce [6].
Berestycki, Busca and Florent [2, 3] and Henry-Labordère [8] give general methods

L. Paulot (B)
Misys, 42 Rue Washington, 75008 Paris, France
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 37


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_2
38 L. Paulot

to compute short maturity asymptotics of stochastic volatility models. These


expansions give the implied volatility at first order in maturity. In addition some
quantities are approximated by their value at the money, which can produce errors
in the wings of the distributions.
In this paper, we leverage on the heat kernel methods introduced for the study
of stochastic processes by Varadhan [11, 12] and used for the SABR model in
[7, 8]. Using the heat kernel expansion of DeWitt [5], we provide a method to com-
pute exactly a Taylor expansion of the implied volatility at all strikes. The stochastic
volatility diffusion is formulated as a diffusion on a Riemannian manifold. The geo-
desic distance gives the implied volatility at null maturity. The multiplicative factor
of the heat kernel expansion provides the first order (in time) correction to implied
volatility. The first corrective term of the heat kernel is translated into the second
order correction to implied volatility and similarly for higher order corrections. We
perform a detailed computation up to order 2 of the Taylor expansion in time of
implied volatility, without other approximations.
More generally, our method can be used to approximate a stochastic volatility
model by an other model for which a closed form solution exists, with an implied
parameter computed as a Taylor expansion.
As an application, we compute the asymptotic SABR volatility at order 2 and
compare it to finite difference method results and to the original SABR expansion.
Our results can be useful for pricing short maturities options or even long matu-
rities options with low volatility of volatility. When the approximation is not valid,
a numerical method such as a finite difference method (FDM) has to be used. When
our approximation is valid, it gives much faster results. At very short maturities, the
prices are even more precise. Calibration at short maturities appears to be more stable
using this approximation.
In Sect. 2 we recast the financial model in physical and geometric terms and fix our
conventions. In Sect. 3 we use a heat kernel expansion to compute a short maturity
expansion of Black or more generally CEV implied volatility. Finally in Sect. 4 we
apply the method to the SABR model and compare the results to FDM and to the
original formula.

2 Diffusion Equation in Covariant Form

A stochastic volatility model for some asset with pure diffusion (no jumps) is
described by two risk-neutral processes: the asset price S and a variable V which
describes the stochastic part of volatility. In the Heston model V would be the vari-
ance whereas in the SABR model it is a factor of volatility. The diffusion is given by
the stochastic differential equations

dS = μ S (S)dt + σ S (S, V )dW1


dV = μV (V )dt + σV (V )dW2 (1)
Asymptotic Implied Volatility at the Second Order … 39

where dW1 and dW2 are two standard Brownian processes with correlation ρ.
The dependence of parameters in variables S and V we have written is the more
common, it may be more general with all parameters depending on both variables.
Stochastic volatility models can be seen as diffusions on a Riemann surface. More
precisely, prices of securities are sections of a line bundle over this Riemann surface
which are solutions of a diffusion (or heat) equation.
A introduction to this subject and its applications to finance can be found in [9].
We present here the formalism and define all quantities we use in order to set our
conventions.

2.1 Diffusion Equation

Let us consider a general model with n state variables X i (t) (which will be the
spot and the volatility) which follow a pure diffusion process, without jumps. For
simplicity we consider a European payoff of some maturity T . The price P(X (t), t)
of such a payoff is the solution of a diffusion equation

1
−∂t P = μi ∂i P +  i j ∂i ∂ j P − r P (2)
2

where  i j is the covariance matrix, μi the drifts and r the numéraire rate. All coef-
ficients can depend on state variables X i (t) and time t. Unless explicitly staten,
we adopt Einstein sum convention: repeated indices are summed. The price of the
European option is given by the solution of this equation with terminal boundary
condition at maturity T given by the payoff.
The covariance matrix  i j can be seen geometrically as the inverse g i j =  i j of
a metric gi j on the space of variables. The diffusion equation describes the diffusion
 −1a Riemannian manifold: the state of variables endowed with the metric gi j =
over
 ij.

Examples

1. The Black-Scholes equation in the monetary account numéraire with volatility σ,


risk free rate r and dividend yield q reads

1
−∂t P = (r − q)S∂ S P + σ 2 S 2 ∂ S ∂ S P − r P
2

This is Eq. (2) with μ S = (r − q)S,  SS = σ 2 S 2 and r = r .


2. For the stochastic volatility model described by Eq. (1), μi is a two-dimensional
vector  
μS
μ= .
μV
40 L. Paulot

The covariance matrix  i j is


 
σ S 2 ρσ S σV
= .
ρσ S σV σV 2

In what follows we restrict ourselves to the case of time-homogeneous mod-


els: there is no explicit time-dependence in parameters. The generalization to time-
dependent cases is not difficult.

2.2 Gauge Structure

There are several gauge transformations which are natural for such systems:
1. Change of numéraire:
P(X, t)
P(X, t) −→
(X, t)

where (X, t) is the price of a security which is always nonzero. Mathematically


it is a real function which is positive everywhere, that we denote thus by (X, t) =
e−φ(X,t) .
2. Change of variables
X −→ X  (X ).

The natural way to handle a system with gauge freedom is to introduce covariant
derivatives. The coordinate freedom is handled through the Levi-Civita connection
which acts respectively on scalars, vectors and 1-forms as

Di f = ∂i f
j
Di f j = ∂i f j + ik f k
Di f j = ∂i f j − ikj f k

j
where ik are the Christoffel symbols. The action on tensors with more indices is
obtained by acting on all indices with the Christoffel symbols. Christoffel symbols
can be computed from the metric as

1 kl  
ikj = g ∂i gl j + ∂ j gil − ∂l gi j .
2
A fundamental property of the Levi-Civita connection is the covariance of the metric:

Di g jk = 0.
Asymptotic Implied Volatility at the Second Order … 41

The metric is used to transforms vectors into 1-forms and conversely, i.e. lowering
or raising indices:

Ai = g i j A j
Ai = gi j A j .

The numéraire gauge freedom is handled through a line bundle L (i.e. with sec-
tions in R). Geometrically, P is a section of L. A R-valued connection1 is defined
with spatial and time components given by a 1-form Ai and a scalar2 Q:

∇i P = (Di − Ai )P
∇t P = (∂t − Q)P.

Under the change of numéraire

P −→ eφ(X,t) P

these operators are covariant,

∇i P −→ eφ(X,t) ∇i P
∇t P −→ eφ(X,t) ∇t P,

provided that Ai and Q are shifted as

Ai −→ Ai − ∂i φ
Q −→ Q − ∂t φ.

Using these connections, the diffusion equation (2) can be rewritten as

1 i
− ∇t P = ∇ ∇i P. (3)
2
Identifying terms between Eqs. (2) and (3), the R connection must be
 
1 j
Ai = gi j − kl g kl − μ j (4)
2
1 ij  
Q = − g ∂i A j − Ai A j − ikj Ak + r . (5)
2

1 This connection is similar to the connection which described the electromagnetic potential, except

that the fibre of the gauge bundle is R instead of U (1). This causes a difference of a factor i in
equations.
2 There is a breaking of symmetry between time and spatial directions. The diffusion equation can

be seen as a non-relativistic limit of a pure wave equation in imaginary time.


42 L. Paulot

In addition with
gi j =  i j

this translates the set of financial parameters into geometrical quantities.

2.3 Kolmogorov Forward Equation

The Kolmogorov backward Eq. (3) leads to a dual Kolmogorov forward equation.
We suppose that all prices are expressed with respect to a numéraire which is a
traded asset that does not pay any coupon or dividend. The price of the numéraire
security itself is identically 1; this reads mathematically

1 i
−∇t 1 = ∇ ∇i 1
2
If p(X, t) is a risk-neutral probability density to get in state X at time t starting
from state X 0 at time 0, then the price of a European payoff of maturity T ≥ t can
be written as 
P(X 0 , 0) = dX p(X, t)P(X, t) .

As t does not appear on the left-hand side, the derivative of the integral with respect
to t must vanish. 
dX ∂t ( p(X, t)P(X, t)) = 0.

We define an action of the gauge group on p with a plus sign instead of a minus sign
when acting on P:

∇i p = (Di + Ai ) p
∇t p = (∂t + Q) p.

This means that they p and P have opposite charges under the numéraire R gauge
group, such that p P is neutral and ∇t ( p P) = ∂t ( p P). We have thus

dX (∇t p(X, t)P(X, t) + p(X, t)∇t P(X, t)) = 0 .

Using Eq. (3) for ∇t P(X, t) and integrating by part on the spatial directions, this
equation becomes
  
1 i
dX ∇t p(X, t) − ∇ ∇i p(X, t) P(X, t) = 0 . (6)
2
Asymptotic Implied Volatility at the Second Order … 43

This equation will be automatically satisfied if

1 i
∇t p = ∇ ∇i p. (7)
2
Moreover, if the market is complete Eq. (6) must be true for all functions P(·, t) which
imposes Eq. (7). This is the Kolmogorov forward equation, written in a covariant way.
It should be noted that p(X, t) is a density, which means that the Levi-Civita
connection does not reduce to a partial derivative as would be the case for a scalar.
More precisely, the transition probability p(X 0 , 0; X, t) has value in L  L∗ ⊗
∧d (T ∗ M). Numéraire gauge tranformations associated with the line bundle L acting
on p gives the well-known change of measure which are usually obtained from the
Girsanov formula.

3 Asymptotic Implied Volatility

We consider a stochastic volatility model where the variable is a forward price or


rate F with a volatility variable V :

dF = σ F (F, V )dW1
dV = μV (V )dt + σV (V )dW2

with dW1 dW2 = ρdt.


Our computation of an asymptotic expansion at short time of implied volatility at
strike K involves four steps:
1. Compute an asymptotic value of the transition probability from initial state F0 ,
V0 at time 0 to K , V at time t using a heat kernel expansion;
2. Compute E σ 2F (K , V )δ(F − K ) using a saddle point method;
3. Integrate over time to compute the time value;
4. Compare to the same formula for the Black-Scholes model to extract the implied
volatility.

3.1 Heat Kernel Expansion

In order to keep exposition as simpler and clear as possible, we will skip here technical
details and refer the reader to [5, 13] for mathematically precise statements.3

3 In
finance we will usually consider noncompact manifolds, possibly with boundaries as in the
SABR model for 0 < β < 1.
44 L. Paulot

At short time the solution of Eq. (7) with initial condition p(X, 0) = δ(X − X 0 )
is asymptotically given by a heat kernel expansion4

√ d 2 (X 0 , X )
g(X ) −
p(X, t) = (X 0 , X )P(X 0 , X )e 2t ak (X 0 , X )t k . (8)
(2πt)n/2
k≥0

g(X ) is the determinant of the metric at point X :

g = Det(gi j ).

d(X 0 , X ) is the geodesic distance between the starting point X 0 and the end point
X , this is the minimal distance between X 0 and X . It can also be written as
 t
d 2 (X 0 , X )
= min dt gi j Ẋ i Ẋ j
t X (s) 0

where the minimum is taken on all paths going from X (0) = X 0 to X (t) = X .
(This is independent of t.) We denote by C this geodesic path. (X 0 , X ) is the Van
Vleck–Morette determinant

1 ∂ 2 d 2 (X 0 , X )
Det −
2 ∂ Xi∂ X j
(X 0 , X ) = √ 0
.
g(X 0 )g(X )

P(X 0 , X ) is the parallel transport along the geodesic with respect to the R connection.
It is such that its covariant derivative along the geodesic path is null:
 
− Ai Ẋ i dt − Ai dX i
P(X 0 , X ) = e C =e C

where the integral is computed on the geodesic path C. Finally, ai (X 0 , X ) are func-
tions which are defined recursively with

a0 = 1

4 Using Feynman path integral, the solution to Eq. (7) can be written up to some normalization factor

as  t  
 1
− dt gi j Ẋ i Ẋ j + Ai Ẋ i + Q
p(X, t) ∝ [D X ]e 0 2

where [D X ] means integrating over all path X (s) going from X (0) = X 0 to X (t) = X . The
normalization factor is the inverse of the same quantity with the integral computed over all paths
with starting point X 0 , so that the total probability is 1. It is generally not possible to compute
this integral exactly. However it gives some hints on the asymptotic solution at short time: the
solution will be dominated by the path corresponding to the minimal value of the integrand inside
the exponential, which will be close to the geodesic path.
Asymptotic Implied Volatility at the Second Order … 45

and ai ’s satisfy the differential equations


 
−1 −1/2 1 i
(k + (∇ d)d∇i )ak = P
i
 ∇ ∇i − Q P1/2 ak−1 .
2

Along a given geodesic curve parameterized by its geodesic distance from X 0 , this
equation reads
 
1 i
(k + d∂d )ak = P −1 −1/2 ∇ ∇i − Q P1/2 ak−1
2

which can be integrated as


 d  
1 1 i
ak = ds s k−1 P −1 −1/2 ∇ ∇i − Q P1/2 ak−1 .
dk 0 2

Functions ak are sections of a L  L∗ bundle. The parallel transport P and the


connexion with respect to the numéraire gauge group act on the second factor of this
external product. (The first factor is related to the numéraire at t = 0.) Also note that
√p(X,t)
g(X )
is a scalar with respect to the Levi-Civita connection.
In order to produce a first order expansion of the implied volatility, only the com-
mon multiplicative factor of expansion (8) is needed. In order to compute a second
order term for the implied volatility, we will also make use of the first corrective term
a1 t with   
1 d −1 −1/2 1 i
a1 = ds P  ∇ ∇i − Q P1/2 . (9)
d 0 2

3.2 Expected Variance

We now compute E σ 2F (Ft , Vt )δ(Ft − K ) . This quantity can be written as an inte-


gral over the terminal volatility variable V :
  
E σ 2F (Ft , Vt )δ(Ft − K ) = dV σ 2F (K , V ) p(K , V ; t)

 
F
where p(F, V ; t) is given by the heat kernel expansion (8) with X = and
V
n = 2. The integrand can be written as

B
1 − − C − Dt + o(t)
σ 2F (K , V ) p(K , V ; t) = e t (10)
2πt
46 L. Paulot

with
1
B= d(F0 , V0 ; K , V )2 (11)
2
1
C = −2 ln(σ F (K , V )) − [ln(g(K , V )) + ln((F0 , V0 ; K , V ))] + M(K , V )
2
(12)
D = −a1 (K , V ) (13)

where M is the integral of the R connection



M(K , V ) = − ln(P(K , V )) = Ai dX i
C

on C, the geodesic curve joining (F0 , V0 ) to (K , V ), and a1 is given in Eq. (9) as an


integral over the geodesic path.
The integral over (10) will be dominated at short time by the B term. More
precisely, it will be dominated by the volatility Vmin which minimizes B(K , V ) =
2 d(F0 , V0 ; K , V ) . This is the final volatility which minimizes the distance between
1 2

the initial conditions and the strike K . Expanding all functions in the neighborhood
of Vmin , where B  (Vmin ) = 0, the integrand is

B B  2
1 − − C − Dt − δV
σ 2F (K , V ) p(K , V ; t)
= e t e 2t
2πt
  
1    1 (4) 1 (3)  δV 4 1 (3) 2 δV 6
C − C  δV 2 −
2
1− B − B C + B
2 24 6 t 72 t2

+ o(t) + odd terms

where derivatives are with respect to V , all functions B, C, D and their derivatives
are taken at (K , Vmin ) and δV = V − Vmin . When writing o(t), we have anticipated
that after integration δV 2 ∼ 1t . We have also anticipated that odd terms in δV will
not give contributions to the integral.
Integrating over δV , and using that the first even moments of the standard normal
distribution are M2 = 1, M4 = 3 and M6 = 15, we get for the integral

  B
1 − − C − Dt
E σ F (Ft , Vt )δ(Ft − K ) = √
2
e t
2πt B 
   
1   2
 t 1 (4) 1 (3)  3t 1 (3) 2 15t
1− C −C − B − B C + B + o(t)
2 B  24 6 B  2 72 B  3
Asymptotic Implied Volatility at the Second Order … 47

This can be rewritten as

  B  
1 − −C − Dt + o(t)
E σ F (Ft , Vt )δ(Ft − K ) = √
2
e t (14)
2πt

with

 = C + 1 ln(B  )
C (15)
2 ⎡ 2 ⎤
1 (4) (3) B (3) ⎦
= D+
D ⎣C  − C  2 + 1 B − B C  − 5
2B  4 B  B  12 B 
⎡ 2 ⎤
1 ⎣ 2 1 B (4) 1 B (3) ⎦
= D+ C −C − + (16)
2B  4 B  3 B 

where all derivatives are with respect to V and all functions and their derivatives are
taken at (K , Vmin ).

3.3 Time Value

The price of a Call of maturity T and strike K can be written as the payoff integrated
against the risk-neutral distribution:

−r T
Call(K , T ) = e dF (F − K )+ p(F, T )

where p(F, t) is the marginal probability density of Ft . This can be written also as
a double integral over forward and time as
   T 
Call(K , T ) = e−r T (F0 − K )+ + dF dt (F − K )+ ∂t p(F, t) . (17)
0

As Ft is a forward, it is a driftless process and the Kolmogorov forward equation


reduces to
1  2 
∂t p(F, t) = ∂ F2 σloc (F, t) p(F, t) (18)
2
2 (F, t) is the local (normal) volatility
where σloc

  E σ 2 (Ft , Vt )δ(Ft − F)
F
σloc
2
(F, t) = E σ 2F (Ft , Vt ) | Ft = F = .
p(F, t)
48 L. Paulot

Plugging the Kolmogorov equation (18) in Eq. (17) and integrating twice by part on
the F variable, the Call price is finally obtained as an integral over time at strike K :
   
1 T
Call(K , T ) = e−r T (F0 − K )+ + dt E σ 2F (Ft , Vt )δ(Ft − K ) . (19)
2 0
Using expression (14) for the integrand, the integral over time can be computed:
 T  
1
dt E σ 2F (Ft , Vt )δ(Ft − K ) =
2 0
⎡ B  
1 −C ⎣ t − t √ B
√ e e − B erfc
2 π t
⎛ B  ⎞⎤ ⎛ B⎞

D t − B −
− ⎝ (t − 2B)e t + 2B 3/2 erfc ⎠⎦ + o⎝t 5/2 e t ⎠
3 π t

where erfc is the complementary


√ error function, equal to the cumulative of the stan-
dard normal distribution up to 2 factors:
 +∞  √ 
2
dy e−y = 2 N − 2x .
2
erfc(x) = √
π x
The asymptotic expansion of this function at +∞
2   
e−x 1 3 1
erfc(x) = √ 1− 2 + 4 +o 4
x π 2x 4x x

with x = B
t gives the asymptotic expansion of the time value:
 T  
1
dt E σ 2F (Ft , Vt )δ(Ft − K )
2 0
B   T − 3 T + o(T )
T 3/2 − − C − ln(B) − D
= √ e T 2B . (20)
2 2π

3.4 Implied Volatility

The final step consists in computing the same expansion for the Black–Scholes model,
which is simpler as there is no stochastic volatility to be integrated. The metric is
given by the inverse of the variance:

1
gF F = .
σ2 F 2
Asymptotic Implied Volatility at the Second Order … 49

The Christoffel symbol is therefore


1
 FF F = − .
F
The R-connection components are computed using (4) and (5):
1
AF =
2F
σ2
Q= .
8
The geodesic distance is
   
 K dF  1  K 
d(F0 , K ) =  = ln .
F0 σ F  σ  F0 
The Van Vleck–Morette determinant is simply

(F0 , K ) = 1.

The parallel transport is


 K
− dF A F 
F
P(F0 , K ) = e F0 = .
K
Putting all these elements together, the heat kernel expansion of p(K , t) is accord-
ing to (8)

 ln2 KF  
1 F − 2 σ2
p(K , t) = √ e 2σ t 1 − t + o(t) .
σK 2πt K 8

Multiplying by the local variance, we get

  BBS  BS t + o(t)


1 − − CBS − D
E σ 2F (K )δ(F − K ) = σ 2 K 2 p(K , t) = √ e t
2πt
(21)

with
1 K
BBS = ln2
2σ 2 F0
BS 1
C = − ln(σ) − ln(K F0 )
2
BS σ2
D = .
8
50 L. Paulot

(In fact formula (21) is exact: there is no o(t) correction and it can be integrated
exactly to get the Black–Scholes formula.)
Writing Eq. (20) for both the stochastic volatility model and the Black-Scholes
model, the implied volatility is such that both quantities are equal:

B  + 3 T = BBS + C
 + ln(B) + DT BS T + 3 T + o(T ).
BS + ln(BBS ) + D
+C
T 2B T 2BBS
(22)
Expanding the implied volatility σ as a Taylor expansion

σ(K , T ) = σ0 (K ) + σ1 (K )T + σ2 (K )T 2 + o(T 2 )

and plugging this into Eq. (22) on the Black–Scholes side, we get
   2 
1 2 K σ1 σ2 2 σ1 σ1
ln 1−2 T −2 T +3 T 2 − ln(σ0 ) − T
2σ02 T F0 σ0 σ0 σ0 σ0

1 1 2 K σ1 σ02 3σ02
− ln(K F0 ) + ln ln − 2 T + T + T
2 2σ02 F0 σ0 8 ln2 FK 0
B   T + 3 T + o(T ).
= +C + ln(B) + D
T 2B
Coefficients must be equal at each order in T , which gives our final expansion of the
implied volatility.
Power −1 gives the order 0 implied volatility
   
 K  K
ln F0  ln F0 
σ0 = √ = (23)
2B d(F0 , V0 ; K , Vmin )

which was already obtained in [2, 8].


The first order correction is extracted from the constant term:
 √ 
σ1  + ln σ0 K F0
C
=− . (24)
σ0 2B

Finally the O(T ) term gives the second order correction:


 2 
σ2 σ1 σ2
=
3

1  + 3 σ1 − 0
D . (25)
σ0 2 σ0 2B σ0 8

This gives our final result as the implied volatility expansion


 
σ1 σ2 2
σ = σ0 1 + T+ T + o(T ) .
2
(26)
σ0 σ0
Asymptotic Implied Volatility at the Second Order … 51

We stress that this result is exact in strike: for a given strike, we have computed
exactly the three first coefficients of the Taylor expansion. Moreover, contrary to
other expansions, the order 1 expansion is extracted from the order 0 expansion of the
probability. This technique allows us to extract a second order term for the implied
volatility from the order 1 term in the probability expansion. This method can be
used to compute the Taylor expansion of implied volatility up to any order, although
the computation becomes more complicated and involves integrals of increasing
dimension: the ak coefficient of the heat kernel expansion involves k + 1 integrals.

3.5 At the Money

The computation we have performed makes the implicit hypothesis that we are not
exactly at the money: K = F0 . Otherwise, the dominant term in the exponential
would vanish and we could not use the asymptotic expansion of the erfc function at
infinity. Precisely at the money, we should use instead a Taylor expansion in 0. As
the implied volatility surface is smooth, we just take the limit of formulas (23), (24)
and (25) at K → F0 . If we perform instead the Taylor expansion of the erfc function
at 0, we find only the two first orders


e−C(F0 )
σ0 (F0 ) =
F0

σ1 1 σ02
(F0 ) =  0) .
− D(F
σ0 3 8

Careful Taylor expansions of all quantities at the money can be used to check that this
is indeed the limit of Eqs. (23) and (24). Moreover, it can be seen that the existence
of these limit are conditions for formulas (24) and (25) to be convergent, as B goes to
0 at the money (at order 2 in the geodesic distance, which means that the numerators
must in fact vanish at order 2).

3.6 CEV Volatility

Instead of Black volatility, the asymptotic expansion can be computed for other
local volatility models. Without stochastic volatility, the SABR model reduces to the
CEV model. The local volatility part of the model is thus taken into account exactly
without introducing approximation besides the stochastic corrections. In view of our
application to the SABR model, we will compute here a CEV implied volatility.
There are closed formulas for this model, involving Bessel functions. This implied
volatility can therefore be used in the CEV pricing formula in order to get the price
of the option.
52 L. Paulot

For a CEV model with parameter β0 and volatility factor σ, such that

dF = σ F β0 dW ,

 and D
the function B, C  are

1
B0 = ln2 (q0 )
2σ 2
0 = −ln(σ) − 1 β0 ln(K F0 )
C
2
0 = β0 (2 − β0 )σ
2
D 1−β
8K 1−β0 F0 0

with
⎧ 1−β

⎪ K 1−β0 − F0 0
⎨ β0 < 1
q0 =  1− β0

⎪ K
⎩ ln β0 = 1.
F0

Formulas (23), (24) and (25) are modified as follows.

|q0 | |q0 |
σ0 = √ = (27)
2B d(F0 , V0 ; K , Vmin )

σ1  + ln(σ0 ) + 1 β0 ln(K F0 )
C
=− 2
. (28)
σ0 2B

 2 
σ2 σ1 β (2 − β0 )σ02
=
3

1  + 3 σ1 − 0
D . (29)
σ0 2 σ0 2B σ0 1−β
8K 1−β0 F0 0

This gives the CEV implied volatility expansion


 
σ1 σ2 2
σ = σ0 1 + T+ T + o(T 2 ) . (30)
σ0 σ0

The Black implied volatility formulas correspond to the special case β0 = 1. The
Bachelier (i.e. normal) implied volatility would correspond to β0 = 0.
Asymptotic Implied Volatility at the Second Order … 53

3.7 Generalization

This technique can be generalized easily to other parameterizations of the options


prices. Consider a model with local volatility or stochastic volatility, for which there
are closed form formulas for European option prices. It can be used as a proxy in the
following way.
• Denoting by z i the parameters of the model, compute B∗ (z i ), C ∗ (z i ) and D∗ (z i ),
 
the quantities B, C and D of the asymptotic expansion (20) for this model at a
given strike.
(0) (0)
• Find parameters z i such that B∗ (z i ) = B (there can be several solutions).
• Choose a one-dimensional subset of the parameters z i = z i (λ) which allows a
(0)
wide range of option prices at the given strike and such that z i (0) = z i .
∗ (z i ) and D
• Compute derivatives of B∗ (z i ), C ∗ (z i ) with respect to λ at λ = 0. We
use the notation B∗ = B∗ (z i (0)), B∗ = ∂λ B∗ (z i (λ)) |λ=0 …
• Write a Taylor expansion λ(T ) = λ1 T + λ2 T 2 + o(T 2 ) and write the equality of
the asymptotic expansion (20) for the model and the proxy model:

B  B + λ1 T B∗ + λ2 T 2 B∗ + 21 λ21 T 2 B∗


 + 3 T = ∗
+ C + ln(B) + DT
T 2B T
+C∗ + λ1 T C ∗ + 3 T + o(T ).
∗ + ln(B∗ + λ1 T B∗ ) + D
2B∗

• This gives the Taylor expansion of λ:

− C
C ∗ + ln(B) − ln(B∗ )
λ1 =
B∗

− D
D ∗ − λ1 B∗ − λ1 C
∗ − 1 λ2 B∗
B∗ 2 1
λ2 = .
B∗

• Plug parameters z i (λ1 T + λ2 T 2 ) into the closed form option price of the proxy
model to get an approximate price of the option in the real model.
The closer the models are, the better the approximation is. It is clear that if the proxy
model is the real model itself, there are no corrections at all. This procedure consists
in approximating only the differences between models at a given strike and not the
option price itself. In the basic case of Sect. 3.4 where the proxy model is the Black-
Scholes model, the approximation leverages on the fact that the volatility surface is
more regular than the option price.
54 L. Paulot

4 SABR Model

4.1 Model

The SABR Model [6] is a stochastic volatility model where the volatility is a local
volatility function multiplied by a lognormal stochastic volatility:

dF = V C(F)dW1
dV = νV dW2

with dW1 dW2 = ρdt. The initial value for V is the parameter5 α:

α = V (0).

C(F) is a local volatility function, which is generally

C(F) = F β

β is a number between 0 and 1 which controls the local skew. 0 corresponds to a


normal process and 1 to a lognormal process. The implied volatility at time 0 and at
β
the money is the local volatility αF0 .
Depending on the parameters, the origin F = 0 could be reached with finite
probability in finite time. For example this happens for the CEV process (i.e. even
without stochastic volatility) for β ≤ 21 . If F models a positive variable, a boundary
condition must be imposed. The asymptotic expansion does not distinguish between
different boundary conditions, as the computation is local around the geodesic path.
It is valid as long as this geodesic does not reach the boundary. However the maturity
validity range may be reduced for low strikes, when the probability of reflection or
absorbtion at the origin modifies the probability distribution at the strike considered
in a significant way.
In the following sections, we compute the asymptotic expansion for the SABR
model. This short maturity expansion is valid when both α2 T and ν 2 T are small
enough in front of 1. If ones uses CEV implied volatility instead of lognormal implied
volatility, the expansion is in ν 2 T only. Numerical experiments indicates that the
approximation remains very good for ν 2 T < 1.

5 We use the standard notation of α for the initial value of the volatility variable in the SABR model

instead of V0 as in the previous section.


Asymptotic Implied Volatility at the Second Order … 55

4.2 Order 0: Metric

In order to compute the order 0 implied volatility, the only geometric object involved
is the metric. According to the dictionary of Sect. 2.2, its inverse is the covariance
matrix
   V 2 C(F)2 ρνV 2 C(F) 
gi j = .
ρνV 2 C(F) ν 2 V 2

This matrix is first simplified by changing the variable F to


 F dF
q= (31)
F0 C(F)

which for C(F) = F β reads for β = 1

1−β
F 1−β − F0
q=
1−β

and for β = 1
 
F
q = ln .
F0

In addition, we rescale the time such that ν disappears of the equations while
keeping the same solution of the equations (the variances which are the physical
quantities are not changed):

t −→ ν 2 t
α
α −→
ν
ν −→ 1

At the end of the computation, the inverse transformation must be applied to the
implied volatility:

σν ←− σ

The matrix in the set of variables (q, V ) after this rescaling is


   
1 ρ
gi j = V 2 .
ρ 1
56 L. Paulot

This is diagonalized by going from variables (q, V ) to (x, y) with

q − ρV
x=
1 − ρ2
y = V.

The covariance matrix becomes


   
1 0
gi j = y 2
0 1

and its inverse is the metric


   
1 1 0
gi j = 2
y 0 1

which corresponds to the infinitesimal distance

dx 2 + dy 2
ds 2 = .
y2

This geometry corresponds to the hyperbolic plane, in the Poincaré half-plane rep-
resentation (y > 0) [7, 8]. Geodesics are vertical lines and semi-circles orthogonal
to the y = 0 axis. The geodesic distance between two points (x1 , y1 ) and (x2 , y2 )
can be computed:
 
−1 (x2 − x1 )2 + (y2 − y1 )2
d(x1 , y1 ; x2 , y2 ) = cosh 1+ .
2y1 y2

In the (q, V ) variables, going from q = 0, V = α to q, V the geodesic distance is


 
−1 q 2 + (V − α)2 − 2ρq(V − α)
d(0, α; q, V ) = cosh 1+ .
2(1 − ρ2 )αV

For a given strike, i.e. a given q, it is minimized by the volatility



Vmin = α2 + 2ραq + q 2

and the minimal distance is


    
Vmin − ρq − ρ2 α  Vmin + ρα + q 
d(0, α; q) = cosh−1 = ln .
(1 − ρ2 )α (1 + ρ)α
Asymptotic Implied Volatility at the Second Order … 57

Equation (23) gives the order 0 implied volatility


 
K
ln
F0
σ0 =  .
Vmin + ρα + q
ln
(1 + ρ)α

(We have dropped the absolute values as the numerator and the denominator have
the same sign.)
Plugging the expression for Vmin and going back to the original time, with ν
factors, the order 0 implied volatility for the SABR model is
 
K
ν ln
F0
σ0 =  (32)
α2 + 2ρανq + ν 2 q 2 + ρα + qν
ln
(1 + ρ)α

with ⎧ 1−β

⎪ K 1−β − F0
⎨ β<1
q=  1−β

⎪ K
⎩ ln β = 1.
F0

At the money, the limit of this expression is simply

β−1
σ0 (F0 ) = αF0

which is the local volatility.

4.3 Order 1: Connection

To compute the order 1 correction, we need the scalar factor in the time value expan-
 in Eq. (15), with C given in Eq. (12).
sion, given by C
For the hyperbolic plane, the Van Vleck–Morette determinant can be computed
as a function of the geodesic distance:

d
= .
sinh(d)

We need also

σ F (K , V ) = V C(K ) = V K β
58 L. Paulot

1 1
g(K , V ) = =
V 4 C(K )2 (1 − ρ2 ) V 4 K 2β (1 − ρ2 )

Using that d  = 0 for the minimum distance at V = Vmin , we compute

d
B  = dd  =
αVmin (1 − ρ2 ) sinh(d)

Terms simplify against each other to give at V = Vmin


 
 = − ln
C αVmin K β + M (33)

M is the integral of the connection 1-form on the geodesic. According to formula


(4), the connection is given by

1 ρC  (F)
A= d ln(C(F)) − dV . (34)
2(1 − ρ2 ) 2(1 − ρ2 )

In fact, A can be rewritten in (x, y) variables as

C  (F)
A= dx.
2 1 − ρ2

It must be integrated from 


  √−ρα
x1
= 1−ρ2
y1 α

to   
x2 √
q−ρV
= 1−ρ2 .
y2 V

If β = 1, the 1-form
1
A= dx
2 1 − ρ2

is exact and can be integrated directly:


     
q − ρV + ρα 1 K ρ K
M=   = ln +   ρ ln −V +α .
2 1 − ρ2 2 F0 2 1 − ρ2 F0

We consider now the general case where β < 1. If x1 = x2 , the geodesic is a


vertical line. As A is along dx, its integral is zero:

M = 0.
Asymptotic Implied Volatility at the Second Order … 59

In other cases, the first part of Eq. (34) is an exact form and can be integrated directly:
 (K ,V )    
1 1 C(K ) β K
d ln(C(F)) = ln = ln .
(F0 ,α) 2(1 − ρ )
2 2(1 − ρ )
2 C(F) 2(1 − ρ )
2 F
(35)

The second part must be integrated on the geodesic path. The geodesic is a semi-
circle with origin (X, 0), radius R and going through (x1 , x2 ) and (y1 , y2 ). The origin
is therefore
x 2 − x12 + y22 − y12
X= 2 (36)
2(x2 − x1 )

and the radius 


R= y12 + (x1 − X )2 . (37)

We parameterize the geodesic by t = tan(θ/2) where θ is the angle on the circle:

1 − t2
x=X+R
1 + t2
2t
y=R .
1 + t2

In this parametrization, the geodesic distance is given by

ds = d ln(t)

and we can compute


  
ρβ F β−1 ρ2 β K ρβ
− dV = − ln − [G(t2 ) − G(t1 )]
C 2(1 − ρ2) 2(1 − ρ2) F (1 − β) 1 − ρ2
(38)
with

G(t) = tan−1 (t)


⎧ 

⎪ a + bX c R + t (a + b(X − R))
⎪−
⎪ tan−1 (a + bX )2 > (1 − β)2 R 2

⎪ (a + bX )2 − (1 − β)2 R 2 (a + bX )2 − (1 − β)2 R 2

⎨ a + bX
+ (a + bX )2 = (1 − β)2 R 2

⎪ c R + t (a + b(X − R)) 



⎪ a + bX  −1 c R + t (a + b(X − R))

⎩ tanh (a + bX )2 < (1 − β)2 R 2
(1 − β)2 R 2 − (a + bX )2 (1 − β)2 R 2 − (a + bX )2
(39)
60 L. Paulot

1−β
a = F0
b = (1 − β) 1 − ρ2
c = (1 − β)ρ
#
R − xi + X
ti = (40)
R + xi − X

and  
  
−1 (z) = 1 ln 1 + z  ,
tanh 
2 1 − z

which coincides with the inverse function of tanh on ] − 1; 1[. Summing Eqs. (35)
and (38), the integral of the connection is finally
 
β K ρβ
M= ln − [G(t2 ) − G(t1 )] . (41)
2 F (1 − β) 1 − ρ2

Replacing M in Eq. (33) we get



⎪ ρβ
  ⎪
⎨− [G(t2 ) − G(t1 )] β < 1
 1 β β (1 − β) 1 − ρ2  
C = − ln αF0 Vmin K + ρ
2 ⎪
⎪ K
⎩   ρ ln − Vmin + α β = 1
2 1 − ρ2 F0
(42)
Restoring the factor ν, the order 1 correction is given by Eq. (24):
 
 + ln σ0 K F0
C
σ1 ν
= −ν 2  (43)
σ0 α2 + 2ρανq + ν 2 q 2 + ρα + qν
2
ln
(1 + ρ)α

 by Eq. (42) (with α divided by ν, also inside Vmin ), where


σ0 is given by Eq. (32), C
X , R, t1 and t2 are given in Eqs. (36), (37) and (40) from

⎛ ⎞ ⎛ ⎞
  −ρα   νq − ρ α2 + 2ρανq + ν 2 q 2
x1 ⎜ ⎟ x2 ⎜ ⎟
= ⎝ ν 1α− ρ2 ⎠ and =⎜ ν 1 − ρ2 ⎟
y1 y2 ⎝ ⎠
α + 2ρανq + ν 2 q 2
2
ν ν
and G(t) is defined in formula (39).
Asymptotic Implied Volatility at the Second Order … 61

Using this expression, the first order implied volatility is


 
σ1
σ = σ0 1 + t + o(t)
σ0

which is valid for all positive strikes. Exactly at the money, the formula we give must
be replaced by its limit, which can be computed by a Taylor expansion or numerically.
At the money and only at the money it appears to be equal to the original HKLW
formula:
σ1 1 2 2(β−1) 1 β−1 1 1
(F0 ) = α (1 − β)2 F0 + ρανβ F0 + ν 2 − ρ2 ν 2 . (44)
σ0 24 4 12 8

This is not surprising as their expansion is in fact an expansion in both maturity and
moneyness (eventually of order 0 in moneyness).

4.4 Order 2


To compute the second order correction to implied volatility, we need to compute D
as defined in Eq. (16), with D = −a1 defined in Eq. (9).
We have to compute a1 as defined in Eq. (9). Most of the integration can be done
analytically. We have first the integral of Q along the geodesic:

1
a1(Q) =− Qds.
d C

According to Eq. (5), Q is


 
β β V2
Q= 1−β+ .
4 2(1 − ρ2 ) F 2(1−β)

Using the values defined in the previous section for X , R, t1 , t2 , a, b and c, its integral
along the geodesic is
 
(Q) β β R2 H (t2 ) − H (t1 )
a1 = 1−β+ (45)
2 2(1 − ρ2 ) (1 − β)2 R 2− (a + bX )2 ln(t2 ) − ln(t1 )
62 L. Paulot

with
a + b(R + X ) + c Rt
H (t) =
(a + bX )(1 + t 2 ) + b R(1 − t 2 ) + 2c Rt
⎧ 

⎪ cR c R + t (a + b(X − R))

⎪ tan −1
(a + bX )2 > (1 − β)2 R 2

(a + bX )2 − (1 − β)2 R 2 (a + bX )2 − (1 − β)2 R 2 
+

⎪ cR  c R + t (a + b(X − R))
⎪−
⎪ tanh −1 (a + bX )2 < (1 − β)2 R 2 .

(1 − β)2 R 2 − (a + bX )2 (1 − β)2 R 2 − (a + bX )2

Note that in the denominator, the quantity ln(t2 ) − ln(t1 ) is up to a sign the geodesic
(Q)
distance d. If β = 1, a1 reduces to

(Q) R | x2 − x1 |
a1 =− . (46)
8(1 − ρ2 )d

The Laplacian on the hyperbolic plane is in (x, y) coordinates

D i Di = y 2 (∂x2 + ∂ y2 ).

As the Van Vleck–Morette determinant  = sinh(s) s


depends only on the geodesic
distance s, its derivative on the orthogonal coordinate vanishes: ∂⊥  = 0. On the
other hand, by definition the parallel transport on the geodesic curve has no covariant
derivative along the curve: ∇s P = 0. As a consequence, there is no crossed term and
both terms decouple: we have

P −1 −1/2 ∇ i ∇i (P1/2 ) = P −1 ∇ i ∇i P + −1/2 D i Di 1/2

(the R charge is carried only by P).


The metric part can be integrated analytically [1, 7]:
  
(R) 1 1 cosh(d) 1
a1 =− 1+ − . (47)
8 d sinh(d) d

The last part to integrate is the R connection term P −1 ∇ i ∇i P. As the action of


gauge transformations on the heat kernel expansion is fully carried by the parallel
transport term P, a1 can only depend on gauge-invariant quantities constructed from
F = d A. We split therefore A = A(0) + A(1) into a pure gauge part A(0) and A(1)
such that F = d A(1) :
1
A(0) = d ln(C(F))
2
ρ2 ρC  (F)
A(1) = d ln(C(F)) − dV .
2(1 − ρ2 ) 2(1 − ρ2 )
Asymptotic Implied Volatility at the Second Order … 63

Forgetting A(0) which is pure gauge (it can be checked by hand that A(0) will not
contribute), we denote by P (1) the A(1) part of the parallel transport:
(1)
&
A(1)
P (1) = e−M = e− C

with ⎧
⎪ ρβ

⎨− [G(t2 ) − G(t1 )] β<1
(1) (1 − β) 1 − ρ2  
M = ρ (48)

⎪ K
⎩   ρ ln −V +α β = 1.
2 1 − ρ2 F0

Using M(1) and A(1) which in (x, y) variables is



(1) ρC  (F) ρ
A = dx − dy ,
2 1 − ρ2

we can rewrite
    2
1 −1 i 1
P ∇ ∇i P = y 2 − ∂x2 + ∂ y2 M(1) + ∂x M(1) − A(1) x
2 2
 2 
+ ∂ y M(1) − A(1)
y + ∂ x A (1)
x + ∂ y A (1)
y

where the specific form of A(1) can be used to simplify terms

∂x A(1) (1)
x + ∂ y A y = 0.

Computing M(1) and A(1) analytically, we compute numerically the x and y


derivatives and integrate numerically along the geodesic curve to get
     2
(A) 1
a1 = ds y − ∂x2 + ∂ y2 M(1) + ∂x M(1) − A(1)
2
x
2d C
 2 
(1) (1)
+ ∂y M − A y . (49)

This integral can be computed by a numerical quadrature with few points. For β = 1
the connection has in fact no curvature and therefore a1(A) = 0.
64 L. Paulot

We compute also the following quantities (at V = Vmin ):

1
d  =
α(1 − ρ2 )Vmin sinh(d)
B  = dd 
B (3) 3
=−
B  Vmin
 
B (4) 12  cosh(d) 1
= 2 − 3d − . (50)
B  Vmin sinh(d) d

 and C
 . Using our decomposition M = β
We need finally C 2 ln K
F + M(1) , we can
write from formulas (12) and (15)
   
 = − β ln(K F) − 1 ln d 1  
C + ln B  + ln 1 − ρ2 + M1 .
2 2 sinh(d) 2

Its first and second derivatives in V at V = Vmin are


(3)
 = 1 B + M(1) 
C
2 B 
  2
 1  cosh(d) 1 1 B (4) 1 B (3) 
C = d − + 
− + M(1) .
2 sinh(d) d 2 B 2 B 

We choose to differentiate M(1) numerically, by finite difference, as the analytical


expression is very long and hard to simplify.
 (R)
 against −a1
C
Simplifying 2B we get finally
' (
= (Q) (A) 1 1 (1)  (1) 2 3 (1)  3
D −a1 − a1 + + M −M + M −
8 2B  Vmin 2
4Vmin

with a1(Q) given in Eq. (45), a1(A) in (49), B  in (50) and M(1) in (48) with G(t)
defined in Eq. (39). For β = 1, this expression can be simplified using Eq. (46) for
a1(Q) and

a1(A) = 0
 ρ
M(1) = −
2(1 − ρ2 )

M(1) = 0.
Asymptotic Implied Volatility at the Second Order … 65

We have finally obtained the second order corrective term


 2 
σ2 σ1 σ2
=
3

1  + 3 σ1 − 0
D
σ0 2 σ0 2B σ0 8

such that we get the quadratic approximation to implied volatility


 
σ1 σ2 2
σ = σ0 1 + T + T + o(T ) .
2
σ0 σ0

The computation has been done in redefined variables such that ν = 1. To restore
the ν factors, α must be replaced by αν , T by ν 2 T and the final implied volatility
must be multiplied by ν.
At the money, the formula for σσ02 looks divergent but its limit is well defined. We
compute this limit numerically, although it could be done analytically.

4.5 CEV Volatility

The results of Sect. 3.6 can be used to invert the SABR volatility into a CEV fractional
volatility. Using formulas of Sect. 3.6 the implied CEV volatility is computed and
used in the closed-form option prices of the CEV model.
This appears to be useful at low strikes for β < 1/2 or with small volatility of
volatility: only the corrections to the CEV model which come from the stochastic
volatility are approximated, not the local volatility part. For example, at the money
the first order coefficient of the Black volatility which is given by Eq. (44) becomes
for the CEV volatility

σ1 1 β−1 1 1
(F0 ) = ρανβ F0 + ν 2 − ρ2 ν 2 .
σ0 4 12 8

The corrective term in α2 T has disappeared.

4.6 Numerical Results

We present in Figs. 1 and 2 the implied volatility given by our expansion and compare
it to the implied volatility computed by a two-dimensional finite difference method
scheme. We also show for comparison the implied volatility given by the original
formula of [6]. In this example, parameters are F0 = 4, α = 30 %, β = 0.7,
ν = 40 %, ρ = −0.5. The FDM scheme is a second order Yanenko scheme [14]
with exponential fitting. We use 400 points in strike, 200 points in volatility and 30
time steps.
66 L. Paulot

2.5 years
0.55 Order 2 0.03 Order 2
Order 1 Order 1
0.5 HKLW 0.025 HKLW
FDM
0.45
Implied Voaltility

0.02

Relative Error
0.4
0.35 0.015
0.3 0.01
0.25
0.005
0.2
0.15 0
0.1 -0.005
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike
5 years
0.55 Order 2 Order 2
Order 1 Order 1
0.5 HKLW 0.08 HKLW
FDM
0.45
Implied Voaltility

Relative Error
0.4 0.06
0.35
0.3 0.04
0.25
0.02
0.2
0.15 0
0.1
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike
7.5 years
0.55 Order 2 Order 2
0.5 Order 1
HKLW
0.15 Order 1
HKLW
FDM
0.45
Implied Voaltility

Relative Error

0.4 0.1
0.35
0.3 0.05
0.25
0.2 0
0.15
0.1 -0.05
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike

10 years
0.55 Order 2 0.25 Order 2
Order 1 Order 1
0.5 HKLW 0.2 HKLW
FDM
0.45
Implied Voaltility

0.15
Relative Error

0.4
0.35 0.1
0.3 0.05
0.25
0
0.2
0.15 -0.05
0.1 -0.1
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike

Fig. 1 Implied volatility and relative error for the SABR model with parameters F0 = 4, α = 30 %,
β = 0.7, ν = 40 %, ρ = −0.5 and maturities 2.5 yr, 5 yr, 7.5 yr and 10 yr. On the left, implied
volatilities are plotted for our first order and second order expansions, the original formula of [6]
and are compared to the result of a FDM solution. On the right are the relative errors with respect
to this reference solution
Asymptotic Implied Volatility at the Second Order … 67

15 years
0.55 Order 2 0.4 Order 2
Order 1 Order 1
0.5 HKLW 0.3 HKLW
FDM
0.45
Implied Voaltility

0.2

Relative Error
0.4
0.1
0.35
0.3 0
0.25 -0.1
0.2 -0.2
0.15
-0.3
0.1
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike
20 years
Order 2 0.6 Order 2
0.5 Order 1 Order 1
HKLW HKLW
FDM 0.4
Implied Voaltility

0.4 0.2

Relative Error
0.3 0
-0.2
0.2
-0.4
0.1
-0.6
0 -0.8
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike
30 years
Order 2 1 Order 2
0.5 Order 1 Order 1
HKLW HKLW
0.4 FDM 0.5
Implied Voaltility

0.3 0
Relative Error

0.2
0.1 -0.5
0 -1
-0.1
-1.5
-0.2
-0.3 -2
0 2 4 6 8 10 0 2 4 6 8 10
Strike Strike

Fig. 2 Implied volatility and relative error for the SABR model with parameters F0 = 4, α = 30 %,
β = 0.7, ν = 40 %, ρ = −0.5 and maturities 10 yr, 20 yr and 30 yr. On the left, implied volatilities
are plotted for our first order and second order expansions, the original formula of [6] and are
compared to the result of a FDM solution. On the right are the relative errors with respect to this
reference solution

At very short maturities, all expansions are acceptable as the expansion is domi-
nated by the order 0 term. At first order our expansion is equal to the HKLW at the
money but is more regular in strikes and is better in the wings as our computation
does not involve any approximation in the moneyness. Our second order expansion is
one order of magnitude more precise.6 When maturity grows, first order expansions
lose precision but the second order remain relatively good up to 10 years, where
ν 2 T = 1.6. At higher maturities, the second order expansion explodes quadratically
and finally gives even negative volatilities at very long maturity and low strikes. At

6 Infact at very short maturities, the FDM scheme we use is less precise and less stable than this
second order expansion, especially in the wings where the probability density is very small.
68 L. Paulot

long maturities, a FDM or an other numerical method must be used, unless a valid
long maturity expansion could be computed more efficiently.

5 Conclusion

We have presented a general method to compute a Taylor expansion in maturity


of implied volatility for stochastic volatility models. We give exact formulas for
the first and second order corrections. As an application, we have computed this
expansion for the SABR model and compared it to the implied volatility given by a
numerical scheme and to the original HKLW formula. It appears that it gives more
precise results than the usual formula and extends the domain where a short maturity
expansion can be used. Outside this range of validity, other methods must be used:
numerical schemes or possibly other approximations.
If a closer model with closed formulas than Black-Scholes exists, we provide a
method to use this model as a proxy to extend the domain of validity of the expansion.
It would be interesting to see the results of this method for the SABR model with a
stochastic volatility model as a proxy.
Obtaining exact option prices at all maturities would be a non-perturbative com-
putation, which is a longstanding issue in theoretical physics.

Acknowledgments We thank Erwan Curien for his interest in this work, Martial Millet, Xavier
Lacroze and Wafaa Bennehhou for many discussions.

Appendix: Mean reversion

At long maturity, the SABR model is not realistic as the volatility process is a
geometric Brownian motion. In particular the variance of the volatility increases
linearly in time. A direct extension would be to add mean reversion to the volatility
process, either on the volatility or on the variance. The asymptotic expansion can
still be computed at order 2. However its domain of validity is usually reduced:
in addition to other conditions, the maturity must be small compared to the mean
reversion characteristic time.
We impose mean reversion on the volatility process as

dV = νV dW2 + κ(V − V )dt

The metric is not modified as it describes the diffusion part. Expressions for A and
Q are modified as follows:

ρκ(V − V ) κ(V − V )
A = ASABR − dF + 2 2 dV
νV 2 (1 − ρ2 )F β ν V (1 − ρ2 )
Asymptotic Implied Volatility at the Second Order … 69

1 κ2 (V − V )2 1 V 1 ρβκ(V − V )
Q = Q SABR + + κ−κ − .
2 ν 2 V 2 (1 − ρ2 ) 2 V 2 ν(1 − ρ2 )F 1−β

At first order, the integral of A along the geodesic is needed. Using the same
notations as in Sect. 4.3 where variables have been rescaled such that ν = 1, it gets
an additional term
' (
ρκ   V t 
−1 −1 2
M = MSABR + 2 tan (t2 ) − tan (t1 ) − ln
1 − ρ2 R t1
'   (
V V V
+ κ ln + − .
α V α

The second order correction involves a one-dimensional integral which can be


computed numerically by a quadrature with a few points, although it may be possible
to get an analytical expression.

References

1. Avramidi, I.G.: Analytical and geometric methods for heat kernel applications in finance. http://
infohost.nmt.edu/~iavramid/notes/hkt/hktslides7b.pdf (2007)
2. Berestycki, H., Busca, J., Florent, I.: Asymptotics and calibration of local volatility models.
Quant. Financ. 2(1), 61–69 (2002)
3. Berestycki, H., Busca, J., Florent, I.: Computing the implied volatility in stochastic volatility
models. Commun. Pure Appl. Math. 57(10), 1352–1373 (2004)
4. Cox, J.C.: Notes on option pricing I: constant elasticity of diffusions. Working paper (1975)
5. DeWitt, B.S.: Dynamical Theory of Groups and Fields. Gordon and Breach, New York (1965)
6. Hagan, P.S., Kumar, D., Lesniewski, A.S., Woodward, D.E.: Managing smile risk. Wilmott
Mag. 1(8), 84–108 (2002)
7. Hagan, P.S., Lesniewski, A.S., Woodward, D.E.: Probability distribution in the SABR model of
stochastic volatility. In: Friz, P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.)
Large Deviations and Asymptotic Methods in Finance. Springer Proceedings in Mathematics
and Statistics, vol. 110 (2015)
8. Henry-Labordère, P.: A general asymptotic implied volatility for stochastic volatility models.
http://arxiv.org/abs/cond-mat/0504317 (2005)
9. Henry-Labordère, P.: Analysis, Geometry, and Modeling in Finance: Advanced Methods in
Option Pricing. Chapman & Hall/CRC, Boca Raton (2008)
10. Heston, S.L.: A closed-form solution for options with stochastic volatility with applications to
bond and currency options. Rev. Financ. Stud. 6(2), 327–343 (1993)
11. Varadhan, S.: Diffusion processes in a small time interval. Commun. Pure Appl. Math. 20(4),
659–685 (1967)
12. Varadhan, S.: On the behavior of the fundamental solution of the heat equation with variable
coefficients. Commun. Pure Appl. Math. 20(2), 431–455 (1967)
13. Vassilevich, D.V.: Heat kernel expansion: user’s manual. Physics report 388, 279–360. http://
arxiv.org/abs/hep-th/0306138 (2003)
14. Yanenko, N.N.: The Method of Fractional Steps. Springer, New York (1971)
Unifying the BGM and SABR Models:
A Short Ride in Hyperbolic Geometry

Pierre Henry-Labordère

Abstract In this paper, using a geometric method introduced in (Henry-Labordère


Large Deviations and Asymptotic Methods in Finance (2015) [12]) and initiated by
(Avellaneda et al. Risk Mag. (2002) [4]), we derive an asymptotic swaption implied
volatility at the first-order for a general stochastic volatility Libor Market Model.
This formula is useful to quickly calibrate a model to a full swaption matrix. We
apply this formula to a specific model where the forward rates are assumed to follow
a multi-dimensional CEV process correlated to a SABR process. For a caplet, this
model degenerates to the classical SABR model and our asymptotic swaption implied
volatility reduces naturally to the Hagan-al formula (Hagan et al. Willmott Mag. 88–
108 (2002) [11]). The geometry underlying this model is the hyperbolic manifold
Hn+1 with n the number of Libor forward rates.

Keywords Heat kernel expansion · Hyperbolic geometry · Asymptotic smile


formula · Stochastic Libor market model

1 Introduction

The BGM model [6, 14] has recently been the focus of much attention as it gives
a theoretical justification for pricing caps-floors using the classical Black-Scholes
formula. The basic (physical) random variables are given by the Libor forward rates
which are assumed to follow a correlated log-normal process. As the forward swap
rate model implied by the BGM model is quite complicated (the swap forward rate
is not log-normally distributed), the calibration to a swaption matrix is difficult. An
asymptotic swaption implied volatility (at the zero-order in the swaption maturity)
was initially derived by Rebonato [17], Hull and White [9] for the BGM model.
Despite its great success, the BGM model presents the same drawbacks as the
classical Black-Scholes theory: as the forward rates follow a correlated log-normal

P. Henry-Labordère (B)
Société Générale, Global Markets Quantitative Research, Paris, France
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 71


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_3
72 P. Henry-Labordère

process, the model is not able to calibrate the full swaption matrix in/out-the money
(in particular the caplets smile) and give a good dynamics to the Libor rates. The
incorporation of a swaption smile can be obtained by introducing more elaborated
models which should be flexible enough to calibrate caplets and a grid of swaption
volatilities (not necessary at the money) across all swaption expiries and underlying
swap maturities. One property that these models must still share is their ability to
quickly calibrate the swaption matrix without using complicated numerical routines
such as Monte-Carlo simulation which are usually noisy and time-consuming. In this
context, Andersen-Andreasen introduced the CEV Libor Market Model (LMM) [1]
which assumes that each Libor forward rate follows a CEV process, and showed how
to obtain an asymptotic swaption smile. Their method is still based on the Rebonato
“freezing” argument which consists in assuming that the ratio of a forward Libor rate
over the swap rate and the derivative of the swap rate according to a forward Libor
rate are almost constant (and therefore equal to their values at the spot).
Recently, for this specific model, Kawai found a more accurate asymptotic formula
using the Wiener chaos expansion [15]. Although giving more flexibility than the
BGM model, the CEV LMM model is still not able to calibrate the swaption matrix
for in/out strikes and in this context, we are naturally led to use stochastic volatility
LMM. The literature on this subject is not particularly large. Andersen-al introduced
a LMM where the Libors follow a multi-dimensional correlated CEV process cou-
pled (but uncorrelated) to a Heston model [2, 3] and recently Piterbarg modifies this
model by allowing the model parameters to be time-dependent [16]. Using an aver-
aging principle, which consists in replacing the time-dependent parameters by some
effective constant parameters, Piterbarg derives an asymptotic volatility. Note that as
these models are uncorrelated to the stochastic volatility, the swaption fair value is
simply given by the fair price in the case of a local volatility model conditional to the
stochastic volatility process as explained by the Hull and White decomposition [10].
An asymptotic expression can then be generated by approximating the moments of
the volatility process [2].
For pricing exotic options (such as Bermudan swaptions for example), it is simpler
or more natural to model directly the forward swap rate with a stochastic volatility
process. For example, the SABR model [11] was introduced to fulfill this goal. An
asymptotic swaption smile formula (at the first-order) was derived for this specific
model and helps to calibrate quickly the model to liquid market data. In this con-
text, it is natural to try to reconcile/unify both benchmark models, the BGM and
SABR models. We therefore introduce a LMM where the forward rates follow a
multi-dimensional CEV process (with one beta for each Libor forward rate) corre-
lated to a SABR model. As it is the case for the SABR model, we impose that the
Libors are correlated to an unique volatility and it is therefore not possible to follow
the Andersen-al [3] method (i.e. the Hull and White decomposition) to derive an
asymptotic swaption smile.
In this paper, we pursue our previous work on the application of the heat kernel
expansion on a Riemannian manifold endowed with an Abelian connection [12]
to derive an asymptotic smile formula for a swaption. The plan of this paper is as
follows: in the first part, we will recall some definitions and present a list of recent
Unifying the BGM and SABR Models … 73

Libor Market Models. In the second part, we apply the heat kernel expansion to
derive an asymptotic swaption smile formula at the first-order valid for any LMM.
In the third part, we present our stochastic LMM and apply this general formula.
We will prove that the geometry underlying this model is the hyperbolic manifold
Hn+1 with n the number of forward rates. Furthermore, we show that the “freezing”
argument is no longer valid when we try to price a swaption in/out the money.

2 Libor Market Models

We denote by Fk (t) ≡ F(t, Tk−1 , Tk ) with the forward rate resetting at Tk−1 with
τk = Tk−1 − Tk the tenor. As the product of the bond P(t, Tk ) with the forward
rates Fk (t) is a difference of two bonds with maturity Tk−1 and Tk , τ1k (P(t, Tk−1 ) −
P(t, Tk )), and therefore a traded asset, Fk is a (local) martingale under Qk , the
(forward) measure associated with the numéraire P(t, Tk ). Therefore, we assume
the following driftless dynamics

d Fk (t) = σk (t)k (a, Fk )dWk , ∀t ≤ Tk−1 , k = 1, . . . , n


dWk dWl = ρkl (t)dt

with the initial conditions a(t = 0) = α and Fk (t = 0) = Fk0 . Throughout this


paper, W denotes a Brownian under the forward measure.
In order to achieve some flexibility, we assume that the (normal) local volatility
k (a, Fk ) depends on a hidden Markov process a (to be specified later) representing
a stochastic volatility. We therefore assume that all the forward rates are coupled
with the same stochastic volatility a. Table 1 presents a list of the different functional
forms for k used in the literature. The BGM, (limited) CEV and shifted log-normal
models correspond to local volatility models (a = 1) and the others to stochastic
volatility models with a unique stochastic volatility a driven by a Heston process.

Table 1 Examples of stochastic (or local) volatility Libor models


Libor market model SDE
BGM [6] d Fk = σk (t)Fk d Wk
β
CEV [1] d Fk = σk (t)Fk d Wk
β−1
Limited CEV d Fk = σk (t)Fk min(Fk , β−1 )d Wk with  a small
positive number
Shifted log-normal d X k = σk (t)X k d Wk with Fk = X k + αk

FL-SV [2] d Fk = σk (t)(βk Fk + (1 − βk )Fk0 ) vd Wk

dv = λ(v − λ̄)dt + ν vd Z ; d Wk d Z = 0

FL-TSS [16] d Fk = σk (t)(βk (t)Fk + (1 − βk (t))Fk0 ) vd Wk

dv = λ(v − λ̄)dt + ν vd Z ; d Wk d Z = 0
74 P. Henry-Labordère

Note that the stochastic differential equation for the Libors rate Fk has been written
in the forward measure Qk and the stochastic equation for a remains the same in the
forward or forward swap rate measures as a is assumed to be uncorrelated with the
Libor rates. This will not be the case in our LMM.

3 Asymptotic Swaption Smile

We note sαβ the forward swap rate starting at Tα and expiring at Tβ . The forward
swap rate satisfies the following driftless dynamics in the forward-swap measure

Qαβ (associated to the numéraire Cαβ (t) = i=α+1 τi P(t, Ti ))

β
 ∂sαβ
dsαβ = σk (t)k (a, Fk )d Z k
∂ Fk
k=α+1

Throughout this paper, W denotes a Brownian under the forward measure.


αβ
The local volatility associated to the forward swap rate (dsαβ = σloc (t, sαβ )dB)
is then by definition
⎡ ⎤

αβ ∂sαβ ∂sαβ
(σloc )2 (t, K ) ≡ Eαβ ⎣ ρi j (t)σi (t)σ j (t)i (a, Fi ) j (a, F j ) |s = K ⎦
∂ Fi ∂ F j αβ
i, j=α+1
 ∂sαβ ∂sαβ
β
 B i (a, Fi ) j (a, F j ) ∂ Fi ∂ F j p(da i d Fi )
= ρi j (t)σi (t)σ j (t) 
i, j=α+1 B p(da i d Fi )

(3.1)

with the submanifold B = {({Fi }i , a)|sαβ = K } and p ≡ p(Fi , a, t|Fi0 , α) the


conditional probability satisfying the (backward) Kolmogorov equation associated
to the SDE for the Libors and the volatility a in the forward swap measure Qαβ . An
asymptotic expression in the short time limit for the local volatility σαβ (s, t) can
be found in two steps: First by finding an asymptotic expansion for the conditional
probability p (in Qαβ ) and then by doing the integration over B. The first step
is achieved via the heat kernel expansion technique summarized in Appendix A.
and the second step using the Laplace saddle-point method explained briefly in
Appendix B.
Unifying the BGM and SABR Models … 75

3.1 Saddle-Point
d(x,x 0 )2
As the conditional probability at the zero-order is proportional to e− 4t (see
Appendix A) with d(x, x 0 ) the geodesic distance between the points x =
({Fi }i=1,...,n , a) ∈ Rn+1 and x 0 = ({Fi0 }i=1,...,n , α), the saddle-point corresponds
to the point x on the submanifold sαβ = K which minimizes the geodesic distance
d(x, x 0 ) [4, 5]

({Fi∗ }, a ∗ ) ≡ argmin{Fi },a|sαβ =K [d(x, x 0 )2 ] (3.2)

Introducing a Lagrange multiplier λ, this is equivalent to

({Fi∗ }, a ∗ ) ≡ argmin{Fi },a,λ [d(x, x 0 )2 + λ(sαβ (F) − K )] (3.3)

3.2 Asymptotic Local Volatility

Plugging our asymptotic expression for the conditional probability (5.2) into (3.1)
and doing the integration over B using the Laplace method (see Appendix B), we
finally obtain the local volatility at the first-order


β
 ⎨
αβ
(σloc )2 (t, K ) = ρi j (t)σi (t)σ j (t) f i j (F ∗ , a ∗ )

i, j=α+1
⎛ ⎛ ⎞ ⎞⎫

n+1
∂μ f i j (F ∗ , a ∗ ) ∗ ∗ 
n+1
∂μν f i j (F ∗ , a ∗ ) ⎬
1 + 2t Aμν ⎝ ⎝2 ∂ν ψ(F , a ) − Aγδ ∂νγδ d 2 ⎠ + ⎠
f i j (F ∗ , a ∗ ) ψ(F ∗ , a ∗ ) f i j (F ∗ , a ∗ ) ⎭
μ,ν=1 γ,δ=1

∂s ∂sαβ √
with f i j (F, a) = i (a, Fi ) j (a, F j ) ∂ Fαβi ∂ Fj , ψ(F, a) = gP and Aμν =
[∂μν d 2 ]−1 . g, P and  are defined in Appendix A and computed explicitly in Sect. 4.
Note that as opposed to other asymptotic methods presented in the literature, this
formula is exact at t → 0. A similar zero-order formula (independent of the time t
for σi (t), ρi j (t) constant) was derived for a general multi-dimensional local volatility
model by [4]. Moreover, in the expansion, we assumed that the time t is small but
we have made no assumption that Fk is close to the spot Libor or that the volatility
of volatility is small.
76 P. Henry-Labordère

3.3 Asymptotic Smile

The asymptotic smile can be derived in two steps from the asymptotic local volatility:
first, we have (s0 ≡ sαβ (t = 0))

αβ
σloc (t, sαβ ) αβ
dsαβ = αβ
σloc (t, s0 )dBt
σloc (t, s0 )
 t αβ
and doing a change of local time t = 0 σloc (u, s0 )2 du, we now obtain the associated
local volatility model for the swap rate

αβ
dsαβ = σ̄loc (t, sαβ )dBt

αβ
αβ σloc (t,s)
with σ̄loc (t, s) = αβ . Secondly, we know that there is a one-to-one
σloc (t,s0 )
correspondence between this local volatility and the smile [12] given at the first-
order by

Tα αβ 2
αβ 0 (σloc ) (u, s0 )du αβ
σBS (K , Tα ) = σBS (K )0

  Tα 
1 αβ αβ
× 1+ (σloc )2 (u, s0 )duσBS (K )1 (3.4)
2 0

αβ ln( sK )
σBS (K )0 =  K 0
dx
s0 C(x)
 αβ
 αβ
αβ 1 (σBS (K )0 )2 K s0 1 ∂t σloc ( f av , 0)
σBS (K )1 = −  2 ln + αβ 2
K dx C(K ) (σ ) (0, s0 ) C( f av )
s0 C(x) loc

αβ
σloc (0,K ) s0 +K
with C( f ) ≡ αβ , f av ≡ 2 .
σloc (0,s0 )

4 SABR-LMM Model

We have seen that the asymptotic local and implied volatilities can be computed if we
know the geodesic distance and a parametrization of geodesic curves on Mn+1 . This
is the case for the hyperbolic space Hn for all n. This manifold has a lot of important
properties. In the first part, we present our BGM-LMM-SABR model and show
that the underlying geometry is Hn+1 (with n the number of forward Libor rates).
Unifying the BGM and SABR Models … 77

Using this connection, we will find an asymptotic local volatility and an asymptotic
swaption implied volatility.

4.1 Dynamics

We introduce the SABR-LMM model, given by the following SDE under the spot
β(t)−1
Libor measure Q (associated to the numéraire Bm (t) = j=1 (1 + τ j F j (T j−1 ))
P(t, Tβ(t)−1 ) where β(t) = m if Tm−2 < t < Tm−1 )

d Fk = a 2 Bk (t, F)dt + σk (t)aCk (Fk )d Z k


da = νad Z n+1 , d Z i d Z j = ρi j (t)dt i, j = 1, . . . , n + 1

with
β
Ck (Fk ) = φk Fk k

k
τ j ρ jk σk (t)σ j (t)Ck (Fk )C j (F j )
Bk (t, F) =
(1 + τ j F j )
j=β(t)

Z is a correlated Brownian motion under the measure Q. Here we take the local
β
volatility term k (a, Fk ) of type aφk Fk k with φk ∈ R.
The functions Ck (Fk ) have been scaled by φk and therefore we can impose that
σk (0) = 0. The stochastic equation for a was written in the spot Libor measure in
order to get a SDE independent of a specific underlying swap sαβ or a forward bond.
Under the forward swap measure Qαβ , we have

d Fk = a 2 bk (t, F)dt + σk (t)aCk (Fk )d Z k


da = −ν 2 a 2 ba (t, F)dt + νad Z n+1 , d Z i d Z j = ρi j (t)dt i, j = 1, . . . , n + 1

with
β
  j)
max(k,
P(t, T j ) τi ρki σi (t)σk (t)Ci (Fi )Ck (Fk )
bk (t, F) = (21 j≤k − 1)τ j
Cαβ (t) (1 + τi Fi )
j=α+1 i=min(k+1, j+1)
β

Cαβ (t) = τi P(t, Ti )
i=α+1
β
 
i
τk Ck (Fk )ρka σk (t)
ba (F, t) = τi ωi (t)
1 + τk Fk (t)
i=α+1 k=β(t)
78 P. Henry-Labordère

i 1
k=β(t) (1+τk Fk )
and with ωi (t) = β j 1
. Here Z is a correlated Brownian motion
j=α+1 k=β(t) (1+τk Fk )
under the measure Qαβ . Note that the forward-rate dynamics under the forward mea-
sure Qk is much simpler and given by the following stochastic differential equations
(SDE)

d Fk (t) = σk (t)aCk (Fk )dWk , dWk dW p = ρkp (t)dt

As it is the case for the BGM model, we can use a piecewise parametric form or a
functional form for the serial volatilities σi (t) and the correlation ρi j (t) (here full
rank) as

σi (t) = Ni [(a(Ti−1 − t) + d)e−b(Ti−1 −t) + c] ∀t ≤ Ti−1


ρi j (t = 0) = ρ L + (1 − ρ L )e−(δ A −δ B min[Ti−1 ,T j−1 ])|Ti−1 −T j−1 |

The constants Ni are fixed such as σi (0) = 1. The model depends on 9 + 3n


parameters (see Table 2) which are calibrated on the swaption matrix. In the next
subsection, we derive the metric, the geodesic distance and the Abelian connection
underlying this model.

4.2 Hyperbolic Geometry

By definition (see Eq. (5.1a)), the infinitesimal distance (at t = 0) between the point
xα and xα +d xα (5.1a) is given by (ρi j ≡ [ρ−1 ]i j , (i, j) = (1, . . . , n), ρia ≡ [ρ−1 ]ia
and ρaa ≡ [ρ−1 ]aa are the components of the inverse of the correlation matrix ρ)


n+1
ds ≡
2
gαβ d xα d xβ
α,β=1
⎛ ⎞
2 ⎝  i j νd Fi νd F j 
n n
ia νd Fi 2⎠
= 2 2 ρ +2 ρ da + ρ da
aa
ν a Ci (Fi ) C j (F j ) Ci (Fi )
i, j=1 i=1

Table 2 SABR-LMM: 9 + 3n parameters


BGM parameters a, b, c, d, φi , ρ L , δ A , δ B
CEV parameters βi , i = 1, . . . , n
SABR parameters α, ν, ρia i = 1, . . . , n
Unifying the BGM and SABR Models … 79

After some algebraic manipulations, we show that in the new coordinates [xk ]k=1...n+1
( L̂ is the Cholesky decomposition of the (reduced) correlation matrix: [ρ]i, j=1...n =
[ L̂ L̂ † ]i, j=1...n )


n  Fi  n
d Fi
xk = ν L̂ ki
+ ρia L̂ ik a , k = 1, . . . , n
Fi0 Ci (Fi )
i=1 i=1

n
1
xn+1 = (ρaa − ρia ρ ja ρi j ) 2 a
i, j

the metric becomes


n n
2(ρaa − i, j ρia ρ ja ρi j ) i=1 d x i + d x n+1
2 2
ds =
2
ν2 2
xn+1

Written in the coordinates [xi ], the metric is therefore



the standard hyperbolic
n
2(ρaa − ρia ρ ja ρi j )
metric on Hn+1 modulo a constant factor μ = i, j,k=1
ν2
. This factor
can be eliminated by doing a change of time t = μ−1 t. In order to compute our
saddle-point (3.3), we need the geodesic distance which is given in [18].
Proposition 4.2.1 The geodesic distance d(x, x ) on Hn+1 is given by
 n+1 
−1 i=1 (x i − xi0 )2
d(x, x ) = cosh
0
1+ 0
(4.3)
2xn+1 xn+1

Using the geodesic distance on Hn+1 between the points x = ({F}k , a) and the
 F dF
initial point x 0 = ({F 0 }k , α) (qi ≡ F 0i C (Fi ) ) given by
i i i

 n  
−1
ν2 i, j=1 ρ qi q j
ij + 2ν(a − α) nj=1 ρ ja q j + (a − α)2 ρaa
d(F, a|F , α) = cosh
0
1+ 
2(ρaa − i,n j=1 ρia ρ ja ρi j )aα

In this model, the non-linear equation (3.3), satisfied by the saddle-point a ∗ (s), qi∗ (s)
which implicitly depends on the swaption strike s, read:


n 
n
a ∗ (s)2 ρaa = α2 ρaa − 2να ρia qi∗ + ν 2 ρi j qi∗ q ∗j (4.4)
i=1 i, j=1
∗ (s)−α) n
(ρia (a ν + ij ∗ ∗ ∗
j=1 ρ q j )d(q , a ) λ ∂sαβ
= |(a,q)=(a ∗ ,q ∗ ) (4.5)
a ∗ (s)(cosh(d(a ∗ , {qi∗ }))2 − 1)
1
2 α ∂qi
80 P. Henry-Labordère

with
 Fi∗
qi∗ = φi−1 x −βi d x
Fi0

The saddle-point is determined by solving these non-linear equations (4.4) and (4.5)
and an approximation (which could be used as a guess solution in a numerical opti-
mization routine) is found by linearizing these equations around the spot Libor rates
(i.e. qi = 0)

(s − s0 )
λ∗ (s) = n (4.6)
p,q=1 ω p ωq ρ̃ pq
n
Fi∗ j=1 ρ̃i j ω j (s − s0 )
= 1 +  n + O((s − s0 )2 ) (4.7)
Fi0 ω
p,q=1 p q pq ω ρ̃

∂s
with ωi ≡ ∂qαβi (qi = 0) and ρ̃i j = ρi j − ρ ρaa
ρ ia ja
. Note that when the strike is close to
at-the-money, the saddle-points are close to the spot Libors and a ∗ = α. Moreover,
by using the explicit expression for the hyperbolic distance, the Van-Vleck-Morette
determinant is

d(F, a|F 0 , α)
(F, a|F 0 , α) = 
cosh2 (d(F, a|F 0 , α)) − 1

4.3 Connection

The Abelian connection is given by (5.1b)


⎛ ⎞

n  
1 ⎝ b j (t, F) ∂ j C j (F j )
Ai = ρi j − − νρia ba (F, t)⎠
Ci (Fi ) C j (F j ) 2
j=1
⎛ ⎞
 n  
1⎝ b (t, F) ∂ C (F )
− νρaa ba (F, t)⎠
j j j j
Aa = ρa j −
ν C j (F j ) 2
j=1

where we have used that


n+1 1
√ 2 2 det[ρ]− 2
g= n
νa n+1 i=1 Ci (Fi )
Unifying the BGM and SABR Models … 81

n
Finally, the Abelian 1-form connection, A = i=1 Ai d Fi + Aa da, is

n    
1  b j (t, F) ∂ j C j (F j )
n
A= − ν ρi j dqi + ρa j da
ν C j (F j ) 2
j=1 i=1
 n 

− ba (t, F) ν ρia dqi + ρaa da
i=1

In order to compute the log of the parallel gauge transport ln(P)(a, q|α) = C A,
we need to know a parametrization of the geodesic curve on Hn+1 . However, we
can directly find ln(P)(a, q|α) if we approximate the drifts bk (t, F) by their values
at the Libor spots (and t = 0). A similar approximation was done in the Hagan-al
formula [11] as was shown in [12]. Modulo this approximation,
  n 
1 b j (0, F 0 ) ∂ j C j (F j ) 
n 0
ln(P)(a, q|α) ∼ − ν ρi j qi + ρa j (a − α)
ν C j (F 0)
j
2
j=1 i=1
 n 

− ba (0, F 0 ) ν ρia qi + ρaa (a − α)
i=1

4.4 Asymptotic Smile—Summary

The asymptotic local volatility is given by (3.4)

β

αβ
(σloc )2 (t, s) = ρi j σi (t)σ j (t) f i j (a, F)
i, j=α+1
⎛ ⎧
β
 ⎨∂ f (F∗, a ∗ ) ∂μ f i j (F∗, a ∗ ) ∂ν ψ(F∗, a ∗ )
μν i j
× ⎝1 + 2t Aμν + 2
⎩ f i j (F∗, a ∗ ) f i j (F∗, a ∗ ) ψ(F∗, a ∗ )
μ,ν=α+1
⎫⎞

n+1
∂μ f i j (F∗, a ∗ ) ⎬
− Aγδ ∂νγδ d 2
(F∗, a ∗
) ⎠
f i j (F∗, a ∗ ) ⎭
γ,δ=1
82 P. Henry-Labordère

with (a ∗ ≡ a ∗ (s), F ∗ ≡ {Fi∗ (s)}i ) the saddle-point satisfying the Eqs. (4.4) and
(4.5) approximated by (4.6) and (4.7) and

∂sαβ ∂sαβ 
f i j (a, F) = a 2 Ci (Fi )C j (F j ) , ψ(a, F) = gP , Aαβ = [∂αβ d 2 ]−1
∂ Fi ∂ F j
 2 n ij n ja 2
ν i, j=1 ρ qi q j + 2ν(a − α) j=1 ρ q j + (a − α)
0
d(F, a|F , α) = cosh −1 1+ n
2(ρaa − i, j=1 ρ ρ ρi j )aα
ia ja
⎡  ⎛ ⎞⎤
0
1  ⎣ b j (0, F 0 ) ∂ j C j (F j ) ⎝  i j
n n
ln(P)(a, q|α) ∼ − ν ρ qi + ρ (a − α)⎠⎦
a j
ν C j (F j0 ) 2
j=1 i=1
⎛ ⎞
n
− ba (0, F ) ⎝ν
0 ρ qi + ρ (a − α)⎠
ia aa

i=1
d(F, a|F 0 , α)
(F, a|F 0 , α) = 
cosh2 (d(F, a|F 0 , α)) − 1
n+1 1
√ 2 det[ρ]− 2
2
g= n
νa 1+n
i=1 Ci (Fi )

Note that this expression is exact when t goes to zero. The smile at the first-
order is then obtained by plugging the above expression into (3.4) with t =
 ν
2
2(ρaa − n ρia ρ ja ρ )
t.
i, j,k=1 ij

Remark 4.4.1 (Libor CEV model) Note that our model reduces for ν goes to zero
(and α ≡ 1) to the Andersen-Andreasen CEV Libor model (with different CEV
parameters for each Libors) and the above expressions degenerates into

∂sαβ ∂sαβ
f i j (F) = Ci (Fi )C j (F j )
∂ Fi ∂ F j

 n
d(F) = !2 ρi j qi q j
i, j=1


n
b j (0, F 0 ) ∂ j C j (F j )  i j
0 n
ln(P)(q) = ( − ) ρ qi
j=1
C j (F j0 ) 2
i=1

(F, F ) = 1
0
n 1
√ 2 2 det[ρ]− 2
g= n
i=1 Ci (Fi )
Unifying the BGM and SABR Models … 83

with the saddle-points (4.4) and (4.5) satisfying the non-linear equations (modulo
the constraint sαβ = s)

∂sαβ
ρi j q ∗j = λ .
∂qi q ∗

4.5 Comments and Numerical Tests

It is interesting to note that for n = 1, i.e. for a caplet, the caplet asymptotic smile
reduces to the classical SABR formula by construction. Moreover, the asymptotic
local volatility is given at the zero-order by

αβ

n
∂sαβ ∗ ∂sαβ ∗
(σloc )2 (s, t) = ρi j (t)σi (t)σ j (t)a ∗ 2 (F ∗ )Ci (Fi∗ )Ci (Fi∗ ) (F ) (F )
∂ Fi ∂ Fj
i, j=1

with F ∗ depending implicitly on s via (4.4) and (4.5). At this stage, it is useful
to recall how a similar asymptotic local volatility is derived using the “freezing”
argument. The forward swap rate satisfies the following SDE in the forward swap
numéraire Qαβ


n
∂sαβ
dsαβ = σk (t)aCk (Fk )d Z k (4.8)
∂ Fk
k=1

∂s
The “freezing” argument consists in assuming that the terms ∂ Fαβk and C(s)
C(Fi ) are
almost constant. Therefore, the SDE (4.8) can be approximated by


n
∂sαβ Ck (Fk0 )
dsαβ = (F 0 )σk (t)a Ck (s)d Z k
∂ Fk Ck (s 0 )
k=1

and the local volatility is

αβ

n
Ci (Fi0 ) C j (F j ) ∂sαβ
0
(σloc )2 (s, t) = ρi j (t)σi (t)σ j (t)a ∗ 2 (s)
Ci (s 0 ) Ci (s 0 ) ∂ Fi
i, j=1
∂sαβ 0
× (F 0 ) (F )Ci (s)C j (s)
∂ Fj
84 P. Henry-Labordère

Table 3 Scenario A: Libor volatility λi (t) = 5 %


Swaption Strike MC (%) F1 F2
5 × 15 4 % (ITM) 22.42 22.41 % (−1) 22.61 % (19)
6 % (ATM) 20.33 20.41 % (8) 20.46 % (13)
8 % (OTM) 18.92 18.93 % (1) 19.01 % (10)
10 × 10 4 % (ITM) 22.41 22.51 % (11) 22.67 % (26)
6 % (ATM) 20.38 20.41 % (3) 20.50 % (12)
8 % (OTM) 18.93 18.93 % (−1) 19.05 % (12)
Libor L i (0) = 6 %. β = 0.5

β
Table 4 Scenario B: Libor volatility d L i = 0.25(0.17 + 0.002(Ti−1 − t))L i dW
Swaption Strike MC (%) F1 F2
5 × 15 5.08 % (ITM) 18.12 18.20 % (8) 18.17 % (5)
7.26 % (ATM) 16.51 16.61 % (10) 16.63 % (12)
9.44 % (OTM) 15.38 15.38 % (0) 15.56 % (18)
10 × 10 5.55 % (ITM) 17.80 17.81 % (1) 17.89 % (9)
7.93 % (ATM) 16.26 16.33 % (7) 16.38 % (11)
10.31 % (OTM) 15.17 15.19 % (2) 15.32 % (15)
Libor L i (0) = log(a + bi), L 0 (0) = 5 %, L 19 (0) = 9 %. β = 0.5

We can reproduce this formula for the swaption smile at-the-money1 as the
saddle-point Libor rates coincides with the spot rates. This is not the case for in/out-
the-money swaption. Therefore our expression (exact at the zero-order) shows that
the freezing argument is no longer correct when we try to fit a swaption implied smile
in/out-the-money. In the following, we have tested our asymptotic swaption formula
at the zero-order with the same beta βk = β and ν = 0 (Formula F1) against the
Andersen-Andreasen asymptotic formula (Formula F2) [1] in the case ν = 0. The
accuracy of these approximations are examined using Monte-Carlo (MC) prices as a
benchmark. Following [15], we consider five scenarii (see Tables 3, 4, 5, 6 and 7). In
the following tables, the implied volatility is reported and the numbers in brackets are
the errors (in basis points i.e. true volatility times 104 ) corresponding to the implied
volatility computed using the F1 or F2 formula minus the MC implied volatility. An
x × y swaption has an option maturity of x years, a swap length of y years and a
tenor of one year. We set a time-step for Monte-Carlo δ = 0.125 and 216 paths.2 Our
formula F1 is more accurate than F2.

1 An at-the-money swaption (ATM) has a strike K equal to the spot rate sαβ (0) and an out-of-the
money (OTM) (resp. in-the-money (ITM)) swaption has K < sαβ (0) (resp. K > sαβ (0)).
2 We have used a predictor-corrector scheme with a Brownian bridge.
Unifying the BGM and SABR Models … 85

β
Table 5 Scenario C: Libor volatility d L i = 0.25(0.17 − 0.002(Ti−1 − t))L i dW
Swaption Strike MC (%) F1 F2
5 × 15 5.08 % (ITM) 14.89 14.97 % (8) 15.08 % (19)
7.26 % (ATM) 13.73 13.79 % (4) 13.81 % (8)
9.44 % (OTM) 12.92 12.91 % (−1) 12.92 % (0)
10 × 10 5.55 % (ITM) 14.52 14.53 % (1) 14.64 % (12)
7.93 % (ATM) 13.33 13.38 % (5) 13.40 % (7)
10.31 % (OTM) 12.51 12.51 % (0) 12.54% (3)
Libor L i (0) = log(a + bi). L 0 (0) = 5 %, L 19 (0) = 9 %. β = 0.5

β
Table 6 Scenario D: d L i = 0.05L i ( √ bi12(t) d W1 + √ bi22(t) d W2 ). bi1 (t) =
bi1 (t) +bi2 (t)2 bi1 (t) +bi2 (t)2

ρe−k1 (Ti−1 −t) + θe−k2 (Ti−1 −t) , bi2 (t) = 1 − ρ2 e−k1 (Ti−1 −t)
Swaption Strike MC (%) F1 F2
5 × 15 5.08 % (ITM) 19.19 19.33 % (14) 19.38 % (19)
7.26 % (ATM) 17.59 17.72 % (13) 17.75 % (16)
9.44 % (OTM) 16.46 16.49 % (3) 16.61 % (15)
10 × 10 5.55 % (ITM) 18.92 18.94 % (2) 19.06 % (14)
7.93 % (ATM) 17.31 17.39 % (8) 17.45 % (14)
10.31 % (OTM) 16.18 16.21 % (3) 16.32 % (14)
ρ = 0.99, θ = −0.99, k1 = k2 = 0.54. Libor L i (0) = log(a + bi). L 0 (0) = 5 %, L 19 (0) = 9 %.
β = 0.5

Table 7 Scenario E: Scenario D with β = 0.3


Swaption Strike MC (%) F1 F2
5 × 15 5.08 % (ITM) 33.09 33.65 % (56) 34.24 % (115)
7.26 % (ATM) 29.47 29.96 % (49) 30.23 % (76)
9.44 % (OTM) 26.92 27.14 % (22) 27.49 % (57)
10 × 10 5.55 % (ITM) 31.75 32.41 % (66) 33.33 % (158)
7.93 % (ITM) 28.47 28.88 % (41) 29.37 % (91)
10.31 % (OTM) 26.01 26.18 % (17) 26.68 % (67)

5 Conclusion

In this short note, we have introduced a LMM model coupled to a SABR stochastic
volatility process. By using the heat kernel expansion technique in the short time
limit, we have obtained an asymptotic swaption implied volatility at the first-order,
compatible with the Hagan-al classical formula for caplets. Moreover, we have seen
that this exact expression (when the expiry is very short) is incompatible with the
analogous expression obtained using the freezing argument.
86 P. Henry-Labordère

Appendix A: Heat Kernel Expansion

An short-time expansion of the conditional probability for a multi-dimensional Itô


diffusion process can be achieved using the heat kernel expansion. In that purpose, the
Kolmogorov equation is rewritten as the heat kernel equation on a (n)-dimensional
Riemannian manifold Mn endowed with an Abelian connection as explained in
[12, 13]. Let’s assume that our multi-dimensional stochastic equations (in Qαβ ) are
written as

d xμ = bμ (t, x)dt + σμ (t, x)dWμ

with dWμ dWν = ρμν (t)dt. Then, the metric gμν depends only on the diffusion terms
σμ and the connection Aμ on the drift terms bμ as well

ρμν (t)
gμν (t, x) = 2 , μ, ν = 1, . . . , n , ρμν ≡ [ρ−1 ]μν (5.1a)
σμ (t, x)σν (t, x)
 

n
1 
n+1
1
 
Aα (t, x) = gαμ bμ (t, x) − g − 2 ∂ν g 1/2 g μν (t, x) , α = 1, . . . , n
2
μ=1 ν=1
(5.1b)

with g(t, x) ≡ det[gμν (t, x)]. In terms of these functions, the asymptotic solution to
the Kolmogorov equation in the short-time limit is given by
√  ∞

g(x)  σ(x,x 0 ) 
p(x, t|x ) =
0
n (x, x 0 )P(x, x 0 )e− 2t 1+ an (x, x )t
0 n
,t → 0
(4πt) 2 n=1
(5.2)

• Here, σ(x, x 0 ) is the Synge function defined as

d 2 (x, x 0 )
σ(x, x 0 ) =
2

The distance d(x, x 0 ) is defined as the minimizer of


 T d xμ (t) d xν (t)
d(x, x ) = min
0 2
gμν (t = 0, x) dt
C 0 dt dt

and t parameterizes the curve C(x, x 0 ) joining x(t = 0) ≡ x 0 and x(T ) ≡ x.


• (x, x 0 ) is the so-called Van Vleck-Morette determinant

1 ∂ 2 σ(x, x 0 ) 1
(x, x 0 ) = g(0, x)− 2 det(− )g(0, x 0 )− 2 (5.3)
∂x∂x 0
Unifying the BGM and SABR Models … 87

with g(x) = det[gμν (0, x)]


• P(x, x 0 ) is the parallel transport of the Abelian connection along the geodesic
C(x, x 0 ) from the point x to x 0 .
 μ
P(x, x 0 ) = e
− C(x 0 ,x) Aμ (t=0,x)d x (5.4)

• The ai (x, x 0 ) are smooth functions on Mn and depend on geometric invariants


such as the scalar curvature R. More details can be found in [12].

Appendix B: Saddle-Point Method

The integration over B is obtained by usinga saddle-point method which consists in


approximating at the first order the integral f (x)eφ(x) d x in the limit  large by [8]
  
∗ 1 ∂αβ f
f (x)eφ(x) d x ∼>>1 f (x ∗ )eφ(x ) 1 + − Aαβ
 2f
 
∂α f 1
+ ∂βγδ φ + ∂αβγδ φ Aαβ Aγδ
2f 8

∂αβγ φ∂δμν φ
− Aαβ Aγδ Aμν
72

with Aαβ = [∂αβ φ]−1 , d x ≡ i=1 n


d xi and x ∗ the saddle-point (which minimizes
φ(x)). Here we have used Einstein summation convention. This expression can be
obtained by developing φ(x) and f (x) in series around x ∗ . The quadratic part in
φ(x) leads to a Gaussian integration over x which can be performed.

References

1. Andersen, L., Andreasen, J.: Volatility skews and extensions of the Libor market model. Appl.
Math. Financ. 7(1), 1–32 (2000)
2. Andersen, L., Andreasen, J.: Volatile volatilities. Risk 15(12), 163–168 (2002)
3. Andersen, L., Brotherton-Ratcliffe, R.: Extended Libor market models with stochastic volatil-
ity. J. Comput. Financ. 9(1), 1–40 (2005)
4. Avellaneda, M., Boyer-Olson, D., Busca, J., Fritz, P.: Reconstructing the smile. Risk Mag.
(2002)
5. Berestycki, H., Busca, J., Florent, I.: Asymptotics and calibration of local volatility models.
Quant. Financ. 2, 31–44 (1998)
6. Brace, A., Gatarek, D., Musiela, M.: The market model of interest rate dynamics. Math. Financ.
7, 127–154 (1996)
7. Brigo, D., Mercurio, F.: Interest Rate Models, Theory and Practice. Springer Finance. Springer,
Berlin (2015)
8. Erdély, A.: Asymptotic Expansions. Dover, New York (1956)
88 P. Henry-Labordère

9. Hull, J., White, A.: Forward rate volatilities swap rate volatilities, and the implementation of
the Libor market model. J. Fixed Income 3, 46–62 (2000)
10. Hull, J., White, A.: The pricing of options on assets with stochastic volatilities. J. Financ. 42,
281–300 (1987)
11. Hagan, P., Kumar, D., Lesniewski, A., Woodward, D.: Managing smile risk. Willmott Mag.
84–108 (2002)
12. Henry-Labordère, P.: A General Asymptotic Implied Volatility for Stochastic Volatility Models.
http://arxiv.org/abs/condmat/0504317
13. Henry-Labordère, P.: Solvable local and stochastic volatility models: supersymmetric methods
in option pricing. Quant. Financ. 7(5), 525–535 (2007)
14. Jamshidian, F.: Libor and swap market models and measures. Financ. Stoch. 1(4), 293–330
(1997)
15. Kawai, A.: A new approximate swaption formula in the Libor model market: an asymptotic
approach. Appl. Math. Financ. 10(1), 49–74 (2003)
16. Piterbarg, V.: A stochastic volatility forward Libor model with a term structure of volatility
smiles, SSRN
17. Rebonato, A.: On the pricing implications of the joint lognormal assumption for the swaption
and cap markets. J. Comput. Financ. 3(3), 5–26 (1999)
18. Terras, A.: Harmonic Analysis on Symmetric Spaces and Applications, vol. I, II. Springer, New
York (1985, 1988)
Second Order Expansion for Implied
Volatility in Two Factor Local Stochastic
Volatility Models and Applications
to the Dynamic λ-Sabr Model

Gérard Ben Arous and Peter Laurence

Abstract Using an expansion of the transition density function of a two dimensional


time inhomogeneous diffusion, we obtain the first and second order terms in the short
time asymptotics of the local volatility function in a family of time inhomogeneous
local-stochastic volatility models. With the local volatility function at our disposal,
we show how recent results (Gatheral et al., Math. Financ. 22:591–620, 2012, [28])
for one dimensional diffusions can be applied to also determine expansions for call
prices as well as for the implied volatility. The results are worked out in detail in
the case of the dynamic Sabr model, thus generalizing earlier work by Hagan et al.
(Wilmott Mag. 84–108, 2003, [31]), Hagan and Lesniewski (Springer Proceedings in
Mathematics and Statistics, vol. 110, 2015, [32]) and by Henry-Labordère (Springer
Proceedings in Mathematics and Statistics, vol. 110, 2015, Geometry, and Modeling
in Finance. Chapman & Hall/CRC Financial Mathematics Series, 2008, [39, 40]).

Keywords Implied volatility · Local volatility · Asymptotic expansion · Heat


kernels

1 Introduction

Stochastic volatility models offer a widely accepted approach to incorporating into


the modeling of option markets a flexibility that accounts for the implied volatility
smile or skew. From a historical perspective he first models to be introduced into the

Part of this work carried out while the author was a Visiting Scholar at the Courant Institute.

G. Ben Arous (B) · P. Laurence


Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street,
New York, NY 10012, USA
e-mail: [email protected]
P. Laurence
Dipartimento di Matematica, Università di Roma 1 Piazzale Aldo Moro, 2,
00185 Rome, Italy

© Springer International Publishing Switzerland 2015 89


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_4
90 G. Ben Arous and P. Laurence

literature were the Hull and White model [37], the Stein and Stein model [50] and
the Heston model [33]. In three of these models the underlying asset and its volatility
are driven by Brownian motions that may or may not be instantaneously correlated.
The correlation coefficient is taken to be a constant ρ. Then Bates introduced the
first of a series of models incorporating jumps [7]. These were followed by work by
Andersen and Andreasen [2].
In recent times there has been an explosion of models using the method of sto-
chastic time changes to produce ever more versatile models [15, 16]. However purely
diffusive models have retained of their popularity. A case in point is the introduc-
tion into the literature of the dynamic SABR (stochastic alpha-beta-rho model), by
Hagan, Lesniewski and Woodward [32]

β
d Ft = γ(t)Ft yt dW1t
dyt = ν(t)yt dW2t
< dW1t , dW2t > = ρ(t)dt
F0 = F̄0 ; y0 = α

A possible shortcoming of the Sabr was the lack of mean reversion in the stochastic
volatility. To address this a generalization was proposes by Henry-Labordère, who
in [39] introduced the λ Sabr model. In this new model the second equation is
complemented by a mean-reverting term

dyt = κ(θ − yt )dt + yt dW2t

The incorporation of a mean-reverting volatility was well received in the market


place, since most practitioners think this feature is inherent in the volatility’s evolu-
tion.
In this paper we will consider a model which includes both the dynamic Sabr
model and the λ-Sabr models, in that we allow the volatility to be mean reverting
and in addition for all parameters to depend explicitly on time. We explain how the
approach via Laplace asymptotics needs to be adjusted in order to allow for this time
dependence. The family of models considered also encompasses the so-called local
volatility Heston models.
Two major tools have been developed to evaluate European option prices in sto-
chastic volatility models in closed form. The first is the so-called mixing formula,
due to Hull and White [37] and Renault and Touzi [49]. This method is particularly
powerful when applied to uncorrelated stochastic volatility models, since in this case,
the price of the European option is fully determined, provided the law of the time
integral of the instantaneous variance is known. In the language of stochastic time
changes, we need to determine the law of the time change. The second tool, and the
one closest to the approach taken in this paper, was introduced into mathematical
finance by Kunitomo and Takahashi [38]. It is the method of asymptotic expansions.
Hagan and Woodward [30] applied this method to find asymptotic expansions for
the implied volatility of European options in a local volatility setting. Hagan, Kumar,
Second Order Expansion for Implied Volatility … 91

Lesniewski and Woodward [32] used asymptotics methods to obtain approximations


for the implied volatility in the two factor Sabr models. In a Courant Institute lecture
André Lesniewski [43] introduced the geometric approach to asymptotics, by relat-
ing the underlying geometry of diffusion associated to the SABR model in the case
β = 0 to the Poincaré plane, a model of hyperbolic space, and outlined an approach to
the asymptotics in stochastic volatility models via a WKB expansion. This approach
was further developed in the important unpublished working paper [31] by Hagan,
Lesniewski and Woodward. These authors used changes of variables to reduce the
Sabr model with β = 0 to a perturbed form of the Sabr model with β = 0 and
then used the Hausdorff-Baker Campbell formula to find approximate solutions for
the fundamental solution of the perturbed problem. In [39] Henry-Labordère’s made
contributions of both a theoretical and a practical nature. On the theoretical side
he showed how an approach to small time asymptotics of fundamental solutions
can be derived via the heat kernel expansion. This expansion was introduced by
Minakshisundaram and Pleijel in [44], building on earlier work by Hadamard [29].
Using the heat kernel methods a fundamental solution for linear parabolic differential
operators with variable coefficients can be obtained even in cases where the operator
contains first order (i.e. drift) and zero order terms. On the practical side, as men-
tioned earlier, Henry-Labordère introduced the λ-Sabr model, and showed how the
heat kernel method yields asymptotic formulas for the fundamental solution and for
the implied volatility and local volatility in this model. Also, in recent work, Forde
and Jacquier [24, 25], have begun the rigorous investigation of implied volatility in
both the short and long time limit, in the Heston model. In particular they obtained the
first closed form near the money expression for the implied volatility in the Heston
setting. This work was further extended in joint work with Mijatović and with Lee
[26, 27]. A different, i.e. fast mean reverting, regime was investigated in [23]. Also,
Takahashi and collaborators have recently applied their Malliavin and Wiener-Itô
iterated integral based approach to the λ-Sabr model [51], and have demonstrated
its effectiveness and versatility. Benhamou and Croissant [8] examine the concept
of local time in the Sabr model and show that this valuation works using a Black-
Scholes like formula in which complex quantities appear. Bourgade and Croissant
[13], following an approach by Molchanov [47], in a working paper, applied the
latter to a generalized p-homogeneous version of the Sabr model [13].
In a recent paper Gatheral, Hsu, Laurence, Ouyang and Wang [28] have re-
considered the implied volatility expansions in local volatility models and developed
highly accurate expansions up to the second order for the implied volatility. They
have also pointed out the need to consider separate “regimes” in developing such
expansions. A “near the money” small time limit regime was considered in an influ-
ential paper by Megvedev and Scaillet [45] based on the nonlinear PDE approach
of Berestycki et al. [11], extended to a class of jump diffusions. Gatheral et al. [28]
considered two complimentary regimes, an “away from the money regime” and an
at the money regime. In the former, the limit is taken as the time to maturity τ goes to
zero, for fixed value of the spot (or of the forward) and for fixed strike. This regime
is associated to exponential decay of the difference between call prices and their
intrinsic value as τ → 0. On the practical side, the main contribution of [28] was
92 G. Ben Arous and P. Laurence

extending earlier asymptotic results for local volatility models to second order, so
that they become capable to furnish highly accurate expansions both for time homo-
geneous and time inhomogeneous models. In addition, unlike earlier treatments (with
the exception of Forde and Jacquier’s treatment of the Heston model) the expansions
in [28] were put on a rigorous mathematical basis. In particular it is shown that the
terms in the expansion of the implied volatility can actually be interpreted as deriv-
atives with respect to time to maturity, and for this reason, are in a certain sense
optimal.
In Sect. 4.2 we combine the results in [28] with the Gyöngy projection tech-
nique and with the heat kernel method for time inhomogeneous diffusions to develop
asymptotic expansions that are highly accurate and we show, using in part the results
in [28] that these results extend to stochastic volatility models with time dependent
parameters. That is to say, we describe in detail the first and second term in the
expansion of the implied volatility in the limit of short times to expiration in two
factor local-stochastic volatility models. The form of the resulting expansion is:
• Second order expansion of local volatility σ L ( f t , at , f, T ):

σ L ( f t , at , t, f, T ) = σ L(0) ( f t , at , t, f ) + σ L(0) ( f t , at , t, f )τ
(1)
+ σ L ( f t , at , t, f )τ 2 + o(τ 2 ) τ → 0 (1.1)

where we have set τ = T − t and where, in the time inhomogeneous case, we note
(i)
that the coefficients σ L , i = 0, 1, 2, will depend explicitly on the spot time t, as
well.
• Order τ 5/2 expansion of the call prices
 
C(s, t, K , T ) − (s − K )+ ed(K ,s,t) /2τ
2
(1.2)
(1) (2)
=C (s, K , t)τ 3/2
+C (s, K , t)τ 5/2
+ o(τ 5/2
).

• Second order expansion of implied volatility σBS ( f t , at , f, T );

σBS (( f t , at , t, f, T )
(0) (0) (1)
= σBS (( f t , at , t, f ) + σBS (( f t , at , t, f )τ + σBS (( f t , at , t, f )τ 2 + O(τ 2 )
(1.3)

The key to the proof will be determining the coefficients in the first expansion (for
local volatility). With the local volatility up to second order in hand, we can apply
the asymptotic expansion obtained for time inhomogeneous local volatility models
in [28] to derive the implied volatility. One difference between this paper and earlier
ones is that we do not make simplifications for the purpose of making formulas shorter
or simpler. We take the point of view that the derivation of the full length formulas
should be made clear and these should be presented at a first stage. At a second stage,
one can explore ways to simplify or shorten the formulas. Since the formulas are all in
closed form, up to quadrature, this requires at most approximating a one dimensional
Second Order Expansion for Implied Volatility … 93

integral by Simpson’s rule and thus, even if the formulas are long their calculation
is instantaneous. An illustration is given for the λ-Sabr model. We devote a section
to the asymptotics obtained in [28] for the one factor (local volatility) models, since
these can be readily applied in the present setting once the local volatility function
has been determined.
As this work was nearing completion we learned of nice independent work by
Louis Paulot [48] who also derives a second order asymptotic expansion for the
implied volatility in two factor stochastic volatility models in the time homogeneous
case. Paulot does not consider the time inhomogeneous case considered in this paper.
Paulot takes a somewhat different (and very reasonable) approach to ours, in which
he bypasses the computation of the local volatility function. For the local variance
function (square of the local volatility), which he does not try to determine explicitly,
his calculations stop at the first order (4.23) and do not seek to determine the next
order (4.24). The determination of the local volatility in stochastic volatility models
is of independent interest and is provided by our method. Once the local volatility
has been determined the determination of the call prices is a straightforward “plug
and play”, using the Proposition (5.2). Alternatively using the local volatility one can
use Dupire’s forward equation to obtain the call prices for all strikes and maturities.
The determination of the implied volatility is more involved, since the relationship
between local volatility and implied volatility derived in [28] and recalled in (5.3) is
correspondingly more complicated.
Another important issue concerns rigor. Two questions are often asked concerning
asymptotic expansions used in mathematical finance and also in other areas of the
applied sciences. The first is, are the results rigorous? The second, related question
concerns the role of the boundary conditions and the influence of the boundary
conditions on the asymptotics. In the present paper, in the style of classical asymptotic
analysis, our main concern is not on full rigor, but rather to provide details of full
expansions. As opposed to previous works we provide full detail of intermediate
steps.
However, in order to point out the rigorous underpinnings of this work, the influ-
ence of degenerating coefficients, and of boundary conditions, we devote a section to
the discussion of how (certain aspects) of how our results here can be made rigorous,
based on work of Azencott and co-workers [6], the relevance of whose work seems
to have not been noticed in mathematical finance heretofore. In particular, we point
out, that, the Sabr and λ Sabr model are associated to an incomplete Riemannian
manifold, when β < 1.

2 The Family of Models Considered and Application


of the Heat Kernel Expansion

2.1 An Efficient Approach

In this section, we introduce the family of models to which our results will apply.
It is understood that we are working under a risk neutral forward measure and are
94 G. Ben Arous and P. Laurence

assuming zero dividends for the asset process

d f t = C( f t , t)at dW1,t , (2.1)


dat = α(at , t)dt + v(at , t)dW2,t , (2.2)
< dW1 , dW2 > = ρ(t)dt (2.3)

with initial conditions at time zero given by (s0 , a0 ). Here W1 and W2 are Brownian
motions with deterministic, possibly time dependent correlation ρ(t).
The associated Kolmogorov backward and forward equations for this family of
models are given by:

1 1
pt + C 2 a 2 p f f + ρCavp f a + v 2 paa + αpa = 0 (2.4)
2 2
1 2 2 1
pt − (C a p) f f − (ρCavp) f a − (ν 2 p)aa + (αp)a = 0 (2.5)
2 2
The matrix a with entries ai j , i, j = 1, 2
 
C 2 a 2 ρCav
a= (2.6)
ρCav v 2 a 2

is the diffusion matrix.


The price, at time t of a European call option C( f t , at ; K , T ) struck at K , under
the forward measure with maturity T is given by the expectation

C( f t , at ; K , T ) = E( ft ,at ) [( f T − K )+ ],

which can be expressed via the transition density p( f t , at , t; f T , a, T ) in the form



C( f t , at , t; K , T ) = ( f − K )+ p( f t , at , t; f, a, T )d f da.

Here and throughout this paper we will assume the expectations are taken with respect
to the a (given) risk neutral measure.
Heat Kernel: Time homogeneous case
In the time homogeneous case, we may without loss of generality assume the initial
time is 0. It is well-studied in differential geometry and stochastic analysis, under
certain technical conditions (see the discussion in Sect. 3), the transition density has
in the time homogeneous case the following expansion as T → 0+
Second Order Expansion for Implied Volatility … 95

d 2 ( f 0 ,a0 ; f,a )
 e− 2T
p( f 0 , a0 ; f, a, T ) = g( f, a)U ( f 0 , a0 ; f, a, T ) (2.7)
2π T

where, letting x0 = ( f 0 , a0 ) and x = ( f, a)


• g is the volume form associated with the Riemannian metric determined by the
inverse of the diffusion matrix (2.6). The inner product, denoted < ., . >, of two
vectors A and B is given by

< A, B > = gi j ai b j

where the Einstein summation is used to sum over repeated indices.


• d(x0 , x) is the geodesic (Riemannian) distance between x0 and x, in the above
mentioned metric gi j corresponding to the inverse of the diffusion matrix.
• U is the series expansion:


n
U (x0 ; x, T ) = u k (x0 ; x)T k + O(T n+1 ).
k=0

The u k ’s are called the heat kernel coefficients. In particular, u 0 (x0 ; x) =


√ γ̇
(x0 , x)e γ V,  , where and V are defined below.
• is the Van Vleck-DeWitt determinant:
 
1 1 ∂ 2d 2
(x0 , x) = √ det − ,
g(x0 )g(x) 2 ∂ x0 ∂ x

γ̇
• P = e γ V,  is the exponential of the work done by the vector field A along the
geodesic γ, with V = V i ∂i and

1 ∂ √ ij
V i = bi − √ gg . (2.8)
g ∂x j

Here the bracket “ < ·, · >” with vector entries V and W is defined via the metric
gi j as gi j vi w j .
• By adding and subtracting first order terms (2.4) can then be re-expressed in the
form
1
ut + B u + V · ∇u = 0, (2.9)
2

where B is the second order Laplace Beltrami operator √1 ∂ ( gg i j ∂∂x j ).
g ∂ xi

Note that the zeroth order heat kernel coefficient is known in closed form provided
we have available in closed form both the distance function and the geodesics. The
higher order on diagonal heat kernel coefficients u i (x, x) have been calculated up to
order 4 in very general settings. On the other hand the efficient calculation of the off
96 G. Ben Arous and P. Laurence

diagonal heat kernel coefficients Ui , for i ≥ 1, is still an active field of research. We


refer to Hsu [34] for an in-depth introduction to heat kernel expansion in stochastic
analysis perspective.
Given the heat kernel expansion in (2.7) for the transition density p, the call price
C as T → 0+ has the expansion


n 
1 d 2 (x0 ,x)
C(s0 , a0 ; K , T ) ∼ e− 2T G k (s, a)T k dsda.
2π T
k=0 {s≥K }

where for notational simplicity we have denoted by G k the expression



G k (s, a) = (s − K )u k (s, a) g(s, a). (2.10)

Heat kernel expansion: Time inhomogeneous case


For models of the form (2.1)–(2.3) with time dependent coefficients, the heat kernel
is modified as follows: again, let x0 = ( f 0 , a0 ) and x = ( f, a) and let τ = T − t

d 2 (x0 ;x,t )
e− 2τ 
p(x0 , t; x, T ) = u(x0 , t; x, T ) g(x, T ) (2.11)
2π τ
where now the distance function d, the Van Vleck-De Witt determinant and the
contravariant drift (2.8), depend explicitly on time because of the explicit dependence
of the metric and the drift on time. The series expansion H in this case reads


n
u(x0 , t; x, T ) = u k (x0 , x, t)τ k + O(τ n+1 ).
k=0

Moreover, the explicit form of the heat kernel coefficients is given by the following
formula
 d( f 0 ,a0 , f,a,t) 
 V,γ̇  ∂d( f˜(ρ), ã(ρ), f, a)
u 0 (x0 , x, t) = (x0 , x)e γ exp − dρ
0 ∂t

and recursively
 d( f 0 ,a0 , f,a,t)  
u 0 (x0 , x, t) ρ k−1 ∂Uk−1
u k (x0 , x, t) = k LUk−1 + dρ,
d (x0 , x, t) 0 U0 ∂t
(2.12)

where s̃(0) = f, ã(0) = a. Since the definition of u i always involves u 0 , it is


convenient to factor the latter out and to define
Second Order Expansion for Implied Volatility … 97

ui
uˆi = (2.13)
u0

Comparison with Henry-Labordère approach


In [39] Henry-Labordère takes a different, in the time-homogeneous case equiva-
lent to ours, approach to implementing the heat kernel expansion which involves a
complex line bundle. Thus Eq. (2.9) is recast in the form

u t + g −1/2 (∂i + Ai )g 1/2 g i j + (∂ j + A j )u + Qu = 0, (2.14)

where

Ai = gi j A j

where A j is given in (2.8), and where

Q = g i j (Ai A j − b j Ai − ∂ j Ai )

While the form (2.14) is equivalent to our form (2.9) it makes it necessary to
introduce the potential term Q. The motivations for introducing such a formulation
are undoubtedly related to the arduous task encountered in quantum field theory
of determining the higher order heat kernel coefficients. This covariant approach,
introduced among others by Avramidi in [4] and discussed in his lectures on mathe-
matical finance [5], introduces heavy machinery, complex line bundles gauge groups
etc., which for instance in the case of time homogeneous parabolic operators leads
to the determination of heat kernel coefficients u n via the solution of the transport
equations:

1
(1 + σi )u n = P −1 −1/2
(D + Q) 1/2
Pu n−1
n
where

σi = ∇i σ ∇i = ∂i + Ai
σ i = g i j σi

This covariant approach undoubtedly has computational and conceptual advantages


when calculating the higher order dimensional heat kernel coefficients. However
the method explained in the following subsection, due to Yoshida, is we think, much
simpler and more effective in linking in an intuitive way the coefficients of the original
parabolic operator to the u i . Moreover it is extremely easy in Yoshida’s approach to
incorporate the influence of time inhomogeneity on the heat kernel coefficients.
98 G. Ben Arous and P. Laurence

3 The Starting Point for Rigorous Justification of Heat


Kernel Method: Varadhan’s Lemma

This section is meant to give the reader an intuitive grasp of some of the key concepts.
We do not aspire to completeness but try to indicate original sources where various
issues are dealt with in depth. This section can be skipped by those interested only
in the practical results.
Varadhan’s lemma, in the form he first derived it (see [52, 53]), applies to the
case where the underlying operator is uniformly parabolic, with sufficient regularity
on the coefficients. In its original form as obtained in [52], it relates to the unique
fundamental solution in all of Rn associated to a diffusion

pt − L p = 0

and finds the small time behavior of such a diffusion to be

lim −2t log pt (x, y) = d 2 (x, y) (3.1)


t→0

where d is the Riemannian distance introduced in Sect. 2, associated to the principle


part of the diffusion operator. Varadhan proved (see Theorem 4.1.3) holds uniformly
for x and y in compact sets, for which d(x, y) remains bounded. When expressed in
the above form, one has in mind that the domain considered does not have a boundary.
Varadhan’s lemma can be viewed as a weak form of the zero-th order heat kernel
expansion described in Sect. 2. In fact, a general principle is that in many cases, once
it is known that Varadhan’s lemma holds, the full heat kernel expansion (suitably
modified by a Levy parametrix, see Sect. 3.4 in [28] readily follows.
On a domain with a boundary, as is the case encountered in this paper, where the
domain is R2+ , or the upper half plane1 the situation is more delicate. It is natural that
this be the case, since to begin with, one on a domain with boundary, one needs to
deal with the possibility that the diffusion can reach the boundary. In cases where the
diffusion can reach the boundary, in order to uniquely define the diffusion thereafter
we can impose an absorbing boundary condition. This amounts to considering the
transition density for the diffusion that is killed the first time it reaches the boundary
of the domain.

p D (x, y, t)
= Prob. of reaching y at time t
and not reaching the boundary before time t, starting at x

1 Depending on whether forward or log of forward is taken as state variable.


Second Order Expansion for Implied Volatility … 99

This leads to a general construction of the so-called minimal heat kernel which
applies to non-compact Riemannian manifolds like R2+ . In such a construction, which
is discussed in detail in Hsu [34], the manifold is exhausted by a sequence Dn of
nested compact domains for which the Dirichlet heat kernel pn (diffusion killed on
exiting Dn ), is constructed and the minimal heat kernel corresponds to the pointwise
limit of these as n → ∞. It is intuitively clear that at points on the boundary of the
manifold where the diffusion can reach the boundary, this construction leads to a
fundamental solution satisfies a Dirichlet condition in the backward variables.
As was shown by Azencott [6], a simple sufficient condition for Varadhan’s lemma
to hold that does extend to the case where the coefficients degenerate at the boundary,
is the case of complete Riemannian manifolds. To put this geometric concept into
perspective, consider the case where the underlying manifold consists of the first
quadrant, corresponding to non-negative forward price and volatility. In this case,
the manifold together with the natural Riemannian metric associated to the inverse of
the diffusion metric, is complete, if the boundary (and the other points “at infinity”,
if the domain is unbounded) are at an infinite distance from any point in the interior
of the manifold. It can be shown that a Riemannian manifold is complete if it is
metrically complete, i.e., all Cauchy sequences converge to a point in M (and not on
the boundary of M).
The above, with a modern proof appears as Theorem 5.2.1 in Hsu’s book [34].
Theorem 3.1 Let M be a complete Riemannian manifold and p M (t, x, y) the min-
imal heat kernel on M. Then, uniformly on compact subsets of M × M, we have
(3.1).
When the underlying Riemannian manifold fails to be complete, Hsu [36] refined
the work by Azencott by showing that a sufficient condition under which Varadhan’s
lemma is still guaranteed to hold is

d(x, y) ≤ d(x, ∞) + d(y, ∞), (3.2)

which in our setting amounts to requiring that the distance of x to y does not exceed
the sum of the distance of x to the boundary and the distance of y to the boundary.
Below we illustrate this with a couple of examples:
Consider the well known distance function associated to the family of λ-Sabr
models (7.8) in the ( f, a) plane.
• When β = 1, the boundary a = 0 is at an infinite distance from any point in
the interior ( f, a). This is because the quantity q involved in the definition of the
distance equals

f
1
| lim du| = ∞
→0 u

100 G. Ben Arous and P. Laurence

Similarly it is easily seen that the f = 0 axis is at an infinite distance from any
point in the interior. Thus in the case β = 0 the Riemannian manifold is complete
and Varadhan’s lemma holds for all points in the manifold.
• When β < 1, since f1β is integrable at zero, the points on the f = 0 axis lie at
a finite distance and those on the a = 0 axis at an infinite distance. Thus here
we need to apply the sufficient condition (3.2) to guarantee the applicability of
Varadhan’s lemma.
• For the family of SV models

d f t = at dW1t + μ f dt
dat = γ at dW2t + μa dt
q

< dW1t , dW2t > = ρdt

where ρ and γ are constants. Then


• The (Gaussian) curvature of the Riemannian metric naturally associated to the
problem is independent of the correlation and of the drift and is given by.
• The curvature is equal to

(q − 2)y 2(q−1) (3.3)

Also, it is easily seen that when q < 1 the a = 0 axis lies at a finite distance from
interior points (it suffices to note that vertical lines are geodesics) and also in this
case the curvature of the metric blows up at a = 0.
A final note concerns the relationship between the remarks above and the inter-
esting recent work by Doust [19] who in determining call prices in the Sabr model
for 0 < β < 1, takes into account the probability that the forward hits zero. Doust
correctly points out, just as Lewis had done, in the case of the CEV model (see [42]
p. 305), that the call price needs to be adjusted to allow for this possibility. There is non
conflict between this result and the modified version’s of Varadhan’s lemma. These
simply give sufficient conditions for the distance of points x, y from the boundary
(here “infinity”), so that the effect of paths of the forward which reach the boundary
is exponentially negligible in the small time limit. As Doust points out and as his
numerics shows, for longer times these need to be taken into account.
We also would like to remark that the when the local volatility component C( f, t)
of the local-stochastic volatility models is such that C(0, t) = 0 (for instance
C( f, t) = γ(t) f β with 0, β < 1, the series constructed by means of the heat kernel
vanishes on the boundary in the backward variables. Remarkably, this comes “for
free” (see (8.1) in Sect. 8) and was not part of the heat kernel’s explicit construction.
Second Order Expansion for Implied Volatility … 101

4 The Projection Method via Gyöngy Projection

A driftless diffusion (say in the forward measure)

d f t = σ L ( f t , t)dWt ,

gives rise to the so-called local volatility models, made famous by the work of Bruno
Dupire. Given a two factor local-stochastic volatility model, of the form

d f t = b(t) f t dt + C( f t , t)at dW1,t


dat = μ(a, t)dt + v(at , t)dW2,t ,
f 0 = f, a0 = a
< dW1,t , dW2,t > = ρ(t)dt (4.1)

the Gyöngy projection technique yields a one factor model

d f t = b(t) f (t)dt + σ L ( f t , t)dWt (4.2)

with effective local volatility σ L ( f t ) and drift b̄ defined by

σ L2 ( f, T ) = σ 2 (F, T )E aT2 | f T = F, f 0 = f, a0 = a (4.3)

Note that the effective drift of f t hasn’t changed when the original drift is of the
special form b(t) f (t). In fact, applying Gyöngy’s result in our case, the effective
drift is given by

b̄(F, T ) = E [b(t) f t | f T = F, f 0 = f, a0 = a] (4.4)

and the latter clearly equals b(T )F.


The local volatility model (4.2) has the same marginals with respect to the f vari-
able as the original two factor model (4.1). Independently of Gyöngy, this projection
(sometimes also called2 marginalization) technique was discovered by Bruno Dupire
who was the first to apply it in mathematical finance in [21].
As pointed out and exploited in Hagan Kumar Lesniewski and Woodward, in
Hagan Lesniewski and Woodward [32] and in Henry-Labordère [39], formula (4.4)
can be expressed using the joint probability density p( f, a, t, F, A, T ) as

C 2 (F, T ) R+ A2 p( f, a, F, A, T )d A
σ L2 (F, T ) = (4.5)
R+ p( f, a, F, A, T )d A

2 This
expression was coined by Marco Avellaneda, who in 1999, without prior knowledge of
Gyöngy or Dupire’s work, independently discovered the technique.
102 G. Ben Arous and P. Laurence

Therefore plugging in the expansion (2.11) into (4.5) we obtain


n d2
C 2 (F, T ) R+ A2 e− 2τ αi (A, T )τ i d A
i=0
σ L2 (F, T ) = n , (4.6)
R+ αi (A, T )(τ )i d A
i=0

where

α0 =: g(F, A, T )u 0 (4.7)
=: Ĉ (4.8)

αi = g(F, A, T )u 0 û i ( f, a, F, A, t), i ≥ 1 (4.9)

d2
and where we have canceled the common factor 1
n . Letting  = τ and φ = 2
(2π(τ )) 2
the above may be expressed in the form
n φ
C 2 (F, T ) R+ A2 αi e  d A
i=0
σ L2 (F, T | f, a) = n (4.10)
φ
R+ αi e d A

i=0

Now we apply the Laplace expansion to obtain an expansion of the local volatility.
It is convenient to consider separately the time homogeneous and the more involved
time inhomogeneous cases:

4.1 Laplace Expansion: Time-Homogeneous Case

Proposition 4.1 Suppose φ(Y ) has a unique minimum in (0, +∞), at the point A0 .
The asymptotic expansion of
 +∞ −φ(A)
f (A)e  d A, (4.11)
0

as  → 0 is given by:


2π  − φ(A0 )
= e 
φ (A0 )
 
× f (0) (A0 ) + f (1) (A0 ) + f (2) (A0 ) 2
Second Order Expansion for Implied Volatility … 103

where

f (0) (A0 ) = f (A0 )


f (A0 ) φ (4) (A0 ) f (A0 ) f (A0 )φ (3) (A0 ) 5(φ (3) (A0 ))2 f (A0 )
f (1) (A0 ) = − − +
2φ (A0 ) 8(φ (a0 )) 2 2(φ (A0 )) 2 24(φ (A0 ))3
(4.12)
f (2) (A0 ) = (4.13)
1   3 (3)  2
(3) (3) 2
 6 −480 φ (A0 ) f (A0 )φ (A0 ) + 840 φ (A0 ) f (A0 )φ (A0 )
1152 φ
(4.14)
 4
− 840φ (A0 ) f (A0 )φ (3) (A0 )3 + 385 f (A0 )φ (3) (A0 )4 + 144 φ (A0 ) f (4) (A0 )
 3  2
− 360 φ (A0 ) f (A0 )φ (4) (A0 ) φ (A0 ) f (A0 )φ (3) (A0 )φ (4) (A0 )
 2
− 630 f (A0 )φ φ (3) (A0 )2 φ (4) (A0 ) + 105 f (A0 ) φ (A0 ) φ (4) (A0 )2
 3  2
− 144 φ (A0 ) f (A0 )φ (5) (A0 ) + 168 f (A0 ) φ (A0 ) φ (3) (A0 )φ (5) (A0 )
 3 
− 24 f (A0 ) φ (A0 ) φ (6) (A0 ) (4.15)

where A0 is the point that minimizes φ.3

In using the Laplace expansion in conjunction with the heat kernel form (4.16),
we must deal with a more general (than (4.16) expression of the form
 +∞   φ(A)
f 0 (A) +  f 1 (A) +  2 f 2 (A) e−  d A (4.16)
0

( j)
Let us introduce the notation f i , i, j = 1, 2 to indicate the jth term in the Laplace
expansion above applied to the function fi . We apply the proposition to numerator and
denominator of the expression defining the local volatility. This yields an expression
that has the following form

(0) (1) (2) (0) (1) (2) (0)


f0 +  f0 +  2 f0 + ( f1 +  f1 +  2 f 1 ) + f 2  2 + o( 2 )
(0) (1) (2) (0) (1) (2) (0)
ζ0 + ζ0 +  2 ζ0 + (ζ1 + ζ1 +  2 ζ1 ) 2 + ζ2  2 + o( 2 )
(0) (1) (0) (2) (1)
f0 + ( f 0 + f1 ) +  2 ( f0 + f 1 ) + o( 2 )
= (0) (1) (0) (2) (1)
,
ζ0 + (ζ0 + ζ1 ) +  2 (ζ0 + ζ1 ) + o( 2 )

This ratio can be expanded in powers of , and yields the following asymptotic
expansion which can be applied to obtain the effective local volatility valid in a wide

3 The expansion up to the first order (4.12) appears in several sources including in textbook form,
as in Bender and Orszag [9], p. 273. On the other hand we have not been able to locate a source for
the second order expansion given here as (4.15).
104 G. Ben Arous and P. Laurence

range of models. is:

(0)
f0
(0)
ζ0
1  
+ (0)
− f 0(0) (ζ0(1) + ζ1(0) ) + ζ0(0) ( f 0(1) + f 1(0) ) 
(ζ0 )2
1 
(0) 3
f 0(0) (ζ0(1) + ζ1(0) )2 − f 0(0) ( f 0(2) + f 1(1) + f 2(0) )
(ζ0 )

(0) (1) (0) (1) (0) (0) (2) (1) (0)
+ ζ0 (−(ζ0 + ζ1 )( f 0 + f 1 )+ f 0 ( f 0 + f 1 + f 2 )  2 + o( 2 ),  → 0

Let us introduce the following notation in relation to the family of stochastic volatility
models in (4.1), in conjunction with the notation introduced in our discussion of the
heat kernel expansion, following Yoshida. Recall from (4.8) the definition of Ĉ
 
Ĉ(a) = detg(a, f ) ( f 0 , a0 , f, a)P( f 0 , a0 , f, a)

where on the left hand side we suppress the dependence of Ĉ on variables other than
a, since these are not relevant in the Laplace asymptotics.
Proposition 4.2 Assume the distance function in family of local-stochastic volatility
models has a unique minimum, as a function of the final value of the volatility a,
2
at c = amin . Let φ = d2 . Then the effective local volatility in the family of local
stochastic volatility models (4.1), is given, up to the second order, by

(0) (1) (2)


σ L2 (t, T, a0 , f 0 , f ) =: VL ( f, t) = VL ( f, t) + VL ( f, t)τ + VL ( f, t)τ 2 ,
(4.17)

and correspondingly, by taking square roots and expanding, the local volatility is
given by4

(0) (1) (2)


σ L = σ L + σ L  + σ L  2 + · · · where  = τ (4.18)

where

(0) (0)
σL = VL (= C( f )c) (4.19)
(1)
VL
σ L(1) = (0)
(4.20)
2σ L

4 In(4.17) we once again suppressed on the right hand side the dependence of variables other than
t, T and f . In the next section, when we combine the asymptotics for local volatility with the above
asymptotics for implied volatility, given local volatility, it will be important that the local volatility
depends on the initial time t and on the final time T in the particular way indicated.
Second Order Expansion for Implied Volatility … 105

(1) (0) (2)


(2) −(VL )2 + 4VL VL
σL = (0)
8(VL )3/2
(1) (0) (0) (2)
−4(σ L )2 (σ L )2 + 4(σ L )2 VL
= (0)
(4.21)
8(σ L )3

where
(0)
VL ( f, t) = C 2 ( f )c2 (4.22)
  
C 2 ( f ) Ĉ(c) + 2cĈ (c) φ (c) − cĈ(c)φ (3) (c)
VL(1) ( f, t) =
Ĉ(c)φ (c)2
(4.23)
(num,2)
(2) VL ( f, t)
VL ( f, t) = (den,2)
(4.24)
VL ( f, t)

with
1
VL(den,2) ( f, t) =
24Ĉ(c)2 (φ (c))6 φ (c)5

and
(num,2)
VL ( f, t)
  2   
= C( f )2 12c φ (c) −2Ĉ (c) + cĈ(c)u 1 (c)φ (c) φ (c) Ĉ (c)
   
+ 2Ĉ(c)u 1 (c)φ (c) − Ĉ (c)φ (3) (c) + Ĉ(c)φ (c) 2φ (c)6 12cφ (c)2 Ĉ (3) (c)
 
+ 35cĈ (c)φ (3) (c)2 + 6Ĉ (c)φ (c) 3φ (c) − 5cφ (3) (c)
    
− 15Ĉ (c)φ (c) 2φ (3) (c) + cφ (4) (c) + φ
6
−12Ĉ (c)φ (c)
 
φ (c) + c2 u 1 (c)φ (c)2 − cφ (3) (c) + Ĉ(c)u 1 (c)φ (c)
 
−24φ (c)2 − 24c2 u 1 (c)φ (c)3 + 5c2 φ (3) (c)2 − 3cφ (c) −8φ (3) (c)
   
+ cφ (4) (c) + 2Ĉ (c) −11cφ (3) (c)2 + 6cu 1 (c)φ (c)2 4φ (c) + cφ (3) (c)
 
+ 3φ (c) 2φ (3) (c) + cφ (4) (c) + Ĉ(c)2 (u 1 (c)
 6  
φ φ (c)2 24φ (c)2 − 5c2 φ (3) (c)2 3cφ (c) −8φ (3) (c)
     
+ cφ (4) (c) + φ 48cu 1 (c)φ (c)4 + φ (c) − cφ (3) (c)
6
106 G. Ben Arous and P. Laurence
  
−5φ (3) (c)2 + 3φ (c)φ (4) (c) + φ (c)6 −35cφ (3) (c)3 + 35φ (c)φ (3) (c)
   
φ (3) (c) + cφ (4) (c) − 3φ (c)2 5φ (4) (c) + 2cφ (5) (c)

4.2 Laplace Expansion: Time-Inhomogeneous Case

Since the heat kernel expansion for the probability transition density involves
the backward (in financial terms, spot) time in all places except for the factor

g(F, A, T ) in Eq. 4.9, the Laplace expansion technique is essentially unchanged.

As was done in [28], in Sect. 2, we need to develop g(F, A, T ) in a power series
expansion around in time around T = t. Ie.

√ √ ∂ g(F, A, T )
g(F, A, T ) = g(F, A, t) + |T =t (τ )
∂T

1 ∂ g(F, A, T )
+ |T =t (τ )2 + o(τ 2 )
2 ∂T
=: d0 (A, t) + d1 (A, t) + d2 (A, t) 2 + o( 2 ),  = τ (4.25)

The presence of powers of  in the expansion above implies that letting


αi √
βi = √ =: Pu i ,
g(F, A, T )

where αi was defined in (4.9), the expansion in powers of  will now follow from
the expansion in powers of  of

(βi(0) + βi(1)  + βi(2)  2 + o( 2 ))(c0 + c1  + c2  2 + o( 2 ))


(0) (1) (0) (2) (0) (1)
= βi c0 + (βi c0 + βi c1 ) + (βi c0 + c2 βi + βi c1 ) + o( 2 )
(0) (1) (2)
=: γ + γ  + γ  2 + o( 2 )
i i i

(0)
αi

I.e. it suffices to replace in the Laplace expansion of numerator and denominator of


the expansion of the local volatility function, αi by γi throughout. Notice that the
(0) (0)
zero-th order γi coincides with αi in the time homogeneous case. Since only α0
(1)
enters into the definition of σ L , this means that the form of the latter is basically
unchanged.
(2)
The new form of the VL is given in Appendix.
Second Order Expansion for Implied Volatility … 107

5 Coupling with the Local Volatility and Call Price


Expansion

5.1 The Key Quantities in One-Dimensional Case

In [28] we obtained an optimal result for the asymptotics of the implied volatility in
a local volatility model of the form5

d f t = σ L ( f t , t)dWt (5.1)

In order to formulate our asymptotic result in the stochastic volatility setting, we will
need to combine those results with the results in the previous section. We begin by
recalling some of the required auxiliary quantities derived in [28].
• One dimensional (signed) distance function
 f 1
d1 ( f, K , t) = du, t ∈ [0, T ]
K σ L (u, t)

• One dimensional heat kernel coefficients u L ,0 and u 1,0 , given by


   s 
(1d) σ L ( f, t) (d1 )t (K , η, t)
u 0 ( f, K , t) = exp − dη . (5.2)
σ L (K , t) K σ L (η, t)

and
   η 
(1d)
(1d)
u ( f, K , t) f σ L2  2 
u1 = 0 H + H f + bH + c + Ht (ζ, K , t)dζ
d1 (K , f, t) K 2 K

× (5.3)
σ L (η, t)

where

∂ (σ L ) f ( f, t) (d1 )t (K , f, t)
H ( f, K , t) = [ln u 0 ( f, K , t)] = − .
∂f 2σ L ( f, t) σ L ( f, t)

Expansion for call prices


As noted by Henry-Labordère, following on the work by Dupire and Derman and
Kani, it is possible to use a formula, which actually goes back even further, i.e., to
the work of Carr and Jarrow [14] for the call prices C(s, K , t, T ) which reads

5 Reference [28] explains how to adjust the results to allow for a non zero but constant interest (or
other constant yield) rate.
108 G. Ben Arous and P. Laurence
 T
+1
C(s, K , t, T ) = (s − K ) + σ L (K , u)2 p(s, t, K , u)du.
2 t
C(s, K , t, T ) − (s − K )+
k  T 
1  −d(K ,s,t)2 /2(u−t) i− 21
∼ √ σ L (K , u)e (u − t) du u i (s, K , t).
2 2π i=0 t

Letting  τ 1
u i− 2 e−d
2 /2u
Ui (ω, τ ) = du. (5.4)
0

the expansion may, in the time inhomogeneous case, be expressed in the compact
form
Proposition 5.1 The expansion of the call prices in a driftless local volatility model,
is given by:

C(s, K , t, T ) − (s − K )+ (5.5)
 
1 
k
1
∼ √ Ui (d, τ ) + (σ L )t (K , t)Ui+1 (d, τ ) + (σ L )tt (K , t)Ui+2 u i (s, K , t).
2 2π i=0 2

Using that
 
τ 3/2 3τ 5/2 −ω2 /2τ
U0 (ω, τ ) ∼ 2 − e ,
d2 d4
 
2τ 5/2 −d 2 /2τ
U1 (ω, τ ) ∼ e
d2

an alternate form for the call price expansion up to order τ 5/2 is

Proposition 5.2 The expansion of the call prices in a driftless local volatility model,
is given, in the small time limit τ → 0, up to the order τ 5/2 by

C(s, K , t, T ) − (s − K )+
 
1 −d 2 /2τ 1
=√ e σ L (K , t)u 0 (s, K , t) τ 3/2
2π d2
  
3 1 1
+ − 4 σ L (K , t)u 0 (s, K , t) + (σ L )t 2 u 0 (s, K , t) + 2 σ L (K , t)u 1 (s, K , t) τ 5/2
d d d
Second Order Expansion for Implied Volatility … 109

Implied volatility
For the implied volatility the following expansion was obtained in the same paper:
Proposition 5.3 The implied volatility σBS admits the following asymptotic expan-
sion, away from the money:

σBS ∼ σBS,0 + σBS,1 τ + σBS,2 τ 2 + o(τ 2 ), T → t, for f = K

where
 
f
ξ ln K
σBS,0 = = f . (5.6)
d1 (K , f, t) dη
K σ L (η,t)

 (1d)
  (1d)

u0 ( f,K ,t)σ L (K ,t)ξ 2 σ L (K ,t)u 0 ( f,K ,t)d(K , f,t)
σBS,0
3 ln √
f K d 2 σBS,0
3 ξ ln √
ξ fK
σBS,1 = = . (5.7)
ξ2 d1 (K , f, t)3

3σBS,1 σBS,0
2 2
3σBS,1 σBS,0
3
σBS,2 = − + +
ξ2 2σBS,0 ξ2
 
2
3σBS,0 σBS,0
2
(σ L )t (K , t) 3 u (1d)
1 ( f, K , t)
+ + − + (1d)
ξ 2 8 σ L (K , t) d (K , f, t) u ( f, K , t)
2
0
2
 (1d)

3σBS,1 3σBS,1 ξ 3 ξ (σ L )t (K , t) u ( f, K , t)
=− + + 5+ 3 + (1d)
1
.
d2 2σBS,0 8d d σ L (K , t) u ( f, K , t) 0
(5.8)

5.2 Plugging the Local Volatility into the Call Price


and into the Implied volatility Expansion

Given the expansion (8.4) for the local volatility in powers of  = τ , we can use
the 1D expansion above to determine the contributions of the higher order terms to
the implied volatility. Note that the above expansion of the Black-Scholes implied
volatility in powers of τ has an underlying functional dependence on the underly-
(0) (1) (2)
ing local volatility, σ L ∼ σ L + σ L +  2 σ L  2 , obtained in Sect. 3. In order to
emphasize this dependence below we write σ B S,i [σ L ]. In order to combine the local
volatility asymptotics (4.2) with the implied volatility asymptotics, we must plug
the former into the latter and again expand in powers of  = τ . Now an additional
subtlety arises and taking it into account properly is crucial to the correct derivation
110 G. Ben Arous and P. Laurence

of all subsequent formulas. The subtlety is that the local volatility expansion obtained
in the previous section introduces an additional dependence on the backward variable
f 0 which (alongside a0 ) serves as an initial condition when using the call decom-
position formula (4.10). Thus f enters in as a parameter and the “active” variable
in σ L ( f 0 , a0 , f, t) is f . Every time a differentiation or integration with respect to a
spatial argument needs to be carried out, we need to use f as the active variable.
This may seem a little surprising because in deriving the 2-D heat kernel we are
freezing the forward variable f (or K ), whereas now we are freezing the backward
variable f 0 . This state of affairs is easily clarified when we remember that underlying
Dupire’s derivation of the local volatility function is a differentiation with respect to
the forward variable. So, in a sense we are mixing a backward and a forward repre-
sentation. Although this may seem odd, it does carry has some distinct advantages
especially in the time-inhomogeneous case, since all quantities of interest there are
evaluated at the spot time t, which is mostly frozen throughout.
As an example, letting d1 ( f, f 0 , t) denote the 1 − D distance function introduced
in the beginning of the preceding section, corresponding to the expansion for the
local volatility denote σ L obtained in (8.4)
 f0 1
d  ( f 0 , f, t, T ) = du
f σ L ( f 0 , a0 , f, t, T )

∂ ∂
So, we obtain using that by definition ∂ T |T =t = ∂

 (1)
d f0 σ L ( f 0 , a0 , f, t)
= d( f, f 0 , t, )|=0 = − (0)
du (5.9)
d f (σ L ( f 0 , a0 , f, t))2

Therefore the expression appearing in u 0 can be written


(1)
f0 1 η σ L ( f 0 ,a0 ,u,t)
f σ (0) ( f ,a ,η,t) K (0) dudη
0 0 (σ L ( f 0 ,a0 ,u,t))2
e

Applying this procedure throughout, we obtain the following results:

Proposition 5.4 The small time to maturity expansion of the call prices in the family
(2.1)–(2.3) is given by

C(s, K , t, T ) − (s − K )+
  
1 1 3
= √ e−d /2τ + − 4 σ L(0) (K , t)u 0 (s, K , t)
2
σ L (K , t)u 0,1 (s, K , t) τ 3/2
2π d2 d
 
(1) 1 1 ((0)
+ σ L 2 u 0,1 (s, K , t) + 2 σ L (K , t)u 1,1 (s, K , t) τ 5/2
d d
Second Order Expansion for Implied Volatility … 111

(0) (1)
where σ L is given by (4.19), σ L is given by (4.20) and where u 0,1 and u 1,1 are
given by (5.14) and u 1,1 is given by (5.16).

Proposition 5.5 Given the family of local-stochastic volatility models (2.1)–(2.3),


and assuming the coefficients do not depend on time, we have the following asymptotic
expansion for  → 0

σ BSVS [σ L ]( f 0 , a0 , K , t) ≡ σ BSVS,0 [σ L ]
+ σBS,1
SV
[σ L ]( f 0 , a0 , K , t)τ + σBS,2
SV
[σ L ]( f 0 , K , T )τ 2 , +o(τ 2 ) τ = T − t → 0

where
ξ
σBS,0
SV
[σ L ]( f 0 , a0 , K , t) = f0
du (5.10)
1
K σ (0) ( f ,a ,u,t)
L 0 0

σBS,1
SV
[σ L ]( f 0 , a0 , K , t) (5.11)
 (0) 
σ L ( f 0 ,a0 ,K ,t)u 0,1 ( f 0 ,K ,t)d (0) ( f 0 ,a0 ,K ,t)
ξ ln √
ξ f0 K
= (0)
. (5.12)
d ( f 0 , K , t)3

In the equation above

1
d (0) ( f 0 , K , t) = f
(5.13)
1
K σ (0) ( f ,a ,u,t) du
L 0 0

u 0,1 ( f, K , t)
  
(0)  η (1)
! σ L ( f 0 , a0 , f 0 , t) f 1 σ L ( f 0 , a0 , u, t)
= (0)
exp (0) (0)
dudη .
σ L ( f 0 , a0 , K , t) K σ L ( f 0 , η, t) K (σ L ( f 0 , a0 , u, t)))2

(5.14)

Also

σBS,2
SV
( f 0 , a0 , K , t)
SV SV )2
 (1)

3σBS,1 3(σBS,1 ξ3 ξ (σ L u 1,1
=− + + + (0) 3 +
(d (0) )2 SV
2σBS,0 8(d (0) )5 (d ) σL
(0) u 0,1
(5.15)

where, recalling (5.3), and, for brevity suppressing the dependence on the initial
variables ( f 0 , a0 ) we have
112 G. Ben Arous and P. Laurence

 (0)
u 0,1 f0 σ L (η, t)
u 1,1 = (0)
d K d (0)
   η 

× H(2) + (H f )(2) +
2
(Ht )(2) (ζ, K , t)dζ (0)
, (5.16)
K σ L (η, t)

where
(1)
f 0 σ L (u,t)
σL
(1) K (σ (0) (u,t)))2 du
H(2) = (0)
− L
(0)
,
2(σ L )2 σL
(H f )(2)
 (1)

f0 σL
(1) K (σ (0) )2 du σ L(1) (0) (0)
σL (σ L )2K (σ ) K K
= (0)
− L
(0)
− + L (0) ,
(σ L )3 (σ L )2 2(σ L )(0) )2 2σ L

(Ht )(2)
 (1) (0) (2)

1 f −2(σ L )2 + σ L σ L (0)
= 2 du σ L
2(σ L(0) )2 K (σ L(0) )3
 f (0,1)  
σL (1) (1) (0) (0) (1)
+2 (0)
du σ L − σ L (σ L ) K + σ L (σ L ) K
K (σ L )2

Remark 5.6 A simplified formula at the second order


A remarkable aspect of the above expression for σBS,2
SV is that in almost the entire

expression only σ L(0) and σ L(1) and their derivatives are involved. Recall that the
determination of the first of these required only the heat kernel coefficient u 0 in the
two heat kernel expansion, while the second of these required only the heat kernel
(2)
coefficient u 1 . The lengthy expression for σ L , i.e. the coefficient of τ 2 in (8.4),
expressed in terms of the lengthy expression for V2 , is required above, only for the
first term under the integral of (Ht )(2) . Thus, we may propose as a very reasonable
approximation for the implied volatility expansion at the second order to use (5.15),
with all terms the same but using instead the modified H2m defined by

 
1 f −2(σ L(1) )2 +
2 du σ L(0)
2(σ L(0) )2 K (σ L(0) )3
 f (0,1)  
σL (1) (1) (0) (0) (1)
+2 (0)
du σ L − σ L (σ L ) K + σ L (σ L ) K
K (σ L )2
Second Order Expansion for Implied Volatility … 113

6 Example: Dynamic λ-SABR Model

6.1 Step 1: Reduction to Laplace Beltrami + Drift Form

In this section we apply our expansions to the dynamic λ Sabr model

d f t = C( f, t)at dW1,t
dat = κ(t)(λ̄ − at ) + at ν(t)dW2,t

Our diffusion matrix is


   C 2 a 2 γ νρCa 2 
gi j =
Cνρa 2 ν 2 a 2

with inverse
ρ 
 
1
− a2 ν
gi j = C 2 a 2 (1−ρ 2 ) (1−ρ 2 )C
− a 2 νC ρ1−ρ 2 1
( ) ν 2 (1−ρ 2 )a 2

Recall that the Laplace Beltrami B operator is given by


 
B = g −1/2 ∂i g 1/2 g i j ∂ j = g i j ∂i ∂ j + g −1/2 ∂i (g 1/2 g i j )∂ j

In the case of our metric we have

g −1 = −a 4 C 2 ν 2 (−1 + ρ 2 ); (6.1)

We now use the definition of the Laplace-Beltrami operator to rewrite the original
PDE in the form
1
ut + Bu + V1 u f + V2 u a = 0
2
where a straightforward calculation using the explicit form of the metric coefficients
yields that the “metric” part of the drift is given by
  
1 1 1
Vm1 = a2νγ f β 1 − ρ2 ∂ f (  g 11 + ∂a  g 12
2 a2νγ f β 1 − ρ2 a2νγ f β 1 − ρ2
 
1 β β γ f β−1 12
= a νγ f
2 1−ρ 2  = γ βa 2 f 2β−1
2 ν 1 − ρ2 2
114 G. Ben Arous and P. Laurence

   
1 1 1
V2 = a2νγ f β 1 − ρ2 ∂ f  g 12 + ∂a  g 22
2 a2νγ f β 1 − ρ2 a2ν γ f β 1 − ρ2
  
1 1
= − a2ν γ f β 1 − ρ2∂ f  γ νρ f β a 2
2 a2ν γ f β 1 − ρ2
 
1 2 2
+ ∂a  ν a
a2ν γ f β 1 − ρ2
=0

Thus the metric part of the contravariant drift can be written

1 2 2 2β−1
Vm = γ βa f ∂f
2
The corresponding covector field, is obtained by lowering the indices

V1m = g11 Vm1 V2m = g12 Vm1

Hence

1 2 2 2β−1 1
V1m = γ βa f  
2 γ f
2 2β 1 − ρ2 a2
1 β
=
2 f (1 − ρ 2 )
ρ 1 2 2 2β−1
V2m = −   γ βa f
γ
ν 1−ρ f a 22 β 2

β γρ
=− f β−1
2ν(1 − ρ 2 )

Thus the covector form of the metric part of the drift is

β β γρ da
df − f β−1 (6.2)
2 f (1 − ρ )
2 2(1 − ρ 2 ) ν

Remark: Allowing general C( f, t).


Via the same procedure, when instead of γ(t) f β we consider a general form
C( f, t) in (2.1), we obtain, for the metric part of the drift, when C does not depend
explicitly on time:

∂C
1 ∂f ρ ∂C
∂f
d f − da, (6.3)
2(1 − ρ 2 ) C 2ν(1 − ρ 2 )

and, in the new variables defined by (6.6) and (6.7) below, this term may be expressed
in the form
Second Order Expansion for Implied Volatility … 115

∂C
∂f
 d x,
2 1 − ρ2

a fact already pointed out by (Henry-Labordère [39] and Paulot in [48]).


2 ) d log C( f ) and
1
Note also that the first part of (6.3) is a perfect differential 2(1−ρ
can thus be integrated directly along the original geodesic (from final point (K , a)
to initial point ( f 0 , a0 ) to get
 
1 C( f 0 , t)
log
2(1 − ρ 2 ) C(K , t)

The work done by this term is thus immediately evaluated to be

  1
C( f 0 , t)) 2(1−ρ 2 )

C(K , t)

In particular in the λ-Sabr model, we get

  β
F0 2(1−ρ 2 )
(6.4)
K

To this metric part, we need to add the drift coming from the mean reverting
volatility. This vector has only the ∂a component
V meanrev = κ(λ̄ − a)∂a
Vmeanrev = g12 κ(λ̄ − a)d f + g22 κ(λ̄ − a)da
ρκ[t](−a + λ[t]) κ(−a + λ[t])
= 2   df − 2 2  da
a ν −1 + ρ 2 C[ f, t] a ν −1 + ρ 2
(6.5)

Define
f 1−β
q=
γ(t)(1 − β)

We make the change of variables

νq − ρa
x=  (6.6)
ν 1 − ρ2
a
y= (6.7)
ν 
d f = γ −ρ 2 Cd x + γ ρCdy (6.8)
116 G. Ben Arous and P. Laurence

This change of variables is defined in such a way that the principal part of the
partial differential equation, expressed in the new coordinates, is the standard Sabr
model corresponding to the (rescaled) hyperbolic plane, i.e., of the form

1 2 2
ν y (u x x + u yy )
2

After writing the backward partial differential equation in the form 21 B + V , each
of the terms is covariant. This means that in order to determine the contravariant
drift in the new coordinates, it suffices to transform the contravariant drift in the old
coordinates by the recipe for the change of a contravariant vector under changes of
the independent variables. However, since the changes of variables we are making
(6.6) and (6.7) depend explicitly on time it is clear by the expression of the new
dependent variable in terms of the old, i.e. u( f, a, t) = v(x( f, a, t), y( f, a, t), t)
that the in the drift term for the new PDE there is an extra term coming from the time
derivatives of the independent variables. A simple calculation shows that the metric
part of the drift (6.2) can be expressed in the compact form

β f β−1
 dx (6.9)
2 1 − ρ2

Recalling the transformation (6.6) and (6.7), the time derivatives are

1 f 1−β ρ 1 ρ
xt = ( )  − y−( )  νy
γ (1 − β) 1 − ρ 2 (1 − ρ 2 )3/2 ν 1 − ρ2
γ y ρ ν ρ
= − (x + ρ  )− y + ( ) y (6.10)
γ 1 − ρ2 (1 − ρ )
2 3/2 ν 1 − ρ2
1
yt = ( ) νy (6.11)
ν
Note that from the transformations (6.6) and (6.7) we obtain the relation

df dy
dx =  −ρ (6.12)
1 − ρ2 γ f β 1 − ρ2

For the non-metric part of the drift arising from the mean reversion we obtain
from (6.12) and from (6.5):

 
ρκ(λ̄ − νy) −ρ 2 κ(λ̄ − νy) κ(λ̄ − νy)
− dx + + dy (6.13)
1 − ρ 2 ν 3 y2 (1 − ρ )ν y
2 3 2 (1 − ρ 2 )ν 3 y 2
Second Order Expansion for Implied Volatility … 117

All in all the PDE satisfied by v is therefore

1
vt + ν 2 y 2 (vx x + vyy ) + Ṽ x vx + Ṽ y vy = 0, (6.14)
2
where

β γ f β−1 ρκ(λ̄ − νy) 1
Ṽx d x =  − + 2 xt d x
2 1 − ρ2 1 − ρ 2 ν 3 y2 y

Since the first part contains an exact part, it is better to express the first expression
in the equation above in the form

1 β γρ
 d log f β − f β−1 dy
2 1−ρ 2 2(1 − ρ 2 )

As mentioned earlier, the first part is trivially integrated (see (6.4) and the second
part of Ṽy dy so we have the full drift written as

1
V̄˜x d x + V̄˜y dy +  d log f β , (6.15)
2 1 − ρ2

where

˜ ρκ(λ̄ − νy) 1
V̄x d x = −  + 2 xt d x
1 − ρ 2 ν 3 y2 y
 
β γρ 1 κ(λ̄ − νy)
V̄˜y dy = − f β−1
dy + y t + dy, (6.16)
2(1 − ρ 2 ) y2 ν 3 y2

β γρ 1 1 κ(λ̄ − νy)
= −  + 2 yt + dy
2(1 − ρ 2 ) (1 − β) γ 1 − ρ 2 x + ρ(1 − β) γ y y ν 3 y2
(6.17)

where, in the next to last step, we used the relation

1
f β−1 = 
(1 − β) γ 1 − ρ 2 x + ρ(1 − β) γ y

Recall that the difference between V̄˜ and Ṽ is that in V̄˜ we removed the exact part
of the drift vector. For completeness we note that, alternatively, including the exact
part explicitly, we have the relations:
118 G. Ben Arous and P. Laurence

β γ f β−1 ρκ(λ̄ − νy) 1
Ṽx =  − + 2 xt d x
2 1 − ρ2 1 − ρ 2 ν 3 y2 y

β ρκ(λ̄ − νy) 1
=   − + 2 xt d x
2 1 − ρ 2 (1 − β)( 1 − ρ 2 x + ρy) 1 − ρ 2 ν 3 y2 y
(6.18)
 
1 κ(λ̄ − νy)
Ṽy = yt + dy, (6.19)
y2 ν 3 y2

and the associated contravariant components:

βν 2 y 2 ν 2 ρκ(λ̄ − νy)
Ṽ x =   −  + ν 2 xt
2 1 − ρ 2 (1 − β)( 1 − ρ 2 x + ρy) 1 − ρ2ν3
κ(λ̄ − νy)
Ṽ y = ν 2 yt + (6.20)
ν

6.2 The Geodesics in the Poincaré Plane

The changes of variables we have made in the last section, transform the original
operator into the operator

1
∂t + ν 2 y 2 (∂x2 + ∂y2 ) + Ṽ x ∂x + Ṽy ∂y
2

The factor ν 2 in front of the second order part means we are not yet in the standard
Poincaré plane where ν = 1. To adjust for this we further make the change of time
t
t= (6.21)
ν2

so, in the t variable the operator is now in standard form

1 Ṽ x Ṽy
∂t + y 2 (∂x2 + ∂y2 ) + 2 ∂x + 2 ∂y
2 ν ν
As is easily seen the covariant form of the drift remains the same (due to offsetting
effects), but the contravariant form changes by a factor of ν 2 .
The geodesic passing through the points (x1 , y1 ) and (x2 , y2 ) is known to be the
semi-circle with origin at (x0 , 0) where

x22 − x12 + y22 − y12


x0 (x1 , y1 , x2 , y2 ) =
2(x2 − x1 )
Second Order Expansion for Implied Volatility … 119

and where the radius R is given by:



R = y12 + (x1 − x0 )2

= y22 + (x2 − x0 (x1 , y1 , x2 , y2 ))2 (6.22)

It will be convenient to use standard polar coordinates

x = x0 + R cos θ, 0 ≤ θ ≤ π
y = R sin θ, 0 ≤ θ ≤ π (6.23)

Notice that if we denote by (x, y) the running point on the geodesic, then the polar
angle corresponding to this point, issuing from a fixed point (x2 , y2 ) is given by

θ (x, y)

⎪ arctan( x 2 −xy2 +y2 −y2 ) arctan( x 2 −xy2 +y2 −y2 ) > 0
⎨ x− 2 2(x −x) x− 2 2(x −x)
= y 2 2
y (6.24)
⎩ arctan( x22 −x 2 +y2 −y2 ) + π, arctan( x22 −x 2 +y2 −y2 ) < 0

x− 2(x2 −x) x− 2(x2 −x)

6.3 Calculation of the Work Done by the Drift

The integral we need to calculate is



Ṽx d x + Ṽy dy
γ((x1 ,y1 ),(x2 ,y2 )

where γ((x1 , y1 ), (x2 , y2 ) is the geodesic joining the points (x1 , y1 ) and (x2 , y2 ) in
the standard hyperbolic plane, which is the image of the original f, a planed under
the coordinate transformation (6.6) and (6.7), call it .
Let us first carry this out for the so-called “metric part” of the drift, i.e. the part
corresponding to κ = 0 and no time dependence in the coefficients.
Then, using (6.18), for the metric part of the drift we have that

Ṽ m dy
γ(x,y)
θ2
β γρ cos θ
=−  dθ
2(1 − ρ 2 ) 1 − ρ (x0 + R cos θ ) + Rρ sin θ
2
θ1
120 G. Ben Arous and P. Laurence

Therefore we must calculate the integral



1
 cos θ dθ
1 − ρ (x0 + R cos θ ) + ρ R sin θ
2

To evaluate this integral, note that we may write the identity

cos θ

1 − ρ 2 (x
+ R cos θ ) + ρ R sin θ
0

1 − ρ2ρ  
= + d log 1 − ρ 2 (x0 + R cos θ ) + ρ R sin θ
R R 
 
x0 1 − ρ 2 1
− 
R 1 − ρ 2 (x0 + R cos θ ) + ρ R sin θ

and therefore the integral can be written



1 − ρ2 ρ  
(θ2 − θ1 ) + log 1 − ρ 2 (x0 + R cos θ ) + ρ R sin θ |θθ21
R R
  θ2
x0 1 − ρ 2
1
−  dθ (6.25)
R 1 − ρ 2 (x0 + R cos θ ) + ρ R sin θ
θ1

and it remains only to evaluate this last integral.


This may be expressed as

θ2
1
   dθ
θ1 1 − ρ 2 x0 + R cos θ − arctan( √ ρ 2
)
1−ρ

Letting

a1 = 1 − ρ 2 x0
ρ
γ̂ = arctan(  ),
1 − ρ2

the last integral can be expressed as:

θ2
1
d(θ − γ̂) = (6.26)
a1 + R cos(θ − γ̂)
θ1
  
I1
Second Order Expansion for Implied Volatility … 121

 θ2 −γ̂ 1
= dθ (6.27)
θ1 −γ̂ a1 + R cos θ
⎧ & '
⎪  
⎪ 2 −γ̂

⎪  2
arctan a 1 −R
a1 +R tan 2
θ
|θ=θ if a12 > R 2

⎪ a −R
2 2 θ=θ 1 −γ̂

⎪ & 1 '

⎪  

⎨ 2 −γ̂
 2 arctanh R−a1
R+a 1
tan θ
2 |θ=θ
θ=θ −γ̂
if a12 < R 2
R 2 −a1
2 1

⎪ (1 )
⎪ 2 −γ̂

⎪ tan( θ2 ) |θ=θ if a1 = R


R θ=θ −γ̂ 1



⎪ (1 )
⎩ 2 −γ̂
R cot( θ2 ) |θ=θ
θ=θ −γ̂
if a1 = −R (6.28)
1

So that all told, taking into account (6.4) we obtain the following expression for
the line integral corresponding to the metric part:

log(P ) M E

β γρ 1 − ρ2 ρ  
=− (θ2 − θ1 ) + log 1 − ρ 2 (x0 + R cos θ) + ρ R sin θ |θθ21 (6.29)
2(1 − ρ )
2 R R
  
x0 1 − ρ 2
β f0
− I1 +  log( ) (6.30)
R 2 1−ρ 2 K

λSabr part
Next we deal with the specific to λ-Sabr part, which have found above to be

ρκ(λ̄ − νy) κ(λ̄ − νy)


− dx + dy (6.31)
1−ρ ν y
2 3 2 ν 3 y2

Inserting this into the line integral that must be evaluated, we find

θ2
ρκ λ̄
=  csc θ dθ
R 1 − ρ2ν3
θ1
θ2
ρκ
− 1dθ
1 − ρ2ν2
θ1
122 G. Ben Arous and P. Laurence

θ2
κ λ̄ cos θ
+ 3 dθ
ν R sin2 θ
θ1
θ2
κ cos θ
− dθ
ν2 sin θ
θ1

Now

csc θ dθ = log(tan(θ/2)) + C

cos θ 1
=− +C
θ
 sin θ
2 sin
cot θ dθ = log(| sin θ |) + C

Therefore, all told we obtain for the line integral corresponding to the mean reverting
part:

ρκ λ̄ tan(θ2 /2) ρκ
 log | |−  (θ2 − θ1 ) (6.32)
R 1 − ρ2ν3 tan(θ1 /2) 1 − ρ2ν2
κλ 1 1 κ | sin θ2 |
+ 3 ( − ) − 2 log( (6.33)
ν R sin θ1 sin θ2 ν | sin θ1 |

Since θ varies between 0 and π and since in this range both sin θ and tan θ/2 are
non-negative, we can remove the absolute value signs above and arrive at

log(P) M R
 
ρκ λ̄ tan(θ2 /2) ρκ
=  log − (θ2 − θ1 )
R 1−ρ ν 2 3 tan(θ1 /2) 1 − ρ2ν2
κλ 1 1 κ sin θ2
+ 3 ( − ) − 2 log( )
ν R sin θ1 sin θ2 ν sin θ1
(6.34)

Time dependent component of drift


The last integral we need to evaluate arises from the term xt vx + yt vy which, as we
have seen, gives rise, after lowering the indices in the hyperbolic metric to:
 
xt yt
dx + dy
y2 y2
γ[x,y] γ[x,y]
Second Order Expansion for Implied Volatility … 123

Using the explicit expressions (6.10) and (6.11) we find that


xt
y2
γ x 1 ρ 1 ν ρ 1
=− ( +ρ )− + ( )
γ y2 1−ρ y
2 (1 − ρ ) y
2 3/2 ν 1−ρ y
2

yt ν 1
=−
y2 ν y

So the integrals to be calculated are:

log(Pt ) (6.35)
 θ2  θ2
1 γ ρ ρ νρ dx ν dy
= (− − + ) −
1−ρ 2 γ 1−ρ 2 ν θ1 y ν θ1 y
 θ2
γ x
− dx
γ θ1 y 2

which gives

log(Pt )
1 γ ρ ρ νρ ν sin θ2
= (− − + )(θ1 − θ2 ) − log | |
1 − ρ2 γ 1 − ρ2 ν ν sin θ1
 
γ 1 tan θ2 sin θ2
− log − log | |
γ R tan θ1 sin θ1
(6.36)

7 Calculation of u1 Heat Kernel Coefficient

The calculations of the heat kernel coefficient u 1 (and all other heat kernel coeffi-
cients) can be carried out either in the original variables ( f, a) or in the (rescaled)
standard hyperbolic place with metric ds 2 = y12 (d x 2 + dy 2 ),6 provided we use the
transformed drift vector (6.20).
Recall, from formula (2.12) with i = 1 that
 d
1
u1 = u0 u −1
0 (Lu 0 + u τ ) ds (7.1)
d 0

6 Recall that a time change (6.21) gets rid of the extra factor ν 2 .
124 G. Ben Arous and P. Laurence

where
 
√ d( f 0 ,a0 , f,a,t) ∂d(s̃(ρ), ã(ρ), s, a)
u0 = P exp − dρ (7.2)
0 ∂t
  
u0t

where we recall that using (6.4), (6.34) and (6.36), P is known in closed form and
where in the a formula above the distance function is given by (7.8). Also recall that
is the Van-Vleck De Witt determinant, and we know that in the hyperbolic space
of curvature −1

d
= , (7.3)
sinh d
As noted by Willmore [54] p. 208, the Van-Vleck De Witt determinant (which Will-
more calls the “discriminant function”) is independent of the coordinate system cho-
sen. This means that to express the VVDW determinant in the original coordinate
system, it is sufficient to use the same expression (7.3) in conjunction with the dis-
tance function in that coordinate system, i.e. (7.8). Since all ingredients in u 0 are
explicit we can now simply insert this expression into (7.1) and use a symbolic
calculation engine like Mathematica or Maple to do the calculations.
Alternatively, we can do the calculations using the formula for u 1 in the (x, y)
plane. This has the advantage that many of the terms arising during the calculation
can be computed, once again, in closed form. We discuss this further in the remainder
of this section.
The fact that depends exclusively on the distance simplifies some of the expres-
sions involved in u 1 . In polar coordinates based at the point y, we have that the Laplace
Beltrami operator can be expressed in the form

  n−1 
∂2 ∂ r ∂ ∂2
+ log + a(r, θ, t) 2 (7.4)
∂r 2 ∂r ∂r ∂θ


∂ 2 n−1 ∂ ∂2
= 2 + − ∂r + + a(r, θ, t) 2 , (7.5)
∂r r ∂r ∂θ

see Proposition G.V.3, p. 134 of Berger-Gauduchon-Mazet [12] and use the fact that
( )−1 = θ (r u) as defined in C.III.3 in [12]. For brevity’s sake, let


∂r n−1
c(r, t)) = − +
r

Using the explicit form of on calculates that, in the case of standard, hyperbolic
plane,
Second Order Expansion for Implied Volatility … 125

c(r, t) = Cotanh(d)
1
a(r, t) =
sinh2 (d)

so that the above operator can be expressed as

∂2 ∂ ∂2
+ c(r, t) + a(r, θ, t)
∂r 2 ∂r ∂θ 2
Letting B act on u 0 , there are some simplifications due to the fact that (Van
Vleck De Witt determinant), depends only on r

B u0 = B( (r )P(r, θ ))
 
∂2 ∂ ∂2
= +
+ c(r, t) a(r, θ, t) B (P(r, θ ))
∂r 2 ∂r ∂θ 2
 2     
∂ ∂ ∂2
= (P(r, θ )) + c(r, t) ( (r )) + a(r, θ, t) 2 (P(r, θ )) ( (r ))
∂r 2 ∂r ∂θ

Therefore

Lu 0 =
√ √
= PL + LP
√ √ 1 √
= PL + P LA + g i j Ai A j P
2
And so

u −1
0 Lu 0

L 1 ij
= √ + LA + g Ai A j
2

L 1 2 2
= √ + LA + y (Ax + A2y )
2

So, that putting it all together and inserting into (7.1) we obtain an expression of
the form
 & √ '
√ 1 L 1 2 2
P √ + LA + y (Ax + Ay ) ds 2
d γ[x2 ,y2 ,x1 ,y1 ] 2

Since the first part L√ depends only of the distance, it can be integrated in
closed form, and we obtain, using Mathematica, and the expression (7.7) below, that
it equals
126 G. Ben Arous and P. Laurence

(−1 + d 2 + d coth d) d csch d
−P
4d 2
Now

LA
y2
= (Ax x + Ayy ) + Ṽ x Ax + Ṽ y Ay (7.6)
√2
L

  √
−1 − 3d 2 + (1 + d 2 ) cosh 2d csch4 d sinh d
= (7.7)
8d 2 (csch d)3/2

So a key quantity that we must calculate is the action of the differential operator L
on the exponent A in P.
Expressing this exponent using the standard polar coordinates (6.23).7 We have,
with fixed point (x2 , y2 ), letting for brevity θ1 = θ (x1 , y1 ), θ2 = θ (x2 , y2 ) with θ
given by (6.24)
 θ1 (x,y,x2 ,y2 )
A(x, y, x2 , y2 ) = R(x, y, x2 , y2 ) (−Ṽx sin θ + Ṽy cos θ )dθ
θ2

Therefore
∂A    
(x1 , y1 ) = R −Ṽx sin θ1 + Ṽy (x1 , y1 ) cos θ1 (θ1 )x |x=x1 ,y=y1
∂ x& '
 θ1 (x1 ,y2 ,x2 ,y2 )
+ (R)x (−Ṽx sin θ + Ṽy cos θ )dθ
θ2
x=x1 ,y=y1

&  θ1 (x1 ,y2 ,x2 ,y2 ) '


∂ 2A
(x1 , y1 ) = (R)x x (−Ṽx sin θ + Ṽy cos θ )dθ
∂x2 θ2
x=x1 ,y=y1
   
+ 2 (R)x −Ṽx sin θ1 + Ṽy (x1 , y1 ) cos θ1 (θ1 )x |x=x1 ,y=y1
   
+ R −Ṽx sin θ1 + Ṽy (x1 , y1 ) cos θ1 (θ1 )x x |x=x1 ,y=y1

A
and similarly for other partial order derivatives ∂∂y . Note that the evaluation of the
terms above requires knowledge of the value of the integral which was computed in
the previous section.

7 Note this angle is not the same angle as that used above in geodesic polar coordinates.
Second Order Expansion for Implied Volatility … 127

Also we may calculate the second order partial derivatives Ax x , Ayy , Axy and
express these in terms of Ṽ and derivatives of the polar angle θ1 , given as in (6.23),
(6.24) and of the radius, given by (6.22).

7.1 Distance Function and Derivatives in Time Dependent


Case

The distance function in the original coordinates is given by



1 q 2 − 2ρ(t)(y1 − y0 )q + (y1 − y0 )2
d(( f 0 , a0 ), ( f , a1 ) = cosh−1 1 + , a = ν(t)y
ν 2(1 − ρ 2 )y0 y1
(7.8)

1 ν 2 (t)q 2 − 2ρ(t)ν(t)(a1 − a0 )q + ν 2 (t)(a2 − a1 )2
= cosh−1 1+
ν 2(1 − ρ 2 )a1 a2

where
 f0 1 1 1−β
q( f, f 1 , t) = df = ( f 1−β − f 1 )
f1 γ(t) f β γ(1 − β)

Next we check the value that minimizes the geodesic distance from the hyperplane
f = f1:
This value is given by

c(t) := (a1 )min (t) = a0 a02 − 2ν(t)q(t)a0 ρ(t) + ν 2 (t)q 2 (t) (7.9)

After some simplification the value of the minimum distance at the minimum point
this may be written in one of the two forms, one corresponding to a distance function
and the second to a signed distance function (of the point (a0 , f 0 ) to the line f = K )

1 qνρ − a0 ρ − a02 + q 2 ν 2 + 2a0 qνρ
2
us
dmin =   , unsigned
ν a0 1 − ρ 2
(7.10)

and
   
1 1
s
dmin = log νq − a0 ρ + a02 − 2a0 νρq + ν 2 q 2 , signed
ν a0 (1 − ρ)
(7.11)
128 G. Ben Arous and P. Laurence

As mentioned above, this second version of the distance function can become
negative, but it’s absolute value coincides with the expression above. This distinction
seems to have not been emphasized in previous treatments of the subject. In calcu-
lating the local volatility expansion, especially the term V2 , we require derivatives
2
up to the sixth order of the φ = d2 . These are listed below:
Let

φ (2) (7.12)
Csch(νd)d
=    (7.13)
ν a02 + q 2 − 2qνρ a0 − a0 ρ 2

φ (3)
3Csch(ν d)d
=  2  
a0 ν a0 + q 2 − 2qνρ −1 + ρ 2
(7.14)

φ (4)
  
  
3Csch(dν) −4da0 ν −1 + ρ 2 + a02 + q 2 ν 2 − 2qa0 νρ(1 − dνCoth(dν))Csch(dν)
=   3/2  2 
a02 ν 2 a02 + q 2 ν 2 − 2qa0 νρ −1 + ρ 2
(7.15)

φ (5)
 
(−1+dνCoth(dν))Csch(dν)
30Csch(dν)  2
2dν
2 +  3/2
a0 +q 2 ν 2 −2qa0 νρ a0 a02 +q 2 ν 2 −2qa0 νρ (−1+ρ 2 )
=  
a0 ν 2 −1 + ρ 2
(7.16)

and
φ (6)
1
= −  5/2  3 
a ν a + q ν − 2qaνρ
3 2 2 2 2 −1 + ρ 2
   2  
× 15Csch[dν] 24da 2 ν −1 + ρ 2 − 3 a 2 + q 2 ν 2 − 2qaνρ Coth[dν]Csch[dν]2
 
+ dν a 2 + q 2 ν 2 − 2qaνρ (2 + Cosh[2dν])Csch[dν]4
   
+ 18a a 2 + q 2 ν 2 − 2qaνρ −1 + ρ 2 Csch[dν]2 (dνCosh[dν] − Sinh[dν]) (7.17)
Second Order Expansion for Implied Volatility … 129

8 Probability Distribution, Local and Implied Volatility


in Dynamic λ-Sabr Model

Collecting the results in this paper, we see that the probability density function in
the λ-Sabr model is given, at the zero-th order, by

p( f, a, t, K , A, T )
√ *
1 1 C( f, t 1−ρ
1 d
=  (√ ) 2
2π(T − t) 1 − ρ(T )2 C(K , T )ν(T )A2 C(K , t) sinh d
 
2 d( f 0 ,a0 , f,a,t) ∂d( f˜(ρ),ã(ρ), f,a)
− 2(Td −t) − − dρ +log(P ) M R +log(Pt )
×e 0 ∂t

where the distance d in the above formula is given by (7.8). In the case of the λ-Sabr
model, the results in this paper allow us to sharpen the above results and obtain the
expansion for the heat kernel up to the zero-th order in the form

p CEV ( f, a, t, K , A, T )
β  1 *
1−ρ 2
1 1 f 2 d
=  β
2π(T − t) 1 − ρ(T )2 ν(T ) γ(T )K β A2 K 2 sinh d
 
2 d( f 0 ,a0 , f,a,t) ∂d( f˜(ρ),ã(ρ), f,a)
− 2(Td −t) + − dρ +log(P ) M R +log(Pt )
×e 0 ∂t
(8.1)

where log(P) M R is explicit and defined in (6.34) and also the fully explicit log(Pt ) is
given by (6.36). Also the derivative of the distance function appearing in the exponent
of the exponential can easily calculated for any specific functional form of the time
dependent parameters γ(t), ν(t), ρ(t) by differentiating (7.8) with respect to t.
Using (8.4) we see that the zero-th order local volatility σ L , for the family of
stochastic volatility models (2.1), (2.2) (for any local volatility C( f, t)), can be
expressed, in the form

σ L(0) ( f 0 , a0 , K , τ )
   f0 
1
= C(K , t)(a1 )min = C(K , t) a02 − 2qa1 ν(t)ρ(t)q + q 2 ν 2 (t), q( f 0 , K ) =
K C(u, t)

To determine the implied volatility we need to calculate the integral


 f0 1
 du
K C(u, t) a12 + 2q( f 0 , u)a1 νρq( f 0 , u) + q 2 ( f 0 , u)ν 2
130 G. Ben Arous and P. Laurence

Note that this integration is easy to carry out because − C(u,t)


du
= dq. In particular, in
β
the λ-Sabr model case, with C( f, t) = γ(t) f we get:
  
(0) ξ 1
σBS = log (νq + a0 ρ(t) + a02 − 2qa0 ν(t)ρ(t) + q 2 ν 2 (t)) , (8.2)
ν(t) a0 (1 + ρ(t))
ξ ξ
= (0) |amin = s (8.3)
d dmin

where ξ was defined via (5.6), dmin s is the signed distance to line f = K given in
(7.11) and q was defined by (7.9), and we recall that a0 is the initial value of at .
This zero-th order result, in the time homogeneous case, where the coefficients ν, ρ
are constants and where γ = 1 agrees with that given in Berestycki et al., Henry-
Labordère in his Encyclopedia article [35] but does not agree with the formula in
Hagan et al. [31], a discrepancy already pointed out by Obloj in [46]

(0) (1) (2)


σ L = σ L + σ L  + σ L  2 + · · · where  = τ (8.4)
 (1) (1) (0) (2)
(0) V −(VL )2 + 4VL VL 2
= VL + L τ+ (0)
τ + ··· (8.5)
2 VL
(0) 8(VL )3/2

where, recalling that c = (a1 )min , given by (7.9), we have for a g


  
γ(t) f β Ĉ(c) + 2cĈ (c) φ (c) − cĈ(c)φ (3) (c)
(1)
σ L ( f, t) = , (8.6)
2cĈ(c)φ (c)2

where recalling the definition of Ĉ from (4.8) we have

Ĉ(c)

1 dmin
= 
a 2 ν(t) γ(t) f β (1 − ρ 2 (t)) sinh(dmin )
β 
1
2
f 2 1−ρ log(P M R )+log(Pt )
× β
e
K2

where the quantities in the exponent of the exponential were defined respectively
in, (6.34), and (6.36), and where the derivatives of the functions φ were supplied in
(7.13) and (7.14).
The Black-Scholes implied volatility is now given by (5.12) whose expression we
recall here
Second Order Expansion for Implied Volatility … 131
 (0)

σ L ( f 0 ,a0 ,K ,t)u 0,1 ( f 0 ,K ,t)dmin ( f 0 ,a0 ,K ,t)
ξ ln √
ξ f0 K
(1)
σBS = 3
.
dmin

and where we recall that the definition of u 0,1 , given by (5.14) (where we need to plug
(1)
in the quantity, and only that term, involves σ L and hence involves the non-metric
of the zero-th order heat kernel. Also, recall that dmin was given above (8.3).
Moreover for the second order expansion of local volatility and implied volatility
respectively we have

−4(σ L(1) )2 (σ L(0) )2 + 4(σ L(0) )2 VL(2)


σ L(2) = (0)
8(σ L )3
(2)
where VL , given by (4.24) in the time-homogeneous case and by (A.1) in the
time inhomogeneous case. The ingredients going into the definition of VL(2) , include
the derivatives up to order three of Ĉ at the minimum point, and derivatives of φ
(2)
up to order six, which were given in (7.13)–(7.17). Lastly we obtain σBS by using
expression (5.15).

0.074

new order 1
0.072
MC "exact"
Hagan et al
0.07

0.068

0.066

0.064

0.062
100 100.5 101 101.5 102 102.5 103 103.5 104

Fig. 1 1 Year; model parameters S0 = 102, a0 = 0.65, ν = 1, T = 1, β = 0.5, ρ = −0.5. The


third column contains the option price calculated using Monte Carlo and the fourth column contains
the (benchmark) numerically inverted implied volatility
132 G. Ben Arous and P. Laurence

0.078
new sig1
0.076 sig1 Hag
MC sig1
0.074

0.072

0.07

0.068

0.066

0.064

0.062

0.06

0.058
99 100 101 102 103 104 105 106

Fig. 2 Half a year (T = 0.5); model parameters S0 = 102, a0 = 0.65, ν = 1, T = 1, β = 0.5,


ρ = −0.5

9 Numerics

In this section we compare the accuracy of our first order expansion with that
of Hagan-Kumar-Lesniewski-Woodward [31], who do not provide second order
approximations, on three time horizons. In all of the numerics below, the bench-
mark prices used were obtained from a Monte Carlo simulation with 40 time steps
and 300,000 paths per time step. The standard error associated, calculated by means
of dividing the empirical standard deviation by the square root of the number of
paths, was 0.0032. The dimensionless parameters involved in the empirical work
were large, since we took vol of vol ν to be equal to 1.
What we have found is that our first order approximation is more accurate on
time horizons of 0.5 and 1 year, both in and out of the money. On a 2 year time
Second Order Expansion for Implied Volatility … 133

0.078
new order 1
0.076 order 1 Hag
order 1 MC
0.074

0.072

0.07

0.068

0.066

0.064

0.062
100 101 102 103 104 105 106

Fig. 3 2 year; model parameters S0 = 102, a0 = 0.65, ν = 1, T = 1, β = 0.5, ρ = −0.5

horizon, Hagan et al.’s first order approximation is sometimes better than ours out
of the money and tends to be worse in the money. These results are illustrated in the
figures (Figs. 1, 2 and 3).

Acknowledgments We would like to thank Tai-Ho Wang of Baruch College for his assistance in
calculating the higher derivatives of φ in Sect. 7, and for having pointed out to us the validity of the
curvature formula (3.3) in the case q = 1. We would also like to thank Elton Hsu for interesting
discussions.

Appendix: Form of VL(2) in the Time-Inhomogeneous Case

(2)
VL =
1   
− + (c) + cCt(u)(d
C( f, t)2 12cφ (c)2 −2d0 (c)Ct + 1 (u) + u 1 (u))φ (c)
ˆ
24Ct(c) d0 (c) φ (c)
2 2 5
   
+ (c)d0 (c) − d0 (c)Ct
φ (c) −2Ct + (c) − 2Ct(u)(d
+ + (c)φ (3) (c)
1 (u) + u 1 (u))φ (c) + d0 (c)Ct
134 G. Ben Arous and P. Laurence
   
+
Ct(c)φ (c) 12cφ (c)2 2Ct + (c)d0 (c) 2d0 (c) + c(d1 (c) + u 1 (c))φ (c) +
   
+
+ Ct(u)(d (3)
1 (u) + u 1 (u)) φ (c) 4d0 (c) − cd0 (c) + 2c(d1 (c) + u 1 (c))φ (c) + cd0 (c)φ (c)

− d0 (c)φ (c) 48cd0 (c)Ct + (c)φ (c) − 24Ct(u)d
+ +
1 (u)φ (c) − 24Ct(u)u
2
1 (u)φ (c)
2

+ (c)φ (c)2 − 12c2 u 1 (c)Ct


− 12c2 d1 (c)Ct + (c)φ (c)2 + 24cCt(u)d
+ (3)
1 (u)φ (c)φ (c)
+
+ 24cCt(u)u (3) 2+
1 (u)φ (c)φ (c) + 5c Ct(u)d
(3) 2+
1 (u)φ (c) + 5c Ct(u)u
2 (3)
1 (u)φ (c) + 12Ct
2 + (c)
     
(3) (3)
× 4d0 (c) φ (c) − cφ (c) + cφ (c) 4d0 (c) + (d1 (c) + u 1 (c)) 4φ (c) + cφ (c)
 
+
− 3c2 Ct(u)(d (4)
1 (u) + u 1 (u))φ (c)φ (c) − 24d0 (c)
2 +(3) (c) + 2cCt
cφ (c)2 Ct + (c)φ (3) (c)2
   
+ (c)φ (c) φ (c) − 2cφ (3) (c) − Ct
+ Ct + (c)φ (c) 2φ (3) (c) + cφ (4) (c)
   
+ 2 12cφ (c)2 2d0 (c) + c(d1 (c) + u 1 (c))φ (c) d0 (c)φ (c) − d0 (c)φ (3) (c)
+ Ct(c)

+ d0 (c)φ (c) −24d1 (c)φ (c)3 − 24u 1 (c)φ (c)3 − 48cd1 (c)φ (c)3 − 48cu 1 (c)φ (c)3

− 24cφ (c)2 d0(3) (c) + 48d0 (c)φ (c)φ (3) (c) + 24cd1 (c)φ (c)2 φ (3) (c)
+ 24cu 1 (c)φ (c)2 φ (3) (c)− 48cd0 (c)φ (3) (c)2 + 5c2 d1 (c)φ (c)φ (3) (c)2 + 5c2 u 1 (c)φ (c)φ (3) (c)2
    
− 24d0 (c)φ (c) φ (c) − 2cφ (3) (c) − 3cφ (c) −8d0 (c) + c(d1 (c) + u 1 (c))φ (c) φ (4) (c)
  
+ 2d0 (c)2 15cφ (3) (c)3 − φ (c)φ (3) (c) 15φ (3) (c) + 16cφ (4) (c)
 
+ 3φ (c)2 2φ (4) (c) + cφ (5) (c) (A.1)

where all quantities appearing above where already defined in connection with (4.24)
in the time-inhomogeneous case and where

+=
Ct P

and d1 , d2 where defined in (4.25). In the case of the λ-Sabr model all required
derivatives of φ (up to order 6) were supplied in Sect. 7.1.

References

1. Andersen, L.B.G.: Option pricing with quadratic volatility: a revisit. Financ. Stoch. 15(2),
191–219 (2011)
2. Andreasen, J., Andersen, L.B.G.: Jump diffusion pricing: volatility smile fitting and numerical
methods for option pricing. Rev. Deriv. Res. 4, 231–262 (2000)
3. Avellaneda, M., Boyer-Olson, D., Busca, J., Friz, P.: Reconstructing volatility. Risk Mag.
15(10), 87–91 (2002)
4. Avramidi, I.: Heat Kernel and Quantum Gravity. Lecture Notes in Physics, New Series, Mono-
graphs, vol. 64. Springer, Berlin (2000)
5. Avramidi, I.: Lectures on Heat Kernel Applications in Finance. New Mexico Technology. http://
infohost.nmt.edu/iavramid/notes/hkt/hkt.html
6. Azencott, R.: Géodésiques et diffusions en temps petit. Séminaire de Probabilités. pp. 84–85
(1991)
Second Order Expansion for Implied Volatility … 135

7. Bates, D.: Jumps and stochastic volatility: the exchange rate processes implicit in Deutschemark
opions. Rev. Financ. Stud. 9, 69–107 (1996)
8. Benhamou, E., Croissant, O.: Local time for the SABR model: connection with the ‘complex’
black scholes and application to CMS and spread options, SSRN preprint (2004)
9. Bender, C.M., Orszag, S.A.: Advanced Mathematical Methods for Scientists and Engineers.
Springer, Berlin (1999)
10. Beresticky, H., Busca, J., Florent, I.: Asymptotics and calibration of local volatility models.
Quant. Financ. 2, 61–69 (2002)
11. Beresticky, H., Busca, J., Florent, I.: Computing the implied volatility in stochastic volatility
models. Commun. Pure Appl. Math. 57(10), 1352–1373 (2004)
12. Berger, M., Gauduchon, P., Mazet, E.: Le spectre d’une variété Riemanniene. Lecture Notes
in Mathematics. Springer, Berlin (1970)
13. Bourgade, P., Croissant, O.: Heat kernel expansion for a family of stochastic volatility models:
δ geometry, SSRN (2005)
14. Carr, P., Jarrow, R.: The stop-loss start-gain paradox and option valuation: a new decomposition
into intrinsic and time value. Rev. Financ. Stud. 3(3), 469–492 (1990)
15. Carr, P., Wu, L.: Stochastic skew for currency options. J. Financ. Econ. 86, 213–247 (2007)
16. Carr, P., Geman, H., Madan, D., Yor, M.: From local volatility to local Lévy models. Quant.
Financ. 4, 581–588 (2004)
17. Chavel, I.: Eigenvalues in Riemannian Geometry. Academic Press, Waltham (1984)
18. Cox, J.: Notes on option pricing I: constant elasticity of diffusions. Unpublished draft. Palo
Alto, CA: Stanford University, September 1975
19. Doust, P.: No arbitrage Sabr, Royal Bank of Scotland working paper (2010)
20. Dupire, B.: Pricing with a smile. Risk 7, 18–20 (1994)
21. Dupire, B.: A unified theory of volatility, discussion paper paribas capital management. In:
Peter, C. (ed.) Reprinted in Derivative Pricing: The Classic Collection. Risk Books (2004)
22. Derman, E., Kani, I.: Riding on a smile. Risk 7, 32–39 (1994)
23. Feng, J., Forde, M., Fouque, J.-P.: Short-maturity asymptotics for a fast mean-reverting Heston
stochastic volatility model. SIAM J. Financ. Math. 1, 126–141 (2010)
24. Forde, M., Jacquier, A.: Small time asymptotics for implied volatility under the Heston model.
Int. J. Theory Appl. Financ. 12, 861–876 (2009)
25. Forde, M., Jacquier, A.: The large-maturity smile for the Heston model. Financ. Stoch. 15(4),
755–780 (2011)
26. Forde, M., Jacquier, A., Mijatović, A.: Asymptotic formulae for implied volatility in the Heston
model. Proc. R. Soc. A 466(2124), 3593–3620 (2010)
27. Forde, M., Jacquier, A., Lee, R.: The small-time smile and term structure of implied volatility
under the Heston model. SIAM J. Financ. Math. 3(1), 690–708 (2012)
28. Gatheral, J., Hsu, E.P., Laurence, P., Ouyang, C., Wang, T.H.: Asymptotics of implied volatility
in local volatility models. Math. Financ. 22(4), 591–620 (2012)
29. Hadamard, J.: Lecons sur les equations différentielles. Dover Publications, Sewed (1952)
30. Hagan, P., Woodward, D.E.: Equivalent black volatilities. Appl. Math. Financ. 6, 147–157
(1999)
31. Hagan, P., Kumar, P.S., Lesniewski, A., Woodward, D.E.: Managing smile risk. Wilmott Mag.,
84–108 (2003)
32. Hagan, P., Lesniewski, A.: Probability distribution in the SABR model of stochastic volatility.
In: Friz, P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.) Large Deviations
and Asymptotic Methods in Finance. Springer Proceedings in Mathematics and Statistics, vol.
110 (2015)
33. Heston, S.: A closed form solution for options with stochastoc volatility, with applications to
bond and currency pricing. Rev. Financ. Stud. 6, 327–342 (1993)
34. Hsu, E.: Stochatic analysis on manifolds, Graduate Studies in Mathematics. American Math-
ematical Society (2002)
35. Henry-Labordère, P.: SABR Model. Encyclopedia of Quantitative Finance (2010)
36. Hsu, E.: The heat kernel on non-complete manifolds. Indiana Math. J. 39(2), 431 (1990)
136 G. Ben Arous and P. Laurence

37. Hull, J., White, A.: The pricing of options on assets with stochastic volatilities. J. Financ. 42(2),
282–300 (1987)
38. Kunimoto, N., Takahashi, A.: The asymptotic expansion approach to the valuation of interest
rate contingent claims. Math. Financ. 11(1), 117–151 (2001)
39. Henry-Labordère, P.: A general asymptotic implied volatility for stochastic volatility models.
http://arxiv.org/abs/cond-mat/0504317 (2005)
40. Henry-Labordère, P.: Analysis, Geometry, and Modeling in Finance. Chapman & Hall/CRC
Financial Mathematics Series (2008)
41. Henry-Labordère, P.: Solvable local and stochastic volatility models: supersymmetric methods
in option pricing. Quant. Financ. 7(5), 525–535 (2007)
42. Lewis, A.L.: Option valuation under stochastic volatility: with mathematica code. CA: Finance
Press (2000)
43. Lesniewski, A.: Swaption Smiles via the WKB Method. Seminar Mathematical Finance,
Courant Institute of Mathematical Sciences (February 2002)
44. Minakshisundaram, S., Pleijel, A.: Some properties of the eigenfunctions of the Laplace- oper-
ator on Riemannian manifolds. Can. J. Math. 1, 242–256 (1949)
45. Megvedev, A., Scaillet, O.: Approximation and calibration of short-term implied volatilities
under jump-diffusion stochastic volatility. Rev. Financ. Stud. 20, 427–459 (2007)
46. Obloj, J.: Fine-tuning your smile: correction to Hagan et al. Wilmott Mag. (2008)
47. Molchanov, S.: Diffusion processes and Riemannian geometry. Russ. Math. Surv. 30, 11–63
(1975)
48. Paulot, L.: Asymptotic implied volatility at the second order with application to the SABR
model. In: Friz, P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.) Large
Deviations and Asymptotic Methods in Finance. Springer Proceedings in Mathematics and
Statistics, vol. 110 (2015)
49. Renault, E., Touzi, N.: Option hedging and implied volatilities in a stochastic volatility model.
Math. Financ. 6(3), 279–302 (1996)
50. Stein, E.M., Stein, J.C.: Stock distributions with stochastic volatility: an analytic approach.
Rev. Financ. Stud. 4(4), 727–752 (1991)
51. Takahashi, A., Takehara, K., Toda, M.: Computation in an asymptotic expansion method, SSRN
preprint (2009)
52. Varadhan, S.: On the behavior of the fundamental solution of the heat equation with variable
coefficients. Commun. Pure Appl. Math. 20, 431–455 (1967)
53. Varadhan, S.: Diffusion processes in a small time interval. Commun. Pure Appl. Math. 20,
659–685 (1967)
54. Willmore, T.J.: Riemannian Geometry. Oxford Science Publications, Oxford (2002)
55. Yoshida, K.: On the fundamental solution of the parabolic equation in a Riemannian space.
Osaka Math. J. 1(1), 1–52 (1953)
General Asymptotics of Wiener Functionals
and Application to Implied Volatilities

Yasufumi Osajima

Abstract In the present paper, we give an asymptotic expansion of probability


density for a component of general diffusion models. Our approach is based on infinite
dimensional analysis on the Malliavin calculus and Kusuoka-Stroock’s asymptotic
expansion theory for general Wiener functionals (Kusuoka and Stroock, J. Funct.
Anal. 99:1–74, 1991 [12]). The initial term of the expansion is given by the geodesic
distance and we calculate it by solving Hamilton’s equation. We apply our approach
to obtain asymptotic expansion formulae for implied volatilities in general diffusion
models, e.g. CEV and SABR model.

Keywords Wiener functional · Stochastic volatility · Hamilton equation · Malliavin


calculus · Asymptotic approximation · SABR model

1 Introduction

There are many applications of asymptotic expansion theory to mathematical finance.


The most popular is the singular perturbation approach. For example, Hagan and
Woodward [6] gave an asymptotic expansion formula for implied volatilities of local
volatility models and Hagan et al. [7] gave a formula for a stochastic volatility
model (SABR model) based on Hagan et al. [8]. Their formula is well-known to
practitioners. Berestycki-Busca-Florent [1] applied non-linear PDE analysis to this
problem. Henry-Labordère [9] applied a heat kernel expansion method and gave an
asymptotic expansion formula for a mean-reverting SABR model.
In this paper, we take an approach based on Malliavin calculus. The theory of
asymptotic expansions of probability densities based on Malliavin calculus was orig-
inated by Bismut [2] and was developed by Watanabe [17] and Kusuoka and Stroock
[11, 12]. Many applications of this theory to finance were given by Yoshida [18],

Y. Osajima (B)
BNP Paribas, Fixed Income Research and Strategies, 10 Harewood Avenue,
London NW1 6AA, UK
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 137


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_5
138 Y. Osajima

Takahashi-Kunitomo [10] and Siopacha and Teichmann [16]. In [14], we gave an


asymptotic expansion for implied volatilities of SABR model with time-dependent
coefficients. Deuschel et al. [3, 4] gave density expansions for multi dimensional
hypoelliptic diffusions (X 1 , . . . , X d ) at fixed time T and projected to their first l
coordinates. They applied their results to short time and tail asymptotics of implied
volatilities for some stochastic volatility models.
In this paper, we apply the methods of Kusuoka-Stroock [12] to the asymptotic
expansion for implied volatilities of call options. The key theorem is given in [13]
and also summarized in the Appendix. Finally we give explicit analytic formulae in
general diffusion models.
Let (Ω, F , P) be a probability space and let {W 1 (t), . . . , W d (t); t ∈ [0, T ]} be
a d-dimensional Brownian motion. Let V0 , . . . , Vd ∈ Cb∞ ([0, T ] × R N ; R N ). Here
Cb∞ ([0, T ] × R N ; R N ) denotes the space of R N -valued smooth functions defined in
[0, T ] × R N whose derivatives of any order are bounded.
Now let X ε (t), t ∈ [0, T ], ε ∈ (0, 1], be the solution to the following stochastic
differential equation:


d
d X εi (t) = εVki (t, X ε (t))dWk (t) + V0i (t, X ε (t))dt, 1 ≤ i ≤ N ,
k=1
X ε (0) = x0 = (x01 , . . . , x0N ), x0 ∈ R N . (1)

In view of financial applications, cf. below, we assume

V01 ≡ 0, (2)

and the ellipticity of V1 , . . . , Vd at x0 , i.e. there exists a constant δ > 0 such that


d
Vk (0, x0 ) ⊗ Vk (0, x0 ) ≥ δ I, (3)
k=1

where I denotes the identity matrix. Then there exists a unique solution to (1).
Moreover, we assume that X ε (t) is continuous in t with probability one.
We investigate the distribution of X ε1 (T ). From the ellipticity condition (3), the
law of X ε1 (T ), denoted by νε , is absolutely continuous and has a smooth density
pε (y). Let H be the Cameron-Martin space of d-dimensional Wiener space. We
consider the associated ordinary differential equation:

d i  d
y (t; h) = Vki (t, y(t; h))ḣ k (t) + V0i (t, y(t; h)), t ∈ [0, T ], h ∈ H,
dt
k=1
y(0; h) = x0 , x0 ∈ Rn . (4)
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 139

We define the energy function e : R → R by


  
1
d T
e(y) = inf |ḣ i (s)|2 ds; h ∈ H, y 1 (T ; h) = y . (5)
2 0
i=1

Since V01 ≡ 0, this energy function satisfies e(x01 ) = 0. Let us define a flow φ :
[0, T ] × R N → R N by

d
φ(t, x) = V0 (t, φ(t, x)), t ∈ [0, T ], x ∈ R N , (6)
dt
φ(0, x) = x.

Then the map φ(t, ·) : R N → R N , t ∈ [0, T ] is a diffeomorphism denoted by φt .


Note that φt1 (x) = x 1 . We define


N
∂φ i j
Ṽki (t, y) = (−t, φ(t, y))Vk (t, φ(t, y)), 1 ≤ i ≤ N , 1 ≤ k ≤ d, (7)
∂x j
j=1

which is the push-forward of the vector field V by the map φt . Let us define
(g ij )1≤i, j≤N : [0, T ] × R N → R by


d
j
g ij (t, x) = Ṽki (t, x)Ṽk (t, x), 1 ≤ i, j ≤ N .
k=1

From (3), the matrix (g ij )1≤i, j≤N is positive definite corresponding to the Riemaniann
metric on R N . We define the generating operator L t , t ∈ [0, T ] by

1  ij 
N N
∂2 f ∂f
(L t f )(x) = g (t, x) i j (x) + Ṽ0i (t, x) i (x),
2 ∂x ∂x ∂x
i, j=1 i=1

f ∈ Cb∞ (R N ), x ∈ R N , t ∈ [0, T ], (8)

where Ṽ0i ∈ Cb∞ ([0, T ] × R N ; R N ) is given by

1   ∂ 2 φi
N d
Ṽ0i (t, y) = (−t, φ(t, y))Vmk (t, φ(t, y))Vml (t, φ(t, y)), 1 ≤ i ≤ N . (9)
2 ∂ xk ∂ xl
k,l=1 m=1

Let us define linear operators V : Cb∞ ([0, T ] × R N ) → Cb∞ ([0, T ] × R N ) and


Γ : Cb∞ ([0, T ] × R N ) ⊗ Cb∞ ([0, T ] × R N ) → Cb∞ (R N ) by
140 Y. Osajima


N  T ∂f
(V f )(t, x) ≡ g 1i (t, x) (s, x)ds, (10)
t ∂ xi
i=1

N   
T ∂f T ∂g
Γ ( f, g)(x) ≡ g ij (t, x) (s, x)ds (s, x)ds dt. (11)
t ∂ xi t ∂x j
i, j=1

Our main result is:


Theorem 1 There is a constant r0 > 0 satisfying the following (1) and (2).
(1) The energy function e ∈ C 2 ([x01 − r0 , x01 + r0 ]) and there is a constant C0 > 0
such that the asymptotic expansion of energy e satisfies

1 b2  b b2 
3
e(y) − (y − x01 )2 − 3 (y − x01 )3 + − 4 + 25 (y − x01 )4
2b1 3b1 4b1 2b1
≤ C0 |y − x01 |5 , y ∈ [x01 − r0 , x01 + r0 ], (12)

where
 T 
3 T
b1 = g (t, x0 )dt, b2 =
11
(V g 11 )(t, x0 )dt, (13)
0 2 0
 T 
1 T
b3 = 2 (V g )(t, x0 )dt +
2 11
Γ (g 11 , g 11 )(x0 ).
0 2 0

(2) There are constants C1 , C2 > 0 such that the probability density pε (y) satisfies
the following:

1
 e(y) 
(2π ε2 ) 2 exp pε (y) − a0 (y) − ε2 a2 (y) ≤ ε4 C1 , y ∈ [x01 − r0 , x01 + r0 ].
ε2
(14)

Here, a0 and a2 are continuous functions which satisfy

 ∂ 2 e(y)  1  L(y − x 1 )2 
2
a0 (y) − exp 0
≤ C2 |y − x01 |3 , y ∈ [x01 − r0 , x01 + r0 ],
∂ y2 2b12
(15)

and

1  L 5 b22 3 b3 
a2 (x01 ) = √ − − + , (16)
b1 2b1 6 b13 4 b12

where

L= 0<u<t<T L u (g 11 (t, ·))(x0 ) du dt. (17)


General Asymptotics of Wiener Functionals and Application to Implied Volatilities 141

Remark 1 We can restate our results (14) as the heat kernel expansion:

1
pε (y) ∼ e−e(y)/ε
2
(a0 (y) + ε2 a2 (y) + O(ε4 )).
ε(2π )1/2

Next, we apply our results to the asymptotic expansion of call option values and
their implied volatilities. We regard X ε1 as the underlying of these options. Then the
forward value of a call option of strike rate K and maturity T is given by

Cε (T, K ) = E[(X ε1 (T ) − K )+ ], ε ∈ (0, 1], K > 0.

We define smooth functions ϕn ∈ Cb∞ ([0, ∞)), n ≥ 0, by


 ∞ 
z2
ϕn (x) = z n exp −x z − dz, x ≥ 0. (18)
0 2

Some properties of ϕn are given in Lemma 7. Since (12), we can define the following
function q ∈ C 2 ([x01 − r0 , x01 + r0 ]; R+ ) such that

1 x dy 2
e(x) = , x ∈ [x01 − r0 , x01 + r0 ]. (19)
2 x01 q(y)

Then the asymptotic expansion of call option values are given by the following.
Theorem 2 There are constants K 0 < K 1 and C1 such that the value of the call
option with strike rate K , maturity T satisfies

√ e(K )  2e(K ) 
2π exp( 2 )Cε (T, K ) − εa0 (K )q(K )2 ϕ1 R2 (ε, K ) ≤ C1 ε4 ,
ε ε
ε ∈ (0, 1], K ∈ [K 0 , K 1 ],

where
 √
a0 (K ) 3 q (K ) ϕ2 ( 2e(K )/ε)
R2 (ε, K ) = εq(K ) + √
a0 (K ) 2 q(K ) ϕ1 ( 2e(K )/ε)
 
2 1 a0 (K ) a0 (K ) q (K ) 7  q (K ) 2 2 q (K )
+ ε q(K )
2
+2 + +
2 a0 (K ) a0 (K ) q(K ) 6 q(K ) 3 q(K )

ϕ3 ( 2e(K )/ε) a2 (K )
× √ + ε2 . (20)
ϕ1 ( 2e(K )/ε) a0 (K )
142 Y. Osajima

Next we calculate the asymptotic expansion of implied volatilities of call options.


Let us define f ∈ C ∞ (R+ ; R+ ) by
 2
1 1 x
f (x) = √ exp − ϕ1 (x), x > 0. (21)
x 2π 2

We can easily check that f is strictly decreasing and

f (0+ ) = ∞, f (∞) = 0.

Therefore the inverse function f −1 : R+ → R+ is well defined. When we consider


the following normal model:

d X̃ (t) = σ d W̃ (t), X̃ (0) = x01 ,

the value of the call option with strike rate K and maturity T is given by
 ∞  
1 z2  K − x1 
C N (T, K ) = √ (z + x01 − K )+ exp − 2 d x = (K − x01 ) · f √ 0 .
2π σ 2 T −∞ 2σ T σ T

Therefore the implied normal volatility can be written as

K − x01
σ Nε (T, K ) = √ , K > x01 .
f −1 (Cε (T, K )/(K − x01 )) T

The asymptotic expansion of the implied normal volatilities are given by the
following.
Theorem 3 The asymptotic expansion of implied normal volatilities are given by

 ε|K − x 1 | −1
√ 0
σ Nε (T, K ) − exp(J ) ≤ C(ε + |K − x01 |)3 , K ∈ [x01 , K 1 ], (22)
2e(K )T

where
√ √
|K − x01 |2  L 1 b22 1 b3  2e(K ) ε2  L 5 3 b3 
b22 2e(K )
J = + − ϕ 1 + − − + ϕ 1
b12 2 6 b2 4 b1 ε b1 2 6b12 4 b1 ε
1
√ √
ε |K − x01 |  2 b22 3 b3  2e(K ) 2
ε L b 2
b3  2e(K )
+ √ L+ − ϕ 2 + + 22 − ϕ3 .
b1 b1 3 b2 4 b1 ε b1 2 2b1 2b1 ε
1

Remark 2 Since we can give the same formula for put options, Theorem 3 still
holds in the case K < x01 . The implied volatility for a put option of strike rate K and
maturity T is the same as the implied volatility for a call option with the same strike
rate and maturity due to the put-call parity. See Appendix 3 for the details.
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 143

2 Hamilton Equation and the Energy of Path

In this section, we investigate the correspondence between the Hamilton equation


and the energy of path defined by (5). Without loss of generality, we can assume
T = 1. Let H be a separable real Hilbert space defined by

 d  1
 
2
H = h ∈ C0 ([0, 1]; Rd ) : h is absolutely continuous and ḣ i (t) dt < ∞ .
i=1 0

The inner product is given by

d 
 1
(h, k) H = ḣ i (t)k̇ i (t)dt.
i=1 0

This Hilbert space H is called the Cameron-Martin space.


Let y(t; h), t ∈ [0, 1], h ∈ H, be the solution to the ordinary differential
equation:

d i 
d
y (t; h) = Ṽki (t, y(t; h))ḣ k (t), 1 ≤ i ≤ N , t ∈ [0, 1],
dt
k=1
y(0; h) = x0 , x0 ∈ R N .

Let (g ij )1≤i, j≤N : [0, 1] × R N → R be given by


d
j
g ij (t, x) = Ṽki (t, x)Ṽk (t, x).
k=1

We define Hamilitonian H : [0, 1] × R N × R N → R by

1  ij
N
H (t, x, p) = g (t, x) pi p j . (23)
2
i, j=1

Then the correspondence between Hamilton equation and the energy of path is given
by the following.
Proposition 1 Let J ji : [0, 1] × H → R be the solution to the following ordinary
differential equation:
144 Y. Osajima

d i   ∂ Ṽ i d N
k
J j (t; h) = (t, y(t; h))J rj (t; h)ḣ k (t),
dt ∂ xr
k=1 r =1
J ji (0; h) = δij , 1 ≤ i, j ≤ N ,

where δij is Kronecker’s delta. Let J¯(t; h) = J −1 (t; h). We assume there is h 0 ∈ H
and λ ∈ R N such that1
N
h0 = λk Dy k (1; h 0 ). (24)
k=1

We define x, p ∈ C ∞ ([0, 1]; R N ) by

x(t) = y(t; h 0 ),
N
pi (t) = j,k=1 J¯i (t; h 0 )J jk (1; h 0 )λk .
j
(25)

Then (x, p) satisfies the Hamilton equation:

d i ∂
x (t) = H (t, x(t), p(t)),
dt ∂ pi
d ∂
pi (t) = − i H (t, x(t), p(t)), 0 ≤ t ≤ 1, 1 ≤ i ≤ N , (26)
dt ∂x
x(0) = x0 , x0 ∈ Rn .

Furthermore, we have λ = p(1) and

d k  N
h 0 (t) = pi (t)Ṽki (t; x(t)), 0 ≤ t ≤ 1, 1 ≤ k ≤ d,
dt
i=1

N  1
h0 2
= g ij (t, x(t)) pi (t) p j (t)dt. (27)
i, j=1 0

Proof We note that J¯ji : [0, 1] × H → R satisfies the following ordinary differential
equation:

d ¯i  ∂ d N
J j (t; h) = − Ṽ r (t, y(t; h)) J¯ri (t; h)ḣ k (t),
dt ∂x j k
k=1 r =1
J¯ji (0; h) = δij , 1 ≤ i, j ≤ N .

From Proposition 6.6 in Shigekawa [15], we have

1 We define Dy(·; h)[k] = d


dε y(·; h + εk).
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 145


d 
N  1
J¯rj (t; h)Ṽl (t, y(t; h))k̇ l (t)dt, 1 ≤ i ≤ N ,
j
Dy i (1; h)[k] = Jri (1; h)
l=1 r, j=1 0

N (28)
From (25), it is easy to see λ = p(1). Since h 0 = i=1 λi Dy i (h ), we see that
0

 d 
N  1
(h 0 , k) = pi (t)Ṽli (t, y(t; h 0 )k̇ l (t)dt.
i=1 l=1 0

Therefore we have (27). We can check that (x(t), p(t)), 0 ≤ t ≤ 1, satisfies (26) as
follows:

d i  d  N
x (t) = Ṽki (t, x(t))ḣ k0 (t) = g ij (t, x(t)) p j (t),
dt
k=1 j=1

d 
d 
N
∂ Ṽk
j 
N
∂g jr
pi (t) = − (t, x(t)) p j (t)ḣ k0 (t) = − (t, x(t)) p j (t) pr (t).
dt ∂ xi ∂ xi
k=1 j,r =1 j,r =1

Remark 3 We will give a remark on condition (24). We define an energy function


E : R N → R as
 
1
E(y) = inf h 2 ; h ∈ H, y(1; h) = y .
2

and let h 0 ∈ H be the minimizer of the energy function. Then we can apply
Lagrange’s method and there is a λ ∈ R N such that


N
h0 = λk Dy k (1; h 0 ),
k=1

which is the condition (24). In particular, the condition (29) in the next proposition
is corresponding to the energy function (5).

Let us define the following notations.

def f (w) − g(w)


f ∼ g ⇐⇒ lim = 0, k ≥ 0, f, g ∈ C([0, 1]).
k w↓0 wk

In the following case, we obtain the asymptotic solutions.


146 Y. Osajima

Proposition 2 Let x(t; w), p(t; w) be the solution to the Hamilton equation (26) with

w (i = 1), w ∈ R
λi = (29)
0 (2 ≤ i ≤ N ),

under the boundary condition x(0) = x0 , p(1) = λ. Then the asymptotic expansion
of x 1 (1; w) is given as follows:

x 1 (1; w) ∼ x0 + b1 w + b2 w2 + b3 w3 , (30)
3

where b1 , b2 , b3 are defined by (13).

Proof The solution can be written as

N 
 t
x i (t; w) = x0i + g ij (s, x(s; w)) p j (s; w)ds, (31)
j=1 0
N 
1  1 ∂g jr
pi (t; w) = pi (1; w) + (s, x(s; w)) p j (s; w) pr (s; w)ds. (32)
t ∂x
2 i
j,r =1

We calculate the asymptotic expansion inductively. Since x(t; 0) = x0 , p(t; 0) = 0,


we have

x(t; w) ∼ x0 , p(t; w) ∼ 0. (33)


0 0

Since the integral term in (32) is of the second order in w and from the boundary
condition (29), we have the first order expansion of p:

w (i = 1)
pi (t; w) ∼ pi (1; w) = (34)
1 0 (2 ≤ i ≤ N ).

We substitute (34) for (31), we have the first order expansion of x:


 t 
x i (t; w) ∼ x0i + 0 g i1 (s, x0 )ds w. (35)
1

Substituting (34) for (32), we have the second order expansion of p:


1   1 ∂g jr 
N
pi (t; w) ∼ pi (1; w) + (s, x(s; w))ds p j (1; w) pr (1; w)
2 2 ∂x i
j,r =1 t

1  1 ∂g 11 
∼ pi (1; w) + (s, x 0 )ds w2 .
2 2 t ∂ xi
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 147

We substitute (35) for (31). Then we have the second order expansion of x:

N   
 t 1  1 ∂g 11  
x (t; w) ∼
i
x0i + g (s, x(s; w)) p j (1; w) +
ij
(r, x 0 )dr w2 ds
2 2 s ∂x j
j=1 0
 t  N  t 
 s ∂g i1
∼ x0i + g i1 (s, x0 )ds w + (s, x0 )g j1 (u, x0 )duds
2 0 0 0 ∂x j
j=1
 t 
1 1 ∂g 11
+ g ij (s, x0 ) (u, x 0 ) du ds w2 .
2 0 s ∂x j

From the second order expansion of p and the first order expansion of x, we have
third order expansion of p:

1  1 ∂g 11 
pi (t; w) ∼ pi (1; w) + (s, x 0 )ds w2
3 2 t ∂ xi
N   1 ∂g 11
1  1 ∂g j1 
+ (s, x 0 ) (u, x 0 )du ds
2 ∂ xi s ∂x
j
j=1 t
 1 2 11  s  
∂ g
+ (s, x 0 ) g j1
(u, x 0 )du ds w3 .
t ∂x ∂x
i j
0

Finally we have the following third order expansion of x:


 t 
x (t; w) ∼
i
x0i + g i1 (s, x0 )ds w
3 0
N  t 
 ∂g i1 s
+ (s, x0 )g j1 (u, x0 )duds
0 0 ∂x j
j=1
  
1 t 1 ij ∂g 11
+ g (s, x0 ) (u, x0 )duds w2
2 0 s ∂x j

 1 t
N  
 1 ∂g k1  1 ∂g 11  
+ g ij (s, x0 ) (u, x 0 ) (r, x0 )dr du ds
s ∂x u ∂x
2 0 j k
j,k=1
  1 ∂ 2 g 11  u  
1 t ij
+ g (s, x0 ) (u, x 0 ) g k1
(r, x 0 )dr du ds
s ∂x ∂x
2 0 j k
0
  1 ∂g 11  s 
1 t ∂g ij
+ (s, x 0 ) (u, x 0 )du g k1
(r, x 0 )dr ds
2 0 ∂xk s ∂x
j
0
  s  1 ∂g 11  
1 t ∂g i1
+ (s, x 0 ) g jk
(u, x 0 ) (r, x0 )dr du ds
2 0 ∂x j
0 u ∂x
k
 t i1   s   u  
∂g ∂g j1
+ (s, x 0 ) (u, x 0 ) g k1
(r, x 0 )dr du ds
0 ∂x 0 ∂x
j k
0
148 Y. Osajima
  s  s 
1 t ∂ 2 g i1
+ (s, x 0 ) g j1
(u, x 0 )du g k1
(r, x 0 )dr ds w3 .
2 0 ∂x j ∂xk 0 0

From the definition of the linear operator V given in (10), we have

x 1 (1; w) ∼ x01 + b1 w + b2 w2 + b3 w3 . 
3

3 Proof of Theorem 1

3.1 Proof of Theorem 1 (1)

Let X˜ε be defined by X˜ε (t) = φ(−t, X ε (t)). Then X̃ satisfies the following stochastic
differential equation:


d
d X̃ εi (t) = ε Ṽki (t, X̃ ε (t))dWk (t) + ε2 Ṽ0i (t, X̃ ε (t))dt, 1 ≤ i ≤ N , t ∈ [0, 1],
k=1
X̃ ε (0) = x0 , (36)

where Ṽ is defined as (7) and (9). The solution to the associated ordinary differential
equation ỹ satisfies (37) in the next lemma.
Lemma 1 Let y(t; h) : [0, 1]× H → R, be the solution defined by (4). Let us define

ỹ(t; h) = φ(−t, y(t; h)), 1 ≤ i ≤ N , t ∈ [0, 1],

then ỹ satisfies the ordinary differential equation:

d i  d
ỹ (t; h) = Ṽki (t, ỹ(t; h))ḣ k (t), 1 ≤ i ≤ N , t ∈ [0, 1]. (37)
dt
k=1

Proof From the definition of φ given by (6), we have


d
j
−V0i (t, φ(−t, φ(t, y))) + ∇ j φ i (−t, φ(t, y))V0 (t, φ(t, y)) = 0.
j=1

Therefore we have our lemma. 

Proof (Theorem 1(1)) Since V01 ≡ 0, we have ỹ 1 (t; h) = y 1 (t; h), and the energy
function can be defined as follows.
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 149


1  
d 1
2
e(x) = inf ḣ k0 (t) dt : ỹ 1 (1; h) = x .
2
k=1 0

Therefore it is enough to prove the theorem for the driftless case, i.e. V0 ≡ 0.
Let h 0 be defined by

h 0 (x) ≡ argmin{e(h); h ∈ H, y 1 (1; h) = x}. (38)

We denote h 0 (x)(t) ≡ h 0 (t, x). Then from non-degeneracy condition, there is an


r > 0 such that h 0 (x) is unique in x ∈ (x0 − r, x0 + r ). Using Lagrange multiplier
theorem, we have
h 0 (x) = λ(x)D F 1 (0, h 0 (x)), (39)

where λ : (x0 − r, x0 + r ) → R is a smooth function. Applying Proposition 2, we


have
 
x 1 (1; λ(x)) − x01 + b1 λ(x) + b2 λ(x)2 + b3 λ(x)3 = O(|x − x0 |4 ).

Therefore we have the following asymptotic expansion of λ in x:

λ(x) ∼ c1 (x − x01 ) + c2 (x − x01 )2 + c3 (x − x01 )3 , (40)


3

where
1 b2 b3 b2
c1 = , c2 = − 3 , c3 = − 4 + 2 25 . (41)
b1 b1 b1 b1

From [13] we have


∂e(x)
λ(x) = . (42)
∂x

Since e(x01 ) = 0, we can calculate the path of energy by


 x c1 c2 c3
e(x) = λ(y)dy ∼ (x − x01 )2 + (x − x01 )3 + (x − x01 )4 .
x01 4 2 3 4

Therefore we have Theorem 1 (1). 

Let us define α : [0, 1] → R by


 t 
α(t) = c1 Ṽk1 (u; x0 )du . (43)
0

Then we have the following.


150 Y. Osajima

Corollary 1 Let h 0 ∈ H be the element defined in (38), then we have

h k0 (x) − α(·)(x − x01 ) H = O(|x − x01 |2 ).

Proof From (27) and the proof of Theorem 1 (1), we have

N 
 t
h k0 (t, x) = pi (u; w)Ṽki (u, x(u; w))dt
i=1 0
 t   t 
∼ Ṽk1 (u; x0 )du w∼ Ṽk1 (u; x0 )du c1 (x − x01 ). 
1 0 1 0

3.2 Proof of Theorem 1 (2)

In this section, we will use the same notations as in [12, 13]. Let (Θ, · Θ ) be a
separable Banach space and (H, · H ) be a separable Hilbert space such that H is
a dense subspace of Θ and the inclusion map is continuous. Let μs , s ∈ [0, ∞), be
the (necessarily unique) probability measure on (Θ, BΘ ) with the property that

√ s
exp[ −1u, θ ]μs (dθ ) = exp(− u H ),
2
u ∈ Θ ∗.
Θ 2

We can rewrite (36) replacing ε2 by s :


d
d X si (t, θ) = Vki (t, X s (t, θ))dθ k (t) + sV0i (t, X s (t, θ))dt, 1 ≤ i ≤ N , t ∈ [0, 1],
k=1
X s (0) = x0 . (44)

Here we replaced X̃ and Ṽ in (36) by X and V respectively for simplicity.


Let us define Wiener functionals F i : (0, 1) × Θ × [x01 − r0 , x01 + r0 ] → R, 1 ≤
i ≤ N , by

F i (s, θ, y) = X si (1, θ ) − y. (45)

The main theorem in [13] is summarized in Appendix 2. To apply Theorem 7, it is


necessary to check the assumptions (A-1), . . ., (A-5) in Appendix 2. Since f ≡ 0,
we can check (A-1). Since h(0) = 0, we can check (A-2), (A-3) and (A-4) in the
neighborhood of origin. Since the ellipticity condition at origin, we can check (A-5),
using the same discussion given in Appendix B in [14]. Then we have the following.
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 151

For each (s, y) ∈ (0, 1] × [−r0 , r0 ], the density function ps (y) satisfies

e(y)
(2π s)1/2 exp( ) ps (y) − a0 (y) ≤ K 0 s 1/2 , (s, y) ∈ (0, 1] × [−r0 , r0 ].
s

The function a0 ∈ C([−r0 , r0 ]) is given by


 ∂ 2 e(y)  1 1  ∂e(y) 
det 2 (I H − B(y))− 2 exp
2
a0 (y) = A F 1 (0, h 0 (y), y) . (46)
∂ y2 ∂y

Here A is called the heat operator defined by

∂f 1
A f (s, θ ) = [ + trace H D 2 f ](s, θ ),
∂s 2
and
∂e(y) 2 1
B(y) ≡ D F (0, h 0 (y), y). (47)
∂y

In this section, we calculate each terms in right hand side of (46) explicitly. First we
calculate the heat operator.
Lemma 2 There are constants C > 0 and r > 0 such that
 1 t
(y − x01 ) 
N
A F 1 (0, h 0 (y), y) − V0i (u, x0 )∇i g 11 (t, x0 )dudt
2b1 0 0
i=1

N d  1
  t  
+ Vk1 (t, x0 )∇i,2 j Vk1 (t, x0 ) g ij (u, x0 )du dt
i, j=1 k=1 0 0

= O(|y − x01 |2 ), y> x01 .

Proof Since the adaptivity of X , we have

d 
 1  1
A F (s, θ, y) =
i
A [Vki (u, X s (u, θ ))]dθsk (u) + V0i (u, X s (u, θ ))du
k=1 0 0
 1
+s A [V0i (u, X s (u, θ ))]du, 1 ≤ i ≤ N .
0
152 Y. Osajima

Therefore we have

A F 1 (0, h 0 (y), y)
N  d  1
j
= ∇ j Vk1 (u, X 0 (u, h 0 (u; y)))A X 0 (u, h 0 (u; y))ḣ k0 (u; y)du
j=1 k=1 0
d 
1  1 2 1
N
j
+ ∇i, j Vk (u, X 0 (u, h 0 (u; y)))D X 0i (u), D X 0 (u)ḣ k0 (u; y)du.
2 0
i, j=1 k=1

Then using Corollary 1, we have the following.

 d 
N  1
j
A F 1 (0, h 0 (y), y) − (y − x01 ) ∇ j Vk1 (u, X 0 (u; 0))A X 0 (u; 0)α̇ k (u)du
j=1 k=1 0
d  
1  1 2 1
N
j
+ ∇i, j Vk (u, X 0 (u; 0))D X 0i (u; 0), D X 0 (u; 0)α̇ k (u)du
2 0
i, j=1 k=1

= O(|y − x01 |2 ),

j t j
where A X 0 (t; 0) = 0 V0 (u, X 0 (u; 0))du. 

Lemma 3 The Hilbert-Schmidt norm of D 2 F 1 is given by


N d  1
 t
D 2 F 1 (0, 0, x0 ) 2
HS =2 gl1 l2 (u, x0 )∇l1 Vm1 (t, x0 )∇l2 Vm1 (t, x0 ) du dt.
l1 ,l2 =1 m=1 0 0

Proof The Malliavin derivatives of X 0i , 1 ≤ i ≤ N , to the direction k ∈ H is


given by

 d 
N  t
D X 0i (t; h)[k] = ∇l Vmi (u, X 0 (u; h))D X 0l (u; h)[k]ḣ m (u)du
l=1 m=1 0
d 
 t
+ Vmi (u, X 0 (u; h))k̇ m (u)du.
m=1 0
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 153

The second Malliavin derivative of F 1 to the direction k1 , k2 ∈ H is given by

D 2 F 1 (0, 0, x0 )[k1 ][k2 ]


 N  d  1
= ∇l Vm1 (u, x0 )D X 0l (u; 0)[k1 ]k̇2m (u)du
l=1 m=1 0
 1
+ ∇l Vm1 (u, x0 )D X 0l (u; 0)[k2 ]k̇1m (u)du
0
N  d  1  t 
= ∇l Vm1 1 (t, x0 ) Vml 2 (u, x0 )k̇2m 2 (u)du k̇1m 1 (t)dt
l=1 m 1 ,m 2 =1 0 0


N 
d  1 1
= (∇l Vm1 1 (t, x0 )Vml 2 (u, x0 )1t>u
l=1 m 1 ,m 2 =1 0 0

+ ∇l Vm 2 (u, x0 )Vm 1 (t, x0 )1t<u )k̇1m 1 (t)k̇2m 2 (u)


1 l
du dt.

Therefore we can calculate the Hilbert-Schmidt norm of D 2 F 1 as follows:

D 2 F 1 (0, 0, x0 ) 2
HS

N 
d  1 1
= ((∇l Vm1 1 (t, x0 )Vml 2 (u, x0 )1t>u
l=1 m 1 ,m 2 =1 0 0

+ ∇l Vm1 2 (u, x0 )Vml 1 (t, x0 )1t<u )2 du dt


N  d  1 t  N
=2 ( ∇l Vm1 1 (t, x0 )Vml 2 (u, x0 ))2 du dt
l=1 m 1 ,m 2 =1 0 0 l=1


N 
d  1 t
=2 ∇l1 Vm1 1 (t, x0 )Vml12 (u, x0 )∇l2 Vm1 1 (t, x0 )Vml22 (u, x0 ) du dt
l1 ,l2 =1 m 1 ,m 2 =1 0 0


N d 
 1 t
=2 gl1 l2 (u, x0 )∇l1 Vm1 (t, x0 )∇l2 Vm1 (t, x0 ) du dt.
l1 ,l2 =1 m=1 0 0


Finally we will complete the proof of Theorem 1.
Proof (Theorem 1 (2)) Using (46), we have

1   ∂e(y) 1  ∂ 2 e(y) 
log a0 (y) = − log det 2 (I H − B(y)) + A F 1 (0, h(y), y)+ log .
2 ∂y 2 ∂ y2

In the right hand side, the asymptotic expansion of second term is given by Lemma 2,
so we will give the asymptotic expansion of the first term.
154 Y. Osajima

∂e(y)
Since B is defined by (47) and ∂y ∼ 1
c1 (y − x01 ), we have

B(y) − c1 D 2 F 1 (0, 0, x0 )(y − x01 ) = O(|y − x01 |2 ).

Since B(x01 ) = 0, if |y − x01 | is sufficiently small, we have


 1
det 2 (I − B(y)) = exp(− trace H (B(y)n )).
n
n=2

Therefore we have

  c2 (y − x01 )2
log det 2 (I H − B(y)) + 1 D 2 F(0, 0, x0 ) 2
HS = O(|y − x0 |3 ).
2
(48)

The Hilbert-Schmidt norm of D 2 F is given by Lemma 3. Therefore we have

(y − x01 )2
log a0 (y) ∼ D 2 F 1 (0, 0, x0 ) 2
HS
2 4b12
(y − x01 ) 1  ∂ 2 e(y) 
+ A F 1 (0, h 0 (y), y) + log
b1 2 ∂ y2
d  
1 (y − x01 )2   1 t l1 l2
N
= g (u, x0 )∇l1 Vm1 (t, x0 )∇l2 Vm1 (t, x0 )dudt
2 b12 l ,l =1 m=1 0 0
1 2
N  
(y − x01 )2  1 t j
+ V0 (u, x0 )∇ j g 11 (t, x0 )dudt
2b12 j=1 0 0

d  
1 (y − x01 )2   1 t 1
N
+ Vm (t, 0)gl1 l2 (u, x0 )∇l1 ,l2 Vm1 (t, x0 )dudt
2 b12 0 0
l1 ,l2 =1 m=1
1  ∂ 2 e(y) 
+ log .
2 ∂ y2

From the definition of (8), we have


  ∂ 2 e(y) 
(y − x01 )2 1
log a0 (y) ∼ L u (g 11 (t, ·))dudt + log .
2 2b12 0<u<t<1 2 ∂ y2

Then we have (15).


General Asymptotics of Wiener Functionals and Application to Implied Volatilities 155

Finally we calculate a2 (x01 ). First we give an asymptotic expansion of the density


using Hermite polynomials. Let y = x01 + ε √zc . Then the asymptotic expansion in
1
ε up to the second order is given as follows:

εz εdz
pε (y)dy = pε (x01 + √ ) √
c1 c1
εz 1  c1  z 2 εc2  z 3
∼ (a0 (x01 + √ ) + ε2 a2 (x01 )) √ exp − √ + √
2 c1 2π 2 c1 3 c1
ε c3  z 4 
2
+ √ dz
4 c1
c2  c2  c22 c3 
= 1 − 3/2 ε(z 3 − 3z) + ε2 2
3
z 6
− 3
+ 2 z4
3c1 18c1 3c1 4c1
 c 2
3c3 Lc1  2 
+ − 23 + 2 + z + ε2 a2 (x01 ) φ(z)dz
2c1 2c1 2
c2 c22  c22 c3 
= 1−ε 3/2
H3 (z) + ε2 H6 (z) + ε2 − H4 (z)
3c1 18c13 2c13 4c12
 c1 L  c22  c22 c3 
+ ε2 H2 (z) + ε2 (a2 (x01 ) − H (0) −
3 6
− H4 (0)
2 18c1 2c13 4c12

 c1 L  c1 z2
− H2 (0)) · exp(− )dz,
2 2π ε 2 2

where Hn , n ∈ N, are Hermite polynomials e.g.

H2 (x) = x 2 − 1,
H3 (x) = x 3 − 3x,
H4 (x) = x 4 − 6x 2 + 3,
H6 (x) = x 6 − 15y 4 + 45y 2 − 15.

Since pε is probability density, we have


 ∞  ∞
1= pε (y)dy = pε (εz)εdz.
−∞ −∞

The orthogonality of Hermite polynomials implies


 ∞ 1 z2
Hn (z) √ exp(− )dz = 0, n ≥ 1,
−∞ 2π 2
156 Y. Osajima

then we have

c22  c22 c3   c1 L 
a2 (x01 ) = H6 (0) − − H4 (0) − H2 (0).
18c13 2c13 4c12 2

This completes the proof of Theorem 1 (2). 


The asymptotic expansion of the probability density in ε using Hermite polynomials
is given as follows.
Corollary 2 For each z ∈ R, let y = x01 + ε √zc , ε ∈ (0, 1]. For any r ≥ 0, there
1
is a constant C > 0 such that

2π ε2 z2 c2 c2  c2 c3 
exp( ) pε (y) − 1 − ε 3/2 H3 (z) + ε2 2 3 H6 (z) + ε2 23 − 2 H4 (z)
c1 2 3c1 18c1 2c1 4c1
 c 1 L 
+ ε2 H2 (z) ≤ ε3 C, ε ∈ (0, 1], z ∈ [−r, r ].
2

4 Proof of Theorem 2

First we prove the following theorem.


Theorem 4 We assume X ε1 (T ) has a density pε (y), y ∈ R and let

 e(y) 
aε (y) = (2π ε2 )1/2 exp pε (y), y ∈ R.
ε2
We assume that there are constants N ∈ N, C0 > 0 and K 0 > 0 such that


N
aε (y) − a2k (y)ε2k ≤ C0 ε2N +2 , y ∈ [x01 , K 0 ],
k=0

and assume that the energy function e satisfies e (x) > 0, x ∈ (x01 , K 0 ]. We define
g : R → R by
x2
e(g(x)) = .
2
Since e is strictly increasing, g is well defined. Then there are constants K 1 < K 0
and C1 , such that the value of the call option satisfies following:

√ e(K )  g −1 (K ) 
2π exp( )C ε (T, K ) − εϕ1 a0 (K )q(K )2 R N (ε, K ) ≤ C1 ε N +1 ,
ε2 ε
ε ∈ (0, 1], K ∈ [x01 , K 1 ].
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 157

where
 cn,m (g −1 (K )) ϕm+1 (g −1 (K )/ε) 2n+m
R N (ε, K ) = ε . (49)
n,m≥0,n+m≥1
c0,0 (g −1 (K )) ϕ1 (g −1 (K )/ε)
2n+m+1≤N

Here cn,m ∈ C(R) is given by


m
1  d k+1  d m−k
cn,m (x) = g(x) · An (x), (50)
(k + 1)!(m − k)! d x dx
k=0

where
Ak (x) = a2k (g(x))g (x), n ∈ N, x ∈ [x01 , K 1 ]. (51)

We prepare the following lemma for the proof of Theorem 4.


Lemma 4
A0 (x01 ) = 1.

Proof Since
 ∞  ∞
1 e(y)
1= pε (y)dy = aε (y) exp(− )dy,
−∞ (2π ε2 )1/2 −∞ ε2

we have 
1 ∞ y2
1= aε (g(εy)) exp(− )g (εy)dy.
(2π )1/2 −∞ 2

Since the right hand side is bounded, taking the limit of ε ↓ 0, we have
a0 (g(0))g (0) = 1. 

Proof (Proof of Theorem 4) We can divide the value of a call option into two parts:

Cε (T, K ) = C̃ε (T, K ) + Rε (K 0 ),

where
 
K0 K0  1 1 e(y)
C̃ε (T, K ) = (y − K ) pε (y)dy = (y − K ) 2 exp(− )aε (y)dy,
K K 2π ε2 ε2

and
Rε (K 0 ) = E[X ε1 (T ) − K : X ε1 (T ) > K 0 ].
158 Y. Osajima

x2
Since e(g(x)) = 2 , we have

 g −1 (K 0 )  1 1 x2
C̃ε (T, K ) = (g(x) − K ) 2 exp(− )aε (g(x))g (x)dx.
g −1 (K ) 2π ε2 2ε2

Let Aε (x) = aε (g(x))g (x) and K̃ ε = 1ε (g −1 (K 0 ) − g −1 (K )). Putting x = εz +


g −1 (K ), we have

 g −1 (K )2 
exp C̃ε (T, K )
2ε2
 K̃ ε
  1  z2 zg −1 (K ) 
= g(εz + g −1 (K )) − K √ exp − − Aε (εz + g −1 (K ))dz.
0 2π 2 ε

We define

n
Ãε,n (x) = āε,n (g(x))g (x) = Ak (x)ε2k .
k=0

We also define

C̃ε,n (T, K )

 g −1 (K )2  K̃ ε   1  z2 zg −1 (K ) 
= exp − g(εz + g −1 (K )) − K √ exp − −
2ε2 0 2π 2 ε
−1
× Ãε,n (εz + g (K ))dz.

Then there exist constants C1 , C2 > 0 such that

 g −1 (K )2 
exp C̃ε (T, K ) − C̃ε,n (T, K ) ≤ C1 ε2n+2 .
2ε2
Since

(g(εz + g −1 (K )) − K ) Ãε,n (εz + g −1 (K ))



− cn,m (g −1 (K ))ε2n+m+1 z m+1 ≤ C2 ε N +1 , K ∈ [x01 , K 1 ],
n,m≥0
2n+m+1≤N

we have

e(K )  1  g −1 (K ) 
exp( )C̃ε,n (T, K ) − cn,m (g −1 (K ))ε2n+m+1 √ ϕm+1
ε2 n,m≥0 2π ε
2n+m+1≤N

≤ Rε N +1 , K ∈ [x01 , K 1 ].
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 159

For any δ > 0, we have

Rε (K 0 ) ≤ E[X ε1 (T ); X ε1 (T ) > K 0 ]
≤ E[X ε1 (T )1/δ ]δ P(X ε1 (T ) > K 0 )1−δ .

Therefore we have

lim ε2 log Rε (K 0 ) ≤ lim ε2 (1 − δ) log P(X ε1 (T ) > K 0 ) = −(1 − δ)e(K 0 ).


ε↓0 ε↓0

Note that e(K 0 ) > e(K 1 ), we have

lim sup ε2 log Rε (K 0 ) < −e(K 1 ).


ε↓0

The function q defined by (19) can be written as

 d −1 −1
q(K ) = g (g −1 (K )) = g (K ) .
dK
Then we have our assertion. 

Finally we prove Theorem 2.


Proof (Proof of Theorem 2) From the definition of R2 (ε, K ) given in (49), we have

c0,1 (g −1 (K )) ϕ2 (g −1 (K )/ε) −1 −1
2 c0,2 (g (K )) ϕ3 (g (K )/ε)
R2 (ε, K ) = ε + ε
c0,0 (g −1 (K )) ϕ1 (g −1 (K )/ε) c0,0 (g −1 (K )) ϕ1 (g −1 (K )/ε)
c1,0 (g −1 (K )/ε)
+ ε2 .
c0,0 (g −1 (K )/ε)

The second and third derivatives of g at g −1 (K ) are given as follows:

d2
g(g −1 (K )) = q(K )q (K ),
dK2
d3
g(g −1 (K )) = q(K )q (K )2 + q(K )2 q (K ).
dK3
160 Y. Osajima

Using the definition of cn,m given in (50), we can calculate c0,0 , c0,1 , c1,0 , c0,2 explic-
itly as follows:

c0,0 (g −1 (K )) = a0 (K )q(K )2 ,
c1,0 (g −1 (K )) = a2 (K )q(K )2 ,
3
c0,1 (g −1 (K )) = a0 (K )q(K )3 + a0 (K )q(K )2 q (K ),
2
−1 1 7
c0,2 (g (K )) = a0 (K )q(K ) + 2a0 (K )q(K )3 q (K ) + a0 (K )q(K )2 q (K )2
4
2 6
2
+ a0 (K )q(K ) q (K ).
3
3
Then we have our theorem. 

5 Proof of Theorem 3

First, we define smooth functions θn , n ∈ N, inductively by

θ1 (x) = ϕ1 (x),
θn+1 (x) = −nθn (x) + θn (x)θ1 (x)x. (52)

We define the function h : [0, 1] × R → R by

h(t, y) ≡ f −1 (t f (y)), (53)

where f is defined by (21). The properties of h are given in Lemma 9. Then we have
the following.
Proposition 3 The implied normal volatilities of call options are given as follows.
 1+l(ε,K )
ε(K − x01 )  1 g −1 (K ) 
σ Nε (T, K ) = √ exp − ϕ1 (h(t, ))dt , K > x01 .
g −1 (K ) T 1 t ε

Here
l(ε, K ) = (1 + R(ε, K )(1 + r (K )) − 1,

where √ )
2π exp( e(K
ε2
)Cε (T, K )
R(ε, K ) = − 1,
εc0,0 (g −1 (K ))ϕ1 (g −1 (K )/ε)

and
g −1 (K )c0,0 (g −1 (K ))
r (K ) = − 1.
(K − x01 )
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 161

R and r satisfies the following respectively:

R(ε, K ) − R N (ε, K ) ≤ Cε N , (54)

and
lim r (K ) = 0.
K ↓x01

Proof From Theorem 4 and Lemma 7, we have (54). Using l’Hospital’s rule, we
have
g −1 (K )c0,0 (g −1 (K ))
lim = g (x01 )a0 (x01 ) = 1.
K ↓0 K − x01

By definition of R, we can rewrite the value of call option as

Cε (T, K ) = f (g −1 (K )/ε)g −1 (K )c0,0 (g −1 (K ))(1 + R(ε, K )).

On the other hand, the value of call option under the normal model is given by

 K − x01 
V = (K − x01 ) f √ .
σ T

Therefore we have

 K − x01   g −1 (K ) 
f √ = (1 + r (K ))(1 + R(ε, K )) f .
σ T ε

Using the definition of h given by (53) and Lemma 9, we have our assertion. 
Next we will give the asymptotic expansion of implied volatilities.
Theorem 5 For any N ∈ N, there is a constant C > 0 such that the asymptotic
expansion of implied volatilities satisfies the following:

 ε(K − x 1 ) −1 N


l N (ε, K )n+1 g −1 (K ) 
√0 σ N (T, K ) − exp θn+1 ( )
g −1 (K ) T (n + 1)! ε
n=0
< C(ε + |K − x01 |) N +1 , K ∈ [x01 , K 1 ].

Here
l N (ε, K ) = (1 + R N (ε, K ))(1 + r (K )) − 1, (55)

where
g −1 (K )c0,0 (g −1 (K ))
r (K ) = − 1. (56)
K − x01
162 Y. Osajima

Proof Using Lemma 9, we have


 ∂ n 1
ϕ1 (h(t, y)) = θn (y), n ≥ 1.
∂t t t=1

Therefore
 1+l(ε,K ) N  1+l N (ε,K )
1 g −1 (K ) θn (y)
θ1 (h(t, ))dt − (t − 1)n dt
1 t ε 1 n!
n=0
 1+l(ε,K ) −1  1+l N (ε,K )
1 g (K ) 1 g −1 (K )
≤ θ1 (h(t, ))dt − θ1 (h(t, ))dt
1 t ε 1 t ε
 1+l N (ε,K )  N  1+l N (ε,K )
1 g −1 (K ) θn (y)
+ θ1 (h(t, ))dt − (t − 1)n dt
1 t ε 1 n!
n=0
≤ C1 |l(ε, K ) − l N (ε, K )| + C2 |l N (ε, K )| ≤ C(ε + |K − x01 |) N .
N


Finally we prove Theorem 3.


Lemma 5 The derivatives of q, a0 , a2 at x0 are given as follows:

1 q (x01 ) 2 c2 q (x01 ) 11  c2 2 3 c3
q(x01 ) = √ , =− , = − ,
c1 q(x01 ) 3 c1 q(x01 ) 9 c1 2 c1
a0 (x01 ) c2 a0 (x01 )  c 2 3c
2 3
= , = c12 L − + ,
a0 (x0 )
1 c1 a0 (x0 )
1 c1 c1
a2 (x01 ) 1  c12 L 2  c2 2 3 c3 
= − + − ,
a0 (x01 ) c1 2 3 c1 4 c1

where ci (i = 1, 2, 3) are given by (41).

Proof Since
1 2
e(g(x)) = x ,
2

and g (x) > 0, the derivatives are given by

x = e (g(x))g (x),
1 = e (g(x))g (x)2 + e (g(x))g (x),
0 = e (g(x))g (x)3 + 3e (g(x))g (x)g (x) + e (g(x))g (x),
0 = e(4) (g(x))g (x)4 + 6e (g(x))g (x)2 g (x) + 3e (g(x))g (x)2
+ 4e (g(x))g (x)g (x) + e (g(x))g (4) (x).
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 163

Furthermore, since

1 2b2 6b3 12b2


e (x01 ) = 0, e (x01 ) = , e (x01 ) = − 3 , e(4) (x01 ) = − 4 + 5 2 ,
b1 b1 b1 b1

we have

 2 b2 b1
g (0) = b1 , g (0) = , g (0) = (9b1 b3 − 8b22 ). 
3 b1 6
Lemma 6
R2 (ε, K ) − R20 (ε, K ) ≤ C(ε + |K − x01 |)3 ,

r (K ) − r 0 (K ) ≤ C|K − x01 |3 ,

where

ε(K − x01 ) 2 5  c2 2 3 c3 ϕ2 (g −1 (K )/ε)


R20 (ε, K ) = √ c1 L − +
c1 6 c1 4 c1 ϕ1 (g −1 (K )/ε)
ε2 c12 L 1  c2 2 1 c3 ϕ3 (g −1 (K )/ε)
+ − +
c1 2 2 c1 2 c1 ϕ1 (g −1 (K )/ε)
ε2 c2 L 2  c2 2 3 c3
+ − 1 + − ,
c1 2 3 c1 4 c1

and
1  c2 2 1 c3 c2 L
r 0 (K ) = − + + 1 (K − x01 )2 .
3 c1 4 c1 2

Proof We will calculate each terms of R2 given by (20). From Lemma 7, the functions
ϕ2 /ϕ1 and ϕ3 /ϕ1 are bounded above. Since the first term is O(ε) and other terms are
O(ε2 ), it is enough to calculate the first order of K in the first term and 0th order in
the other terms. Using Lemma 5, we have

c0,1 (x01 )
= 0,
c0,0 (x01 )

and the first derivative is given by

d c0,1 (g −1 (K )) a0 (K )  a0 (K ) 2 3 q (K ) a0 (K ) q (K )
= q(K ) + + + .
d K c0,0 (g −1 (K )) a0 (K ) a0 (K ) 2 q(K ) a0 (K ) q(K )
164 Y. Osajima

Using Lemma 5 again, we have

c0,1 (g −1 (K )) (K − x01 ) 2 5  c2 2 3 c3
∼ √ c L − + ,
c0,0 (g −1 (K )) 1 c1 1
6 c1 4 c1
c0,2 (g −1 (K )) 1 c12 L 1  c2 2 1 c3
∼ − + ,
c0,0 (g −1 (K )) 0 c1 2 2 c1 2 c1
c1,0 (g −1 (K )) 1 c2 L 2  c2 2 3 c3
−1
∼ − 1 + − .
c0,0 (g (K )) 0 c1 2 3 c1 4 c1

We can calculate r (K ) in the same way and we have our results. 

Proof (Proof of Theorem 1.3) Using (55) we have

l2 (ε, K ) ∼ R20 (ε, K ) + r 0 (K ).


2

Since R20 and r 0 are of the second order in ε, K , we have


2
l2 (ε, K )n+1 g −1 (K ) g −1 (K )
θn+1 ( ) ∼ (R20 (ε, K ) + r 0 (K ))ϕ1 ( ).
(n + 1)! ε 2 ε
n=0

Hence we have our result. 

6 Examples

In this section, we apply our results to some known models.

6.1 Local Volatility Models

We assume the following model. Let σ : R → R+ be a smooth function whose


derivatives of any order are bounded. Let λ be continuous R+ -valued functions
defined on [0, T ].

d X ε (t) = ελ(t)σ (X ε (t))dWt ,


X ε (0) = x0 .
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 165

In this case we can solve the energy as follows:



1  y dx 2
e(y) = ,
2Λ x0 σ (x)

where  T
Λ= λ2 (t)dt.
0

The minimum energy path h is given by


 
1  y dx  t
h(t) = λ(s)ds.
Λ x0 σ (x) 0

We can easily calculate the coefficients.

3
b1 = σ (x0 )2 Λ, b2 = σ (x0 )3 σ (x0 )Λ2 ,
2
8 2 
b3 = σ (x0 )4 σ (x0 )2 + σ (x0 )5 σ (x0 ) Λ3 ,
3 3 
1 1  1  y dx 
−1
L= σ (x0 ) σ (x0 ) + σ (x0 ) σ (x0 ) Λ , g (y) = √
2 2 3 2
.
2 2 Λ x0 σ (x)

Then using Theorems 1 and 3 we can calculate the density function and implied
normal volatilities. We illustrate some cases.
Example 1 (CEV model) This is the case λ(t) ≡ α and

σ (x) = x β .

Each terms are given by

2β 3 4β−1 2 2 6β−2 3
Λ = α 2 T, b1 = x0 Λ, b2 = βx0 Λ , b3 = (β 2 − β + 4)x0 Λ ,
2 3
β 4β−2 2 β(1 + β)
L = (β 2 − )x0 Λ , e (y) = ,
2 2α 2 T y β+2
⎧  1−β 

⎨ √1 y −x0
1−β
(β = 1)
Λ
g −1 (y) =
1−β

⎩ √1 log( y ) (β = 1).
Λ x0

Example 2 (Displaced diffusion) This is the case λ(t) ≡ σ and

σ (x) = q x + (1 − q)x0 .
166 Y. Osajima

Fig. 1 Implied volatility smile of displaced diffusion, asymptotic expansion versus analytic solution
with x0 = 1.0, q = 0.5, σ = 0.15, T = 10

Each terms are given by (Fig. 1)

3 8 1
Λ = σ 2 T, b1 = x02 Λ, b2 = x03 qΛ2 , b3 = x04 q 2 Λ3 , L = x02 q 2 Λ2 ,
 y 2  3 2
−1 1 dx 1 qy + (1 − q)x0
g (y) = √ = √ log ,
Λ x0 q x + (1 − q)x0 q Λ x0

1 + g −1 (y)q Λ
e (y) = .
Λ(qy + (1 − q)x0 )2

Black-Scholes model is the case q = 1. We present a numerical results of the


asymptotic expansion formula, comparing with analytical solution.

6.2 SABR Model

We investigate the following model which is called SABR model.



d X ε (t) = εα ε (t)σ (X ε (t))(ρdW(t) + 1 − ρ 2 dZ(t)),
dα ε (t) = ενα ε (t)dW (t),
X ε (0) = x0 , α ε (0) = α.

This model was investigated in Hagan and Woodward [6, 14]. The energy function
was given in Hagan et al. [8] as follows.
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 167

 2
1 1 − 2ρζ + ζ 2 − ρ + ζ x̂(ζ (y))2
e(y) = 2 log = ,
2ν T 1−ρ 2ν 2 T

where

ν y dz
ζ (y) = − .
α x0 σ (z)

In Theorem 3.1 [14], we also gave the energy function by solving Hamilton equations.
Then the parameters are given by (Fig. 2)

3
b1 = α 2 σ (x0 )2 T, b2 = σ (x0 )3 α 3 (ασ (x0 ) + νρ)T 2 ,
2
8 2
b3 = α σ (x0 ) σ (x0 ) + α 6 σ (X 0 )5 σ (x0 ) + 6νρσ (x0 )4 σ (x0 )α 5
6 4 2
3 3
2 4 
+ 2ν ρ σ (x0 ) α + α σ (x0 )4 ν 2 T 3 ,
2 2 4 4
3
α 2 σ (x0 )2 T 2  2 
L= α (σ (x0 )2 + σ (x0 )σ (x0 )) + 4νρασ (x0 ) + ν 2 ,
2  

−1 1 1 − 2ρζ (y) + ζ (y)2 − ρ + ζ (y)
g (y) = √ log .
ν T 1−ρ

We present a numerical results of the asymptotic expansion formula comparing with


Monte Carlo simulation. Here we assume σ (x) = x β .

Fig. 2 Implied volatility smile of SABR model, asymptotic expansion versus Monte Carlo simu-
lation with x0 = 1, α = 0.15, β = 0.5, ν = 0.2, ρ = −0.2, T = 10.
168 Y. Osajima

Acknowledgments The author would like to thank Professor Shigeo Kusuoka for useful
discussions.

Appendix 1

In this section, we investigate some properties of functions defined in Sect. 1. First


we consider ϕn , n ≥ 0 defined by (18).
Lemma 7 The functions ϕn have the following properties.
(1) ϕn (x) > 0, x ≥ 0.
(2) lim x→∞ x n+1 ϕn (x) = n!.
ϕn (x)
(3) sup < ∞, n ≥ 1.
x ϕ1 (x)

Proof (1) is easy to check. We prove (2). Putting y = x z


 ∞  ∞
y2 y dy 1 y2
ϕn (x) = exp(− − y)( )n = n+1 y n exp(−y − )dy
0 2x 2 x x x 0 2x 2

Then we have  ∞
lim x n+1 ϕn (x) = y n e−y dy = n!.
x→∞ 0

(3) is an easy consequence of (1) and (2). 

The following is easy to check.


Lemma 8 The functions {ϕn } satisfy the following recurrence relations.

ϕn+1 (x) = −xϕn (x) + nϕn−1 (x),


ϕn (x) = −ϕn+1 (x).

Example 3 ϕi (0 ≤ i ≤ 3) are given as follows:


 ∞
x2 z2
ϕ0 (x) = exp( ) exp(− )dz,
2 x 2
ϕ1 (x) = −xϕ0 (x) + 1,
ϕ2 (x) = (x 2 + 1)ϕ0 (x) − x,
ϕ3 (x) = −(x 3 + 3x)ϕ0 (x) + x 2 + 2.

Next we consider the function h ∈ C ∞ ([0, 1] × R+ ) defined by (53).


Lemma 9 The n-times differentiation of log h(t, y) with respect to t is given as
follows. We define θ in (52).
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 169

 ∂ n 1
log h(t, y) = n θn (h(t, y)), t ∈ [0, 1], y > 0,
∂t t
where θn ∈ Cb [0, ∞], n ≥ 1 are given inductively as follows:

θ1 (x) = ϕ1 (x),
θn+1 (x) = nθn (x) + θn (x)θ1 (x)x.

Proof In the case n = 1, since f (h(t, y)) = t f (y), we have

∂h f (h(t, y))
(t, y) = .
∂t t f (h(t, y))

Since
1 ϕ2 (x)
f (x) = −( +x+ ) f (x) < 0, x > 0,
x ϕ1 (x)

we have 
f (x) ϕ2 (x) −1
θ1 (x) = = 1 + x2 + x = ϕ1 (x).
x f (x) ϕ1 (x)

It is easy to check that θ1 ∈ Cb ([0, ∞]) and xθ1 (x) ∈ Cb ([0, ∞]). We have

∂ 1
log h(t, y) = θ1 (h(t, y)).
∂t t
Since
∂ 1  1  
θn (h(t, y)) = n+1 −nθn (h(t, y)) + θn (h(t, y))θ1 (h(t, y))h(t, y) ,
∂t t n t
it is easy to prove our lemma. 

Appendix 2

In this section, we summarize the main theorem in Kusuoka and Osajima [13]. See
[13] for the definitions.
Let f, g ∈ G ∞ (A ; R) and F ∈ G ∞ (A ; R N ) be completely P-regular functions
and Y be a compact subset in R N . We assume the following.
(A1) There is an α > 0 such that

(1 + α) f (s, θ )
sup s log( exp( )μs (dθ )) < ∞.
s∈(0,1] Θ s
170 Y. Osajima

We define e : R N → [−∞, ∞] by

h 2
e(x) ≡ inf{ − f (0, h) : F(0, h) = x}, x ∈ RN .
2
We also assume the following.
(A2) For each y ∈ Y ,

M(y) ≡ {h ∈ H ; F(0, h) = y} = ∅

and that
h(y) 2
e(y) = − f (0, h(y))
2
for precisely one h(y) ∈ M(y).
We assume moreover the following.
(A3) T (y) ≡ D F(0, h(y)) has rank N for every y ∈ Y.
Let π(y) = T (y)∗ (T (y)T (y)∗ )−1 T (y), y ∈ Y. π(y) is an orthogonal projection
in H. Let π(y)⊥ = I H − π(y). Then π(y)⊥ is also an orthogonal projection in H
onto ker T (y). Let V (y) : H × H → R be a bilinear form given by

V (y)(h, h )
= D 2 f (0, h(y))(π(y)⊥ h, π(y)⊥ h )
+ (h(y) − D f (0, h(y)), T (y)∗ (T (y)T (y)∗ )−1 D 2 F(0, h(y))(π(y)⊥ h, π(y)⊥ h )) H .

We assume the following furthermore.


(A4) For all y ∈ Y and h ∈ H \ {0}

V (y)(h, h) < h 2 .

Finally we define

A(s, θ ) = D F(s, θ )D F(s, θ )∗


= ((D Fi (s, θ ), D F j (s, θ )) H )1i, j N

and assume the following.


(A5) For any p ∈ [1, ∞)

lim s log( | det A(s, θ )|− p μs (dθ ))  0.
s↓0 Θ

Then Kusuoka-Stroock [12] proved the following.


General Asymptotics of Wiener Functionals and Application to Implied Volatilities 171

Theorem 6 For each s ∈ (0, 1], a signed measure Ps (·) on R N given by


 
f (s, θ )
Ps (Γ ) = g(s, θ ) exp μs (dθ ), Γ ∈ B(R N ),
F(s,θ)∈Γ s

admits a smooth density ps (·) with respect to Lebesgue’s measure. Moreover, there
exist sequence {an }∞ ∞
n=0 ⊆ C(Y ; R) and {K n }n=0 ⊆ (0, ∞) with the property that,
for every n ∈ N,


n
(2π s) N /2 ee(y)/s ps (y; 0) − s m/2 am (y)  K n s (n+1)/2 , (s, y) ∈ (0, 1] × Y.
m=0

The main theorem in Kusuoka-Osajima [13] is the following.

Theorem 7 e is smooth in the neighborhood of Y and

N
∂e
a0 (y) = (det ∇ 2 e(y))1/2 det 2 (I H − B(y))−1/2 exp (y)A F i (0, h(y))
∂ yi
i=1

+ A f (0, h(y))

for y ∈ Y, where


N
∂e
B(y) ≡ (y)D 2 F i (0, h(y)) + D 2 f (0, h(y)), y ∈ Y.
∂ yi
i=1

Here we identify a continuous symmetric bilinear form B : H × H → R with a


bounded symmetric linear operator B̃ : H → H given by

( B̃h, k) H = B(h, k), h, k ∈ H,

and det2 is a Carleman-Fredholm determinant (c.f. Dunford and Schwartz [5]


pp.1106).

Appendix 3

In this section, we discuss about the implied volatilities for the case K < x01 . We
define the forward value of a put option of strike rate K and maturity T by

Pε (T, K ) = E[(K − X ε1 (T ))+ ]


172 Y. Osajima

Since we have put-call parity, the implied volatility of the put option is the same as
the implied volatility of a call option with strike rate K and maturity T . Since

Pε (T, K ) = E[(−X ε1 (T ) − (−K ))+ ] = E[(−(X ε1 (T ) − x01 ) − (−(K − x01 )))+ ]

It is enough to discuss in the case x01 = 0.


Let x = (x 1 , . . . , x n ) ∈ Rn . We denote x̄ = (−x 1 , x 2 , . . . , x n ). We define
X̄ ε (t) = X̄ ε (t). Then we have


d
d X̄ εi (t) = ε V̄ki (t, X̃ ε (t))dWk (t) + V̄0i (t, X̃ ε (t))dt, 1 ≤ i ≤ N ,
k=1

where 
j −Vk1 (t, x̄) (1 ≤ k ≤ d)
V̄k (t, x) = j
Vk (t, x̄) (1 ≤ k ≤ d, j = 1).

d j
Since the associated Riemaniann metric ḡ ij (t, x) = k=1 V̄ki (t, x)V̄k (t, x) is
given by

ḡ 11 (t, x) = g 11 (t, x), ḡ 1i (t, x) = −g 1i (t, x) (i  = 1), ḡ ij (t, x) = g ij (t, x) (i, j  = 1),

we have
b̄1 = b1 , b̄2 = −b2 , b̄3 = b3 , L̄ = L .

Therefore Theorems 1 and 3 still hold for K < x01 .

References

1. Berestycki, H., Busca, J., Florent, I.: Computing the implied volatility in stochastic volatility
models. Commun. Pure Appl. Math. 57(10), 1352–1373 (2004)
2. Bismut, J.M.: Large Deviations and the Malliavin Calculus. Birkhauser, Boston (1984)
3. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, part I: theoretical foundations. commun. Pure Appl. Math. 67(1) (2014)
4. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, part II: applications. Commun. Pure Appl. Math. 67(2), 321–350
(2014)
5. Dunford, N., Schwartz, J.T.: Linear Operators, Part II. Wiley, New York (1988)
6. Hagan, P.S., Woodward, D.E.: Equivalent black volatilities. Appl. Math. Financ. 6, 147–157
(1999)
7. Hagan, P.S., Kumar, D., Lesniewski, S., Woodward, D.E.: Managing smile risk. Wilmott Mag.
18(11), 84–108 (2002)
8. Hagan, P.S., Lesniewski, S., Woodward, D.E.: Probability distribution in the SABR model of
stochastic volatility. In: Friz, P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.)
Large Deviations and Asymptotic Methods in Finance. Springer Proceedings in Mathematics
and Statistics, vol. 110 (2015)
General Asymptotics of Wiener Functionals and Application to Implied Volatilities 173

9. Henry-Labordère, P.: A General Asymptotic Implied Volatility for Stochastic Volatility Models,
preprint, http://arxiv.org/abs/cond-mat/0504317 (2005)
10. Kunitomo, N., Takahashi, A.: The asymptotic expansion approach to the valuation of interest
rate contingent claims. Math. Financ. 11, 117–151 (2001)
11. Kusuoka, S., Stroock, D.W.: Applications of Malliavin Calculus, Part I. In: Ito, K. (ed.) Pro-
ceedings of the Taniguchi International Symposium on Stochastic Analysis, Kyoto and Katata,
1982, pp. 271–360. Kinokuniya, Tokyo (1984)
12. Kusuoka, S., Stroock, D.W.: Precise asymptotics of certain Wiener functionals. J. Funct. Anal.
99, 1–74 (1991)
13. Kusuoka, S., Osajima, Y.: A remark on the asymptotic expansion of density function of Wiener
functionals. J. Funct. Anal. 255, 2545–2562 (2007)
14. Osajima, Y.: The Asymptotic Expansion Formula of Implied Volatility for Dynamic SABR
model and FX hybrid model, BNP Paribas, Date posted: 26 Feb 2007 SSRN working paper
series
15. Shigekawa, I.: Stochastic analysis. Am. Math. Soc. (2004)
16. Siopacha, M., Teichmann, J.: Weak and strong Taylor methods for numerical solutions of
stochastic differential equations. Quant. Financ. 11(4), 517–528 (2011)
17. Watanabe, S.: Analysis of wiener functionals (Malliavin calculus) and its application to heat
kernels. Ann. Probab. 15, 1–39 (1987)
18. Yoshida, N.: Asymptotic expansions of maximum likelihood estimators for small diffusions
via the theory of Malliavin-Watanabe. Probab. Theory Relat. Fields 92, 275–311 (1992)
Implied Volatility of Basket Options
at Extreme Strikes

Archil Gulisashvili and Peter Tankov

Abstract In the paper, we characterize the asymptotic behavior of the implied


volatility of a basket call option at large and small strikes in a variety of settings with
increasing generality. First, we obtain an asymptotic formula with an error bound
for the left wing of the implied volatility, under the assumption that the dynamics
of asset prices are described by the multidimensional Black-Scholes model. Next,
we find the leading term of asymptotics of the implied volatility in the case where
the asset prices follow the multidimensional Black-Scholes model with time change
by an independent increasing stochastic process. Finally, we deal with a general
situation in which the dependence between the assets is described by a given cop-
ula function. In this setting, we obtain a model-free tail-wing formula that links the
implied volatility to a special characteristic of the copula called the weak lower tail
dependence function.

Keywords Implied volatility asymptotics · Basket options · Index options · Large/


small strikes · Time change · Copula

We thank the anonymous reviewer for the careful reading of our manuscript and many constructive
comments.

A. Gulisashvili
Department of Mathematics, Ohio University, Athens, OH, USA
e-mail: [email protected]
P. Tankov (B)
Laboratoire de Probabilités et Modèles Aléatoires, Université Paris Diderot, Paris, France
e-mail: [email protected]
P. Tankov
International Laboratory of Quantitative Finance,
National Research University “Higher School of Economics”, Moscow, Russia

© Springer International Publishing Switzerland 2015 175


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_6
176 A. Gulisashvili and P. Tankov

1 Introduction

In option markets, prices of vanilla call and put options are commonly quoted in terms
of their implied volatility I (T, K ), defined as the value of the volatility parameter
which must be substituted into the Black-Scholes option pricing formula to obtain
the quoted option price. Similarly, given a risk-neutral model, one can define the
function (T, K ) → I (T, K ) from the prices of vanilla options computed for that
model. However, since in most stochastic asset price models the implied volatility
function is not known explicitly, it becomes important to obtain efficient and accurate
asymptotic approximations for it. Such approximations are useful for at least two
reasons. First, they may shed light on the qualitative behavior of the implied volatility
in the asset price model, and also on the effect of different model parameters on
the shape of the model-generated implied volatility surface. Second, they allow to
perform an approximate calibration of the model by comparing the market implied
volatility with the asymptotic approximation. Such preliminary estimates can be
used as intelligent guesses in the construction of a numerical calibration algorithm
to accelerate its convergence.
Approximations to the implied volatility have been studied by many authors in
a variety of asymptotic regimes, both in specific models and in model-independent
settings. One of the early references on the subject is the book by Lewis [31] dealing
with stochastic volatility models. Various model-free formulas describing the wing
behavior of the implied volatility were obtained in the last decade. To our knowledge,
celebrated Lee’s moment formulas were the first model-independent asymptotic for-
mulas for the implied volatility at extreme strikes (see [30]). Lee’s results were
later refined by Benaim and Friz [8, 9] and Gulisashvili [22–24]. In Gao and Lee
[19], higher order asymptotic formulas for the implied volatility at extreme strikes
were found, and in Tehranchi [41], uniform estimates for the implied volatility are
obtained. Small-time behavior of implied volatility is analyzed, among other papers,
in [11] (in local volatility models), [17] (for the Heston stochastic volatility model),
[33] (for jump-diffusions), and in [2, 16, 34, 38] (for exponential Lévy models).
Formulae for the implied volatility far from maturity are given in [18] (for the Hes-
ton model) and [40] (model-independent). Finally, sharp price and implied volatility
approximations for various models have been obtained as “expansions around the
Black-Scholes model” in [10, 21].
Implied volatility is also quoted in the market for options on a basket of stocks.
Note that the Black-Scholes formula can be applied to price a vanilla option by
considering the entire basket (index) as a log-normal random variable. In particular,
options on stock indices or major exchange traded funds are often liquid and quoted
in terms of their implied volatility. Several studies [5, 13, 29] explore the relationship
between the implied volatilities of index options and those of the constituents, with
the aim of designing dispersion trading strategies. Another example is provided
by swaptions, which are also quite liquid, often quoted in terms of their implied
volatility, and can be interpreted as basket options on the underlying Libor rates
[4, 37]. A tractable relationship between swaption and caplet implied volatilities
Implied Volatility of Basket Options at Extreme Strikes 177

could be used to design a calibration procedure for the correlation structure of the
Libor rates.
In the above cases, finding reliable asymptotic approximations to the implied
volatility can be even more important, since calculating the exact value numerically
can be computationally very expensive due to the large dimension of the basket.
Approximations based on the small-noise asymptotics in multidimensional local
volatility models have been developed in [5] and more recently refined in [7], but in
other asymptotic regimes, much less is known about multi-asset options, than in the
single-asset case.
Our main goal in the present paper is to characterize the asymptotic behavior of
the implied volatility of a call option on a basket of stocks (with positive weights)
for large and small strikes. Three different classes of multidimensional risk-neutral
models with increasing generality are considered in the paper. In Sect. 3, we discuss
the case of correlated log-normal assets, in other words, the assets which follow the
multidimensional Black-Scholes model. Using a recent characterization of the tail
behavior of sums of correlated log-normal random variables [27], we obtain a sharp
asymptotic formula with error estimates for the implied volatility at small strikes. On
the other hand, the asymptotics of the implied volatility at large strikes can be easily
characterized using the results obtained in [3]. It turns out that for very large strikes,
the implied volatility of a basket call option converges to the highest volatility among
the stocks in the basket.
Section 4 deals with the case where the assets follow the multidimensional Black-
Scholes model time-changed by an independent increasing stochastic process. It is
assumed in this section that the marginal density of the time-change process decays
at infinity like the function s → s α e−θs with α ∈ R and θ > 0. The class of such
models, includes standard multidimensional extensions of various exponential Lévy
models, for instance, of the variance gamma model, the normal inverse Gaussian
model, or the generalized hyperbolic model. These extensions were previously dis-
cussed in, e.g., [15, 32, 36]. To our knowledge, for such a class of multidimensional
models, the tail behavior of marginal distributions has not been studied before. In
Sect. 4, we provide two-sided estimates for the distribution function of the asset price
in the time-changed multidimensional Black-Scholes model, and use these estimates
to find the leading term in the asymptotic expansion of the implied volatility.
Finally, in Sect. 5, we deal with the case where the assets in the basket are corre-
lated, and the dependence structure is described by a given copula function (we refer
the reader to the book [12] for details on this modeling approach). Here we obtain
an asymptotic formula that can be considered as a generalization to the multidimen-
sional setting of one of the tail-wing formulae established in [9]. The new tail-wing
formula uses a special characteristic of the copula called weak lower tail dependence
function. This notion was recently introduced in [39].
178 A. Gulisashvili and P. Tankov

Remarks on the notation used in the paper

• Let f and g be functions defined on R, and let a ∈ [−∞, ∞]. Throughout the
present paper, we write “ f ∼ g as x → a” provided that

f (x)
lim = 1.
x→a g(x)

We also use the notation “ f  g as x → a” if

f (x)
lim sup ≤ 1,
x→a g(x)

and write “ f (x) ≈ g(x) as x → a” if there exist c1 > 0 and c2 > 0 such that

c1 g(x) ≤ f (x) ≤ c2 g(x)

for all x in some neighborhood of a.


• A positive function f defined in [a, ∞) for some a > 0 is called regularly varying
at infinity with index α ∈ R if for any λ > 0,

f (λx)
lim = λα .
x→0 f (x)

for all α > 0. The class of all regularly varying functions with index α is denoted
by Rα . The elements of the class R0 are called slowly varying functions. Regularly
varying functions at zero can be defined similarly.
• The following set will be used in the paper:


d
d : = {w ∈ Rd : wi ≥ 0, i = 1, . . . , d, and wi = 1}.
i=1

• Let w ∈ d . We set

d
E(w) := − wi log wi , (1)
i=1

with the convention x log x = 0 for x = 0.


Implied Volatility of Basket Options at Extreme Strikes 179

2 Model-Free Formulae for the Implied Volatility

Let X t be a non-negative martingale on a filtered probability space (, F, {Ft }t≥0 ,


P). Consider a stochastic model where the process X models the price dynamics
of an asset. Define the call and put pricing functions in the price model described
above by

C(T, K ) = E[(X T − K )+ ] and P(T, K ) = E[(K − X T )+ ], (2)

respectively. Here T > 0 is the maturity, while K > 0 is the strike price.
The implied volatility (T, K ) → I (T, K ) is determined from the following
equality:
C(K , T ) = CBS (T, K , σ = I (T, K )),

where the symbol CBS stands for the Black-Scholes call pricing function. In the
sequel, the maturity T will be fixed, and the implied volatility will be considered as
a function of only the strike price.
We will next formulate two model-free asymptotic formulas, characterizing the
left-wing behavior of the implied volatility in terms of the put pricing function. These
formulas will be needed below. Suppose the initial condition for the price process is
X 0 = 1. Suppose also that the asset price model does not have atoms at zero. The
previous assumption means that P(X T = 0) = 0. Then the following asymptotic
formula (a zero order formula for the implied volatility) holds:
√  √ 
2 1 1 K 2 K 1 K
I (K ) = √ log − log log −√ log − log log
T  ) 2
P(K  )
P(K T  ) 2
P(K  )
P(K
 − 1 
K 2
+O log (3)

P(K )

as K → 0. Here P  is a positive function satisfying the condition P(K ) ≈ P(K


 ) as
K → 0. Formula (3) was established in [22] (see also Theorem 9.29 in [24]). The
fact that the absence of atoms is a necessary condition for the validity of formula (3)
was noticed in [14] (see also [25]).
180 A. Gulisashvili and P. Tankov

The next asymptotic formula (a first-order formula for the implied volatility) can
be easily deduced from the results formulated in [24, Sects. 9.6 and 9.9]:
√ 
2 1 1 K
I (K ) = √ log − log log + log B(K )
T P(K ) 2 P(K )
√ 
2 K 1 K
−√ log − log log + log B(K )
T P(K ) 2 P(K )
  − 3 
K K 2
+ O log log log (4)
P(K ) P(K )

as K → 0, where
log 1
P(K ) − log K
P(K )
B(K ) = √ . (5)
2 π log 1
P(K )

Formula (4) takes into account the results obtained in [19]. It provides more terms
in the asymptotic expansion of the implied volatility at small strikes than formula
 = P. More information on model free formulas for the implied volatility
(3) with P
can be found in [24].

3 Basket Options in Multidimensional Black-Scholes Model

Our goal in the present section is to characterize the asymptotic behavior of the
implied volatility at small strikes in the case of a basket option of European style in
the n-dimensional driftless Black-Scholes model. We assume that the interest rate is
equal to zero. Let S 1 , . . . , S n be a basket of assets such that

diag(B)t
St = log 
log 
1
S0 − + B 2 Wt ,
2

where St = (St1 , . . . , Stn ), 


S0 = (S01 , . . . , S0n ), W is an n-dimensional standard
Brownian motion, B is the covariance matrix, and diag(B) stands for the main
diagonal of B. We denote by (λ1 , . . . , λn ) ∈ n the weight vector associated with
the assets in the basket.
Consider the price process of the following form:


n
St = λi Sti , t ≥ 0. (6)
i=1
Implied Volatility of Basket Options at Extreme Strikes 181

n
The initial condition for the process S is given by S0 = i=1 λi S0i , and we will
assume in the sequel that S0i = 1 for all 1 ≤ i ≤ n. The previous condition implies
that S0 = 1. Therefore,
n
St = exp{Yti }, (7)
i=1

where
bii t 
n
j
Yti = log λi − + βij Wt , 1 ≤ i ≤ n. (8)
2
j=1

1
In (8), the symbols βij stand for the elements of the matrix B 2 . We also set

bii t
μi,t = log λi − , 1 ≤ i ≤ n. (9)
2

It is clear that the following equality holds: exp{Yti } = λi Sti , t > 0, 1 ≤ i ≤ n.

3.1 Asymptotics of Put Pricing Functions


in Multidimensional Black-Scholes Model

The distribution density of the random variable ST will be denoted by pT . An asymp-


totic formula for pT was recently established in [27]. Let us briefly recall the notation
used in that paper. Let w̄ ∈ n be the unique vector such that

w̄ ⊥ Bw̄ = min w ⊥ Bw. (10)


w∈n

The existence and uniqueness of w̄ follows from the non-degeneracy of the matrix
B. We let

n̄ := Card {i = 1, . . . , n : w̄i = 0}, I¯ := {i = 1, . . . , n : w̄i = 0} := {k̄(1), . . . , k̄(n̄)},

μ̄ ∈ Rn̄ with μ̄i = μk̄(i) , and B̄ ∈ Mn̄ (R) with B̄ij = Bk̄(i),k̄( j) . The inverse matrix
of B̄ is denoted by B̄−1 , and the elements and the row sums of B̄ are denoted
by āij and Āk := n̄j=1 ākj , respectively. Since the variables Y1 , . . . , Yn in (7) are
exchangeable, we can assume with no loss of generality that for the covariance matrix
B, I¯ = {1, . . . , n̄} with n̄ ≤ n. By the strict convexity of the objective function, the
minimizer of min w ⊥ B̄w coincides with the first n̄ components of w̄ and therefore
w∈n̄
belongs to the interior of the set Rn̄+ . The minimizer over n̄ then coincides with the

minimizer over the set {w ∈ Rn̄ : i=1 wi = 1}, which means that
182 A. Gulisashvili and P. Tankov

B̄−1 1
(w̄i )i=1,...,n̄ = ,
1⊥ B̄−1 1
or, equivalently,

Āk
w̄k = n̄
, k = 1, . . . , n̄. (11)
i=1 Āi

Since i=1n̄
Āi > 0 (the matrix B̄−1 is positive definite), this implies that Āk > 0
for k = 1, . . . , n̄.
We will next formulate a condition under which the asymptotic formula for the
density pT holds.

Assumption (A) For every i ∈ {1, . . . , n} \ I¯, (ei − w̄)⊥ Bw̄ = 0, where ei ∈ Rn
satisfies eij = 1 if i = j and eij = 0 otherwise.
Assumption (A) is a natural nondegeneracy condition for our problem. The following
straightforward equality gives a relation between the optimization problem in (10)
and a similar problem without the normalization constraint:

r2 ⊥ 1 ⊥
inf w Bw − r = inf v Bv − 1⊥ v. (12)
w∈n ,r ≥0 2 v∈R :vi ≥0,i=1,...,n 2
n

A minimizer v̄ of the right-hand side can therefore be constructed from the minimizer
w̄ of (10) as follows:

v̄ = ⊥ .
w̄ Bw̄

Now, introducing the vector λ ∈ Rd of Lagrange multipliers for the positivity con-
straints on the right-hand side of (12), we get the Lagrangian 21 v ⊥ Bv − 1⊥ v − λ⊥ v.
At the extremum therefore, Bv̄ = 1 + λ, or in other words,

Bw̄
= 1 + λ.
w̄ ⊥ Bw
Therefore, Assumption (A) simply states that for the constraints, which are saturated,
the Lagrange multipliers are not equal to zero (since the constraints are inequalities,
this is equivalent to the strict positivity for the multipliers). This is generally true,
except when the solution of the unconstrained problem belongs to the boundary of
the domain defined by the constraints. Assumption A is not restrictive and is satisfied
in most applications. Note that if the row sums of the covariance matrix B̄ satisfy
Ai > 0, 1 ≤ i ≤ n, then Assumption A holds.
Implied Volatility of Basket Options at Extreme Strikes 183

It was established in [27] that under Assumption (A), the following asymptotic
formula is valid for the density pT of the price ST of the basket:
 
  1−n̄ n̄ Ā1 +···+ Ān̄
1 2 −1+ T1 k=1 Āk log +μ̄k,T
pT (x) = C T log x Āk
x
   
1 1 1 −1
exp − ( Ā1 + · · · + Ān̄ ) log2 1+O log , (13)
2T x x

as x → 0, where the constant C is given by



1 Ā1 + · · · + Ān̄
CT = √ 
2πT B̄ Ā1 · · · Ān̄
⎧ ⎫
⎨   ⎬
1 

Ā1 + · · · + Ān̄ Ā1 + · · · + Ān̄
exp − āij log + μ̄i,T log + μ̄ j,T .
⎩ 2T Āi Ā j ⎭
i, j=1
(14)

Using formula (13), we can characterize the asymptotic behavior of the put pricing
function P at small strikes. This can be done as follows. Consider the fractional
integral of order two defined by
 ∞
F2 M(σ) = (τ − σ)M(τ )dτ , (15)
σ

where M is a positive function on (0, ∞). Since


 K
P(K ) = (K − x) pT (x)d x,
0

it is not hard to see that


 
P(K ) = S −1 F2 M(S), where S = K −1 and M(y) = y −3 pT y −1 . (16)

Using (13), we get   


M(y) = M1 (y) 1 + O (log y)−1 (17)
184 A. Gulisashvili and P. Tankov

as y → ∞, where
 
Ā1 +···+ Ān
1−n̄ −2−T −1 n̄
k=1 Āk log +μk,T
M1 (y) = C T (log y) 2 y Āk

1
exp − ( Ā1 + · · · + Ān̄ ) log2 y , y > y0 . (18)
2T

In (18), the constant C T is given by (14). Set

M2 (y) = (log y)−1 M1 (y). (19)

It follows from (17) that there exist c > 0 and y1 > 0 such that

|M(y) − M1 (y)| ≤ cM2 (y), y > y1 . (20)

In [26], a general asymptotic formula was obtained for fractional integrals (see
also Theorem 5.3 in [24]). We will next formulate this general result. Suppose


M(y) = a(y)e−b(y) for all y ≥ c

where c > 0 is some number. Suppose also that the following conditions hold:
1. y|a (y)| ≤ γa(y) for some γ > 0 and all y > c.
2. b(y) = B(log y), where B is a positive increasing function on (c, ∞) such that
B (y) ≈ 1 as y → ∞.
Then as σ → ∞,

M(σ)

F2 M(σ) = (1 + O((log σ)−1 )). (21)
b (σ)2

The functions M  = M1 and M  = M2 defined in (18) and (19) satisfy the


conditions in the theorem formulated above. Applying this theorem, we obtain

Mi (σ)
F2 Mi (σ) = (1 + O((log σ)−1 )) (22)
b (σ)2

as σ → ∞, where i = 1, 2 and

1
b(u) = ( Ā1 + · · · + Ān̄ ) log2 u, (23)
2T
Implied Volatility of Basket Options at Extreme Strikes 185

It follows from (19), (20), and (22) that

M1 (σ)
F2 M(σ) = F2 M1 (σ) + O(F2 M2 (σ)) = (1 + O((log σ)−1 )) (24)
b (σ)2

as σ → ∞. Now, using (16), (23), and (24), we establish the following assertion.
Theorem 1 Let P be the price of the put option defined in (2), and suppose Assump-
tion (A) holds for the covariance matrix B (see [27]). Then, as K → 0,
      −1 
1 δ1 1 δ2 1 1
P(K ) = δ0 log exp −δ3 log2 1+O log ,
K K K K
(25)

where
CT T 2 3 + n̄
δ0 =  2 , δ1 = − ,
Ā1 + · · · + Ān̄ 2

 
1 

Ā1 + · · · + Ān̄ 1
δ2 = −1 − Āk log + μk,T , δ3 = ( Ā1 + · · · + Ān̄ ),
T Āk 2T
k=1

and C T is given by (14).


Formula (25) will be used in the next subsection to characterize the left-wing
behavior of the implied volatility associated with a basket option in the multidimen-
sional Black-Scholes model.

3.2 Left-Wing Asymptotic Behavior of the Implied


Volatility Associated with Basket Options

The next statement characterizes the asymptotic behavior of the implied volatility
for small strikes.
Theorem 2 Suppose Assumption (A) holds for the covariance matrix B. Then, as
K → 0,
 
1 2 n̄k=1 Āk log Ā1 +···+
Āk
Ān̄
+ μk,T +T  
1 −1
I (K ) =  − 3
log
Ā1 + · · · + Ān̄ 2( Ā1 + · · · + Ān̄ ) 2 K
 −2   −2 
T (n̄ − 1) 1 1 1
− 3
log log log +O log . (26)
2( Ā1 + · · · + Ān̄ ) 2 K K K
186 A. Gulisashvili and P. Tankov

Remark 1 The leading term in the implied volatility expression above can also be
written as 
1
lim I (K ) =  = min w ⊥ Bw. (27)
K ↓0 Ā1 + · · · + Ān̄ w∈n

Formula (27) for the leading term of the implied volatility holds even if assumption
(A) is not satisfied—in this case, this formula can be obtained as a corollary of
Theorem 9 of this paper.

Proof It follows from (25) that as K → 0,

1 1 1 1 1
log = log − δ1 log log − δ2 log + δ3 log2
P(K ) δ K K K
0 −1 
1
+O log (28)
K

and
K 1 1 1
log = log − δ1 log log − (δ2 + 1) log
P(K ) δ0 K K
 −1 
1 1
+ δ3 log2 +O log (29)
K K

where δ0 , δ1 , δ2 , and δ3 are such as in Theorem 1. Moreover, the error term in (4)
can be represented as follows:
  −3 
1 1
O log log log . (30)
K K

We will next characterize the asymptotic behavior of log B(K ) as K → 0. Denote


the functions on the right-hand side of (28) and (29) by V1 (K ) and V2 (K ), respec-
tively. Then, using (5), (28), and (29), we obtain
  
1 V1 (K ) − V2 (K )
log B(K ) = log √ + log 1 − 1 − .
2 π V1 (K )

It is easy to see that log(1 − 1 − h) = log h2 + O(h) as h → 0. Put h =
V1 (K )−V2 (K )
V1 (K ) . Then we have
  
1 V1 (K ) − V2 (K ) 1 −1
log B(K ) = log √ + log +O log ,
2 π 2V1 (K ) K
Implied Volatility of Basket Options at Extreme Strikes 187

and hence
  
1 1 1 −1
log B(K ) = log √ − log log + O log (31)
4 πδ3 K K

as K → 0.
Our next goal is to simplify formula (4) by taking into account (28), (29), and
(31),
 and replacing the error term by the expression in (30). We can drop the terms
−1 
O log K1 in (28), (29), and (31), using the mean value theorem. This will
 −2 
introduce an error term O log K1 in the formula that follows from formula
(4). Thus
√ 
2  1 2 (K ) + log √1 1
I (K ) = √ V1 (K ) − log V − log log
T 2 4 πδ3 K
√ 
2  1 2 (K ) + log √1 1
−√ V2 (K ) − log V − log log
T 2 4 πδ 3 K
 −2 
1
+O log (32)
K

as K → 0, where V 1 (K ) and V 2 (K ) denote the functions on the right-hand side of


 −1 
(28) and (29), respectively, without the terms O log K1 . Next, using the mean
2 (K ) in the expression log V
value theorem, we see that it is possible to replace V 2 (K )
in formula (32) by δ3 log2 K . Now, taking into account the definitions of V 1 (K ) and
2 (K ), we obtain
V
√   
2 √ 3 1 1 1
I (K ) = √ − log 4 πδ0 δ32 − (δ1 + 2) log log − δ2 log + δ3 log2
T K K K
√   
2 √ 3 1 1 1
−√ − log 4 πδ0 δ32 − (δ1 + 2) log log − (δ2 + 1) log + δ3 log2
T K K K
 −2 
1
+O log (33)
K

as K → 0. Put
 
√ 3
− log 4 πδ0 δ32 − (δ1 + 2) log log 1
K − δ2 log 1
K
h 1 (K ) =
δ3 log2 1
K
188 A. Gulisashvili and P. Tankov

and
 
√ 3
− log 4 πδ0 δ32 − (δ1 + 2) log log 1
K − (δ2 + 1) log 1
K
h 2 (K ) = .
δ3 log2 1
K

It follows from (33) that


√ √   
2 δ3 1   1 −2
I (K ) = √ log 1 + h 1 (K ) − 1 + h 2 (K ) + O log
T K K
(34)

as K → 0. Next, using the formula 1 + h = 1 + 21 h − 18 h 2 + O(h 3 ) as h → 0 in
(34), we get
   
1 1 + 2δ2 1 −1 δ1 + 2 1 1 −2
I (K ) = √ + √ log + √ log log log
2T δ3 4δ3 2T δ3 K 2δ3 2T δ3 K K
  
1 −2
+O log (35)
K

as K → 0. Finally, plugging the values of δ1 , δ2 , and δ3 given in Theorem 1 into


formula (35), we obtain formula (26).
This completes the proof of Theorem 2. 

Remark 2 (Implied volatility in the multidimensional Black-Scholes model for large


strikes.) From Theorem 1 in [3], it follows that

mn σ t (log K − μ)2
P[St ≥ K ] ∼ √ exp − , K → ∞,
2π log K 2σ 2 t

where σ 2 = maxk=1,...,n Bkk , μ = maxμk,t :Bkk =σ2 and m n = #{k : Bkk =


σ 2 , μk,t = μ}. From this result, we easily deduce that

K (log K − μ)2
E[(St − K )+ ] ≈ exp − , K → ∞.
log2 K 2σ 2 t

Applying Corollary 2.4 in [22] (which is nothing but the right-tail version of formula
(3)), we conclude that  
ψ(K )
I (K ) = σ + O
log K

as K → +∞, where ψ is any function satisfying ψ(K ) → +∞ as K → +∞.


Implied Volatility of Basket Options at Extreme Strikes 189

The function ψ can be removed from the error estimate in the previous formula,
using Lemma 3.1, part 1, in [24]. The resulting formula is as follows:
 
1
I (K ) = σ + O
log K

as K → ∞.

Numerical illustration In this part of the paper we compare the theoretical left-
tail limit of the implied volatility given by Formula (27) with the numerical values
computed by Monte Carlo in the multidimensional Black-Scholes model. Figure 1
plots the implied volatility of two basket call options as function of the strike price
with 2-standard deviation confidence intervals (for 5 million paths), as well as the
horizontal line corresponding to the theoretical limit.
In the left graph, the basket contains two independent identical assets following
the Black-Scholes model with volatility σ = 0.3. In the right graph, the basket
contains ten identical assets following the multidimensional Black-Scholes model,
where the volatility of every component is σ = 0.3 and the correlation between the
log-prices of different components is ρ = 0.5. The maturity of the options is T = 0.2
years in both graphs.
We observe that in both cases the volatility is almost constant as a function of
strike (note the scale on the vertical axis), and for all strikes it is very close to the
theoretical limit of Formula (27). We only show the zero-order term of the expansion
in Theorem 2 because the higher-order terms do not lead to an improvement of the
approximation for the strikes shown in the graph. Indeed, the higher-order terms in
this expansion have a singularity at K = 1 and have a “reasonable” value only when
log K1 is very small.
For comparison, we also plot the implied volatility in the right wing in Fig. 2.
According to Remark 2, in the right wing, the implied volatility must converge to
σ = 0.3. However, from the graph in Fig. 2 we see that this convergence is very slow:
for all strike values for which option prices may be computed without sophisticated

Fig. 1 Implied volatility of a basket call option in the multidimensional Black-Scholes model
together with the theoretical 0-order approximation for the left wing. Left option on a basket of 2
identical assets. Right option on a basket of 10 identical assets
190 A. Gulisashvili and P. Tankov

Fig. 2 Right wing of the implied volatility of a basket call option in the two-dimensional Black-
Scholes model together with the theoretical 0-order left-wing approximation

variance reduction, the implied volatility, although it increases slightly with strike,
remains close to its left-wing limit.

3.3 The Case Where n = 2

The detailed discussion of the behavior of the distribution of the sum of two log-
normal variables can be found in [20, 27]. The covariance matrix in this case is as
follows: B = [bij ], where b11 = σ12 , b12 = b21 = ρσ1 σ2 , b22 = σ22 with σ1 > 0,
σ2 > 0, and the correlation coefficient satisfies −1 < ρ < 1. We will also assume
σ1 ≥ σ2 . Note that the case where ρ < σσ21 is a regular case, and Assumption (A)
holds. In the case where ρ > σσ21 , we have to rearrange the rows and the columns
of B (see the example in Sect. 2.1 of [27]). Then B̄ = (σ22 ), and Assumption (A)
holds. The case where ρ = σσ21 is exceptional. Here Assumption (A) does not hold.
The following asymptotic formulas for the implied volatility follow from (26):
σ2
• Suppose ρ > σ1 . Then

 −1   
1 1 −2
I (K ) = σ2 − σ2 log λ2 log +O log (36)
K K

as K → 0.
Implied Volatility of Basket Options at Extreme Strikes 191

σ2
• Suppose ρ < σ1 . Then

  
T 2 σ12 T
I (K ) = σ∞ − σ∞ σ + log λ1 − − log v̄ v̄
2 ∞ 2
   
σ22 T 1 −1
+ log λ2 − − log(1 − v̄) (1 − v̄) log
2 K
  
T 3 log log K1 1 −2
− σ∞ + O log (37)
2 log2 K1 K

as K → 0, where

σ1 σ2 1 − ρ2 σ2 (σ2 − ρσ1 )
σ∞ = and v̄ = .
σ12 + σ22 − 2ρσ1 σ2 σ12 + σ22 − 2ρσ1 σ2

Therefore, the behavior of the implied volatility experiences a qualitative change


(phase transition) at ρ∗ = σσ21 . Indeed, for ρ < ρ∗ , the expression in formula (37),
approximating the left wing of the implied volatility, depends on the correlation coef-
ficient, while for ρ > ρ∗ the left wing is approximated by a correlation-independent
expression (see (36)).
We will next discuss the asymptotic behavior of the implied volatility in the
exceptional case where n = 2 and ρ = ρ∗ . The following formula holds for the
distribution density pT in the exceptional case (see [20]):

μ2,T  1   1
T σ22 1 − T (σ12 −σ22 )
−1 1 −2
pT (x) ≈ x log log log
x x
!      
1 1 1 1
exp −  2  log − 1 + log log − log log − 1
2T σ1 − σ22 ρ2 x ρ2
 2 "
1
+ log log + μ1,T − μ2,T
x
! "
log2 x1
exp − (38)
2T σ22

as x → 0. Recall that we assume that μ = 0. Recall also that μ1,T and μ2,T are
defined in (9).
Remark 3 Formula (38) can be derived from formula (B20) established at the end
of the proof of part (ii) of Theorem 2.3 in [20]. Note that in the present paper we
assume σ1 ≥ σ2 , while in [20], σ1 ≤ σ2 .
192 A. Gulisashvili and P. Tankov

Set
   
1 1
V1,T = log − 1 + μ1,T − μ2,T and V2 = log − 1 . (39)
ρ2 ρ2

It is not hard to see using the mean value theorem that


   
1 1 2
log2 V2 + log log − log log log = o(1)
x x

as x → 0. Hence
!  " !   "
1 1 1 1 2
exp − log2 V2 + log log ∼ exp − log log log
2T (σ12 − σ22 ) x 2T (σ12 − σ22 ) x

as x → 0. In addition,
!    "
1 1 1
exp log log log V2 + log log
T (σ12 − σ22 ) x x
  1 !   "
1 T (σ12 −σ22 ) 1 1 1
≈ log exp log log log log log
x T (σ12 − σ22 ) x x

as x → 0. Therefore, (38) implies the following estimate for the density pT :

 1− μ2,T   V1,T   V1,T 1


1 T σ22 1 − T (σ12 −σ22 ) 1 T (σ12 −σ22 ) − 2
pT (x) ≈ log log log
x x x
! " !   "
1
2
log x 1 1 2
exp − exp − log log
2T σ22 2T (σ12 − σ22 ) x
!   "
1 1 2
exp − log log log
2T (σ12 − σ22 ) x
!   "
1 1 1
exp log log log log log (40)
T (σ12 − σ22 ) x x

as x → 0.
Implied Volatility of Basket Options at Extreme Strikes 193

Our next goal is to obtain a two-sided estimate for the put pricing function P,
by taking into account formula (40). We will use the ideas employed in the proof of
Theorem 1. Let us set

u2 log2 u (log log u)2 1


B(u) = + + − (log u)(log log u)
2T σ2
2 2T (σ1 − σ2 ) 2T (σ12 − σ22 ) T (σ12 − σ22 )
2 2

and
μ2,T V1,T V1,T
−2− − −1
T σ22 T (σ12 −σ22 ) T (σ12 −σ22 ) 2
a(y) = y (log y) (log log y) .

It is not hard to see that the restrictions, under which formula (21) is valid, are
satisfied. In addition, for the function b(x) = B(log x), we have b (x) ≈ logx x as
x → ∞. Now, reasoning as in the proof of Theorem 1, we obtain the following
formula: P(K ) ≈ P(K  ) as K → 0, where

 −1− μ2,T  −  V1,T  V1,T − 1


−2
 )= 1 T σ22 1 1
T (σ12 −σ22 ) T (σ12 −σ22 ) 2
P(K log log log
K K K
! " !  2 "
2 1
log K 1 1
exp − exp − log log
2T σ22 2T (σ1 − σ2 )
2 2 K
!  2 "
1 1
exp − log log log
2T (σ1 − σ2 )
2 2 K
!   "
1 1 1
exp log log log log log (41)
T (σ12 − σ22 ) K K

as K → 0. Next, using (3) with P  given by (41), and making numerous simplifi-
cations, we obtain the following asymptotic formula for the implied volatility in the
exceptional case:   
1 −1
I (K ) = σ2 + O log (42)
K

as K → 0. Comparing formula (42) with formulas (36) and (37), we see that the
behavior of the implied volatility at the critical point ρ = σσ21 , where the qualitative
change happens, is similar to that in the case where ρ > σσ21 .
194 A. Gulisashvili and P. Tankov

4 Time-Changed Multidimensional Black-Scholes Model

Recall that in Sect. 3, we introduced the price process S for a basket of assets (see
formula (6)). The present section deals with time changes in such processes. Suppose
τt , t ≥ 0, is a non-negative non-decreasing stochastic process on (, F, {F}t≥0 , P)
(a time change). Then, the time-changed process S has the following form: t → Sτt .
We only consider time changes which are independent of the price process S. In the
next subsections, two-sided estimates for marginal distribution functions of time-
changed price processes such as above will be established. Moreover, the leading
term in the asymptotic expansion of the implied volatility associated with a time-
changed price process t → Sτt in the n-dimensional Black-Scholes model will be
found.

4.1 Bounds on Distribution Functions


of Sums of Log-Normal Mixtures

The next assertion provides an upper bound for the distribution function of a random
variable imitating the random variable Sτt for fixed t > 0. The additional drift vector
μ̃ will be needed later to ensure the martingale property.
Theorem 3 (Upper bound) Let Y be a centered Gaussian vector with covariance
matrix B = [bij ]1≤i, j≤n , and let μ ∈ Rn and μ̃ ∈ Rn . Suppose Z is a random
variable with values in (0, ∞), which has a density ρ(x) satisfying ρ(s) ≤ csα e−θs
for s ≥ 1, where θ > 0, c > 0 and α ∈ R are constants. Then, there exists C > 0
such that as k → +∞,


n √ ∗
Z +μi Z +μ̃i
P[ eYi ≤ e−k ]  Ck α e−c k ,
i=1

where
(1 + tμ⊥ w)2
c∗ = min max θt + . (43)
t≥0 w∈n 2w ⊥ Bwt

Proof In this proof, C denotes a constant which may change from line to line. For
k > 0, set  n 
 √
Ft (k) = P eYi kt+μi kt+μ̃i ≤ e−k .
i=1
Implied Volatility of Basket Options at Extreme Strikes 195

Fix w ∈ n , and let t be such that 1 + tμ⊥ w > 0. Then, by Jensen’s inequality,
 

n √
Yi kt+μi kt+μ̃i −k
P e ≤e
i=1
 
√  n
≤P kt wi Yi + ktμ⊥ w + μ̃⊥ w + E(w) ≤ −k
i=1
 
k + tkμ⊥ w + μ̃⊥ w + E(w)
=N − √
w ⊥ Bwkt

C t (k + tkμ⊥ w + μ̃⊥ w + E(w))2
≤ √ exp −
(1 + tμ⊥ w) k 2w ⊥ Bwkt
√ ⊥
C t (1 + tμ w)2 (μ̃⊥ w + E(w))2
= √ exp −k exp −
(1 + tμ⊥ w) k 2w ⊥ Bwt 2w ⊥ Bwkt
E(w) + μ̃⊥ w μ⊥ w(E(w) + μ̃⊥ w)
× exp − exp −
w ⊥ Bwt w ⊥ Bw
√ ⊥
C t (1 + tμ w) 2 μ̃⊥ w
≤ √ exp −k exp − ,
(1 + tμ⊥ w) k 2w ⊥ Bwt w ⊥ Bwt

where E(w) is defined by (1).


Consider the following function:

(1 + tμ⊥ w)2
F(t, w) = θt + .
2w ⊥ Bwt
The following lemma establishes some properties of this function. The proof is given
in the appendix.
Lemma 1 There exists a unique couple (t¯, w̄), with t¯ ∈ (0, ∞) and w̄ ∈ n such
that
F(t¯, w̄) = min max F(t, w).
t>0 w∈n

In addition, the function


f (t) = F(t, w̄)

has a unique minimum at the point t¯.


We clearly have 1 + t¯μ⊥ w̄ > 0. Indeed, if 1 + t¯μ⊥ w̄ < 0 then f (− μ⊥1w̄ ) < f (t¯)
which contradicts the fact that t¯ is the minimizer. If 1 + t¯μ⊥ w̄ = 0 then f (t¯) = θ
which also leads to a contradiction. Let

⎨− 1 , μ⊥ w̄ < 0
T = μ⊥ w̄

+∞ otherwise,
196 A. Gulisashvili and P. Tankov

Remark that if T < ∞, then f (T ) = θT > f (t¯). Let us also choose T small
enough so that
1 1
1 − |μ⊥ w̄|T ≥ and > f (t¯).
2 8w̄ ⊥ Bw̄T

and assume that k is large enough so that k + 8μ̃w̄ > 0. We bound the distribution
function of the Gaussian mixture from above as follows:


n √
Z +μi Z +μ̃i
P[ eYi ≤ e−k ] = E[FZ /k (k)]
i=1
 ∞  ∞
= ρ(s)Fs/k (k)ds = k ρ(tk)Ft (k)dt (44)
0 0
 √  ∞
T C(tk)α t
≤ k max Ft (k) + k √ e−k f (t) dt + ck e−tkθ (tk)α dt.
0≤t≤T T k(1 + μt)⊥ w T
(45)

Now, by the choice of T , the first term on the right-hand side of the last inequality
in (45) satisfies √
k max Ft (k) ≤ C ke−βk
0≤t≤T

with β > f (t ∗ ). The second term is computed using Laplace’s method. As k → +∞,
up to a constant,
 √
T C(tk)α t ∗
k √ e−kf (t) dt ∼ Ck α e−kf (t ) .
T k(1 + μt)⊥ w

Finally, the last term is negligible by the choice of T .


The proof of Theorem 3 is thus completed. 
Our next goal is to establish a lower estimate complementing the estimate in
Theorem 3. Note that the estimates in Theorems 3 and 4 are off by the factor k −n .
Theorem 4 (Lower bound) Let Y be a centered Gaussian vector with covariance
matrix B and let μ ∈ Rn and μ̃ ∈ Rn . Let Z be a random variable with values in
(0, ∞), which has a density ρ(x) satisfying ρ(s) ≥ csα e−θs for s ≥ 1, where θ > 0,
c > 0 and α ∈ R are constants. Then, there exists C > 0 such that as k → +∞,


n √ ∗
Z +μi Z +μ̃i
P[ eYi ≤ e−k ]  Ck α−n e−c k ,
i=1

where c∗ is given by (43).


Implied Volatility of Basket Options at Extreme Strikes 197

Proof It is clear that


⎡ ⎤
n √ √
P⎣ eYi kt+ μi kt+μ̃i ≤ e− k⎦ ≥ P[Yi kt + μi kt + μ̃i ≤ − k − log n, i = 1, . . . , n].
i=1

By Proposition 3.2 in [28], the above probability can be bounded from below (very
roughly) as follows:

√ C
P[Yi kt + μi kt + μ̃i ≤ −k − log n, i = 1, . . . , n] ≥ exp{−αt /2},
(1 + k(1 + t))n

where

αt = min x ⊥ B−1 x
x≥ √1 ((k+log n)1+ktμ+μ̃)
kt

1 1
= maxn − u ⊥ Bu + u ⊥ √ ((k + log n)1 + ktμ + μ̃)
u∈R+ 2 kt
(k + log n + ktμ⊥ w + μ̃⊥ w)2
= max
w∈n 2w⊥ Bwkt
(1 + tμ⊥ w)2 (1 + tμ⊥ w)(log n + μ̃⊥ w) (log n + μ̃⊥ w)2
≤ max k ⊥
+ max ⊥
+ max .
w∈n 2w Bwt w∈n w Bwt w∈n 2w⊥ Bwkt

Finally, we bound the distribution function of the Gaussian mixture from below as
follows:


n √  ∞  t¯+1/k
Z +μi Z +μ̃i −k
P[ e Yi
≤e ]=k ρ(tk)Ft (k)dt ≥ ck (tk)α e−θtk Ft (k)dt
i=1 0 t¯−1/k
 t¯+1/k
Ck(t¯k)α (1 + tμ⊥ w)2
≥ exp −θt¯k − k max dt
(1 + k(1 + t))n t¯−1/k w∈n 2w ⊥ Bwt
C(t¯k)α (1 + t¯μ⊥ w)2 Ck α e−k f (t¯)
≥ exp −θt¯k − k max = .
(1 + k(1 + t¯))n w∈n 2w ⊥ Bw t¯ (1 + k(1 + t¯))n

Remark 4 Theorems 3 and 4 show that under their assumptions, the dominating
factor describing the decay of the left tail of the price of a portfolio of assets is
exponential with the decay rate equal to the constant c∗ . For example, for n = 1, we
have 
∗ (1 + μt)2 2θσ 2 + μ2 + μ
c = min{θt + } = .
t≥0 2σ 2 σ2
198 A. Gulisashvili and P. Tankov

In symmetric models with μ = 0, the formula for c∗ simplifies to




c∗ = .
minw∈n w ⊥ Bw

4.2 Implied Volatility Asymptotics

Let S 1 , . . . , S n be assets such that

St = log 
log 
1
S0 + μ̃t + μτt + B 2 Wτt , (46)

where we use the same notation as in the beginning of Sect. 3. Let S denote the price
process of the basket. Fix a maturity T > 0, and suppose the random variable τT has
a density ρT . Suppose also that there exist c1 > 0, c2 > 0, θ > 0 and α ∈ R such
that
c1 s α e−θs ≤ ρT (s) ≤ c2 s α e−θs , s ≥ 1. (47)

We assume that for every i = 1, . . . , n,

Bii
θ > μi + . (48)
2
This assumption implies that there exists ε > 0 such that

E[(STi )1+ε ] < ∞

We then assume further that μ̃i is chosen in such way that

E[STi ] = S0i . (49)

It follows from Theorems 3 and 4 that there exist C1 > 0, C2 > 0, and y0 > 0
such that
   
c∗ 1 α−n c∗ 1 α
C1 y log ≤ P[SτT ≤ y] ≤ C2 y log , y < y0 . (50)
y y

Since we have
 
+ K
P(K ) = E K − SτT = P[SτT ≤ y]dy,
0
Implied Volatility of Basket Options at Extreme Strikes 199

the estimates in (50) imply that there exist C3 > 0, C4 > 0, and K 0 > 0 such that
   
c∗ +1 1 α−n c∗ +1 1 α
C3 K log ≤ P(K ) ≤ C4 K log , K < K0. (51)
K K

Note that the put pricing pricing in (51) is squeezed between two regularly varying
functions with the same index of regular variation at zero. Such estimates allow one
to find the leading term in the asymptotic expansion of the implied volatility near
zero.
Theorem 5 Suppose condition (47) holds for the time-change process τ and that
the assumptions (48) and (49) are satisfied. Then the following asymptotic formula
holds for the implied volatility in time-changed n-dimensional Black-Scholes model:
 1 
ψ(c∗ ) 2 1
I (K ) ∼ log
T K

as K → 0, where the function ψ is defined by



ψ(u) = 2 − 4( u 2 + u − u), u > 0 (52)

and the constant c∗ is given by Formula (43).

Proof Theorem 5 follows from (51) and Theorem 10.28 in [24]. 

Remark 5 Condition (47) holds for many processes commonly used as stochastic
time changes, e.g., for the gamma process, the inverse Gaussian process, or the
generalized inverse Gaussian process. The latter process is used as time change in
the generalized hyperbolic Lévy model. Recall that the density of the gamma process
is given by

λct ct−1 −λs


ρt (s) = s e , (53)
(ct)

while the density of the inverse Gaussian process is as follows:

ct √
πλ−λs−πc2 t 2 /s
ρt (s) = 3/2
e2ct .
s

In the previous formulas, the symbols λ and c stand for the parameters of the distri-
butions.

We close this section with a counterpart of Theorem 5 for the right tail, which can
be deduced from Theorem 10 proved in the next section.
200 A. Gulisashvili and P. Tankov

Theorem 6 Suppose condition (47) holds for the time-change process τ and that the
assumptions (48) and (49) are satisfied. Then the following asymptotic formula holds
for the implied volatility in a time-changed n-dimensional Black-Scholes model:

  21
ψ(cmin − 1) 
I (K ) ∼ log K
T

as K → +∞, where

2θBii + μi2 − μi
cmin = min .
i=1,...,n Bii

Proof Let G i (x) = P[log STi ≥ x]. By Theorems 3 and 4, there exist constants C1
and C2 such that
C1 x α e−ci x  G i (x)  C2 x α−n e−ci x

as x → +∞, where
2θBii + μi2 − μi
ci = .
Bii

Note that in the single-asset case Theorems 3 and 4 can also be applied to the right
tail, by symmetry. It follows that

G i (x) ∼ −ci x

as x → +∞, and the proof may be completed by applying Theorem 10. 

Numerical illustration In this part of the paper we illustrate the asymptotic result of
Theorem 5 with a numerical example. Figure 3 plots the squared implied volatility of

Fig. 3 Implied volatility of a basket call option together with the theoretical asymptote. Left option
on a basket of 2 identical assets. Right option on a basket of 10 identical assets. The logarithms of
asset prices follow the multidimensional variance gamma model
Implied Volatility of Basket Options at Extreme Strikes 201

two basket call options computed by Monte Carlo as function of the strike price on
logarithmic scale, as well as the theoretical asymptote with slope given by Theorem
5. Note that Theorem 5 only provides the value of the limiting slope of the squared
implied volatility. Therefore, the performance of the asymptotic results should be
evaluated by comparing the slope of the wing of the smile with the slope of the
straight line. The intercept of the straight line has been chosen to keep the straight line
relatively close to the curve, solely for the purpose of visualisation. The confidence
intervals for 5 million simulated paths are very tight and not shown on the graphs.
We focus on the left wing of the smile since in the left wing the slope of the
smile is correlation-dependent, and therefore can in principle be used to calibrate the
correlation structure. Also, numerical experiments for the right wing (not presented
in the paper) show that one needs to go much further into the tail to observe the
asymptotic behavior predicted by Theorem 6.
In this numerical illustration, the time change follows the variance gamma law
(53) with λ = 10 and c = 10. In the left graph, the basket contains two identical
assets with price processes given by (46), where we take μ ≡ 0, S0i = 50 for i = 1, 2
and the covariance matrix which satisfies Bii = σ 2 with σ = 0.3 for i = 1, 2 and
Bij = 0 for i = j. In the right graph, the basket contains ten identical assets with
price processes given by (46), where we take μ ≡ 0, S0i = 10 for i = 1, . . . , 10 and
the covariance matrix which satisfies Bii = σ 2 for i = 1, . . . , 10 and Bij = ρσ 2
with ρ = 0.5 for i = j. The maturity of the options is T = 0.2 years in both graphs.

5 Assets with Dependence Structure Defined by a Copula

A popular approach to pricing European style multi-asset options is to calibrate


full-fledged models for marginal distributions of asset prices, and then use a copula
function from a simple parametric family to model the dependence structure. This
is because information about the marginal distributions can be extracted from the
prices of single asset options, which are liquidly traded, but the market quotes offer
very little information about the dependence.

5.1 A Very Brief Primer on Copulas

Recall that the copula of a random vector (X 1 , . . . , X n ) is a function C : [0, 1]n →


[0, 1], satisfying the following conditions:
• dC is a positive measure in the sense of Lebesgue-Stieltjes integration.
• C(u 1 , . . . , u n ) = 0 when u k = 0 for at least one k.
• C(u 1 , . . . , u n ) = u k when u i = 1 for all i = k.
202 A. Gulisashvili and P. Tankov

In addition, it is supposed that

P[X 1 ≤ x1 , . . . , X n ≤ xn ] = C(P[X 1 ≤ x1 ], . . . , P[X n ≤ xn ]), (x1 , . . . , xn ) ∈ Rn .

A copula exists by Sklar’s theorem and is uniquely defined in the case where the
marginal distributions of X 1 , . . . , X n are continuous. We refer to [35] for more details
on copulas.
The Gaussian copula with correlation matrix R is the unique copula of any
Gaussian vector with correlation matrix R and nonconstant components (it does
not depend on the mean vector and on the variances of the components).
Given a function φ : [0, 1] → [0, ∞] which is continuous, strictly decreasing
and such that its inverse φ−1 is completely monotonic, the Archimedean copula with
generator φ is defined by

C(u 1 , . . . , u n ) = φ−1 (φ(u 1 ) + · · · + φ(u n )).

Definition 1 The weak lower tail dependence function χ(α1 , . . . , αn ) of a copula


C is defined by
mini log u αi
χ(α1 , . . . , αn ) = lim ,
u→0 log C(u α1 , . . . , u αn )

provided that the limit exists and is finite for all α1 , . . . , αn ≥ 0 such that αk > 0
for at least one k.

We will next formulate several known assertions (see [39]).


Theorem 7 Let X 1 , . . . , X n be random variables with state space (0, ∞), marginal
distribution functions F1 , . . . , Fn , and a copula C. Suppose that for every k =
1, . . . , n, the function Fk is slowly varying at zero, and there exist constants ηk ,
1 ≤ k ≤ n, and a function F such that

log Fk (x) ∼ ηk log F(x), 1 ≤ k ≤ n.

Suppose also that the copula C admits a weak lower tail dependence function χ.
Then,
log P[X 1 + · · · + X n ≤ x] 1
lim = .
x↓0 mini log P[X i ≤ x] χ(η1 , . . . , ηn )

Theorem 8
• Assume that a copula function C has strong tail dependence in the left tail, meaning
that the limit
C(u, . . . , u)
λ L = lim ,
u↓0 u

exists and satisfies λ L > 0. Then, the weak lower tail dependence function of C
satisfies χ(α1 , . . . , αn ) = 1.
Implied Volatility of Basket Options at Extreme Strikes 203

• Let C be a Gaussian copula with correlation matrix R such that det R = 0. Then,

χ(α1 , . . . , αn ) = max αi min w T w, for all α1 , . . . , αn > 0,


i w∈n

Rij
where the matrix  has entries ij = √
αi α j , 1 ≤ i, j ≤ n.
• Let C be an Archimedean copula with a generator function φ such that log φ−1 is
regularly varying at ∞ with index λ > 0. Then,

max(α1 , . . . , αn )
χ(α1 , . . . , αn ) = 1/λ 1/λ
.
(α1 + · · · + αn )λ

5.2 Copulas and the Implied Volatility Asymptotics

In this subsection, we study the left-wing behavior of the implied volatility associated
with a basket call option. Recall that we denoted by (Y1 , . . . , Yn ) the vector of
logarithmic returns of the risky assets, and by (λ1 , . . . , λn ) the corresponding vector
of weights. Let C be the copula of the vector (Y1 , . . . , Yn ), and G i be the distribution
function of Yi for i = 1, . . . , n. The implied volatility is considered in this section
as a function k → I (−k) of the variable −k, where k is the log-strike defined by
k = log K . The tail-wing formulas due to Benaim and Friz (see [9]) play an important
role in the sequel.
Theorem 9 Let α > 0, and assume that the following are true:
• There exists ε > 0 such that E[e−εYi ] < ∞, i = 1, . . . , n.
• For every 1 ≤ i ≤ n, the function k → − log G i (−k), k > k0 , belongs to the class
Rα of regularly varying functions, and there exist positive constants η1 , . . . , ηn
and a function G such that

log G i (−k) ∼ ηi log G(−k) as k → ∞. (54)

• The copula C admits a weak lower tail dependence function χ.


Then,  
I (−k)2 T log G(−k) maxi ηi
∼ψ − (55)
k k χ(η1 , . . . , ηn )

as k → ∞, where the function ψ is defined in (52).

Proof The distribution function Fi of the random variable λi Si is given by

Fi (x) = G i (log x − log λi ).


204 A. Gulisashvili and P. Tankov

Since the function log G i is regularly varying at −∞, it is clear that log Fi is slowly
varying at zero and

log Fi (x) ∼ log G i (log x) ∼ ηi log G(log x)

as x → 0. It follows from Theorem 7 that


maxi ηi
log F(x) ∼ log G(log x) as x → 0,
χ(η1 , . . . , ηn )
n
where F is the distribution function of i=1 λi Si . Equivalently

maxi ηi
log F(e−k ) ∼ log G(−k) as k → ∞,
χ(η1 , . . . , ηn )

and hence

log F(e−k ) log G(−k) maxi ηi


− ∼− as k → ∞. (56)
k k χ(η1 , . . . , ηn )

It follows from the assumptions in Theorem 9 that log G(−k) ∈ Rα as k → ∞.


Therefore log F(e−k ) ∈ Rα as well. Next, using the tail-wing formula of Benaim
and Friz (see Theorem 2 in [9]), we obtain
 
I (−k)2 T log F(e−k )
∼ψ − as k → ∞. (57)
k k

We will need the following lemma.


Lemma 2 Let ψ be the function defined by (52), and suppose ρ1 and ρ2 are positive
functions on (0, ∞) such that

ρ1 (x)
→ 1 as x → ∞. (58)
ρ2 (x)

Then
ψ(ρ1 (x))
→ 1 as x → ∞. (59)
ψ(ρ2 (x))

Proof It is not hard to see that for all u ≥ 0,

2
ψ(u) = √ √ . (60)
( u + 1 + u)2

The equality in (60) describes the structure of the function ψ better than the original
definition.
Implied Volatility of Basket Options at Extreme Strikes 205

Fix ε > 0. Then, using (60) and the inequality 1 < 1


1−ε , we get

2 2
ψ((1 − ε)u) ≤  ≤ √ √
√ 2 (1 − ε)( u + 1 + u)2
(1 − ε) u+ 1
1−ε + u
2 1
= √ √ = ψ(u).
(1 − ε)( u + 1 + u)2 1−ε

Similarly
1
ψ((1 + ε)u) ≥ ψ(u).
1+ε

Therefore,

1 1
ψ(u) ≤ ψ((1 + ε)u) ≤ ψ((1 − ε)u) ≤ ψ(u). (61)
1+ε 1−ε

It follows from (58) that for every ε > 0 there exists xε > 0 such that

(1 − ε)ρ2 (x) ≤ ρ1 (x) ≤ (1 + ε)ρ2 (x)

for all x > xε . Since the function ψ decreases on (0, ∞), we have

ψ((1 + ε)ρ2 (x)) ≤ ψ(ρ1 (x)) ≤ ψ((1 − ε)ρ2 (x))

for all x > xε . Now, using (61), we obtain

1 1
ψ(ρ2 (x)) ≤ ψ(ρ1 (x)) ≤ ψ(ρ2 (x))
1+ε 1−ε

for all x > xε , and (59) follows. 

Finally, it is not hard to see that (56), (57), and Lemma 2 imply (55).
This completes the proof of Theorem 9. 
The next example shows that condition (54) does not prevent one from choosing
different marginal laws for different components of the process (Y1 , . . . , Yn ) as long
as these laws have a similar tail behavior.
Example 1 Let us consider the following multidimensional extension of the example
given in Sect. 5.2 of [9]. We assume that for i = 1, . . . , n, the distribution of the
random variable Yi is normal inverse Gaussian, more precisely, NIG(αi , βi , μi , δi ).
It is also supposed that the parameters satisfy αi > |βi | > 0 and δi > 0. This means
that the moment generating function of Yi is given by
 
Mi (z) = exp δi αi2 − βi2 − αi2 − (βi + z)2 + μi z .
206 A. Gulisashvili and P. Tankov

We refer the reader to [6] for more details on the normal inverse Gaussian distribution.
In particular, it follows that Yi has a density gi which satisfies the following condition:
3
gi (k) ∼ Ci |k|− 2 e−αi |k|+βi k , k → ±∞,

where Ci is a constant. Using Theorem 2 in [9], we see that − log G i (−k) ∈ Rα as


k → +∞, and also

− log G i (−k) ∼ − log gi (−k) ∼ (βi − αi )k, k → +∞.

Therefore, the condition in (54) holds with λi = αi − βi and G(k) = ek .


Assuming that the dependence structure of (Y1 , . . . , Yn ) is described by the
Gaussian copula with correlation matrix R, we see that
 
I (−k)2 T 1
∼ψ , k → +∞, (62)
k inf w∈d w ⊥ w

where the matrix  = [ij ] is such that

Rij
ij =  .
(αi − βi )(α j − β j )

In other words, the implied variance is asymptotically linear, with a correlation-


dependent limiting slope, which is given by the right-hand side of (62).

Remark 6 In this remark, we compare the asymptotic formulas for the implied
volatility obtained in Sects. 3 and 5 (see Theorems 2 and 9). The latter theorem
is more general than the former one. It provides the leading term in the asymptotic
expansion of the implied volatility under certain restrictions on the marginal distri-
butions of log-returns and the corresponding copula, and applies to many special
models. In the case of a Gaussian copula and log-normal marginal distributions,
all the conditions in Theorem 9 are satisfied, and the leading term is equal to the
 − 1
constant Ā1 + · · · + Ān̄ 2 . This follows from Theorem 9, the second equality in
formula (27), and the second statement in Theorem 8. The advantage of Theorem 9
is its generality, while the disadvantage is that the asymptotic formula for the implied
volatility contains only the leading term and no error estimate. On the other hand,
Theorem 2 applies only to the case of Gaussian copula and lognormal margins under
a not very restrictive condition (A), but provides a sharp asymptotic formula for the
implied volatility with several terms and an error estimate.

For the sake of completeness, we include a proposition that is a counterpart of


Theorem 9 in the case of the right tail. This proposition turns out to be somewhat
trivial: the leading order of the implied volatility is determined by a single component
with the fattest tail, and it does not depend on the copula. Let us denote by G i the
survival function of Yi , i.e., the function G i (x) = P[Yi ≥ x].
Implied Volatility of Basket Options at Extreme Strikes 207

Theorem 10 Let α > 0, and suppose that the following assumptions hold:
• There exists ε > 0 such that E[e(1+ε)Yi ] < ∞ for i = 1, . . . , n.
• For each i = 1, . . . , n, the function k → − log G i (k) belongs to the class Rα at
infinity.
Then,  
I (k)2 T 1
∼ ψ −1 − max log G i (k) as k → +∞. (63)
k k i

Proof Set X i = vi eYi . Then we get

P[X 1 + · · · + X n ≥ x] ≥ max P[X i ≥ x],


i

x  n
x x
P[X 1 + · · · + X n ≥ x] ≤ P[∃i : X i ≥ ]≤ P[X i ≥ ] ≤ n max P[X i ≥ ].
n n i n
i=1

Since for each i, the function log G i is regularly varying at infinity, it follows that
the function x → log P[X i ≥ x] is slowly varying, and therefore, for x sufficiently
large and any ε > 0,

max log P[X i ≥ x/n] ≤ (1 + ε) max log P[X i ≥ x].


i i

Finally,
log P[X 1 + · · · + X n ≥ x]
lim = 1,
x→+∞ maxi log P[X i ≥ x]

and formula (63) follows from Theorem 1 in [9] with a similar proof to that of
Theorem 9. 

Numerical illustration In this part of the paper we illustrate the asymptotic result of
Theorem 9 with a numerical experiment. Figure 4 plots the squared implied volatility
of two basket call options computed by Monte Carlo as function of the strike price on
logarithmic scale, as well as the theoretical asymptote with slope given by Theorem 9.
Once again, we focus on the left wing of the smile since the slope of the left wing
depends on the correlation of the Gaussian copula while the slope of the right wing
does not.
In both graphs, the basket contains assets with price processes

STi = S0i eμ̃i T +X T ,


i
208 A. Gulisashvili and P. Tankov

Fig. 4 Implied volatility of a basket call option together with the theoretical asymptote. Left option
on a basket of 2 identical assets. Right option on a basket of 10 identical assets. The logarithms of
asset prices follow the variance gamma model with dependence given by a Gaussian copula

where μ̃i is chosen so that E[STi ] = S0i and X i is the variance gamma process with
characteristic function
 −κi T
σ 2 u 2 κi−1
E[e iu X Ti
]= 1 − iuκi−1 μi + i .
2

For the numerical illustration we take μi = 0, σi = 0.3 and κi = 10 for all i.


In the left graph, the basket contains two assets; we take S0i = 50 for i = 1, 2
and assume that the assets are independent. In the right graph, the basket contains
ten assets; we take S0i = 10 for i = 1, . . . , 10 and assume that the dependence
structure of the assets is given by the Gaussian copula with correlation matrix R
with elements Rij = ρ + (1 − ρ)1i= j , where we took ρ = 0.5. The maturity of the
options is T = 0.2 years in both graphs.
In the variance gamma model, similarly to Example 1, it can be shown (see e.g.,
[1]) that
log G i (k) ∼ (αi + βi )k, k → −∞

with coefficients αi and βi given by



μi2 2κi μi
αi = + , and βi = 2 .
σi4 σi2 σi

Therefore, the limiting behavior of the implied volatility in this model is also given by
(62). We see that the numerical illustration agrees well with the theoretical prediction.
Compared to the numerical example of Sect. 4, we see that the slope of the implied
volatility is steeper in the multidimensional VG model than in the copula model.
This happens because the multidimensional VG model introduces additional positive
dependence between the assets through the common time change process.
Implied Volatility of Basket Options at Extreme Strikes 209

Proof of Lemma 1

The function F satisfies

λ2 w ⊥ Bwt
F(t, w) = max{θt + λw ⊥ (1 + μt) − },
λ>0 2

where 1 stands for the n-dimensional vector with all elements equal to 1. Therefore,

 u),
max F(t, w) = maxn F(t,
w∈n u∈R+

with

 u) = {θt + u ⊥ (1 + μt) − u But }.
F(t,
2
 u) is strictly concave in u, there exists a unique ū(t) ∈ Rn+
Since for every t > 0, F(t,
 ū) = maxu∈Rn F(t,
with ū(t) = 0 such that F(t,  u). This in turn implies that there
+
exists a unique w̄(t) such that F(t, w̄) = maxw∈n F(t, w). It is also easy to see
that ū(t) depends continuously on t.
Let f¯(t) = F(t,
 ū(t)). We would like to show that f¯ is differentiable in t and
compute its derivative. ū(t) may be characterized as follows: for i = 1, . . . , n

[1 + μt − tBū(t)]i = 0 if ū(t)i > 0 (64)


[1 + μt − tBū(t)]i ≤ 0 if ū(t)i = 0. (65)

Let I (t) denote the set of indices i ∈ {1, . . . , n} such that ū(t)i > 0, and, for a
vector x ∈ Rn , let x I (t) denote the subset of components of x with indices in I (t):
x I (t) = {xi : i ∈ I (t)}. Furthermore, let B I (t),I (t) denote the submatrix of the
covariance matrix, containing the elements bij with i ∈ I (t) and j ∈ I (t). Then, the
vector ū(t) satisfies

1 −1
ū(t) I (t) = B (1 + μt) I (t) , ū(t) I˜(t) = 0,
t I (t),I (t)

where the set I˜(t) contains the indices i ∈ {1, . . . , n} which are not in I (t).
Now, fix t ∈ (0, ∞) and for t ∈ (0, ∞), define

1 −1
v(t ) I (t) = B (1 + μt ) I (t) , v(t) I˜(t) = 0
t I (t),I (t)

First, assume that for all i such that ū(t)i = 0, either [1 + μt − tBū(t)]i < 0 (with
strict inequality) or
[1 + μt − t Bv(t )]i = 0
210 A. Gulisashvili and P. Tankov

for all t ∈ (0, ∞). We shall call this Assumption 1. Then we can find δ > 0, such
that for every t ∈ (0, ∞) with |t − t| < δ, v(t ) satisfies the characterization (64)
and (65). Therefore, v(t ) = ū(t ). This means that

1
f¯(t ) = θt + (1 + μt )⊥ −1
I (t) B I (t),I (t) (1 + μt ) I (t) .
2t

Therefore, f¯ is differentiable at t with first derivative given by

1 1 1
f¯ (t) = θ − 2 1⊥ B−1 1 I (t) + μ⊥ B−1 μ = θ − ū(t)⊥ (1 − μt)
2t I (t) I (t),I (t) 2 I (t) I (t),I (t) I (t) 2t
(66)

and second derivative


1
f¯ (t) = 3 1⊥ B−1 1 I (t) .
t I (t) I (t),I (t)

Now assume that there exists at least one i such that ū(t)i = 0 and [1 + μt −
tBū(t)]i = 0, or, equivalently,

[1 + μt − t Bv(t )]i = 0

with t = t. The case when the above equality holds for all t is covered by Assump-
tion 1. Since the left-hand side is linear in t , this means that for a given index set
I (t) and for a given i, there exists only one t ∈ (0, ∞) which satisfies the above
equality. Since the number of possible index sets is finite, we conclude that there is
at most a finite number of elements t ∈ (0, ∞) which do not satisfy Assumption
1. But then, we can conclude by continuity that f¯ is strictly convex (which entails
uniqueness of t¯) and differentiable for all t ∈ (0, ∞), with the derivative given by
(66) or alternatively by

1 (w̄(t)⊥ μ)2
f¯ (t) = θ − + .
2t 2 w̄(t)⊥ Bw̄(t) 2w̄(t)⊥ Bw̄(t)

Comparing this with the derivative of f , which is easily computed, we see that at the
point t¯, these derivatives coincide. Since this point is characterized by the first order
condition f¯ (t¯) = 0, and the function f is strictly convex, f also attains its unique
minumum at t¯.
Implied Volatility of Basket Options at Extreme Strikes 211

References

1. Albin, J.M.P., Sundén, M.: On the asymptotic behaviour of Lévy processes, part I: subexpo-
nential and exponential processes. Stoch. Process. Appl. 119, 281–304 (2009)
2. Andersen, L., Lipton, A.: Asymptotics for exponential Lévy processes and their volatility smile:
survey and new results. Int. J. Theor. Appl. Financ. 16, 1350001-1–1350001-98 (2013)
3. Asmussen, S., Rojas-Nandayapa, L.: Asymptotics of sums of lognormal random variables with
Gaussian copula. Stat. Probab. Lett. 78, 2709–2714 (2008)
4. d’Aspremont, A.: Interest rate model calibration using semidefinite programming. Appl. Math.
Financ. 10, 183–213 (2003)
5. Avellaneda, M., Boyer-Olson, D., Busca, J., Friz, P.: Reconstruction of volatility: pricing index
options using the steepest-descent approximation. Risk Mag. 15, 87–91 (2002)
6. Barndorff-Nielsen, O.: Processes of normal inverse Gaussian type. Financ. Stoch. 2, 41–68
(1998)
7. Bayer, C., Laurence, P.: Asymptotics beats Monte Carlo: the case of correlated local vol baskets.
Commun. Pure Appl. Math. 67, 1618–1657 (2014)
8. Benaim, S., Friz, P.: Smile asymptotics II: models with known MGF. J. Appl. Probab. 45, 16–32
(2008)
9. Benaim, S., Friz, P.: Regular variation and smile asymptotics. Math. Financ. 19, 1–12 (2009)
10. Benhamou, E., Gobet, E., Miri, M.: Smart expansion and fast calibration for jump diffusions.
Financ. Stoch. 13, 563–589 (2009)
11. Berestycki, H., Busca, J., Florent, I.: Asymptotics and calibration of local volatility models.
Quant. Financ. 2, 61–69 (2002)
12. Cherubini, U., Luciano, E., Vecchiato, W.: Copula Methods in Finance. Wiley, Chichester
(2004)
13. Cont, R., Deguest, R.: Equity correlations implied by index options: estimation and model
uncertainty analysis. Math. Financ. 23, 496–530 (2013)
14. De Marco, S., Hillairet, C., Jacquier, A.: Shapes of implied volatility with positive mass at zero
(2013). arXiv:1310.1020
15. Eberlein, E., Madan, D.B.: On correlating Lévy processes. J. Risk 13, 3–16 (2010)
16. Figueroa-López, J., Forde, M.: The small-maturity smile for exponential Lévy models. SIAM
J. Financ. Math. 3, 33–65 (2012)
17. Forde, M., Jacquier, A.: Small-time asymptotics for implied volatility under the Heston model.
Int. J. Theor. Appl. Financ. 12, 861–876 (2009)
18. Forde, M., Jacquier, A.: The large-maturity smile for the Heston model. Financ. Stoch. 15,
755–780 (2011)
19. Gao, K., Lee, R.: Asymptotics of implied volatility to arbitrary order. Financ. Stoch. 18, 349–
392 (2014)
20. Gao, X., Xu, H., Ye, D.: Asymptotic behavior of tail density for sum of correlated lognormal
variables. Int. J. Math. Math. Sci. 2009, p. 28 (2009)
21. Gobet, E., Miri, M.: Time dependent Heston model. SIAM J. Financ. Math. 1, 289 (2010)
22. Gulisashvili, A.: Asymptotic formulas with error estimates for call pricing functions and the
implied volatility at extreme strikes. SIAM J. Financ. Math. 1, 609–641 (2010)
23. Gulisashvili, A.: Asymptotic equivalence in Lee’s moment formulas for the implied volatility,
asset price models without moment explosions, and Piterbarg’s conjecture. Int. J. Theor. Appl.
Financ. 15, 1250020 (2012)
24. Gulisashvili, A.: Analytically Tractable Stochastic Stock Price Models. Springer, Berlin (2012)
25. Gulisashvili, A.: Left-wing asymptotics of the implied volatility in the presence of atoms. Int.
J. Theor. Appl. Finan. 18(2) (2015)
26. Gulisashvili, A., Stein, E.M.: Asymptotic behavior of the stock price distribution density and
implied volatility in stochastic volatility models. Appl. Math. Optim. 61, 287–315 (2010)
27. Gulisashvili, A., Tankov, P.: Tail behavior of sums and differences of log-normal random
variables. Bernoulli (to appear)
212 A. Gulisashvili and P. Tankov

28. Hashorva, E., Hüsler, J.: On multivariate Gaussian tails. Ann. Inst. Stat. Math. 55, 507–522
(2003)
29. Jourdain, B., Sbai, M.: Coupling index and stocks. Quant. Financ. 12, 805–818 (2012)
30. Lee, R.: The moment formula for implied volatility at extreme strikes. Math. Financ. 14, 469–
480 (2004)
31. Lewis, A.: Option Valuation Under Stochastic Volatility. Finance Press, Newport Beach (2000)
32. Luciano, E., Schoutens, W.: A multivariate jump-driven financial asset model. Quant. Financ.
6, 385–402 (2006)
33. Medvedev, A., Scaillet, O.: Approximation and calibration of short-term implied volatilities
under jump-diffusion stochastic volatility. Rev. Financ. Stud. 20, 427–459 (2007)
34. Mijatović, A., Tankov, P.: A new look at short-term implied volatility in asset price models
with jumps. Math. Financ., to appear
35. Nelsen, R.: An Introduction to Copulas. Springer, New York (1999)
36. Prause, K.: The generalized hyperbolic model: estimation, financial derivatives, and risk mea-
sures, Ph.D. thesis, University of Freiburg (1999)
37. Schoenmakers, J.: Robust Libor Modelling and Pricing of Derivative Products. CRC Press,
Boca Raton (2005)
38. Tankov, P.: Pricing and Hedging in Exponential Lévy Models: Review of Recent Results.
Paris-Princeton Lectures on Mathematical Finance. Springer, Berlin (2010)
39. Tankov, P.: Large deviation asymptotics for the left tail of the sum of dependent positive random
variables (2014). arXiv:1402.4683
40. Tehranchi, M.R.: Asymptotics of implied volatility far from maturity. J. Appl. Probab. 46,
629–650 (2009)
41. Tehranchi, M.R.: Uniform bounds for Black-Scholes implied volatility, Pre-print (2014)
Small-Time Asymptotics
for the At-the-Money Implied Volatility
in a Multi-dimensional Local
Volatility Model

Christian Bayer and Peter Laurence

Abstract We consider a basket or spread option based on a multi-dimensional local


volatility model. Bayer and Laurence (Commun. Pure. Appl. Math., 67(10), 2014,
[5]) derived highly accurate analytic formulas for prices and implied volatilities of
such options when the options are not at the money. We now extend these results to
the ATM case. Moreover, we also derive similar formulas for the local volatility of
the basket.

Keywords Basket options · Spread options · Implied volatility · Asymptotic


formulas · Heat kernel expansion

2010 Mathematics Subject Classification Primary 91G60 · Secondary 91G20,


58J90

1 Introduction

For a local volatility type model for a basket of stocks, whose forward prices are
given by

To the memory of Peter Laurence, who passed away unexpectedly during the final stage of the
preparation of this manuscript.

C. Bayer (B)
Weierstrass Institute, Mohrenstrasse 39, 10117 Berlin, Germany
e-mail: [email protected]
P. Laurence
Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street,
New York, NY 10012, USA
P. Laurence
Dipartimento di Matematica, Università di Roma 1 Piazzale Aldo Moro, 2,
00185 Rome, Italy

© Springer International Publishing Switzerland 2015 213


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_7
214 C. Bayer and P. Laurence

dFi (t) = σi (Fi (t))dWi (t), i = 1, . . . , n, (1.1)


 
d Wi , W j (t) = ρij dt, i, j = 1, . . . , n, (1.2)

with a given correlation matrix ρ, we consider basket options with a payoff


 +

n
P(F) = wi Fi − K ,
i=1

where we generally denote in bold face a vector of the corresponding italic com-
ponents, as in F = (F1 , . . . , Fn ). Since we only assume that at least one of the
weights w1 , . . . , wn is positive, we will refer to options of that type as generalized
spread options.
The purpose of this paper is to provide an explicit first order accurate short time
expansion of the price CB (F0 , K , T ) of the above option using the heat kernel ex-
pansion technique (see, for instance, [12, 13, 20]) when the option is at the money.
Moreover, from the asymptotic formula for the option price we also obtain an asymp-
totic formula for the implied and for the local volatility.1 Thereby we complement
the results obtained in [5], where a first order accurate asymptotic formula was given
when the option is not at the money. (The zero order accurate formula is well-known,
see, for instance [1]. When the option is not at the money, alternative first order ac-
curate results can be found in [12].) Such asymptotic formulas are highly relevant,
in particular when the dimension of the model is high (say n > 3), since then tradi-
tional (simulation or PDE) techniques to compute CB fail or are at least very time
consuming. In fact, for a wide range of different parameters, [5] show numerically
that their asymptotic formula is remarkably close to the true price as given by the
model, even for not so small maturities T (like 5 or even 10 years), for dimensions of
up to n = 100 (or even more). The same holds true when the option is at the money,
see Sect. 6.
We now sketch the procedure for deriving the asymptotic formulas, highlighting
the differences to the non-ATM case.
• In the first step, we derive a Carr-Jarrow formula for the basket option price,
separating the price into the intrinsic value of the option
 (which vanishes in the
ATM case) and an integral over the arrival manifold { i wi Fi = K } with respect
to the transition density p(F0 , F, T ). This is done in Sect. 2.
• The first terms in the heat kernel expansion of p(F0 , F, T ) are computed. In the
non-ATM case, a zero-order heat kernel expansion was sufficient to get first order
accurate formulas for the implied volatilities. At the money, we actually need to
add one additional term in the heat kernel expansion. The heat kernel coefficients
are computed in Lemma 3.6.
• The aforementioned integral on the arrival manifold
 is essentially an integral with
respect to the rapidly decaying kernel exp −d(F0 , F)2 /(2T ) , where d denotes

1 Since

we consider spread options here (for which i wi F0,i may be negative), we derive implied
volatilities both in the Black-Scholes and in the Bachelier sense.
Small-Time Asymptotics for the At-the-Money Implied Volatility … 215

the Riemannian (geodesic) distance induced by the stock price process. Hence,
the integral can be approximated using Laplace’s expansion  for T → 0, which
involves the minimizer F∗ of F → d(F0 , F)2 subject to i wi Fi = K . In the
general case, this minimizer has to be computed numerically, while it is obviously
given by F∗ = F0 when the option is at the money. On the other hand, the formulas
are much longer and more complex due to the higher order heat kernel expansion
used, see Proposition 3.4 together with Lemmas 3.3 and 3.7.
• In Sect. 4, we use the same Laplace’s expansion technique to derive the local
volatility of the basket, see Proposition 4.1.
• Finally, in Sect. 5, an asymptotic expansion for the implied volatilities is computed
by a comparison of coefficients between the asymptotic expansion of the basket
price derived in Proposition 5.1 and asymptotic expansions of the Black-Scholes
and Bachelier formulas, respectively, see Eqs. (5.2)–(5.4).
An alternative way to derive the asymptotic expansion for at-the-money options
would be to start from the non-at-the-money formulas and pass to the limit. This
would involve un-determined terms “ 00 ”, which would need to be resolved by the
l’Hopital rule. In particular, we would have to compute limits of derivatives of the
optimal configuration, which are not known in closed form when the option is not
at-the-money. Still, one could follow that approach using similar techniques as in
[4], but the derivation would hardly be any simpler than directly starting from scratch
again (the course of action chosen in this article).
In Sect. 6 we present numerical examples for one particular choice of a lo-
β
cal volatility model, namely the CEV model, corresponding to σi (Fi ) = ξi Fi i ,
0 ≤ βi ≤ 1, 1 ≤ i ≤ n. The numerical observations supports the claimed accuracy
of the asymptotic price formulas. In fact, comparisons with highly accurate reference
solutions show that the asymptotic formulas indeed have the suggested rates of con-
vergence as T → 0. Even more, they indicate that the formulas, in particular the first
order formula, are highly accurate even for large maturities such as T = 10 years,
thereby confirming the observations in [5].

Remark 1.1 In the same spirit as [5], the aim of this paper is to give informal deriva-
tions of fast and accurate asymptotic formulas. Indeed, there are several steps, in
which our derivations are not fully rigorous. In particular, most local volatility mod-
els (like the CEV model) exhibit singular behaviour at the boundary of the domain
Rn+ which can inhibit the validity of the heat kernel expansion, and, a fortiori, also
the Laplace expansion applied later. It is clearly possible to rigorously justify both
expansions under appropriate (uniform) ellipticity assumptions (see, for instance,
[20] for the validity of the heat kernel expansion and [6] for a rigorous version of
Laplace’s expansion). An extension to general or some specific local volatility mod-
els, however, seems to be a difficult task, see also the comments in [5, Sect. 4], and,
in particular, the results of [2]. Thus, we believe that a more “hands-on” approach
can be justified in this particular case. For related problems see also [3, 8, 9].
216 C. Bayer and P. Laurence

2 Basket Carr-Jarrow Formula



Consider a basket B = wi Fi with weights wi ∈ R. Following [5, 7], we are now
going to derive a Carr-Jarrow formula for the price of a generalized spread option on
the basket, i.e., a decomposition of the price of the option into the intrinsic value and
an integral over the arrival manifold {B = K }. Take the Itô derivative of the basket’s
price:


n 
n
d wi Fi (t) = wi σi (Fi (t))dWi (t)
i=1 i=1


n
= wi w j σi (Fi (t)σ j (F j (t))ρij d W̄ (t),
i, j=1
 
σN
2
,B

for a new Brownian motion W̄ . Here we have used the notation σN,B to indicate the
“normal volatility” of the basket which must not be confused with the lognormal
σN,B
(Black) volatility σB =  n used in reference [1]. Therefore, by the Itô-Tanaka
wi Fi
i=1
formula we have
 n +
 
n
d wi Fi (t) − K = wi 1 wi Fi (t)>K d Fi (t)
i=1 i=1
1
+ δ{F: wi Fi (t)=K } σN,B
2
(F(t))dt.
2
Integrating we obtain
n + n + 
  
n T
wi Fi (T ) − K = wi Fi (0) − K + wi 1 wi Fi (u)>K d Fi (u) +
i=1 i=1 i=1 0
 T
1
+ δ{F(u): wi Fi (u)=K } σN,B
2
(F(u))du.
2 0

Letting E K = {F ∈ Rn+ : wi Fi = K } and taking conditional expectations with
respect to the filtration F0 at time 0, we obtain, assuming Fi (t) is a martingale for
each i 2 :

2 In many cases of interest, Fi (t) is only a local martingale and not a martingale. But the discrepancy
is not “felt” for short times, since the set of paths that can reach the boundary have small probability,
in this limit. This is known as the principle of “not feeling the boundary” for small times and is born
out by our numerical results. More surprisingly the boundary is not felt, even for quite large times.
Small-Time Asymptotics for the At-the-Money Implied Volatility … 217
 n + 
 1 T  
CB (F0 , K , T ) = wi Fi (0) − K + E σN,B
2
δE K (Bt ) dt.
2 0
i=1
n
Letting u(F) := i=1 wi Fi and recalling |∇u| = |w| (where | · | denotes the
Euclidean norm) we can re-express this as
 n +

CB (F0 , K , T ) = wi Fi (0) − K +
i=1
 T 
1 1
+ |∇u(F)|σN
2
,B (F)δ0 (u(F) − K ) p(F0 , F, t)dFdt.
2 |w| 0 Rn

By the co-area formula (see [10])


  ∞ 
|∇u(x)|g(x)d x = g(x)Hn−1 (d x)ds
 −∞ u −1 ({s})

(where Hn−1 denotes the Hausdorff measure on u −1 ({s})), we arrive at


 +

n
CB (F0 , K , T ) = wi Fi (0) − K +
i=1
 T  ∞ 
1 1
+ δ0 (s − K ) σN
2
,B (F) p(F0 , F, t)Hn−1 (dF)dsdt
2 |w| 0 −∞ Es
 +

n
= wi Fi (0) − K +
i=1
 T 
1 1
+ σN
2
,B (F) p(F0 , F, t)Hn−1 (dF)dt.
2 0 |w|
EK

Note that Hn−1 coincides with the (n − 1)-dimensional Lebesgue measure on E K .

Proposition 2.1 The value of a call option on a basket B is given by


 n +

CB (F0 , K , T ) = wi Fi (0) − K +
i=1
 T  
n
1 1
+ wi w j σi (Fi )σ j (F j )ρij p(F0 , F, u)Hn−1 (dF)du. (2.1)
2 0 |w|
EK i, j=1

Using the formula for the basket’s local volatility, [1, 12], expressed in the notation
introduced above, after canceling common factors we also have the
218 C. Bayer and P. Laurence

Proposition 2.2 The local volatility of the basket option is given by:

 
n
wi w j σi (Fi )σ j (F j )ρij p(F0 , F, T )Hn−1 (dF)
E(K ) i, j=1
σloc
2
(K , T )K 2 =  .
p(F0 , F, T )Hn−1 (dF)
E(K )

3 A General Asymptotic Expansion Procedure

The starting point is the basket Carr-Jarrow formula derived above for the calculation
of the option prices as in Propositions 2.1 and 2.2 for the calculation of the local
volatilities. The next step is to approximate the transition density there using the heat
kernel. For reasons that will become clear in the course of the asymptotics, it will be
necessary to use the so-called geometric expansion

1  d 2 (F0 ,F)
p(F0 , F, t) = n det g(F)e− 2t (u 0 (F0 , F) + tu 1 (F0 , F) + o(t)). (3.1)
(2πt) 2

For a detailed exposition of the geometrical underpinning of (3.1) we refer to [5, 12,
13, 17, 20]. Here, we just give a very quick reminder.

Remark 3.1 We shall assume that the process Ft is locally elliptic in the sense that
ρ is invertible and σi (Fi ) > 0 for any Fi > 0 and any i, i.e., for F in the interior of
the domain of the process Ft . A rigorous heat kernel expansion for locally elliptic
diffusions is given in [2], with the restriction that the expansion is only valid for
compact subsets of Rn+ .

The state space Rn+ is equipped with a Riemannian metric by defining the inverse
g −1 of the metric tensor by

g ij (F) = σi (Fi )ρij σ j (F j ), 1 ≤ i, j ≤ n.

Hence, the metric tensor itself is given by

gij (F) = σi (Fi )−1 ρij σ j (F j )−1 , 1 ≤ i, j ≤ n,

with determinant
 
n
det g(F) = det ρ−1 σk (Fk )−2
k=1

(where ρij denotes the (i, j)-component of the inverse matrix ρ−1 of the correla-
tion matrix ρ). The (geodesic) distance between two points F0 and F is denoted
by d(F0 , F).
Small-Time Asymptotics for the At-the-Money Implied Volatility … 219

The specific form of these quantities in the setting of local volatility models has
no relevance in our initial asymptotic derivations, which can be obtained for generic
versions of these. So, to lighten the notation and streamline the presentation, we
first derive the asymptotic expansions without any specific reference to these and
then plug in the specific form only at the end of the process in order to produce the
required concrete asymptotic expansions.
Plugging the heat kernel expansion (3.1) into the expressions in Propositions 2.1
and 2.2, respectively, we see that we have to compute expressions of the form
  
1 d(F0 , F)2
(F) exp − Hn−1 (dF), (3.2)
(2πt)n/2 EK 2t

where 
(F) = ū i (F0 , F) := 2
det g(F)σN,B (F)u i (F0 , F), i = 0, 1, (3.3)

for the option price and for the numerator in Proposition 2.2 and

(F) = û i (F0 , F) := det g(F)u i (F0 , F), i = 0, 1 (3.4)

for the denominator in Proposition 2.2.


The integral on the n − 1 dimensional subspace E K of Rn can be transformed into
an integral over Rn−1 , by eliminating one of the variables. We choose to eliminate
the nth one, using the payoff
 
1 
n−1
Fn (F1 , . . . , Fn−1 , K ) = K− wi Fi , (3.5)
wn
i=1

Denoting

G = (F1 , . . . , Fn−1 ) ∈ Rn−1


+ ,
 

n−1
GK = G ∈ R n−1
: wi Fi < K ,
i=1

so that for our hyperplane’s intersection


    
1 
n−1
E K ∩ Rn+ = F∈ Rn+ : F = G, K− wi Fi , G ∈ GK .
wn
i=1

Note that the set G K is introduced in order to ensure that Fn in (3.5) is non-negative,
as it needs to be. The set E K is an n − 1 dimensional hyperplane in Rn+ .
Note that, when we parametrize the hyperplane Ek using (F1 , . . . , Fn−1 ), as
in (3.5)
220 C. Bayer and P. Laurence

FK (F1 , . . . , Fn−1 ) = (F1 , . . . , Fn−1 , Fn (F1 , . . . , Fn−1 , K )),

we will always assume that the weight multiplying Fn is positive. This can always
be achieved by choosing as the nth asset one of the assets with a positive weight.
Then for the surface measure, we have
 |w|
d Hn−1 = 1 + |∇ Fn |2 d F1 . . . d Fn−1 = d F1 . . . d Fn .
|wn |

d2
In this notation, with  = 2 , the integral (3.2) reads

1 |w| (F0 ,F K (G))
e− t (F K (G))d F1 . . . d Fn−1 =
(2πt) |wn |
n/2
GK

1 |w| (G)
e− t (G)dG,
(2πt)n/2 |wn |
GK
(3.6)

using the notation (G) := (F0 , F K (G)) and (by abuse of notation) (G) :=
(F K (G)). We now use Laplace asymptotics for multiple integrals. The main con-
tribution comes from a neighborhood of the minimum point.

G∗ = arg min d 2 (F0 , (G, Fn (G, K )), (3.7)


G∈G K

= d 2 (F0 , E K ).

Set F∗K = (G∗ , Fn (G∗ , K )). (Of course, when the option is at the money, we have
G∗ = (F0,1 , . . . , F0,n−1 ).)
Order Zero. The zero-th order term in the Laplace expansion of

(G)
e− t (G)dG
GK

is identical to the one in [5] except that in the present setting we have d(F0 , F∗K ) = 0.
We get, as in [5]
 n−1
n−1
∗ −z
T Qz n−1
∗ (2π) 2
t 2 (G ) × e 2 dz 2 . . . dz n = t 2 (G ) 1
,
Rn−1 (det Q) 2

where Q = D 2 (G∗ ) is the Hessian of  at the minimum point. Thus, bringing


back the missing factor and taking into account that F∗K = F0 in the current (ATM)
setting, we see that the lowest order term in the Laplace expansion of (3.2) is
Small-Time Asymptotics for the At-the-Money Implied Volatility … 221

|w| 1
h
0 := √ (F0 ). (3.8)
|wn | 2πt det Q

Order One. For obtaining first order implied or local volatility terms in the ATM
regime, we need to push the Laplace expansion one step further, i.e., we need one
additional term for 
(G)
e− t (G)dG
GK

Hence, we apply the (multi-variate) Taylor expansion for (G) := (F0 , F K (G))
up to order 4 around the maximizer G∗ , which can be expressed in tensor notation
as
 1  ⊗2
(G) = (G∗ ) + D(G∗ ) G − G∗ + D 2 (G∗ ) G − G∗ +
  2
=0
1 3  ⊗3 1 4  ⊗4
+ D (G∗ ) G − G∗ + D (G∗ ) G − G∗ + ··· ,
6 24
with
 ∂k
D k (x)y⊗k := (x)yi1 · · · yik
∂xi1 · · · ∂xik
i 1 ,...,i k

(This notation makes sense as any multi-linear map on a vector space—such as


D k (x)—corresponds to a linear map—here also denoted by D k (x)—on the ten-
sor product space). Of course, we are aware that when the option is at the money,
the optimal configuration is the same as the initial configuration F0 . Nonetheless, we
think that using a different symbol for the optimal configuration at this stage leads
to a clearer exposition of the underlying ideas. Likewise, we apply Taylor expansion
up to second order for the map (G) around G∗ ,

 1  ⊗2
(G) = (G∗ ) + D(G∗ ) G − G∗ + D 2 (G∗ ) G − G∗ + ··· .
2
In the end, we are interested in small-time asymptotics, so we change variables

1 
z := √ G − G∗ ,
t

so that we can express the above Taylor expansions as expansions in t,


222 C. Bayer and P. Laurence

1 1 1 1 √
(G) = (G∗ ) + D 2 (G∗ )z⊗2 + D 3 (G∗ )z⊗3 t
t t 2 6
1 4 ∗ ⊗4
+ D (G )z t + o(t),
24
and
√ 1
(G) = (G∗ ) + D(G∗ )z t + D 2 (G∗ )z⊗2 t + o(t).
2
Using the above Taylor expansions, the change of variables, and
 
√ √ a2
e a t+bt
=1+a t + + b t + o(t),
2

we obtain
 
(F0 ,F K (G))
− (n−1)/2 −(G∗ )/t − 12 D 2 (G∗ )z⊗2
e t (G)dG = t e √ e
GK ∗
(G K −G )/ t
   2 
1 3 ∗ ⊗3
√ 1 1 3 ∗ ⊗3 1 4 ∗ ⊗4
× 1 − D (G )z t+ − D (G )z − D (G )z t
6 2 6 24
!
√ 1
+ o(t) × (G∗ ) + D(G∗ )z t + D 2 (G∗ )z⊗2 t + o(t) dz. (3.9)
2

In the next step,


√ we approximate the integral by replacing the domain of integration
(G K − G∗ )/ t by Rn−1 . Then we can see that the integration kernel in (3.9) is
Gaussian with vanishing mean, so that the integral of any odd monomial with respect
to the kernel vanishes. Thus, we obtain the expansion

|w| 1 (F0 ,F K (G)) " #
e− t (G)dG = h  
0 + h 1 t + o(t) , (3.10)
|wn | (2πt)n/2 GK

with h 
0 defined in (3.8) and
 !
|w| 1 1 2 1
− 21 zT Qz
h
1 := D (G∗ )z⊗2 − D 3 (G∗ )z⊗3
e
|wn | (2πt)n/2 2
Rn−1 6
 2
1 1 3 1 4
× D(G∗ )z + D (G∗ )z⊗3 (G∗ ) − D (G∗ )z⊗4 (G∗ ) dz.
2 6 24
(3.11)

Here we assume that  is polynomially bounded and F0 > 0 (i.e., all components
of F0 are strictly positive. Indeed, under these assumptions we observe that the error
in the approximation (3.10) decays, in fact, like e−1/t by properties of the normal
distribution.
Using Isserlis’ Theorem (see [14]), the Eq. (3.11) for h 1 can be computed ex-
plicitly.
Small-Time Asymptotics for the At-the-Money Implied Volatility … 223

Lemma 3.2 (Isserlis’ theorem for fourth and sixth moments) For a covariance
matrix  ∈ Rd×d let T 2 () ∈ (Rd )⊗4 and T 3 () ∈ (Rd )⊗6 be the tensors defined
by
T 2 ()i1 ,...,i4 = i1 i2 i3 i4 + i1 i3 i2 i4 + i1 i4 i2 i3

and

T 3 ()i1 ,...,i6 = i1 i2 i3 i4 i5 i6 + i1 i2 i3 i5 i4 i6 + i1 i2 i3 i6 i4 i5
+ i1 i3 i2 i4 i5 i6 + i1 i3 i2 i5 i4 i6 + i1 i3 i2 i6 i4 i5 + i1 i4 i2 i3 i5 i6
+ i1 i4 i2 i5 i3 i6 + i1 i4 i2 i6 i3 i5 + i1 i5 i2 i3 i4 i6 + i1 i5 i2 i4 i3 i6
+ i1 i5 i2 i6 i3 i5 + i1 i6 i2 i3 i4 i5 + i1 i6 i2 i4 i3 i5 + i1 i6 i2 i5 i3 i4 ,

1 ≤ i 1 , . . . , i 6 ≤ d. For Z ∼ N(0, ) we have


   
E Z ⊗4 = T 2 (), E Z ⊗6 = T 3 ().

Hence, we can get an explicit formula also for h  1 in terms of derivatives of 


and —which are easy to compute, but lead to quite long formulas that depend on
the individual choice of the local volatility model. These formulas are not included
here, as they essentially boil down to exercises in the product rule.
∂k
Lemma 3.3 With the short-hand notation ∂i1 ,...,ik := ∂ Fi 1 ···∂ Fi k , we have

!
|w| 1 1 2
h
1 = √ D (G∗ )Q −1
|wn | 2πt det Q 2
1 
− (∂i1 ,i2 ,i3 )(G∗ )(∂i4 )(G∗ )T 2 (Q −1 )i1 ,...,i4
6
i 1 ,...,i 4
1   
+ (G∗ ) ∂i1 ,i2 ,i3  (G∗ ) ∂i4 ,i5 ,i6  (G∗ )T 3 (Q −1 )i1 ,...,i6
72
i 1 ,...,i 6
1  
− (G∗ ) ∂i1 ,...,i4  (G∗ )T 2 (Q −1 )i1 ,...,i4 .
24
i 1 ,...,i 4

These results are summarized in


Proposition 3.4 Assume that we have a locally elliptic local volatility model such
that the heat kernel expansion (3.1) holds, initial stock prices F0 are strictly pos-
itive, and  is polynomially bounded. Moreover, we assume that the minimization
problem (3.7) has a unique solution. Then we have the Laplace expansion
  
1 d(F0 , F)2
(F) exp − Hn−1 (dF) = h  
0 + th 1 + o(t)
(2πt)n/2 EK 2t
224 C. Bayer and P. Laurence

with h  
0 given in (3.8) and h 1 given in Lemma 3.3.

Remark 3.5 We note that the key assumptions of Proposition 3.4 are not easy to
verify in general. We refer to [5] for elements of a proof of the heat kernel expansion
and to [3] for further discussion.

The last ingredient needed for the asymptotic expansions of both implied and
local volatilities are the heat kernel coefficients u 0 and u 1 . As we are assuming the
options to be ATM, we only need the heat kernel coefficients on the diagonal.

Lemma 3.6 For a local volatility model, we have the following formulas for the
heat kernel coefficients on the diagonal:

u 0 (F, F) = 1,
1 
n
1 
n
u 1 (F, F) = σi (Fi )σi (Fi ) − σi (Fi )ρij σ j (F j ),
4 8
i=1 i, j=1

where, as usual, ρij denotes the (i, j)-component of ρ−1 .

Proof Note that the infinitesimal generator A of the process F(t) can be expressed
(using the summation convention) as

1 1 ∂
A=  − f i (F) ,
2 2 ∂ Fi

where
1 ∂ ij  ∂
= √ g det g
det g ∂ Fi ∂ Fj

denotes the Laplace-Beltrami operator associated to g and the vector field f is given
by
f i (F) = σi (F)σi (Fi ), i = 1, . . . , n.

As indicated in (3.1), the transition density of the process F(t) satisfies (under certain
assumptions, see Proposition 3.4 and Remark 3.5)

1  d(F0 ,F)2
p(F0 , F, T ) = det g(F)e− 2T (u 0 (F0 , F) + T u 1 (F0 , F) + o(T )) ,
(2πT ) n/2

where d(F0 , F) is the geodesic distance between F0 and F and u 0 and u 1 are the heat
kernel coefficients. √
The

order zero heat kernel coefficient is given by u 0 (F0 , F) = (F0 , F)
  
e− 2 γ  f , γ̇g  , where
1
γ f , γ̇ is understood as integral along the geodesic γ joining
g
F0 and F and (F0 , F) is the Van Vleck–De Witt determinant,
Small-Time Asymptotics for the At-the-Money Implied Volatility … 225
 
1 1 ∂ 2 d 2 (F0 , F)
(F0 , F) = √ det − .
det g(F0 ) det g(F) 2 ∂F0 ∂F
  
On the diagonal, we clearly have γ f , γ̇g = 0 and for any local volatility model we
have (F0 , F) ≡ 1, as the geometry is isomorphic to the Euclidean geometry by the
F
coordinate transformation F → Ly, where LρL T = Id and yi := 0 i σi (u)−1 du.
Hence, u(F, F) = 1.
For the first order heat kernel coefficient, we refer to [16, Eq. (4.1)], where it is
shown that
1 1 1
u 1 (F, F) = κ + divg f (F) − | f (F)|2g .
6 4 8
Here, κ denotes the scalar curvature, which vanishes for local volatility model due
to the isomorphism with the Euclidean geometry already used above. (Note that [16]
consider the heat kernel corresponding to  + f , whereas we consider the operator
2  + 2 f . Hence, we evaluate the formula obtained in [16, Eq. (4.1)] at t/2 instead
1 1

of t.) For the remaining terms we have

1 ∂   
divg f (F) = √ f i (F det g(F) = σi (Fi )σi (Fi ),
det g(F) ∂ Fi
| f (F)|2g = gij (F) f i (F) f j (F) = σi (Fi )ρij σ j (F j ). 


Finally, we can explicitly compute the determinant of the Hessian Q of  at


G∗ = (F0,1 , . . . , F0,n−1 ) in the ATM regime.
Lemma 3.7 The Hessian Q of  satisfies
n
i, j=1 wi σi (F0,i )ρij w j σ j (F0, j )
det Q = $n = σN,B
2
(F0 ) det g(F0 )/wn2 .
wn2 det ρ i=1 σi (F0,i )2

The proof of Lemma 3.7 is deferred to the Appendix.

4 Basket Local Volatility

The numerator in the right hand side of the formula in Proposition 2.2 is given by
  
1 d(F0 , F)2
(ū 0 (F0 , F) + t ū 1 (F0 , F)) exp − Hn−1 (dF) =
(2πt)n/2 EK 2t
 
h 0ū 0 + t h 1ū 0 + h 0ū 0 + o(t),

where, by abuse of notation, we denote the function F → ū i by ū i again, i = 0, 1.


For the denominator, we get
226 C. Bayer and P. Laurence
  
1  d(F0 , F)2
û 0 (F0 , F) + t û 1 (F0 , F) exp − Hn−1 (dF) =
(2πt)n/2 EK 2t
 
h 0û 0 + t h 1û 0 + h 0û 1 + o(t).

As
a1 + b1 t + o(t) a1 a2 b1 − a1 b2
= + t + o(t),
a2 + b2 t + o(t) a2 a22

we arrive at

h ū0 0 h û0 0 (h ū1 0 + h ū0 1 ) − h ū0 0 (h û1 0 + h û0 1 )


σloc (K , T )2 K 2 = +  2 T + o(T ).
h û0 0 h û0 0

As ū 0 = σN,B
2 û , we can easily simplify
0

h ū0 0
= σN,B
2
(F0 ).
h û0 0

ū û
For the first order term, we note that all the terms h i j and h i j have the common
|w| √ 1
factor |w n | 2πT det Q
, which, hence, cancels out in the first order term—in particular,
implying that the “first order term” is really first order in T . Thus, we get
n
Proposition 4.1 For K = F 0 = i=1 wi F0,i , the basket local volatility has the
asymptotic expansion σloc 2 (T, K ) = σ 2
loc,0 (K ) + σloc,1 (K )T + o(T ), with
2

σN,B
2 (F )
0
σloc,0
2
(K ) = ,
K2
h û0 0 (h ū1 0 + h ū0 1 ) − h ū0 0 (h û1 0 + h û0 1 )
σloc,1
2
(K ) =  2 .
h û0 0 K 2

We recall the definition


n
σN,B (F)2 = wi σi (Fi )ρij w j σ j (F j ).
i, j=1

5 Implied Volatility

The strategy for obtaining an asymptotic expansion for the implied volatility is as
follows: we first compute an asymptotic expansion of the basket option price in our
Small-Time Asymptotics for the At-the-Money Implied Volatility … 227

local volatility model, then we compare coefficients with the short time expansion
of the corresponding call option price in the Black-Scholes or Bachelier model,
respectively. Hence, we first apply our general asymptotic expansion obtained in
Proposition 3.4 to the Carr-Jarrow formula from Proposition 2.1, getting (for K = F 0 )
Now we can insert these results back into Proposition 2.1, and we obtain
 T  
1 √ 
CB (F0 , K , T ) = h ū0 0 + t h ū1 0 + h ū0 1 + o( t) dt
2 |w| 0
  ū 
1 T g0 0 √  ū 0 ū 1
 √
= √ + t g1 + g0 + o( t) dt
2 0 t
√ 1  ū 0   
= g0ū 0 T + g1 + g0ū 1 T 3/2 + o T 3/2 ,
3
where √
ū t ū j
gi j := h , i, j = 0, 1 (5.1)
|w| i

is independent of t. Finally, using (3.8) together with (3.3), and Lemma 3.7, we get

σN,B
2 (F ) det g(F )
0 0 σN,B (F0 )
g0ū 0 = √ = √ .
|wn | 2π det Q 2π

Proposition 5.1 The expansion of the call prices (at-the-money) in drift-less local
volatility models is asymptotically equivalent, to first order, to

σN,B (F0 ) 1  ū 0 
CB (F0 , K , T ) = √ + g1 + g0ū 1 T 3/2 + o(T 3/2 )
2π 3

as T → 0.
In the final step, we compute an expansion of the implied volatility with respect
to either Black-Scholesor Bachelier model. Let us consider the prices of call options
n
with stock price F 0 = i=1 wi F0,i = K in the Black-Scholes and Bachelier models,
assuming that the respective volatilities are of the form σBS = σBS,0 + T σBS,1 and
σBach = σBach,0 + T σBach,1 . We obtain the well known formulas

CBS (F 0 , K , T ) =CBS (K , K , T ) =
!
K √ K 1 3
√ σBS,0 T + √ σBS,1 − σBS,0 + o(T 3/2 ),
2π 2π 24
228 C. Bayer and P. Laurence

CBach (F 0 , K , T ) = CBach (K , K , T ) =
K √ K
√ σBach,0 T + √ σBach,1 T 3/2 + o(T 3/2 ).
2π 2π

5.1 Zeroth Order Implied Volatility

Despite being well-known, we recall the zeroth order implied volatility coefficients
and some of their properties. By comparison of coefficients, see Proposition 5.1 and
the above expansions for CBS and CBach , respectively, we find that

1 1 σN,B (F0 )
σBS,0 = σBach,0 = ū 0 (F0 , F0 ) (det Q)− 2 = , (5.2)
|wn |K F0

where we also used F 0 = K . Note, in particular, that the basket implied volatility
(5.2) can be interpreted as a weighted mean of the individual components’ (ATM)
 F F j
implied volatilities in the sense that (σBS,0 )2 = i,n j=1 ρij wi K0,i σBS,0
i w j K0, j σBS,0 .
Remark 5.2 The right hand side in Eq. (5.2) is nothing but the local volatility of the
n
basket i=1 wi Fi at F0 in the Black-Scholes (i.e., log-normal) sense. Hence, we
have obtained that the zero order term in the small time expansion of the implied
volatility of the basket is equal to its local volatility when we consider an ATM option.
That result is not surprising in light of [11], where similar results were obtained (in
one-dimensional models). In this sense, one could even take (5.2) as an ex-post
justification of Lemma 3.7.

5.2 First Order Implied Volatility

The first order implied volatilities in the Black Scholes and the Bachelier model do
not coincide any more. Indeed, we immediately have the first order correction term
in the Bachelier model
√  
2π ū 0
σBach,1 = g1 + g0ū 1 . (5.3)
3K
On the other hand, for the Black-Scholes model we have
√   σ3
2π ū 0 BS,0 σBS,0
3
σBS,1 = g1 + g0ū 1 + = σBach,1 + , (5.4)
3K 24 24
implying that implied volatility quoted in the Black-Scholes framework is strictly
larger than the implied volatility in the Bachelier framework up to first order—the
prices are, of course, equal up to first order.
Small-Time Asymptotics for the At-the-Money Implied Volatility … 229

6 Numerical Results

6.1 The CEV Model

As in [5], we consider the CEV model for the numerical examples. The CEV model
is a special case of the general local volatility model considered so far, where the
local volatilities are given by

β
σi (Fi ) = ξi Fi i , i = 1, . . . , n,

for some parameters ξi ≥ 0 and βi > 0. In fact, the most realistic scenario here is
0 < βi ≤ 1. Note that we allow βi < 1/2, which implies degenerate densities of Ft
at the boundary.

6.2 Implementation of the Approximate Formulas


and Simulation

Implementation of the zero order terms of the implied volatilities in either Black-
Scholes or Bachelier setting is, of course, easy using (5.2). On the other hand, the
formulas for σBS,1 and σBach,1 are much less straightforward to implement. While the
formulas in the ATM case are fully explicit (unlike in [5]) an efficient implementation
is much less trivial. The formula for h 1 in Lemma 3.3, for instance, depends on the
derivatives up to order four of the squared Riemannian distance at F0 and on the
Jacobi matrix of F → u 0 (F0 , F). Already the evaluation of the (n − 1) × (n − 1) ×
(n−1)×(n−1) tensor D 4  can be very time-consuming, if a naive implementation is
used, which does not take into account that most derivatives actually vanish. But even
when more efficient implementations are used, the sheer size of the tensor may impose
limitations on the dimension of the problem. So far, we have implemented (3.11) in
Mathematica using symbolic differentiation of the squared Riemannian distance and
the zeroth order heat kernel coefficient u 0 , which works for small dimensions, up to
n = 5, say.
As in the paper [5], we compare the approximate prices against prices obtained
from sophisticated Monte Carlo simulation. Here, the CEV-SDE is discretized using
the Ninomiya-Victoir scheme [18], which is a second order weak approximation
scheme based on a splitting of the generator. Strictly speaking, the CEV process
violates the strong regularity assumption of that scheme, especially at the boundary
of the domain, but, as often in equity modelling, we do empirically observe second
order convergence for CEV-baskets, yet another beneficial effect of “not feeling
the boundary”. For variance reduction, we combine the discretization with the mean
value Monte Carlo method, see [19]. This is a variant of the control variate technique,
where a linear combination of one-dimensional geometrical Brownian motions is
230 C. Bayer and P. Laurence

used as control variate. More precisely, we freeze each component but one of the
basket, and replace the dynamics of the remaining basket by a corresponding Black-
Scholes dynamics. In the resulting model, the true option price can be explicitly
calculated. Finally, we choose a linear combination of those partially frozen model
so as to minimize the variance of the Monte Carlo estimator.
The expectation of the random variable obtained by combining the Ninomiya-
Victoir discretization of the CEV process and the mean value Monte Carlo method
is the approximated using Sobol numbers. In some sense, this contradicts the above
motivation for the variance reduction, but we do find empirically that the integration
error for a Quasi Monte Carlo estimator is also reduced by the variance reduction, i.e.,
the variance reduction also seems to reduce the number of most relevant dimensions
of the integration problem. Finally, we sacrifice some of the accuracy available by the
combination of the three techniques mentioned so far by introducing a random shift
of the Sobol numbers, i.e., we use the Randomized Quasi Monte Carlo technique,
see L’Ecuyer [15]. In this way, we can obtain reliable computable error bounds for
the integration error.

6.3 Numerical Example

We consider a three-dimensional spread option, which is determined by the following


parameters:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
8 0.4 0.7 −1
F0 = ⎝17⎠ , σ = ⎝0.8⎠ , β = ⎝0.5⎠ , w = ⎝ 1 ⎠ ,
12 0.7 0.3 1

with a correlation matrix


⎛ ⎞
1 0.9167390 0.7425194
ρ = ⎝0.9167390 1 0.8099573⎠ .
0.7425194 0.8099573 1

We compute the ATM price, i.e., the option price at K = 21, for maturities
T ∈ {0.5, 1, 2, 5, 10} years, which we compare with the zeroth and first order prices
in the corresponding Bachelier model. We also report σBach,0 = 0.1487036 and
σBach,1 = −6.72781 × 10−5 . Note that the “error bounds” reported in Tables 1
and 2 are upper estimates for the integration error (i.e., quasi Monte Carlo error) for
the reference values. Hence, numbers obtained from the first order approximation
formula are within the error bounds around the reference values.
In Fig. 1, we plot (linear interpolations of) the relative errors of the zeroth and
first order approximate pricing formulas close to the money (as obtained in [5]) and
compare them to the ATM-formulas represented by circles. We see that the accuracy
is extremely good in both cases, and that our approximation formulas for ATM CEV-
Small-Time Asymptotics for the At-the-Money Implied Volatility … 231

Table 1 Prices
Time Price 0th order price 1st order price Error bound
0.5 0.88073 0.88092 0.88072 2.43e-05
1 1.24525 1.24581 1.24524 4.63e-05
2 1.76023 1.76184 1.76024 8.90e-05
5 2.77895 2.78571 2.77941 3.21e-04
10 3.91968 3.93959 3.92176 5.92e-04
Error bounds given correspond to the (quasi) Monte Carlo error in the numerical scheme. The dis-
cretization error is of higher order

Table 2 Relative errors


Time 0th order rel. error 1st order rel. error Error bound
0.5 2.19e-04 6.85e-06 2.43e-05
1 4.49e-04 3.80e-06 4.63e-05
2 9.15e-04 9.02e-06 8.90e-05
5 2.43e-03 1.65e-04 3.21e-04
10 5.08e-03 5.33e-04 5.92e-04
Error bounds given correspond to the (quasi) Monte Carlo error in the numerical scheme. The dis-
cretization error is of higher order

basket options nicely interpolate the formulas available away from the money. Indeed,
deviations from the non-ATM values only appears at very small orders of magnitude
in the logarithmic scale of Fig. 1 (where the Monte Carlo error contained in the
reference values probably dominates). For the sake of completeness, Fig. 2 reports
the absolute errors of the respective asymptotic formulas over a wide range of strike
prices, indicating that the asymptotic formulas exhibit their worst quality ATM.

Appendix A: Proof of Lemma 3.7

We present a proof of Lemma 3.7. Recall that we want to compute the determinant
of the Hessian Q of the map

1
(G) := d (F0 , (G, FN (G, K )))2
2

evaluated at G = F0,1 , . . . , F0,n−1 . Let Si (x) denote the anti-derivative of 1/σi
satisfying (for simplicity) Si (F0,i )=0. Now consider the change of variables F → y
with yi := Si (Fi ), i = 1, . . . , n. As verified in [5], this transformation turns the
Riemannian geometry introduced above into an (almost) Euclidean geometry, with

d(F0 , F)2 = yT ρ−1 y.


232 C. Bayer and P. Laurence

5e−03
1e−03
5e−05 2e−04
Relative error

T = 0.5
T = 1.0
T = 2.0
T = 5.0
1e−05

T = 10.0
2e−06

20.90 20.95 21.00 21.05 21.10


K
Fig. 1 Relative errors. Solid lines correspond to prices obtained from (non-ATM) zeroth order
approximate formulas, dashed lines to (non-ATM) first order approximate formulas. The corre-
sponding ATM-approximate prices are represented by circles and other symbols. Note that the
option is ATM for K = 21

Of course, the constraint on F translates into a constraint on y, which can be removed


by eliminating one variable. Indeed, setting x := (y1 , . . . , yn−1 ), we get
⎛ ⎛ ⎞⎞
1 ⎝ 
n−1
yn (x) = Sn (Fn ) = Sn ⎝ K− w j S−1
j (y j )
⎠⎠ .
wn
j=1

This way, we understand (G) as a function ϕ(x) in the new (reduced) coordinates,
and obtain for the Hessian

HG (G) = J (G)T Hx ϕ(x)J (G),

where HG and Hx denote the Hessians in the G- and x-coordinates, respectively,


and J (G) denotes the Jacobian matrix of the change of coordinates G → x. As
Si = 1/σi , we have J (G) = diag(1/σ1 (F1 ), . . . , 1/σn−1 (Fn−1 )). Regarding the
matrix Hx ϕ, an elementary calculation using the fact that F = F0 corresponds to
y = 0, we obtain
Small-Time Asymptotics for the At-the-Money Implied Volatility … 233

1e−03

.
Absolute error

.
1e−05

. T = 0.5
T = 1.0
T = 2.0
T = 5.0
1e−07

T = 10.0

10 15 20 25 30
K
Fig. 2 Absolute errors. Solid lines correspond to prices obtained from (non-ATM) zeroth order
approximate formulas, dashed lines to (non-ATM) first order approximate formulas. The corre-
sponding ATM-approximate prices are represented by circles and other symbols. Note that the
option is ATM for K = 21

 
w j σ j (F0, j ) wi σi (F0,i ) wi σi (F0,i )w j σ j (F0, j ) n−1
Hx ϕ(0) = ρij − ρin − ρjn + ρnn .
wn σn (F0,n ) wn σn (F0,n ) wn2 σn (F0,n )2 i, j=1

From the structure of the above expression and the expression in Lemma 3.7, we see
that we may assume that wi = 1, i = 1, . . . , n, and σn (F0,n ) = 1. In this case, we
are left to prove that the determinant of the matrix
 n−1
A := ρij − ρin s j − ρjn si + ρnn si s j i, j=1

is equal to the expression a := sT ρs/ det ρ, where we used the short-hand notation
si = σi (F0,i ), i = 1, . . . , n − 1, and sn = 1, and s = (s1 , . . . , sn ).
As both det A and a are polynomials in s1 , . . . , sn−1 , we prove this equality by
establishing that they have the same coefficients. Here, Cramer’s rule is the essential
tool:
1
B −1 = Adj(B),
det B
234 C. Bayer and P. Laurence

where the adjugate matrix Adj B is the transpose of the matrix of co-factors, i.e.,

(Adj B)i j = (−1)i+ j det B ĵ î ,

with B ĵ î being obtained from B by removing the jth row and the ith column. By
symmetry, we hence have
ρij
= (−1)i+ j det ρ−1 , ∀(i, j) ∈ {1, . . . , n − 1}2 , (A.1)
det ρ î ĵ

where ρ−1 is understood in the sense of (ρ−1 )î ĵ .


î ĵ
Let us also establish a few notations. Let Sn−1 be the set of all permutations of
{1, . . . , n − 1} and let, similarly, S(A; B) denote the set of all bijective maps from
A ⊂ N to B ⊂ N, with A, B having the same (finite) size. Moreover, the definition of
the signature sign is extended to S(A; B) in the obvious way (as being ±1 depending
on the number of inversions being even or odd). Moreover, for a monomial x in the
variables s1 , . . . , sn−1 we denote by πx p the coefficient of any polynomial p w.r.t.
the monomial x. In order to establish Lemma 3.7, we need to prove that

)
2(n−1)
∀x ∈ {s1 , . . . , sn−1 }k : πx det A = πx a.
k=0

We distinguish different cases according to the degree.


Case 0. For deg x = 0, i.e., x = 1, we have

 
n−1
ρnn
π1 det A = sign(σ) ρiσ(i) = det ρn̂−1 = Adj(ρ−1 )nn = = π1 a.
n̂ det ρ
σ∈Sn−1 i=1

Case 1. For some fixed sk we have



 −1 (k)n 
πsk det A = sign(σ)(−1) ⎣ρσ ρiσ(i)
σ∈Sn−1 i∈{1,...,n−1}\{σ −1 (k)}


+ ρσ(k)n ρiσ(i) ⎦
i∈{1,...,n−1}\{k}
 
σ(k)n
= −2 sign(σ)ρ ρiσ(i)
σ∈Sn−1 i∈{1,...,n−1}\{k}
Small-Time Asymptotics for the At-the-Money Implied Volatility … 235

by symmetry of ρ−1 . There is a one-to-one correspondence between Sn−1 and


S({1, . . . , n} \ {k}; {1, . . . , n − 1}) given by σ → σ̃ defined by

σ(i), i ∈ {1, . . . , n − 1} \ {k},
σ̃(i) =
σ(k), i = n.

Moreover, one can see that sign(σ̃) = (−1)k+n−1 sign(σ). Hence, we obtain
  ˜
πsk det A = −2 sign(σ)ρn σ̃(n) ρi σ(i)
σ∈Sn−1 i∈{1,...,n−1}\{k}
  ˜
= 2(−1) k+n
sign(σ̃)ρn σ̃(n) ρi σ(i)
σ̃∈S({1,...,n}\{k};{1,...,n−1}) i∈{1,...,n−1}\{k}

= 2(−1) k+n
det ρ−1
k̂ n̂
2ρkn
= 2 Adj(ρ−1 )kn = = πsk a.
det ρ

Case 2. We consider x = sk sl . For simplicity, we assume k = l (k = l works


analogously). We have

 
πs 2 det A = sign(σ) ⎣1k=σ(k) ρnn ρiσ(i) +
k
σ∈Sn−1 i∈{1,...,n−1}\{k}

−1 (k)n 
+ 1k=σ(k) ρσ(k)n ρσ ρiσ(i) ⎦ .
i∈{1,...,n−1}\{k,σ −1 (k)}

We construct a bijective map from Sn−1 to S({1, . . . , n} \ {k}; {1, . . . , n} \ {k}) by


mapping σ ∈ Sn−1 to σ̃ defined by

σ(i), i ∈ {1, . . . , n − 1} \ {k},
σ̃(i) =
n, i = n,

for the case k = σ(k) and



⎪ −1
⎨σ(i), i ∈ {1, . . . , n − 1} \ {k, σ (k)},
σ̃(i) = n, i = σ −1 (k),


σ(k), i = n,

else. Note that it is easy to see that sign(σ) = sign(σ̃). Hence, we have
236 C. Bayer and P. Laurence
 
πs 2 det A = sign(σ) ρi σ̃(i)
k
σ∈Sn−1 i∈{1,...,n}\{k}
 
= sign(σ̃) ρi σ̃(i)
σ̃∈S({1,...,n}\{k};{1,...,n}\{k}) i∈{1,...,n}\{k}

= det ρ−1 = πs 2 a.
k̂ k̂ k

Higher order terms. Regarding the higher order terms, we note that πx a = 0 for any
monomial of degree larger than two. Therefore, the same should be true for det A,
where it does not to seem to follow from an obvious argument. Note that we only
need to consider polynomials where each individual variable sk appears at most two
times, as any other monomial cannot appear in det A by the definition of A and of
the determinant. But any coefficient of det A with respect to such monomials can
be understood as the determinant of a matrix ρ2 −1 , which is obtained from ρ−1 by
omitting one row and one column and by replacing some rows/columns by copies of
other rows/columns. Of course, any such matrix ρ̃ has vanishing determinant, imply-
ing that πx det A = 0. For concreteness, we indicate this mechanism by appealing
to two special cases. First, take x = sk2 sl , l = k. Similarly to the case of x = sk , one
can show that
 ! 
−1
πs 2 sl det A = −2 sign(σ) 1k=σ(k) ρnn ρσ (l)n ρiσ(i) +
k
σ∈Sn−1 i∈{1,...,n−1}\{k,l}

−1 (k)n −1 (l)n 
+ 1k=σ(k) ρσ(k)n ρσ ρσ ρiσ(i) ,
i∈{1,...,n−1}\{k,σ −1 (k)σ −1 (l)}

which is (the multiple of) the determinant of ρ2


−1 , which is obtained from ρ−1 by
k̂ k̂
replacing the lth row by the last row. As the last row appears twice in ρ2 −1 , the
determinant, and hence πs 2 sl det A, vanishes.
k
The mechanism is even more transparent for the most extreme monomial x =
s12 · · · sn−1
2 . In this case,


πs 2 ···s 2 det A = sign(σ)(ρnn )n−1 = 0,
1 n−1
σ∈Sn−1

as the determinant of the (n − 1) × (n − 1) matrix with all entries being equal to ρnn .
Small-Time Asymptotics for the At-the-Money Implied Volatility … 237

References

1. Avellaneda, M., Boyer-Olson, D., Busca, J., Friz, P.: Application of large deviation methods to
the pricing of index options in finance. C. R. Math. Acad. Sci. Paris 336(3), 263–266 (2003)
2. Azencott, R.: Densité des diffusions en temps petit: développements asymptotiques. I. Seminar
on Probability, XVIII. Lecture Notes in Mathematics, vol. 1059, pp. 402–498. Springer, Berlin
(1984)
3. Bayer, C., Friz, P., Laurence, P.: On the Probability Density Function of Baskets. Springer
Proceedings in Mathematics & Statistics (2014)
4. Bayer, C., Laurence, P.: Calculation of greeks for basket options. Working paper
5. Bayer, C., Laurence, P.: Asymptotics beats Monte Carlo: the case of correlated local vol baskets.
Commun. Pure Appl. Math. 67(10), 1618–1657 (2014)
6. Breitung, K., Hohenbichler, M.: Asymptotic approximations for multivariate integrals with an
application to multinormal probabilities. J. Multivar. Anal. 30, 80–97 (1989)
7. Carr, Peter P., Jarrow, Robert A.: The stop-loss start-gain paradox and option valuation: a new
decomposition into intrinsic and time value. Rev. Financ. Stud. 3(3), 469–492 (1990)
8. Deuschel, J., Friz, P., Jacquier, A., Violante, S.: Marginal density expansions for diffusions and
stochastic volatility, part I: theoretical foundations. Commun. Pure Appl. Math. 67(1), 40–82
(2013)
9. Deuschel, J., Friz, P., Jacquier, A., Violante, S.: Marginal density expansions for diffusions and
stochastic volatility, part II: applications. Commun. Pure Appl. Math. 67(2), 321–350 (2013)
10. Evans, L.C., Gariepy, R.F.: Measure Theory and Fine Properties of Functions. Studies in Ad-
vanced Mathematics. CRC Press, Boca Raton (1992)
11. Gatheral, J., Hsu, E.P., Laurence, P., Ouyang, C., Wang, T.: Asymptotics of implied volatility
in local volatility models. Math. Financ. 22(4), 591–620 (2012)
12. Henry-Labordère, P.: Analysis, Geometry, and Modeling in Finance: Advanced Methods in
Option Pricing. Chapman & Hall/CRC Financial Mathematics Series. CRC Press, Boca Raton
(2009)
13. Hsu, P.: Heat kernel on noncomplete manifolds. Indiana Univ. Math. J. 39(2), 431–442 (1990)
14. Isserlis, L.: On a formula for the product-moment coefficient of any order of a normal frequency
distribution in any number of variables. Biometrika 12(1/2), 134–139 (1918)
15. L’Ecuyer, P.: Quasi-Monte Carlo methods with applications in finance. Financ. Stoch. 13(3),
307–349 (2009)
16. McKean Jr., H.P., Singer, I.M.: Curvature and the eigenvalues of the Laplacian. J. Differ. Geom.
1(1), 43–69 (1967)
17. Minakshisundaram, S., Pleijel, Å.: Some properties of the eigenfunctions of the Laplace-
operator on Riemannian manifolds. Can. J. Math. 1, 242–256 (1949)
18. Ninomiya, S., Victoir, N.: Weak approximation of stochastic differential equations and appli-
cation to derivative pricing. Appl. Math. Financ. 15(1–2), 107–121 (2008)
19. Pellizzari, P.: Efficient Monte Carlo pricing of European options using mean value control
variates. Decis. Econ. Financ. 24(2), 107–126 (2001)
20. Yosida, K.: On the fundamental solution of the parabolic equation in a Riemannian space.
Osaka Math. J. 5, 65–74 (1953)
A Remark on Gatheral’s ‘Most-Likely Path
Approximation’ of Implied Volatility

Martin Keller-Ressel and Josef Teichmann

Abstract We give a new proof of the representation of implied volatility as a


time-average of weighted expectations of local or stochastic volatility. With this
proof we clarify the question of existence of ‘forward implied variance’ in the orig-
inal derivation of Gatheral, who introduced this representation in his book ‘The
Volatility Surface’.

Keywords Implied volatility · Local volatility · Most-likely path

1 Gatheral’s Most-Likely Path Approximation

In his book ‘The Volatility Surface—A Practitioners Guide’, Jim Gatheral presents
an approximation formula for the implied volatility of a European option, when the
underlying stock follows a general diffusion process

dSt
= μ(t, St ) dt + σ(t, St ) dWt . (1)
St

The ‘most-likely path approximation’ to implied Black-Scholes volatility in this


model consists of two parts: The first part is the assertion that implied variance—the
square of implied volatility—can be written as a time-average of weighted expecta-
tions of σ 2 (t, St ):

M. Keller-Ressel (B)
Fachrichtung Mathematik, Institut f. Math. Stochastik, TU Dresden,
01062 Dresden, Germany
e-mail: [email protected]
J. Teichmann
Department of Mathematics, ETH Zürich,
Rämistrasse 101, 8092 Zürich, Switzerland
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 239


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_8
240 M. Keller-Ressel and J. Teichmann
 T  
1
σimp
2
(K , T ) = EGt σ 2 (t, St ) dt . (2)
T 0

Here, the measures Gt are given by their Radon-Nikodym derivatives with respect
to the risk-neutral measure Q,

dGt S 2 t BS (St , σ K ,T (t))


=  2 , (3)
dQ E S t BS (St , σ K ,T (t))

where σ K ,T (t) is a function that is yet to be specified, BS denotes the Black-Scholes
Gamma and expectations are always taken to be under the risk-neutral pricing mea-
sure. Let us emphasize that (2) is an exact formula, and that it is the second part of
the method where the approximation happens: Gatheral argues that the density (3)
is concentrated (as a function of (t, S)) close to a narrow ridge connecting today’s
stock price S0 to the strike price K at time T , and claims that a good approximation
to (2) is to evaluate it as if the density was entirely concentrated on this ridge.1 In the
terminology of Gatheral this ridge is called the most-likely path and the described
approximation method the most-likely path approximation. Extensions of the repre-
sentation (3) have been proposed e.g. by Guyon and Henry-Labordère [2] for implied
correlations.
In this note we will only be concerned with the first part of Gatheral’s method,
i.e. the derivation of the exact Eq. (2), and in particular the definition of the yet
unknown function σ K ,T (t). Gatheral [1] defines on p. 27 first the ‘Black-Scholes
forward implied variance’ v K ,T (t) by
 
E σ 2 (t, St )S 2 t BS (St , σ K ,T (t))
v K ,T (t) =   , (4)
E St2 BS (St , σ K ,T (t))

and then, in the equation below, the quantity σ K ,T (t) by


 T
1
σ 2K ,T (t) = v K ,T (u)du . (5)
T −t t

Differentiating (5) and inserting into (4) yields an ordinary differential equation for
σ K ,T (t). This definition through an ODE leaves open the question whether (and
under which conditions) the quantities v K ,T (t) and σ K ,T (t) actually exist.2 We will
show that a simpler definition of σ K ,T (t) can be given, which clarifies the problem of
existence, implies Eqs. (4) and (5) and finally leads to a proof of the implied volatility
representation (2).

1See Gatheral [1, p. 29ff] for details.


2See also Lee [3, Sect. 2.3], who remarks that the proof in Gatheral [1] hinges upon the assumption

of the existence of v K ,T (t).


A Remark on Gatheral’s ‘Most-Likely Path Approximation’ of Implied Volatility 241

2 A New Proof of the Implied Volatility Representation

For our proof of the implied volatility representation we assume that the stock price
follows an Itô-process with respect to the risk-neutral measure Q (with respect to
which all expectations are taken) of the form

dSt
= r dt + σt dWt , (6)
St

such that the discounted stock price (e−rt St )0≤t≤T is a square-integrable martin-
gale. The volatility process σ is a general predictable, W -integrable process. This
setup covers in particular local volatility models, where σt = σ(t, St ) and stochastic
volatility models where σt = σ(t, Vt ) and Vt is a stochastic factor driving the volatil-
ity. We fix a terminal time T and assume that S is non-deterministic in the sense that
P(St = ST ) > 0 for all t ∈ [0, T ]. Fixing also a strike price K we are ultimately
interested in the implied Black-Scholes volatility σimp (T, K ) for a European option
with expiry T and strike K in the above model.

2.1 A Regime-Switching Model and Implied Forward Total


Variance

To start our derivation, we associate for each u ∈ [0, T ] and  u ≥ 0 the ‘regime-
switching’ process S u to S, given by

dSut
Stu
= r dt + σt dWt t ∈ [0, u]
(7)
dSut
Stu
= r dt +  u dWt t ∈ [u, T ].

The process S u switches, at time t = u, from the dynamics (6) to Black-Scholes


dynamics with constant volatility  u . It should be obvious, that S T = S, while
S 0 is simply a Black-Scholes model with volatility  0 . In what follows, it will be
helpful to consider the total variance wu = (T − u)( u )2 instead of  u . By simple
conditioning, the price of a put option on S u with strike K and maturity T is given by
    
e−rT E (K − Su )+ = e−ru E e−r (T −u) E (K − Su )+ | Fu
= e−ru E [PBS (u, Su , T, K ; wu )],

where PBS (u, S, T, K ; w) is the Black-Scholes put-price parametrized by total vari-


ance, i.e.
PBS (u, S, T, K ; w) = e−r (T −u) K (−d2 ) − S(−d1 )
242 M. Keller-Ressel and J. Teichmann

and  
log er (T −u) S √
K w
d1,2 (w) = √ ± .
w 2

Definition 2.1 For u ∈ [0, T ) we define the implied forward total variance ŵu =
ŵu (T, K ) ≥ 0 as the solution of
   
e−ru E PBS (u, Su , T, K ; ŵu ) = e−rT E (K − ST )+ (8)

i.e. ŵu is the total variance wu = (T − u)( u )2 that has to be chosen in the regime-
switching model (7) such that the resulting put-price coincides with the put-price
from the original model (6).

Proposition 2.2 There exists a unique positive deterministic function u → ŵu , such
that the equality
   
e−ru E PBS (u, Su , T, K ; ŵu ) = e−rT E (K − ST )+ (9)

is satisfied for all u ∈ [0, T ].

Proof For w = 0, the Black-Scholes price e−ru PBS (u, Su , K , T ; w) equals e−ru
(e−r (T −u) K − Su )+ . Since (e−ru Su )0≤u≤T is a martingale, we have by Jensen’s
inequality that
   
e−ru E [PBS (u, Su , K , T ; 0)] = e−ru E (e−r (T −u) K − Su )+ ≤ e−rT E (K − ST )+ .

For w → ∞ the Black-Scholes price PBS (u, Su , K , T ; w) approaches e−r (T −u) K .


In this case we get
 
e−ru E [PBS (u, Su , T, K ; ∞)] = e−rT K ≥ e−rT E (K − ST )+ .

In addition w → PBS (t, St , T, K ; w) is for any given St a continuous and strictly


monotone increasing function (here we need the non-degeneracy assumption on S),
hence also w → E [PBS (t, St , T, K ; w)] is. Therefore we conclude that (9) has a
unique solution ŵu for each u ∈ [0, T ]. 

Remark 2.3 Notice that the previous proof holds in fact for semi-martingales S, such
that (exp(−r t)St )0≤t≤T is a martingale, so neither square integrability nor absence
of jumps are needed. However, we do not get regularity assertions for u → ŵu .
A Remark on Gatheral’s ‘Most-Likely Path Approximation’ of Implied Volatility 243

2.2 Main Result

We now present our main result on the implied forward total variance ŵu . Here
the assumption of continuous trajectories is really needed, as well as the following
L 2 -continuity assumption:
Assumption 2.4 We assume that σu is mean-square continuous, i.e. the map
[0, T ] u → σu2 ∈ L 2 (, Q) is continuous with respect to the L 2 -topology.

Theorem 2.5 Under Assumption 2.4 the mapping u → ŵu is in C 1 [0, T )∩C 0 [0, T ]
and satisfies the ODE
 
∂ ŵu E φ(d2 (ŵu ))σu2
=−   , u ∈ [0, T ), (10)
∂u E φ(d2 (ŵu ))

with terminal condition limu→T ŵT = 0 and where φ denotes the standard normal
density. For u = 0 it holds that

ŵ0 (T, K ) = T σimp


2
(T, K ),

where σimp (T, K ) is the implied Black-Scholes volatility for time-to-maturity T and
strike K in (6).

Remark 2.6 Equation (10) can be rewritten as (2). Alternatively, it can be written as

∂ ŵu   φ(d2 (ŵu ))


− = E σu2 + Cov   , σu2 ,
∂u E φ(d2 (ŵu ))

i.e., the rate of decrease in total implied variance is given by expected instanta-
neous stochastic volatility plus a correction term that accounts for correlation effects
between σu and Su in a highly non-linear way.

Proof We set
F(u, w) = e−r u E [PBS (u, Su , T, K ; w)].

Note that the derivative of PBS with respect to total variance w is given by

∂ 1
PBS (u, S, T, K ; w) = √ Sφ(d1 ),
∂w 2 w

which, inserting S = Su , is uniformly integrable in w on each interval (, ∞),  > 0.


Hence for w ∈ (0, ∞),

∂ e−ru e−r T
F(u, w) = √ E [Su φ(d1 (w))] = √ E [φ(d2 (w))]. (11)
∂w 2 w 2 w
244 M. Keller-Ressel and J. Teichmann

Applying Ito’s formula and using the martingale property of S we obtain

∂ ∂ ∂ 1 ∂2
F(u, w) = e−r u E −r PBS + PBS + PBS r Su + PBS Su2 σu2 . (12)
∂u ∂u ∂S 2 ∂ S2

Parameterized by total implied variance, the Black-Scholes put-price PBS satisfies

∂ ∂
−r PBS + PBS + r S PBS = 0 ,
∂u ∂S
such that (12) simplifies to

∂ 1 ∂2 1 e−rT K  
F(u, w) = e−r u E PBS Su σ
2 2
u = √ E φ(d2 (w))σ 2
u . (13)
∂u 2 ∂ S2 2 w

Note that due to Assumption 2.4 both ∂u F(u, w) and ∂w F(u, w) are continuous.
Furthermore, recall that ŵu is given in Definition 2.1 by the implicit equation
 
F(u, ŵu ) = e−r T E (K − ST )+ , (14)

where the right hand side depends neither on u nor on ŵu . Let us first examine the
boundary behavior of F(u, w). We easily derive that
 
lim F(u, w) = E e−r T K − e−ru Su ,
w→0 +

lim F(u, w) = e−rT K ,


w→∞
lim F(u, w) = PBS (0, S0 , K ; w),
u→0
lim F(u, w) = e−r T E [(−d2 (w))K − (−d1 (w))ST ].
u→T

By Jensen’s inequality and the assumptions on the non-degeneracy of S it holds that


    
E e−rT K − e−ru Su +
< e−rT E (K − ST )+ < e−rT K

for all u ∈ [0, T ). From (11) we see that ∂w F(u, w) > 0 and hence w → F(u, w)
is increasing for w ∈ (0, ∞). Altogether, it follows that for each u ∈ [0, T ] a
unique ŵu solving (14) exists. In addition, by the implicit function theorem, ŵu is
in C 1 [0, T ) ∩ C 0 [0, T ] with derivative
 
∂ ∂u F(u, w) E φ(d2 (wu ))σu2
ŵu = − =− ,
∂u ∂w F(u, w) E [φ(d2 (wu ))]
A Remark on Gatheral’s ‘Most-Likely Path Approximation’ of Implied Volatility 245

where we have combined (11) and (13). The initial and terminal conditions for
ŵu at u = 0 and u = T can be derived from the above boundary conditions for
F(u, w). Indeed,
PBS (0, S0 , K ; ŵ0 ) = C(K , T )

implies that ŵ0 = T σimp


2 , where σ
imp is the Black-Scholes implied volatility corre-
sponding to the put-price P(K , T ). Finally
 
E [(−d2 (w))K − (−d1 (w))ST ] = P(K , T ) = E (K − ST )+

implies that w = 0 and hence both boundary conditions for ŵu follow. 

Acknowledgments MKR acknowledges funding from the Excellence Initiative of the German
Research Foundation (DFG).

References

1. Gatheral, J.: The Volatility Surface. Wiley Finance (2006)


2. Guyon, J., Henry-Labordère, P.: Nonlinear Option Pricing. CRC Press, Boca Raton (2013)
3. Lee, R.: Implied volatility: statics, dynamics, and probabilistic interpretation. Recent Advances
in Applied Probability. Springer, New York (2004)
Implied Volatility from Local Volatility:
A Path Integral Approach

Tai-Ho Wang and Jim Gatheral

Abstract Assuming local volatility, we derive an exact Brownian bridge


representation for the transition density; an exact expression for the transition den-
sity in terms of a path integral then follows. By Taylor-expanding around a certain
path, we obtain a generalization of the heat kernel expansion of the density which
coincides with the classical one in the time-homogeneous case, but is more accurate
and natural in the time inhomogeneous case. As a further application of our path
integral representation, we obtain an improved most-likely-path approximation for
implied volatility in terms of local volatility.

Keywords Small time asymptotic expansion · Heat kernels expansion · Implied


volatility · Local volatility model · Most likely path · Path integral

1 Introduction

Because of their consistency with the known prices of European options, and despite
their unrealistic dynamical implications, local volatility models continue to be used
in practice as powerful tools for risk management of equity derivatives portfolios.
Under the forward measure (with no drift), local volatility models take the form

d St
= σ (St , t) d Bt , (1.1)
St

In memory of our long term collaborator and friend, a passionate mathematician, Peter Laurence.

T.-H. Wang (B) · J. Gatheral


Department of Mathematics, Baruch College, CUNY 1 Bernard Baruch Way,
New York, NY 10010, USA
e-mail: [email protected]
J. Gatheral
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 247


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_9
248 T.-H. Wang and J. Gatheral

where Bt is a Brownian motion and σ is a local volatility function that depends only
on the underlying level S and the time t.
Assume that prices of European options of all strikes K and expirations T are
given or equivalently that the Black-Scholes implied volatility function σBS (K , T )
is known. In that case, it is straightforward to compute the local volatility function
σ from, for example, Eq. (1.10) of Gatheral [6]:

∂w
∂T
σ 2 (K , T ) =  2 1   ∂w 2 (1.2)
k ∂w 1 ∂2w
1− 2 w ∂k − 1
4 4 + 1
w ∂k + 2 ∂k 2

where k denotes the log-strike k := log K /S and w, the Black-Scholes implied total
variance, given by w(K , T ) := σBS 2 (K , T ) T .

In practice, we observe option prices for only a finite set of strikes and expirations.
Moreover (see for example Gatheral and Jacquier [7]), it is very hard if not impossible
to find a functional form for implied volatility that both matches observed prices and
is free from static arbitrage. One alternative approach is to assume a parameterized
functional form for the local volatility function σ (S, t) and price a finite set of
European options, tuning the parameters of the function until a satisfactory fit is
achieved. Such calibration of local volatility models to given option prices is in
practice typically performed using numerical PDE techniques. However, numerical
PDE techniques are slow and moreover are not practical in higher dimensions.
Alternatively, to achieve better understanding of the qualitative properties of local
volatility models, and potentially faster calibration, both academics and practitioners
have exploited asymptotic expansions of implied volatility in terms of local volatility.
First, Berestycki et al. [2] solved the nonlinear PDE (1.2) for the implied total vari-
ance w in the small time to expiration limit, obtaining an exact expression for implied
volatility as an integral of local volatility. Subsequently, this asymptotic approxima-
tion was extended, to first order in time to expiry τ = T − t by Henry-Labordère (see
the article in this volume and also Henry-Labordère [12]), and then to second order in
Gatheral et al. [9] using the heat kernel expansion. Jordan and Tier [14] apply similar
methods to derive an asymptotic solution for the SABR and CEV models. In related
work, the paper of Cheng et al. [5] derives an operator expansion of the density,
which up to first order agrees with prior expansions obtained using the heat kernel
expansion. As an earlier example of work in a similar spirit to the most-likely-path
approach of our paper, Baldi and Caramellino [1] develop a small-time expansion
for the hitting probability of a one-dimensional diffusion.
Our contribution in this paper is to derive an exact Brownian bridge representation
for the transition density, from which an exact expression for the transition density
in terms of a path integral follows. Indeed, the path integral representation of the
density has often been used as a powerful tool for the derivation of improved asymp-
totic expansions of the transition density. For example, in the foregoing, we apply
a technique from the paper of Goovaerts et al. [10]. An earlier paper by Linetsky
[16] provides a more general survey of the application of path integral techniques to
option pricing.
Implied Volatility from Local Volatility: A Path Integral Approach 249

By replacing all paths that contribute to the path integral by the most-likely-path,
the unique path that minimizes the action functional in the path integral formulation,
we obtain a new approximation to the transition density which is both more accurate
and natural than the classical heat kernel version. As an application, we obtain an
improved most-likely-path approximation for implied volatility in terms of local
volatility.
The most-likely-path (MLP) approach has been used to analyze the asymptotic
behavior of implied volatility in stochastic volatility models in Gatheral [6]; this
analysis is further elaborated in an article by Keller-Ressel and Teichmann [15] in
these proceedings. Guyon and Henry-Labordère [11] and Reghai [17] both explore
alternative definitions of the most likely path, achieving improved accuracy by con-
sidering fluctuations around the MLP. In particular, Guyon and Henry-Labordère
[11] compare and contrast various approximations in a unified setting. Though the
approach of Guyon and Henry-Labordère [11] differs from our path integral approach
in the current paper, it is worth mentioning that their heat kernel approximation is
closely related to ours. Once again however, our path integral approach leads to an
unambiguously natural definition of the most-likely-path.
Our paper is organized as follows. In Sect. 2, we derive Brownian bridge and path
integral representations for the transition density of one dimensional diffusions. As
an application, in Sect. 3, we present a novel probabilistic derivation of the heat kernel
expansion, also referred to as the WKB method in the physics literature. For time
homogeneous diffusions, this new expansion recovers the conventional heat kernel
expansion; however, in the time-inhomogeneous case, the two expansions differ a
little. In Sect. 4, we present heuristic derivations of known small time asymptotic
expansion of implied volatility to zeroth order. From the path integral perspective,
these known approximations are suboptimal in the sense that they correspond to
computing the optimal path of an approximate but incomplete action functional. By
considering the optimal path of the exact action functional, we show how an optimal
approximation may be computed. An interesting feature of the optimal approximation
is that it recovers the implied volatility of the time dependent Black-Scholes model
exactly, which so far, to the best of our knowledge, none of the existing small time
approximations are able to achieve. Finally, in Sect. 5, we summarize and conclude.
Throughout the text, Bt denotes the standard Brownian motion defined on the
filtered probability space (, Ft , P) satisfying the usual conditions. X t denotes the
Brownian motion with some drift h. p X (T, y|t, x) denotes the transition density
of X from x at time t to y at time T and similarly p S (T, sT |t, st ) is the transition
density from st to sT of the process St . Moreover, dot will always refer to the partial
derivative with respect to the time variable and prime to the space variables x or s.
250 T.-H. Wang and J. Gatheral

2 Path Integral Representations for Transition Density

In this section, we derive path integral representations of the transition density and
of the call prices under local volatility, which will in turn yield the most-likely-path
approximation to implied volatility. The key ingredient in this derivation is a Brown-
ian bridge representation for the transition density, which though straightforward,
does not appear to be well-known.
We start with the case of one-dimensional Brownian motion with general but
Markovian drift. We reduce the more general diffusion case which concerns us here
to this one by applying the well-known Lamperti change of variable.

2.1 Brownian Bridge Representations

Two Brownian bridge representations for the transition density of Brownian motion
with general but Markovian, smooth and bounded, drift are derived in Theorem 1. The
first expression, (2.1), will be used in the derivation of the path integral representation
for transition density in Sect. 2.2 and the second, (2.2), will be used to derive the heat
kernel expansion of transition density in Sect. 3.
Theorem 1 Let X t be a Brownian motion with drift driven by

d X t = d Bt + h(X t , t)dt,

where the drift h is assumed smooth and bounded. Let H be an antiderivative of h



with respect to x, i.e., ∂x H (x, t) = h(x, t), for all x and t. The transition density
X
p of X t has the following two equivalent Brownian bridge representations:
 T 
1 T 2

p X (T, y|t, x) = φ(T − t, y − x) Ẽx,y e t h(X s ,s)d X s − 2 t h (X s ,s)ds (2.1)

and

p X (T, y|t, x) = φ(T − t, y − x) e H (y,T )−H (x,t) ×


 1 T 2 
Ẽx,y e− 2 t h (X s ,s)+h x (X s ,s)+2Ht (X s ,s)ds , (2.2)

ξ2
where φ is the Gaussian density φ(t, ξ) = √ 1 e− 2t . The notation Ẽx,y [·] denotes
2πt
the expectation under the Brownian bridge measure from x to y.
Proof Note that X t under the original measure P is a Brownian motion with drift h.
Define a new probability measure P̃ through the Radon-Nikydom derivative

d P̃ T 
1 T 2
= e− t h(X s ,s)d Bs − 2 t h (X s ,s)ds .
dP
Implied Volatility from Local Volatility: A Path Integral Approach 251

By the Girsanov theorem, X t is a Brownian motion under P̃. Given any bounded
measurable function f , we have, since d Bt = d X t − h(X t , t)dt,

dP  T 
1 T 2

Et,x [ f (X T )] = Ẽt,x f (X T ) = Ẽt,x f (X T )e t h(X s ,s)d X s − 2 t h (X s ,s)ds ,
d P̃

where, for notational simplicity, Et,x [·] denotes the conditional expectation E[·|X t =
x], and similarly for Ẽt,x [·]. It follows that, for any bounded measurable function f ,
 T 
1 T 2

f (y) p X (T, y|t, x)dy = f (y)Ẽx,y e t h(X s ,s)d X s − 2 t h (X s ,s)ds

φ(T − t, y − x)dy,

where Ẽx,y [·] = Ẽ[·|X t = x, X T = y]. Consequently, (2.1) follows, i.e.,


 T 
1 T 2

p X (T, y|t, x) = φ(T − t, y − x)Ẽx,y e t h(X s ,s)d X s − 2 t h (X s ,s)ds .

Furthermore, Ito’s formula implies that

T T h x (X s , s)
h(X s , s)d X s = H (X T , T ) − H (X t , t) − Ht (X s , s) + ds,
t t 2

where we recall that H is an antiderivative of h with respect to x. Thus,


T T 
h(X s ,s)d X s H (X T ,T )−H (X t ,t)− Ht (X s ,s)+ h x (X2 s ,s) ds
e t =e t
.

We further rewrite the transition density as

p X (T, y|t, x) = φ(T − t, y − x)e H (y,T )−H (x,t)


 1 T 2 
Ẽx,y e− 2 t h (X s ,s)+h x (X s ,s)+2Ht (X s ,s)ds .

This completes the proof of (2.2). 

Remark 1 We remark that the conditional expectations in both (2.1) and (2.2) are
under the Brownian bridge measure since X t is a Brownian motion under P̃. One
intriguing feature of the representation (2.2) is that, if we Taylor expand the condi-
tional expectation for small T − t around the straight line connecting the initial and
terminal points, we recover the heat kernel expansion in the time-homogeneous case
and probably do better than the heat kernel expansion in the time-inhomogeneous
case. See Sect. 3 for more detailed discussions on the heat kernel expansion.
252 T.-H. Wang and J. Gatheral

Now for the general diffusion case, consider the process St driven by the stochastic
differential equation (SDE)

d St = μ(St , t)dt + a(St , t)d Bt , S0 = s0 ,

where for simplicity, we assume the coefficients μ and a are Lipschitz and of linear
growth; a is further assumed strictly away from zero. By applying the Lamperti
 s dξ
transformation x = s0 a(ξ,t) , the process St is transformed into a Brownian motion
with drift. Specifically, denote the transformation from s to x by x = ϕ(s, t) =
 s dξ
s0 a(ξ,t) . Applying Ito’s formula to X t = ϕ(St , t) yields

d X t = dϕ(St , t)
a 2 (St , t)
= ϕ̇(St , t) + μ(St , t)ϕs (St , t) + ϕss (St , t) dt + ϕs (St , t)a(St , t)d Bt
2
μ(St , t) as (St , t)
= ϕ̇(St , t) + − dt + d Bt
a(St , t) 2
= d Bt + h(X t , t)dt,

where subindices of ϕ and a refer to partial derivatives. The function h is defined as


h(x, t) = ϕ̇(s, t) + μ(s,t)
a(s,t) −
as (s,t) −1
2 , with s = ϕ (x, t). The transition densities p
S

for St and p X for X t are then related as

1
p S (T, sT |t, st ) = p X (T, x T |t, xt ),
a(sT , T )

with x T = ϕ(sT , T ) and xt = ϕ(st , t). Thus, the transition from the Brownian
bridge representation for p X to a similar representation for p S is straightforward by
applying Theorem 1. Theorem 2 formalizes this result.
Theorem 2 Let St be the diffusion process driven by the stochastic differential equa-
tion
d St = μ(St , t)dt + a(St , t)d Bt , S0 = s0 .
s dξ
Denote the Lamperti transformation from s to x by x = ϕ(s, t) = s0 a(ξ,t) . Define
the function h by h(x, t) = ϕ̇(s, t) + μ(s,t)
a(s,t) −
as (s,t)
2 , with s = ϕ−1 (x, t), where
subindices refer to corresponding partial derivatives. Let H be an antiderivative

of h with respect to x, namely, ∂x H (x, t) = h(x, t), for all x and t. Then the
transition density p of St from (t, st ) to (T, sT ) has the following Brownian bridge
S

representations:

φ(T − t, ϕ(sT , T ) − ϕ(st , t))


p S (T, sT |t, st ) = Ẽϕ(st ,t),ϕ(sT ,T )
a(sT , T )
 T 
1 T 2

× e t h(X s ,s)d X s − 2 t h (X s ,s)ds (2.3)
Implied Volatility from Local Volatility: A Path Integral Approach 253

and
φ(T − t, ϕ(sT , T ) − ϕ(st , t)) H (ϕ(sT ,T ),T )−H (ϕ(st ,t),t)
p S (T, sT |t, st ) = e ×
a(sT , T )
 1 T 2 
Ẽϕ(st ,t),ϕ(sT ,T ) e− 2 t h (X s ,s)+h x (X s ,s)+2Ht (X s ,s)ds , (2.4)

ξ2
where again φ denote the Gaussian density φ(t, ξ) = √ 1 e− 2t . As before, the
2πt
notation Ẽx,y [·] denotes the expectation under the Brownian bridge measure from x
to y.

Note that the X t process in both expressions (2.3) and (2.4) is a Brownian bridge
from x T = ϕ(sT , T ) to xt = ϕ(st , t). One application of such Brownian bridge rep-
resentations of transition densities is to devise more efficient simulation schemes. For
example, for some given function f , we may compute numerically the expectation
Et,st [ f (ST )] in x-space as

Et,st [ f (ST )] = Et,xt [ f (ϕ−1 (X T ))] = f ◦ ϕ−1 (x T , T ) p X (T, x T |t, xt )d x T

= f ◦ ϕ−1 (x T , T ) φ(T − t, x T − xt )e H (x T ,T )−H (xt ,t)


 1 T 2 
× Ẽxt ,x T e− 2 t h (X τ ,τ )+h x (X τ ,τ )+2Ht (X τ ,τ )dτ d x T
 1 T 2 
= E f ◦ ϕ−1 (Y, T )e H (Y,T )−H (xt ,t) Ẽxt ,Y e− 2 t h (X τ ,τ )+h x (X τ ,τ )+2Ht (X τ ,τ )dτ

where Y is a normal random variable with mean xt and variance T − t. Therefore, if


there is
 an1 efficient
T 2
method to calculate or approximate the conditional expectation
Ẽxt ,Y e− 2 t h (X τ ,τ )+h x (X τ ,τ )+2Ht (X τ ,τ )dτ in the Brownian bridge measure, the
expectation Et,st [ f (ST )] could potentially be computed more efficiently. Since X t
is a Brownian bridge from xt to Y , one  obviousT
approximation is to simply  replace
− 12 t h 2 (X τ ,τ )+h x (X τ ,τ )+2Ht (X τ ,τ )dτ
the integral in the exponent of Ẽxt ,Y e with the
T −τ τ −t
integrand evaluated along the straight line xτ = T −t x t + T −t Y for τ ∈ [t, T ]. In
other words,
 1 T 2 
Ẽxt ,Y e− 2 t h (X t ,t)+h x (X t ,t)+2Ht (X t ,t)dt
1
T
≈ e− 2 t h 2 (xτ ,τ )+h x (xτ ,τ )+2Ht (xτ ,τ )dτ
. (2.5)
254 T.-H. Wang and J. Gatheral

Hence,
1
T
Et,st [ f (ST )] ≈ e− 2 t h (xτ ,τ )+h x (xτ ,τ )+2Ht (xτ ,τ )dτ
2

 
E f ◦ ϕ−1 (Y, T ) e H (Y,T )−H (xt ,t) ,

after which we need only simulate the normal random variable Y .


The straight-line approximation in (2.5) seems somewhat ad hoc. Would we do
better with another path? Why not add two extra paths to take some account of the
variability of the random paths in the full integral? More generally, is there an optimal
or systematic way of picking these paths? The path integral representation in Sect. 2.2
may provide a partial answer to this question. The Brownian bridge representations
(2.1) and (2.3) play a key role in the derivation of this path integral representation.

2.2 The Path Integral Representation of the Density

In this section, we provide a formal derivation of the path integral representa-


tion exploiting the Brownian bridge representations of Sect. 2.1 and the Chapman-
Kolmogorov equation. As in Sect. 2.1, we will use the notations ϕ(st , t) and xt
interchangeably.
Let {t = t0 < t1 < · · · < tn = T } be a partition of the time interval [t, T ]
with ti = ti − ti−1 = Tn , for i = 1, . . . , n. By iteratively applying the Chapman-
Kolmogorov equation, the transition density p S (T, sT |t, st ) can be written as


n
p S (T, sT |t, st ) = ··· p S (ti , si |ti−1 , si−1 ) ds1 . . . dsn−1 , (2.6)
i=1

where we set s0 = st and sn = sT . Recall from (2.3) that the transition density p S
of St from (ti−1 , si−1 ) to (ti , si ) has the Brownian bridge representation

p S (ti , si |ti−1 , si−1 )


φ(t, ϕ(si , ti ) − ϕ(si−1 , ti−1 ))
= Ẽϕ(si−1 ,ti−1 ),ϕ(si ,ti )
a(si , ti )
 ti  ti 1 2
h(X τ ,τ )d X τ − ti−1 2 h (X τ ,τ )dτ
e ti−1
,

where X τ is a Brownian bridge from ϕ(si−1 , ti−i ) to ϕ(si , ti ). We next compute the
limit of (2.6), as t → 0+ (or equivalently n → ∞), assuming that, for i = 1, . . . , n,
the si ’s form a discretization of a differentiable curve sτ , for τ ∈ [t, T ].
Implied Volatility from Local Volatility: A Path Integral Approach 255

We have


n  ti
h(X τ ,τ )d X τ − 21
 ti
h 2 (X τ ,τ )dτ
lim Ẽϕ(si−1 ,ti−1 ),ϕ(si ,ti ) e ti−1 ti−1
n→∞
i=1
T 
1 T 2
=e t h(ϕ(sτ ,τ ),τ ) ẋ τ dτ − 2 t h (ϕ(sτ ,τ ),τ )dτ

and


n
1 2
lim e− 2t [ϕ(si ,ti )−ϕ(si−1 ,ti−1 )]
n→∞
i=1
n
= lim e− 2t
1
i=1 [ϕ(si ,ti )−ϕ(si−1 ,ti−1 )]2
n→∞
n   si
2
− 12 ϕi−1 +ϕ̇i−1 t+O (si 2 +t)2
= lim e i=1 t
n→∞
T
− 12 t [ϕ (sτ ,τ )ṡτ +ϕ̇(sτ ,τ )]2 dτ
=e .

Substitution into (2.6) and taking the limit n → ∞ yields the following path integral
representation for the transition density p S
T
p S (T, sT |t, st ) = e− 2
1
t [ϕ (sτ ,τ )ṡτ +ϕ̇(sτ ,τ )−h(ϕ(sτ ,τ ),τ )]2 dτ D[s], (2.7)
Cs

where
1 
n−1
1 dsi
D[s] = lim √ √ ,
n→∞ 2πt a(s , T )
T i=1
2πt a(si , ti )

and Cs denotes the collection of all differentiable curves from (t, st ) to (T, sT ).
Equivalently, because d xi = a(sdsi ,ti i ) , we may rewrite the path integral representation
(2.7) more neatly and simply in x-space as

1 
1 T
e− 2 t [ẋτ −h(xτ ,τ )] dτ D[x]
2
p S (T, sT |t, st ) = (2.8)
a(ST , T ) Cx

where
1  d xi
n−1
D[x] = lim √ √
n→∞ 2πt 2πt
i=1

and Cx denotes the collection of all differentiable curves from (t, xt ) to (T, x T ).
We shall henceforth deal mostly with the simpler expression (2.8). Heuristically, one
could think of the path integral representation (2.8) of the density as an exponentially-
weighted average over all possible differentiable curves connecting xt to x T . D[x]
could then be regarded as the “Lebesgue” measure on the space of differentiable
256 T.-H. Wang and J. Gatheral

curves connecting xt to x T , though mathematically such a measure does not really


exist.
Assume now that under the pricing measure (assuming zero interest rate and
dividend yield), the price St of the underlying is driven by the SDE of local volatility
type
d St = a(St , t)d Bt .

The path integral representation (2.7) of the transition density p S in this case has the
following simpler form
T ṡτ

as (sτ ,τ ) 2
− 12 + dτ
p (T, sT |t, st ) =
S
e t a(sτ ,τ ) 2
D[s].
Cs

Integrating the payoff function over the transition density, the path integral represen-
tation for call price is immediate:

∞ T ṡτ

as (sτ ,τ ) 2
− 21 + dτ
C(t, st , K , T ) = (sT − K ) e t a(sτ ,τ ) 2
D[s]dsT ,
K Cs

or equivalently in x-space,
∞ sT − K 
1 T
e− 2 t |ẋτ −h(xτ ,τ )| dτ D[x]dsT ,
2
C(t, st , K , T ) = (2.9)
K a(sT , T ) Cx

as (s,t)
where h(x, t) = ϕ̇(s, t) − 2 .

3 Probabilistic Derivation of the Heat Kernel Expansion

The heat kernel expansion is a small time asymptotic expansion of the fundamental
solution of the heat equation over a Riemannian manifold. Reexpressing the transition
density of a diffusion process in terms of this fundamental solution leads naturally to a
small time asymptotic expansion of the transition density. This topic is well-studied
in the Riemannian geometry literature, see Chavel [4] for a geometric analytical
approach and Hsu [13] for a probabilistic approach. In the physics literature, the heat
kernel approach to deriving small time asymptotic expansions is also known as the
WKB method or the ray solution, see Jordan and Tier [14]. Deriving such expansions
in one dimension is much simpler than in higher dimensions where no analogue of
the Lamperti transformation exists.
Though the heat kernel expansion is very well-known, the Brownian bridge rep-
resentation (2.4) of Theorem 2 leads to a novel probabilistic derivation which we
will now present. To fix ideas and illustrate the methodology employed, we start
with the case of Brownian motion with drift; as before, the general diffusion case
Implied Volatility from Local Volatility: A Path Integral Approach 257

follows via the Lamperti transformation. To minimize mathematical technicalities,


we shall assume (at least in this section) that all functions are bounded with bounded
derivatives.

3.1 Heat Kernel Expansion for Brownian Motion with Drift

Theorem 3 Let X t be the Brownian motion with drift h, i.e., X t satisfies the SDE
d X t = d Bt + h(X t , t)dt. Denote by H an antiderivative of h with respect to x,

namely, ∂x H (x, t) = h(x, t), for all x and t. The transition density p X of X t has,

as t → T , the following small time asymptotic expansion:

p X (T, y|t, x) = φ(T − t, y − x) e H (y,T )−H (x,t) × (3.1)



1 T 2 ∗ 
1− h (xs , s) + h x (xs∗ , s) + 2Ht (xs∗ , s)
2 t

ds + O(T − t)2

ξ2
where φ is the Gaussian density φ(t, ξ) = √ 1 e− 2t . xs∗ denotes the straight line
2πt
from (t, x) to (T, y), i.e., xs∗ = x + T −t (y
s−t
− x) for s ∈ [t, T ].

Notice that in the time-inhomogeneous case h = h(x, t), the approximation (3.1)
is different from the heat kernel expansion (see, for example, (3.3), (3.6), and (3.7)
on page 603 of Gatheral et al. [9]) in that the approximation in (3.1) involves an
integration from t to T whereas, in the classical heat kernel expansion, all quantities
are evaluated at the fixed initial time t. Of course, in the time homogeneous case where
the drift h = h(x) has no explicit dependence on t, the expansion (3.1) coincides
with the classical heat kernel expansion as formalized in the following corollary.

Corollary 1 (Heat kernel expansion for Brownian motion with drift) For Brownian
motion with time homogeneous drift h = h(x), the transition density p X of X t from
(t, x) to (T, y) has the asymptotic expansion up to first order as

p(T, y|t, x) = φ(T − t, y − x)e H (y)−H (x)


 y  
T −t
1− h 2 (ξ) + h  (ξ) dξ + O(T − t)2 ,
2(y − x) x

which coincides with the classical heat kernel expansion up first order (see, for
instance, Gatheral et al. [9]).
258 T.-H. Wang and J. Gatheral

Proof In this case, Ht = 0 because h t = 0. The integral in (3.1) can be evaluated as

T  
h 2 (xs∗ ) + h  (xs∗ ) ds
t
   
T s−t s−t
= h2 x + (y − x) + h  x + (y − x) ds
t T −t T −t
T −t y 2 
= h (ξ) + h  (ξ) dξ,
y−x x

where in the last equation we used the change of variable ξ = x + T −t (y


s−t
− x). 

Let Yt denote the Brownian bridge from x at time t to y at time T and Ẽx,y [·]
be the expectation under the Brownian bridge measure. The proof of the asymptotic
expansion (3.1) requires the following two lemmas.

Lemma 1 For a bounded function g = g(x, s), |g| ≤ M say, we have the following
estimate
 T  T
Ẽx,y e t g(Ys ,s)ds = 1 + Ẽx,y [g(Ys , s)] ds + O(T − t)2 .
t

Proof The proof is based on a clever application of the convex order for random
variables first observed, to our knowledge, in the paper by Goovaerts et al. [10] (see
Proposition 6.2 on p. 348). Denote by Q g(Ys ,s) (q) the qth quantile of the random
variable g(Ys , s). Since exponential functions are convex, it follows from Proposition
6.2 of Goovaerts et al. [10] that
 T  1 T
Ẽx,y e t g(Ys ,s)ds ≤ e t Q g(Ys ,s) (q)ds
dq.
0

We establish an upper bound for the right hand side. First we Taylor expand the
integrand and rewrite the integral as

1 T ∞
 1 1 T k
Q g(Ys ,s) (q)ds
e t dq = Q g(Ys ,s) (q)ds dq.
0 k! 0 t
k=0

 1  T k
An upper bound for 0 t Q g(Ys ,s) (q)ds dq is then determined as

1 T k
Q g(Ys ,s) (q)ds dq
0 t
1 T
≤ (T − t)k−1 |Q g(Ys ,s) (q)|k ds dq (by Hölder’s inequality)
0 t
Implied Volatility from Local Volatility: A Path Integral Approach 259

T
= (T − t)k−1 Ẽx,y |g(Ys , s)|k ds
t
≤ M k (T − t)k (since |g| ≤ M).

Thus,
1 T
Q g(Ys ,s) (q) ds
e t dq
0
1 T ∞
 1 1 T k
= 1+ Q g(Ys ,s) (q) ds dq + Q g(Ys ,s) (q)ds dq
0 t k! 0 t
k=2
T ∞
 1 k
≤ 1+ Ẽx,y [g(Ys , s)] ds + M (T − t)k
t k!
k=2
T
≤ 1+ Ẽx,y [g(Ys , s)] ds + M 2 (T − t)2 e M(T −t) ,
t

which completes the proof. 

Lemma 2 asserts that the time integral of the conditional expectation in Lemma 1
is approximately, up to order (T − t)2 , equal to the integral along a straight line
connecting x at time t to y at time T .
Lemma 2 For a bounded function g = g(x, s) with bounded second partial deriv-
ative with respect to x, the following asymptotic holds.

T T
Ẽx,y [g(Ys , s)] ds = g(xs , s)ds + O(T − t)2 ,
t t

where xs denotes the straight line xs = x + T −t (y


s−t
− x) from (t, x) to (T, y).

Proof Taylor’s theorem implies that

gx x (ξs , s)
g(Ys , s) = g(xs , s) + gx (xs , s)(Ys − xs ) + (Ys − xs )2 ,
2
for some ξs between Ys and xs . Since
 Ys is a Brownian
 bridge from (t, x) to (T, y),
(s−t)(T −s)
Ys is normally distributed: Ys ∼ N xs , T −t . Therefore,

gx x (ξs , s)  
Ẽx,y [g(Ys , s)] = g(xs , s) + gx (xs , s) Ẽx,y [Ys − xs ] + Ẽx,y (Ys − xs )2
2
gx x (ξs , s) (s − t)(T − s)
= g(xs , s) + .
2 T −t

Hence, by the assumption that |gx x | ≤ K ,


260 T.-H. Wang and J. Gatheral

T T Tgx x (ξs , s) (s − t)(T − s)


Ẽx,y [g(Ys , s)] ds = g(xs , s)ds + 2 ds
t s t T −t
T K T (s − t)(T − s)
≤ g(xs , s)ds + ds
t 2 t T −t
T K
= g(xs , s)ds + (T − t)2 . 
t 12

The proof of Theorem 3 is now straightforward.

Proof (Proof of Theorem 3) By combining the two asymptotics in Lemmas 1 and


2 with g(x, s) = h 2 (x, s) + h x (x, s) + 2Ht (x, s), under the assumption that g is
bounded with bounded second partial derivative with respect to x, we obtain
 1 T 2 
Ẽx,y e− 2 t h (X s ,s)+h x (X s ,s)+2Ht (X s ,s)ds
1 T  
= 1− h 2 (xs , s) + h x (xs , s) + 2Ht (xs , s) ds + O(T − t)2 .
2 t

Recall expression (2.2) for the transition density:

p X (T, y|t, x) = φ(T − t, y − x) e H (y,T )−H (x,t) ×


 1 T 2 
Ẽx,y e− 2 t h (X s ,s)+h x (X s ,s)+2Ht (X s ,s)ds .

Substituting the approximation of the conditional expectation above, we obtain

p X (T, y|t, x) = φ(T − t, y − x) e H (y,T )−H (x,t) ×


 
1 T 2 
1− h (xs , s) + h x (xs , s) + Ht (xs , s) ds + O(T − t)2 .
2 t

3.2 Heat Kernel Expansion for Nondegenerate Diffusions

For general nondegenerate diffusions, consider the process St driven by the SDE:

d St = a(St , t)d Bt + μ(St , t)dt. (3.2)

Again the Lamperti transformation allows us to carry over the small time asymptotic
expansion (3.1) in x-space to s-space. Specifically, recall that the Lamperti transfor-
 s dξ
mation xt = ϕ(st , t) = s0t a(ξ,t) transforms the SDE (3.2) into a Brownian motion
Implied Volatility from Local Volatility: A Path Integral Approach 261

with drift d X t = d Bt + h(X t , t)dt, where h(xt , t) = ϕ̇(st , t) + μ(st ,t)


a(st ,t) −
as (st ,t)
2 and
S X
that the transition densities p for St and p for X t are related by

1
p S (T, sT |t, st ) = p X (T, x T |t, xt ),
a(sT , T )

with x T = ϕ(sT , T ) and xt = ϕ(st , t). Hence, a small time asymptotic expansion
as t → T − for p S can be obtained by simply applying the expansion (3.1). This
argument is formalized in Theorem 4.

Theorem 4 The transition density p S of the process St driven by the SDE

d St = a(St , t)d Bt + μ(St , t)dt

has the small time asymptotic expansion as t → T −

φ(T − t, ϕ(sT , T ) − ϕ(st , t)) H (ϕ(sT ,T ),T )−H (ϕ(st ,t),t)


p S (T, sT |t, st ) = e (3.3)
a(sT , T )
 
1 T 2
× 1− h (ϕτ , τ ) + h x (ϕτ , τ ) + 2Ht (ϕτ , τ )dτ + O(T − t)2 ,
2 t

T −τ τ −t
where ϕτ = T −t ϕ(st , t) + T −t ϕ(sT , T ).

We stress once again that in the time-inhomogeneous case, a = a(s, t), the
expansion in (3.3) is not identical to the classical heat kernel expansion as it involves
an integral along the path ϕτ . On the other hand, in the time-homogeneous case a =
a(s), (3.3) does recover the classical heat kernel expansion. In this sense therefore,
we have derived a natural generalization of the classical heat kernel expansion.

Corollary 2 (Heat kernel expansion for time-homogeneous diffusions) The transi-


tion density p S of the process St driven by the time-homogeneous SDE

d St = a(St )d Bt + μ(St )dt

has the small time asymptotic expansion as t → T − up to first order

φ(T − t, ϕ(sT ) − ϕ(st )) H ◦ϕ(sT )−H ◦ϕ(st )


p(T, sT |t, st ) = e (3.4)
a(sT )
 sT   ds 
T −t 
× 1− h (ϕ(s)) + h ◦ ϕ(s)
2
+ O(T − t) ,
2
2(ϕ(sT ) − ϕ(st )) st a(s)
262 T.-H. Wang and J. Gatheral
s dξ μ(s) a  (s)
where ϕ(s) = s0 a(ξ) , h ◦ ϕ(s) = a(s) − 2 , and H is an antiderivative of h.
ξ2
φ denotes the Gaussian density φ(t, ξ) = √ 1 e− 2t . The small time asymptotic
2πt
expansion coincides with the classical heat kernel expansion up to first order.

Proof We verify that the expansion (3.4) is indeed the classical heat kernel expansion.
The classical heat kernel expansion up to first order (see, for instance, Gatheral et al.
[9]) reads in our notation

φ(T − t, ϕ(sT ) − ϕ(st ))


p(T, sT |t, st ) ≈ u(st , sT )
a(sT )
 sT Lu(s, s ) ds 
T −t T
× 1+ ,
ϕ(sT ) − ϕ(st ) st u(s, sT ) a(s)
 s μ(η)
T dη

a(s) a 2 (s) 2
where u(s, sT ) = e s a2 (η) a(sT ) and L = 2 ∂s + μ(s)∂s is the infinitesimal
generator associated with the process St . In this case, the asymptotic expansion (3.3)
reduces to
 
φ(T − t, ϕ(sT ) − ϕ(st )) H ◦ϕ(sT )−H ◦ϕ(st ) 1 T 2 
e × 1− h (ϕτ ) + h (ϕτ )dτ ,
a(sT ) 2 t

T −τ τ −t
where ϕτ = T −t ϕ(st ) + T −t ϕ(sT ). Therefore, it suffices to show that

e H ◦ϕ(sT )−H ◦ϕ(st ) = u(st , sT ) (3.5)

and

1 T T −t sT Lu(s, sT ) ds
− h 2 (ϕτ ) + h  (ϕτ )dτ = . (3.6)
2 t ϕ(sT ) − ϕ(st ) st u(s, sT ) a(s)

μ(s) a  (s)
For (3.5), since h ◦ ϕ(s) = a(s) − 2 , ϕ (s) = 1
a(s) , and H is an antiderivative of
h, we have
ϕ(sT ) sT
H ◦ ϕ(sT ) − H ◦ ϕ(st ) = h(ξ)dξ = h ◦ ϕ(s)dϕ(s)
ϕ(st ) st
sT μ(s) a  (s) ds sT μ(s) 1 a(sT )
= − = ds − log .
st a(s) 2 a(s) st a 2 (s) 2 a(st )

Therefore, 
s
H ◦ϕ(sT )−H ◦ϕ(st )
T μ(s)
st a 2 (s) ds
a(st )
e =e = u(st , sT ).
a(sT )
Implied Volatility from Local Volatility: A Path Integral Approach 263

T −τ τ −t
As for (3.6), since ϕτ = T −t ϕ(st ) + T −t ϕ(sT ), we have

T
h 2 (ϕτ ) + h  (ϕτ )dτ
t
   
2 T −τ τ −t  T −τ τ −t
T
= h ϕ(st ) + ϕ(sT ) + h ϕ(st ) + ϕ(sT ) dτ
t T −t T −t T −t T −t
T −t sT
= h 2 (ϕ(s)) + h  (ϕ(s))dϕ(s).
ϕ(sT ) − ϕ(st ) st

Note that dϕ(s) = ds


a(s) and

1 d d μ(s) a  (s)
h  ◦ ϕ(s) = [h ◦ ϕ(s)] = a(s) × − ,
ϕ (s) ds ds a(s) 2

consequently,
 2   
sT   sT μ a μ a ds

h (ϕ(s)) + h (ϕ(s)) dϕ(s) =
2
− +a − ,
st st a 2 a 2 a(s)

where we suppressed the dependence on s for notational simplicity. On the other


hand, for the right hand side of (3.6), by straightforward calculation we have

a 2 (s) 2
Lu(s, sT ) = ∂ u(s, sT ) + μ(s)∂s u(s, sT )
 2 s 
 
μ2 (a  )2 a μ a  aμ
= − 2− − − + u(s, sT )
2a 8 2 a 2 2a
   
1 μ a 2 μ a 
=− − +a − u(s, sT ).
2 a 2 a 2

It follows that
   
Lu(s, sT ) ds
sT 1 sT μ a 2 μ a   ds
=− − +a −
st u(s, sT ) a(s) 2 st a 2 a 2 a(s)
1 s T  
=− h 2 (ϕ(s)) + h  (ϕ(s)) dϕ(s),
2 st

which completes the proof of (3.6). 


264 T.-H. Wang and J. Gatheral

4 Implied Volatility Approximation

The implied volatility σBS = σBS (K , T ) is defined implicitly by solving the nonlin-
ear equation
C(s, t, K , T ) = CBS (s, t, K , T, σBS (K , T )), (4.1)

where the function CBS on the right hand side is the celebrated Black-Scholes pricing
formula for call options (assuming zero interest rate and dividend yield):

CBS (s, t, K , T, σBS ) = s N (d1 ) − K N (d2 )


√ √
with d1 = log √ K + σBS T −t , d2 = d1 − σBS T − t, and N (·) is the cumulative
s−log
σBS T −t 2
normal distribution function. The Black-Scholes formula is monotonic increasing in
the volatility parameter σBS , and for this reason amongst others, it is often market
practice to quote options in terms of Black-Scholes implied volatility. Moreover,
practitioners often calibrate their option pricing models to implied volatilities rather
than price quotes. In this regard, efficient and accurate approximations of implied
volatility not only permit faster calibration of option pricing models but also help
build intuition.
Conventionally, asymptotic expansions of implied volatility for small time to
expiry (to lowest order) are generated by matching exponents in respectively, an
asymptotic approximation for a far out-of-the-money (OTM) option under Black-
Scholes, and an asymptotic approximation to the option price from direct integration
over the (approximated) density. For such far out-of-the-money (OTM) options, as
time approaches expiry, the event that the underlying will end up in-the-money at
expiry is a rare event. According to the theory of large deviations, such a rare event
has exponentially small probability, so the option price is of the form
∞ d(x)
e− T −t f (x) d x. (4.2)
K

As t → T − , the main contribution to the integral comes from the minimum point of
d, which in this case is the boundary point of the support of f because, in the OTM
case, d(x) is strictly increasing in x, and f (x) has the payoff function as a factor (see
(4.4)). To zeroth order, the Laplace asymptotic formula (for example, see (5.2.23)
on p. 193 of Bleistein and Handelsman [3]) then reads
∞ d(x) d(K ) f  (K )
e− T −t f (x)d x ≈ (T − t)2 e− T −t (4.3)
K |d  (K )|2

as t → T − , provided f  (K ) and d  (K ) are nonzero. Thus, the small time asymptotic


expansion of the implied volatility is obtained by applying the Laplace asymptotic
formula (4.3) to both sides of (4.1) then matching the corresponding coefficients.
As one might expect, the dominating term of such expansions is typically the zeroth
Implied Volatility from Local Volatility: A Path Integral Approach 265

order term. Our objective in this section is to demonstrate how to implement this
matching procedure from the path integral perspective.
Recasting Eq. (4.1) for implied volatility using our path integral representation of
the density, and using our earlier representation (2.9) of the call price, we obtain
∞ ST − K 
1 T
e− 2 t |ẋτ −h(xτ ,τ )| dτ D[x] d ST ,
2

K a(ST , T ) Cx
∞ ST − K − σBS (x T −xt )− σBS
2 
1 T
8 (T −t) e− 2 t |ẋτ | dτ D[x]d ST
2
= e 2
K σBS sT Cx
 √ 2
∞ log sT −log st σBS T −t
ST − K − 12 √
σBS T −t
+ 2
= √ e d ST . (4.4)
K 2π(T − t)σBS ST

Equation (4.4) provides an implicit expression for Black-Scholes implied volatility


in terms of local volatility. In the foregoing, we first show how to recover from
(4.4) the heat kernel approximations of Gatheral et al. [9] and the most-likely-path
approximation of Gatheral and Wang [8]. Finally, in Sect. 4.3, we show how to
improve on these approximations by adopting the path integral perspective.

4.1 Recovery of the Berestycki-Busca-Florent (BBF) Formula

To rederive the results in Berestycki et al. [2] and Gatheral et al. [9] from (4.4), we
approximate both sides of (4.4) as Laplace type integrals as in (4.2). The path integral
on the left hand side of (4.4) is approximated as follows:

1
T
e− 2 t |ẋτ −h(xτ ,τ )|2 dτ
D[x]
Cx
1
T T T
= e− 2 t |ẋτ |2 −2 t h(xτ ,τ )d x τ + t h 2 (xτ ,τ )dτ
D[x]
Cx
1
T T T
≈ e− 2 t |ẋτ |2 dτ
1−2 h(xτ , τ )d xτ + h 2 (xτ , τ )dτ D[x]
Cx t t
(x T −xt )2
− 2(T
≈e −t) [1 + O(T − t)] ,

where in the last step we approximated the path integral by evaluating the integral in
the exponent along a single path: the straight line connecting xt and x T . Recall that
 s dξ
xt = ϕ(st , t) = s0t a(ξ,t) . Substitution back into the left hand side of (4.4) gives

ST − K ∞ 
1 T
e− 2 t |ẋτ −h(xτ ,τ )| dτ D[x] d ST
2

K a(ST , T ) Cx
∞ |ϕ(sT ,T )−ϕ(st ,t)|2 S − K
e−
T
≈ 2(T −t) [1 + O(T − t)] d ST ,
K a(S T,T)
266 T.-H. Wang and J. Gatheral

which is of Laplace type as in (4.2). Applying the Laplace asymptotic formula (4.3),
we obtain that, up to a factor,

|ϕ(K ,T )−ϕ(s,t)|2
C(s, t, K , T ) ≈ e− 2(T −t) . (4.5)

Likewise, the Black-Scholes price on the right hand side of (4.4) is given, up to a
factor, by
2
− | log2K −log s|
2σBS (T −t)
CBS (s, t, K , T ) ≈ e . (4.6)

Finally, by matching the exponents in (4.5) and (4.6), we obtain the zeroth order
approximation of the implied volatility as

log K − log s
σBS ≈ .
ϕ(K , T ) − ϕ(s, t)

In the time homogeneous case,

K dξ
ϕ(K ) − ϕ(s) =
s a(ξ)

and we recover the BBF formula as in Berestycki et al. [2] and Gatheral et al. [9].

4.2 Recovery of the Variational-Most-Likely-Path (vMLP)


Approximation of Gatheral and Wang [8]

The path integral term in (4.4) is in x-space. Alternatively, in s-space it reads


T ṡτ

as (sτ ,τ ) 2
− 21 + dτ
e t a(sτ ,τ ) 2
D[s],
Cs

where
1 
n−1
1 dsi
D[s] = lim √ √ .
n→∞ 2πta(sT , T ) i=1
2πt a(si , ti )

Hence, we can rewrite the left hand side of (4.4) in s-space as

∞ T ṡτ

as (sτ ,τ ) 2
− 12 + dτ
C(t, st , K , T ) = (sT − K ) e t a(sτ ,τ ) 2
D[s]dsT .
K Cs

The variational most-likely-path approximation of implied volatility developed in


Gatheral and Wang [8] is obtained by dropping the second term as (s2τ ,τ ) in the path
Implied Volatility from Local Volatility: A Path Integral Approach 267

integral and evaluating the resulting path integral along the path that minimizes the
functional  2  T  ṡτ 
− 12 t  a(sτ ,τ )  dτ
e .

In other words,
 
 T  s˙τ∗ 2
∞ − 21  
t  a(sτ∗ ,τ )  dτ
C(s, t, K , T ) ≈ (sT − K )e dsT ,
K

 T  2

where sτ∗ is the optimal path that maximizes the action functional t  a(sṡττ,τ )  dτ
subject to the constraints that initial and terminal points are fixed at st and sT respec-
tively. Moreover, since the resulting integral is of Laplace type, the call price is given
asymptotically, up to a factor, by
 
 T  s˙τ∗ 2
− 12  
t  a(sτ∗ ,τ )  dτ
C(s, t, K , T ) ≈ e ,

where the optimal path sτ∗ has initial and terminal points s and K respectively. Finally,
by matching the exponent with the Black-Scholes asymptotic as in (4.6), the zeroth
order approximation of implied volatility is given by
   − 1
|log K − log s| T  s˙τ∗ 2 2
σBS ≈ √   dτ
T −t  a(s ∗ , τ ) 
t τ

which recovers the variational most-likely-path approximation of the implied volatil-


ity presented in Gatheral and Wang [8].

4.3 New and Improved Most-Likely-Path (MLP)


Approximation

As is obvious from our presentation, the approximations obtained in Gatheral et al.


[9] and in Gatheral and Wang [8] are suboptimal from the perspective of our path
integral representation (4.4) in the sense that they both drop terms. This suggests that
we should define the path-integral-most-likely-path to be the path that maximizes
the full action functional

1 T
|ẋτ − h(xτ , τ )|2 dτ (4.7)
2 t

or equivalently in s-space the functional


268 T.-H. Wang and J. Gatheral

2
1 T ṡτ as (sτ , τ )
+ dτ (4.8)
2 t a(sτ , τ ) 2

without dropping terms. The Euler-Lagrange equation associated with the functional
in (4.7) is
ẍτ = h h x + h t (4.9)

with boundary conditions xt and x T at times t and T respectively. Matching exponents


as before gives
 √ 2
 log K − log s σBS T − t  T

 √ +  = [x˙τ∗ − h(xτ∗ , τ )]2 dτ , (4.10)
 σBS T − t 2  t

where xτ∗ is the optimal path which maximizes the functional (4.7) (or equivalently
solves (4.9)) with initial and terminal points given by ϕ(s, t) and ϕ(K , T ) respec-
tively. Solving (4.10) for σBS yields our new-and-improved zeroth order approxima-
tion for implied volatility.
To illustrate the accuracy of our new approximation (4.10), consider the case
of time dependent Black-Scholes, where rather pleasingly, (4.10) gives the exact
solution. Note in passing that, to the best of our knowledge, none of the existing
small time approximations is able to recover this very simple case.

Example 1 (Implied volatility in the time dependent Black-Scholes model) Assume


the price St of the underlying satisfies the following under the pricing measure:

d Sτ = σ(τ ) Sτ d Bτ , St = st .

In order to apply (4.10), we proceed as follows:


(a) Transform the model into x-space.
(b) Solve the Euler-Lagrange equation (4.9) for the optimal path.
(c) Evaluate the the action functional (4.9) along the optimal path, substitute into
(4.10) and solve for the implied volatility.
s 1
(a) Transform into x -space: In this case, x = ϕ(s, t) = s0 σ(t)ξ dξ = log s−log
σ(t)
s0
.
Dropping the explicit dependence on t for ease of notation, and applying Ito’s
formula to X t = ϕ(St , t) we obtain

1
d X t = ϕ̇(St , t)dt + ϕs (St , t)d St + ϕss (St , t)d[S]t
   2
σ σ
= d Bt − Xt + dt.
σ 2

σ
Thus h(x, t) = − σ2 − σ x.
Implied Volatility from Local Volatility: A Path Integral Approach 269

(b) Solve the Euler-Lagrange equation: The associated Euler-Lagrange equation


(4.9) in this case reads
 2   
σ σ
ẍ = h h x + h t = − x.
σ σ

With the change of variable x = σz , the above ODE for x is transformed into the
following ODE for z
 
2σ  d ż
z̈ − ż = 0 =⇒ = 0.
σ dτ σ2

With boundary conditions z t = σt xt and z T = σT x T , the solution to the Euler-


Lagrange equation is given by

σ T x T − σt x t τ
στ x τ = z τ = σ t x t +  T σ 2 (s)ds.
t σ (s)ds
2 t

(c) Solve for implied volatility: It follows that the functional (4.7) evaluated along
the optimal path, taking into account that σż2 = σTT x T2−σt xt is a constant, is given
t σ (s)ds
by
 
T T σ σ  2
|ẋτ − h(xτ , τ )| dτ = 
ẋτ + 2 + σ x  dτ
2
t t

T  ∂ (σ x)
  
 τ σ 2 T  ż
 σ 2
=  σ +  dτ =  σ + 2  dτ
t 2 t
 2
σ T x T − σt x t 1 T
= T + στ2 dτ
t σ (s)ds
2 2 t
⎛  ⎞2
log s − log s 1 T
= ⎝  σ 2 (s)ds ⎠ .
T t
+
T 2 2 t
t σ (s)ds

Finally, substituting this last expression into (4.10) gives the well-known result

1 T
σBS
2
= σ 2 (s)ds,
T −t t

which is exact.
270 T.-H. Wang and J. Gatheral

5 Conclusion

We have shown, up to first order in τ = T − t, that the classical heat kernel expan-
sion can be derived using a novel probabilistic approach. This new probabilistic
derivation of the heat kernel expansion inspires a path integral representation of the
transition density; natural definitions of the most-likely-path approximation of the
transition density, the call price, and the implied volatility then follow. In the time
homogeneous case, we recover well-known classical results. However, in the time
inhomogeneous case, we obtain a new asymptotic expansion that generalizes the
classical one. We showed how the lowest order approximation of Berestycki, Busca
and Florent as well as the higher order approximations of Gatheral et al. [9] and
Gatheral and Wang [8] correspond to dropping terms in our lowest order path inte-
gral representation. We further showed that by restoring the dropped terms, our new
representation recovers the exact expression for Black-Scholes implied volatility in
the time-dependent Black-Scholes model, which no existing asymptotic expansion
technique has so far been able to achieve, to the best of our knowledge. Further appli-
cations of this promising approach to the important practical problem of accurately
approximating implied volatility under local volatility is left for future research.

Acknowledgments We thank the anonymous reviewer for his helpful and constructive comments.
We are also grateful for helpful discussions with the participants of the following seminars: Math
Finance and PDE Seminar at Rutgers University, Probability Seminar at TU Berlin, Probability
Seminar at Academia Sinica, Mathematics Colloquium at Ritsumeikan University, Mathematical
Finance Seminar at Osaka University. All errors are our own responsibility.

References

1. Baldi, P., Caramellino, L.: Asymptotics of hitting probabilities for general one-dimensional
diffusions. Ann. Appl. Probab. 12, 1071–1095 (2002)
2. Berestycki, H., Busca, J., Florent, I.: Asymptotics and calibration of local volatility models.
Quant. Financ. 2, 61–69 (2002)
3. Bleistein, N., Handelsman, R.A.: Asymptotic Expansions of Integrals. Dover Publications,
New York (1986)
4. Chavel, I.: Eigenvalues in Riemannian geometry. Pure and Applied Mathematics, Book 115,
Academic Press (1984)
5. Cheng, W., Costanzino, N., Liechty, J., Mazzucato, A.L., Nistor, V.: Closed-form asymptotics
and numerical approximations of 1D parabolic equations with applications to option pricing.
SIAM J. Financ. Math. 2(1), 901–934 (2011)
6. Gatheral, J.: The Volatility Surface: A Practitioner’s Guide, Wiley Finance (2006)
7. Gatheral, J., Jacquier, A.: Arbitrage-free SVI volatility surfaces. Quant. Financ. 14(1), 59–71
(2014)
8. Gatheral, J., Wang, T.-H.: The heat kernel most-likely-path approximation. Int. J. Theor. Appl.
Financ. 15(1), 1250001 (2012)
9. Gatheral, J., Hsu, E.P., Laurence, P., Ouyang, C., Wang, T.-H.: Asymptotics of implied volatility
in local volatility models. Math. Financ. 22(4), 591–620 (2012)
10. Goovaerts, M., De Schepper, A., Decamps, M.: Closed-form approximations for diffusion
densities: a path integral approach. J. Comput. Appl. Math. 164–165, 337–364 (2004)
Implied Volatility from Local Volatility: A Path Integral Approach 271

11. Guyon, J., Henry-Labordère, P.: From spot volatilities to implied volatilities. Risk Mag. pp.
79–84 (2011)
12. Henry-Labordère, P.: Analysis, Geometry, and Modeling in Finance. Chapman & Hall/CRC,
Financial Mathematics Series (2008)
13. Hsu, E.P.: Stochastic Analysis on Manifolds. Graduate Studies in Mathematics, American
Mathematical Society (2002)
14. Jordan, R., Tier, C.: Asymptotic approximations to deterministic and stochastic volatility mod-
els. SIAM J. Financ. Math. 2(1), 935–964 (2011)
15. Keller-Ressel, M., Teichmann, J.: A remark on Gatheral’s ’most-likely path approximation’ of
implied volatility. In: Springer Proceedings in Mathematics & Statistics (2014)
16. Linetsky, V.: The path integral approach to financial modeling and options pricing. Comput.
Econ. 11(1–2), 129–163 (1997)
17. Reghai, A.: The hybrid most likely path. Risk Mag. 34–35 (2006)
Extrapolation Analytics for Dupire’s Local
Volatility

Peter Friz and Stefan Gerhold

Abstract We consider wing asymptotics of local volatility surfaces. While our


recent paper in the journal Risk (De Marco et al. Risk 2:82–87, 2013, [3]) discusses
our approximation formula from a practical and numerical perspective, the present
paper focuses on rigorous proofs of the approximations. We apply the saddle point
method (Heston model) and Hankel contour integration (variance gamma model).

Keywords Local volatility · Saddle point methods · Contour integration

1 Introduction

One of the main objectives in option pricing theory is to price exotic derivatives con-
sistently with observed vanilla prices. According to the seminal work of Dupire [5],
this can in principle be achieved, for a one-dimensional underlying, by a model with
dynamics d St /St = σ(St , t)dWt . As opposed to stochastic volatility models, here
the volatility is a deterministic function of time and current underlying price. Any
given smooth call price surface C(K , T ), for strikes K > 0 and maturities T > 0,

A preprint of this article circulated under the title “Don’t stay local—extrapolation analytics for
Dupire’s local volatility”.

P. Friz (B)
Institut für Mathematik, Technische Universität Berlin, Berlin, Germany
e-mail: [email protected]
P. Friz
Weierstraß-Institut für Angewandte Analysis und Stochastik,
Berlin, Germany
S. Gerhold
Financial and Actuarial Mathematics, Vienna University of Technology, Wiedner Hauptstraße
8/105-1 A-1040, Vienna, Austria
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 273


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_10
274 P. Friz and S. Gerhold

can be recovered by a so-called local volatility model d St /St = σloc (St , t)dWt ,
where the volatility function is given by Dupire’s formula [5]

2∂T C
σloc
2
(K , T ) = . (1)
K 2∂K K C

Exotic options can then be priced by Monte Carlo simulation. Local volatility models
are of considerable practical importance, and serve as building blocks for more
advanced models, e.g. local-stochastic-volatility (LSV) models.
In the present paper, we consider local volatility surfaces that arise from call
prices that are generated by some model for the underlying. Our aim is to turn the
knowledge of that model’s mgf (moment generating function; of log-spot X T ) into
asymptotic results of the corresponding local volatility surface. In [3], we described
two applications of such approximations. One is to the design of local volatility
parametrizations, whose asymptotic behavior may be matched to our results. Another
application concerns model risk. Consider pricing under an “advanced” model (affine
stochastic volatility, Lévy, etc.; anything with known mgf) versus a local volatility
model. The relative differences between the prices has been named “toxicity index”
in [13]. Roughly speaking, it measures the distance of the trade from vanilla options.
The most consistent way to calculate this index is to use the local volatility model
generated by the “advanced” model, because only then all vanillas will have zero
toxicity. When computing the local volatility surface, our accurate approximations
can then profitably replace other numerical methods in regimes where the latter
become unstable (see [3] for details).
We suppose that the underlying price process St = exp(X t ) is a martingale under
the pricing measure P and write C(K , T ) for its call price surface. For simplicity we
assume zero interest rate throughout. If C is sufficiently smooth, then the associated
local volatility function is given by Dupire’s formula (1). Recall the main asymptotic
formula from [3]: 

2 ∂T m(s, T ) 
σloc (K , T ) ∼
2
 , (2)
s(s − 1) 
s=ŝ(k,T )

where k denotes log-strike, and ŝ = ŝ(k, T ) is determined as solution of the saddle


point equation

m(s, T ) = k. (3)
∂s

Here, m(s, T ) := log M(s, T ) is the logarithm of the moment generating function
(mgf) M, which is defined by M(s, T ) := E exp(s X T ) and is analytic in the (max-
imal) strip s− (T ) < Re(s) < s+ (T ). The numbers s− and s+ are called critical
exponents. In this note, we will use (2) for K → ∞, but other asymptotic regimes
can also be covered [3, 8]; it is thus not only a local-volatility analogue of Lee’s
moment formula [11], but works also for maturity (or joint) asymptotics.
Extrapolation Analytics for Dupire’s Local Volatility 275

As described in [3], formula (2) results from saddle point approximations of


numerator and denominator of Dupire’s formula, after inserting the Fourier repre-
sentation of the call price:
 i∞ ∂T m(s,T ) −ks
2∂T C 2 −i∞ s(s−1) e M(s, T )ds
σloc
2
(K , T ) = 2 =  i∞ . (4)
K ∂K K C −ks M(s, T )ds
−i∞ e

(The real parts of the contours are in (1, s+ ).)


Whereas the focus of [3] is on numerical tests and applications, the present note
gives proofs for the validity of (2), in the setting of the Heston and of the variance
gamma model. As regards methodology, the proof for the Heston model uses a
classical saddle point approach. Its most interesting ingredient, similarly to [7], is
the use of ODE comparison results to furnish the necessary tail estimates, without
taking recourse to the explicit form of the Heston mgf. The analysis is thus well
suited to extension towards other affine stochastic volatility models. For the variance
gamma model, the saddle point method is not appropriate. We apply another classical
contour integration approach, based on Hankel contours, which seems to be new in
mathematical finance.

2 The Heston Model

Even though practitioners seem to prefer local-stochastic-volatility models nowadays


over the classical Heston model, it might still be useful for the two applications
outlined in the introduction (model risk and parametrization design; recall that the
large maturity Heston smile motivates the popular SVI parametrization of implied
volatility [9]). The dynamics of the Heston model are

d St = St Yt dWt , S0 = s0 > 0,

d Vt = (a + bVt )dt + c Vt d Z t , V0 = v0 > 0,

with a ≥ 0, b ≤ 0, c > 0, and d W, Z t = ρdt with ρ ∈ (−1, 1).

Theorem 1 In the Heston model with ρ ≤ 0 (the relevant regime in practice, at least
for equity models), the asymptotic equivalence (2) holds for k → ∞. The explicit
leading term is

2
σloc
2
(K , T ) ∼ × k, k → ∞, (5)
s+ (s+ − 1)R1 /R2
276 P. Friz and S. Gerhold

where k = log (K /S0 ), s+ ≡ s+ (T ) and


 
R1 = T c2 s+ (s+ − 1) c2 (2s+ − 1) − 2ρc(s+ ρc + b) (6)
 
− 2(s+ ρc + b) c2 (2s+ − 1) − 2ρc(s+ ρc + b)
 
+ 4ρc c2 s+ (s+ − 1) − (s+ ρc + b)2 ,
 
R2 = 2c2 s+ (s+ − 1) c2 s+ (s+ − 1) − (s+ ρc + b)2 . (7)

Proof It was shown in [3] that the right hand side of (5) asymptotically equals the
right hand side of (2). It thus remains to show that (2) holds for the Heston model as
k → ∞.
By the exponential decay of the Heston mgf towards ±i∞, the second equality
in formula (4) is correct for the Heston model. For the saddle point analysis of (4),
we employ the approximate saddle point

ŝapprox (k) := s+ − βk −1/2 ,



2v0
where β = √
c σ
, σ denotes the critical slope

∂T ∗
σ(T ) = − (s+ (T )),
∂s
and
T ∗ (s) = sup{t ≥ 0 : E[es X t ] < ∞}.

This is the same approximate saddle point as in [7]; see there for more details on its
choice, and the definition of σ(T ) and T ∗ (s). (In [7], our ŝapprox was called simply ŝ,
since the exact saddle point of the denominator of (4), defined in (3), did not occur.)
This approximate saddle may be used for both integrals in (4). As for the denominator,
this was carried out in detail in [7], where an expansion of the Heston density ∂ K K C
was determined. The analysis of the numerator in (4) is similar, except that a new
tail estimate is required. But first we discuss the local expansion around the saddle
point. Let us fix a number α ∈ ( 23 , 43 ) and define h(k) = k −α . Then, in the central
range |s − ŝapprox (k)| ≤ h(k), we have

1 1
= + O(s+ − s)
s(s − 1) s+ (s+ − 1)
1  
= 1 + O(k −1/2 )
s+ (s+ − 1)
Extrapolation Analytics for Dupire’s Local Volatility 277

and (cf. formula (19) in [3])

∂ 2β 2 1
2 m(s, T ) = +O
∂T σ(s+ − s)2 s+ − s
2β 2
= (βk −1/2 + O(k −α ))−2 + O(k −1/2 )
σ
2k
= (1 + O(k 1/2−α )).
σ
Therefore, the local expansions of the two integrands in (4) agree, up to a factor that
is given by
2∂T m(s, T ) 2k
= (1 + O(k 1/2−α )), (8)
s(s − 1) σs+ (s+ − 1)

where the error term holds uniformly w.r.t. the integration variable s. According to
Theorem 1.2 of [7], we have

1 ŝapprox +i h(k) √
e−ks M(s, T )ds ∼ A1 e(1−A3 )k+A2 k −3/4+a/c2
k (9)
2iπ ŝapprox −i h(k)

for certain constants A1 , A2 = 2β, and A3 = s+ + 1. Analogously, we derive from


(8) that

ŝapprox +i h(k) 2∂
1 T m(s, T ) −ks
e M(s, T )ds (10)
2iπ ŝapprox −i h(k) s(s − 1)
2k √
× A1 e(1−A3 )k+A2 k k −3/4+a/c .
2

σs+ (s+ − 1)

Dividing (10) by (9) shows our claim (5), provided that the tails |s − ŝapprox (k)| >
h(k) of the integrals can be discarded. For the denominator of (4), this was shown in
Lemma A.3 of [7]. So we proceed with the numerator. We consider only the upper
tail, as the lower one is handled by symmetry. By Lemma A.3 of [7], there is a
constant B > 0 such that
 
 ŝapprox +i B  √
 
 e−ks M(s, T )ds  ≤ e(1−A3 )k exp(A2 k − 21 β −1 k 3/2−2α + O(log k)).
 ŝapprox +i h(k) 
(11)
From formula (18) in [3] we obtain
 
 ∂T m(s, T ) 
 
 s(s − 1)  ≤ const × k
278 P. Friz and S. Gerhold

for all s on the contour in (11). This estimate can be absorbed into the factor
exp(O(log k)) in (11), so that we conclude

 ŝapprox +i B ∂T m(s, T ) −ks 
 
 e M(s, T )ds 
 ŝapprox +I h(k) s(s − 1)

≤ e(1−A3 )k exp(A2 k − 21 β −1 k 3/2−2α + O(log k)).
(12)

This grows slower than the right hand side of (10) (compare the relevant factors
k −3/4+a/c resp. exp(− 21 β −1 k 3/2−2α )). As for Im(s) > B, it was shown in [7]
2

(Lemma A.2) that


 
 ŝapprox +i∞  √
 −ks 
 e M(s, T )ds  = O(exp((1 − A3 )k + β k)).
 ŝapprox +i B 

This was deduced from the exponential decay of M(s, T ) for large Im(s) (Lemma
A.1 in [7]). The following lemma implies that the new factor ∂T m(s, T )/(s(s − 1))
grows only polynomially, so that the exponential decay of the integrand persists for
the numerator of (4). This finishes the proof of Theorem 1. 

To state the lemma, recall that m(s, t) = φ(s, t) + v0 ψ(s, t), where φ and ψ
satisfy the Riccati equations

φ̇ = aψ, φ(0) = 0,
ψ̇ = 2 (s
1 2
− s) + 21 c2 ψ 2 + bψ + sρcψ, ψ(0) = 0.

We have to show that ṁ grows only polynomially as Im(s) → ∞. Because of


the Riccati equations, it suffices to show this for ψ. Let us write ψ = f + ig and
s = ξ + i y.
Lemma 2 Let T > 0, and assume that the real part ξ of s stays bounded in some
interval 1 ≤ ξ ≤ ξmax . Then, there are positive constants Ci,T (i = 1, 2, 3) such that
for y ≥ y0 , where y0 depends only on ξmax and the other (fixed) model parameters
of the Heston model,

−C3,T y 2 ≤ f (t) ≤ −C1,T y,


0 ≤ g(t) ≤ C2,T y.
Extrapolation Analytics for Dupire’s Local Volatility 279

In fact, we can take

C1,T = 1/ (3c) ,
1
C2,T = (2ξmax − 1) T,
2
c2 2
C3,T = T 1 + C2,T .
2

Proof It follows from the proof of Lemma A.1 in [7] that (e.g. with C1,T := T θ =
1√
c 1/6 ≤ 3c
1
)

1 1
f (t) ≤ −T θy = − 1/6y ≤ − y =: −C1,T y.
c 3c

We next provide a similar upper estimate for g. To this end we first show that g = g(t)
remains ≥ 0 for all times t > 0. The differential equation for g,

1
ġ = (2ξ y − y) + c2 f g − γg, g(0) = 0,
2
implies the first order Euler estimate

1
g (t) = g(0) + (2ξ y − y) + c2 f (0)g(0) − γg(0) t + o(t)
2
1
= (2ξ y − y) t + o(t),
2
  
>0

and hence g is positive (even strictly so) on some interval (0, ε1 ). Assume this interval
is maximal in the sense that g(ε1 ) = 0 and g is (strictly) negative on some further
interval (ε1 , ε2 ). Clearly then ġ(ε1 ) ≤ 0, which contradicts the information from the
differential equation: indeed, using g(ε1 ) = 0, we obtain the contradiction

1
ġ(ε1 ) = (2ξ y − y).
2
  
>0
280 P. Friz and S. Gerhold

The observation that g ≥ 0 is useful to us, since it leads, together with f ≤ −C1,T y
and γ ≥ 0, to the differential inequality

1
ġ = (2ξ y − y) + c2 f g − γg
2
1  
≤ (2ξ y − y) − c2 C1,T + γ g
2
1
≤ (2ξ y − y) ,
2
and hence to the upper estimate

1
∀0 ≤ t ≤ T : g(t) ≤ (2ξmax − 1) T × y =: C2,T y.
2
We can feed this upper estimate on g back in the differential equation for f to obtain
a lower estimate

1 2  c2  
f˙ = ξ − y2 − ξ + f 2 − g2 − γ f
2 2
1 2  c2 c2 2 2
≥ ξ − y2 − ξ + f 2 − C2,T y −γf
2 2 2
1  1 2  c2 2
= − 1 + c2 C2,T
2
y2 + ξ −ξ −γf + f
2 2 2
1   1 2 
≥ − 1 + c2 C2,T
2
y2 + ξ −ξ −γf
2 2
c2 2
≥ − 1 + C2,T y 2 − γ f,
2

where in the last step we assume that yis large enough so that the extra amount
subtracted (at least: 21 y 2 ) is larger than 21 ξ 2 − ξ , which remains bounded. We also
know that f (t) ≤ −C1,T y ≤ 0 for all 0 ≤ t ≤ T . It follows that −γ f ≥ 0 and
omission leads to our final lower bound on f˙, namely

c2 2
f˙ ≥ − 1 + C2,T y2.
2

This entails immediately

c2 2
f (t) ≥ −T 1 + C y 2 =: −C3,T y 2 . 
2 2,T
Extrapolation Analytics for Dupire’s Local Volatility 281

3 The Variance Gamma Model

The mgf of the variance gamma model is

M(s, T ) = e T bs (1 − θνs − 21 σ 2 νs 2 )−T /ν ,

where σ, ν > 0 and θ ∈ R. The “drift” b = ν −1 log(1 − θν − 21 σ 2 ν) is chosen such


that S = e X becomes a martingale (w.l.o.g., S0 = 1). For fixed T with 0 < T /ν < 21 ,
the density ∂ K K C(K , T ) of ST has a singularity at the origin. Indeed, it behaves as
≈ |k|2T /ν−1 , which easily follows from the integral representation of the density [1]
(as always, k = log K ). At the money, the denominator of the Dupire formula (1)
thus explodes for small T . If T /ν > 21 , then the density is continuous. This lack of
smoothness is just an additional issue on top of a common feature of jump models:
The associated local volatility surface explodes as T → 0, and so the local volatility
SDE
d S/S = σloc (S, t)dW (13)

does not make sense on [0, ∞)  t.


However, following [8], we can start a Monte Carlo simulation of (13) at a time
T0 > 0 (here, T0 > ν/2) instead of time zero. With the appropriate stochastic initial
value, sampled from the density ∂ K K C(K , T0 ), we recover call prices from time T0
on. (T0 is called ε in [8].) This gives a meaning to the local volatility surface of a
jump model, without appealing to the practically challenging approach of local Lévy
models [2]. Our aim is not to make this fully rigorous for the variance gamma model
(or other jump models), which would require to show that (13) admits a unique strong
solution on [T0 , ∞). Our focus, instead, is on a rigorous proof that (2) is valid in
this setting. To ensure the validity of the Fourier representations of density and call
price, we even assume T /ν > 1 (instead of T /ν > 21 ).
Theorem 3 In the variance gamma model, formula (2) holds for k = log K → ∞.
The explicit leading term is

2 log(k/T )
σloc
2
(K , T ) ∼ , k → ∞. (14)
νs+ (s+ − 1)

Note that the numerator of (14) is ∼ 2 log k. We kept the T -dependence, because
the same analysis works for fixed k and T → 0, and in fact for any asymptotic regime
with k/T → ∞. This is a common feature of Lévy models, since the right-hand side
of (2) depends on k and T only through k/T .
Proof We write the moment generating function as
 −T /ν
M(s, T ) = ebT s 2 σ ν(s+
1 2
− s)(s − s− )
282 P. Friz and S. Gerhold

where the critical moments are



−νθ ± 2νσ 2 + ν 2 θ2
s± = .
νσ 2
We analyze the denominator of (4), i.e., the density. The arguments for the numerator
are analogous (see below). The shift k → k + bT makes it clear that we may w.l.o.g.
assume that b = 0. The main part of the saddle point equation (3) is T /(ν(s+ − s)) =
k, and so
T
ŝ = s+ − + O(k −2 ).
νk
The saddle point approximation of the density then is

1 i∞ exp(m(ŝ, t) − k ŝ)
e−ks M(s, T )ds ≈  . (15)
2iπ −i∞ 2πm  (ŝ, T )

The interesting point now is that (15) is wrong for the variance gamma model,
inasmuch as asymptotic equality does not hold. The algebraic singularity of the mgf
is not pronounced enough to make the saddle point method work; see also the remark
after the proof. For a correct analysis, we use an integration contour as in Fig. 1. The
U-shaped notch, denoted by C(k), extends a bit to the right of the singularity s+ ,
and captures enough asymptotic information from it. By transformation into a so
called Hankel path, Hankel’s representation of the Gamma function can be invoked
after termwise integration of a local expansion. This “Hankel contour approach” is
well known in analytic combinatorics, in particular, from the so-called singularity
analysis of generating functions [6].
Let us first argue that the integrals over the dashed lines in Fig. 1 can be discarded.
By symmetry, it suffices to consider the upper one. The real part of s is then Re(s) =
s+ + (log k)/k. First suppose that s is away from the singularity, say Im(s) > 1. The

Fig. 1 The contour C (k), a Im(s)


small notch embracing the
critical moment s+
log k
k

1
k
Re(s)
s+
C(k)
Extrapolation Analytics for Dupire’s Local Volatility 283

integral of ((s+ − s)(s − s− ))−T /ν over this part of the contour is O(1), and so we
get the bound O(e−kRe(s) ) = O(e−ks+ /k). Now consider s with 1/k ≤ Im(s) < 1.
We estimate the resulting integral by the length of the contour, which is O(1), times
the absolute value of the integrand at the lower endpoint s = s+ + (log k)/k + i/k.
The latter is easily seen to be O(e−ks+ k T /ν−1 (log k)−T /ν ).
We will now show that the integral over C(k) is of order e−ks+ k T /ν−1 , so that the
tail estimates we have just derived are good enough. The factor (s − s− ) is locally
almost constant; we have, uniformly for s ∈ C(k),

M(s, T ) ∼ c1 (s+ − s)−T /ν , k → ∞,

where c1 = c1 (T ) = (σ 2 ν(s+ − s− )/2)−T /ν . Therefore,

c1
e−ks M(s, T )ds ∼ e−ks ds.
C (k) C (k) (s+ − s)T /ν

The change of variables s = s+ − w/k transforms this into

T /ν
e−ks+ k e−ks+
ew c1 dw = c1 ew w −T /ν dw
k H(k) w k 1−T /ν H(k)
e−ks+
∼ c1 1−T /ν ew w −T /ν dw.
k H(∞)

The integration paths are displayed in Fig. 2. The right one, H(∞), is called a Hankel
contour; H(k) is a Hankel contour truncated at Re(s) = −log k. Now recall Hankel’s
representation for the Gamma function [12]:

Im(s) Im(s)

log k

1 1
Re(s) Re(s)
0 0
H(k)
H(∞)

Fig. 2 The integration contours H(k) and H(∞). The dots should indicate that the contour H(∞)
extends to −∞
284 P. Friz and S. Gerhold

1 1
ew w −z dw = .
2iπ H(∞) (z)

We thus arrive at

1 i∞ c1
e−ks M(s, T )ds ∼ e−ks+ k T /ν−1 . (16)
2iπ −i∞ (T /ν)

The numerator of (4) can be treated analogously, with a very similar tail estimate.
The contribution of the new factor to the local expansion is

∂T m(s, T ) 2/ν 1
2 ∼ log
s(s − 1) s+ (s+ − 1) s+ − s
2/ν k
= log
s+ (s+ − 1) w
2 log k
∼ ,
νs+ (s+ − 1)

and so
i∞ ∂T m(s, T ) −ks 2 log k c1
2 e M(s, T )ds ∼ × e−ks+ k T /ν−1 . (17)
−i∞ s(s − 1) νs+ (s+ − 1) (T /ν)

Dividing (17) by (16) yields the desired result. 


As mentioned in the preceding proof, the saddle point formula (15) is not an
asymptotic equivalence for the variance gamma model. But, as we have shown, our
formula (2) is still correct. What happens is that (15), and its counterpart for the
numerator of (4), are almost correct: They are only off by a constant factor. (This
phenomenon has already been observed for similar integrals in [4].) This constant
factor is the same for both integrals, and thus cancels in the quotient (4). Therefore,
our asymptotic formula (2) extends well beyond models where the saddle point
method is applicable. In fact, we conjecture that the formula holds whenever the mgf
explodes close to the singularity s+ .

4 Other Jump Models

Without giving proofs, we briefly discuss local volatility asymptotics for two other
jump models. The mgf of Kou’s double exponential Lévy jump diffusion is given by

σ2 s 2 λ+ p λ− (1 − p)
M(s, T ) = exp T bs + +λ + −1 .
2 λ+ − s λ− + s
Extrapolation Analytics for Dupire’s Local Volatility 285

The critical moment is s+ = λ+ , and the saddle point is located at



λλ+ pT
ŝ ≈ s+ − .
k

The singularity type, the same as in the Heston model, is amenable to the saddle
point method. Formula (2) can thus certainly be verified, and yields

2 λp
σloc
2
(K , T ) ∼ k 1/2 , k → ∞.
λ+ T (λ+ − 1)

For T → 0, the blowup of local volatility is of order T −1/2 . (Just as the Hankel
contour analysis in the proof of Theorem 3 can be carried out for any asymptotic
regime with k/T → ∞, the same is true when applying the saddle point method to
the local volatility surface of a Lévy model.)
Finally, we consider the normal inverse Gaussian (NIG) model. The mgf
   
M(s, T ) = exp T bs + δT α2 − β 2 − α2 − (β + s)2

has no blow-up at the critical moment

s+ = α − β,

but a square-root type singularity, with local expansion


√  √  
M(s, T ) ≈ e T bs+ +δT α2 −β 2
1 − δT 2α s+ − s . (18)

It is still true that σloc


2 (K , T ) asymptotically depends, via (4), on the local behavior

of M(s, T ) near s+ . However, the approximation (2) hinges on the first term of the
2 (K , T )
local expansion of M(s, T ). It therefore fails to capture the asymptotics of σloc

here, which depend on the first singular term (the term s+ − s in (18)). The NIG
model is thus one of the few examples where (2) is wrong. (It gives the qualitatively
correct result of convergence to a constant, but a wrong one.) The Hankel contour
analysis in the proof of Theorem 3 can be adapted to handle this situation. The result
is that local volatility tends to a constant for k → ∞. This fact may be understood
by comparing the NIG marginals with those of Heston’s in the time T → ∞ regime
(this link is made precise in [10]). In particular, the result is then consistent with the
Heston asymptotics (5) of local vol, given that the O(k) term carries a factor ≈ 1/T
which tends to zero as T → ∞.

Acknowledgments We thank M. Drmota, J. Morgenbesser, and the referee for helpful comments,
and gratefully acknowledge financial support from MATHEON (P. Friz) resp. the Austrian Science
Fund (FWF) under grant P 24880-N25 (S. Gerhold).
286 P. Friz and S. Gerhold

References

1. Carr, P., Chang, E., Madan, D.: The variance gamma process and option pricing. Eur. Financ.
Rev. 2, 79–105 (1998)
2. Carr, P., Geman, H., Madan, D.P., Yor, M.: From local volatility to local Lévy models. Quant.
Financ. 4, 581–588 (2004)
3. De Marco, S., Friz, P., Gerhold, S.: Rational shapes of local volatility. Risk 2, 82–87 (2013)
4. Drmota, M., Soria, M.: Marking in combinatorial constructions: generating functions and
limiting distributions. Theor. Comput. Sci. 144, 67–99 (1995). Special volume on mathematical
analysis of algorithms
5. Dupire, B.: Pricing with a smile. Risk 7, 18–20 (1994)
6. Flajolet, P., Odlyzko, A.: Singularity analysis of generating functions. SIAM J. Discret. Math.
3, 216–240 (1990)
7. Friz, P., Gerhold, S., Gulisashvili, A., Sturm, S.: On refined volatility smile expansion in the
Heston model. Quant. Financ. 11, 1151–1164 (2011)
8. Friz, P.K., Gerhold, S., Yor, M.: How to make Dupire’s local volatility work with jumps. Quant.
Financ. 14, 1327–1331 (2014)
9. Gatheral, J., Jacquier, A.: Convergence of Heston to SVI. Quant. Financ. 11, 1129–1132 (2011)
10. Keller-Ressel, M.: Moment explosions and long-term behavior of affine stochastic volatility
models. Math. Financ. 21, 73–98 (2011)
11. Lee, R.W.: The moment formula for implied volatility at extreme strikes. Math. Financ. 14,
469–480 (2004)
12. Olver, F.W.J., Lozier, D.W., Boisvert, R.F., Clark, C.W. (eds.): NIST Handbook of Mathematical
Functions. U.S. Department of Commerce National Institute of Standards and Technology,
Washington, DC (2010)
13. Reghai, A.: Model evolution. Presentation at the Parisian Model Validation seminar. https://
sites.google.com/site/projeteuclide/les-seminaires-vmf/archives-vmf (2011)
The Gärtner-Ellis Theorem,
Homogenization, and Affine
Processes

Archil Gulisashvili and Josef Teichmann

Abstract We obtain a first order extension of the large deviation estimates in the
Gärtner-Ellis theorem. In addition, for a given family of measures, we find a spe-
cial family of functions having a similar Laplace principle expansion up to order
one to that of the original family of measures. The construction of the special fam-
ily of functions mentioned above is based on heat kernel expansions. Some of the
ideas employed in the paper come from the theory of affine stochastic processes.
For instance, we provide an explicit expansion with respect to the homogenization
parameter of the rescaled cumulant generating function in the case of a generic
continuous affine process. We also compute the coefficients in the homogenization
expansion for the Heston model that is one of the most popular stock price models
with stochastic volatility.

Keywords Affine process · Large deviation principle · Heat kernel expansion ·


Short time asymptotics · Laplace method · Small maturity limit in affine models

2010 Mathematics Subject Classification 60F10 · 35K08

1 Introduction

The large deviations theory has found numerous applications in mathematical finance
(see, e.g., [19]). For instance, using the methods of the large deviations theory, one
can estimate various important characteristics of financial models such as tails of
asset price distributions, option pricing functions, and the implied volatility (see,
e.g., [7–11, 13, 15] and the references therein). A popular source of information on

A. Gulisashvili (B)
Department of Mathematics, Ohio University, Athens, OH, USA
e-mail: [email protected]
J. Teichmann
Department of Mathematics, ETH Zürich, Zürich, Switzerland
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 287


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_11
288 A. Gulisashvili and J. Teichmann

the large deviations theory is the book [4] by Dembo and Zeitouni. A useful result
in the theory is the Gärtner-Ellis theorem (see [6, 12], see also [4]). This theorem
allows to infer the upper and lower estimates in the large deviation principle knowing
the properties of the limiting cumulant generating function.
We will next provide a brief overview of the contents of the paper. In Sect. 2, a
new notion of Laplace principle equivalent expansions for families of functions and
measures is introduced. This notion is motivated by the homogenization expansion
of the rescaled cumulant generating function associated with an affine stochastic
process X , that is, the function  defined by
  u    u 
(, u) =  log E exp − X  =  log exp − z p (dz).
 R 

Actually, the homogenization expansion mentioned above is nothing else but the real
analytic expansion of the function  with respect to the parameter  (see Sect. 4).
In Sect. 3, we gather definitions and known facts from the theory of general affine
processes, while in Sect. 4, the homogenization procedure is described in all details
for continuous affine processes. The main general results obtained in the paper are
contained in Sect. 2 (see Theorems 2.4 and 2.7). Theorem 2.4 states that for any family
of measures on the real line, satisfying the conditions in the Gärtner-Ellis theorem,
and such that the homogenization expansion exists, we can find a special family of
functions that is Laplace principle equivalent to the original family of measures. The
structure of the function family in Theorem 2.4 resembles the first two terms in the
heat kernel expansions on Riemannian manifolds (notice that we face a degenerate
situation here, so we could not apply heat kernel expansion directly). Theorem 2.7 is a
generalization of the Gärtner-Ellis theorem. It is shown in Theorem 2.7 that under the
same conditions as in Theorem 2.4, the first order large deviation estimates are valid.
Finally, in Sect. 5, we compute the coefficients in the homogenization expansion for
the correlated Heston model that is one of the most popular stochastic stock price
models with stochastic volatility.

2 Distributions with Equivalent Laplace


Principle Expansions

Laplace’s principle is an asymptotic expansion technique, which allows one to


approximate integrals of the form
  
b φ(z)
f (z) exp − dz (2.1)
a 

as  → 0. We will next formulate a rather general version of Laplace’s principle that


will be used in the sequel. Suppose the following conditions hold:
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 289

• The functions f and φ in (2.1) are continuous on the interval (a, b), and the integral
in (2.1) converges absolutely for all 0 <  < 0 .
• The function φ has a unique absolute minimum that occurs at z = z 0 with a <
z 0 < b.
• The function φ is strictly convex in a neighborhood of z 0 .
• The function φ is four times continuously differentiable in a neighborhood of z 0 ,
and
4
∂ n φ(z 0 )
φ(z) = φ(z 0 ) + (z − z 0 )n + O (z − z 0 )5 (2.2)
n!
n=2

as z → z 0 .
• The formula in (2.2) can be differentiated. More exactly, the condition

4
∂ n φ(z 0 )
∂φ(z) = (z − z 0 )n−1 + O (z − z 0 )4 , z → z 0 , (2.3)
(n − 1)!
n=2

holds.
• The function f is twice continuously differentiable in a neighborhood of z 0 , and

2
∂ n f (z 0 )
f (z) = (z − z 0 )n + O (z − z 0 )3 (2.4)
n!
n=0

as z → z 0 .
Then, as  → 0,
  
b φ(z)
f (z) exp − dz
a 
   2 2
φ(z 0 ) 2π ∂ f (z 0 ) 5(∂ 3 φ(z 0 )) f (z 0 )
= exp − f (z 0 ) +  +
 ∂ 2 φ(z 0 ) 2∂ 2 φ(z 0 ) 24(∂ 2 φ(z 0 ))3
 
∂ 4 φ(z 0 ) f (z 0 ) ∂ 3 φ(z 0 )∂ f (z 0 )
− 2
− 2
+ O 2 . (2.5)
8(∂ φ(z 0 ))
2 2(∂ φ(z 0 ))
2

Formula (2.5) can be derived by following the proof of Theorem 8.1 in [18].
Let us next assume that weaker differentiability restrictions than those listed above
are imposed on the functions f and φ:
• The function φ is twice continuously differentiable in a neighborhood of z 0 , and

∂ 2 φ(z 0 )
φ(z) = φ(z 0 ) + (z − z 0 )2 + O (z − z 0 )3 (2.6)
2
as z → z 0 .
290 A. Gulisashvili and J. Teichmann

• The formula in (2.2) can be differentiated. More exactly, the condition

∂φ(z) = ∂ 2 φ(z 0 )(z − z 0 ) + O (z − z 0 )2 as z → z 0 (2.7)

holds.
• The function f is such that

f (z) = f (z 0 ) + O (z − z 0 ) as z → z 0 . (2.8)

Then, as  → 0,
     
b φ(z) φ(z 0 ) 2π
f (z) exp − dz = exp − f (z 0 ) + O () . (2.9)
a   ∂ 2 φ(z 0 )

Remark 2.1 Using the Taylor formula, we see that (2.2), (2.3), and (2.4) hold pro-
vided that the function f is three times continuously differentiable and the function φ
is five times continuously differentiable near z 0 . Similarly, (2.6), (2.7), and (2.8) hold
if f is continuously differentiable and φ is three times continuously differentiable
near z 0 .
Let p = { p }>0 be a family of probability measures on R. The following assump-
tion is modeled on the behavior of the family of moment generating functions of the
affine process and on the homogenization ideas (see Sect. 4 for more details):
  u   
 (0) (u)   (1)  (2)
exp − z p (dz) = exp exp  (u) 1 +  (u) + O( )
2
R  
(2.10)

as  → 0, where (i) , 0 ≤ k ≤ 2, are continuous functions on the domain I . The


big O estimate in (2.10) is uniform on all closed intervals contained in I .
It is not hard to see that the functions (i) , 0 ≤ i ≤ 2, in (2.10) can be recovered
from the following formulas:
  u 
(0)
 (u) = lim  log exp − z p (dz), (2.11)
→0 R 
 
  (0) (u)   u 
exp (1) (u) = lim exp exp − z p (dz), (2.12)
→0  R 
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 291

and
 
exp (1) (u) (2) (u)
   
1 (0) (u)  u   
(1)
= lim exp exp − z p (dz) − exp  (u) . (2.13)
→0   R 

It will be assumed throughout the rest of the paper that the conditions in the
Gärtner-Ellis theorem hold. More precisely, we suppose that the following are true:
• The function (0) defined in (2.11) exists as an extended real number for all u ∈ R.
We denote by I the maximum open interval such that the number (0) (u) is finite
for all u ∈ I .
• The point u = 0 belongs to the interval I .
• The function (0) is continuously differentiable on I , the derivative ∂u (0) is a
strictly increasing function on I , and the range of the function ∂u (0) is R.
The previous restrictions concern only the function (0) . By the Gärtner-Ellis
theorem, they imply the validity of the large deviation principle for the family p.
More information on the Gärtner-Ellis theorem can be found in [4]. The existence
of the functions (1) and (2) (these functions are determined from (2.12) and
(2.13), respectively), signals that certain refinements of large deviation results may
be possible.
Remark 2.2 In the paper [16] of Jacquier and Roome, an assumption similar to that
in (2.10) is imposed on the rescaled cumulant generating function (see (2.1) in [16]).
Moreover, there are more similarities between the assumptions in the present section
and those in Sect. 2 of [16]. Note that the main results obtained in [16] concern the
asymptotic behavior of forward start options and forward smiles.
The function (0) is strictly convex on I . Let us define an appropriate Legendre-
Fenchel transform of (0) , more precisely, we put
 ∗
(0) (z) = − inf (uz + (0) (u)), z ∈ R.
u∈I

It is clear that there exists a unique minimizer z → u ∗ (z) in the problem described
above, satisfying the condition

∂u (0) (u ∗ (z)) = −z. (2.14)

It follows that  ∗
(0) (z) = −zu ∗ (z) − (0) (u ∗ (z)). (2.15)

 ∗
Since (0) (0) = 0, we have (0) (z) ≥ 0. It is well-known that the function
 (0) ∗
 is strictly convex on R. The previous statements, (2.14), and (2.15) imply
 (0) ∗  ∗
that  (z) = 0 if z = −∂u (0) (0), and (0) (z) > 0 if z = −∂u (0) (0).
292 A. Gulisashvili and J. Teichmann

Next, set   ∗
d(z) = 2 (0) (z). (2.16)

It is clear that
d 2 (z)  (0) ∗
=  (z). (2.17)
2
Therefore,   
d(z) = −2 zu ∗ (z) + (0) (u ∗ (z)) . (2.18)

By the strict convexity of the function (0) ,



d 2 (z)
inf uz + = −(0) (u), u ∈ I.
z∈R 2

Let p be a family of Borel probability measures satisfying condition (2.10). Our


next goal is to find a special family of functions f = { f  }>0 on R, for which the
asymptotic behavior of rescaled moment generating functions resembles the behavior
described in formula (2.10). It would be tempting to try to find an appropriate family
f among the families of functions satisfying the following condition as  → 0:
  
 u  (0) (u)  
exp − z f  (z)dz = exp exp (1) (u)
R  
 
× 1 + (2) (u) + O 2 (2.19)

uniformly on compact subintervals of I , where the functions (k) , 0 ≤ k ≤ 2, are the


same as in (2.10). However, we can not always guarantee the existence of the integral
on the left-hand side of formula (2.19) due to the lack of control of the tail-behavior
of the function f  . The remedy here is to localize the condition in (2.19).
Definition 2.3 Let p be a family of Borel probability measures such that (2.10)
holds. We say that a family f of continuous functions on R is Laplace principle
equivalent up to order 1 to the family p provided that the following conditions hold:
(i) For every n ≥ 1 there exists a proper open subinterval Jn ⊂ I of the interval I
such that as  → 0,
 n  
 u  (0) (u)  
exp − z f  (z)dz = exp exp (1) (u)
−n  
 
1 + (2) (u) + On,u 2

for all u ∈ Jn .
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 293

∞
(ii) The sequence of intervals Jn , n ≥ 1, is increasing and n=1 Jn = I.

The next statement explains how to construct the family f. The ansatz, defining
the structure of the function f  in formula (2.20), is based on the classical theory of
heat kernel expansions.
Theorem 2.4 Let p be a family of Borel probability measures on R satisfying (2.10),
and suppose the conditions in the Gärtner-Ellis theorem hold. Suppose also that the
function (0) is five times continuously differentiable on I , the function (1) is
three times continuously differentiable on I , and the function (2) is continuously
differentiable on I . Define a family f of functions as follows:
 2 
1 d (z)
f  (z) = √ exp − (C0 (z) + C1 (z)),  > 0, (2.20)
2π 2

where d is given by (2.18),


  
C0 (z) = ∂u2 (0) (u ∗ (z)) exp (1) (u ∗ (z)) ,

and
 2
∂ 2 C0 (z) ∂u2 (0) (u ∗ (z)) 5C0 (z) ∂u3 (0) (u ∗ (z))
C1 (z) = C0 (z) (2) (u ∗ (z)) − −  3
2 24 ∂u2 (0) (u ∗ (z))
 2
C0 (z) 3 ∂u3 (0) (u ∗ (z)) − ∂u2 (0) (u ∗ (z)) ∂u4 (0) (u ∗ (z))
+  3
8 ∂u2 (0) (u ∗ (z))
∂C0 (z) ∂u3 (0) (u ∗ (z))
+ .
2∂u2 (0) (u ∗ (z))

Then the family f is Laplace principle equivalent up to order 1 to the family p.

Proof The differentiability restrictions on the functions (i) , 0 ≤ i ≤ 2, in the


formulation of Theorem 2.4 are imposed because otherwise the functions C0 and
C1 are not defined. Note that the function z → u ∗ (z) is three times continuously
differentiable on the real line. The previous statement easily follows from (2.14).
The proof of Theorem 2.4 is based on the following construction, which uses
Laplace’s principle. For every n ≥ 1, we have
 n  u 
exp − z f  (z)dz
−n 
 n   
1 1 d 2 (z)
=√ exp − uz + (C0 (z) + C1 (z))dz. (2.21)
2π −n  2
294 A. Gulisashvili and J. Teichmann

Set
d 2 (z)
φu (z) = uz + . (2.22)
2
Laplace’s principle will be applied to the family of integrals appearing on the right-
hand side of (2.21) twice. The first time, formula (2.5) with f = C0 and φ = φu
will be used, while for the second time, formula (2.9) will be used with f = C1 and
φ = φu .

The critical
 point
(0)
∗ z (u) of the function φu given by (2.22) is∗the solution to the
equation ∂z  (z) = u. It is not hard to see that z = z (u) if and only if
u = u ∗ (z). It follows from (2.14) that

z ∗ (u) = −∂(0) (u), u ∈ I. (2.23)

The next formulas can be derived using (2.17), (2.15), (2.22), and (2.23). We have

1
∂z2 φu (z ∗ (u)) = , (2.24)
∂ 2 (0) (u)

∂ 3 (0) (u)
∂z3 φu (z ∗ (u)) = , (2.25)
[∂ 2 (0) (u)]3

and
3[∂ 3 (0) (u)]2 − ∂ 2 (0) (u) ∂ 4 (0) (u)
∂z4 φu (z ∗ (u)) = . (2.26)
[∂ 2 (0) (u)]5

Let us define the intervals Jn appearing in Definition 2.3 as follows:

Jn = {u ∈ I : z ∗ (u) ∈ (−n, n)}, n ≥ 1.

It is not hard to see that condition (ii) in Definition 2.3 is satisfied. Next, using (2.5)
and (2.21), we obtain
  u   
n φu (z ∗ (u)) 1
exp − z f  (z)dz = exp −
−n   ∂z φ(z ∗ (u))
2

 2
∗ ∗ ∂z2 C0 (z ∗ (u)) 5(∂z3 φu (z ∗ (u))) C0 (z ∗ (u))
C0 (z (u)) +  C1 (z (u)) + 2 + −
2∂z φu (z ∗ (u)) 24(∂z2 φu (z ∗ (u)))3
 
∂z4 φu (z ∗ (u))C0 (z ∗ (u)) ∂z3 φu (z ∗ (u))∂z C0 (z ∗ (u))
− − + On,u  2
(2.27)
8(∂z2 φu (z ∗ (u)))2 2(∂z2 φu (z ∗ (u)))2
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 295

as  → 0. Note that the differentiability conditions in Theorem 2.4 allow us to use


formulas (2.5) and (2.9) with the functions f and φ chosen above.
We will next compare the formulas in (2.10) and (2.27). Note that

φu (z ∗ (u)) = u z ∗ (u) − z ∗ (u) u ∗ (z ∗ (u)) − (0) (u ∗ (z ∗ (u))) = −(0) (u).

This shows that if we choose the function d as in (2.16), then the first factors in
formulas (2.10) and (2.27) coincide. Moreover, the functions C0 and C1 have to be
chosen so that 
C0 (z ∗ (u)) = ∂u2 (0) (u) exp((1) (u)) (2.28)

and
2
∂z2 C0 (z ∗ (u)) 5(∂z3 φu (z ∗ (u))) C0 (z ∗ (u))
C1 (z ∗ (u)) = C0 (z ∗ (u)) (2) (u) − ∗

2∂z φu (z (u))
2
24(∂z2 φu (z ∗ (u)))3
∂z4 φu (z ∗ (u)) C0 (z ∗ (u)) ∂z3 φu (z ∗ (u)) ∂z C0 (z ∗ (u))
+ 2
+ . (2.29)
8(∂z2 φu (z ∗ (u))) 2(∂z2 φu (z ∗ (u)))2

The representations of the functions C0 and C1 given in Theorem 2.4 can be


obtained by plugging u = u ∗ (z) into (2.28) and (2.29), and simplifying the resulting
formulas. Equalities (2.23)–(2.26) are taken into account in the simplifications.
This completes the proof of Theorem 2.4. 
 ∗
Remark 2.5 We have already established that (0) (y) ≥ 0 for all y ∈ R. Since
(2.14) and (2.15) hold, we have
 ∗
∂ (0) (y) = −u ∗ (y)

 ∗
for all y ∈ R. Hence the infimum of the function (0) on the real line is attained
at the point y such that u ∗ (y) = 0. This point is given by y = z ∗ (0) = ∂(0) (0).
Moreover,  ∗
inf (0) (y) = −(0) (0) = 0.
y∈R

Remark 2.6 A heuristic conclusion that can be reached using Theorem 2.4 is that the
family f is a small-time approximation to the family p in a certain very weak sense.
Finding such approximations is an important problem. We consider our results as
first modest steps in going beyond the celebrated Gärtner-Ellis theorem.

The next assertion provides a first order large deviation estimate in the Gärtner-
Ellis theorem for families of measures satisfying condition (2.10). Higher order
estimates can also be found, but we do not include them in the present paper. Let A
296 A. Gulisashvili and J. Teichmann

be a bounded Borel set. Denote by A the closure of the set A, and let

a + = sup{z} and a − = inf {z}.


z∈A z∈A

Then we have z + , z − ∈ A.
Theorem 2.7 Let p be a family of probability Borel measures on R such that (2.10)
holds. Suppose also that the function (0) is twice continuously differentiable on I
and the conditions in the Gärtner-Ellis theorem hold (see the conditions listed after
formula (2.13)). Suppose also that A ⊂ R is a bounded Borel set, and x ∈ A. Then
the following are true:

(i) If x ≥ ∂(0) (0), then as  → 0,


  ∗ 
(0)
(x) − u ∗ (x) (a + − x)  
p (A) ≤ exp − exp (1) (u ∗ (x))

 
(2) ∗
× 1 +  (u (x)) + O( ) .2
(2.30)

(ii) If x < ∂(0) (0), then as  → 0,


  ∗ 
(0) (x) − |u ∗ (x)| (x − a − )  
p (A) ≤ exp − exp (1) (u ∗ (x))

 
× 1 + (2) (u ∗ (x)) + O(2 ) . (2.31)

The big O estimates in (2.30) and (2.31) are uniform with respect to x ∈ A.

Remark 2.8 The conditions x ≥ ∂(0) (0) and x < ∂(0) (0) are equivalent to
u ∗ (x) ≥ 0 and u ∗ (x) < 0, respectively.

Theorem 2.9 Let p be a family of probability Borel measures on R such that (2.10)
holds. Suppose also that the function (0) is twice continuously differentiable on I
and the conditions in the Gärtner-Ellis theorem hold (see the conditions listed after
formula (2.13)). Suppose also that A ⊂ R is a bounded open set, and x ∈ A. Then
the following are true:

(i) Let x ≥ ∂(0) (0). Then there exists a constant γ A > 0 depending on the set
A such that as  → 0,
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 297
  ∗ 
(0)
(x) + u ∗ (x) (x − a − )  
p (A) ≥ exp − exp (1) (u ∗ (x))

 γ   
A (2) ∗
× 1 − exp − 1 +  (u (x)) + O( ) . 2
(2.32)


(ii) If x < ∂(0) (0), then as  → 0,


  ∗ 
(0) (x) + |u ∗ (x)| (a + − x)  
p (A) ≥ exp − exp (1) (u ∗ (x))

 γ   
1 + (2) (u ∗ (x)) + O(2 ) .
A
× 1 − exp − (2.33)


The constant γ A in (2.33) is the same as in (2.32), and the big O estimates in (2.32)
and (2.33) are uniform with respect to x ∈ A.

Remark 2.10 Note that performing the transformation lim sup→0  log p (A) in the
upper estimates in Theorem 2.7, we obtain the upper estimate in the large deviation
principle for any bounded Borel set A. This gives a little more than the upper estimate
in the Gärtner-Ellis theorem. However, we should not forget that formula (2.30) was
derived under a stronger restriction (2.10), than in the Gärtner-Ellis theorem.

Proof of Theorem 2.7 We borrow some ideas from the proofs of Cramer’s theorem
and the Gärtner-Ellis theorem given in [4]. The proofs of the upper estimates in those
theorems use Chebyshev’s inequality. In our case, due to a special structure of the
problem, we can provide a slightly more direct proof.
Suppose the conditions in Theorem 2.7 hold, and let u ∈ I and  > 0. Then we
have
  uz    uz 
exp − p (dz) ≥ p (A) inf exp − . (2.34)
A  z∈A 

It follows from (2.34) that for every u ∈ I there exists ξ(u) ∈ A such that
   uz 
uξ(u)
p (A) ≤ exp exp − p (dz)
 A 
 
(0) (u)  u 
= exp − exp − z p (dz)
 A 
 
(0) (u) + xu + u(ξ(u) − x)
× exp .


Indeed, we can take ξ(u) = a + if u ≥ 0 and ξ(u) = a − if u < 0.


298 A. Gulisashvili and J. Teichmann

Next, by plugging u = u ∗ (x) into the previous equalities and taking into account
condition (2.10), we get
   ∗ 
(0) (u ∗ (x)) u (x)
p (A) ≤ exp − exp − z p (dz)
 A 
  ∗ 
(0) (x) − u ∗ (x)(ξ(u ∗ (x)) − x)
× exp −

  ∗ 
  (0) (x) − u ∗ (x)(ξ(u ∗ (x)) − x)
(1) ∗
≤ exp  (u (x)) exp −

 
× 1 + (2) (u ∗ (x)) + O(2 ) (2.35)

as  → 0. Now, it is not hard to see that (2.35) implies Theorem 2.7.

Proof of Theorem 2.9 The lower bounds given in Theorem 2.9 are more delicate.
Here we start with the estimate
  uz    uz 
exp − p (dz) ≤ p (A) sup exp −
A  z∈A 

instead of the estimate in (2.34). This implies that


   uz 
uη(u)
p (A) ≥ exp exp − p (dz)
 A 
 
(0) (u)  uz 
= exp − exp − p (dz)
 A 
 
(0) (u) + xu + u(η(u) − x)
× exp ,


for all u ∈ I , where η(u) = a − if u ≥ 0 and η(u) = a + if u < 0. Therefore


   ∗ 
(0) (u ∗ (x)) u (x)
p (A) ≥ exp − exp − z p (dz)
 A 
  ∗ 
(0) (x) − u ∗ (x)(η(u ∗ (x)) − x)
× exp − . (2.36)


Our next goal is to use the change of measure method. Consider a new family 
p
of probability measures defined by
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 299
 ∗ 
exp − u (x)z
 p (dz)

p (dz) =   ∗  ,  > 0.
u (x)z
R exp −  p  (dz)

Note that the family 


p depends on x. Then inequality (2.36) and condition (2.10)
imply that
   ∗ 
(0) (u ∗ (x)) u (x)
p (A) ≥ exp − exp − z p (dz) p (A)
 R 
 
[(0) ]∗ (x) − u ∗ (x)(η(u ∗ (x)) − x)
× exp −

  
(1) ∗ (2) ∗
= exp  (u (x)) 1 +  (u (x)) + O( )  2
p (A)
 
[(0) ]∗ (x) − u ∗ (x)(η(u ∗ (x)) − x)
× exp − (2.37)


as  → 0.
We will next estimate the quantity

p (A) = 1 − 
 p (Ac ) (2.38)

from below. This will be done using the upper estimate in the Gärtner-Ellis theorem.
Let us denote by (0) the function defined by (2.11) for the family 
p instead of the
family p. Then it is not hard to see that

(0) (v) = (0) (v + u ∗ (x)) − (0) (u ∗ (x)), v ∈ 


 I, (2.39)

where  (0) and the interval 


I = I − u ∗ (x). The function  I depend on x. It is clear
that 0 ∈ 
I . Moreover,
 ∗  
(0) (y) = − inf yv + 
 (0) (v) ≥ 0
v∈
I

Next, taking into account that Ac is a closed set, and using the upper large deviations
estimate in the Gärtner-Ellis theorem (see Theorem 2.3.6 in [4]), we obtain
   ∗
p (Ac ) ≤ − inf c 
lim sup  log  (0) (y).
→0 y∈A

 (0) ∗
Set δ A = inf y∈Ac  (y). Using Remark 2.5 and (2.39), we see that the
 (0) ∗
unique infimum of the function  on the real line is attained at the point
 ∗
y=∂ (0) (0) = (0) (u ∗ (x)) = x,
300 A. Gulisashvili and J. Teichmann

and is equal to zero. Since x ∈ / Ac , and the set Ac is closed, we have δ A > 0.
Therefore, for every τ > 0, there exists τ > 0 such that
 
−δ A + τ
p (A ) ≤ exp
 c
, 0 <  < τ . (2.40)


Fix any number τ > 0 with 0 < τ < δ A , and set γ A = δ A − τ . Then (2.38) and
(2.40) imply the following estimate:
 
−γ A
p (A) ≥ 1 − exp
 , 0 <  < τ . (2.41)


Finally, using (2.37) and (2.41), we establish estimate (2.32).


The proof of Theorem 2.7 is thus completed.

3 Affine Processes

Let D be a non-empty Borel subset of the real Euclidian space Rd , equipped with
the Borel σ-algebra D, and assume that the affine hull of D is the full space Rd . To
D we add a point δ that serves as a ‘cemetery state’. Define

 = D ∪ {δ} ,
D  = σ(D, {δ}),
D

and equip D with the Alexandrov topology, in which any open set with a compact
complement in D is declared an open neighborhood of δ.1 Any continuous function
f defined on D is extended to D  by setting f (δ) = 0.
Let (, F, F) be a filtered measurable space, on which a family (Px )x∈ D  of
probability measures is defined, and assume that F is Px -complete for all x ∈ D 
and that the filtration F is right continuous. Finally, let X be a càdlàg process taking
 whose transition kernel
values in D,

pt (x, A) = Px (X t ∈ A), (t ≥ 0, x ∈ D, 
 A ∈ D)

is a normal time-homogeneous Markov kernel, for which δ is absorbing. That is,


pt (x, .) satisfies the following conditions:

(a) x → pt (x, A) is D-measurable 
for each (t, A) ∈ R0 × D.
(b) p0 (x, {x}) = 1 for all x ∈ D,
(c) pt (δ, {δ}) = 1 for all t ≥ 0
 = 1 for all (t, x) ∈ R0 × D,
(d) pt (x, D)  and
(e) the Chapman-Kolmogorov equation

1 Note  enters our assumptions in a subtle way: We require later that X is


that the topology of D
, which is a property for which the topology matters.
càdlàg on D
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 301

pt+s (x, dξ) = pt (y, dξ) ps (x, dy)


 × D.
holds for each t, s ≥ 0 and (x, dξ) ∈ D
We equip Rd with the canonical inner product , , and associate to D the set
U ⊆ Cd defined by
 
U = u ∈ Cd : sup Re u, x < ∞ .
x∈D

Note that the set U is the set of complex vectors u such that the exponential function
x → e u,x is bounded on D. It is easy to see that U is a convex cone and always
contains the set of purely imaginary vectors iRd .
Definition 3.1 (Affine processes) A stochastic process X is called affine with state
space D, if the transition kernel pt (x, dξ) of X satisfies the following conditions:
(i) It is stochastically continuous, i.e., lims→t ps (x, .) = pt (x, .) weakly for all
t ≥ 0, x ∈ D.
(ii) The Fourier-Laplace transform of the kernel depends on the initial state in the
following way: there exist functions  : R0 ×U → C and ψ : R0 ×U → Cd ,
such that 
e ξ,u pt (x, dξ) = (t, u) exp( x, ψ(t, u)) (3.1)
D

for all t ∈ R0 , x ∈ D, and u ∈ U.


Remark 3.2 Note that the previous definition does not specify ψ(t, u) in a unique
way. However, there is a natural unique choice for ψ that will be discussed in
Proposition 3.3. Also note that as long as (t, u) is non-zero, there exists φ(t, u)
such that (t, u) = eφ(t,u) , and equality (3.1) becomes

e ξ,u pt (x, dξ) = exp {φ(t, u) + x, ψ(t, u)} . (3.2)
D

This is the essentially the definition that was used in [5]. Condition (3.2) means
that the Fourier-Laplace transform of the transition function is the exponential of an
affine function of x. This fact is usually interpreted as the reason for the name ‘affine
process’, even though affine functions also appear in other aspects of affine processes,
e.g., in the coefficients of the infinitesimal generator, or in the differentiated semi-
martingale characteristics. We prefer to use equality (3.1) instead of equality (3.2),
since the former equality leads to a slightly more general definition that avoids the
necessity of the a-priori assumption that the left hand side of (3.1) is non-zero for
all t and u.
Before we start exploring the first simple consequences of Definition 3.1, addi-
tional notation will be introduced. For any u ∈ U, set σ(u) := inf {t ≥ 0 : (t, u) = 0}
302 A. Gulisashvili and J. Teichmann
 
and Q := (t, u) ∈ R0 × U : t < σ(u) , and let φ be a function on Q such that

(t, u) = eφ(t,u) for all (t, u) ∈ Q.

The uniqueness of φ will be discussed below. The functions φ and ψ have the fol-
lowing properties (see [17]):

Proposition 3.3 Let X be an affine process on D. Then


(i) The condition σ(u) > 0 holds for any u ∈ U.
(ii) The functions φ and ψ are uniquely defined on Q under the restriction that they
are jointly continuous and satisfy φ(0, 0) = ψ(0, 0) = 0.
(iii) The function ψ maps Q into U.
(iv) The functions φ and ψ satisfy the semi-flow property. For any u ∈ U and
t, s ≥ 0 with t + s ≤ σ(u), the following conditions hold:

φ(t + s, u) = φ(t, u) + φ(s, ψ(t, u)), φ(0, u) = 0


ψ(t + s, u) = ψ(t, ψ(s, u)), ψ(0, u) = u

Remark 3.4 In the sequel, the functions φ and ψ will always be chosen according to
Proposition 3.3.

We now introduce the important notion of regularity.

Definition 3.5 An affine process X is called regular if the derivatives

∂φ(t, u) ∂ψ(t, u)
F(u) = , R(u) =
∂t t=0+ ∂t t=0+

exist for all u ∈ U and are continuous at u = 0.

The next statement illustrates why the regularity is a crucial property. This
statement was originally established by [5] for affine processes on the state-space
Rn × Rm0 .

Proposition 3.6 Let X be a regular affine process. Then there exist Rd -vectors
b, β 1 , . . . , β d ; d ×d-matrices a, α1 , . . . , αd ; real numbers c, γ 1 , . . . , γ d , and signed
Borel measures m, μ1 , . . . , μd on Rd \ {0} such that the functions F(u) and R(u)
can be represented as follows:

1
F(u) = u, au + b, u − c + e ξ,u − 1 − h(ξ), u m(dξ) ,
2 Rd \{0}
(3.3a)

1! " ! "
Ri (u) = u, αi u + β i , u − γ i + e ξ,u − 1 − h(ξ), u μi (dξ) .
2 Rd \{0}
(3.3b)
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 303

In the previous formulas, h(x) = x1{x≤1} is a truncation function. In addition, for


all x ∈ D, the quantities

A(x) = a + x1 α1 + · · · + xd αd , (3.4a)
B(x) = b + x1 β + · · · + xd β ,
1 d
(3.4b)
C(x) = c + x1 γ 1 + · · · + xd γ d , (3.4c)
ν(x, dξ) = m(dξ) + x1 μ (dξ) + · · · + xd μ (dξ)
1 d
(3.4d)

have the following properties: A(x) is positive semidefinite, C(x) ≤ 0, and



ξ2 ∧ 1 ν(x, dξ) < ∞.
Rd \{0}

Moreover, for u ∈ U and t ∈ [0, σ(u)), the functions φ and ψ satisfy the following
ordinary differential equations:


φ(t, u) = F(ψ(t, u)), φ(0, u) = 0 (3.5a)
∂t

ψ(t, u) = R(ψ(t, u)), ψ(0, u) = u. (3.5b)
∂t
Remark 3.7 The Eq. (3.5) are called generalized Riccati equations, since they are
classical Riccati equations when m(dξ) = μi (dξ) = 0. Moreover, Eqs. (3.3) and (3.4)
imply that u → F(u) + R(u), x is a function of Lévy-Khintchine form for each
x ∈ D.
Proof See [17]. 
In general, the parameters (a, αi , b, β i , c, γ i , m, μi )i∈{1,...,d} appearing in the rep-
resentations of F and R in (3.5a) and (3.5b) have to satisfy additional conditions,
called the admissibility conditions. These conditions guarantee the existence of an
affine Markov process X with state space D and with prescribed F and R. It is clear
that such conditions should depend strongly on the geometry of the (boundary of the)
state space D. Finding such (necessary and sufficient) conditions on the parameters
for different types of state spaces has been the focus of several publications. For
D = Rm 0 × R , the admissibility conditions were derived in [5]. For the cone of
n

semi-definite matrices D = Sd+ , such conditions were found in [2], and for sym-
metric irreducible cones, the admissibility conditions were found in [3]. Finally, for
affine diffusions (m = μi = 0) on polyhedral cones and on quadratic state spaces,
the admissibility conditions were given in [20].

0 × R with m, n ≥ 0 the canonical


Definition 3.8 We call the state space D = Rm n

state space.
Affine processes on canonical state spaces are completely characterized in [5]
in terms of the admissibility conditions imposed on F and R. Affine processes
304 A. Gulisashvili and J. Teichmann

on canonical state spaces have continuous trajectories (such processes are called
continuous affine processes) if and only if the functions F and R satisfy the admis-
sibility conditions and are polynomials of degree at most 2 (see Proposition 3.6).

4 Homogenization Procedure

In this section, we consider continuous, affine processes on the canonical state space
D = Rm 0 × R . We will next introduce a natural homogenization procedure, which
n

allows to analyze the short-time asymptotics of the law of continuous affine processes.
In the case of affine processes, the homogenization leads in fact to real analytic
expansions with respect to the homogenization parameter.
The following lemmas introduce the homogenization procedure.
Lemma 4.1 Let ψ : U × R≥0 → U be the unique solution of the equation

∂  
ψ(u, t) = R ψ(u, t) , ψ(u, 0) = u ∈ U,
∂t

where R : U → Cd is a quadratic polynomial. Then, for every  > 0, the function


u
ψ  (u, t) := ψ , t

solves the equation

∂   
ψ (u, t) = R  ψ  (u, t) , ψ  (u, 0) = u
∂t
 
with R  (u) := 2 R −1 u for u ∈ U.
Analogously, let φ : U × R≥0 → C be the unique solution of the equation

∂  
ψ(u, t) = F ψ(u, t) , φ(u, 0) = 0.
∂t
Then, for every  > 0, the function
u
φ (u, t) := φ , t

solves the equation

∂   
φ (u, t) = F  ψ  (u, t) , φ (u, 0) = 0
∂t
 
with F  (u) := 2 F −1 u for u ∈ U.
The proof of Lemma 4.1 is simple, and we leave it as an exercise for the reader.
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 305

Lemma 4.2 Under the previous assumptions, the limit lim→0 ψ  = ψ (0) exists
uniformly on compact sets in U × R≥0 . Furthermore,

ψ  (u, t) = ψ (0) (u, t) + ψ (1) (u, t) + n ψ (n) (u, t) (4.1)


n≥2

is a convergent power series expansion for small  > 0. The coefficient functions in
(4.1) satisfy certain ordinary differential equations, i.e., in particular,

∂ (0)  
ψ (u, t) = R (0) ψ (0) (u, t) , ψ (0) (u, 0) = u ,
∂t
and
∂ (1) ∂  
ψ (u, t) = R  ψ (0) (u, t) ψ (1) (u, t), ψ (1) (u, 0) = 0.
∂t ∂ =0

For n ≥ 2, the equations for the coefficient functions involve higher order derivatives.
In complete analogy, the limit lim→0 φ = φ(0) exists uniformly on compact sets in
U × R≥0 . Furthermore

φ (u, t) = φ(0) (u, t) + φ(1) (u, t) + u φ(n) (u, t) ,


n≥2

for small enough values of .

Proof Observe that R  = R (0) + R (1) + 2 R (2) and F  = F (0) + F (1) + 2 F (2) .
2 2

Hence, the vector fields appearing in the equation in Lemma 4.2 are polynomial in u
and . Standard results on differential equations with polynomial vector fields yield
the assertions in Lemma 4.2, in particular, the real analyticity of the solution with
respect to . 

Let X be an affine diffusion process with the corresponding functions F and R.


We can extend the solutions of the Riccati equations described above to maximal
domains for u ∈ Rd , i.e., consider maximal local flows on Rd with the vector fields
ˆ (i) , i ≥ 0, are denoted the functions appearing in the following
F  and R  . By 
power series expansion in :

ˆ (0) (u) + 
 ˆ (1) (u) + ... := φ (−u, 1) + x, ψ  (−u, 1) , (4.2)

They are the solutions of the Riccati equations appearing in the previous lemmas.
Note that we suppress the dependence on the initial value x on the left-hand side of
ˆ (i) exist as extended real numbers for u ∈ Rd .
(4.2). The functions 
Remark 4.3 If the expression on right-hand side of (4.2) is finite, then the power
series on the left-hand side converges absolutely for sufficiently small values of .
306 A. Gulisashvili and J. Teichmann

Remark 4.4 For continuous affine processes, the homogenization procedure leads
to the following representation:
  u    u 
E exp − , X   = exp − , z p (dz)
 D 
 
ˆ (u)
(0)
= exp ˆ (1)
+  (u) + ... , (4.3)


where u is such that the expressions on both sides of (4.3) are finite for small enough
values of .

The representation in (4.3) valid for any continuous affine process was a motivation
for us for introducing condition (2.10) used in the previous sections. However, the
expansion in (4.3) is a little different from that in (2.10).

5 Example: The Heston Model

In this section, we find explicit formulas for the functions (i) , 0 ≤ i ≤ 2, associ-
ated with the log-price process in the Heston model. Let us consider the following
correlated Heston model:
#
d X t = (r + kVt )dt + Vt dW1, t ,
#
d Vt = (a − bVt )dt + σ Vt dW2, t , (5.1)

where r, k ∈ R, a, b ≥ 0, σ > 0, and W1, t and W2, t are standard Brownian


motions with d W1 , W2 t = ρdt. We assume that the correlation coefficient ρ sat-
isfies the condition −1 < ρ < 1. In (5.1), X is the log-price process, and V is the
variance process. The initial conditions for the processes X and V are denoted by
x0 and v0 , respectively. The Heston model was introduced in [14]. Note that in the
present paper we consider the Heston model in which both the log-price and the vari-
ance equations contain drift terms generated by affine functions. Very often, e.g., in
[7–10, 16], a special Heston model where k = − 21 and r = 0 is studied. An extended
Heston model, in which the defining equations contain affine drift terms, is discussed
in [15].
The process X is not an affine process. It is a projection of the two-dimensional
affine process (X, V ) ontothe first coordinate.
 The moment generating function of
X t is given by Mt (u) = E exp{u X t } = exp {C(u, t) + D(u, t)v0 + ux0 } , where

a 1 − g(u)ed(u)t
C(u, t) = r ut + (b − ρσu + d(u))t − 2 log ,
σ 2 1 − g(u)
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 307
 
b + d(u) − ρσu 1 − ed(u)t
D(u, t) = ,
σ2 1 − g(u)ed(u)t

b − ρσu + d(u)
g(u) = ,
b − ρσu − d(u)

and #
d(u) = (ρσu − b)2 − σ 2 (2ku + u 2 )

(see [1]). Here and in the sequel, the symbol · stands for the principal square root
function. We will explain below the meaning of the logarithmic function appearing
in the expression for the function C (see the discussion after formula (5.7)). Note
that for u = 0, the expressions for the functions C and D should be understood in
the limiting sense. More precisely,

C(0, t) = lim C(u, t) = 0 and D(0, t) = lim D(u, t) = 0


u→0 u→0

for all t > 0.


It is clear that
 u   u u u 
E exp{− X t } = exp C(− , t) + D(− , t)v0 − x0 .
t t t t
 
Denote (u, t) = t log E exp{− ut X t } . Then

u u
(u, t) = tC(− , t) + t D(− , t)v0 − ux0 . (5.2)
t t

Next, set A(u) = b − ρσu. It is not hard to see that

1 1 − ed(u)t
D(u, t) = (A(u) + d(u))
σ 2
1 − A(u)+d(u) ed(u)t
A(u)−d(u)

1 sinh d(u)t
= (A(u)2
− d(u)2
) 2
.
σ2 d(u) cosh d(u)t + A(u) sinh d(u)t
2 2

Moreover,
 d(u)t 
a 1 − A(u)+d(u)
A(u)−d(u) e
C(u, t) = r ut + (A(u) + d(u))t − 2 log
σ2 1 − A(u)+d(u)
A(u)−d(u)

a d(u) cosh 2 + A(u) sinh d(u)t
d(u)t
= r ut + 2 A(u)t − 2 log 2
.
σ d(u)
308 A. Gulisashvili and J. Teichmann

Using the previous formula, we obtain


u
C − , t = −r u
t
d(− ut )t d(− ut )t 
a d(− ut )t cosh 2 + (bt + ρσu) sinh 2
+ 2 bt + ρσu − 2 log . (5.3)
σ d(− ut )t

We also have u u
A − = b + ρσ ,
t t

u u u2
A2 − = b2 + 2bρσ + ρ2 σ 2 2 ,
t t t

u u 2 (1 − ρ2 )σ 2 2σu(kσ + bρ)
d2 − = − 2
+ + b2 ,
t t t

 
1 u u u2 2ku
A 2
− − d 2
− = − ,
σ2 t t t2 t

and
d (− u )t
u u2 2ku sinh 2 t
D − ,t = 2 − . (5.4)
t t t d − u  cosh d (− ut )t + A − u  sinh d (− ut )t
t 2 t 2

Let us denote by Z the set of such real numbers u that the expressions on the
right-hand side of (5.3) and (5.4) are finite for all small enough values of t, and put

u t
Ŝ(u, t) = d − .
t 2
It is easy to see that

1# 2
Ŝ(u, t) = −u (1 − ρ2 )σ 2 + 2tu(kσ 2 + bρσ) + t 2 b2 .
2
In the previous formula, t is a real number. Therefore, for every real number u = 0,
Ŝ(u, t) is purely imaginary for all numbers t with |t| small enough. For such u and
t, Ŝ(u, t) = i S(u, t), where

1# 2
S(u, t) = u (1 − ρ2 )σ 2 − 2tu(kσ 2 + bρσ) − t 2 b2 (5.5)
2
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 309

is a real number. It follows that


u
tC − , t = −tr u
t

a 2S(u, t) cos S(u, t) + (bt + ρσu) sin S(u, t)
+ t 2 bt + ρσu − 2 log (5.6)
σ 2S(u, t)

and
u sin S(u, t)
tD − , t = u 2 − 2tku . (5.7)
t 2S(u, t) cos S(u, t) + (bt + ρσu) sin S(u, t)

Our next goal is to introduce an additional condition under which the logarithmic
function appearing in formula (5.6) exists, and the expressions on the right-hand
sides of (5.6) and (5.7) are finite. Recall that we have assumed that u = 0 and |t| is
small enough. Set


S(u) = lim [2S(u, t) cos S(u, t) + (bt + ρσu) sin S(u, t)] .
t→0

Then, we have
1 #
lim S(u, t) = |u|σ 1 − ρ2
t→0 2

and # #
# |u|σ 1 − ρ2 |u|σ 1 − ρ2

S(u) = |u|σ 1 − ρ2 cos + uσρ sin .
2 2
Let ρ = 0, and assume that
# $ # %
2 1 − ρ2 2 1 − ρ2
− # arctan <u< # π − arctan . (5.8)
σ 1 − ρ2 ρ σ 1 − ρ2 ρ

The restriction in (5.8) means that the variable u is bounded from below by the largest
negative root of the function
# #
# uσ 1 − ρ2 uσ 1 − ρ2

S(u) = 1 − ρ2 cos + ρ sin ,
2 2

and from above by the smallest positive root of the same function. Note that  S(0) > 0.
Therefore, we have  S(u) > 0, for all u satisfying the condition in (5.8).
It is easy to see that 
S(u) = σ|u| S(u) for all u = 0, satisfying the condition in
(5.8). Hence,  S(u) > 0, under the same restrictions on u. It follows from (5.7) that
for all u = 0 such that (5.8) holds, the right-hand side of (5.7) is eventually finite as
t → 0, and moreover
310 A. Gulisashvili and J. Teichmann

uσ 1−ρ2
u u sin
lim t D − , t = # √
2
√ . (5.9)
t→0 t uσ 1−ρ 2 uσ 1−ρ2
σ 1 − ρ2 cos 2 + ρ sin 2

In addition, the expression under the logarithm sign in (5.6) is eventually positive,
and u
lim tC − , t = 0. (5.10)
t→0 t

In the case where ρ = 0, the condition in (5.8) becomes


π π
− <u< . (5.11)
σ σ
The analysis here proceeds similarly to that in the previous case.
The next statement provides explicit expressions for the function (0) . This state-
ment was obtained in [8] (see formula (2) in [8], see also [10]) in a special case where
k = − 21 and r = 0.
Theorem 5.1 Suppose ρ = 0 and condition (5.8) holds. Then u ∈ Z and the
following formula is valid:

uσ 1−ρ2
(0) v0 u sin
 (u) = # √ 2
√ − x0 u. (5.12)
uσ 1−ρ2 uσ 1−ρ2
σ 1 − ρ cos
2
2 + ρ sin 2

If ρ = 0 and condition (5.11) holds, then u ∈ Z and


v0 u uσ
(0) (u) = tan − x0 u.
σ 2
Theorem 5.1 follows from (5.2), (5.9), and (5.10).
Recall that for x ∈ R, the ∗
√ critical point u (x) is the solution of the equation
σ 1−ρ2
∂u (0) (u) = −x. Put θ = 2 . Then, using (5.12), we obtain
#
(0) v0 ρ[1 − cos(2θu)] + 1 − ρ2 sin(2θu) + σ(1 − ρ2 )u
∂u  (u) = # − x0 .
2σ ( 1 − ρ2 cos(θu) + ρ sin(θu))2

In a special case where ρ = 0, we have

v0 sin(2θu) + σu
∂u (0) (u) = − x0 .
σ 1 + cos(2θu)
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 311

In the following two statements, we provide formulas for the critical point u ∗ (x)
and the second derivative of the function (0) (u). These results can be used in the
asymptotic formulas established in the previous sections in the case of the Heston
model.
Lemma 5.2 Suppose ρ = 0 and condition (5.8) holds. Then, for every x ∈ R, the
critical point u ∗ (x) is the unique solution to the equation
#
ρ[1 − cos(2θu)] + 1 − ρ2 sin(2θu) + σ(1 − ρ2 )u 2σ
# = (x0 − x).
( 1 − ρ cos(θu) + ρ sin(θu))
2 2 v0

If ρ = 0 and condition (5.11) holds, then for every x ∈ R, u ∗ (x) is the unique
solution to the equation

sin(2θu) + σu σ
= (x0 − x).
1 + cos(2θu) v0

Lemma 5.3 Suppose ρ = 0 and condition (5.8) holds. Then

v0 S(u)
∂ 2 (0) (u) = #
2σ[ 1 − ρ2 cos(θu) + ρ sin(θu)]3

where
# #
S(u) = (2θ + σ 1 − ρ2 )[ρ 1 − ρ2 sin(θu) + (1 − ρ2 ) cos(θu)]
#
+ 2σθ(1 − ρ2 )u[ 1 − ρ2 sin(θu) − ρ cos(θu)].

If ρ = 0 and condition (5.11) holds, then

v0 (2θ + σ) cos(θu) + 2θσu sin(θu)


∂ 2 (0) (u) = .
2σ cos3 (θu)

Lemmas 5.2 and 5.3 are straightforward, and their proofs are omitted.
We will next compute the functions (1) and (2) . Recall that
  
 u   (0) (u)   (1)  (2)
exp − z pt (dz) = exp exp  (u) 1 + t (u) + . . . .
t t

Therefore,

(u, t) = (0) (u) + t(1) (u) + t log(1 + t(2) (u) + . . .).


312 A. Gulisashvili and J. Teichmann

By differentiating the previous formula with respect to t, we obtain

∂
(1) (u) = lim (u, t)
t→0 ∂t

and
1 ∂2
(2) (u) = lim (u, t). (5.13)
2 t→0 ∂t 2

Let us fix u = 0 such as in Theorem 5.1. Then the function t → S(u, t) defined
by (5.5) is real analytic in t in a small neighborhood of t = 0, depending on u. Using
the Taylor formula, we obtain

1
S(u, t) = c0 (u) + c1 (u)t + c2 (t)t 2 + O t 3 (5.14)
2
as t → 0, where the O-estimate depends on u, and the coefficients are given by

|u|σ #
c0 (u) = 1 − ρ2 , (5.15)
2
|u| kσ + bρ
c1 (u) = − # , (5.16)
u 2 1 − ρ2

and
|u| b2 (1 − ρ2 ) + (kσ + bρ)2
c2 (u) = − 3
.
u2 2σ(1 − ρ2 ) 2

Our next goal is to expand the functions t → sin S(u, t) and t → cos S(u, t).
Using the Taylor formula and (5.14), we get

1
sin S(u, t) = U0 (u) + U1 (u)t + U2 (u)t 2 + O(t 3 ) (5.17)
2
as t → 0, where
U0 (u) = sin c0 (u), (5.18)

U1 (u) = c1 (u) cos c0 (u), (5.19)

and
U2 (u) = c2 (u) cos c0 (u) − c1 (u)2 sin c0 (u).

Similarly,
1
cos S(u, t) = W0 (u) + W1 (u)t + W2 (u)t 2 + O(t 3 ) (5.20)
2
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 313

as t → 0, where
W0 (u) = cos c0 (u),

W1 (u) = −c1 (u) sin c0 (u),

and
W2 (u) = −[c2 (u) sin c0 (u) + c1 (u)2 cos c0 (u)].

We will next expand the functions t → t D(− ut , t) and t → tC(− ut , t). It follows
from (5.14), (5.17), and (5.20) that

1
2S(u, t) cos S(u, t) + (bt + ρσu) sin S(u, t) = V0 (u) + V1 (u)t + V2 (u)t 2 + O(t 3 )
2
(5.21)
as t → 0, where
V0 (u) = 2c0 (u)W0 (u) + ρσuU0 (u),

V1 (u) = 2c0 (u)W1 (u) + 2c1 (u)W0 (u) + bU0 (u) + ρσuU1 (u),

and

V2 (u) = 2c0 (u)W2 (u) + 4c1 (u)W1 (u) + 2c2 (u)W0 (u) + 2bU1 (u) + ρσuU2 (u).

It is not hard to see that

V0 (u) = 2c0 (u) cos c0 (u) + ρσu sin c0 (u), (5.22)

V1 (u) = (2 + ρσu)c1 (u) cos c0 (u)


+ (b − 2c0 (u)c1 (u)) sin c0 (u), (5.23)

and

V2 (u) = [2c2 (u) + 2bc1 (u) + ρσuc2 (u) − 2c0 (u)c1 (u)2 ] cos c0 (u)
− [2c0 (u)c2 (u) + 4c1 (u)2 + ρσuc1 (u)2 ] sin c0 (u).

Therefore,

u U0 (u) + U1 (u)t + 21 U2 (u)t 2 + O(t 3 )


t D(− , t) = (u 2 − 2tku) (5.24)
t V0 (u) + V1 (u)t + 21 V2 (u)t 2 + O(t 3 )

as t → 0 (see (5.7), (5.17), and (5.21)).


Set
u 1
t D(− , t) = T0 (u) + T1 (u)t + T2 (u)t 2 + O(t 3 ) (5.25)
t 2
314 A. Gulisashvili and J. Teichmann

as t → 0. Then, (5.24) and (5.25) give

T0 (u)V0 (u) = u 2 U0 (u),

T0 (u)V1 (u) + T1 (u)V0 (u) = u 2 U1 (u) − 2kuU0 (u),

and
1 1 1
T0 (u)V2 (u) + T1 (u)V1 (u) + T2 (u)V0 (u) = u 2 U2 (u) − 2kuU1 (u).
2 2 2
It follows from the previous equalities that

u 2 U0 (u)
T0 (u) = ,
V0 (u)

u 2 U1 (u)V0 (u) − 2kuU0 (u)V0 (u) − u 2 U0 (u)V1 (u)


T1 (u) = ,
V0 (u)2

and
Q(u)
T2 (u) = ,
V0 (u)3

where

Q(u) = u 2 U2 (u)V0 (u)2 − 4kuU1 (u)V0 (u)2 − u 2 U0 (u)V0 (u)V2 (u)


− 2u 2 U1 (u)V0 (u)V1 (u) + 4kuU0 (u)V0 (u)V1 (u) + 2u 2 U0 (u)V1 (u)2 .
(5.26)

Therefore, the following asymptotic formula holds:

u u 2 U0 (u) u 2 U1 (u)V0 (u) − 2kuU0 (u)V0 (u) − u 2 U0 (u)V1 (u)


t D(− , t) = +t
t V0 (u) V0 (u)2
2
t Q(u)
+ + O(t 3 ) (5.27)
2 V0 (u)3

as t → 0.
Now, we turn our attention to the function t → tC(− ut , t). Using (5.6), we see
that

u a V0 (u) + V1 (u)t + O(t 2 )
tC(− , t) = −tr u + t 2 bt + ρσu − 2 log   .
t σ 2c0 (u) + 2c1 (u)t + O t 2
(5.28)
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 315

Set
V0 (u) + V1 (u)t + O(t 2 )
  = L 0 (u) + L 1 (u)t + O(t 2 )
2c0 (u) + 2c1 (u)t + O t 2

as t → 0. It is not hard to see that

V0 (u)
L 0 (u) = , (5.29)
2c0 (u)

c0 (u)V1 (u) − c1 (u)V0 (u)


L 1 (u) = . (5.30)
2c0 (u)2

We also have
L 1 (u)
log[L 0 (u) + L 1 (u)t + O(t 2 )] = log L 0 (u) + t + O(t 2 ) (5.31)
L 0 (u)

as t → 0. It follows from (5.28)–(5.31) that



u aρu 2a V0 (u)
tC(− , t) = − r u − 2 log t
t σ σ 2c0 (u)

ab 2a c0 (u)V1 (u) − c1 (u)V0 (u) 2
+ − 2 t + O(t 3 ) (5.32)
σ2 σ c0 (u)V0 (u)

as t → 0.
Next, we will find explicit expressions for the functions (1) and (2) . Suppose
ρ = 0, u = 0, and condition (5.8) holds. Then

aρ 2a V0 (u)
(1) (u) = − r u − 2 log
σ σ 2c0 (u)
u 2 U1 (u)V0 (u) − 2kuU0 (u)V0 (u) − u 2 U0 (u)V1 (u)
+ v0 . (5.33)
V0 (u)2

Formula (5.33) can be established, using (5.27) and (5.32).


The next statement provides an explicit expression for the function (1) in terms
of the Heston model parameters.
Theorem 5.4 Suppose ρ = 0, u = 0, and condition (5.8) holds. Then
# √ √
2 2
aρ 2a 1 − ρ2 cos uσ 1−ρ + ρ sin uσ 1−ρ
(1)
 (u) = − r u − 2 log #2 2
σ σ 1 − ρ2
√ √ √ √
uσ 1−ρ2 uσ 1−ρ2 uσ 1−ρ2 uσ 1−ρ2
E 1 (u) cos2 2 + E 2 (u) cos 2 sin 2 + E 3 (u) sin2 2
+ v0 # √ √ 2 ,
uσ 1−ρ 2 uσ 1−ρ 2
σ 2 1 − ρ cos
2
2 + ρ sin 2
(5.34)
316 A. Gulisashvili and J. Teichmann

where
uσ(kσ + bρ)
E 1 (u) = − ,
2
# kσ + bρ
E 2 (u) = −2kσ 1 − ρ2 + # ,
1 − ρ2

and  
kσ + bρ
E 3 (u) = − 2kρσ + b + uσ .
2

If ρ = 0, then formula (5.34) holds for all u satisfying condition (5.11).


Proof Taking into account (5.33), (5.15), (5.16), (5.18), (5.19), (5.22), (5.23),we
obtain
# √ √
2 cos |u|σ 1−ρ + ρσu sin |u|σ 1−ρ
2 2
aρ 2a |u|σ 1 − ρ
(1) (u) = − r u − 2 log #
2 2
σ σ |u|σ 1 − ρ2
√ √ √ √
1 (u) cos2 |u|σ 1−ρ + E 2 (u) cos |u|σ 1−ρ sin |u|σ 1−ρ + E 3 (u) sin2 |u|σ 1−ρ
2 2 2 2
E
+ v0 
2

2 2
√ 2
2
,
# |u|σ 1−ρ2 |u|σ 1−ρ2
|u|σ 1 − ρ cos
2
2 + ρσu sin 2

where
3
1 (u) = − u σ(kσ + bρ),
E
2
 
kσ + bρ # kσ + bρ
2 (u) = u −ρσu|u| #
E − 2k|u|σ 1 − ρ + (2 + ρσu)|u| #
2 ,
2 1 − ρ2 2 1 − ρ2

and

3 (u) = −u 2 2kρσ + b + uσ kσ + bρ .
E
2 

Next, replacing |u| by u in the previous formulas (it is not hard to see that this can
be done) and making several cancellations, we obtain formula (5.34).
This completes the proof of Theorem 5.4.
Our final goal in the present section is to find an explicit formula for the function
(2) in terms of the Heston model parameters. It follows from (5.2), (5.13), (5.27),
and (5.32) that

ab 2a c0 (u)V1 (u) − c1 (u)V0 (u) v0 Q(u)


(2) (u) = − 2 + , (5.35)
σ2 σ c0 (u)V0 (u) 2 V0 (u)3
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 317

where Q(u) is given by (5.26). Now, it is clear how to obtain an explicit expression
for the function (2) , expressed in terms of the Heston model parameters. It suffices
to transform the formula in (5.35), using the explicit expressions for the functions
ci , Ui , Vi with i = 0, 1, 2, and the function Q. Let us also note that the value of the
function on the right-hand of formula (5.35) does not change if we replace |u| by
u. Taking into account what was said above, and making long but straightforward
computations, we see that the following statement holds.
Theorem 5.5 Suppose ρ = 0, u = 0, and condition (5.8) holds. Then

ab a I0 (u)
(2) (u) = − 3 √ √
σ 2 σ (1 − ρ2 )u # uσ 1−ρ2 uσ 1−ρ2
1 − ρ2 cos 2 + ρ sin 2
v0 I1 (u)
+ √ √ 2
2 # uσ 1−ρ2 uσ 1−ρ2
σ2 1 − ρ2 cos 2 + ρ sin 2

v0 I2 (u)I3 (u)
+ √ √ 3 , (5.36)
2 # uσ 1−ρ2 uσ 1−ρ2
uσ 3 1 − ρ2 cos 2 + ρ sin 2

where
#
 #  uσ 1 − ρ2
I0 (u) = −uρσ 1 − ρ (kσ + bρ) cos
2
2 #
  uσ 1 − ρ2
+ 2b + 2ρkσ + uσ(kσ + bρ)(1 − ρ ) sin
2
,
2

 
#
u b2 (1 − ρ2 ) + (kσ + bρ)2 (kσ + bρ)2 uσ 1 − ρ2
I1 (u) = − + sin 2
2(1 − ρ2 ) 1 − ρ2 2
  # #
b2 (1 − ρ2 ) + (kσ + bρ)2 b(kσ + bρ) uσ 1 − ρ2 uσ 1 − ρ2
+ 3
+ # sin cos ,
σ(1 − ρ2 ) 2 1 − ρ2 2 2

  #
# (2 + ρσu)(kσ + bρ) uσ 1 − ρ2
I2 (u) = 2 2kσ 1 − ρ −
2 # cos
2 1 − ρ2 2
 #
uσ(kσ + bρ) uσ 1 − ρ2
+ 2 2kρσ + b + sin ,
2 2

and
318 A. Gulisashvili and J. Teichmann
#
#
2 uσ 1 − ρ
2
I3 (u) = − uσ 1 − ρ + b sin
2
# 2 #
kσ + bρ uσ 1 − ρ2 uσ 1 − ρ2
−# sin cos .
1 − ρ2 2 2

If ρ = 0, then formula (5.36) holds for all u satisfying condition (5.11).

Proof The second term on the right-hand side of (5.36) can be obtained from the
corresponding term in (5.35) by taking into account (5.15), (5.16), (5.22), and (5.23).
Next, using (5.26), we see that

Q(u) u 2 [U2 (u)V0 (u) − U0 (u)V2 (u)]


=
V0 (u)3 V0 (u)2
 
4kuV0 (u) + 2u 2 V1 (u) [U0 (u)V1 (u) − U1 (u)V0 (u)]
+ . (5.37)
V0 (u)3

Moreover

U2 (u)V0 (u) − U0 (u)V2 (u) = 2c0 (u)c2 (u) + 4c1 (u)2 sin2 c0 (u)
− 2 [c2 (u) + bc1 (u)] sin c0 (u) cos c0 (u),

4kuV0 (u) + 2u 2 V1 (u) = 2u [4kc0 (u) + u(2 + ρσu)c1 (u)] cos c0 (u)
+ 2u 2 [2kρσ + b − 2c0 (u)c1 (u)] sin c0 (u),

and

U0 (u)V1 (u) − U1 (u)V0 (u) = b sin2 c0 (u) − 2c0 (u)c1 (u)


+ 2c1 (u) sin c0 (u) cos c0 (u).

Set
I1 (u) = U2 (u)V0 (u) − U0 (u)V2 (u),
 
I2 (u) = u −2 4kuV0 (u) + 2u 2 V1 (u) ,

and
I3 (u) = U0 (u)V1 (u) − U1 (u)V0 (u).
The Gärtner-Ellis Theorem, Homogenization, and Affine Processes 319

Next, taking into account (5.35), (5.37), and using the explicit expressions for the
functions ci , Ui , and V1 with i = 0, 1, 2, which were found above, we obtain (5.36).
This completes the proof of Theorem 5.5.
Remark 5.6 The present remark concerns the continuity of the functions (i) with
i = 0, 1, 2 on their domain. Recall that (i) (0) = 0. It follows from Theorems 5.1,
5.4, and 5.5 that the functions (i) are continuous on their domain with a possible
exception of the point u = 0. However, it is not hard to see, using the explicit
expressions for the functions (i) , provided in the theorems mentioned above, that

lim (i) (u) = 0 for i = 0, 1, 2.


u→0

Acknowledgments We gratefully acknowledge the support by the Institute for Mathematical


Research (FIM) and the ETH foundation. We would also like to thank Antoine Jacquier for reading
the paper and making very helpful comments.

References

1. Aït-Sahalia, Y., Yu, J.: Saddlepoint approximations for continuous-time Markov processes. J.
Econom. 134, 507–551 (2006)
2. Cuchiero, C., Filipovic, D., Mayerhofer, E., Teichmann, J.: Affine processes on positive semi-
definite matrices. Ann. Appl. Probab. 21, 397–463 (2011)
3. Cuchiero, C., Keller-Ressel, M., Mayerhofer, E., Teichmann, J.: Affine processes on symmetric
cones. J. Theor. Probab. doi:10.1007/s10959-014-0580-x (2014)
4. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Jones and Bartlett
Publishers Inc., Boston (1993)
5. Duffie, D., Filipovic, D., Schachermayer, W.: Affine processes and applications in finance.
Ann. Appl. Probab. 13, 984–1053 (2003)
6. Ellis, R.S.: Large deviations for a general class of random vectors. Ann. Probab. 12, 1–12
(1984)
7. Forde, M., Jacquier, A.: Small-time asymptotics for implied volatility under the Heston model.
IJTAF 12, 861–876 (2009)
8. Forde, M., Jacquier, A.: The large-maturity smile for the Heston model. Financ. Stoch. 15,
755–780 (2011)
9. Forde, M., Jacquier, A., Mijatović, A.: Asymptotic formulae for implied volatility in the Heston
model. Proc. R. Soc. A 466, 3593–3620 (2010)
10. Forde, M., Jacquier, A., Lee, R.: The small-time smile and term structure of implied volatility
under the Heston model. SIAM J. Financ. Math. 3, 690–708 (2012)
11. Forde, M., Kumar, R.: Large-time option pricing for a general stochastic volatility model with
a stochastic interest rate, using the Donsker-Varadhan LDP, Preprint (2013)
12. Gärtner, J.: On large deviations from the invariant measure. Theory Probab. Appl. 22, 24–39
(1977)
13. Gulisashvili, A., Laurence, P.: The Heston Riemannian distance function. Journal de Mathé-
matiques Pures et Appliquées 101, 303–329 (2014)
14. Heston, S.L.: A closed-form solution for options with stochastic volatility, with applications
to bond and currency options. Rev. Financ. Stud. 6, 327–343 (1993)
15. Jacquier, A., Mijatović, A.: Large deviations for the extended Heston model: the large-time
case. Asia-Pacific Finan. Markets 21(3), 263–280 (2014)
320 A. Gulisashvili and J. Teichmann

16. Jacquier, A., Roome, P.: Asymptotics of forward implied volatility. SIAM J. Finan. Math. 6(1),
307–351 (2015)
17. Keller-Ressel, M., Teichmann, J., Schachermayer, W.: Regularity of affine processes on general
state spaces. Electron. J. Probab. 18, 1–17 (2013)
18. Olver, F.W.J.: Asymptotics and Special Functions. A K Peters Ltd., Wellesley (1997)
19. Pham, H.: Some applications and methods of large deviations in finance and insurance. Paris-
Princeton Lectures on Mathematical Finance 2004. Lecture Notes in Mathematics, vol. 1919,
pp. 191–244 (2007)
20. Spreij, P., Veerman, E.: Affine diffusions with non-canonical state space. Stoch. Anal. Appl.
30, 605–641 (2012)
Asymptotics for d-Dimensional Lévy-Type
Processes

Matthew Lorig, Stefano Pagliarani and Andrea Pascucci

Abstract We consider a general d-dimensional Lévy-type process with killing.


Combining the classical Dyson series approach with a novel polynomial expansion
of the generator A(t) of the Lévy-type process, we derive a family of asymptotic
approximations for transition densities and European-style options prices. Exam-
ples of stochastic volatility models with jumps are provided in order to illustrate the
numerical accuracy of our approach. The methods described in this paper extend the
results from Corielli et al. (SIAM J Financ Math 1:833–867, 2010, [4]), Pagliarani
and Pascucci (Int. J. Theor. Appl. Financ. 16(8):1–35, 2013, [20]) to Lorig et al.
(Analytical expansions for parabolic equations, 2013, [13]) for Markov diffusions to
Markov processes with jumps.

Keywords Multi-dimensional Lévy-type process with killing · Asymptotic


approximation · Integro-differential equation · Levy processes · Parametrix

1 Introduction

In a multi-dimensional Markovian setting, the time evolution of a market model is


usually described by the solution X of a Lévy-Itô stochastic differential equation
(SDE). Such a model allows for features commonly seen in markets, such as sto-
chastic volatility, jumps, default, co-integration and correlation. Many quantities of
interest (e.g., option prices, net present values) can be expressed as expectations of

M. Lorig
Department of Applied Mathematics, University of Washington, Seattle, WA, USA
e-mail: [email protected]
S. Pagliarani
Ecole Polytechnique, CMAP, Route de Saclay, 91128 Palaiseau Cedex, France
e-mail: [email protected]
A. Pascucci (B)
Dipartimento di Matematica, Università di Bologna, Bologna, Italy
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 321


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_12
322 M. Lorig et al.

the form u(t, x) := E[ϕ(X T )|X t = x]. Under mild conditions, the function u(t, x)
is the unique classical solution of a partial integro-differential equation (PIDE).
Unfortunately, closed form and even semi-closed form solutions of these PIDEs are
available only in rare cases. As such, it is important to develop general methods for
finding analytical approximations for the solutions of these PIDEs.
Within the mathematical finance literature, a number of different approaches have
been taken for finding approximate transition densities and option prices for markets
described by Markov processes. Most of these techniques involve expansions that
exploit a small parameter or a limiting case. For example, Benhamou et al. [1] develop
analytical approximations for models with local volatility and Gaussian jumps in the
small diffusion and small jump frequency/size limits (see also the recent review
paper by Bompis and Gobet [2]). Deuschel et al. [5] obtain densities for diffusion
processes in a small noise limit. Fouque et al. [6] find option prices for Black-Scholes-
like multiscale models where volatility is driven by two factors, one running on a
fast scale, one running on a slow scale. Lorig [11], Lorig and Lozano-Carbassé [12]
extend these multiscale techniques to more general diffusions and to the exponential
Lévy setting.
Recently, Pagliarani and Pascucci [19] introduce a method for finding asymptotic
solutions of parabolic PDEs. The approach, called the adjoint expansion method, is
extended by Lorig et al. [15], Pagliarani et al. [21] to models with jumps and it was
further generalized by Lorig et al. [13] to a family of asymptotic expansions for a d-
dimensional market described by an Itô SDE (i.e., a Markov market with no jumps).
The method consists of expanding the pricing PDE in polynomial basis functions,
which results in a nested sequence of Cauchy problems, and deriving analytical
solutions for these nested Cauchy problems. In this paper, we extend the results of
Lorig et al. [13], Lorig et al. [15], Pagliarani et al. [21] to the PIDEs that arise when
markets are described by a d-dimensional Lévy-Itô SDE. Results presented here also
simplify results from Lorig et al. [13], Lorig et al. [15], Pagliarani et al. [21].
The rest of this paper proceeds as follows. In Sect. 2 we present a general d-
dimensional market model. We also describe the kinds of derivative-assets we wish
to price, and we relate the price of such derivative-assets to the solution of a parabolic
PIDE. In Sect. 3 we introduce the idea of polynomial expansions of the pricing PIDE
and in Sect. 4, we derive a family of analytical price approximations—one for each
polynomial expansion of the pricing PIDE. Lastly, in Sect. 5 we provide a numerical
example, illustrating the versatility and accuracy of our methods.

2 Market Model

We take, as given, an equivalent martingale measure Q defined on a complete


filtered probability space (, F, {Ft , t ≥ 0}, Q). All stochastic processes defined
below live on this probability space and all expectations are taken with respect
to Q. The risk-neutral dynamics of our market are described by the following d-
dimensional Markov Lévy-type process
Asymptotics for d-Dimensional Lévy-Type Processes 323

dX t = μ(t, X t ) dt + σ (t, X t ) dWt + (t, X t− , dt, dz).
z dN
Rd

(·, ·, dt, dz), given by


Here W is a standard m-dimensional Brownian motion, and N

(t, x, dt, dz) = N (t, x, dt, dz) − ν(t, x, dz)dt,


N (t, x) ∈ R+ × Rd ,

is a family of compensated Poisson measures on B(R) ⊗ B(Rd ). The drift vector


μ and volatility matrix σ map μ : R+ × Rd → Rd and σ : R+ × Rd → Rd×m ,
respectively. We assume the Lévy kernel ν satisfies
  
min |z|, |z|2 ν̄(dz) < ∞, ν̄(dz) := sup ν(t, x, dz), (2.1)
Rd (t,x)∈R+ ×Rd

which is rather standard for Lévy-type models. The components of X could represent
a number of things such as e.g., economic factors, asset prices, indices, or functions
of these quantities. We assume a risk-free interest rate of the form r (t, X t ) where
r : R+ × Rd → R. We also introduce a random time ζ , which is given by
  t 
ζ = inf t ≥ 0 : γ (s, X s )ds ≥ E , γ : R+ × Rd → R+ ,
0

with E exponentially distributed and independent of X . The random time ζ could


represent the default time of an asset, the arrival of an economic shock, etc.
Denote by V the no-arbitrage price of a European derivative expiring at time T
with payoff

H (X T ) I{ζ >T } + G(X T ) I{ζ ≤T } = H (X T ) − G(X T ) I{ζ >T } + G(X T ).

It is well known (see, for instance, Jeanblanc et al. [9]) that


T
Vt = E e− t r (s,X s )ds
G(X T )|X t +
T

I{ζ >t} E e− t (r (s,X s )+γ (s,X s ))ds H (X T ) − G(X T ) |X t , t < T. (2.2)

Thus, to value a European-style option, one must compute functions of the form
T
u(t, x) := E e− t λ(s,X s )ds
ϕ(X T ) | X t = x . (2.3)

Under mild assumptions (see, for instance, Pascucci [22]), the function u, defined
by (2.3), satisfies the Kolmogorov backward equation

(∂t + A(t))u = 0, u(T, x) = ϕ(x), x ∈ Rd , (2.4)


324 M. Lorig et al.

where the operator A(t) is given explicitly by


 
A(t) = ν(t, x, dz) ez,∇x − 1 − z, ∇x
Rd
1 
d  
d
+ σ σ T (t, x)∂xi ∂x j + μi (t, x)∂xi − λ(t, x), (2.5)
2 ij
i, j=1 i=1

with


d
z, x := z i xi , ∇x := (∂x1 , ∂x2 , . . . , ∂xd ), ez,∇x f (x) := f (x + z).
i=1

The formal representation of the shift operator ez,∇x is motivated by the fact that its
Taylor expansion applied to the function f (x) gives the Taylor expansion of f (x + z)
about the point x. As in Øksendal and Sulem [18, Chap. 1], we regard the domain of
A(t) to be all functions f : Rd → R such that A(t) f (x) exists and is finite for all
x ∈ Rd .

Remark 2.1 (Martingale property) Let us denote by X (i) the ith component of the
vector X and assume that

ezi ν̄(dz) < ∞,
|z|≥1

(i)
for some i ≤ d, with ν̄ as in (2.1). If St := I{ζ >t} e X t is supposed to be a traded
asset then, in order for S to be a martingale, the drift μi must satisfy
 
1
μi (t, x) = γ (t, x) − ν(t, x, dz)(ezi − 1 − z i ) − σ σ T (t, x),
Rd 2 ii

To see this, set H (x) = e xi , G(x) = 0 and impose Vt = St in (2.2).

3 General Expansion Basis

Let us start by rewriting the differential operator (2.5) in the more compact form
  
A(t) := ν(t, x, dz) ez,∇x − 1 − z, ∇x + aα (t, x)Dxα , t ∈ R, x ∈ Rd ,
Rd |α|≤2
Asymptotics for d-Dimensional Lévy-Type Processes 325

where by standard notations


d
α = (α1 , . . . , αd ) ∈ Nd0 , |α| = αi , Dxα∗ = ∂xα11 · · · ∂xαdd .
i=1

In this section we introduce a family of expansion schemes for A(t), which we shall
use to construct closed-form approximate solutions (one for each family) of (2.4).
Definition 3.1 For |α| ≤ 2 and n ≤ N ∈ N0 , let aα,n = aα,n (t, x) and νn =
νn (t, x, dz) be such that the following hold:
(i) For any t ∈ [0, T ], aα,n (t, ·) are polynomial functions with aα,0 (t, x) ≡ aα,0 (t),
and for any x ∈ Rd the functions aα,n (·, x) belong to L ∞ ([0, T ]).
(ii) For any t ∈ [0, T ], x ∈ Rd , we have

νn (t, x, dz) = x β νn,β (t, dz), Mn ∈ N0 , (3.1)
|β|≤Mn

where each νn,β (t, dz) satisfies condition (2.1). Moreover, M0 = 0, ν0 ≥ 0 and

eλ|z| ν0 (t, dz) < ∞, t ∈ [0, T ], (3.2)
|z|≥1

for some positive λ.


Then we say that (An (t))0≤n≤N , defined by
   
An (t) f (x) = xβ νn,β (t, dz) ez,∇x − 1 − z, ∇x f (x) + aα,n (t, x)Dxα f (x)
|β|≤Mn Rd |α|≤2
  
≡ νn (t, x, dz) ez,∇x − 1 − z, ∇x f (x) + aα,n (t, x)Dxα f (x),
Rd |α|≤2
(3.3)

is an N th order polynomial expansion of A(t).


Definition 3.1 allows for very general polynomial specifications. The idea is to
choose an expansion (An (t)) that closely approximates A(t). The precise sense of this
approximation will depend on the application. Below, we present three polynomial
expansions. The first two expansion schemes provide an accurate approximation
A(t) in a pointwise local sense, under the assumption of smooth coefficients. The
last expansion scheme approximates A(t) in a global sense and can be applied even
in the case of discontinuous coefficients.
326 M. Lorig et al.

Example 3.2 (Taylor polynomial expansion) Assume the coefficients aα (t, ·) ∈


C N (Rd ) and that the compensator ν takes the form

ν(t, x, dz) = h(t, x, z)ν̄(dz)

where h(t, ·, z) ∈ C N (Rd ) with h ≥ 0, and ν̄ is a Lévy measure. Then, for any
fixed x̄ ∈ Rd and n ≤ N , we define νn and aα,n as the nth order term of the Taylor
expansions of ν and aα respectively in the spatial variables x around the point x̄.
That is, we set

 Dxβ h(t, x̄, z)


νn (t, x, dz) = (x − x̄)β ν̄(dz),
β!
|β|=n
 Dxβ aα (t, x̄)
aα,n (t, x) = (x − x̄)β , |α| ≤ 2,
β!
|β|=n

β β
where as usual β! = β1 ! · · · βd ! and x β = x1 1 · · · xd d . The expansion proposed
in Lorig et al. [14, 17] is the particular case when ν ≡ 0, whereas the expansion
proposed in Lorig et al. [15, 16] is a particular case when d = 1.

Example 3.3 (Time-dependent Taylor polynomial expansion) Under the assump-


tions of Example 3.2, fix a trajectory x̄ : R+ → Rd . We then define νn (t, x, dz) and
aα,n (t, x) as the nth order term of the Taylor expansions of ν(t, x, dz) and aα (t, x)
respectively around x̄(t). More precisely, we set

 Dxβ h(t, x̄(t), z)


νn (t, x, dz) = (x − x̄(t))β ν̄(dz),
β!
|β|=n
 Dxβ aα (t, x̄(t))
aα,n (t, x) = (x − x̄(t))β , |α| ≤ 2.
β!
|β|=n

This expansion for the coefficients allows the expansion point x̄ of the Taylor series to
evolve in time according to the evolution of the underlying process X t . For instance,
one could choose x̄(t) = E[X t ]. In Lorig et al. [14] this choice results in a highly
accurate approximation for option prices and implied volatility in the Heston [8]
model.

Example 3.4 (Hermite polynomial expansion) Hermite expansions can be useful


when the diffusion coefficients are discontinuous. A remarkable example in financial
mathematics is given by the Dupire’s local volatility formula for models with jumps
(see Friz et al. [7]). In some cases, e.g., the well-known Variance-Gamma model, the
Asymptotics for d-Dimensional Lévy-Type Processes 327

fundamental solution (i.e., the transition density of the underlying stochastic model)
has singularities. In such cases, it is natural to approximate it in some L p norm rather
than in the pointwise sense. For the Hermite expansion centered at x̄, one sets

νn (t, x, dz) = Hβ (· − x̄), ν(t, ·, dz) Hβ (x − x̄),
|β|=n

aα,n (t, x) = Hβ (· − x̄), aα (t, ·) Hβ (x − x̄), |α| ≤ 2,
|β|=n

where the inner product ·, · is an integral over Rd with a Gaussian weighting
centered at x̄ and the functions Hβ (x) = Hβ1 (x1 ) · · · Hβd (xd ) where Hn is the nth
one-dimensional Hermite polynomial (properly normalized so that Hα , Hβ =
δα,β with δα,β being the Kronecker’s delta function).

4 Formal Solution Via Dyson Series

In this section we present a heuristic argument to pass from an expansion of the


operator A(t) in (2.5) to an expansion for u, the solution of problem (2.4). The
following argument is not intended to be rigorous. Rather, the computations that
follow provide motivation for the price expansion given in Definition 4.1. Throughout
this section, we will generally omit x-dependence, except where it is needed for
clarity. To begin, we presume that the operator A(t) can be formally written as


A(t) = A0 (t) + B(t), B(t) = An (t). (4.1)
n=1

We insert expansion (4.1) for A(t) into Cauchy problem (2.4) and find

(∂t + A0 (t))u(t) = −B(t)u(t), u(T ) = ϕ.

Note that, by construction, A0 (t) is the generator of an additive process. Therefore,


by Duhamel’s principle, we have
 T
u(t) = P0 (t, T )ϕ + dt1 P0 (t, t1 )B(t1 )u(t1 ), (4.2)
t
328 M. Lorig et al.

where P0 (t, T ) is the semigroup of operators generated by A0 (t). Inserting expres-


sion (4.2) for u into the right-hand side of (4.2) and iterating we obtain
 T
u(t) = P0 (t, T )ϕ + dt1 P0 (t, t1 )B(t1 )P0 (t1 , T )ϕ
t
 T  T
+ dt1 dt2 P0 (t, t1 )B(t1 )P0 (t1 , t2 )B(t2 )u(t2 )
t t1
= ···
∞ 
 T  T  T
= P0 (t, T )ϕ + dt1 dt2 · · · dtk
k=1 t t1 tk−1

P0 (t, t1 )B(t1 )P0 (t1 , t2 )B(t2 ) · · · P0 (tk−1 , tk )B(tk )P0 (tk , T )ϕ (4.3)
 ∞  n  T  T  T
= P0 (t, T )ϕ + dt1 dt2 · · · dtk
n=1 k=1 t t1 tk−1

P0 (t, t1 )Ai1 (t1 )P0 (t1 , t2 )Ai2 (t2 ) · · · P0 (tk−1 , tk )Aik (tk )P0 (tk , T )ϕ,
i∈In,k
(4.4)
In,k = {i = (i 1 , i 2 , . . . , i k ) ∈ N | i 1 + i 2 + · · · + i k = n}.
k
(4.5)

The second-to-last equality (4.3) is known as the Dyson series expansion of u (see, for
instance, Sect. 5.7 of Sakurai and Tuan [23] or Chap. IX.2.6 of Kato [10]).
 To obtain
(4.4) from (4.3) we have used (4.1) to replace B(t) by the infinite sum ∞ n=1 An (t),
and we have partitioned on the sum of the subscripts of the (Aik ). Expansion (4.4)
motivates the following definition.
Definition 4.1 For a fixed N th order polynomial expansion (An (t))0≤n≤N satisfying
Definition 3.1, we define ū N , the N th order price approximation of u, as


N
ū N := un , (4.6)
n=0

where

u 0 (t) := P0 (t, T )ϕ,


n  T  T  T
u n (t) := dt1 dt2 · · · dtk
k=1 t t1 tk−1

P0 (t, t1 )Ai1 (t1 )P0 (t1 , t2 )Ai2 (t2 ) · · · P0 (tk−1 , tk )Aik (tk )P0 (tk , T )ϕ, n ≥ 1.
i∈In,k
(4.7)

Here, P0 (t, T ) is the semigroup generated by A0 (t) and In,k is as given in (4.5).
Asymptotics for d-Dimensional Lévy-Type Processes 329

In Sects. 4.1 and 4.2 we will provide explicit expressions for u 0 and (u n )n≥1
respectively.

4.1 Expression for u0

In what follows, it will be helpful to recall the definition of the Fourier and inverse
Fourier transforms. For any function ϕ in the Schwartz class, we define

Fourier transform: F[ϕ](ξ ) = ϕ̂(ξ ) = dx ϕ(x)eiξ,x ,
Rd

1
Inverse transform: F −1 [ϕ̂](x) = ϕ(x) = dξ ϕ̂(ξ )e−iξ,x .
(2π )d Rd

Recall that by construction M0 = 0 (cf. Definition 3.1) and therefore the operator
A0 (t) has time-dependent coefficients which are independent of x. Then the action
of the semigroup of operators P0 (t, T ) of A0 (t) is well-known:

1
u 0 (t) := P0 (t, T )ϕ = P̂0 (t, x, T, ξ )ϕ̂(−ξ ) dξ (4.8)
(2π )d Rd

where
+0 (t,T,ξ )
P̂0 (t, x, T, ξ ) := eiξ,x (4.9)

with
  T
0 (t, T, ξ ) = (iξ )α ds aα,0 (s) + 0 (t, T, ξ ), (4.10)
|α|≤2 t

and
 T  
0 (t, T, ξ ) = eiξ,z − 1 − iξ, z ν0 (s, dz)ds.
t Rd

Remark 4.2 We introduce P̂ and eξ , the characteristic function and oscillating expo-
nential, respectively
T
a0,0 (s,X s )ds iξ,X T
P̂(t, x, T, ξ ) := E e t e |X t = x , eξ (x) = eiξ,x , (4.11)

where a0,0 is short-hand for a(0,0,...,0),0 . From (2.3) we observe that P̂(t, x, T, ξ ) is
obtained as the special case ϕ = eξ . We note that P̂0 (t, x, T, ξ ) in (4.9) represents the
330 M. Lorig et al.

0th order approximation of P̂(t, x, T, ξ ). More generally, we denote by P̂n (t, x, T, ξ )


the nth order approximation of P̂(t, x, T, ξ ), obtained by setting ϕ = eξ in (4.7).

4.2 Expression for un

Remarkably, as the following proposition shows, every u n (t) can be expressed as a


pseudo-differential operator Ln (t, T ) acting on u 0 (t).
Proposition 4.3 Assume that ϕ belongs to the Schwartz class, and that 0 in (4.10)
is a smooth function of the variable ξ . Then the function u n defined in (4.7) is given
explicitly by

u n (t) = Ln (t, T )u 0 (t), (4.12)

where u 0 is given by (4.8) and


n 
 T  T  T 
Ln (t, T ) = dt1 dt2 · · · dtk Gi1 (t, t1 )Gi2 (t, t2 ) · · · Gik (t, tk ),
k=1 t t1 tk−1 i∈In,k
(4.13)

with In,k as defined in (4.5) and

G j (t, tk ) := A j (tk , M(t, tk ))


  
= ν j (tk , M(t, tk ), dz) ez,∇x − 1 − z, ∇x + aα, j (tk , M(t, tk ))Dxα ,
Rd |α|≤2
(4.14)
  tk   tk  tk
M(t, tk ) := x + z ez,∇x − 1 ν0 (s, dz)ds + m(s)ds + C(s)∇x ds,
Rd t t t
(4.15)

m(s) = a(1,0,...,0),0 (s) a(0,1,...,0),0 (s) . . . a(0,0,...,1),0 (s) ,
⎛ ⎞
2a(2,0,...,0),0 (s) a(1,1,...,0),0 (s) . . . a(0,0,...,1),0 (s)
⎜ a(1,1,...,0),0 (s) 2a(0,2,...,0),0 (s) . . . a(0,1,...,1),0 (s) ⎟
⎜ ⎟
C(s) = ⎜ .. .. .. .. ⎟.
⎝ . . . . ⎠
a(1,0,...,1),0 (s) a(0,1,...,1),0 (s) . . . 2a(0,0,...,2),0 (s)

Moreover, the components of M(t, tk ) commute. Therefore the operators (G j (t, tk )),
which are polynomials in M(t, tk ) by construction, are well defined.

Proof The proof consists in showing that the operator G j (t, tk ) in (4.14) satisfies

P0 (t, tk )A j (tk ) = G j (t, tk )P0 (t, tk ). (4.16)


Asymptotics for d-Dimensional Lévy-Type Processes 331

Assuming (4.16) holds, we can use the fact that P0 (tk , tk+1 ) is a semigroup

P0 (t, T ) = P0 (t, t1 )P0 (t1 , t2 ) · · · P0 (tk−1 , tk )P0 (tk , T ), t ≤ t1 ≤ · · · ≤ tk ≤ T,

and we can re-write (4.7) as


n  T
  T  T 
u n (t) = dt1 dt2 · · · dtk Gi 1 (t, t1 )Gi 2 (t, t2 ) · · · Gi k (t, tk )P0 (t, T )ϕ,
k=1 t t1 tk−1 i∈In,k

from which (4.12) and (4.13) follows directly. Thus, we only need to show that
G j (t, tk ) satisfies (4.16). It is sufficient to investigate how the operator P0 (t, tk )A j (tk )
acts on the oscillating exponential in (4.11). First, we note that

P0 (t, tk )eξ (x) = e0 (t,tk ,ξ ) eξ (x), (4.17)

where 0 (t, tk , ξ ), as given in (4.10), is a smooth function by condition (3.2). Next,


we observe that the operator M(t, tk ) in (4.15) can be written

M(t, tk ) = M(t, tk , −i∇x ), M(t, tk , ξ ) = −i∇ξ (0 (t, tk , ξ ) + iξ, x ) .


(4.18)

Denote by M j and M j the jth component of M and M respectively. Then, using


(4.18) we have

(−i∂ξi )(−i∂ξ j )e0 (t,tk ,ξ ) eξ (x) = (−i∂ξi )M j (t, tk , ξ )e0 (t,tk ,ξ ) eξ (x)
= M j (t, tk )(−i∂ξi )e0 (t,tk ,ξ ) eξ (x)
= M j (t, tk )Mi (t, tk , ξ )e0 (t,tk ,ξ ) eξ (x)
= M j (t, tk )Mi (t, tk )e0 (t,tk ,ξ ) eξ (x). (4.19)

More generally for any multi-index β we have

(−i∇ξ )β e0 (t,tk ,ξ ) eξ (x) = (M(t, tk ))β e0 (t,tk ,ξ ) eξ (x). (4.20)

From (4.19) we deduce that operators Mi and M j commute when applied to


e0 (t,tk ,ξ ) eξ (x), because so do ∂ξi and ∂ξ j . Consequently, Mi and M j also com-
mute when applied to eξ (x) or any function that admits a representation as a Fourier
transform. To see this observe that

M j (t, tk )Mi (t, tk )e0 (t,tk ,ξ ) eξ (x) = Mi (t, tk )M j (t, tk )e0 (t,tk ,ξ ) eξ (x).
332 M. Lorig et al.

Therefore, since M j (t, tk ) acts on x and not ξ we have

M j (t, tk )Mi (t, tk )eξ (x) = Mi (t, tk )M j (t, tk )eξ (x).

Finally, we compute

P0 (t, tk )A j (tk )eξ (x) = P0 (t, tk ) ν j (tk , x, dz)(ez,∇x − 1 − z, ∇x )eξ (x)
Rd

+ P0 (t, tk )aα, j (tk , x)Dxα eξ (x) (by (3.3))
|α|≤2

= P0 (t, tk ) (eiz,ξ − 1 − iz, ξ )ν j (tk , x, dz)eξ (x)
Rd

+ (iξ )α P0 (t, tk )aα, j (tk , x)eξ (x)
|α|≤2

= (eiz,ξ − 1 − iz, ξ )ν j (tk , −i∇ξ , dz)P0 (t, tk )eξ (x)
Rd

+ (iξ )α aα, j (tk , −i∇ξ )P0 (t, tk )eξ (x)
|α|≤2

= (eiz,ξ − 1 − iz, ξ )ν j (tk , −i∇ξ , dz)e0 (t,tk ,ξ ) eξ (x)
Rd

+ (iξ )α aα, j (tk , −i∇ξ )e0 (t,tk ,ξ ) eξ (x) (by (4.17))
|α|≤2

= (eiz,ξ − 1 − iz, ξ )ν j (tk , M(t, tk ), dz)e0 (t,tk ,ξ ) eξ (x)
Rd

+ (iξ )α aα, j (tk , M(t, tk ))e0 (t,tk ,ξ ) eξ (x) (by (4.20))
|α|≤2

= ν j (tk , M(t, tk ), dz)(ez,∇x − 1 − z, ∇x )e0 (t,tk ,ξ ) eξ (x)
Rd

+ aα, j (tk , M(t, tk ))Dxα e0 (t,tk ,ξ ) eξ (x)
|α|≤2

= ν j (tk , M(t, tk ), dz)(ez,∇x − 1 − z, ∇x )P0 (t, tk )eξ (x)
Rd

+ aα, j (tk , M(t, tk ))Dxα P0 (t, tk )eξ (x) (by (4.17))
|α|≤2
= G j (t, tk )P0 (t, tk )eξ (x), (by (4.14))

which concludes the proof. 

Remark 4.4 Error bounds for the Taylor approximation ū N in the scalar case d = 1
can be found in Lorig et al. [15, 16].
Asymptotics for d-Dimensional Lévy-Type Processes 333

4.3 Fourier Representation for u n

Using (4.8), (4.9) and (4.12) we have


 
1
u n (t, x) = Ln (t, T )u 0 (t, x) = e0 (t,T,ξ ) Ln (t, T )eiξ,x ϕ̂(−ξ )dξ.
(2π ) Rd
d

The term in parenthesis Ln (t, T )eiξ,x can be computed explicitly. However,


Ln (t, T ) is, in general, an integro-differential operator (when X is a diffusion
Ln (t, T ) is simply a differential operator). Thus, for models with jumps, computing
Ln (t, T )eiξ,x is a challenge. Remarkably, we will show that there exists a first order
ξ
differential operator L̂n (t, T ) such that

Lnx (t, T )eiξ,x = L̂ξn (t, T )eiξ,x , (4.21)

where, for clarity, we have explicitly indicated using superscripts that Lnx (t, T ) acts
ξ ξ
on x and L̂n (t, T ) acts on ξ . With a slight abuse of terminology, we call L̂n the
symbol1 of the operator Lnx (t, T ) in (4.13).
Let us consider the operator Mx (t, tk ) ≡ M(t, tk ) in (4.15) and denote by
Mix (t, tk ) its ith component. The symbol M ξ (t, tk ) of Mx (t, tk ) is defined analo-
i i
gously to (4.21), that is

ξ
 (t, tk )eiξ,x .
Mix (t, tk )eiξ,x = Mi

Explicitly, we have

ξ (t, tk ) = Fi (ξ, t, tk ) − i∂ξi ,


M i = 1, . . . , d,
i

where the function F is defined as


  tk   tk  tk
Fi (ξ, t, tk ) = z i eiz,ξ − 1 ν0 (s, dz)ds + m i (s)ds + i (C(s)ξ )i ds.
Rd t t t

We note that, while Mx is a first order integro-differential operator, its symbol M ξ


is a first order differential operator. For this reason, it is more convenient to use the
symbol M ξ instead of the operator Mx . Note also that

ξ (t, tk )eiξ,x
Mix (t, tk )Mxj (t, tk )eiξ,x = Mix (t, tk )M j
ξ
 (t, tk )Mx (t, tk )eiξ,x
=M j i
ξ
 (t, tk )M ξ
 (t, tk )eiξ,x .
=M j i

1 The operator L̂ξ is not a function as in the classical theory of pseudo-differential calculus. However
n
ξ
e−iξ,x L̂n eiξ,x is the symbol of Lnx (t, T ).
334 M. Lorig et al.

Since Mix and Mxj commute when applied to a function that admits a Fourier rep-
resentation, then Mξ and Mξ also commute when applied to such functions. In
j i
 ξ
 (t, tk ) β , for β ∈ Nd , is well defined and we have
particular, the operator M 0

 ξ β iξ,x
 (t, tk )
M e = (M(t, tk ))β eiξ,x . (4.22)

From identity (4.22) we obtain directly the expression of the symbol of G j in (4.14).
Indeed, recalling the expression (3.1) of ν j we have

ξ
    ξ β
Ĝ j (t, tk ) = eiz,ξ − 1 − iz, ξ  (t, tk )
ν j,β (tk , dz) M
|β|≤M j Rd
 
+ ξ (t, tk ) .
(iξ )α aα, j tk , M
|α|≤2

Thus we have proved the following lemma.

Lemma 4.5 We have


n 
 T  T  T  ξ ξ ξ
L̂ξn (t, T ) = dt1 dt2 · · · dtk Ĝi1 (t, t1 )Ĝi2 (t, t2 ) · · · Ĝik (t, tk ),
k=1 t t1 tk−1 i∈In,k
(4.23)

where In,k as defined in (4.5).

The following theorem extends the Fourier pricing formula (4.8) to higher order
approximations.
Theorem 4.6 Under the assumptions of Proposition 4.3, for any n ≥ 1 we have

1
u n (t) = P̂n (t, x, T, ξ )ϕ̂(−ξ ) dξ, (4.24)
(2π )d Rd

where P̂n (t, x, T, ξ ) is the nth order term of the approximation of the characteristic
function of X (cf. Remark 4.2). Explicitly, we have

P̂n (t, x, T, ξ ) := P̂0 (t, x, T, ξ ) e−iξ,x L̂ξn (t, T )eiξ,x

ξ
where P̂0 (t, x, T, ξ ) is the 0th order approximation in (4.9) and L̂n (t, T ) is the
differential operator defined in (4.23).
Asymptotics for d-Dimensional Lévy-Type Processes 335

Proof We first note that, since the approximating operator Lnx acts in the x variables,
then it commutes2 with the Fourier pricing operator (4.8). Thus, by (4.12) combined
with (4.8), we get

1
u n (t) = Lnx (t, T )u 0 (t) = Lx (t, T )eiξ,x +0 (t,T,ξ ) ϕ̂(−ξ ) dξ
(2π )d Rd n
 
1 ξ
= P̂0 (t, x, T, ξ ) e−iξ,x L̂n (t, T )eiξ,x ϕ̂(−ξ ) dξ,
(2π ) Rd
d

and the thesis follows from (4.21). 



ξ
Remark 4.7 Computing the term in parenthesis above e−iξ,x L̂n (t, T )eiξ,x is
ξ
a straightforward exercise since the symbol L̂n (t, T ), given in (4.23), is a differential
operator.

Remark 4.8 In case of non-integrable payoffs (e.g. Call and Put options), the Fourier
representation (4.24) can be easily extended by considering the Fourier transform
on the imaginary line ξ = ξr + iξi . For instance, since the Call option payoff
 +
ϕ(x) = e x − ek is not integrable, its Fourier transform ϕ̂(−ξ ) must be computed
in a generalized sense by fixing an imaginary component of the Fourier variable
ξi < −1.

Remark 4.9 Observe that the N th order approximation (4.6)–(4.24) requires only a
single Fourier inversion

 N 
1 
N
ū N (t, x) = u n (t, x) = P̂n (t, x, T, ξ )ϕ̂(−ξ ) dξ.
(2π )d
n=0 R
d
n=0

Moreover, when evaluating the inverse transform, the number of dimensions over
which one must integrate numerically is equal to the number of components of x that
appear in the option payoff ϕ. This is due to the fact that the Fourier transform of a con-
stant is a Dirac delta function. In particular, let ϕ(x) ≡ ϕ̄(x̄) with x̄ = (x1 , . . . , xd ),
for some d < d. Then we have ϕ̂(ξ ) = (2π )d−d ϕ̄ˆ ξ̄ δ0 (ξd +1 ) · · · δ0 (ξd ) with
ξ̄ = (ξ1 , . . . , ξd ), and thus

N 
1    
ū N (t, x) = P̂n t, x, T, ξ̄ , 0 ϕ̄ˆ −ξ̄ dξ̄ .
(2π ) n=0 Rd
d

2 This was one of the main points of the adjoint expansion method proposed by Pagliarani et al.
[21].
336 M. Lorig et al.

5 Example: Heston Model with Stochastic Jump-Intensity

Consider the following model for an asset S = e X , written under the pricing measure
Q assuming zero interest rates
    
1 ζ (t, Z t , dt, dζ ),
dX t = − − ν(dζ )(e − 1 − ζ ) Z t dt + Z t dWt + ζ dN
2 R R

dZ t = κ(θ − Z t )dt + δ Z t dBt , dW, B t = ρdt.

Note
√ that, just as in the Heston model, the instantaneous volatility of X is given by
Z t , where Z is a CIR process. Likewise, the instantaneous arrival rate of jumps of
size dζ is given by Z t ν(dζ ), where ν is a Lévy measure satisfying all of the usual
integrability conditions. The generator A of the process (X, Z ) is given by
  
1 2 ζ ∂x
A = z μ∂x + ∂x + ν(dζ )(e − 1 − ζ ∂x )
2 R
1
+ κ(θ − z)∂z + δ 2 z∂z2 + ρδz∂x ∂ y ,
 2
1
μ=− − ν(dζ )(eζ − 1 − ζ ).
2 R

The characteristic function P̂(t, x, z, T, ξ ) := E[eiξ X T |X t = x, Z t = z] is obtained


in Carr and Wu [3] by expressing the process X as a time-changed Lévy process.
One can also obtain the characteristic function by solving for the Fourier transform
of the fundamental solution corresponding to the operator (∂t + A). We have

P̂(t, x, z, T, ξ ) = eiξ x+C(T −t,ξ )+z D(T −t,ξ ) ,


  
κθ 1 − f (ξ )ed(ξ )τ
C(τ, ξ ) = 2 (κ − ρδiξ + d(ξ ))τ − 2 log ,
δ 1 − f (ξ )
κ − ρδiξ + d(ξ ) 1 − ed(ξ )τ
D(τ, ξ ) = ,
δ2 1 − f (ξ )ed(ξ )τ
κ − ρδiξ + d(ξ )
f (ξ ) = ,
κ − ρδiξ − d(ξ )

d(ξ ) = −δ 2 2ψ(ξ ) + (κ − ρiξ δ)2 ,

ψ(ξ ) = iμξ − 2 ξ +
1 2
ν(dζ )(eiξ ζ − 1 − iξ ζ ).
R
Asymptotics for d-Dimensional Lévy-Type Processes 337

With an explicit expression for P̂(t, x, z, T, ξ ) available, the price of a European call
option can be computed using standard Fourier methods

1 −ek−ikξ
u(t, x, z) = dξr P̂(t, x, z, T, ξ )ϕ̂(−ξ ), ϕ̂(ξ ) = ,
2π R iξ + ξ 2
ξ = ξr + iξi , ξi < −1. (5.1)

Note that, since the call option payoff ϕ(x) = (e x − ek )+ is not in L 1 (R), its Fourier
transform ϕ̂(ξ ) must be computed in a generalized sense by fixing an imaginary
component of the Fourier variable ξi < −1.
Also of interest are sensitivities of option prices or Greeks. In particular, consider
the  and the , which are defined as

(t, x, z) := ∂s u(t, x(s), z) = e−x ∂x u(t, x, z), (5.2)


(t, x, z) := ∂s2 u(t, x(s), z) = e−2x (∂x2 − ∂x )u(t, x, z), (5.3)

where we have used x(s) = log s. When computing terms of the form ∂xm u(t, x, z),
observe that the differential operator ∂xm acts only on the characteristic function P̂
appearing in (5.1) and not on the Fourier transformϕ̂ of the payoff ϕ. Likewise, when
n
using Theorem 4.6 to compute ∂xm ū n (t, x, z) = i=0 ∂xm u i (t, x, z) the differential
operator ∂x acts only on P̂i in (4.24).
m

Now, we specialize to the case where jumps are normally distributed


 
λ −(ζ − m)2
ν(dζ ) = √ exp .
2π s 2 2s 2

In Fig. 1 we plot the implied volatility σ corresponding to the exact price u as well
as the implied volatility σ̄2 corresponding to our second order approximation ū 2 .
To compute σ we first compute option prices using (5.1); we then invert the Black-
Scholes equation numerically in order to obtain the implied volatility σ . To compute
our second order approximation of implied volatility σ̄2 we first compute our second
order approximation for prices ū 2 using Theorem 4.6; we then invert the Black-
Scholes equation numerically in order to obtain σ̄2 . Values from Fig. 1 can be found
¯ 2.
in Table 1. In Fig. 2 we plot the exact  as well as our second order approximation 
In Fig. 3 we plot the exact as well as our second order approximation ¯ 2 . Values
from Figs. 2 and 3 are given in Tables 2 and 3 respectively. Exact Greeks are com-
puted by combining (5.1)–(5.3). Approximate Greeks are computed by combining
Theorem 4.6 and Eqs. (5.2) and (5.3).
338 M. Lorig et al.

t = 0.10 t = 0.25
0.28 0.26

0.26 0.24

0.24
0.22

0.22
0.20

0.2 0.1 0.1 0.2


0.2 0.1 0.1 0.2
t = 0.50 t = 1.00
0.26

0.24 0.24

0.22 0.22

0.20 0.20

0.4 0.2 0.2 0.4


0.3 0.2 0.1 0.1 0.2 0.3

λ −(ζ − m)2
ν(dζ) = √ exp .
2πs2 2s2

Fig. 1 For the model considered in Sect. 5, we plot the implied volatility σ corresponding to the
exact option price u (solid black) as well as the implied volatility σ̄2 corresponding to our second
order option price approximation ū 2 (dashed black). The units of the horizontal axis are log strike
k := log K . Approximate prices are computed using the Taylor series expansion of A(t) as described
in Example 3.2. We assume the Lévy measure ν is as parametrized above. The following parameters
are used in all four plots: κ = 1.15, θ = 0.04, δ = 0.2, ρ = −0.7, z = θ, x = 0, m = −0.1,
s = 0.2, λ = 2.0

Table 1 Exact implied vols σ , second order approximation σ̄2 and relative error |(σ̄2 − σ )/σ |
k − x −0.2 −0.15 −0.1 −0.05 0.00 0.05 0.1 0.15 0.2
t = 0.10 σ 0.2797 0.2478 0.2269 0.2133 0.2028 0.1940 0.1881 0.1960 0.2296
σ̄2 0.2795 0.2483 0.2271 0.2132 0.2028 0.1939 0.1877 0.1963 0.2324
rel. err. 0.0006 0.0018 0.0009 0.0003 0.0002 0.0001 0.0020 0.0018 0.0120
t = 0.25 σ 0.2441 0.2323 0.2217 0.2120 0.2028 0.1941 0.1863 0.1805 0.1803
σ̄2 0.2456 0.2328 0.2215 0.2116 0.2025 0.1939 0.1859 0.1793 0.1799
rel. err. 0.0059 0.0018 0.0013 0.0020 0.0013 0.0009 0.0021 0.0067 0.0027
t = 0.50 σ 0.2348 0.2266 0.2183 0.2101 0.202 0.1940 0.1864 0.1796 0.1743
σ̄2 0.2350 0.2254 0.2168 0.2088 0.201 0.1933 0.1856 0.1783 0.1723
rel. err. 0.0005 0.0049 0.0069 0.0063 0.004 0.0037 0.0040 0.0070 0.0116
t = 1.00 σ 0.2268 0.2204 0.2138 0.2072 0.2005 0.1939 0.1875 0.1813 0.1757
σ̄2 0.2217 0.2149 0.2089 0.2031 0.1973 0.1914 0.1854 0.1794 0.1740
rel. err. 0.0227 0.0246 0.0230 0.0197 0.0160 0.0130 0.0111 0.0103 0.0096
Parameters are the same as those in Fig. 1
Asymptotics for d-Dimensional Lévy-Type Processes 339

t = 0.10 t = 0.25
1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.2 0.1 0.1 0.2 0.2 0.1 0.1 0.2


t = 0.50 t = 1.00

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.2 0.2 0.4

Fig. 2 For the model considered in Sect. 5, we plot the Delta  corresponding to the exact option
price u (solid black) as well as the Delta  ¯ 2 corresponding to our second order option price
approximation ū 2 (dashed black). The units of the horizontal axis are x. Approximate prices are
computed using the Taylor series expansion of A(t) as described in Example 3.2. We assume the
Lévy measure ν is as given in Fig. 1. The following parameters are used in all four plots: κ = 1.15,
θ = 0.04, δ = 0.2, ρ = −0.7, z = θ, k = 0, m = −0.1, s = 0.2, λ = 2.0
340 M. Lorig et al.

t = 0.10 t = 0.25

6 4

5
3
4

3 2

2
1
1

0.2 0.1 0.1 0.2 0.2 0.1 0.1 0.2


t = 0.50 t = 1.00
2.5
3.0

2.5 2.0

2.0 1.5

1.5
1.0
1.0
0.5
0.5

0.3 0.2 0.1 0.1 0.2 0.3 0.4 0.2 0.2 0.4

Fig. 3 For the model considered in Sect. 5, we plot the Gamma corresponding to the exact option
price u (solid black) as well as the Gamma ¯ 2 corresponding to our second order option price
approximation ū 2 (dashed black). The units of the horizontal axis are x. Approximate prices are
computed using the Taylor series expansion of A(t) as described in Example 3.2. We assume the
Lévy measure ν is as given in Fig. 1. The following parameters are used in all four plots: κ = 1.15,
θ = 0.04, δ = 0.2, ρ = −0.7, z = θ, k = 0, m = −0.1, s = 0.2, λ = 2.0
Table 2 Exact Delta , second order approximation 
¯ 2 and relative error |(
¯ 2 − )/|
x −0.2 −0.15 −0.1 −0.05 0.00 0.05 0.1 0.15 0.2
t = 0.10  0.0008 0.00516 0.05084 0.2312 0.5370 0.8024 0.9385 0.9845 0.9959
¯2 0.0009 0.00478 0.05081 0.2313 0.5368 0.8026 0.9387 0.9843 0.9958
rel. err. 0.1309 0.07358 0.00048 0.0006 0.0003 0.0002 0.0002 0.0002 0.0000
t = 0.25  0.01311 0.05708 0.1690 0.3503 0.5559 0.7329 0.8563 0.9293 0.9672
¯2 0.0114 0.05674 0.1696 0.3502 0.5552 0.7330 0.8576 0.9306 0.9673
rel. err. 0.1305 0.00585 0.0035 0.0004 0.0012 0.0000 0.0014 0.0014 0.0000
Asymptotics for d-Dimensional Lévy-Type Processes

t = 0.50  0.06608 0.1506 0.2767 0.4260 0.5739 0.7018 0.8014 0.8731 0.9215
¯2 0.06425 0.1508 0.2766 0.4246 0.5719 0.7007 0.8027 0.8766 0.9256
rel. err. 0.02773 0.0014 0.0003 0.0032 0.0034 0.0015 0.0015 0.0040 0.0044
t = 1.00  0.1708 0.2667 0.3760 0.4878 0.5927 0.6849 0.7618 0.8234 0.8713
¯2 0.1662 0.2627 0.3710 0.4814 0.5857 0.6791 0.7595 0.8262 0.8789
rel. err. 0.0268 0.01496 0.0131 0.0130 0.0117 0.0084 0.0030 0.0033 0.0088
Parameters are the same as those in Fig. 2
341
342 M. Lorig et al.

Table 3 Exact Gamma , second order approximation ¯ 2 and relative error |( ¯ 2 − )/ |


x −0.2 −0.15 −0.1 −0.05 0.00 0.05 0.1 0.15 0.2
t = 0.10 0.01828 0.2978 2.159 5.539 6.288 3.831 1.446 0.3779 0.0780
¯2 0.01197 0.2897 2.1760 5.5300 6.288 3.841 1.437 0.3748 0.0821
rel. err. 0.3452 0.0273 0.0077 0.0015 0.0001 0.0025 0.0061 0.0082 0.0518
t = 0.25 0.5185 1.705 3.337 4.275 3.967 2.884 1.738 0.906 0.4229
¯2 0.5267 1.747 3.334 4.255 3.969 2.907 1.754 0.8925 0.4016
rel. err. 0.0157 0.024 0.0009 0.0046 0.0003 0.0079 0.0094 0.0149 0.0503
t = 0.50 1.514 2.488 3.135 3.206 2.802 2.174 1.54 1.017 0.635
¯2 1.585 2.508 3.109 3.182 2.804 2.208 1.588 1.045 0.6244
rel. err. 0.0468 0.0079 0.0081 0.0076 0.0007 0.015 0.0309 0.0279 0.0167
t = 1.00 2.095 2.425 2.483 2.306 1.985 1.612 1.251 0.9364 0.6814
¯2 2.134 2.418 2.452 2.280 1.988 1.656 1.331 1.028 0.7511
rel. err. 0.0183 0.0032 0.0124 0.0110 0.0015 0.0276 0.0644 0.097 0.1023
Parameters are the same as those in Fig. 3

6 Conclusion

In this paper we derive a family of asymptotic expansions for European option prices
when the underlying is modeled as a d-dimensional time inhomogeneous Lévy-type
process. By combining the classical Dyson series expansion with a novel polynomial
expansion of the generator, we obtain two equivalent representations for approximate
option price: (i) as an integro-differential operator acting on the order zero price, and
(ii) as a Fourier transform. We implement our pricing approximation on a Heston-
like model which allows for both stochastic volatility and stochastic jump intensity.
We find that our second order expansion provides and excellent approximation for
prices (as seen through corresponding implied volatilities), as well as for the Greeks
 and .

References

1. Benhamou, E., Gobet, E., Miri, M.: Smart expansion and fast calibration for jump diffusions.
Financ. Stoch. 13(4), 563–589 (2009)
2. Bompis, R., Gobet, E.: Asymptotic and non asymptotic approximations for option valuation.
Recent Developments in Computational Finance. Foundations, Algorithms and Applications,
pp. 159–241. World Scientific, Hackensack (2013)
3. Carr, P., Wu, L.: Time-changed Lévy processes and option pricing. J. Financ. Econ. 71(1),
113–141 (2004)
4. Corielli, F., Foschi, P., Pascucci, A.: Parametrix approximation of diffusion transition densities.
SIAM J. Financ. Math. 1, 833–867 (2010)
5. Deuschel, J.-D., Friz, P., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, part i: theoretical foundations. Commun. Pure Appl. Math. 67(1),
40–82 (2014)
Asymptotics for d-Dimensional Lévy-Type Processes 343

6. Fouque, J.-P., Papanicolaou, G., Sircar, R., Solna, K.: Multiscale Stochastic Volatility for Equity,
Interest Rate, and Credit Derivatives. Cambridge University Press, Cambridge (2011)
7. Friz, P. K., Gerhold, S., Yor, M.: How to make Dupire’s local volatility work with jumps. Quant.
Financ. 14(8), 1327–1331 (2014)
8. Heston, S.: A closed-form solution for options with stochastic volatility with applications to
bond and currency options. Rev. Financ. Stud. 6(2), 327–343 (1993)
9. Jeanblanc, M., Yor, M., Chesney, M.: Mathematical Methods for Financial Markets. Springer,
London (2009)
10. Kato, T.: Perturbation Theory for Linear Operators. Classics in Mathematics. Springer, Berlin
(1995). Reprint of the 1980 edition
11. Lorig, M.: Pricing derivatives on multiscale diffusions: an eigenfunction expansion approach.
Math. Finance 24(2), 331–363 (2014)
12. Lorig, M., Lozano-Carbassé O.: Multiscale Exponential Lévy models. Quant. Finance 15(1),
91–100 (2015)
13. Lorig, M., Pagliarani, S., Pascucci, A.: Analytical expansions for parabolic equations. SIAM
J. Appl. Math. 75(2), 468–491 (2015)
14. Lorig, M., Pagliarani, S., Pascucci, A.: Explicit implied volatilities for multifactor local-
stochastic volatility models. Math. Finance (to appear) (2015). ArXiv preprint arXiv:1306.5447
15. Lorig, M., Pagliarani, S., Pascucci, A.: A family of density expansions for Lévy-type processes
with default. Ann. Appl. Probab. 25(1), 235–267 (2015)
16. Lorig, M., Pagliarani, S., Pascucci, A.: Pricing approximations and error estimates for local
Lévy-type models with default. Comp. Math. App. 69(10), 1189–1219 (2015)
17. Lorig, M., Pagliarani, S., Pascucci, A.: A Taylor series approach to pricing and implied vol for
LSV models. J. Risk 17(2), 1–17 (2014)
18. Øksendal, B., Sulem, A.: Applied Stochastic Control of Jump Diffusions. Springer, Berlin
(2005)
19. Pagliarani, S., Pascucci, A.: Analytical approximation of the transition density in a local volatil-
ity model. Cent. Eur. J. Math. 10(1), 250–270 (2012)
20. Pagliarani, S., Pascucci, A.: Local stochastic volatility with jumps: analytical approximations.
Int. J. Theor. Appl. Financ. 16(8), 1–35 (2013)
21. Pagliarani, S., Pascucci, A., Riga, C.: Adjoint expansions in local Lévy models. SIAM J. Financ.
Math. 4, 265–296 (2013)
22. Pascucci, A.: PDE and Martingale Methods in Option Pricing. Bocconi & Springer Series, vol.
2. Springer, Milan (2011)
23. Sakurai, J.J., Tuan, S.F.: Modern Quantum Mechanics, vol. 104. Addison-Wesley, Reading
(Mass.) (1994)
Asymptotic Expansion Approach in Finance

Akihiko Takahashi

Abstract This paper provides a survey on an asymptotic expansion approach to


valuation and hedging problems in finance. The asymptotic expansion is a widely
applicable methodology for analytical approximations of expectations of certain
Wiener functionals. Hence not only academic researchers but also practitioners have
been applying the scheme to a variety of problems in finance such as pricing and
hedging derivatives under high-dimensional stochastic environments. The present
note gives an overview of the approach.

Keywords Asymptotic expansion · Derivatives · Option pricing · Hedge · Greeks ·


Stochastic volatility · Interest rate · Term structure model · Malliavin calculus ·
Watanabe theory

1 Introduction

Let (, F , {Ft }t∈[0,T ] , P) denote a probability space with filtration, on which a r -
dimensional standard Wiener process W is defined, where P is an appropriate pricing
measure (a risk neutral measure) in finance, and T denotes some positive constant.
Now, let F(ω) be a Wiener functional and then V, the security or portfolio value can
be expressed as V = E[F(ω)] under certain conditions. Evaluating this expectation
is one of the main issues in finance. Moreover, if F depends on the parameter θ,
computation of ∂V ∂
∂θ = ∂θ E[F(ω; θ)], the sensitivity of the security value with respect
to the change in this parameter (so called Greeks) is also an important task in practice.

I dedicate this note to the late Professor Peter Laurence and Koji Takahashi.
I am very grateful to Professor Fujii, Professor Shiraya, Professor Takehara, Dr. Toda,
Dr. Tsuzuki and Professor Yamada, my coauthors in the original articles, which are main bases
for this survey.

A. Takahashi (B)
Graduate School of Economics, The University of Tokyo, 7-3-1, Hongo,
Tokyo, Bunkyo-ku 113-0033, Japan
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 345


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_13
346 A. Takahashi

As an example, let us consider a d-dimensional diffusion process X () which is


obtained as a strong solution to the stochastic differential equation;

d X t() = V0 (X t() , )dt + V (X t() , )dWt , t ∈ [0, T ]; X 0() = x0 ,

where  ∈ [0, 1] is a known parameter. Here, the coefficients are assumed to sat-
isfy some regularity conditions. In finance, many problems of pricing derivatives
and evaluating the portfolios in investment theories are reduced to the problems of
() () ()
computing E[ f (X T )], the expectation of f (X T ), that is a function of X T .
In finance applications, it is important to deal with not only a smooth function
f (x) but also non-smooth one. For example, when various options are evaluated,
f is expressed as f = T ◦ g, where T (x) = max{x, 0} and g stands for a smooth
function of Rd → R. In general, it is difficult to represent this expectation explicitly
except for special cases. Hence, numerical methods such as Monte Carlo simulations
or numerical solutions of partial differential equations (PDEs) are employed and
various speeding up techniques are developed, since fast and precise computation is
required in practice.
As a different approach, an approximation of the expectation by an asymptotic
expansion of the stochastic differential equation around  = 0 may be considered.
() ()
Furthermore, because ∂x∂ 0 E[ f (X T )] and ∂

E[ f (X T )], the sensitivities of the secu-
rity value with respect to the changes in the initial value x0 and in the parameter  are
important indicators for practical purposes, the approximations with high accuracies
are so valuable. Moreover, some schemes that combine Monte Carlo simulations
with asymptotic expansions with low orders are developed, since the asymptotic
expansion up to the first or second order can be easily evaluated. Those schemes are
able to improve the efficiencies of Monte Carlo simulations and the accuracies of
approximations obtained by the asymptotic expansions.
An asymptotic expansion approach in finance has been developed for the past two
decades, which is mathematically justified by Watanabe theory (Watanabe [111])
in Malliavin calculus (e.g. Malliavin [64], Chap. V-8 in Ikeda and Watanabe [39],
Nualart [72]). To the best of our knowledge, the asymptotic expansion technique is
firstly applied to finance for evaluation of average options that are popular deriva-
tives in commodity markets. Kunitomo and Takahashi [48] and Takahashi [85] derive
approximation formulas for average options by an asymptotic expansion method
based on log-normal approximations for average prices distributions, when the under-
lying asset prices follow geometric Brownian motions. Yoshida [119] derives an
asymptotic expansion of an average option price around a normal distribution for a
general diffusion model, which is a byproduct of his result in statistics [118] based
on the Watanabe theory.
Thereafter, the asymptotic expansion approach have been applied to a broad class
of valuation problems in finance, which includes pricing options with stochastic
volatility models, pricing options under Heath-Jarrow-Morton (HJM) models [37]
or Libor market models (LMM) (Brace, Gatarek and Musiela [7], Jamshidian [43])
Asymptotic Expansion Approach in Finance 347

of interest rates, and pricing so called exotic-type options such as basket and barrier
options in addition to average options.
For instance, please see Kawai [44], Kobayashi, Takahashi and Tokioka [45],
Kunitomo and Takahashi [49–51], Li [59], Matsuoka, Takahashi and Uchida [66],
Muroi [67], Nishiba [71], Osajima [75], Shiraya and Takahashi [78–80], Shiraya,
Takahashi and Toda [81], Shiraya, Takahashi and Yamada [83], Shiraya, Takahashi
and Yamazaki [82], Takahashi and Matsushima [88], Takahashi and Saito [89], Taka-
hashi and Takehara [90–94], Takahashi, Takehara and Toda [90, 91], Takahashi and
Tsuzuki [98], Takahashi and Uchida [99], Takahashi and Yamada [100–104], Taka-
hashi and Yoshida [106, 107], Takahashi and Takehara [92, 93], Violante [110], Xu
and Zheng [112, 113], and Takahashi [86, 87].
We briefly introduce some of above works in Sect. 3.6. Moreover, we remark
that the asymptotic expansion approach is employed by Yamanobe [116, 117] in
physics for analyses of the impulse-driven stochastic biological oscillator and global
dynamics of a stochastic neuronal oscillator.
We also note that there exist many other types of the expansion/perturbation
methods which have turned out to be so useful for applications in finance. For exam-
ple, see Bayer and Laurence [2], Ben Arous and Laurence [3], Benaim, Friz and
Lee [4], Col, Gnoatto and Grasselli [9], Davydov and Linetsky [11], Deuschel, Friz,
Jacquier and Violante [12, 13], Forde and Jacquier [18], Forde, Jacquier and Lee [17],
Foschi, Pagliarani, Pascucci [19], Fouque, Papanicolaou and Sircar [20, 21], Fujii
[24], Fujii and Takahashi [25–27, 29], Gatheral, Hsu, Laurence, Ouyang, and Wang
[30], Gnoatto and Grasselli [31], Gulisashvili [32], Hagan, Kumar, Lesniewski and
Woodward [33], Henry-Labordère [38], Kato, Takahashi and Yamada [46, 47],
Kusuoka and Osajima [57], Lee [58], Lipton [60], Linetsky [61], Osajima [76],
Pagliarani and Pascucci [77], Siopacha and Teichmann [84], Yamamoto, Sato and
Takahashi [114], Yamamoto and Takahashi [115], and references therein.
The organization of the paper is as follows. The next section describes the outline
of the asymptotic expansion approach in a general diffusion setting. Then, Sect. 3
explains a computational scheme for the expansion method. Section 4 provides an
extension of the general computational scheme in the previous section, and Sect. 5
briefly introduces two improvement scheme for the expansion method. Section 6
extends the approach to non-diffusion Wiener functionals by using an instantaneous
forward rates model as an example. Sections 7 and 8 introduce an asymptotic expan-
sion in jump-diffusion models and a perturbation scheme in forward backward sto-
chastic differential equations (FBSDEs). Section 9 concludes.

2 Asymptotic Expansion in General Diffusion Setting

Following [87, 96], this section briefly describes an asymptotic expansion method
in a general diffusion setting.
348 A. Takahashi

() (),1 (),d 


Let us consider a d-dimensional diffusion process X t = (X t , . . . , X t )
which is the solution to the following stochastic differential equation:

(), j j () ()


d Xt = V0 (X t , )dt + V j (X t )dWt ( j = 1, . . . , d) (1)
()
X0 = x0 ∈ R , d

where W = (W 1 , . . . , W r ) is a r -dimensional standard Wiener process, and  ∈


(0, 1] is a known parameter. Here, x  denotes the transpose of x. Next, let us define
V0 = (V01 , . . . , V0d ) : Rd × (0, 1] → Rd and V : Rd → Rd ⊗ Rr whose jth row
is V j , j = 1, . . . , d. Suppose also that V0 and V satisfy some regularity conditions.
(For example, V0 and V are smooth functions with bounded derivatives of all orders.)
Next, let a function g : Rd → R be smooth and all of its derivatives have
()
polynomial growth. Then, a smooth Wiener functional g(X T ) has its asymptotic
expansion:
()
g(X T ) ∼ g0T + g1T + 2 g2T + · · ·

in D∞ as  ↓ 0 where g0T , g1T , g2T , . . . ∈ D∞ . For any k ∈ N, q ∈ (1, ∞) and


s > 0, this expansion means that

1 ()
g(X T ) − (g0T + g1T + · · · + k−1 gk−1,T ) q,s = O(1) (as  ↓ 0),
k

where G q,s represents the sum of L q -norms of Malliavin derivatives of a Wiener


functional G up to the sth order. Further, a Banach space Dq,s = Dq,s (R) can be
regarded as the totality of random variables bounded with respect to (q, s)-norm
· q,s , and D∞ = ∩s>0 ∩1<q<∞ Dq,s . The coefficients gnT ∈ D∞ (n = 0, 1, . . .) in
the expansion can be obtained by Taylor’s formula and represented based on multiple
Wiener-Itô integrals. For the details of definitions and proofs above, please consult
Watanabe [111], Chap. V of Ikeda and Watanabe [39], Malliavin [64], or Chap. 7 of
Malliavin and Thalmaier [65].

Remark 1 As an example of applications in finance, X () consists of n stocks, X () =


() ()
(S1 , . . . , Sn ) and g(·) is those weighted sum g(x) = w1 x1 + · · · + wn xn for
x = (x1 , . . . , xn ) with constant weights wi (i = 1, . . . , n). Then, g(x) would
represent the spread, the average or the basket price of the stock prices.
As another example, we can set X () is a vector of N forward Libor rates, X () =
(L 1 , . . . , L ()
() 
N ) , and

 N −1
1− 1
j=0 1+τ L ()
() () jT
g(X T ) = S RT =  N −1 i ,
τ i=0
1
j=0 1+τ L ()
jT

that is a swap rate with inception date T and maturity date TN = T + N τ . Here,
L j T stands for the forward Libor rate at T fixing at T + jτ with tenor τ .
Asymptotic Expansion Approach in Finance 349

()
∂k X j
Let Akt = k!1 ∂kt |=0 and Akt , j = 1, . . . , d denote the jth elements of Akt . In
particular, A1t is represented by
 t  
A1t = Yt Yu−1 ∂ V0 (X u(0) , 0)du + V (X u(0) )dWu , (2)
0

where Y denotes the solution to the ordinary differential equation:

dYt = ∂V0 (X t(0) , 0)Yt dt; Y0 = Id .


j
j ∂V (x,) j
Here, ∂V0 denotes the d × d matrix whose ( j, k)-element is ∂k V0 = ∂x 0
k
, V0
is the jth element of V0 , and Id denotes the d × d identity matrix.
j
For k ≥ 2, Akt , j = 1, . . . , d is recursively determined by the following equation:
 t
1
∂k V0 (X (0) , 0)du
j j
Akt =
k! 0
⎛ ⎞
 (l)
k   t β
1 1 ⎝ β
Al jju ⎠ ∂  ∂k−l V0 (X u(0) , 0)du
d j
+
(k − l)! β! 0 dβ
l=1 lβ ,dβ j=1
⎛ ⎞
(k−1)
  t β
1 ⎝ β
Al jju ⎠ ∂  V j (X u(0) )dWu ,
d
+ (3)
β! 0 dβ
lβ ,dβ j=1

∂l β ∂β
where ∂l = ∂l
, ∂ = ∂xd1 ···∂xdβ ,

(l)
 
l  
:= (4)
lβ ,dβ β=1 lβ ∈L l,β dβ ∈{1,...,d}β

for l ≥ 1, ⎧ ⎫
⎨ β
 ⎬
L l,β := lβ = (l1 , . . . , lβ ); l j = l; (l, l j , β ∈ N) , (5)
⎩ ⎭
j=1

and for l = 0,

(0)
   
= .
lβ ,dβ β=0 l0 =(∅) d0 =(∅)
350 A. Takahashi

Then, g0T and g1T can be written as

(0)
g0T = g(X T ),

d
(0) j
g1T = ∂ j g(X T )A1T ,
j=1

where ∂ j g(x) = ∂x∂ j g(x), j = 1, . . . , d.


For n ≥ 2, gnT is expressed as follows:

(n)
 1 β
∂ g(X T(0) )Ald11T · · · AlββT .
d
gnT = (6)
β! dβ
lβ ,dβ

Here, we note that each Alti (i = 1, . . . , d, l = 1, 2, . . . , k, 0 ≤ t ≤ T ) has all


finite moments due to a grading structure. We describe the definition of the grading
structure by following pp. 45–47 in Bichteler, Gravereaux and Jacod [5]: Consider
the stochastic differential equation of the form:

d St = μ(St , t)dt + σ(St , t)dWt ; S0 = s0 ∈ Rd , (7)

where μ : Rd × R+ → Rd and σ : Rd × R+ → Rd ⊗ Rr .
Definition 1 A grading of Rd is a decomposition Rd = Rd1 × · · · × Rdq with
d = d1 + · · · + dq . The coordinates of a point in Rd are always arranged in an
increasing order along the subspace Rdi , and we set M0 = 0 and Ml = d1 + · · · + dl
for 1 ≤ l ≤ q. We say that the coefficients μ and σ are graded according to the
grading Rd = Rd1 × · · · × Rdq if μi (x, t) and σ ij (x, t), j = 1, . . . , r depend upon
only through the coordinates (x k )1≤k≤M p when M p−1 ≤ i ≤ M p .
Theorem 1 We assume that the coefficients μ and σ in (7) are graded according to
Rd = Rd1 × · · · × Rdq . Moreover for F(x, t) = μ(x, t) or σ j (x, t), j = 1, . . . , r ,
we assume that F is differentiable in x on Rd and
1. |F i (0, t)| ≤ Z t for i = 1, . . . , d
2. | ∂x∂ j F i (x, t)| ≤ Ẑ t (1 + |x|θ ) for all i, j
3. | ∂x∂ j F i (x, t)| ≤ ζ if M p−1 ≤ i, j ≤ M p for some p ≤ q,

where ζ, θ ≥ 0 are constants, and Z , Ẑ are predictable processes such that Z p


 1/ p
T
and Ẑ p are finite for all p ≥ 1 where Z p = 0 E[|Z t | p ]dt . Then (7) have
a unique solution S, and for every p ≥ 1 there are constants c p and γ p depending
only upon (ζ, θ, {|| Ẑ || p } p ≥1 ), such that

|| sup St || L p ≤ c p (s0 + ||Z ||γ p ).


0≤t≤T
Asymptotic Expansion Approach in Finance 351

For the detail of the definition and theorem above, see pp. 45–47 in Bichteler,
Gravereaux and Jacod [5].
Applying Theorem 1 to the system of stochastic differential equations consisting
of Alti (i = 1, . . . , d, l = 1, . . . , k, 0 ≤ t ≤ T ) as well as any products of them, we
obtain the following lemma.

Lemma 1 Each coefficient in the expansion, Alti (i = 1, . . . , d, l = 1, . . . , k, 0 ≤


t ≤ T ) has all finite moments.

(Proof) We consider the system of stochastic differential equations (SDEs) for


A11 , . . . , Ad1 , A11 A11 , . . . , Ad1 Ad1 , A12 , . . . , Ad2 , . . .. Then, the coefficients of the SDEs
() ()
are represented by the derivatives at  = 0 of Ṽ0 (X u , ) and Ṽ (X u ), which are
bounded in [0, T ]. Moreover, it is easily shown that the coefficients of the equation
are graded and satisfy the conditions in Theorem 1. Hence each coefficient in the
expansion, Aikt has all finite moments. 

Next, let normalize g(X T() ) to

()
g(X T ) − g0T
G () =

for  ∈ (0, 1]. Then, we have

G () ∼ g1T + g2T + · · ·

in D∞ .
Next, for h ∈ H , where H denotes the Cameron-Martin subspace of the r -
dimensional Wiener space, the H -derivative of G () is expressed as


1 
d d T
() () (),i () () () ()
Dh G = ∂i g(X T )Dh X T = ∂i g(X T ) [YT (Yt )−1 V (X t )ḣ t ]i dt,
 0
i=1 i=1

where Y () is the Rd ⊗ Rd -valued stochastic process which is the solution to the
stochastic differential equation:

() () ()



m
() () ()
dYt = ∂V0 (X t , )Yt dt +  ∂V i (X t )Yt dwit ; Y0 = Id ,
i=1

In fact, Yt = Yt(0) . Here, ∂V i (i = 0, 1, . . . , m) denotes the d × d matrix whose


( j, k)-element is ∂k V ji . (∂k = ∂x∂ k .)
Moreover, with a notation V̂t() that is defined by
   
() ()  () () ()
V̂t = ∂g(X T ) YT (Yt )−1 V (X t ) ,
352 A. Takahashi

 
()  () ()
where ∂g(X T ) = (∂1 g(X T ), . . . , ∂d g(X T )), the Malliavin (co)variance of
G () is given by
 T
() ()
σG () = V̂t (V̂t ) dt. (8)
0

Moreover, let
   
(0) (0)  (0)
V̂t := V̂t = ∂g(X T ) YT Yt−1 V (X t )

and make the following assumption:


 T
(Assumption 1) T = V̂t V̂t dt > 0.
0

Note that g1T follows a normal distribution with variance T , and the density function
of g1T denoted by f g1T (x) is given as
 
1 (x − C)2
f g1T (x) = √ exp −
2πT 2T

where
   T
(0)  (0)
C := ∂g(X T ) YT Yt−1 ∂ V0 (X t , 0)dt. (9)
0

Since T is the variance of the random variable g1T , which follows a normal dis-
tribution, (Assumption 1) means the condition that the distribution of g1T does not
degenerate. In application, as it is easy to check this condition in most cases, it plays
an important role for practical purposes.
Next, let us briefly introduce a truncated version of the Watanabe theory [111]
based on Yoshida [118, 119]. Under (Assumption 1), σG () is uniformly non-
()
degenerate for {|ηc | ≤ 1}; that is, it can be shown that there exists a positive
real number c0 > 0 such that for any c > c0 and p > 1,

sup E[1{|η() |≤1} (|σG () |)− p ] < ∞, (10)


c
∈(0,1]

()  T ()
where ηc = c 0 |V̂t − V̂t |dt.
Let S be the real Schwartz space of rapidly decreasing C ∞ -functions on R and S 
 ()
be its dual space. Then, for  : R → R,  ∈ S , a composite function ψ(ηc ) ◦
()
G () = ψ(ηc )(G () ) is well-defined as an element of D̃−∞ = ∪s<0 ∩1< p<∞ D p,s .
Here, ψ(x), x ∈ R denotes a smooth function 0 ≤ ψ(x) ≤ 1, defined as ψ(x) = 1
for |x| ≤ 1/2 and ψ(x) = 0 for |x| ≥ 1. Here, a Banach space D p,s , s < 0 is the
dual space of Dq,−s (R)(q = p/( p − 1)).
Asymptotic Expansion Approach in Finance 353

Moreover, the coupling with the function 1 is well-defined, which is called as


() ()
generalized expectation and is written as E[ψ(ηc )◦G () ]. Further, ψ(ηc )◦G ()
can be expanded in D̃−∞ .
() ()
In addition, it can be shown that {ηc (w);  ∈ (0, 1]} ⊂ D∞ , ηc (w) is O(1) in

D as  ↓ 0, and that for any a0 > 0 there exist positive constants ai , i = 1, 2, 3
()
such that P({|ηc | > a0 }) ≤ a1 exp(−a2 −a3 ). Hence, for any k = 1, 2, . . . , we
have
P(|ηc() | > 21 )
lim < ∞.
↓0 k

()
This means that the probability of the events truncated by ψ(ηc ) is smaller than any
()
polynomial orders of . Then, in the expansion of ψ(ηc ) ◦ G () , the coefficients
expressed as generalized Wiener functionals belonging to D̃−∞ can be written by
applying Taylor’s formula to (g0T + g1T + 2 g2T +· · · ). Therefore, the asymptotic
expansion of the expectation E[(G () )] can be obtained relatively easily. For the
details of Watanabe theory and its truncated version above, please consult Watanabe
[111] and Yoshida [118, 119]. For its application to valuation problems in finance,
please also see [50].
In particular, if we take the delta function at y ∈ R, δ y as , that is (x) = δ y (x),
we obtain an asymptotic expansion of the density function of G () . Moreover, because
functions such as (x) = max{x, 0} that is measurable but not smooth, frequently
appear in finance, the framework mentioned above is necessary for the asymptotic
expansion.
For instance, when we take max{x, 0}, min{x, 0} or δ y (x) as (x) for a useful
application in finance, the expectation of (G () ) is expanded as follows: for N =
0, 1, 2, . . . ,
⎡ ⎛ ⎞⎤
 N (n) m
1
E[(G () )] = n E ⎣(m) (g1T ) ⎝ g(k j +1)T ⎠⎦ + o( N )
m!
n=0 km j=1

1  (m) 

N (n)
 
= n E  (g1T )X km + o( N )
m!
n=0 km


N (n)
  ∞
1 
= n
(m) (x)E[X km |g1T = x] f g1T (x)d x + o( N )
m! −∞
n=0 km


N (n)
  ∞
1
= n
(x)(−1)m
m! −∞
n=0 km
dm  

× E[X km |g1T = x] f g1T (x) d x + o( N )
dxm
(11)
354 A. Takahashi
 (n) n 
d m (x) 
where (m) (g1T ) = d x m x=g , = km ∈L n,m , and
1T km m=1

m

X km := g(k j +1)T . (12)
j=1

In order to compute the asymptotic expansion (11), we need to evaluate the condi-
tional expectations of the form:
  
 
E X̃ km  g1T = x ,


where X̃ km is represented by a product of multiple Wiener-Itô integrals.
In the preceding works on application of the asymptotic expansion, the conditional
expectations in (11) were directly computed with some formulas including multi-
dimensional ones given for example, in [85, 86]. Recently, while the formulas up to
the third order are given in the works, [95] has developed a high-order computation
j
scheme for the conditional expectations by using the fact that each of these {Ak,t } j,k ,

{gnT }n and also {X km }km can be decomposed into a finite sum of iterated multiple
Wiener-Itô integrals by applications of the Itô’s formula with certain properties of
iterated multiple Wiener-Itô integrals. (Please see Sect. 4 of [95] for the detail.)
On the other hand, as shown in the next section, we can develop an alternative
method which does not evaluate the conditional expectations directly.

3 Computational Scheme

This section follows [96] to introduce a computational scheme for the asymptotic
expansion, which is an alternative to the direct calculation method for the conditional
expectations given in [95].

3.1 Preparation

To compute the conditional expectations on the right hand side of (11), we use the
following lemma which can be derived from a property of Hermite polynomials and
leads us to compute the unconditional expectations instead of the conditional ones.

Lemma 2 Let (, F, P) be a probability space. Suppose that X ∈ L 2 (, P) and Z


is a random variable with Gaussian distribution with mean 0 and variance . Then,
the conditional expectation E[X |Z = x] has the following expansion in L 2 (R, μ)
where μ is the Gaussian measure on R with mean 0 and variance :
Asymptotic Expansion Approach in Finance 355


 an
E[X |Z = x] = Hn (x; ) (13)
n
n=0

where Hn (x; ) is the Hermite polynomial of degree n which is defined as

2 /2 d n −x 2 /2
Hn (x; ) = (−)n e x e
dxn
and the coefficients an are given by
  2
1 1 ∂ n  ξ √
an = e 2  E[eiξ Z X ] , (i = −1). (14)
n! i n ∂ξ n ξ=0

(Proof) Since the system of Hermite polynomials {Hn (x; )} is an orthogonal
basis of L 2 (R, μ), and E[X |Z = x] ∈ L 2 (R, μ), we have the following unique
expansion of E[X |Z = x] in L 2 (R, μ):

 an
E[X |Z = x] = Hn (x; ).
n
n=0

Since we have another Taylor expansion




ξ2 Hn (x; )
eiξx = e− 2  (iξ)n ,
n!
n=0

then,

ξ2 ξ2
e 2  E[eiξ Z X ] = e 2  eiξx E[X |Z = x]μ(d x)
R
 ∞ ∞

Hm (x; ) Hn (x; )μ(d x)
= (iξ)m an
R m! n
m=0 n=0


= an (iξ)n .
n=0

ξ2
Comparing to the coefficients of the Taylor series of e 2  E[eiξ Z X ] around 0 with
respect to ξ, we see that an can be written as (14). 
(0) (0) (0)
Next, we write V̂t = (∂g(X T )) YT Yt−1 V (X t ) as V̂ (X t ). Then, we define
ξ
ĝ1 = {ĝ1t ; t ∈ R+ } and Z ξ = {Z t ; t ∈ R+ } as the stochastic processes
 t
ĝ1t = V̂ (X u(0) )dWu
0
356 A. Takahashi

and  
ξ ξ2
Zt = exp iξ ĝ1t + t ,
2
t (0) (0)
respectively, where t := 0 V̂ (X u )V̂ (X u ) du.
Then, from Lemma 2, the conditional expectations appearing on the right hand
side of the Eq. (11) is expressed as

 
E[X km |g1T = x] = E[X km |ĝ1T = x − C]
∞ 
 a km l
= Hl (x − C; T ) (15)
l=0
Tl

where   
 1 1 ∂ l  km ξ
alkm = E[X Z ] . (16)
l! i l ∂ξl ξ=0 T

Here it is noted
 that with
 this expression we now need to compute unconditional
 ξ
expectations E X kδ Z T instead of the conditional expectations.

3.2 Asymptotic Expansion of Density Function

In this subsection, we explain a new computational method through deriving a general


formula for the expansion (11) with an arbitrary specification of its order N . In
particular, we show that the coefficients in the expansion are obtained through a
system of ordinary differential equations that is solved easily.
d
First, we define η β (t; ξ) for lβ ∈ L n,β and dβ ∈ {1, . . . , d}β (n ≥ β ≥ 1) as
lβ

⎡⎛ ⎞ ⎤
β
d ξ
= E ⎣⎝ Al jjt ⎠ Zt ⎦ ,
d
η β (t; ξ) (17)

j=1

and for n = 0 as
 
(∅) ξ
η(∅) (t; ξ) = E Z t . (18)

 ξ
Then, by using (6) we write the unconditional expectations E[X km Z T ] in (16)
in terms of η as follows:
Asymptotic Expansion Approach in Finance 357
⎡⎛ ⎞ ⎤
m
km ξ ξ
E[X Z T ] = E ⎣⎝ g(k j +1)T ⎠ Z T ⎦
j=1
⎡⎛ ⎧ ⎫⎞ ⎤

⎪ (k j +1) j ⎪

⎢⎜ ⎨  1 βjm j dβ ⎬ ⎟ ξ ⎥
= E⎢ ⎜ ∂  j g(X T(0) )A j1 · · · A j j ⎟ ⎥
d
⎣⎝ ⎪ lβ T ⎪ ⎠ ZT ⎦
⎪j  j β j ! dβ j
j=1 ⎩
l1 T
j ⎪

lβ ,dβ
j j
⎛ ⎞
(k1 +1) (k
m +1) m
1 βj (0) dβ1 ⊗···⊗dβmm
= ··· ⎝ ∂  j g(X T )⎠ η1 1 (T ; ξ)
β j ! dβ j lβ ⊗···⊗lβδ
δ
lβ ,dβ lβm ,dβm
1 1 m δ j=1 1
1 1
(19)

where

dβi i ⊗ dβ j := (d1i , . . . , dβi i , d1 , . . . , dβ j ),


j j j

lβi i ⊗ lβ j := (l1i , . . . , lβi i , l1 , . . . , lβ j ).


j j j

d
So, we have to calculate η β (T ; ξ) to evaluate the asymptotic expansion (11).

d
In the following, we derive a system of ODEs satisfied by these {η β }. Before

showing a general result, we first derive the ODEs for a few leading-low-order terms
explicitly to give a better intuition of a key idea of our method. Particularly, let us
j j ξ
consider the evaluation of η(2) (T ; ξ) = E[A2T Z T ] which appears in the -order.
Here, for simplicity, we assume that V0 does not depend on , and write V0 (x, ) as
j j
V0 (x). In this case, we first note that the SDEs of A1t and A2t ( j = 1, . . . , d) are
given as follows:

j

d
j

j (0) (0)
d A1t = A1t ∂ j  V0 (X t )dt + V j (X t )dWt (20)

j =1
⎡ ⎤

d 
(0) 1 d  
(0)
d A2t = ⎣ A1t Ak1t ∂ j  ∂k  V0 (X t )⎦ dt
j j j j j
A2t ∂ j  V0 (X t ) +

2  
j =1 j ,k =1


d
j

(0)
+ A1t ∂ j  V j (X t )dWt . (21)

j =1

ξ
Also, the SDE of Z t is expressed as:

ξ ξ
d Zt = (iξ)V̂ (X (0) )Z t dWt . (22)
358 A. Takahashi

j ξ
Then, applying Itô’s formula to A2t Z t , we have

ξ ξ ξ


d(A2t Z t ) = A2t d Z t + Z t d A2t + dA2 , Z ξ t
j j j j

  d 
d
j  ξ (0) (0) j  ξ (0)
A1t Z t V̂ (X t )∂ j  V j (X t ) +
j
= (iξ) A2t Z t ∂ j  V0 (X t )
j  =1 j  =1

1 
d
j  ξ j (0)
+ A1t Ak1t Z t ∂ j  ∂k  V0 (X t ) dt
2  
j ,k =1
⎧ ⎫
⎨  d


j ξ (0) j ξ (0)
+ (iξ)A2t Z t V̂ (X t ) + A1t Z t ∂ j  V j (X t ) dWt .
⎩ 

j =1

Since the last term is a martingale, taking expectation on both sides, we have the
j
following ordinary differential equation for η(2) :

d j  j d
(0) (0)
η(2) (t; ξ) = (iξ) η(1) (t; ξ)V̂ (X t )∂ j  V j (X t )
dt  j =1


d
j j (0) 1  j  ,k 
d
j (0)
+ η(2) (t; ξ)∂ j  V0 (X t ) + η(1,1) (t; ξ)∂ j  ∂k  V0 (X t ).
2  
j  =1 j ,k =1

j
Here, η(1) ( j = 1, . . . , d) appearing in the right hand side of the above ODE are
evaluated in the similar manner:
ξ ξ ξ
d(A1t Z t ) = A1t d Z t + Z t d A1t + dA1 , Z ξ t
j j j j
⎧ ⎫
⎨  d


ξ (0) (0) j ξ (0)
= (iξ)Z t V̂ (X t )V j (X t ) +
j
A1t Z t ∂ j  V0 (X t ) dt
⎩ ⎭
j  =1
 
j ξ (0) ξ (0)
+ (iξ)A1t Z t V̂ (X t ) + Z t V j (X t ) dWt ,

hence, we have

d j (0) (0) 
 d
j j (0)
η (t; ξ) = (iξ)V̂ (X t )V (X t ) +
j
η(1) (t; ξ)∂ j  V0 (X t ).
dt (1)  j =1

j,k
η(1,1) and other higher-order terms can be evaluated in the same way. The key obser-
vation is that each ODE does not involve any higher-order terms, and only lower- or
the same order-terms appear in the right hand side of the ODE. So, one can easily
solve (analytically or numerically) the system of ODEs and evaluate the expectations.
Asymptotic Expansion Approach in Finance 359

d
The following proposition provides a way to calculate general η β (T ; ξ) as a

solution to the system of the ordinary differential equations:
d
Proposition 1 For η β (t; ξ) defined in (17), the following system of ordinary differ-

ential equations is satisfied:

 β
   
d dβ 1 d (0)
η (t; ξ) = η β/k (t; ξ) ∂lk V0dk (X t , 0)
dt lβ lk ! lβ/k
k=1
β 
 (l)
lk  
1 1 (d )⊗d̃ γ
+ η  β/k (t; ξ)
(lk − l)! γ! (lβ/k )⊗m γ
 γ ,d̃ γ
k=1 l=1
m

γ (0)
× ∂  ∂lk −l V0dk (X t , 0)
d̃ γ
β (l
 k −1) (l
m −1)  
1 (dβ/k,m )⊗d̃ γ ⊗d̂ δ γ (0)
+ η (t; ξ) ∂  V dk (X t )
γ!δ! (lβ/k,m )⊗m γ ⊗m δ d̃ γ
 γ ,d̃ γ m
 δ ,d̂ δ
k,m=1
m
k<m

(0)
× ∂ δ V dm (X t )
d̂ δ
β (l
 k −1) 
1 (d )⊗d̃ γ
+ (iξ) η  β/k (t; ξ)
γ! (lβ/k )⊗m γ
 γ ,d̃ γ
k=1
m

γ (0) (0)
× ∂  V dk (X t ) V̂ (X t ) (23)
d̃ γ

(l)
where is defined in (4), and
 γ ,d̃ γ
m

lβ/k := (l1 , . . . , lk−1 , lk+1 , . . . , lβ )


lβ/k,n := (l1 , . . . , lk−1 , lk+1 , . . . , ln−1 , ln+1 , . . . , lβ ), 1 ≤ k < n ≤ β
lβ ⊗ m
 γ := (l1 , . . . , lβ , m 1 , . . . , m γ )

for lβ = (l1 , . . . , lβ ) and m


 γ = (m 1 , . . . , m γ ).
 
β dj
(Proof) We firstly apply Itô’s formula to j=1 Al j t by using (3) to obtain the
following:
360 A. Takahashi
⎛ ⎞ ⎛ ⎞
⎛ ⎞
β β ⎜
 β ⎟ β ⎜
 β ⎟
⎜ d ⎟ ⎜ d ⎟
d⎝ Al jjt ⎠ =
d
⎜ Al jjt ⎟ d Aldkkt + ⎜ Al jjt ⎟ dAldkk , Aldmm t
⎝ ⎠ ⎝ ⎠
j=1 k=1 j=1 k,m=1 j=1
j=k k<m j=k,m
⎛ ⎞
β ⎜
 β ⎟
⎜ d ⎟ 1 (0)
= ⎜ Al jjt ⎟ ∂lk V0dk (X t , 0)dt
⎝ ⎠ lk !
k=1 j=1
j=k
⎛ ⎞
⎛ ⎞
β ⎜
 β ⎟ lk (l) γ
⎜ d ⎟ 1 1 ⎝ d̃ 
+ ⎜ Al jjt ⎟ Amj j  t ⎠
⎝ ⎠ (lk − l)! γ!
k=1 j=1 l=1
m γ ,d̃ γ j  =1
j=k
γ (0)
× ∂ ∂lk −l V0dk (X t , 0)dt
d̃γ
⎛ ⎞
⎛ ⎞
β ⎜ β ⎟ (lk −1) γ
⎜ dj ⎟  1 d̃  γ (0)
+ ⎜ Al j t ⎟ ⎝ Amj j  t ⎠ ∂ V dk (X t )dWt
⎝ ⎠ γ! d̃ γ
k=1 j=1
 γ ,d̃ γ
m j  =1
j=k
⎛ ⎞
β ⎜
 β ⎟ (lk −1) (lm −1) 1
⎜ d ⎟  
+ ⎜ Al jjt ⎟
⎝ ⎠ γ!δ!
 γ ,d̃ γ m
 δ ,d̂ δ
k,m=1 j=1
m
k<m j=k,m
⎛ ⎞ ⎛ ⎞
γ δ
d̃  γ d̂ 
×⎝ Amj j  t ⎠ ∂ V dk
(X t(0) ) ⎝ Amj j  t ⎠ ∂ δ V dm (X t(0) )dt.
d̃ γ d̂ δ
j  =1 j  =1
(24)

ξ (0) ξ


Note also that d 
 Z t = (iξ)V̂ (X t )Z t dWt . Then, applying Itô’s formula again to
β d j ξ
j=1 Al j t Z t and take expectations on both sides to obtain the result. 

(∅) ξ
Remark 2 Due to η(∅) (t; ξ) = E[Z t ] = 1, and the hierarchical structure of the

ODEs with respect to n = j=1 l j , one can easily solve these ODEs successively
d
from lower-order terms to higher-order terms with initial conditions η β (0; ξ) = 0

for (lβ , dβ ) = (∅, ∅).

Remark 3 Further, due to the structure of the system of the differential equations,
d
it is easily shown by induction that each η β (t; ξ) is expressed as a polynomial of

β  ξ
degree n = j=1 l j with respect to (iξ). Then, we can also show that E[X km Z T ]
Asymptotic Expansion Approach in Finance 361


is a polynomial of degree (n + m) with respect to (iξ), and thus alkm = 0(l > n + m)
for km ∈ L n,m . This ensures a convergence of the infinite sum in (15).

Then, from Lemma 2 and (11), we have the following expression of E[(G () )]:


N (n)
 
1 dm
E[(G () )] = n (x)(−1)m m
m! R dx
n=0 km
⎧ ⎫
 a km
⎨n+m ⎬
l
× Hl (x − C;  ) f
T g1T (x) d x + o( N )
⎩ Tl ⎭
l=0


N (n) 
1
=  n
(x)
m! R
n=0 km
⎧ ⎫
⎨n+m a km ⎬
l
× Hl+m (x − C;  T ) f g (x) d x + o( N )
⎩  l+m
1T

l=0 T

Here we have used the well-known property of the Hermite polynomial:


 m
dm & ' −1
Hl (x − C; T ) f g1T (x) = Hl+m (x − C; T ) f g1T (x).
dx m T

In particular, let  be the delta function at x ∈ R, δx , we obtain the asymptotic


expansion of the density of G () :

f G () (x) = E[δx (G () )]


 (n)
 k
1  al m
N n+m
= n H (x − C; T ) f g1T (x) + o( N ). (25)
l+m l+m
n=0 
m!
l=0
 T
km

We summarize the discussion above as the following theorem:

Theorem 2 Let X () be the solution to the stochastic differential equation (1).
Suppose a function g : Rd → R is smooth and all of its derivatives have poly-
nomial growth. Then, the asymptotic expansion of the density function of G () =
() (0)
g(X T )−g(X T )
 up to  N -order is given by

f G () (x) = f g1T (x)


( )

N 
3n
+  n
Cnm Hm (x − C; T ) f g1T (x) + o( N ),
n=1 m=0
(26)
362 A. Takahashi

where
 
1 (x − C)2
f g1T (x) = √ exp − (27)
2πT 2T

with
   T
(0)  (0)
C = ∂g(X T ) YT Yt−1 ∂ V0 (X t , 0)dt,
0
 T
(0) (0)
T = V̂ (X t )V̂ (X t ) dt > 0,
0
(0) (0) (0)
V̂ (X t ) = (∂g(X T )) YT Yt−1 V (X t ).

Hn (x; ) is the Hermite polynomial of degree n with parameter , which is defined as

2 /2 d n −x 2 /2
Hn (x; ) = (−)n e x e , (28)
dxn
and
(m) (k1 +1) (kδ +1)
1   1
Cnm = . . .
Tm δ!(m − δ)!
kδ lβ ,dβ
1 1 δ δ lβ ,dβ
1 1 δ δ
⎛ ⎞
δ
1 βj (0)
×⎝ ∂ g(X T )⎠
β j ! dβj j
j=1
  1
1 ∂ m−δ  dβ ⊗...⊗dβδ  √ 
× m−δ  η 1 δ
(T ; ξ) , i = −1 . (29)
i ∂ξ m−δ  lβ1 ⊗...⊗lβδ
1
ξ=0 δ

d
η β (T ; ξ) are obtained as a solution to the following system of ODEs:

 β

d d 1 dβ/k (0)
η β (t; ξ) = η (t; ξ)∂lk V0dk (X t , 0)
dt lβ lk ! lβ/k
k=1
β 
 (l)
lk 
1 1 (dβ/k )⊗d̃ γ γ
+ η (t; ξ)∂  ∂lk −l V0dk ( X̃ t(0) , 0)
(lk − l)! γ! (lβ/k )⊗m γ d̃ γ
 γ ,d̃ γ
k=1 l=1
m
β (l
 k −1) (l
m −1)
1 (dβ/k,m )⊗d̃ γ ⊗d̂ δ
+ η (t; ξ)
γ!δ! (lβ/k,m )⊗m γ ⊗m δ
 γ ,d̃ γ m
 δ ,d̂ δ
k,m=1
m
k<m
Asymptotic Expansion Approach in Finance 363

γ (0) (0)
× ∂  V dk (X t )∂ δ V dm (X t )
d̃ γ d̂ δ
β (l
 k −1)
1 (dβ/k )⊗d̃ γ γ (0) (0)
+ (iξ) η (t; ξ)∂  V dk (X t )V̂ (X t , t)
γ! (lβ/k )⊗m γ d̃ γ
 γ ,d̃ γ
k=1
m
 (∅)
η β (0; ξ) = 0 for (lβ , dβ ) = (∅, ∅), η(∅) (t; ξ) = 1 f or (lβ , dβ ) = (∅, ∅).
d

(30)

Here, we use the following notations:

lβ/k := (l1 , . . . , lk−1 , lk+1 , . . . , lβ )


lβ/k,n := (l1 , . . . , lk−1 , lk+1 , . . . , ln−1 , ln+1 , . . . , lβ ), 1 ≤ k < n ≤ β
lβ ⊗ m
 γ := (l1 , . . . , lβ , m 1 , . . . , m γ )

for lβ = (l1 , . . . , lβ ) and m


 γ = (m 1 , . . . , m γ ).

Remark 4 Particularly, in order to calculate the expansion above up to the 2 -order,


we need the Hermite polynomials Hn (x; ) up to n = 6, which are given as follows:

H0 (x; ) = 1,
H1 (x; ) = x,
H2 (x; ) = x 2 − ,
H3 (x; ) = x 3 − 3x,
H4 (x; ) = x 4 − 6x 2 + 3 2 ,
H5 (x; ) = x 5 − 10x 3 + 15 2 x,
H6 (x; ) = x 6 − 15x 4 + 45 2 x 2 − 15 3 .

3.3 Remarks on the Asymptotic Expansion


for Multi-dimensional Density Functions

We can also apply the conditional expectation formulas for the multi-dimensional
case in Lemma 1.1 of [85] and Lemma 2.1 of [86] to derive an asymptotic expansion
up to the third order of the multi-dimensional density functions. This is particularly
useful for pricing exotic-type options such as barrier options with discrete monitoring
(e.g. [83]), and pricing Bermudan-type or approximate American-type derivatives
(e.g. Nishiba [71]).
364 A. Takahashi

Moreover, we obtain the following result as an extension of Lemma 2, which


easily leads to an asymptotic expansion of a multi-dimensional density function in
the similar manner as in the one dimensional case in Theorem 2.

Lemma 3 Let (, F, P) be a probability space. Suppose that X ∈ L 2 (, P) and


Z is a d-dimensional random variable with Gaussian distribution with mean 0 (d-
dimensional zero vector) and variance-covariance matrix . Then, the conditional
expectation E[X | Z = x] for x ∈ Rd has the following expansion in L 2 (Rd , µ) where
µ is the Gaussian measure on Rd with mean 0 and variance-covariance matrix :


E[X | Z = x] = an! Hn (
x ; ), (31)
|
n |=0

where n = (n 1 , n 2 , . . . , n d ), |
n | = n 1 + n 2 + · · · + n d , n! = n 1 !n 2 ! · · · n d ! and

1 1 ∂ n   1 
 

 
  √ 
an =  e 2 (ξ)  ξ E ei(ξ) Z X , i = −1 . (32)
n i |n | ∂ ξ 
ξ=0

  denotes the transpose of ξ.


Here, (ξ)  Hn (x ; ) stands for the d-dimensional mul-
n | with n = (n 1 , n 2 , . . . , n d ):
tiple Hermite polynomial of degree |
    
1 ∂ ∂ ∂
Hn (
x ; ) = − − ··· − x : ];
n[
x : ]
n[ ∂x1 ∂x2 ∂xd
x = (x1 , x2 , . . . , xd ) (33)

where

1 1
x : ] =
n[ exp − x  −1 x . (34)
(2π)d/2 ||1/2 2

(Proof) Basically, we can make a similar discussion as in the proof of Lemma 2.


Indeed we first note that the system of the following Hermite polynomials is a com-
plete biorthogonal system in L 2 (Rd , µ):

{Hn ( x : ) : n = (n 1 , n 2 , . . . , n d ); n i = 0, 1, 2, . . . , (i = 1, 2, . . . , d)},
{ H̃n (
x : ) : n = (n 1 , n 2 , . . . , n d ); n i = 0, 1, 2, . . . , (i = 1, 2, . . . , d)},

where Hn (
x : ) is given by (33) and H̃n (
x : ) is defined as follows:
    
1 ∂ ∂ ∂
H̃n (
x ; ) = − − ··· − x : ],
n[ (35)
x : ]
n[ ∂ y1 ∂ y2 ∂ yd
y = (y1 , y2 , . . . , yd ) =  −1 x.
Asymptotic Expansion Approach in Finance 365

Thus, we have the following expansion of E[X | Z = x] in L 2 (Rd , µ):




E[X | Z = x] = an Hn (
x ; ).
|
n |=0

On the other hand, we know the relation:



  j
(i ξ)  1  
x ; ) = ei ξ x e 2 ξ  ξ ,
H̃ j ( (36)
j!
| j|=0


 j = (iξ1 ) j1 (iξ2 ) j2 · · · (iξd ) jd . Hence,
where (i ξ)


  j
 x 1 
 ξ (i ξ)
ei ξ = e− 2 ξ H̃ j (
x : ).
j!
| j|=0

It is also well known that


 *
 (if m
m!  = n),
Hm (
x : ) H̃n (
x : )n[
x : ]d x = (37)
Rd 0 (if m
 = n).

Therefore,
1 
      
 ξ   1    
e2ξ E ei ξ Z X = e 2 ξ  ξ E ei ξ Z E X | Z
⎧ ⎫
 ⎨ ∞  j ⎬
(i ξ)
= H̃ j (
x : )
Rd ⎩  j! ⎭
| j|=0
⎧ ⎫
⎨ ∞ ⎬
× an Hn (x : ) µ(d x) (38)
⎩ ⎭
|
n |=0


=  n ; ((ξ)
an i |n | (ξ)  n = ξ n 1 ξ n 2 · · · ξ n d ), (39)
1 2 d
|
n |=0

and making n = (n 1 , . . . , n d )th order differentiation of both sides in the equation


above with respect to ξ = (ξ1 , . . . , ξd ) at ξ = 0,
 we obtain (32) and hence the result,
(31)–(34).
366 A. Takahashi

3.4 Expansion of Option Prices

Now, we apply the approximate density function in Theorem 2 obtained by the


asymptotic expansion technique to option pricing.
In particular, we consider a plain vanilla option on the underlying asset price
() ()
process (g(X t ))t∈[0,T ] , where (X t )t∈[0,T ] is the solution to the stochastic differ-
ential equation expressed as the Eq. (1). As an example, we obtain an approximation
of a call option price as follows.
Theorem 3 An asymptotic expansion up to the (N +1) -order of a call option price
(0)
at time 0 with maturity T and strike price K where K = g(X T ) − y for arbitrary
y ∈ R is given as follows:
+,      -
y+C y+C y+C
C(K , T ) = P(0, T ) T n √ + CN √ + yN √ (40)
T T T
N +,    -
y+C y+C
+ n+1 P(0, T )Cn0 T n √ + CN √
n=1
 T T

N +   ,  -
y+C y+C
+ n+1 P(0, T )Cn1 T N √ − T y n √
n=1
T T

N 
3n+ ,  
y+C
+ n+1 P(0, T ) Cnm −y T Hm−1 (−(y + C); T ) n √
n=1 m=2
T
 -
3 y+C
+ T2 Hm−2 (−(y + C); T ) n √
T
N  
y+C
+y n+1 P(0, T ) Cn0 N √
n=1
T

N 
3n ,  
y+C
+y n+1 P(0, T ) Cnm T Hm−1 (−(y + C); T ) n √ + o((N +1) ).
n=1 m=1
T

Here, Cnm is given by (29), and Hn (x; ) is the Hermite polynomial of degree n with
parameter , which is defined as

2 /2 d n −x 2 /2
Hn (x; ) = (−)n e x e .
dxn
C and T are given respectively by
   T
(0)  (0)
C = ∂g(X T ) YT Yt−1 ∂ V0 (X t , 0)dt
0
Asymptotic Expansion Approach in Finance 367

and
 T
(0) (0)
T = V̂ (X t )V̂ (X t ) dt,
0

where
(0) (0) (0)
V̂ (X t ) = (∂g(X T )) YT Yt−1 V (X t ).

Also, P(0, T ) denotes the price at time 0 of a zero coupon bond with maturity T .
N (x) stands for the standard normal distribution function, and its density function
is given by n(x) = √1 e−x /2 .
2

(Proof) We firstly note that the call price is expanded as follows:

()
C(K , T ) = P(0, T )E[max{g(X T ) − K , 0}]
. *( () (0)
) ( (0)
) /0
g(X T ) − g(X T ) g(X T ) − K
= P(0, T )E max + ,0
 
  
= P(0, T )E max G () + y, 0
 ∞
= P(0, T ) (x + y) f G () ,N (x)d x + o((N +1) ). (41)
−y

Here, f G () ,N is the asymptotic expansion of the density of G () up to  N -order,


which is given by the first two terms on the right hand side of (26) in Theorem 2:
( )

N 
3n
f G () ,N (x) = f g1T (x) +  n
Cnm Hm (x − C; T ) f g1T (x), (42)
n=1 m=0

where
 
1 (x − C)2
f g1T (x) = √ exp − .
2πT 2T

Next, we note the well-known properties of the Hermite polynomials:

d
Hn (x; ) = n Hn−1 (x; ) (43)
dx
 
dm −1 m
{H n (x; )n(x; )} = Hn+m (x; )n(x; )
dxm 
Hn+1 (x; ) = x Hn (x; ) − n Hn−1 (x; ),

x2
where n(x; ) = √ 1 e− 2 .
2π
368 A. Takahashi

Then, we can obtain the following expressions for the Integrals appearing on the
right hand side of (41):
 ∞  
y+C
f g1T (x)d x = N
√ , (44)
−y T
 ∞ ,    
y+C y+C
x f g1T (x)d x = T n √ + CN √ ,
−y T T
 ∞ ,  
y+C
Hm (x − C; T ) f g1T (x)d x = T Hm−1 (−(y + C); T ) n √ ; m ≥ 1,
−y T
 ∞ ,  
y+C
x Hm (x − C; T ) f g1T (x)d x = − T y Hm−1 (−(y + C); T ) n √
−y T
 
3 y + C
+ T2 Hm−2 (−(y + C); T ) n √ ; m ≥ 2.
T

Remark 5 In practical applications, usually the underlying model is given as a non-


perturbed form:

j j
d X̂ t = V̂0 ( X̂ t )dt + V̂ j ( X̂ t )dWt ( j = 1, . . . , d) (45)
X̂ 0 = x0 ∈ Rd .

Then, in order to apply the asymptotic expansion method, we may rewrite the model
for instance, as
(), j j () ()
d Xt = V0 (X t )dt + V j (X t )dWt ( j = 1, . . . , d) (46)
X 0() = x0 ∈ R , d

where by rescaling V̂ j (x) we set V j (x) so that V̂ j (x) = V j (x) for some  ∈ (0, 1].
Consequently, an approximate call price under the original model (45) is obtained
by (40) without o( N +1 ).

3.5 Application to Computation of Greeks

We already have a so called closed form approximate formula (40) for the option
price, and hence are able to obtain approximations of its Greeks (that is, sensitivities
to the changes in parameters in a model) as closed forms as well (or at least with
easy numerical method such as the difference quotient method with the approximate
option pricing formula).
For instance, [68] implements direct differentiations of the approximate formulas
for option values under a time-homogeneous general local volatility model, and
Asymptotic Expansion Approach in Finance 369

obtains closed form approximate formulas for the Deltas and Vegas. Moreover, [68]
applies the similar technique to computing the Deltas and Vegas for average options
with continuous monitoring, and gets their closed form approximate formulas as well.
They also confirms the validity of the approximations through numerical experiments
in the CEV model.
By deriving asymptotic expansions of characteristic functions of option values,
[93, 94] propose a new expansion scheme for pricing options on long-term currencies
under a Libor market model (LMM) and a general diffusion stochastic volatility
model with jump of spot exchange rates. Furthermore, applying the approximate
formulas, they provide analytical (closed form) approximations for the Deltas and
Gammas of the options. Please see [93, 94] for the detail.
Alternatively, for a parameter θ, the sensitivity of a call price C(K , T ) with respect
to the change in θ is expressed as follows:


C(K , T ) = P(0, T )E[max{g(X T() ) − K , 0}]
∂θ
∂    
= P(0, T )E max G () + y, 0
∂θ
  
∂  
= {P(0, T )} E max G () + y, 0
∂θ
 
∂  
()
+ P(0, T ) E max G + y, 0
∂θ
 
∂{P(0, T )} C AE (K , T )
=
∂θ P(0, T )
.( ) 0
∂G () ∂y
+ P(0, T )E + 1{G () >−y } , (47)
∂θ ∂θ

where C AE (K , T ) stands for the approximate call price with strike K and maturity
T , which is obtained by the asymptotic expansion.
Then, we are able to obtain an approximation of the sensitivity by a direct appli-
cation of the asymptotic expansion to the above equation, particularly, the second
term in the last equation. For example, under one dimensional diffusion setting, that
is a general time homogeneous local volatility model, [66] successfully applies the
expansion technique to computation of the Deltas and the Vegas with numerical
experiments.
More generally, we note that the similar method as in option pricing in the previous

subsection can be applied in Greeks, since we can take  ∈ S for E[(G () )] in
(11) and apply the integration-by-parts method in Malliavin calculus. Recently, [103]
takes this approach and derives asymptotic expansions of Greeks around the Black-
Scholes model in stochastic volatility environment, and develop a unified method
for precise estimates of the expansion errors. Particularly, they make use of the so
called Kusuoka-Stroock functions introduced by Kusuoka [52], which is a powerful
tool to clarify the order of a Wiener functional with respect to the time parameter t
370 A. Takahashi

in a unified manner. Then, they estimate the error bounds for the Malliavin weights
of both the coefficient and the residual terms in the expansions.

3.6 Approximations of Asset Values Under Diffusion Processes

The framework of the asymptotic expansion can be applied not only to the simple
cases mentioned above, but also to evaluation of much broader range of asset and
security values. In particular, there are many cases where the asymptotic expansion
can be applied to approximate their values when the underlying asset prices of finan-
cial securities, cash flows and interest rates are expressed as some functions of a
random vector X () that follows a diffusion process. The method is almost the same
as the one illustrated above and hence it is omitted. In this subsection, we only review
how to represent the values of financial assets.
First, just as in the previous subsections, we consider a d-dimensional diffusion
process X () defined as the strong solution to the stochastic differential equation (1).
As an example, the present value V of a financial asset which generates a cash flow
at the maturity date T is represented as
 T ()

V = E e− 0 R2 (X u )du
F(g(X T() )) , (48)

where g denotes the underlying asset price and F is the cash flow which characterizes
the asset to be evaluated. Note that the underlying asset price g follows a diffusion
() ()
process, whose drift term (the coefficient of the dt term) is R1 (X t )g − D(X t )
under an equivalent martingale measure. Moreover, R1 at time t ∈ [0, T ] is repre-
sented as
() ()
J1
()
R1 (X t ) = r (X t ) + s1 j (X t ),
j=1

where r denotes the risk-free interest rate and s1 j , j = 1, . . . , J1 stand for various
spreads (the differences from the risk-free rate) such as credit spreads and liquidity
spreads. Suppose also that those are expressed as functions of the variable X () . Fur-
ther, D(X t() ) denotes a payoff generated by the underlying asset such as a dividend or
an interest rate and is also represented as a function of the variable X () . Meanwhile,
the discount rate at time t that is, R2 (X t() ) of the target asset F to be evaluated is
also expressed as
() ()
J2
()
R2 (X t ) = r (X t ) + s2 j (X t ),
j=1

where s2 j , j = 1, . . . , J2 are various spreads related to the objective asset or security.


We again assume that those are expressed as some functions of the variable X () .
Asymptotic Expansion Approach in Finance 371

As an example, let F = 1 in (48) for a zero-coupon bond with the face value
1 and the maturity date T . Also, let Vi denote the price of the zero-coupon bond
with the maturity Ti . Then, V , the value of a coupon bond with the maturity TN
and coupon (and principal) payments  N ci at Ti (i = 1, . . . , N , T1 < · · · < TN ) is
represented by the equation V = i=1 ci Vi . Moreover, the present value of a call
option on the coupon bond with the option maturity T (< T1 ) can be evaluated if we
() N ()
set F(x) = (x − K )+ and g(X T ) = i=1 ci gi (X T ) in the equation (48), where
()
gi (X T ), i = 1, . . . , N are given by
+  -
()
Ti () ()
gi (X T ) = E e− T R1 (X u )du
|X T .

Finally, we briefly review applications of the asymptotic expansion technique to


numerical problems in finance, which can not be introduced in the present note due
to the limitation of the space.
Takahashi and Yoshida [106] applies an asymptotic expansion to a dynamic invest-
ment problem with utility maximization for the asset at the end of the investment
period, and derives an approximation formula for evaluating the optimal portfolio.
Although the optimal portfolio has been numerically evaluated as a function of deriv-
atives of the solution to some Bellman equation except for special cases, it is a hard
task to implement it when the number of assets is large. Takahashi and Yoshida
[106] provides its approximation based on the representation which Ocone-Karatzas
[74] derives by using the so called Clark-Ocone formula. Moreover, [45] applies this
method to a dynamic bond portfolio problem.
In evaluation of the expectation of a Wiener functional based on Monte Carlo
simulations, [107] proposes a new estimator with a control variate which has its
expectation explicitly obtained by an asymptotic expansion, and has a high correla-
tion with the target Wiener functional. The convergence of the simulation based on
this estimator becomes much faster and the approximation error with the asymptotic
expansion up to a low order such as the first or second order is decreased. As for the
extension of this method, please see [51, 88, 99].
For pricing American options, [89] extends a well-know decomposition formula
for an American option value by Carr-Jarrow-Myneni [8], and proposes an approxi-
mation of the value by making use of the approximate density function of the under-
lying asset, which is obtained by the asymptotic expansion.
Moreover, because of its generality and unified nature of this approach with ana-
lytical (so called closed from) formulas, the asymptotic expansion method has been
applied to broad class of valuation models which have become popular recently in
practice. Especially, comparing to other numerical approximation schemes such as
the Monte Carlo simulations and numerically solving methods for the partial differ-
ential equations (PDEs), it has an advantage in high dimensional problems. We list
the following works as examples.
372 A. Takahashi

Applying the framework described above to default risk models, Muroi [67]
derives asymptotic expansions for approximations of CDS (credit default swap)
spreads.
Shiraya et al. [82] applies the expansion technique to obtain an approximation
of swaption values under the Libor market model(LMM) of interest rates (Brace,
Gatarek and Musiela [7], Jamshidian [43]) with local-stochastic volatility models.
Takahashi and Takehara [90–92] develop asymptotic expansion formulas for pric-
ing long-term currency options with a Libor market model(LMM) of interest rates
and diffusion or jump-diffusion stochastic volatility processes of spot exchange rates.
Moreover, [92] presents a new characteristic-function-based Monte Carlo simulation
scheme with the asymptotic expansion as a control variate.
Takahashi and Takehara [96] develops a general computation scheme for a high-
order expansion method explained in this section, and applies it to the SABR model
(Hagan, Kumar, Lesniewski, and Woodward [33]). They derives the expansions of
the option prices up to the fifth order to show that the higher order expansion improves
the approximations.
Takahashi and Takehara [108] and Takahashi et al. [109] also apply this scheme to
the long-term currency options such as the 10 year maturity one under a Libor market
model (LMM) of interest rates and stochastic volatility processes of spot exchange
rates. Again, they confirm that the fourth or the fifth order expansion provides the
better approximations than the lower order ones.
Furthermore, we are able to apply the expansion method to pricing the so called
exotic type options. For instance, [78] derives expansions of average options with
discrete monitoring under stochastic volatility models in order to obtain approximate
prices of commodities average options. Moreover, they implement calibration to
real futures plain-vanilla option prices of the underlying commodities, and evaluate
average options based on the parameters obtained by the calibration.
Shiraya et al. [83] develops new approximation formulas for pricing single and
double barrier options with discrete monitoring under stochastic volatility models.
In addition, they demonstrate its validity through numerical experiments.
Shiraya et al. [81] presents a new approximation scheme for pricing continuous
barrier options in stochastic volatility environment. Particularly, they make use of
a static hedging scheme and the fifth order expansions of the vanilla options to
obtain accurate approximate prices. Further, they derives the fifth order expansions
for pricing average options with continuous monitoring under stochastic volatility
models to achieve very precise approximations.
Shiraya and Takahashi [79] develops a general scheme for evaluation of the so
called multi-asset cross currency options. In particular, they derive the expansions of
basket option prices with 100 underlying assets (200 state variables with their stochas-
tic volatilities), and cross currency average/basket options with discrete monitoring
under stochastic volatility models to obtain accurate approximations.
Kato et al. [46, 47] develop a new expansion scheme for solutions of Cauchy-
Dirichlet problems for second order parabolic partial differential equations (PDEs)
and apply it to pricing down-and-out/up-and-out barrier options with continuous
monitoring under stochastic volatility models.
Asymptotic Expansion Approach in Finance 373

4 Extension

This section follows [97] which presents an extension of the general computational
scheme of the asymptotic expansion described in the previous section. In particular,
by a change of variable technique and by various ways of setting the perturbation
parameters in the expansion, we are able to provide the flexibility of setting the bench-
mark distribution around which the expansion is made, and an automatic way for
computation up to any order in the expansion. For instance we introduce expansions,
called the log-normal expansion and the CEV expansion.

4.1 Change of Variable and Perturbation

We consider a d-dimensional diffusion process X t = (X t1 , . . . , X td ) which is the


solution to the following stochastic differential equation:

j j
d X t = V0 (X t )dt + V j (X t )dWt ( j = 1, . . . , d) (49)
X 0 = x0 ∈ R d

j
where W = (W 1 , . . . , W r ) is an r -dimensional standard Wiener process; V0 :
Rd → R and V j : Rd → Rd are smooth functions with bounded derivatives of all
orders.
Next, let C : Rd → Rd be a C2 -function which has the unique inverse function,
−1
C , and define X̃ t as X̃ t = C(X t ). Then, the dynamics of X̃ is given by

j j
d X̃ t = Ṽ0 ( X̃ t )dt + Ṽ j ( X̃ t )dWt ( j = 1, . . . , d), (50)
X̃ 0 = x̃0 ,

where


d
j
∂ j  C j (C −1 (x̃))V0 (C −1 (x̃))
j
Ṽ0 (x̃) :=
j  =1

1 
d
 
+ ∂ j  k  C j (C −1 (x̃))V j (C −1 (x̃))V k (C −1 (x̃)) ,
2  
j ,k =1


d

Ṽ j (x̃) := ∂ j  C j (C −1 (x̃))V j (C −1 (x̃)),
j  =1

and x̃0 = C(x0 ). ((C −1 (x̃)) denotes the transpose of (C −1 (x̃)).)


374 A. Takahashi

Next, we introduce a perturbation parameter  ∈ (0, 1] as follows:

()
X̃ t → X̃ t
j (), j
Ṽ0 (x̃) → Ṽ0 (x̃, )
Ṽ (x̃) → Ṽ (x̃),
j j

and hence, the dynamics of X̃ () is expressed as

(), j (), j () ()


d X̃ t = Ṽ0 ( X̃ t , )dt + Ṽ j ( X̃ t )dWt ( j = 1, . . . , d). (51)

Then, we are able to apply the technique developed in the previous section to the
transformed SDE (51).

4.2 Applications to Option Pricing Under Local-Stochastic


Volatility Model

We assume that the underlying process is the unique solution to the following SDE:

d St = σ(X t )h(St )dWt


j j
d X t = V0 (X t )dt + V j (X t )dWt ( j = 2, . . . , d) (52)
S0 = s0 ∈ R, X 0 = x0 ∈ R d−1
,

where σ : Rd−1 → Rr , h : R → R, and W is a r -dimensional Brownian motion.


Then, we evaluate a call option with strike K and maturity T , whose underlying
price process is given by S. Under the zero discount interest rate for simplicity, the
call price Call(K , T ) with strike price K and maturity T is obtained by

Call(K , T ) = E[(ST − K )+ ]. (53)

First, for x = (x 1 , x 2 , . . . , x d ), let

C(x) = (C1 (x 1 ), x 2 , . . . , x d ),

where C1 : R → R is an invertible C2 -function. Then, S̃t = C1 (St ), which S̃ follows


a process of the solution to the following SDE:

1 ( )
d S̃t = |σ(X t )|2 h(C1−1 ( S̃t ))2 C1 (C1−1 ( S̃t ))dt
2

+ σ(X t )h(C1−1 ( S̃t ))C1( ) (C1−1 ( S̃t ))dWt , s̃0 = C1 (s0 ), (54)
Asymptotic Expansion Approach in Finance 375

( ) ( ) 2
where C1 (x) := ddx C1 (x) and C1 (x) := ddx 2 C1 (x).
Next, we introduce a perturbation parameter  as follows:

() η() () () ( ) ()


d S̃t = |σ(X t )|2 h(C1−1 ( S̃t ))2 C1 (C1−1 ( S̃t ))dt
2
() ( ) ()
+ σ(X t )h(C1−1 ( S̃t ))C1 (C1−1 ( S̃t ))dWt ,
(), j j () ()
d Xt = V0 (X t , )dt + V j (X t )dWt ( j = 2, . . . , d), (55)

where η() = k and k is a nonnegative integer such as k = 0, 1, 2, . . .. Note that

(1)
St = C1−1 ( S̃t ) = C1−1 ( S̃t ),

(1) ()
where S̃t = S̃t |=1 .
According to Theorem 2 in the previous section, we have already an asymptotic
() (0)
S̃ − S̃
expansion of the density function of G () = T  T up to  N -order, denoted by
f G () ,N (x).
Therefore, an approximation formula of the call price is given as follows:
+    -
(1)
Call(K , T ) = E[(ST − K )+ ] = E C1−1 S̃T − K (56)
+
 ∞ 
(0)
≈ C1−1 (x + S̃T ) − K f G (1) ,N (x)d x, (57)
y

(0)
where y = C1 (K ) − S̃T .
A simple example is the following. Set the local volatility function to be linear:

d St = σ(X t )St dWt


j j
d X t = V0 (X t )dt + V j (X t )dWt ( j = 2, . . . , d). (58)

For x = (x 1 , x 2 , . . . , x d ), let

C(x) = (log x 1 , x 2 , . . . , x d ),

and set η() = k where k is 0, 1 or 2. Then, we have S̃t() = log St() , where

() k () ()


d S̃t =− σ(X t )2 dt + σ(X t )dWt , (59)
2
(), j j () ()
d Xt = V0 (X t , )dt + V j (X t )dWt ( j = 2, . . . , d).

This case corresponds to some existing researches. (e.g. [91, 92, 95, 96, 100])
376 A. Takahashi

4.3 Examples

This subsection shows more specific examples in the local-stochastic volatility


model.

4.3.1 CEV Model

The first example is on the well-known CEV (Constant Elasticity of Variance) model
(Cox [10]):

β 1−β
dSt = σ(St S0 )dWt , σ and S0 are positive constants, β ∈ [0, 1], (60)

1−β
where the term S0 makes the level of σ is of the same order for different β. For
x > 0, let us take the change of variable function to be C(x) = log(x/S0 ), that is
x = C −1 (x̃) = S0 exp(x̃). Hence, S̃t = log SS0t and we have

1
d S̃t = − σ 2 e2(β−1) S̃t dt + σe(β−1) S̃t dWt ; S̃0 = 0. (61)
2
Next, we introduce a perturbation  ∈ [0, 1], again as follows:

η() 2 2(β−1) S̃t() ()


d S̃t() = − σ e dt + σe(β−1) S̃t dWt ; S̃0 = 0. (62)
2

where η() =  j and j is a nonnegative integer.


Because
     
(1) (1) (0)
ST = C −1 S̃T = S0 exp S̃T = S0 exp G (1) + ST ,

an approximation formula of the call price with strike K and maturity T is given as
follows:
+    -
(0)
Call(K , T ) = E[(ST − K )+ ] = E S0 exp G (1) + S̃T − K
+
 ∞   
(0)
≈ S0 exp x + S̃T − K f G (1) ,N (x)d x; (63)
y
K
y = C(K ) − S̃T(0) = log − S̃T(0) . (64)
S0

Note that f g1T , the first term in the asymptotic expansion of the density f G () is a
normal density and hence, the underlying asset price is expanded around a log-normal
distribution. Thus, we could call this case a log-normal asymptotic expansion. We
Asymptotic Expansion Approach in Finance 377

also remark that the case of η() = 0 = 1 is harder to be evaluated than the other
(0)
cases, which is essentially due to difficulty in computation of S̃t for η() = 1.
• On the Validity of the Asymptotic Expansion for CEV model
Previous works such as [85, 86, 107] have considered an asymptotic expansion
of (average and vanilla) option prices based on the following type of a perturbed
process: For β ∈ [1/2, 1),

d St() = (St() ∨ 0)β dWt ; S0() = s0 . (65)

Although the coefficient function in this model is not smooth at 0, the asymp-
totic expansion method is still applicable. For instance, we could use a smooth
modification technique (e.g. [106, 107]). That is, let us take a modified process
( S̃t() )t∈[0,T ] of (St() )t∈[0,T ] as follows:

d S̃t() = g( S̃t() )dWt . (66)

Here, g(x) is a smooth modification of g(x) = (x ∨ 0)β such that g(x) = x β when
x ≥ a1 for some small a1 ∈ (0, a) for a = 21 s0 and g(x) = 0 when x ≤ a2 for
some a2 ∈ (0, a1 ). Specifically, we may set g(x) as follows. For t ∈ [0, T ],

g(x) = h(x)x β
ψ(x − a2 )
h(x) = , 0 < a2 < a1
ψ(x − a2 ) + ψ(a1 − x)
ψ(x) = e−1/x for x > 0, ψ(x) = 0 for x ≤ 0.
 + 2 -
2   
 ()  ()
Suppose that for a R-valued function f , E f (S ) < ∞ and E  f ( S̃ ) <
∞. (e.g. we can take option payoff functions as f in our setting.) Then, we have
     1  1 
 () ()  () 2 2 () 2 2
E  f (S ) − f ( S̃ ) 1{S () = S̃ () } ≤ E | f (S )| + E | f ( S̃ )|
 1
× P {S () = S̃ () } .
2

It also holds that


  1 2
P {S  = S̃  } = P {St ≤ a1 for some t ∈ [0, T ]}
( )
≤ P { sup |St − St0 | > a}
0≤t≤T
( )
+P {St ≤ a1 for some t ∈ [0, T ]} ∩ { sup |St − St0 | ≤ a} .
0≤t≤T
378 A. Takahashi

We can easily see that the second term after the last inequality is 0. The first term
is smaller than any n for n = 1, 2, . . . by the following lemma of a large deviation
inequality:
Lemma 4 Suppose that Z t , t ∈ [0, T ] follows a process of the solution to the SDE:

d Z t = μ(Z t )dt + σ(Z t )dWt .

where μ(z) satisfies the Lipschitz and linear growth conditions, and σ(z) satisfies
the linear growth condition. We assume that the unique strong solution exists. Then,
there exists positive constants c1 and c2 independent of  such that

P({ sup |Z s − Z s0 | > c}) ≤ c1 exp(−c2 −2 ) (67)


0≤s≤T

for all c > 0.


The lemma can be proved by slight modification of the Lemma 5.3 in [119] or the
Lemma 7.1 in [50]. Note also that S  and S̃  satisfy the conditions in the lemma
above.
Hence,  
 
E  f (S () ) − f ( S̃ () ) = o(n ), n = 1, 2, . . . . (68)

Therefore, the difference between f (S () ) and f ( S̃ () ) is negligible in a small


disturbance
 asymptotic
 theory, and hence we could apply an asymptotic expansion
3 4
to E f ( S̃ ) instead of E f (S () ) .
()
 
T
In particular, [107] considered the case that β = 1/2 and f (x) = T1 0 xt dt−
K )+ , x = S () , S̃ () (an average call option’s payoff). The similar modification
could be applied to the asymptotic expansions for transformed processes in this
section. Please also see [88] for numerical experiments under the smooth and
bounded modification of this kind for volatility functions in a HJM-type model of
interest rates.

4.3.2 SABR Model

Next, let us consider a stochastic volatility model so called SABR [33] (or λ-SABR
[38]) Model:

β 1−β
d St = σt (St S0 )dWt1 ; S0 > 0, (69)
dσt = λ(θ − σt )dt + νσt dWt2 ; σ0 > 0

where β ∈ [0, 1], λ ≥ 0, θ > 0, ν > 0, and W = (W 1 , W 2 ) is a two dimensional


Wiener process with correlation ρ ∈ [0, 1].
Asymptotic Expansion Approach in Finance 379

• Log-normal Asymptotic Expansion


Let us take a log-normal asymptotic expansion for the underlying asset price S,
that is for x1 > 0, set C(x1 , x2 ) = (log(x1 /S0 ), x2 ) and S̃t = log SS0t :

1
d S̃t = − σt2 e2(β−1) S̃t dt + σt e(β−1) S̃t dWt1 ; S˜0 = 0 (70)
2
dσt = λ(θ − σt )dt + νσt dWt2 ; σ0 > 0.

Next, we introduce a perturbation  ∈ [0, 1], again as follows:

() η1 () 2 2(β−1) S̃t() ()


d S̃t =− σ e dt + σe(β−1) S̃t dWt ; S˜0 = 0, (71)
2
dσt() = η2 ()λ(θ − σt() )dt + νσt() dWt2 ; σ0() = σ0 ,

where ηi () =  ji , i = 1, 2 and ji is a nonnegative integer.


For instance, typical cases are η2 () = 0 = 1 with η2 () =  (an extension of
the log-normal asymptotic expansion in [95, 100]), or η2 () = 2 (an extension of
[90] to the CEV-type local volatility).
An approximation formula of the call price with strike K and maturity T is given
as follows:
+    -
(0)
Call(K , T ) = E[(ST − K )+ ] = E S0 exp G (1) + S̃T − K
+
 ∞   
(0)
≈ S0 exp x + S̃T − K f G (1) ,N (x)d x; (72)
y
(0) K (0)
y = C(K ) − S̃T = log − S̃T . (73)
S0

Again, we note that the case of η() = 0 = 1 is harder to be evaluated than the
(0)
other cases, which results from difficulty in computation of S̃t for η() = 1.
• CEV Asymptotic Expansion
Let us take change of variable function C as C(x1 , x2 ) = (C1 (x1 ), x2 ) for (x1 , x2 ),
where for x > 0 and β ∈ [0, 1),
(  )
1 x 1−β x dz
C1 (x) = = . (74)
1 − β S 1−β 1−β
z β S0
0

That is,
1 1
C1−1 (x̃) = S0 (1 − β) (1−β) x̃ (1−β) . (75)
380 A. Takahashi

Then, as S̃t = C1 (St ), we have

1 β 1 1
d S̃t = − σ 2 dt + σt dWt1 ; S̃0 = >0 (76)
2 1 − β t S̃t 1−β
dσt = λ(θ − σt )dt + νσt dWt2 ; σ0 > 0.

Again, we set a perturbed process as follows:

() η1 () β () 1 () () 1


d S̃t =− (σ )2 () dt + σt dWt1 ; S̃0 = (77)
2 1−β t S̃t 1 − β
dσt() = η2 ()λ(θ − σt() )dt + νσt() dWt2 ; σ0() = σ0 ,

where ηi () =  ji , i = 1, 2 and ji is a nonnegative integer.


For illustrative purpose, let us set η1 () = η2 () = . That is,

 β 1 1
d S̃t() = − (σt() )2 () dt + σt() dWt1 ; S̃0() = , (78)
21−β S̃t 1−β
() () () ()
dσt = λ(θ − σt )dt + νσt dWt2 ; σ0 = σ0 .

(0) (0)
In this case, as S̃t = 1
and σt = σ0 for all t ∈ [0, T ], the first two terms in
1−β 
∂  ()
the asymptotic expansion, g̃1t = 1
1−β + ∂  S̃t follows a Gaussian process:
=0

−βσ02 1
d g̃1t = dt + σ0 dWt1 ; g̃10 = . (79)
2 1−β

Then, by applying Itô’s formula to

1 1
ĝ1t := C1−1 (g̃1t ) = S0 (1 − β) (1−β) g̃1t(1−β) , (80)

and using

1−β
1 ĝ1t
g̃1t = , (81)
1 − β S 1−β
0

we formally obtain the SDE of ĝ1t though it is generally well-defined only for
g̃1t ≥ 0:

βσ02 S0
1−β  
β 1−β β−1 1−β β
d ĝ1t = ĝ1t −1 + S0 ĝ1t dt + σ0 S0 ĝ1t dWt1 ; ĝ10 = S0 . (82)
2
Asymptotic Expansion Approach in Finance 381

Here, because the diffusion coefficient of ĝ1t is given by σ0 S0 (ĝ1t )β and we may
1−β

think that S is expanded around ĝ1 , we call this case a CEV asymptotic expansion
(though ĝ1 is not exactly a CEV process).
In particular, when β = 1/2,

σ02  ,  ,
d ĝ1t = − S0 ĝ1t + S0 dt + σ0 S0 ĝ1t dWt1 ; ĝ10 = S0 , (83)
4
and because
S0 2
ĝ1T = g̃ , (84)
4 1T

ĝ1T /(S0 σ02 T /4) follows a non-central χ2 distribution, around which the original
underlying asset price ST is expanded.
Finally, for ηi () =  ji , i = 1, 2 and ji is a nonnegative integer, an approximation
formula of the call price with strike K and maturity T is obtained as follows:
+  -
−1
Call(K , T ) = E[(ST − K )+ ] = E C1 ( S̃T ) − K
+
+  -
1 1 
= E S0 (1 − β) (1−β) ( S̃T ) (1−β) − K
+
+  -
1 1 
= E S0 (1 − β) (1−β) ( S̃T(1) ) (1−β) − K
+
+  -
1 1 
(1) (0) (1−β)
= E S0 (1 − β) (1−β) (G + S̃T ) −K
+
 ∞ 
1 1  
(0)
≈ S0 (1 − β) (1−β) (x + S̃T ) (1−β) − K f G (1) ,N (x)d x;
y
(85)
 1−β
1 K
y = C1 (K ) − S̃T(0) = − S̃T(0) . (86)
1−β S0

As numerical examples, [97] examines normal, log-normal and CEV expansions


up to the third order for approximations of option prices under SABR model,
which implies that CEV expansion provides the most stable approximations. We
also observe that CEV expansion becomes more precise with the same level of
absolute errors across the whole range of β along the higher order expansions.
Thus, we expect a higher order CEV expansion will produce the better and more
stable approximation than the other expansions, though further investigation seems
necessary. Please see the original paper [97] for the detail of the numerical exper-
iment.
Remark 6 If necessary, applying a similar technique as mentioned in Sect. 4.3.1,
we could use the asymptotic expansion for a model with smooth (and bounded)
382 A. Takahashi

modification of the underlying processes. For a concrete example please


see Remark 3 in [97].

5 Improvement Scheme for Asymptotic Expansion

Although the asymptotic expansion up to the fifth order is known to be sufficiently


accurate for option pricing (e.g. [81, 95, 96, 108, 109]), one of the main criticisms
against the method would be that the approximate density function admits negative
values typically at its tails that is, some region of the deep Out-of-The-Money (OTM),
which could create an arbitrage opportunity in option trading. Also, even if the domain
of a true density is restricted to be positive, the domain of its approximation may
include negative values unless an appropriate boundary condition is assigned. To
overcome the problems, we briefly introduce two recent researches related to the
present asymptotic expansion approach.

5.1 New Improvement Scheme for Approximation Methods


of Probability Density Functions

Takahashi and Tsuzuki [98] develops a new scheme for improving density approx-
imation methods, which also provides precise approximations of option values.
Specifically, the scheme is inspired by the idea in the Hilbert space projection theo-
rem, and so called “Dykstra’s cyclic projections algorithm” is applied for its imple-
mentation. (Please consult Deutsch [14] for the detail of the algorithm.) We also
remark that the scheme can be easily implemented in practice, where we need only
market data used for usual calibration such as option prices with strikes.
Furthermore, numerical experiments for vanilla option pricing under SABR model
demonstrate the validity of the scheme. In fact, in terms of approximation accuracies
this scheme improves the third and fifth order asymptotic expansions preserving
the required conditions such as nonnegative densities under an appropriate forward
measure.
We finally remark that the scheme is general and flexible enough to include a set
of conditions and information as one would like to put on an approximate density,
and it can be applied to approximation methods other than the asymptotic expansion
method. For example, a number of researches have been going on in order to extend
SABR model with fixing the problem of the negative densities in the method of
[33]. (For instance, see Doust [15].) We note that the scheme is also a candidate for
handling this issue. Also, the estimate of the absorption probability based on Monte
Carlo simulations as in [15] can be consistently incorporated in the scheme.
Asymptotic Expansion Approach in Finance 383

5.2 A Weak Approximation with Asymptotic Expansion


and Multidimensional Malliavin Weights

Takahashi and Yamada [105] develops a new weak approximation scheme for
expectations of functions of the solutions to SDEs. In particular, the scheme con-
nects approximate operators constructed based on the asymptotic expansion. More
concretely, a diffusion semigroup is defined as the expectation of an appropriate
function of the solution to a certain SDE: for example, Pt f (x) = E[ f (X tx, )]
with the solution X tx, of a SDE with perturbation parameter  and a function f .
Then, we approximate Pt by an operator Q t,m which is constructed based on the
asymptotic expansion up to a certain order m. Thus, given a partition of [0, T ],
π = {(t0 , t1 , . . . , tn ) : 0 = t0 < t1 < · · · < tn = T }, we are able to approxi-
mate PT f (x) by connecting the expansion-based approximations with the use of
multi-dimensional Malliavin weights sequentially: that is, roughly speaking, with
sk = tk − tk−1 , k = 1, . . . , n,

PT f (x)  Q s,m


n
Q s,m
n−1
· · · Q s,m
1
f (x).

The present research justifies this idea by applying Malliavin calculus, particularly,
theories developed by Watanabe [111] and Kusuoka [52–54]. In computation, in order
to evaluate the Malliavin weights, the paper makes use of conditional expectation
formulas for multi-dimensional asymptotic expansions in [86].
Moreover, the paper shows through numerical examples for option pricing under
local and stochastic volatility models that very few partition such as n = 2 is mostly
enough to substantially improve the errors at deep OTMs of expansions with the first
or second order (m = 1, 2).

6 Asymptotic Expansion in an Instantaneous


Forward Rates Model

Among main stochastic models in finance, there exist models in which the stochastic
processes of the underlying variables do not belong to the class of diffusion processes.
This section illustrates an instantaneous forward rates model as a typical example.

6.1 Asymptotic Expansion for General Wiener Functionals

Watanabe [111] derives an asymptotic expansion for general Wiener functionals. As


an example of the Watanabe’s expansion, [100] shows the following result:
Theorem 4 Let us consider a family of smooth Wiener functionals F  = (F1 , . . . ,
Fn ), Fi ∈ D∞ (i = 1, . . . , n) such that Fi has an asymptotic expansion in D∞ .
384 A. Takahashi

Moreover, F  satisfies the uniformly non-degenerate condition:

lim sup (det σ F  )−1 Lp < ∞, for all p < ∞, (87)


↓0

where σ F  stands for the Malliavin covariance matrix of F  . Then, for a Schwartz
distribution T ∈ S  (Rn ), we have an asymptotic expansion in R:
 ⎧
 ⎨  
 N
 E[T (F )] −

T (x) p F0
(x)d x +  j
T (x)E
 ⎩ Rn
 j=1 Rn
⎡ ( ) ⎤ ⎫
( j) k ⎬
× ⎣ |F = x ⎦ p (x)d x  = O( N +1 ),
0
Hα(k) F , 0 0,β
Fαl l 0 F
⎭
k l=1
(88)

Equivalently,
 *


 E[T (F  )] −
0
T (x) p F (x)d x
 R n


N ( j)
 
+ j
(−1) k
T (x)∂αk (k)
j=1 k Rn
* . 0 / /
k 

× E Fα0,βl |F 0 =x p F0
(x) d x  = O( N +1 ),
l 
l=1
(89)
k
where Fi0,k := k!1 d  (k) denotes a multi-index,
k Fi |=0 , k ∈ N (i = 1, . . . , n), α
d
(k)
α = (α1 , . . . , αk ) and
( j)  j   1
≡ .
k!
k k=1 β1 +···+βk = j,βi ≥1 α
(k) k ∈{1,...,n}

0
p F (x) stands for the density function of F 0 . The Malliavin weight Hα(k) is recursively
defined as follows:

Hα(k) (F, G) = H(αk ) (F, Hα(k−1) (F, G)), (90)

where
( n )


H(l) (F, G) = D GγliF D Fi . (91)
i=1
Asymptotic Expansion Approach in Finance 385
2 1n n
Here, Fi ∈ D ∞ , G ∈ D ∞ , D ∗ GγliF D Fi is the divergence
  i=1 of i=1 GγliF
D Fi , D Fi is the Malliavin derivative of Fi , and γ F = γiFj denotes the
1≤i, j≤n
inverse
 matrix of the Malliavin covariance matrix of F. Moreover, we use the notation
T (x)g(x)d x for T ∈ S  (Rn ) and g ∈ S(Rn ) meaning that S  T, gS . (See the
Sect. 2 of [100] for the details of those definitions.)
Remark 7 The asymptotic expansion formula (89) is the formula developed by
Watanabe [111]. Hence, this theorem shows the expansion (88) based on push down
(conditional expectation) of Malliavin weights (divergences) is equivalent to the
Watanabe’s formula.
(Proof) We use α as an abbreviation of α(k) in the proof, and the notation
·, · p F 0 (x)d x is defined as follows:

0 0 0
T, E F [·] p F 0 (x)d x := S  T, E F [·] p F S .

Under the uniformly non-degenerate condition of F  ∈ D∞ (Rn ), the lifting up



of T ∈ S  (Rn ) that is, (E F )∗ T , has the asymptotic expansion in distributions on
−∞
the Wiener space D , that is for N ∈ N, there exists s ∈ N such that
5 ⎧ ⎫5
5 ⎨  ( j)
 ⎬5
5 F ∗ N k
5
5(E ) T − T ◦ F 0 +  j
(∂ k
T ) ◦ F 0
F 0,βl 5
5 ⎩ α αl
⎭5
5 j=1 k l=1 5
Dq,−s
N +1
= O( ),  ∈ (0, 1], q < ∞. (92)

Then, there exists an asymptotic expansion of (E F )∗ T, 1D−∞ ×D∞ .
The push-down of the divergence are computed as follows:
6 7 6 ⎛ ⎞7
k k
0,β 0,β
∂αk T (F 0 ), Fαl l = T (F 0 ), Hα ⎝ F 0 , Fαl l ⎠
l=1 D−∞ ×D∞ l=1 D−∞ ×D∞
6 ⎡ ⎛ ⎞⎤7
k
0 0,β
= T, E F ⎣ Hα ⎝ F 0 , Fαl l ⎠⎦
l=1 0
p F (x)d x
⎡ ⎛ ⎞ ⎤
 k
0,β 0
= T (x)E ⎣ Hα ⎝ F 0 , Fαl l ⎠ |F 0 = x ⎦ p F (x)d x. (93)
Rn l=1

On the other hand,


6 k
7 6 . k
07
F0
∂αk T (F 0 ), Fα0,β
l
l = ∂αk T, E Fα0,β
l
l

0
l=1 D−∞ ×D∞ l=1 p F (x)d x
6 . k
07
= T, (∂ ∗ )kα E F
0
Fα0,β
l
l

0
l=1 p F (x)d x
386 A. Takahashi

 * . k
0 /
F0
= (−1) k
T (x)∂αk E Fα0,β
l
l |F 0 =x p (x) d x.
Rn l=1
(94)

Here, (∂ ∗ )kα means (∂ ∗ )kα = ∂α∗ · ·· ∂α∗ (k times), and ∂α∗ denotes the divergence oper-
0
ator on the space Rn , p F (x)d x . 


Corollary 1 The asymptotic expansion of the density function of F  , p F (y) is
expressed with the push-down of the Malliavin weights as the follows:
⎡ ( ) ⎤

m ( j) k
F j ⎣
F0
|F 0 = y ⎦ p F (y) + O(m+1 ),
0
p (y) = p (y) +  E Hα(k) F 0 , Fα0,β
l
l

j=1 k l=1

(95)
0
where p F (y) is the density function of F 0 . An alternative expression is given as
follows:

( j)
* . 0 /

m  k
F F0 F0
p (y) = p (y) +  j
(−1)k ∂αk (k) E Fα0,β
l
l |F 0 =y p (y) + O(m+1 ).
j=1 k l=1
(96)

(Proof) Take a delta function δ y ∈ S  (Rn ) in the theorem above. 

6.2 Instantaneous Forward Rates Model

As a typical stochastic model for pricing the interest rate derivatives, there exists a
model developed by Heath-Jarrow-Morton [37], the so called HJM model, which is
formulated based on the forward rates with infinitesimal terms of the interest rates,
that is the instantaneous forward rates { f (s, t) : 0 ≤ s ≤ t ≤ T }. Here, s is the time
when the forward rate is fixed and t denotes the inception time when the forward
rate is applied.
The stochastic processes for the instantaneous forward rates are considered in
the framework of the asymptotic expansion by introducing a parameter  ∈ [0, 1].
For example, let W be a m-dimensional standard Wiener process and let f (0, t),
t ∈ [0, T ] be a given Lipschitz continuous function of t. Then, under the equivalent
martingale measure, the stochastic processes of { f () (s, t) : 0 ≤ s ≤ t ≤ T } are
solutions to the following stochastic integral equations:
Asymptotic Expansion Approach in Finance 387

 m +
s  t -
() () ()
f (s, t) = f (0, t) +  2
σi ( f (v, t), v, t) σi ( f (v, y), v, y)dy dv
0 i=1 v
m 
 s
+ σi ( f () (v, t), v, t)dWi (v) ;  ∈ [0, 1], (97)
i=1 0

where the volatility functions {σi (x, s, t); i = 1, . . . , m} are smooth and satisfy the
regularity conditions which guarantee that the equation (97) has its unique strong
solution. It is to be noted that the drift term (the coefficient of the dv term) of f () (s, t)
depends on { f () (v, y); 0 ≤ v < s, v ≤ y < t}. Moreover, the stochastic process
of the instantaneous short-term interest rate r () (t) is determined by the relation,
r () (t) = f () (t, t).
For this model, the approximations of the values for interest rate derivatives can
still be considered in a unified framework with derivation of asymptotic expansions
of the instantaneous forward rates when  ↓ 0 and with use of the relation between
the instantaneous forward rates and a zero-coupon bond price:
  T
()
P (t, T ) = exp − f () (t, u)du . (98)
t

As an example, we consider pricing an option on a coupon bond (or a swaption),


which is a standard interest rate derivative. The payoff at maturity of a call option is
given by
* n /

()
Vc (T ) = max ci P (t, Ti ) − K , 0 ,
i=1

where 0 ≤ T ≤ T1 < · · · < Tn , ci (i = 1, . . . , n) are positive constants and K (> 0)


is a strike price. Then, its present value is given by
  T () 
Vc (0) = E e− 0 ru du Vc (T ) . (99)

When  ↓ 0, the forward rate f () (s, t) is expanded around f (0, t) as

f () (s, t) ∼ f (0, t) +  f 1 (s, t) + 2 f 2 (s, t) + · · · in D∞ , (100)

where the coefficients of n , n = 1, 2, . . ., that is f 1 (t, u), f 2 (t, u) · · · are also in


D∞ .
As a result, we obtain an expansion of the zero-coupon bond price P () (t, T )
)
around the current forward bond price P(0,T P(0,t) , and an expansion of the discount
  
factor exp − 0 r () (t)dt around the current zero-coupon bond price P(0, T ) as
T

follows:
388 A. Takahashi
.  T  T
() P(0, T )
P (t, T ) ∼ 1− f 1 (t, u)du − 2 f 2 (t, u)du
P(0, t) t t
 T 2
0
1
+ 2 f 1 (t, u)du + · · · in D∞ , (101)
2 t
.  
T T T
r () (s)ds
e− 0 ∼ P(0, T ) 1 −  f 1 (t, t)dt − 2 f 2 (t, t)dt
0 0
 2
0
T
21
+  f 1 (t, t)dt + · · · in D∞ , (102)
2 0

where f i (s, t), i = 1, 2 are given by


 s
m
∂ f () (s,t)
f 1 (s, t) = ∂ |=0 = σi(0) (v, t)dWi (v),
0 i=1
1 ∂ 2 f () (s,t)
f 2 (s, t) = |=0
 s ∂
2 2
 s
m
= b (0)
(v, t)dv + ∂σi(0) (v, t) f 1 (v, t)dWi (v).
0 0 i=1

(0) (0)
Here, σi (v, t) = σi ( f (0) (v, t), v, t), and b(0) (v, t) and ∂σi (v, t) are defined as


n  t
b(0) (v, t) = σi ( f (0) (v, t), v, t) σi ( f (0) (v, y), v, y)dy,
i=1 v

(0) ∂σi (x, v, t)


∂σi (v, t) = |x= f (0,t) .
∂x
Therefore, in a similar way as in the framework for diffusion cases in the previous
(),1 (),i
sections, we define X t and X t (i = 2, . . . , n) as
  t
(),1
Xt = exp − r () (u)du (103)
0
  Ti
(),i ()
Xt = P (t, Ti ) = exp − f () (t, u)du , i = 2, . . . , n. (104)
t

Then, the payoff at maturity of the call option on a coupon bond is written as
* /

n
(),i
Vc (T ) = max ci X t − K, 0 . (105)
i=2

Moreover, let x = (x1 , x2 , . . . , xn ) and define g(x) as


Asymptotic Expansion Approach in Finance 389
( n )

g(x) = x1 ci xi − K . (106)
i=2

In this way, we are able to employ a similar technique to pricing derivatives as in


the case of diffusion processes. For example, with redefinition of variables such as
T , the approximation of the option price Vc (0) in (99) can be obtained based on
the almost same asymptotic expansion method as in the previous sections. In fact, by
using the above expansions of instantaneous forward rates, zero-coupon bond prices
()
and the discount factor, we can apply the expansion to E[max{g(X T ), 0}], where
() (),1 (),2 (),n
X T = (X T , X T . . . , X T ).
For the details and numerical examples, please see [49, 50, 88]. In particular,
[88] implements numerical experiments under a smooth and bounded modifica-
tion of two factor CEV-type volatility functions (as explained in Sect. 4.3.1), and
the variance reduction technique in proposed in [107] to demonstrate the effective-
ness of the method. We remark that the boundedness of the volatility functions
{σi (x, s, t); i = 1, . . . , m} for the instantaneous forward rates f () (s, t) is one of the
sufficient conditions that guarantee the existence of the unique strong solution of the
stochastic integral Eq. (97).
For evaluation of other various interest rate derivatives, approximations based
on the asymptotic expansion approach can be derived in the similar manner. More-
over, an example of an approximate formula for derivative prices dependent on the
instantaneous forward rates in the HJM model and other variables following general
diffusion processes is given by [85].

7 Asymptotic Expansion in Jump and Jump-Diffusion


Models

So far, we have used stochastic models whose randomnesses are generated by only
Wiener processes. However, we are also able to apply the asymptotic expansion
approach to stochastic processes including jumps in their sample paths. This section
provides its very brief review. For the details, please see the cited papers.
In terms of the mathematical viewpoint, Yoshida [120] presented an extension
of Watanabe theory to develop a framework for providing a validity of asymptotic
expansions in Wiener-Poisson spaces, which can be applied to jump-diffusion models
under some regularity conditions. Hayashi [34] applied a Malliaivin calculus of jump-
type to prove an asymptotic expansion theorem for functionals of a Poisson random
measure, and Hayashi [35] derived the coefficients in the expansion of a call option
price under a pure jump model. Moreover, Hayashi and Ishikawa [36] proved an
asymptotic expansion formula for the compositions of a smooth Wiener-Poisson
functional with Schwartz distributions.
390 A. Takahashi

In direct applications to finance problems, [51, 87] derived asymptotic expansion


to approximate bond prices or/and plain-vanilla option prices under jump-diffusion
with local volatility models.
Subsequently, [93, 94] found a new expansion scheme for pricing long-term Euro-
pean currency options under a Libor market model (LMM) and a general diffusion
stochastic volatility model with jumps of spot exchange rates. Particularly, thanks to
a linear structure of the underlying asset price process in their model, they separated
the jump component with a known characteristic function to apply the expansion
technique developed in the diffusion models. Also, [100] took a Malliavin calculus
approach to derive asymptotic expansions of vanilla option prices in a jump-diffusion
with stochastic volatility model.
Recently, [80] has generalized the preceding researches such as [51, 87] and
[100] in the asymptotic expansion approach, and developed a new approximation
formula for pricing basket options in a local-stochastic volatility model with jumps.
In particular, the model admits local volatility functions and jump components in not
only the underlying asset price processes, but also the volatility processes. Moreover,
they implemented some numerical experiments to confirm the validity of the method.
Please see the paper for the details.
As an example of asymptotic expansions of option prices under jump-diffusion
models, the next subsection describes the outline of the method by using a simplified
version of [80].

7.1 Pricing Basket Options Under Local Stochastic Volatility


with Jumps

In the first place, we define the model of the underlying asset prices and its volatility
processes, which is used for pricing the European type basket options. In particular,
suppose that the filtered probability space (, F, P, {Ft }t≥0 ) is given, where P is an
equivalent martingale measure and the filtration satisfies the usual conditions. The
risk-free interest rate is assumed to be a nonnegative constant r for simplicity. Then,
(Sti )t∈[0,T ] and (σti )t∈[0,T ] , i = 1, . . . , d represent the underlying asset prices and
their volatilities for t ∈ [0, T ], respectively. Particularly, let us assume that STi and
σTi are given by the solutions of the following stochastic integral equations:
 T  T   i
STi = s0i + αi St−
i
dt + φ Si σt−
i
, St−
i
dWtS
0 0
⎛ ⎞

n Nl,T
  T
+ ⎝ h Si ,l, j Sτi j,l − − l St−
i
E[h Si ,l, j ]dt ⎠ , (107)
l=1 j=1 0
 T  T  
dWtσ
i
σTi = σ0i + λ (θ
i i
− σt−
i
)dt + φσi σt−
i
0 0
Asymptotic Expansion Approach in Finance 391
⎛ ⎞

n Nl,T
  T
+ ⎝ h σi ,l, j στi − l σt−
i
E[h σi ,l, j ]dt ⎠ , (108)
j,l −
l=1 j=1 0

where s0i and σ0i , i = 1, . . . , d are given as some constants. The notations are defined
as follows:
• αi (i = 1, . . . , d) are constants.
λi and θi (i = 1, . . . , d) are nonnegative constants.
• φ Si (x, y) and φσi (x) are some functions with appropriate regularity conditions.
• W S and W σ , (i = 1, . . . , d) are correlated Brownian motions.
i i

• Each Nl , (l = 1, . . . , n) is a Poisson process with constant intensity l . Nl ,


l = 1, . . . , n are independent, and also independent of all W S and W σ .
i i

• τ j,l stands for the jth jump time of Nl .  


Nl,t
• For each l = 1, . . . , n and i = 1, . . . , d, both j=1 S ,l, j t≥0 and
h i
  
Nl,t Nl,t
j=1 h σ i ,l, j are compound Poisson processes. ( j=1 ≡ 0 when Nl,t = 0.)
t≥0
• For each l and x i , h x i ,l, j ( j ∈ N) are independent and identically distributed
random variables, where x i stands for one of S i and σ i (i = 1, . . . , d).
Y
– for the log-normal jump case, h x i ,l, j = e x i ,l, j − 1, where
Yx i ,l, j is a random variable which follows a normal distribution with mean m x i ,l
and variance γx2i ,l that is, N (m x i ,l , γx2i ,l ) for all j.

• h x i ,l, j and h x i  ,l  , j  (l = l  ) are independent.


h x i ,l, j and h x i  ,l  , j  ( j = j  ) are independent.
Nl and h x i ,l  , j are independent.
For the same l and j, h Si ,l, j and h σi  ,l, j (i, i  = 1, . . . , d) are allowed to be
dependent, that is Y Si ,l, j and Yσi  ,l, j (i, i  = 1, . . . , d) are generally correlated.
Remark 8 By specifying the functions φ S and φσ , we can express various types of
local-stochastic
√ volatility models.
√ For example, the model with φ S (σ, S) = (aS 2 +
bS + c) σ and φσ (σ) = σ corresponds to an extension of the Quadratic Heston
model. The model with φ S (σ, S) = S β S σ and φσ (σ) = σ corresponds to an extended
SABR (λ-SABR) model, and the one with φ S (σ, S) = S β S σ and φσ (σ) = σ βσ
corresponds to a local volatility on volatility with jumps model.
Next, we introduce perturbations to the model (107) and (108). That is, for a
known parameter  ∈ [0, 1] we consider the following stochastic integral equations:
for i = 1, . . . , d,
 T  T  
i,() i,() i,() i,() i
ST = s0i + αi St− dt + φ Si σt− , St− dWtS
0 0
⎛ ⎞

n Nl,T
  T
+ ⎝ h ()
S i ,l, j
Sτi,()− −
j,l
i,()
l St− E[h ()i ]dt ⎠ ,
S ,l,1
(109)
l=1 j=1 0
392 A. Takahashi
 T  T  
φσi σt− dWtσ
i,() i,() i,() i
σT = σ0i + λ (θ
i i
− σt− )dt +
0 0
⎛ ⎞

n Nl,T
  T
⎝ () ()
l σt− E[h σi ,l,1 ]dt ⎠ ,
i,() i,()
+ h σi ,l, j στ j,l − − (110)
l=1 j=1 0

() Yx i ,l, j


where h x i ,l, j = e −1, that is, we assume that the jump size follows a log-normal
distribution, Yx i ,l, j ∼ N (m x i ,l , 2 γx2i ,l ).
i,() i,()
We assume the asymptotic expansions of ST and σT around  = 0 as follows:

i,() i,(0) i,(1) 2 i,(2)


ST = ST + ST + S + ··· , (111)
2! T
2
σTi,() = σTi,(0) + σTi,(1) + σTi,(2) + · · · , (112)
2!
() (0) (1) 2 (2)
h x i ,l, j = h x i ,l, j + h x i ,l, j + h x i ,l, j + · · · , (113)
2!
 i,()  ∂ιh i  ()
i,(ι) ∂ ι St
i,()
 ∂ ι σt 
i,(ι) (ι) x ,l, j 
where St := ∂ι  , σt :=  , h
∂ι =0 x i ,l, j := ∂ι =0 .
=0
that (W S , . . . , W S , W σ , . . . , W σ ) =  · Z where
1 d 1 d
We also suppose  is a
2d × 2d correlation matrix, and Z is a 2d-dimensional (independent) Brownian
motion.
For ease of the expressions we introduce the following notations:
•  Si , j := φ Si (σ i , S i )()i, j and σi , j := φσi (σ i )()d+i, j , where ()i, j denotes
the (i, j)-element of .
•  Si := ( Si ,1 , . . . ,  Si ,2d ) and σi := (σi ,1 , . . . , σi ,2d ) are 2d-dimensional
vectors.
•  S := ( S 1 , . . . ,  S d ) and σ := (σ1 , . . . , σd ) are d × 2d matricies.
• We define a operator “∗” as follows: When A and B are d × 2d matrices,
⎡ ⎤
(A)1,1 (B)1,1 · · · (A)1,2d (B)1,2d
⎢ .. .. .. ⎥
A ∗ B := ⎣ . . . ⎦. (114)
(A)d,1 (B)d,1 · · · (A)d,2d (B)d,2d

When A is a d × 2d matrix and B is a d-dimensional vector,


⎡ ⎤
(A)1,1 (B)1 · · · (A)1,2d (B)1
⎢ .. .. .. ⎥
A ∗ B = B ∗ A := ⎣ . . . ⎦. (115)
(A)d,1 (B)d · · · (A)d,2d (B)d

When A and B are d-dimensional vectors,


Asymptotic Expansion Approach in Finance 393
⎡ ⎤
(A)1 (B)1
⎢ .. ⎥
A ∗ B := ⎣ . ⎦. (116)
(A)d (B)d

• We also define ∂x  S (x = S or σ) as
⎡ ∂ ∂ ⎤
∂x 1
( S )1,1 ··· ∂x 1
( S )1,2d
⎢ .. .. .. ⎥
∂x  S := ⎣ . . . ⎦, (117)
∂ ∂
∂x d
( S )d,1 ··· ∂x d
( S )d,2d

where ( S )i, j denotes the (i, j)-element of the d × 2d matrix  S .


• Let us introduce the following notations:
St = (St1 , . . . , Std ), σt = (σt1 , . . . , σtd ),
(i) (i) (i) (i) (i) (i)
h S,l, j = (h S 1 ,l, j , . . . , h S d ,l, j ), h σ,l, j = (h σ1 ,l, j , . . . , h σd ,l, j ),
eαt = (eα t , . . . , eα t ) and eλt = (eλ t , . . . , eλ t ).
1 d 1 d

Based on these preparations, we obtain the next proposition.

Proposition 2 The coefficients, ST(i) , σT(i) and h (i)


x,l, j (x =, S, σ), i = 0, 1, 2 in the
expansions (111), (112) and (113) are given as follows:

ST(0) = eαT ∗ s0 , (118)


(0) −λT
σT = θ + (σ0 − θ) ∗ e , (119)
h (0)
x,l, j = 0, (120)
 T  
(1) (0) (0)
ST = eα(T −t) ∗  S σt− , St− d Z t
0
⎛ ⎞
n Nl,T
  
⎝ (1) (1) (0)
+ h S,l, j . − l T E h S,l,1 ⎠ ∗ ST , (121)
l=1 j=1

 T   
n Nl,T

(1) (0) ⎝ (1) (0)
σT = e−λ(T −t) ∗ σ σt− d Z t + h σ,l, j ∗ e−λ(T −τ j,l ) ∗ στ j,l −
0 l=1 j=1
   T 
(1) −λT λt (0)
−l E h σ,l,1 ∗e ∗ e ∗ σt− dt , (122)
0
(1)
h x,l, j = Yx,l, j := (Yx 1 ,l, j , . . . , Yx d ,l, j ), (123)
 T  
(2) (0) (0) (1)
ST = eα(T −t) ∗ ∂ S  S σt− , St− ∗ St− d Z t
0
 T  
(0) (0) (1)
+ eα(T −t) ∗ ∂σ  S σt− , St− ∗ σt− d Zt
0
394 A. Takahashi
⎧ ⎫
n ⎨
 Nl,T  ⎬
+ h (2) (2)
S,l, j − l T E h S,l,1 ∗ ST(0)
⎩ ⎭
l=1 j=1
Nl,T
 (1) (1)
+ h S,l, j ∗ eα(T −τ j,l ) ∗ Sτ j,l −
j=1
   T 
− l E h (1)
S,l,1 ∗ e αT
∗ (1)
e−αt ∗ St− dt , (124)
0
(2)
h x,l, j = Yx,l, j ∗ Yx,l, j . (125)

Next, let us define the payoff of a basket call option with strike price K as

(g(x) − K )+ (:= max{g(x) − K , 0}), (126)



d
g(x) := w · x = wi x i ,
i=1

where g(x) represents a weighted sum of the underlying asset prices of x1 , . . . , xd


with the constant weights w1 , . . . , wd . Here, we set x := (x 1 , . . . , x d ) and w :=
(w1 , . . . , wd ).  
For an approximation of a basket option price, we firstly note that g ST() is
expanded around  = 0 as:
      2  
g ST() = g ST(0) + g ST(1) + g ST(2) + o(2 ). (127)
2

Then, for a strike price K = g(ST0 ) − y for an arbitrary y ∈ R, the payoff of the
call option with maturity T is expanded as follows:
( () (0) )+
   + g(ST ) − g(ST )
()
g ST − K = +y

      +
(1) (2)
=  g ST + g ST + y + o()
2
 + 2  
(1)  g S (2) + o(2 ).
=  g(ST ) + y + 1 (1) T (128)
2 g(ST )>−y

We next note that when the number of jumps is kl (l = 1, . . . , n), that is on


& ' (1)
{Nl = kl } := N1,T = k1 , . . . , Nn,T = kn , ST in the equation (121) becomes

ξ{kl } + ŜT , (129)


Asymptotic Expansion Approach in Finance 395

where


n
(0)
ξ{kl } := (kl − l T )m S,l ∗ ST (130)
l=1

and
⎛ ⎞
 T   
n kl

(0) (0) (0)
ŜT := eα(T −t) ∗  S σt , St d Zt + ⎝ γ S,l ∗ ζ S, j,l ∗ ST ⎠ . (131)
0 l=1 j=1

Here, we use the following notations:


• γ S,l = (γ S 1 ,l , . . . , γ S d ,l )
• ζ S, j,l = (ζ S 1 , j,l , . . . , ζ S d , j,l ) is a vector of random variables, where ζ Si , j,l follows
N (0, 1), that is the standard normal distribution.
 
{k }
We remark that the distribution of g( ŜT ) is N 0, T l , that is the normal dis-
{k }
tribution with mean zero and variance T l whose density function is expressed
as
* /
  1 −x 2
{kl }
n x; 0, T := 8 exp {k }
. (132)
{k } 2T l
2πT l

{k }
Here, T l is defined as follows:
 T      
{k } (0) (0) (0) (0)
T l := w ∗ eα(T −t) ∗  S σt , St w ∗ eα(T −t) ∗  S σt , St dt
0

n
(0) (0)
+ kl (w ∗ γ S,l ∗ ST ) ϑζ S,l (w ∗ γ S,l ∗ ST ), (133)
l=1

where ϑζ S,l stands for the correlation matrix of ζ S, j,l = (ζ S 1 , j,l , . . . , ζ S d , j,l ), and x 
denotes the transpose of x.
Next, we define
   

η2 (x, {kl }) = E g ST(2) g( ŜT ) = x, {Nl = kl } . (134)

With those preparations, we approximate the expectation of the basket call payoff
under an equivalent martingale measure in the following way:
396 A. Takahashi
+   + -
E g ST() − K
+ + +  --
(1) 
= E E g(ST ) + y g( ŜT ) = x, {Nl = kl }
+ +   --
2   (2) 
+ E E 1 (1) g ST g( ŜT ) = x, {Nl = kl } + o(2 ). (135)
2 g(ST )>−y

& '
We also note that the probability of {Nl = kl } := N1,T = k1 , . . . , Nn,T = kn
is expressed as
n
(l T )kl e−l T
p{kl } := , (136)
kl !
l=1

which is the product of the kl times of the jump probabilities of Nl,T (l = 1, . . . , n),
n
that is l=1 P({Nl,T = kl }), thanks to the independence of Nl,T (l = 1, . . . , n).
Then, we calculate the coefficients of  and 2 on the right hand of (135) as follows:
2

The coefficient of  is given by:


+ +   +  --
(1) 
E E g ST + y g( ŜT ) = x, {Nl = kl }

   ∞ 1 2 {k }
= p{kl } x + g(ξ{kl } ) + y n(x; 0, T l )d x, (137)
n −(g(ξ{kl } )+y)
l=1 kl =k
k=0

2
and the coefficient of 2 is given by:
+ +   --
(2) 
E E 1 (1)
g ST g( ŜT ) = x, {Nl = kl }
g(ST )>−y

   ∞
{k }
= p{kl } η2 (x, {kl })n(x; 0, T l )d x. (138)
n −(g(ξ{kl } )+y)
l=1 kl =k
k=0

Then, the initial value, C(K , T ) of the basket call option with maturity T and
strike K is expanded around  = 0 as follows:

C(K , T ) =

    ∞
−r T {k }
p{kl } e  (x + y{kl } ) n(x; 0, T l )d x
n −y{kl }
l=1 kl =k
k=0
 ∞
{k }
+ 2 η2 (x, {kl })n(x; 0, T l )d x + o(2 ),
−y{kl }
(139)
Asymptotic Expansion Approach in Finance 397

where y{kl } := g(ξ{kl } ) + y, and r is a constant risk-free rate.


In order to evaluate η2 (x, {kl }), that is the conditional expectation defined in (134),
we apply some formulas derived in Lemma 3.2 of [80].
Consequently, with  = 1 we obtain an approximate pricing formula for a basket
call option, which corresponds to an asymptotic expansion of the basket option price
up to the 2 -order.

Theorem 5 An approximation formula for the initial value C(K , T ) of an basket


call option with maturity T and strike price K is given by the following equation:
 

      {k }
H1 ykl ; T l
ykl {k }
p{kl } e−r T ykl N 8 + T l + C 1 {k }
n {k }
T l T l
l=1 kl =k
k=0
 
{k }
H2 ykl ; T l   
{kl }
+ C2  2 + C 3 n ykl ; 0, T , (140)
{kl }
T

n (l T )kl e−l T


where p{kl } = l=1 kl ! , r is a constant risk-free rate, y = g(ST(0) ) − K ,
y{kl } = g(ξ{kl } ) + y, N (x)
 denotes
 the standard normal distribution function and
{k }
n(x; 0, ) = √ 1 exp −x
2
2 . Here, T is given by (133), and ξ{kl } is defined by
l
2π
(130). C1 , C2 and C3 are some constants,
 which are given with the derivations in

{k }
Appendix B of [80]. Moreover, Hk x; T l denotes the kth order Hermite polyno-
   
{k } {k } {k }
mial: particularly, H1 x; T l = x and H2 x; T l = x 2 − T l .

8 Perturbation Scheme in Forward Backward Stochastic


Differential Equations (FBSDEs)

The FBSDEs have become quite popular in finance community since El Karoui,
Peng and Quenez [16], especially after the recent financial crises and the subsequent
quite volatile markets, which leads us to recognize the importance of counter party
risk management, particularly the credit value adjustments (CVA).
However, an explicit solution for a FBSDE has been known only for a simple
linear or quadratic example. Although several techniques have been proposed in
the last decade, they seem very limited in practical applications since they rely on
numerical methods for non-linear partial differential equations (PDEs) or regression
based Monte Carlo simulations, which are generally very difficult to implement or
quite time-consuming especially for high-dimensional and long-horizon problems.
Recently, [25] has developed a simple analytical approximation scheme for the
nonlinear FBSDEs, notably for not only the so called decoupled cases but also the
coupled cases. Fujii and Takahashi [25] has introduced a perturbation parameter
398 A. Takahashi

to the generator of a backward stochastic differential equation (BSDE) to expand


recursively the non-linear terms around a relevant linear FBSDE. In the computa-
tion of each order, [25] explicitly represents the backward elements as the functions
of the forward components and take those expectations. Hence, except the cases
that the distributions of the forward process are explicitly known, we need to apply
some approximations of the distributions, and so, again, the asymptotic expansion
technique for the forward stochastic differential equation (FSDE) is useful in the
approximations. Section 8.1 below illustrates the scheme briefly. Fujii and Taka-
hashi [25] also provided two numerical examples, where the second-order analytic
approximations work quite well compared to numerical techniques such as the finite
difference method and the regression-based Monte Carlo simulation. Please see the
paper for the detail.
Moreover, their subsequent work [26] has applied this scheme to the optimal port-
folio problem in an incomplete market with stochastic volatility, and demonstrated
the accurate approximations even for long maturities such as 10 years, as opposed
to the regression based Monte Carlo simulation which works well only up to short
maturities such as one year.
We also note that the method has a great advantage of deriving explicit expressions
of the optimal portfolios and hedging strategies, that is very important in practice.
Furthermore, we can employ the method for the general multi-dimensional cases.
In order to achieve further reduction of computational burdens in this method, the
scheme with an interacting particle method has been recently developed. Section 8.2
describes the outline. Please also see [29] as an application of the method to American
option pricing.
Furthermore, [104] provides a mathematical foundation for the original scheme
in the decoupled case proposed in [25]. (The justification for the coupled case seems
an important and interesting research topic.) It mainly consisted of two parts. That
is, for the BSDE expansion with a perturbed generator they have obtained the coef-
ficients up to an arbitrary order as the solution to a system of the associated BSDEs
with the base FSDE, and present the error estimate of the expansion. Accordingly,
they showed a concrete representation for each expansion coefficient of the volatility
component, that is the martingale integrand in the BSDE. For the FSDE expansion,
they derived an expansion formula with its sharp error estimate for the expectation
of the solution to the base FSDE in terms of a small diffusion. Then, they combine
the both results, particularly applying the FSDE expansion formula to the BSDE
expansion coefficients to obtain a main result, that is an asymptotic expansion of
FBSDEs with a perturbed generator. In the proofs, [104] effectively applied the rep-
resentation results in Ma and Zhang [63] for the BSDE expansion and the properties
of the Kusuoka-Stroock functions in Kusuoka [52] for the FSDE expansion.
In a different stream, [102] has proposed a new semi closed-form approximation
for the solutions of FBSDEs. In particular, applying the asymptotic expansion method
in [100] and [103] to the forward SDEs with a Picard-type iteration scheme for the
BSDEs, they have obtained an error estimate for the approximation. Moreover, they
demonstrated the effectiveness of the method through numerical examples for pricing
options with counter party risk under the local and stochastic volatility models,
Asymptotic Expansion Approach in Finance 399

where the credit value adjustment (CVA) is taken into account. Roughly speaking,
considering a perturbed forward SDE X ε , ε ∈ (0, 1] and an associated backward
SDE (Y ε , Z ε ), they have the following recursive asymptotic expansion around some
non-degenerate gaussian model X̄ 0 . That is, for k ≥ 0, N ≥ 1

Ytε,t,x  u ε,k+1,N (t, x) = E[g( X̄ T0,t,x )]


+ T -
ε,k,N ,t,x ε,k,N ,t,x
+E f (s, X̄ s , Ys
0,t,x
, Zs )ds
t

N
+ εi E[g( X̄ T0,t,x )πi,T
0,t
]
i=1

N + T -
+ εE
i
f (s, X̄ s0,t,x , Ysε,k,N ,t,x , Z sε,k,N ,t,x )πi,s
0,t
ds , (141)
i=1 t

Z tε,t,x  (∇u ε,k+1,N σ)(t, x) = E[g( X̄ T0,t,x )N0,T
0,t
]
+ T -
+E f (s, X̄ s0,t,x , Ysε,k,N ,t,x , Z sε,k,N ,t,x )N0,s
0,t
ds
t

N
+ εi E[g( X̄ T0,t,x )Ni,T
0,t
]
i=1

N + T -
+ εE
i
f (s, X̄ s0,t,x , Ysε,k,N ,t,x , Z sε,k,N ,t,x )Ni,s
0,t
ds εσ(t, x),
i=1 t

(142)

where Ysε,k,N ,t,x = u ε,k,N (s, X̄ s0,t,x ) and Z sε,k,N ,t,x = (∇x u ε,k,N σ)(s, X̄ s0,t,x ). Here,
0 and N 0 , i = 1, . . . , N are the Malliavin weights and in particular,
the processes πi,t i,t
0
N0,t corresponds to the weight appeared in a representation theorem in Ma and Zhang
[63].

8.1 Expansion with Perturbed Generator in BSDE

This subsection briefly describes the perturbation method following [25]. Firstly, let
us consider the following decoupled FBSDE:

d Vt = − f (X t , Vt , Z t )dt + Z t · dWt (143)


VT = (X T ),

where V takes the value in R, W is a r -dimensional Wiener process, and X t valued


in R is assumed to follow a diffusion process, which is the solution to the (forward)
400 A. Takahashi

SDE:
d X t = γ0 (X t )dt + γ(X t ) · dWt ; X 0 = x . (144)

Hereafter, we assume the appropriate regularity conditions that guarantee the math-
ematical validity. For example, pleases see [104] on this point.
In order to approximate the pair of (Vt , Z t ) in terms of X t , we extract the linear
term from the generator f and treat the residual non-linear term as a perturbation
to the linear FBSDE. That is, let us introduce a perturbation parameter , and then
write the equation as

() () () () ()


d Vt = c(X t )Vt dt − g(X t , Vt , Z t )dt + Z t · dWt (145)
()
VT = (X T ).

Here, the above equation with  = 1 corresponds to the original model:

f (X t , Vt , Z t ) = −c(X t )Vt + g(X t , Vt , Z t ) . (146)

We remark that as in the previous asymptotic expansion cases, the residual part g
should be small for a precise approximation. Hence, one should choose the linear
()
term c(X t )Vt in such a way that the residual non-linear term g becomes as small
as possible.
Now, we are going to expand the solution of BSDE (145) with respect to . That
() ()
is, suppose Vt and Z t are expanded as follows:

() (0) (1) (2)


Vt = Vt + Vt + 2 Vt + ··· (147)
() (0) (1) (2)
Zt = Zt + Z t + 2
Zt + ··· . (148)

For illustrative purpose, let us show a first few steps of the expansion. For the zeroth
order of , it is easily seen that Vt(0) is a solution to the following equation:

(0) (0) (0)


d Vt = c(X t )Vt dt + Z t · dWt (149)

(0)
VT = (X T ) . (150)

(0)
Then, Vt can be represented as follows:
 T  
(0) 
Vt = E e− t c(X s )ds (X T ) Ft , (151)

which is equivalent to the value of a standard European contingent claim with the
terminal payoff (X T ) and the discount rate c(X t ) under a suitable pricing measure.
(0)
Clearly, Vt is a function of X t due to the Markovian nature of the model. Moreover,
Asymptotic Expansion Approach in Finance 401

(0)
applying Itô’s formula (or the Malliavin derivative), we are able to obtain Z t as a
function of X t as well.
Next, let us consider the process V () − V (0) :
1 () (0) 2 1 () (0) 2
d Vt − Vt = c(X t ) Vt − Vt dt
() () 1 () (0) 2
− g(X t , Vt , Z t )dt + Z t − Z t · dWt
VT() − VT(0) = 0 . (152)

Now, by extracting the -first order term, we can once again recover the linear
FBSDE:
(1) (1) (0) (0) (1)
d Vt = c(X t )Vt dt − g(X t , Vt , Z t )dt + Z t · dWt
VT(1) =0, (153)

which leads to
+  -
(1)
T u 
Vt =E e − t c(X s )ds
g(X u , Vu(0) , Z u(0) )du  Ft . (154)
t

(0) (0) (1)


Because Vu and Z u are some functions of X u , we obtain Vt as a function of X t ,
(1)
and also Z t through Itô’s formula (or Malliavin derivative).
In exactly the same way, we are able to derive an arbitrarily higher order correction.
Particularly, due to the  in front of the non-linear term g, the system remains to be
linear in every order of the approximation. For example, Vt(2) that is the 2 -order’s
()
coefficient of the expansion of Vt is the solution to the following equation:


d Vt(2) = c(X t )Vt(2) dt − g(X t , Vt(0) , Z t(0) )Vt(1)
∂v

(0) (0) (1) (2)
+∇z g(X t , Vt , Z t ) · Z t dt + Z t · dWt (155)
(2)
VT = 0.

In general, suppose that we have succeeded to represent backward components


(Vt , Z t ) in terms of X t up to the (i − 1)th order. Then, in order to proceed to a higher
order approximation, we need to obtain the following form of expressions with some
deterministic function G(·) in terms of the forward components X t .
+  -
T u 1 2 
Vt(i) = E e− t c(X s )ds
G X u du  Ft . (156)
t

Even if it seems impossible to get the exact result, we can still have an analytic
approximation for (Vt(i) , Z t(i) ). through again, the asymptotic expansion method.
402 A. Takahashi

As an example, [26] has explicitly derived an approximation formula for the


dynamic optimal portfolio in an incomplete market setting, and confirmed its accu-
racy comparing with the exact result by the Cole-Hopf transformation (Zariphopoulou
[121]).
Finally, let us provide a brief remark on an approximation of coupled FBSDEs.
Let us consider the following generic coupled non-linear FBSDE:

d Vt = − f (t, X t , Vt , Z t )dt + Z t · dWt (157)


VT = (X T )
d X t = γ0 (t, X t , Vt , Z t )dt + γ(t, X t , Vt , Z t ) · dWt ; X 0 = x .

We are able to treat this case in the similar way as in the decoupled case by introducing
perturbations to the forward SDE in addition to the one in BSDE:
 
() () () () () () ()
d Vt = c(t, X t )Vt dt − g t, X t , Vt , Z t dt + Z t · dWt
 
() ()
VT =  X T
    
d X t() = r t, X t() + μ t, X t() , Vt() , Z t() dt
    
() () () ()
+ σ t, X t + η t, X t , Vt , Z t · dWt

We also note that the similar method can be applied to the coupled case under a PDE
(partial differential equation) formulation based on the so called four step scheme
(e.g. Ma-Yong [62].) Please see [25] for the details. Developing a mathematical
validity of the scheme for the coupled case will be one of the research topics in the
future.

8.2 Perturbation Scheme with Interacting Particle Method

This subsection briefly introduces a new scheme proposed by Fujii and Takahashi
[27]. Except the cases that we are able to obtain fully closed form expressions, the
high orders’ expansions of perturbed FBSDEs generally contain multi-dimensional
time integrations of expectation values due to a convoluted nature of the scheme,
which makes standard Monte Carlo simulations too time consuming. To avoid nested
simulations, one can applies a particle representation inspired by the ideas of branch-
ing diffusion models (e.g. Fujita [23], Ikeda, Nagasawa and Watanabe [40–42],
McKean [69], Nagasawa and Sirao [70]). Then, we are able to provide a straight-
forward simulation scheme to solve nonlinear FBSDEs at each order of the approx-
imation based on the perturbation. In particular, comparing to the direct application
of the branching diffusion method, the method is expected to be less numerically
intensive, because thanks to expansions of the perturbed generator, the interested
Asymptotic Expansion Approach in Finance 403

system is already decomposed into a set of linear problems. We illustrate the outline
of the method by following [27].
Again, let us introduce a perturbation parameter  in the generator of a BSDE as
follows:
*
() () () ()
d Vs = − f (X s , Vs , Z s )ds + Z s · dWs
() (158)
VT = (X T ),

where X t ∈ R is assumed to follow a generic Markovian forward SDE:

d X s = γ0 (X s )ds + γ(X s ) · dWs ; X t = xt . (159)

Next, let us fix the initial time as t. We denote the Malliavin derivative of X u (u ≥ t)
at time t as

Dt X u ∈ Rr ×d . (160)

Let us also note that in terms of the future time u, the SDE of (Yt,u )ij defined by
(Yt,u )ij = ∂x j X ui is given in the following:
t

d(Yt,u )ij = ∂k γ0i (X u )(Yt,u )kj du + ∂k γai (X u )(Yt,u )kj dWua


(Yt,t )ij = δ ij , (161)

where ∂k denotes the partial differentiation with respect to the kth component of
X , and δ ij stands for the Kronecker delta. Here, i and j run through {1, . . . , d} and
{1, . . . , r } for a, and we adopt the Einstein notation which assumes the summation
of all the paired indexes. Then, it is well-known that

(Dt X ui )a = (Yt,u γ(xt ))ia ,

where a ∈ {1, . . . , r } is the index of r -dimensional Wiener process.


First, for the -zeroth order, it is easy to see
  
(0) 
Vt = E (X T )Ft (162)
  
(0),a 
Zt = E ∂i (X T )(Yt T γ(X t ))ia Ft . (163)

Then, it is clear that they can be evaluated by standard Monte Carlo simulations.
However, for their use in higher order approximations, it is crucial to obtain analytical
(closed form) approximate expressions for these two quantities, for example based
on the asymptotic expansion technique as before.
In the following, let us suppose that we have obtained the solutions up to a given
order of the asymptotic expansion, and write each of them as a function of xt :
404 A. Takahashi
*
Vt(0) = v (0) (xt )
(164)
Z t(0) = z (0) (xt ).

(1)
Next, for the -first order’s coefficient Vt , we obtain an expression as
 T   
(1) 
Vt = E f (X u , Vu(0) , Z u(0) )Ft du
t
 T    

= E f X u , v (0) (X u ), z (0) (X u ) Ft du. (165)
t

Then, we define the new process for (s > t) by introducing a deterministic positive
process λt as follows:
s
(1) λu du
V̂ts = e t Vs(1) , (166)

Here, λt can be a positive constant for the simplest case. Then, for the fixed initial
time t, its SDE is given by
s
(1) (1)
d V̂ts = λs V̂ts ds − λs fˆts (X s , v (0) (X s ), z (0) (X s ))ds + e t λu du
Z s(1) · dWs ,

where
1 s
fˆts (x, v (0) (x), z (0) (x)) = e t λu du f (x, v (0) (x), z (0) (x)).
λs
(1) (1)
Since we have V̂tt = Vt , one can easily see the following relation holds:
+  -
(1)
T u 
Vt =E e− t λs ds
λu fˆtu (X u , v (0) (X u ), z (0) (X u ))du  Ft (167)
t

Similarly to the cases of the standard credit risk modeling (e.g. Bielecki-Rutkowski
[6]), it is the present value of default payment where the default intensity is λs
with the default payoff at s(> t) as fˆts (X s , v (0) (X s ), z (0) (X s )). Thus, we obtain the
following proposition.
(1)
Proposition 3 The Vt in (165) can be equivalently expressed as
   
(1) 
Vt = 1{τ >t} E 1{τ <T } fˆtτ X τ , v (0) (X τ ), z (0) (X τ )  Ft . (168)

Here τ is the interaction time where the interaction is drawn independently from the
Poisson distribution with an arbitrary deterministic positive intensity process λt . fˆ
is defined as
Asymptotic Expansion Approach in Finance 405

1 s
fˆts (x, v (0) (x), z (0) (x)) = e t λu du f (x, v (0) (x), z (0) (x)) . (169)
λs

Now, let us consider the -order’s coefficient of Z () , that is the component Z (1) . It
can be expressed as
 T    
(1) 
Zt = E Dt f X u , v (0) (X u ), z (0) (X u )  Ft du (170)
t

Firstly, we observe that the SDE of the Malliavin derivative of V (1) is given as
follows:

d(Dt Vs(1) ) = −(Dt X si )∇i (x, v (0) , z (0) ) f (x, v (0) , z (0) ) + (Dt Z s(1) ) · dWs ;
(1) (1)
Dt Vt = Zt , (171)

where

∇i (x, v (0) , z (0) ) ≡ ∂i + ∂i v (0) (x)∂v + ∂i z a(0) (x)∂z a , (172)

f (x, v (0) , z (0) ) ≡ f (x, v (0) (x), z (0) (x)). (173)

 (1)
Then, we define for (s > t), Dt Vs as
s

Dt Vs(1) = e t λu du (Dt Vs(1) ), (174)

and its SDE can be written as

 
d(Dt Vs(1) ) = λs (Dt Vs(1) )ds − λs (Dt X si )∇i (X s , v (0) , z (0) ) fˆts (X s , v (0) , z (0) )ds
s
λu du
+e t (Dt Z s(0) ) · dWs . (175)

Then, we again have

 (1) (1)
Dt Vt = Z t . (176)

Hence,
+  -
T u 
Z t(1) = E e− t λs ds
λu (Dt X ui )∇i (X u , v (0) , z (0) ) fˆtu (X u , v (0) , z (0) )du  Ft . (177)
t
406 A. Takahashi

Thus, following the same argument as for the previous proposition, we have the next
result:
(1)
Proposition 4 Z t in (170) is equivalently expressed as
  
(1),a 
Zt = 1{τ >t} E 1{τ <T } (Yt,τ γ(X τ ))ia ∇i (X τ , v (0) , z (0) ) fˆtτ (X τ , v (0) , z (0) ) Ft , (178)

where the definitions of random time τ and the positive deterministic process λ are
the same as those in the previous proposition.
Now, we are able to obtain a new Monte Carlo scheme. That is, we have a new
particle interpretation of (V (1) , Z (1) ) as follows:
   
(1) 
Vt = 1{τ >t} E 1{τ <T } fˆtτ X τ , v (0) , z (0)  Ft (179)
  
(1) 
Z t = 1{τ >t} E 1{τ <T } (Yt,τ γ(X τ ))i ∇i (X τ , v (0) , z (0) ) fˆtτ (X τ , v (0) , z (0) ) Ft , (180)

which allows an efficient time integration with the following Monte Carlo scheme:
• Run the diffusion processes of X and Y .
• Carry out Poisson draw with probability λs s at each time s and if “one” is drawn,
set that time as τ .
• Then stores the relevant quantities at τ , or in the case of (τ > T ) stores 0.
• Repeat the above procedures and take their expectation.
Finally, we remark that the higher order coefficients in the expansions are evalu-
ated in the similar way. Please see [27] for the details.

9 Conclusion

The present note has reviewed an asymptotic expansion approach in finance, partic-
ularly in terms of computational problems arising in practice of financial derivatives.
in finance. However, due to the limitation of the space, we have not provided thorough
explanations especially for recent progress such as improvement schemes in Sect. 5,
expansion methods in jump and jump-diffusion models in Sect. 7 and perturbation
schemes in forward backward stochastic differential equations (FBSDEs) in Sect. 8.
Please see the cited papers for the details.
Moreover, we have not introduced an application of the method to mean-variance
hedging problems in partially observable markets, which is an interesting topic as an
application of stochastic filtering problems in finance. Please see [29] for the detail.
Asymptotic Expansion Approach in Finance 407

References

1. Alòs, E., Eydeland, A., Laurence, P.: A Kirk’s and a Bachelier’s formula for three asset spread
options. Energy Risk 09(2011), 52–57 (2011)
2. Bayer, C., Laurence, P.: Asymptotics beats Monte Carlo: the case of correlated local vol
baskets. Commun. Pure Appl. Math. (2013). Published online 9 October
3. Ben Arous, G., Laurence, P.: Second order expansion for implied volatility in two factor
local stochastic volatility models and applications to the dynamic λ-SABR model. In: Friz,
P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.) Large Deviations and
Asymptotic Methods in Finance. Springer Proceedings in Mathematics and Statistics, vol.
110. Springer, Berlin (2009)
4. Benaim, S., Friz, P., Lee, R.: On Black-Scholes implied volatility at extreme strikes. In:
Cont, R. (ed.) Frontiers in Quantitative Finance: Volatility and Credit Risk Modeling. Wiley,
Hoboken (2008)
5. Bichteler, K., Gravereaux, J.-B., Jacod, J.: Malliavin Calculus for Processes with Jumps.
Stochastic Monographs. Gordon and Breach Science Publishers, New York (1987)
6. Bielecki, T., Rutkowski, M.: Credit Risk: Modeling, Valuation and Hedging. Springer, Berlin
(2000)
7. Brace, A., Gatarek, D., Musiela, M.: The market model of interest rate dynamics. Math.
Financ. 7, 127–155 (1997)
8. Carr, P., Jarrow, R., Myneni, R.: Alternative characterizations of American put options. Math.
Financ. 2, 87–106 (1992)
9. Col, A.D., Gnoatto, A., Grasselli, M.: Smiles all around: FX joint calibration in a multi-Heston
model. J. Bank. Financ. 37(10), 3799–3818 (2013)
10. Cox, J.: Notes on option pricing I: constant elasticity of diffusions. Unpublished draft, Stanford
University (1975)
11. Davydov, D., Linetsky, V.: Pricing options on scalar diffusions: an eigenfunction expansion
approach. Oper. Res. 51, 185–209 (2003)
12. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffu-
sions and stochastic volatility I: theoretical foundations. Commun. Pure Appl. Math. 67–1,
321–350 (2014)
13. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for dif-
fusions and stochastic volatility II: applications. Commun. Pure Appl. Math. 67–2, 40–82
(2014)
14. Deutsch, F.: Best Approximation in Inner Product Spaces. Springer, New York (2001)
15. Doust, P.: No-arbitrage SABR. J. Comput. Financ. 15(3), 3–31 (2012)
16. El Karoui, N., Peng, S.G., Quenez, M.C.: Backward stochastic differential equations in
finance. Math. Financ. 7, 1–71 (1997)
17. Forde, M., Jacquier, A., Lee, R.: The small-time smile and term structure of implied volatility
under the Heston model. SIAM J. Financ. Math. 3, 690–708 (2012)
18. Forde, M., Jacquier, A.: Small-time asymptotics for implied volatility under the Heston model.
Int. J. Theor. Appl. Financ. 12(6), 861–876 (2009)
19. Foschi, P.P., Pagliarani, S., Pascucci, A.: Approximations for Asian options in local volatility
models. J. Comput. Appl. Math. 237, 442–459 (2013)
20. Fouque, J.-P., Papanicolaou, G., Sircar, K.R.: Financial modeling in a fast mean-reverting
stochastic volatility environment. Asia-Pac. Financ. Mark. 6(1), 37–48 (1999)
21. Fouque, J.-P., Papanicolaou, G., Sircar, K.R.: Derivatives in Financial Markets with Stochastic
Volatility. Cambridge University Press, Cambridge (2000)
22. Friz, P., Gerhold, S., Gulisashvili, A., Sturm, S.: On refined volatility smile expansion in the
Heston model. Quant. Financ. 11(8), 1151–1164 (2011)
23. Fujita, H.: On the blowing up of solutions of the Cauchy problem for u t = u + u 1+α . J.
Fac. Sci. Univ. Tokyo 13, 109–124 (1966)
24. Fujii, M.: Momentum-space approach to asymptotic expansion for stochastic filtering. Ann.
Inst. Stat. Math. 66(1) (2012)
408 A. Takahashi

25. Fujii, M., Takahashi, A.: Analytical approximation for non-linear FBSDEs with perturbation
scheme. Int. J. Theor. Appl. Financ. 15(5) (2012)
26. Fujii, M., Takahashi, A.: Perturbative expansion of FBSDE in an incomplete market with
stochastic volatility. Q. J. Financ. 2(3) (2012)
27. Fujii, M., Takahashi, A.: Perturbative expansion technique for non-linear FBSDEs with inter-
acting particle method. Asia-Pacific Finan. Markets (2015)
28. Fujii, M., Sato, S., Takahashi, A.: An FBSDE approach to American option pricing with an
interacting particle method. CARF-F-302 (2012)
29. Fujii, M., Takahashi, A.: Making mean-variance hedging implementable in a partially observ-
able market. Quant. Financ. 14(10), 1709–1724 (2014)
30. Gatheral, J., Hsu, E.P., Laurence, P., Ouyang, C., Wang, T.-H.: Asymptotics of implied volatil-
ity in local volatility models. Math. Financ. 22(4), 591–620 (2012)
31. Gnoatto, A., Grasselli, M.: An affine multi-currency model with stochastic volatility and
stochastic interest rates. SIAM J. Financ. Math. 5(1), 493–531 (2014)
32. Gulisashvili, A.: Asymptotic formulas with error estimates for call pricing functions and the
implied volatility at extreme strikes. SIAM J. Financ. Math. 1(1), 609–641 (2011)
33. Hagan, P.S., Kumar, D., Lesniewski, A.S., Woodward, D.E.: Managing smile risk. Willmott
Mag. 15, 84–108 (2002)
34. Hayashi, M.: Asymptotic expansions for functionals of a Poisson random measure. J. Math.
Kyoto Univ. 48(1), 91–132 (2008)
35. Hayashi, M.: Coefficients of asymptotic expansions of SDE with jumps. Asia-Pac. Financ.
Mark. 17(4), 373–380 (2010)
36. Hayashi, M., Ishikawa, Y.: Composition with distributions of Wiener-Poisson variables and
its asymptotic expansion. Mathematische Nachrichten 285(5–6), 619–658 (2011)
37. Heath, D., Jarrow, R., Morton, A.: Bond pricing and the term structure of interest rates: a new
methodology for contingent claims valuation. Econometrica 60, 77–105 (1992)
38. Henry-Labordère, P.: Analysis, Geometry and Modeling in Finance: Advanced Methods in
Options Pricing. Chapman and Hall, Boca Raton (2008)
39. Ikeda, N., Watanabe, S.: Stochastic Differential Equations and Diffusion Processes, 2nd edn.
North-Holland/Kodansha, Tokyo (1989)
40. Ikeda, N., Nagasawa, M., Watanabe, S.: Branching Markov processes. Proc. Jpn. Acad. 41,
816–821 (1965)
41. Ikeda, N., Nagasawa, M., Watanabe, S.: Branching Markov processes. Proc. Jpn. Acad. 42,
252–257, 370–375, 380–384, 719–724, 1016–1021, 1022–1026 (1966)
42. Ikeda, N., Nagasawa, M., Watanabe, S.: Branching Markov processes I(II). J. Math. Kyoto
Univ. 8, 233–278, 365–410 (1968)
43. Jamshidian, F.: LIBOR and Swap market models and measures. Financ. Stoch. 1, 293–330
(1997)
44. Kawai, A.: A new approximate Swaption formula in the LIBOR market model: an asymptotic
expansion approach. Appl. Math. Financ. 10, 49–74 (2003)
45. Kobayashi, T., Takahashi, A., Tokioka, N.: Dynamic optimality of yield curve strategies. Int.
Rev. Financ. 4, 49–78 (2003) (published in 2005)
46. Kato, T., Takahashi, A., Yamada. T.: A semi-group expansion for pricing barrier options. Int.
J. Stoch. Anal. 2014(268086) (2014)
47. Kato, T., Takahashi, A., Yamada. T.: An asymptotic expansion formula for up-and-out barrier
option price under stochastic volatility model. JSIAM Lett. 5, 17–20 (2013)
48. Kunitomo, N., Takahashi, A.: Pricing average options. Jpn. Financ. Rev. 14, 1–20 (1992). (in
Japanese)
49. Kunitomo, N., Takahashi, A.: The asymptotic expansion approach to the valuation of interest
rate contingent claims. Math. Financ. 11, 117–151 (2001)
50. Kunitomo, N., Takahashi, A.: On validity of the asymptotic expansion approach in contingent
claim analysis. Ann. Appl. Probab. 13(3), 914–952 (2003)
51. Kunitomo, N., Takahashi, A.: Applications of the asymptotic expansion approach based on
Malliavin-Watanabe calculus in financial problems. Stochastic Processes and Applications to
Mathematical Finance, pp. 195–232 (2004)
Asymptotic Expansion Approach in Finance 409

52. Kusuoka, S.: Malliavin calculus revisited. J. Math. Sci. Univ. Tokyo 10, 261–277 (2003)
53. Kusuoka, K.: Approximation of expectation of diffusion process and mathematical finance.
Taniguchi Conference on Mathematics, Nara, 1998. Advanced Studies in Pure Mathematics,
vol. 31, pp. 147–165. Mathematical Society of Japan, Tokyo (2001)
54. Kusuoka, K.: Approximation of expectation of diffusion process based on Lie algebra and
Malliavin calculus. Adv. Math. Econ. 6, 69–83 (2004)
55. Kusuoka S., Stroock, D.: Applications of the Malliavin Calculus Part I. Stochastic Analysis
(Katata/Kyoto 1982), pp. 271–306 (1984)
56. Kusuoka, S., Strook, D.: Precise asymptotics of certain Wiener functionals. J. Funct. Anal.
99, 1–74 (1991)
57. Kusuoka, S., Osajima, Y.: A remark on the asymptotic expansion of density function of Wiener
functionals. J. Funct. Anal. 255(9), 2545–2562 (2007)
58. Lee, R.: The moment formula for implied volatility at extreme. Math. Financ. 14(3), 469–480
(2004)
59. Li, C.: Closed-form expansion, conditional expectation, and option valuation. Math. Oper.
Res. 39(2), 487–516 (2014)
60. Lipton, A.: Mathematical Methods for Foreign Exchange: A Financial Engineer’s Approach.
World Scientific Publication, Singapore (2001)
61. Linetsky, V.: Spectral expansions for Asian (average price) options. Oper. Res. 52, 856–867
(2004)
62. Ma, J., Yong, J.: Forward-Backward Stochastic Differential Equations and Their Applications.
Springer, Berlin (2000)
63. Ma, J., Zhang, J.: Representation theorem of backward stochastic differential equations. Ann.
Appl. Probab. 12(4), 1390–1418 (2002)
64. Malliavin, P.: Stochastic Analysis. Springer, Berlin (1997)
65. Malliavin, P., Thalmaier, A.: Stochastic Calculus of Variations in Mathematical Finance.
Springer, Berlin (2006)
66. Matsuoka, R., Takahashi, A., Uchida, Y.: A new computational scheme for computing greeks
by the asymptotic expansion approach. Asia-Pac. Financ. Mark. 11, 393–430 (2004)
67. Muroi, Y.: Pricing contingent claims with credit risk: asymptotic expansion approach. Financ.
Stoch. 9(3), 415–427 (2005)
68. Matsuoka, R., Takahashi, A.: An asymptotic expansion approach to computing Greeks. FSA
Res. Rev. 2005, 72–108 (2005)
69. McKean, H.P.: Application of Brownian motion to the equation of Kolmogorov-Petrovskii-
Piskunov. Commun. Pure Appl. Math. 28, 323–331 (1975)
70. Nagasawa, M., Sirao, T.: Probabilistic treatment of the blowing up of solutions for a nonlinear
integral equation. Trans. Am. Math. Soc. 139, 301–310 (1969)
71. Nishiba, M.: Pricing exotic options and American options: a multidimensional asymptotic
expansion approach. Asia-Pac. Financ. Mark. 20(2), 147–182 (2013)
72. Nualart, D.: The Malliavin Calculus and Related Topics. Springer, Berlin (1995)
73. Nualart, D., Üstünel, A.S., Zakai, M.: On the moments of a multiple Wiener-Itô integral and
the space induced by the polynomials of the integral. Stochastics 25, 233–340 (1988)
74. Ocone, D., Karatzas, I.: A generalized clark representation formula, with application to opti-
mal portfolios. Stoch. Stoch. Rep. 34, 187–220 (1991)
75. Osajima, Y.: The asymptotic expansion formula of implied volatility for dynamic SABR model
and FX hybrid model. Preprint, Graduate School of Mathematical Sciences, The University
of Tokyo (2006)
76. Osajima, Y.: General asymptotics of wiener functionals and application to mathematical
finance. In: Friz, P., Gatheral, J., Gulisashvili, A., Jacquier, A., Teichmann, J. (eds.) Large
Deviations and Asymptotic Methods in Finance Springer Proceedings in Mathematics and
Statistics, vol. 110 (2015)
77. Pagliarani, S., Pascucci, A.: Local stochastic volatility with jumps. Int. J. Theor. Appl. Financ
16(8), 1350050 (2013)
410 A. Takahashi

78. Shiraya, K., Takahashi, A.: Pricing average options on commodities. J. Futures Mark. 31(5),
407–439 (2011)
79. Shiraya, K., Takahashi, A.: Pricing multi-asset cross currency options. J. Futures Mark. 34(1),
1–19 (2014)
80. Shiraya, K., Takahashi, A.: Pricing basket options under local stochastic volatility with jumps.
CARF-F-336 (2013)
81. Shiraya, K., Takahashi, A., Toda, M.: Pricing barrier and average options under stochastic
volatility environment. J. Comput. Financ. 15(2), 111–148 (2011)
82. Shiraya, K., Takahashi, A., Yamazaki, A.: Pricing swaptions under the LIBOR market model
of interest rates with local-stochastic volatility models. Wilmott 2011(54), 61–73 (2011)
83. Shiraya, K., Takahashi, A., Yamada, T.: Pricing discrete barrier options under stochastic
volatility. Asia-Pac. Financ. Mark. 19(3), 205–232 (2012)
84. Siopacha, M., Teichmann, J.: Weak and strong Taylor methods for numerical solutions of
stochastic differential equations. Quant. Financ. 11(4), 517–528 (2011)
85. Takahashi, A.: Essays on the valuation problems of contingent claims. Unpublished Ph.D.
Dissertation, Haas School of Business, University of California, Berkeley (1995)
86. Takahashi, A.: An asymptotic expansion approach to pricing contingent claims. Asia-Pac.
Financ. Mark. 6, 115–151 (1999)
87. Takahashi, A.: On an asymptotic expansion approach to numerical problems in finance.
Selected Papers on Probability and Statistics, pp. 199–217. American Mathematical Soci-
ety (2009)
88. Takahashi, A., Matsushima, S.: Monte Carlo simulation with an asymptotic expansion in HJM
framework. FSA Research Review 2004, pp. 82–103. Financial Services Agency (2004)
89. Takahashi, A., Saito, T.: An asymptotic expansion approach to pricing American options.
Monet. Econ. Stud. 22, 35–87 (2003). (in Japanese)
90. Takahashi, A., Takehara, K.: An asymptotic expansion approach to currency options with a
market model of interest rates under stochastic volatility processes of spot exchange rates.
Asia-Pac. Financ. Mark. 14, 69–121 (2007)
91. Takahashi, A., Takehara, K.: Fourier transform method with an asymptotic expansion
approach: an applications to currency options. Int. J. Theor. Appl. Financ. 11(4), 381–401
(2008)
92. Takahashi, A., Takehara, K.: A hybrid asymptotic expansion scheme: an application to cur-
rency options. Working paper, CARF-F-116, The University of Tokyo, http://www.carf.e.u-
tokyo.ac.jp/workingpaper/ (2008)
93. Takahashi, A., Takehara, K.: A hybrid asymptotic expansion scheme: an application to long-
term currency options. Int. J. Theor. Appl. Financ. 13(8), 1179–1221 (2010)
94. Takahashi, A., Takehara, K.: Asymptotic expansion approaches in finance: applications to
currency options. Finance and Banking Developments, pp. 185–232. Nova Science Publishers,
New York (2010)
95. Takahashi, A., Takehara, K., Toda, M.: Computation in an asymptotic expansion method.
CARF-F-149 (2009)
96. Takahashi, A., Takehara, K., Toda, M.: A general computation scheme for a high-order asymp-
totic expansion method. Int. J. Theor. Appl. Financ. 15(6) (2012)
97. Takahashi, A., Toda, M.: Note on an extension of an asymptotic expansion scheme. Int. J.
Theor. Appl. Financ. 16(5), 1350031-1–1350031-23 (2013)
98. Takahashi, A., Tsuzuki, Y.: A new improvement scheme for approximation methods of prob-
ability density functions. CARF-F-350. Forthcoming in J. Comput. Financ. (2013)
99. Takahashi, A., Uchida, Y.: New acceleration schemes with the asymptotic expansion in Monte
Carlo simulation. Adv. Math. Econ. 8, 411–431 (2006)
100. Takahashi, A., Yamada, T.: An asymptotic expansion with push-down of Malliavin weights.
SIAM J. Financ. Math. 3, 95–136 (2012)
101. Takahashi, A., Yamada, T.: A remark on approximation of the solutions to partial differential
equations in finance. Recent Adv. Financ. Eng. 2011, 133–181 (2011)
Asymptotic Expansion Approach in Finance 411

102. Takahashi, A., Yamada, T.: An asymptotic expansion for forward-backward SDEs: a Malliavin
calculus approach. CARF-F-296 (2012)
103. Takahashi, A., Yamada, T.: On error estimates for asymptotic expansions with Malliavin
weights—application to stochastic volatility model-. CARF-F-324. Forthcoming in Math.
Oper. Res. (2013)
104. Takahashi, A., Yamada, T.: An asymptotic expansion for forward-backward SDEs with a
perturbed driver. CARF-F-326 (2013)
105. Takahashi, A., Yamada, T.: A weak approximation with asymptotic expansion and multidi-
mensional Malliavin weights. CARF-F-335. Forthcoming in Ann. Appl. Probab. (2013)
106. Takahashi, A., Yoshida, N.: An asymptotic expansion scheme for optimal investment prob-
lems. Stat. Inference Stoch. Process. 7(2), 153–188 (2004)
107. Takahashi, A., Yoshida, N.: Monte Carlo simulation with asymptotic method. J. Jpn. Stat.
Soc. 35(2), 171–203 (2005)
108. Takehara, K., Takahashi, A., Toda, M.: New unified computation algorithm in a high-order
asymptotic expansion scheme. In: Recent Advances in Financial Engineering (The Proceed-
ings of KIER-TMU International Workshop on Financial Engineering 2009), pp. 231–251
(2010)
109. Takehara, K., Toda, M., Takahashi, A.: Application of a high-order asymptotic expansion
scheme to long-term currency options. Int. J. Bus. Financ. Res. 5(3), 87–100 (2011)
110. Violante, S.P.N.: Asymptotics of Wiener functionals and applications to mathematical finance.
Ph.D. Thesis, Department of Mathematics, Imperial College London (2012)
111. Watanabe, S.: Analysis of Wiener functionals (Malliavin calculus) and its applications to heat
kernels. Ann. Probab. 15, 1–39 (1987)
112. Xu, G., Zheng, H.: Basket options valuation for a local volatility jump-diffusion model with
the asymptotic expansion method. Insur. Math. Econ. 47(3), 415–422 (2010)
113. Xu, G., Zheng, H.: Lower bound approximation to basket option values for local volatility
jump-diffusion models. Int. J. Theor. Appl. Financ. 17, 1–15 (2014)
114. Yamamoto, K., Sato, S., Takahashi, A.: Probability distribution and option pricing for draw-
down in a stochastic volatility environment. Int. J. Theor. Appl. Financ. 13(2), 335–354 (2010)
115. Yamamoto, K., Takahashi, A.: A remark on a singular perturbation method for option pricing
under a stochastic volatility model. Asia-Pac. Financ. Mark. 16(4), 333–345 (2009)
116. Yamanobe, T.: Stochastic phase transition operator. Phys. Rev. E 84, 011924 (2011)
117. Yamanobe, T.: Global dynamics of a stochastic neuronal oscillator. Phys. Rev. E 88, 052709
(2013)
118. Yoshida, N.: Asymptotic expansion for small diffusions via the theory of Malliavin-Watanabe.
Probab. Theor. Relat. Fields 92, 275–311 (1992)
119. Yoshida, N.: Asymptotic expansions for statistics related to small diffusions. J. Jpn. Stat. Soc.
22, 139–159 (1992)
120. Yoshida, N.: Conditional expansions and their applications. Stoch. Process. Appl. 107, 53–81
(2003)
121. Zariphopoulou, T.: A solution approach to valuation with unhedgeable risks. Financ. Stoch.
5, 61–82 (2001)
On Small Time Asymptotics for Rough
Differential Equations Driven by Fractional
Brownian Motions

Fabrice Baudoin and Cheng Ouyang

In memory of Peter Laurence

Abstract We survey existing results concerning the study in small times of the
density of the solution of a rough differential equation driven by fractional Brown-
ian motions. We also slightly improve existing results and discuss some possible
applications to mathematical finance.

Keywords Small maturity limit · Mathematical foundations in non-Markovian


situations · Rough differential equations

1991 Mathematics Subject Classification 28D05 · 60D58

1 Introduction

In this paper, our main goal is to survey some existing results concerning the small-
time asymptotics of the density of rough differential equations driven by fractional
Brownian motions. Even though we do not claim any new results, we slightly improve

The first author of this research was supported in part by NSF Grant DMS 0907326.

F. Baudoin (B)
Department of Mathematics, Purdue University, West Lafayette, IN 47907, USA
e-mail: [email protected]
C. Ouyang
Department of Mathematics, Statistics and Computer Science, University of Illinois
at Chicago, Chicago, IL 60607, USA
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 413


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_14
414 F. Baudoin and C. Ouyang

some of the existing ones and also point out some possible connections to finance.
We also hope, it will be useful for the reader to have, in one place, the most recent
results concerning the small-time asymptotics questions related to rough differential
equations driven by fractional Brownian motions. Our discussion will mainly be
based on one hand on the papers [5–7] by the two present authors and on the other
hand on the papers [27, 28] by Inahama.
Random dynamical systems are a well established modeling tool for a variety of
natural phenomena ranging from physics (fundamental and phenomenological) to
chemistry and more recently to biology, economy, engineering sciences and mathe-
matical finance. In many interesting models the lack of any regularity of the external
inputs of the differential equation as functions of time is a technical difficulty that
hampers their mathematical analysis. The theory of rough paths has been initially
developed by T. Lyons [31] in the 1990s to provide a framework to analyze a large
class of driven differential equations and the precise relations between the driving
signal and the output (that is the state, as function of time, of the controlled system).
Rough paths theory provides a perfect framework to study differential equations
driven by Gaussian processes (see [19]). In particular, using rough paths theory,
we may define solutions of stochastic differential equations driven by a fractional
Brownian motion with a parameter H > 1/4 (see [15]). Let us then consider the
equation

 t d 
 t
X tx = x + V0 (X sx )ds + Vi (X sx )d Bsi , (1.1)
0 i=1 0

where x ∈ Rn , V0 , V1 , . . . , Vd are bounded smooth vector fields and (Bt )t≥0 is a


d-dimensional fractional Brownian motion with Hurst parameter H ∈ ( 41 , 1). A
first basic question is the existence of a smooth density with respect to the Lebesgue
measure for the random variable X tx , t > 0. After multiple works, it is now understood
that the answer to this question is essentially the same as the one for stochastic
differential equations driven by Brownian motions: the random variable X tx admits
a smooth density with respect to the Lebesgue measure if Hörmander’s condition is
satisfied at x. More precisely, if I = (i 1 , . . . , i k ) ∈ {0, . . . , d}k , we denote by VI the
Lie commutator defined by

VI = [Vi1 , [Vi2 , . . . , [Vik−1 , Vik ] . . .]],

and
d(I ) = k + n(I ),

where n(I ) is the number of 0 in the word I . The basic and fundamental result
concerning the existence of a density for stochastic differential equations driven by
fractional Brownian motions is the following:
Theorem 1.1 ([4, 10, 12, 24]) Assume H > 1
4 and assume that, at some x ∈ Rn ,
there exists N such that
On Small Time Asymptotics for Rough Differential Equations … 415

span{VI (x), d(I ) ≤ N } = Rn . (1.2)

Then, for any t > 0, the law of the random variable X tx has a smooth density pt (x, y)
with respect to the Lebesgue measure on Rn .

Once the existence and smoothness of the density is established, it is natural to


study properties of this density. In particular, we are interested here in small-time
asymptotics, that is the analysis of pt (x, y) when t → 0. Based on the results in
the Brownian motion case [1, 2], and taking into account the scaling property of the
fractional Brownian motion, the following expansion (in particular when n = d) is
somehow expected when x, y are close enough to each other:

2 (x,y) 
N 
1 −d 2(N +1)H
pt (x, y) = e 2t 2H ci (x, y)t 2i H
+ r N +1 (t, x, y)t . (1.3)
(t H )d
i=0

Our goal is to discuss here the various assumptions under which such expansion
is known to be true and also discuss possible variations. The approach to study the
problem is similar to the case of Brownian motion, the main difficulty to overcome
is to study the Laplace method on the path space of the fractional Brownian motion
(see [3] for the Brownian case).
The paper is organized as follows. In Sect. 2 we give some basic results of the
theory of rough paths and of the Malliavin calculus tools that will be needed. In Sect. 3,
we prove a Varadhan’s type small time asymptotics for ln pt (x, y). The discussion
is mainly based on [7]. In Sect. 4, we study sufficient conditions under which the
above expansion (1.3) is valid. Our discussion is based on [5, 27, 28]. Finally, in
Sect. 5, we discuss some models in mathematical finance where the asymptotics of
the density for rough differential equations may play an important role.

2 Preliminary Material

For some fixed H > 41 , we consider (, F, P) the canonical probability space
associated with the fractional Brownian motion (in short fBm) with Hurst parameter
H . That is,  = C0 ([0, 1]) is the Banach space of continuous functions vanishing
at zero equipped with the supremum norm, F is the Borel sigma-algebra and P is
the unique probability measure on  such that the canonical process B = {Bt =
(Bt1 , . . . , Btd ), t ∈ [0, 1]} is a fractional Brownian motion with Hurst parameter H .
In this context, let us recall that B is a d-dimensional centered Gaussian process,
whose covariance structure is induced by
  1
j j
R(t, s) := E Bs Bt = s 2H + t 2H − |t − s|2H , s, t ∈ [0, 1] and j = 1, . . . , d.
2
(2.1)
416 F. Baudoin and C. Ouyang

In particular it can be shown, by a standard application of Kolmogorov’s criterion,


that B admits a continuous version whose paths are γ-Hölder continuous for any
γ < H.

2.1 Rough Paths Theory

In this section, we recall some basic results in rough paths theory. More details can be
found in the monographs [20] and [32]. For N ∈ N, recall that the truncated algebra
T N (Rd ) is defined by
N
T N (Rd ) = (Rd )⊗m ,
m=0

with the convention (Rd )⊗0 = R. The set T N (Rd ) is equipped with a straightforward
vector space structure plus an multiplication ⊗. Let πm be the projection on the
mth tensor level. Then (T N (Rd ), +, ⊗) is an associative algebra with unit element
1 ∈ (Rd )⊗0 .
For s < t and m ≥ 2, consider the simplex m st = {(u 1 , . . . , u m ) ∈ [s, t] ; u 1
m

< · · · < u m }, while the simplices over [0, 1] will be denoted by  . A continuous
m

map x : 2 → T N (Rd ) is called a multiplicative functional if for s < u < t one


has xs,t = xs,u ⊗ xu,t . An important example arises from considering paths x with
finite variation: for 0 < s < t we set
  
xs,t =
m
dx · · · dx
i1 im
ei1 ⊗ · · · ⊗ eim , (2.2)
1≤i 1 ,...,i m ≤d m
st

where {e1 , . . . , ed } denotes the canonical basis of Rd , and then define the truncated
signature of x as


N
S N (x) : 2 → T N (Rd ), (s, t) → S N (x)s,t := 1 + m
xs,t .
m=1

The function S N (x) for a smooth function x will be our typical example of multi-
plicative functional. Let us stress the fact that those elements take values in the strict
subset G N (Rd ) ⊂ T N (Rd ), called free nilpotent group of step N , and is equipped
with the classical Carnot-Caratheodory norm which we simply denote by | · |. For a
path x ∈ C([0, 1], G N (Rd )), the p-variation norm of x is defined to be

1/ p

x p−var;[0,1] = sup |xt−1
i
⊗ xti+1 | p
⊂[0,1] i

where the supremum is taken over all subdivisions  of [0, 1].


On Small Time Asymptotics for Rough Differential Equations … 417

With these notions in hand, let us briefly define what we mean by geometric
rough path (we refer to [20, 32] for a complete overview): for p ≥ 1, an element
x : [0, 1] → G p (Rd ) is said to be a geometric rough path if it is the p-var limit of
a sequence S p (x m ). In particular, it is an element of the space

C p−var;[0,1] ([0, 1], G p


(Rd )) = {x ∈ C([0, 1], G p
(Rd )) : x p−var;[0,1] < ∞}.

Let x be a geometric p-rough path with its approximating sequence x m , that


is, x m is a sequence of smooth functions such that xm = S p (x m ) converges to
x in the p-var norm. Fix any 1 ≤ q ≤ p so that p −1 + q −1 > 1 and pick any
h ∈ C q−var ([0, 1], Rd ). One can define the translation of x by h, denoted by Th (x)
by
Th (x) = lim S p (x m + h).
n→∞

It can be shown that Th (x) is an element in C p−var ([0, 1], G p (Rd )). Moreover, one
can show that Th (x) uniformly continuous in h and x on bounded sets.

Remark 2.1 A typical situation of the above translation of x by h in the present paper
is when x = B, the fractional Brownian motion lifted as a rough path, and h is a
Cameron-Martin element of B. In this case, we simply denote Th (B) = B + h.

According to the considerations above, in order to prove that a lift of a d-


dimensional fBm as a geometric rough path exists it is sufficient to build enough
iterated integrals of B by a limiting procedure. Towards this aim, a lot of the infor-
mation concerning B is encoded in the rectangular increments of the covariance
function R (defined by (2.1)), which are given by
 
st
Ruv ≡ E (Bt1 − Bs1 ) (Bv1 − Bu1 ) .

We then call 2-dimensional ρ-variation of R the quantity


⎧⎛ ⎞1/ρ ⎫

⎨  ρ ⎪

⎝  t j t j+1  ⎠
Vρ (R) ≡ sup Rsi si+1  ; (si ), (t j ) ∈  ,

⎩ i, j ⎪

where  stands again for the set of partitions of [0, 1]. It is know that (see, for example
[20]) if a process has a covariance function with finite ρ-variation for ρ ∈ [1, 2), it
admits a lift to a geometric p-rough path for all p > 2ρ. As a consequence, we have
the following for fractional Brownian motions:
Proposition 2.2 For a fractional Brownian motion with Hurst parameter H , we
have Vρ (R) < ∞ for all ρ ≥ 1/(2H ). Consequently, for H > 1/4 the process B
admits a lift B as a geometric rough path of order p for any p > 1/H .
418 F. Baudoin and C. Ouyang

2.2 Malliavin Calculus

We introduce the basic framework of Malliavin calculus in this subsection. The reader
is invited to read the corresponding chapters in [33] for further details. Let E be the
space of Rd -valued step functions on [0, 1], and H the closure of E for the scalar
product:

d
(1[0,t1 ] , · · · , 1[0,td ] ), (1[0,s1 ] , · · · , 1[0,sd ] )H = R(ti , si ).
i=1

We denote by K H∗ the isometry between H and L 2 ([0, 1]). When H > 1


it can be
2
shown that L ([0, 1], Rd ) ⊂ H, and when 41 < H < 21 one has
1/H

C γ ⊂ H ⊂ L 2 ([0, 1])

for all γ > 21 − H .


We remark that H is the reproducing kernel Hilbert space for B. Let H H be the
Cameron-Martin space of B, one proves that the operator R := R H : H → H H
given by  ·

Rψ := K H (·, s)[K H ψ](s) ds (2.3)
0

defines an isometry between H and H H . Let us now quote from [20, Chap. 15] a
result relating the 2-d regularity of R and the regularity of H H .
Proposition 2.3 Let B be a fBm with Hurst parameter 41 < H < 21 . Then one has
H H ⊂ C ρ−var for ρ > (H + 1/2)−1 . Furthermore, the following quantitative bound
holds:
h ρ−var
h HH ≥ .
(Vρ (R))1/2

Remark 2.4 The above proposition shows that for fBm we have H H ⊂ C ρ−var for
ρ > (H + 1/2)−1 . Hence an integral of the form h d B can be interpreted in the
Young sense by means of p-variation techniques.

Remark 2.5 Under the same conditions, the above embedding can be sharpened to
H H ⊂ C ρ−var for all ρ ≥ (H + 1/2)−1 . We refer interested readers to [17] for more
details.

A F-measurable real valued random variable F is then said to be cylindrical if it


can be written, for a given n ≥ 1, as
  1  1
F = f B(φ ), . . . , B(φ ) = f
1 n
φ1s , d Bs , . . . , φns , d Bs  ,
0 0
On Small Time Asymptotics for Rough Differential Equations … 419

where φi ∈ H and f : Rn → R is a C ∞ bounded function with bounded derivatives.


The set of cylindrical random variables is denoted S.
The Malliavin derivative is defined as follows: for F ∈ S, the derivative of F is
the Rd valued stochastic process (Dt F)0≤t≤1 given by

 ∂f 
n
Dt F = φi (t) B(φ1 ), . . . , B(φn ) .
∂xi
i=1

More generally, we can introduce iterated derivatives. If F ∈ S, we set

Dkt1 ,...,tk F = Dt1 . . . Dtk F.

For any p ≥ 1, it can be checked that the operator Dk is closable from S into
L p (; H⊗k ). We denote by Dk, p the closure of the class of cylindrical random
variables with respect to the norm
⎛ ⎞1
  
k  p p
 j 
F k, p = ⎝E F p
+ E D F  ⊗ j ⎠ ,
H
j=1

and 
D∞ = Dk, p .
p≥1 k≥1

Definition 2.6 Let F = (F 1 , . . . , F n ) be a random vector whose components are


in D∞ . Define the Malliavin matrix of F by

γ F = (DF i , DF j H )1≤i, j≤n .

Then F is called non-degenerate if γ F is invertible a.s. and

(det γ F )−1 ∈ ∩ p≥1 L p ().

It is a classical result that the law of a non-degenerate random vector F =


(F 1 , . . . , F n ) admits a smooth density with respect to the Lebesgue measure on
Rn . Furthermore, the following integration by parts formula allows to get more
quantitative estimates:

Proposition 2.7 Let F = (F 1 , . . . , F n ) be a non-degenerate random vector whose


components are in D∞ , and γ F the Malliavin matrix of F. Let G ∈ D∞ and ϕ be a
function in the space C ∞
p (R ). Then for any multi-index α ∈ {1, 2, . . . , n} , k ≥ 1,
n k

there exists an element Hα = Hα (F, G) ∈ D depending on F and G such that

E[∂α ϕ(F)G] = E[ϕ(F)Hα ].


420 F. Baudoin and C. Ouyang

Moreover, the elements Hα are recursively given by


d 
H(i) = δ G(γ F−1 )i j DF j
j=1
Hα = H(αk ) (H(α1 ,...,αk−1 ) ),

and for 1 ≤ p < q < ∞ we have

Hα Lp ≤ C p,q γ F−1 DF k
k,2k−1 r
G k,q ,

where 1
p = 1
q + r1 .
Remark 2.8 By the estimates for Hα above, one can conclude that there exist con-
stants β, γ > 1 and integers m, r such that

Hα Lp ≤ C p,q det γ F−1 m



DF r
k,γ G k,q .

2.3 Differential Equations Driven by Fractional Brownian


Motions

Let B be a d-dimensional fractional Brownian motion with Hurst parameter H > 41 .


Fix a small parameter ε ∈ (0, 1], and consider the solution X tε to the stochastic
differential equation

d 
 t  t
X tε = x +ε Vi (X sε )d Bsi + V0 (ε, X sε )ds, (2.4)
i=1 0 0

where the vector fields V1 , . . . , Vd are C ∞ -bounded vector fields on Rn and V0 (ε, ·)
is C ∞ -bounded uniform in ε ∈ [0, 1].
Proposition 2.2 ensures the existence of a lift of B as a geometrical rough path.
The general rough paths theory (see e.g. [20, 22]) together with some integrability
results (see e.g. [12, 18]) allow us to state the following proposition:
Proposition 2.9 Consider Eq. (2.4) driven by a d-dimensional fBm B with Hurst
parameter H > 41 , and assume that the vector fields Vi s are C ∞ -bounded. Then
(i) For each ε ∈ (0, 1], Eq. (2.4) admits a unique finite p-var continuous solution X ε
in the rough paths sense, for any p > H1 .
(ii) There exists λ > 0 such that
 
E exp λ sup |X tε |(2H +1)∧2 < ∞. (2.5)
t∈[0,1],∈(0,1]
On Small Time Asymptotics for Rough Differential Equations … 421

Once Eq. (2.4) is solved, the vector X tε is a typical example of random variable
which can be differentiated in the Malliavin sense. We shall express this Malliavin
derivative in terms of the Jacobian Jε of the equation, which is defined by the relation

ε,i j
Jt = ∂x j X tε,i .

Setting DV j for the Jacobian of V j seen as a function from Rn to Rn , let us recall


that Jε is the unique solution to the linear equation

d 
 t
Jtε = Idn + ε DV j (X sε ) Jsε d Bs ,
j
(2.6)
j=1 0

and that the following results hold true (see [10, 11, 34] for further details):
Proposition 2.10 Let X ε be the solution to Eq. (2.4) and suppose the Vi ’s are C ∞ -
bounded. Then for every i = 1, . . . , n, t > 0, and x ∈ Rn , we have X tε,i ∈ D∞
and
Ds X tε = Jst
ε
V j (X sε ), j = 1, . . . , d, 0 ≤ s ≤ t,
j

where Ds X tε,i is the jth component of Ds X tε,i , Jtε = ∂x X tε and Jst


ε = Jε (Jε )−1 .
j
t s

Let us now quote the recent result [12], which gives a useful estimate for moments
of the Jacobian of rough differential equations driven by Gaussian processes.
Proposition 2.11 Consider a fractional Brownian motion B with Hurst parameter
H > 41 and p > H1 . Then for any η ≥ 1, there exists a finite constant cη such that
the Jacobian Jε defined at Proposition 2.10 satisfies:
 
η
E sup Jε p−var;[0,1] = cη . (2.7)
ε∈[0,1]

Proof The integrability of Jε is only proved in [12] when ε = 1. On the other hand,
the estimates of J in [12] only depends on the supremum norm of the vector fields and
their derivatives. In our case, the vector fields in Eq. (2.4) are εVi s whose derivatives
together with themselves are bounded uniform in ε ∈ (0, 1). Hence the uniform
integrability of Jε (in ε) follows. 

Finally, we close the discussion of this section by the following large deviation
principle that will be needed later. Let  : H H → C([0, 1], Rn ) be given by solving
the ordinary differential equation

d 
 t  t
t (h) = x + Vi (s (h))dh is + V0 (0, s (h))ds. (2.8)
i=1 0 0
422 F. Baudoin and C. Ouyang

Theorem 2.12 Let  be given in (2.8), which is a differentiable mapping from H H


to C([0, 1], Rn ). Introduce the following function on Rn

1
I (y) = inf h 2H H .
1 (h)=y 2

Recall that X 1ε is the solution to Eq. (2.4). Then X 1ε satisfies a large deviation principle
with rate function I (y).

Proof Fix any p > H1 . It is known (see [20]) that εB as a G p (Rd )-valued rough path
satisfies a large deviation principle in p-variation topology with good rate function
given by

1
h 2H if h ∈ H
J (h) = 2
+∞ otherwise.

It is clear 1 (·) : G p (Rd ) → Rn is continuous. Now that X 1ε = 1 (εB), the


claimed result follows from the contraction principle. 

3 Varadhan Asymptotics

In this section, we are interested in a family of stochastic differential equations driven


by fractional Brownian motions B (with Hurst parameter H > 41 ) of the following
form
d  t
X tε = x + ε Vi (X sε )d Bsi .
i=1 0

We define a map  : H H → C[0, 1] by solving the ordinary differential equation

d 
 t
t (h) = x + Vi (s (h))dh is .
i=1 0

Clearly, we have X tε = t (εB). Denote by γ1 (h) the deterministic Malliavin matrix
of 1 (h), i.e.,
ij j
γ1 (h) = Di1 (h), D1 (h)H .

Introduce the following functions on Rn , which depend on 

1 1
d 2 (y) = I (y) = inf h 2H H , and d R2 (y) = inf h 2H H .
1 (h)=y 2 1 (h)=y,det γ1 (h) >0 2
On Small Time Asymptotics for Rough Differential Equations … 423

In the absence of the drift term (V0 = 0) in our setting in this section, one can show
that the above two distances coincide.
Lemma 3.1 For every y ∈ Rn , we have d(y) = d R (y).

Proof We follow an argument of Léandre (see [30]). By using Theorem I.2 in [30] and
the isometry between the Cameron-Martin space of the fractional Brownian motion
and the Cameron-Martin space of the Brownian motion, we see that for every ε > 0,
there exists h ∈ H such that h H ≤ ε and det γ1 (h) > 0. Then arguing as in the
Remark after Proposition II.1 in [30], we can for every η > 0 and y ∈ Rn construct
h ∈ H such that 1 (h) = y, det γ1 (h) > 0 and

1
h 2H ≤ d 2 (y) + η. 
2
Throughout the section, we assume that the following assumption ! Hypothesis 3.2k
is satisfied. Let us first introduce some notations. Let A = {∅} ∪ ∞ k=1 {1, 2, · · ·, n}
and A1 = A \ {∅}. We say that I ∈ A is a word of length k if I = (i 1 , · · ·, i k ) and
we write |I | = k. If I = ∅, then we denote |I | = 0. For any integer l ≥ 1, we denote
by A(l) the set {I ∈ A; |I | ≤ l} and by A1 (l) the set {I ∈ A1 ; |I | ≤ l} . We also
define an operation ∗ on A by I ∗ J = (i 1 , · · ·, i k , j1 , · · ·, jl ) for I = (i 1 , · · ·, i k )
and J = ( j1 , · · ·, jl ) in A. We define vector fields V[I ] inductively by

V[ j] = V j , V[I ∗ j] = [V[I ] , V j ], j = 1, · · ·, d

Hypothesis 3.2 (Uniform hypoelliptic condition) The vector fields V1 , · · ·, Vd are


in Cb∞ (Rn ) and they form a uniform hypoelliptic system in the sense that there exist
an integer l and a constant λ > 0 such that

V[I ] (x), u2Rn ≥ λ u 2
(3.1)
I ∈A1 (l)

holds for any x, u ∈ Rn

Under this assumption the main result proved in [7] is the following Varadhan’s
type estimate:

Theorem 3.3 Let us denote by pε (y) the density of X 1ε . Then

lim inf ε2 log pε (y) = −d 2 (y). (3.2)


ε↓0

The two key ingredients in proving Theorem 3.3 are an estimate for the Malliavin
derivative DX 1ε and an estimate of the Malliavin matrix γ X 1ε of X 1ε . Building on
previous results from [8], the following estimates were obtained in [7]:
424 F. Baudoin and C. Ouyang

Lemma 3.4 Assume Hypothesis 3.2. For H > 41 , we have


(1) supε∈(0,1] X 1ε k,r < ∞ for each k ≥ 1 and r ≥ 1.
(2) γ −1
X ε r ≤ cr ε
−2l for any r ≥ 1.
1

Proof of Theorem 3.3 We first show that

lim inf ε2 log pε (y) ≥ −d R2 (y). (3.3)


ε↓0

Fix y ∈ Rn . We only need to show for d R2 (y) < ∞, since if d R2 (y) = ∞


the statement is trivial. Fix any η > 0 and let h ∈ H H be such that 1 (h) =
y, det γ (h) > 0, and h 2H H ≤ d R2 (y) + η. Let f ∈ C0∞ (Rn ). By the Cameron-
Martin theorem for fractional Brownian motions, we have

h 2
HH
− B(h)
E f (X 1ε ) = e 2ε2 E f (1 (εB + h))e ε .

Consider then a function χ ∈ C ∞ (R), 0 ≤ χ ≤ 1, such that χ(t) = 0 if t ∈


[−2η, 2η], and χ(t) = 1 if t ∈ [−η, η]. Then, if f ≥ 0, we have
h H +4η
− H
E f (X 1ε ) ≥ e 2ε2 Eχ(εB(h)) f (1 (εB + h)).

Hence, we obtain
 
1  
ε2 log pε (y) ≥ − h 2H H + 2η + ε2 log E χ(εB(h))δ y (1 (εB + h)) .
2
(3.4)

On the other hand, we have


  
  −n 1 (εB + h) − 1 (h)
E χ(εB(h))δ y (1 (εB + h)) = ε E χ(εB(h))δ0 .
ε

Note that
1 (εB + h) − 1 (h)
Z 1 (h) = lim
ε↓0 ε

is a n-dimensional random vector in the first Wiener chaos with variance γ1 (h) > 0.
Hence Z 1 (h) is non-degenerate and we can then prove that we obtain
  
1 (εB + h) − 1 (h)
lim E χ(εB(h))δ0 = Eδ0 (Z 1 (h)).
ε↓0 ε
On Small Time Asymptotics for Rough Differential Equations … 425

Therefore,  
lim ε2 log E χ(εB(h))δ y (1 (εB + h)) = 0.
ε↓0

Letting ε ↓ 0 in (3.4) we obtain


  
1
lim inf ε log pε (y) ≥ −
2
h H H + 2η ≥ − d R2 (y) + 3η .
2
ε↓0 2

Since η > 0 is arbitrary, this completes the proof of (3.3).


Next, we show that

lim sup ε2 log pε (y) ≤ −d 2 (y). (3.5)


ε↓0

Fix a point y ∈ Rn and consider a function χ ∈ C0∞ (Rn ), 0 ≤ χ ≤ 1 such that χ


is equal to one in a neighborhood of y. The density of X 1ε at point y is given by

pε (y) = E(χ(X 1ε )δ y (X 1ε )).

By the integration by parts formula of Proposition 2.7, we can write



Eχ(X 1ε )δ y (X 1ε ) = E 1{X 1ε >y} H(1,2,...,n) (X 1ε , χ(X 1ε ))
≤ E|H(1,2,...,n) (X 1ε , χ(X 1ε ))|
 
= E |H(1,2,...,n) (X 1ε , χ(X 1ε ))|1{X 1ε ∈suppχ}
1
≤ P(X 1ε ∈ suppχ) q H(1,..,n) (X 1ε , χ(X 1ε )) p,

where 1
p + 1
q = 1. By Remark 2.8 we know that

H(1,...,n) (X 1ε , χ(X 1ε )) p ≤ C p,q γ −1



m
β DX 1ε r
k,γ χ(X 1ε ) k,q ,
1

for some constants β, γ > 0 and integers k, m, r . Thus, by Lemma 3.4 we have

lim ε2 log H(1,...,n) (X 1ε , χ(X 1ε )) p = 0.


ε↓0

Finally by Theorem 2.12, a large deviation principle for X 1ε ensures that for small
ε we have
1 − 1 (inf y∈suppχ d 2 (y)+o(1))
P(X 1ε ∈ suppχ) q ≤ e qε2 .

This gives us (3.5).


Combining Lemma 3.1, (3.3) and (3.5), the proof of Theorem 3.3 is thus com-
pleted. 
426 F. Baudoin and C. Ouyang

4 Small-Time Kernel Expansion

4.1 Laplace Approximation

Fix H > 41 and consider Eq. (2.4). For the convenience of our discussion, in what
follows, we write the above equation in the following form
 t  t
X tε = x +ε σ(X sε )d Bs + b(ε, X sε )ds,
0 0

where σ is a smooth d × d matrix and b a smooth function from R+ × Rd to Rd .


We also assume that σ and b have bounded derivatives to any order.
Fix p > H1 . Let F and f be two bounded infinitely Fréchet differentiable func-
tionals on C p−var;[0,1] ([0, 1], Rd ) with bounded derivatives (as linear operators) to
any order. We are interested in studying the asymptotic behavior of
" #
J (ε) = E f (X ε ) exp{−F(X ε )/ε2 } , as ε ↓ 0.

Recall for each k ∈ H H , (k) is the deterministic Itô map defined in (2.8). Set
$
1
(φ) = inf k H H , φ = (k), k ∈ H H .
2

Throughout our discussion we make the following assumptions:


Assumption 4.1 • H 1: F +  attains its minimum at finite number of paths
φ1 , φ2 , . . . , φn on P(Rd ).
• H 2: For each i ∈ {1, 2, . . . , n}, we have φi = (γi ) and γi is a non-degenerate
minimum of the functional F ◦  + 1/2 · 2H H , i.e.:

∀k ∈ H H \{0}, d 2 (F ◦  + 1/2 · 2H H )(γi )k 2 > 0.

The following theorem is the main result of this section.

Theorem 4.2 Under the assumptions H 1 and H 2 above, we have


 
− a2 − c N +1
J (ε) = e ε e ε α0 + α1 ε + · · · + α N ε + O(ε
N
) .

Here

a = inf{F + (φ), φ ∈ P(Rd )} = inf{F ◦ (k) + 1/2|k|2H H , k ∈ H H }

and % &
c = inf d F(φi )Yi , i ∈ {1, 2, . . . , n} ,
On Small Time Asymptotics for Rough Differential Equations … 427

where Yi is the solution of

dYi (s) = ∂x σ(φi (s))Yi (s)dγi (s) + ∂ε b(0, φi (s))ds + ∂x b(0, φi (s))Yi (s)ds

with Yi (0) = 0.

In what follows, we sketch the proof of the above Laplace approximation in the
case H > 21 . Remarks on the rough case 41 < H < 21 will be provided afterwards.
Without loss of generality, we may assume that F +  attains its minimum at a
unique path φ. There exists a γ ∈ H H such that

1
φ = (γ), and (φ) = γ 2H H ,
2
and
$
def 1
a = inf{F + (φ), φ ∈ P(Rd )} = inf F ◦ (k) + k 2H H , k ∈ H H .
2

Moreover by assumption H 2, for all non zero k ∈ H H :

1
d 2 (F ◦  + H H )(γ)k
2 2
> 0.
2
Consider the following stochastic differential equation
 t  t
Z tε =x+ σ(Z sε )(εd Bs + dγs ) + b(ε, Z sε )ds.
0 0

It is clear that Z 0 = φ. Denote Z tm,ε = ∂εm Z tε and consider the Taylor expansion
with respect to ε near ε = 0, we obtain


N
gjεj
Zε = φ + + ε N +1 R εN +1 ,
j!
j=0

where g j = Z j,0 . Explicitly, we have

dg1 (s) = σ(φs )d Bs + ∂x σ(φs )g1 (s)dγs + ∂x b(0, φs )g1 (s)ds + ∂ε b(0, φs )ds.

Now the proof is divided into the following steps.


Step 1: By the large deviation principle, the sample paths that contribute to the
asymptotics of J (ε) lie in the neighborhoods of the minimizers of F + . More
precisely, for ρ > 0, denote by B(φ, ρ) the open ball (under λ-Hölder topology for
a fixed λ < H ) centered at φ with radius ρ. There exist d > a and ε0 > 0 such that
for all ε ≤ ε0
428 F. Baudoin and C. Ouyang
  
 ε 
 J (ε) − E f (X Tε )e−F(X T )/ε , X ε ∈ B(φ, ρ)  ≤ e−d/ε .
2 2

Hence, letting  
ε
Jρ (ε) = E f (X Tε )e−F(X T )/ε , X ε ∈ B(φ, ρ) ,
2

to study the asymptotic behavior of J (ε) as ε ↓ 0, it suffices to study that of Jρ (ε).


Step 2: Let θ(ε) = F(Z ε ) and write

1
θ(ε) = θ(0) + εθ (0) + ε2 θ (0) + ε3 R(ε).
2
By the Cameron-Martin theorem for fractional Brownian motions, we have

Jρ (ε) (4.1)
⎧ ⎛ ⎞ ⎫
⎨    2
γ H ⎬
ε F(Z ε ) 1 T 
∗ −1 ˙
−1  ε
= E f (Z ) exp − 2 exp ⎝− (K H ) ( K H γ) s d Bs − H ⎠ ; Z ∈ B(φ, ρ)
⎩ ε ε 0 2ε 2 ⎭
⎡   ∗ −1 −1  ⎤
'  ( ˙
1 1 θ(0) + 0T (K H ) ( K H γ) s d Bs
= E exp − 2 F(φ) + γ 2H exp ⎣− ⎦
ε 2 H ε
' (   $
1
exp − θ (0) · f (Z ε )e−εR(ε) ; Z ε ∈ B(φ, ρ) .
2

Step 3: It is clear that to prove Theorem 4.2, it suffices to analyze the four terms in
the expectation above. First of all, it is apparent that the first term (of order-2) is
'  (
1 1 − a
exp − 2 F(φ) + γ 2H H =e ε2 , (4.2)
ε 2

which gives the leading term the Varadhan asymptotics.


The second term (of order-1) is deterministic. Indeed, since γ is a critical point
−1
of F ◦  + 1/2 · 2H H and note k H H = K H k H , we have

 T  ∗ −1 −1 
d F(φ)(d(γ)k) = − (K H ) (K H˙ γ) s dks .
0

By the continuity of Young’s integral with respect to the driving path, the above
extends to
 T  ∗ −1 −1 
d F(φ)(d(γ)B) = − (K H ) (K H˙ γ) s d Bs .
0

On the other hand, note


θ (0) = d F(φ)g1 ,
On Small Time Asymptotics for Rough Differential Equations … 429

and

g1 = d(γ)B + Y.

Here Y is the solution of

dYs = ∂x σ(φs )Ys dγs + ∂ε b(0, φs )ds + ∂x b(0, φs )Ys ds, Y (0) = 0.

We obtain
⎡  T  ∗ −1 −1 ⎤
θ(0) + (K ) ( K ˙ γ) d B ' (
exp ⎣− 0 H H s s
⎦ = exp − d F(φ)Y . (4.3)
ε ε

For the third term (of order 0), one can show that there exists a β > 0 such that
' ($
1
E exp −(1 + β) θ (0) < ∞. (4.4)
2

Let us emphasize that in order to show the above integrability of θ (0), one needs
to use assumption H2 and prove that d 2 F ◦ (γ)(k 1 , k 2 ) is Hilbert-Schmidt. For
more details, we refer the reader to [5] for the case when H > 21 , and to [27] when
4 < H < 2 . Moreover, one can prove the following integrability of R(ε).
1 1

Lemma 4.3 There exist α > 0 and ε0 > 0 such that



sup E e(1+α)|εR(ε)| ; Z ε ∈ B(φ, ρ) < ∞.
0≤ε≤ε0

Lemma 4.3 and (4.4) allows us to analyze the third and forth terms and show

" 1  # N
E f (Z ε )e− 2 θ (0)−εR(ε) ; Z ε ∈ B(φ, ρ) = αm εm + O(ε N +1 ). (4.5)
m=0

Finally, combining (4.1)–(4.3), and (4.5), the proof of Theorem 4.2 is complete. 

Remark 4.4 In application (see the next section), one may also be interested in an
SDE which involves a fractional order term of ε,
 t  t
1
X tε = x +ε σ(X sε )d Bs +ε H b(ε, X sε )ds. (4.6)
0 0
430 F. Baudoin and C. Ouyang

For this purpose, let us first introduce


- n2  .
1 = n 1 + n 1 , n 2 = 0, 1, 2, . . . , (4.7)
H
the set of fractional orders. Let 0 = κ0 < κ1 < κ2 < · · · be all elements of 1 in
increasing order. When H > 21 , we have
 
1 1
(κ0 , κ1 , κ2 , κ3 , κ4 , . . .) = 0, 1, , 2, 1 + , . . . . (4.8)
H H

Set
2 = {κ − 2|κ ∈ 1 \{0}},

and define

3 = {a1 + a2 + · · · + am |m ∈ N+ and a1 , . . . , am ∈ 1 }

and
3 = {a1 + a2 + · · · + am |m ∈ N+ and a1 , . . . , am ∈ 2 }.

Finally let
4 = {a + b|a ∈ 3 , b ∈ 3 }

and denote by {0 = λ0 < λ1 < λ2 < . . . } all the elements of 4 in increasing


order. Let us note that the set 3 characterizes the powers of ε coming from the term
f (Z ε ) in (4.1) and 3 characterizes that of e−εR(ε) .
Similar as before, we consider
 t  t
1
Z tε =x+ σ(Z sε )(εd Bs + dγs ) + ε H b(ε, Z sε )ds. (4.9)
0 0

It can be proved that Z  has the following expansion in ε,


N
Zε = φ + gκ j εκ j + εκ N +1 Rκε N +1 .
j=0

Note that in (4.8), indices up to degree two are (0, 1, 1/H, 2). There is an extra term
1/H compared to the case without fractional order. Hence when plugging (4.9) into
Step 2 of the proof of Theorem 4.2, there is an extra (but deterministic) term
$
d F()gκ2
exp − 1
,
ε2− H
On Small Time Asymptotics for Rough Differential Equations … 431

where gκ2 satisfies

dgκ2 (s) = ∂x σ(φs )gκ2 (s)dγs + b(0, φs )ds, gκ2 (0) = 0.

It is not hard to see that the other terms up to degree two remain the same, and that
although higher order terms are different they could be handled similarly as before.
Hence we obtain
Theorem 4.5 Let X ε satisfy (4.6). We have
$
" ε 2# −a c d
E f (X ε )e−F(X )/ε = e ε2 e− ε exp − αλ0 + αλ1 ελ1 + · · · + αλ N ελ N
2− H1
ε 
λ N +1
+ O(ε ) .

Here

a = inf{F ◦ (k) + 1/2|k|2H H , k ∈ H H },


c = d F(φ)Y, and d = d F(φ)gκ2 ,

where Y and gκ2 satisfy

dY (s) = ∂x σ(φi (s))Y (s)dγ(s)+∂ε b(0, φ(s))ds+∂x b(0, φ(s))Y (s)ds, Y (0) = 0,

and
dgκ2 (s) = ∂x σ(φs )gκ2 (s)dγs + b(0, φs )ds, gκ2 (0) = 0.

Remark 4.6 Theorem 4.2 for the rough case 41 < H < 21 was proved by Inahama
[27]. In this case, equation is understood in the rough path sense. Thanks to Propo-
sition 2.3, equations for gi and Ri are understood as Young’s paring.
In [27] the author also discussed RDEs with fractional orders of ε, in which the
index set 1 was introduced. The main idea of the proof for the rough case is the same
as that outlined above. But the major difficulty is to show that d 2 F ◦ (γ)(k 1 , k 2 ) is
Hilbert-Schmidt. This is easier when H > 21 , since in this case ∂t K (t, s) is integrable,
and one can easily obtain a nice representation for d 2 F ◦ (γ)(k 1 , k 2 ).

4.2 Expansion of the Density Function

Consider
d 
 t  t
Xt = x + Vi (X s )d Bsi + V0 (X s )ds. (4.10)
i=1 0 0
432 F. Baudoin and C. Ouyang

We are interested in studying the small-time asymptotic behavior of X t . It is clear


that by the self-similarity of B, this is equivalent to studying the asymptotic behavior
of X 1ε (for small ε) which satisfies


d  t  t
1
X tε =x+ ε Vi (X s )d Bs + ε H
i
V0 (X s )ds.
i=1 0 0

In what follows, we use the Laplace approximation to obtain a short time asymp-
totic expansion for the density of X 1ε in the case when H > 21 . For this purpose, we
need the following assumption.
Assumption 4.7 • A 1: For every x ∈ Rd , the vectors V1 (x), · · · , Vd (x) form a
basis of Rd .
• A 2: There exist smooth and bounded functions ωil j such that:


d
[Vi , V j ] = ωil j Vl ,
l=1

and
j
ωil j = −ωil .

Assumption A1 is the standard ellipticity condition. Due to the second assumption


A2, the geodesics are easily described. If k : R≥0 → R is a α-Hölder path with
α > 1/2 such that k(0) = 0, we denote by (x, k) the solution of the ordinary
differential equation:
d  t
xt = x + Vi (xs )dksi .
i=1 0

Whenever there is no confusion, we always suppress the starting point x and denote
it simply by (k) as before. Then we have (see Lemma 4.2 in [5])
Lemma 4.8 (x, k) is a geodesic if and only if k(t) = tu for some u ∈ Rd .
As a consequence of the previous lemma, we then have the following key result
(Proposition 4.3 in [5]):

Proposition 4.9 Let T > 0. For x, y ∈ Rd ,

d 2 (x, y)
inf k 2H H = .
k∈H H ,T (x,k)=y T 2H

Lemma 4.10 For any x ∈ Rd , there exists a neighborhood V of x and a bounded


smooth function F(x, y, z) on V × V × Rd such that:
(1) For any (x, y) ∈ V × V the infimum
On Small Time Asymptotics for Rough Differential Equations … 433
$
d(x, z)2
inf F(x, y, z) + ,z ∈ M = 0
2

is attained at the unique point y. Moreover, it is a non-degenerate minimum.


Hence there exists a unique k 0 ∈ H H such that (a): 1 (x0 , k 0 ) = y0 ; (b):
d(x0 , y0 ) = k 0 H H ; and (c): k 0 is a non-degenerate minimum of the functional:
k → F(1 (x0 , k)) + 1/2 k 2H H on H H .
(2) For each (x, y) ∈ V × V , there exists a ball centered at y with radius r
independent of x, y such that F(x, y, ·) is a constant outside of the ball.

Let F be in the above lemma and pε (x, y) the density function of X 1ε . By the
inversion of Fourier transformation we have
 
− F(x,y,y) 1 −iζ·y iζ·z − ε2
F(x,y,z)
pε (x, y)e ε 2 = e dζ e e pε (x, z)dz
(2π)d
 
1 ζ·y ζ·z − F(x,y,z)
= e−i ε dζ ei ε e ε2 pε (x, z)dz
(2πε) d
  iζ·(X ε −y) F(x,y,X ε ) 
1 1 − 1
= dζE x e ε e ε2 . (4.11)
(2πε) d

It is clear that by applying Laplace approximation to the expectation in the last equa-
tion above and switching the order of integration (with respect to ζ) and summation,
we obtain an asymptotic expansion for the density function pε (x, y).

Remark 4.11 One might wonder why not constructing, for each fixed x, y, a function
F which minimizes (at z = y)

D(x, z)2
F(x, y, z) +
2
in Lemma 4.10, where

D 2 (x, y) = inf k 2H H .
k∈H H ,1 (x,k)=y

After all D(x, y) seems the natural “distance” for the system (4.10), instead of
the Riemannian distance d(x, y). The problem with D(x, y) is that it is not clear
weather it is differentiable, while the construction of F in Lemma 4.10 needs some
differentiability of D(x, y). This is indeed one of the reasons why we impose the
structure assumption A2 so that D(x, y) = d(x, y) (content of Proposition 4.9).
With this identification, we know D(x, y) is smooth for all x = y.

Remark 4.12 In order to show Proposition 4.9, we used the fact that ∂ K (t, s)/∂t is
integrable, which is only true for the smooth case H > 21 . Hence although Inahama
proved the Laplace approximation for 41 < H < 21 in [27], we can not repeat the proof
in this section to produce an expansion of the density function for the rough case.
434 F. Baudoin and C. Ouyang

Recall the definition of 1 in Remark 4.4 and similarly set

2 = {κ − 1|κ ∈ 1 \{0}}

and
2 = {κ − 2|κ ∈ 1 \{0}}.

Next define

3 = {a1 + a2 + · · · + am |m ∈ N+ and a1 , . . . , am ∈ 2 }.

and
3 = {a1 + a2 + · · · + am |m ∈ N+ and a1 , . . . , am ∈ 2 }.

Finally, set
4 = {a + b|a ∈ 3 , b ∈ 3 }

and denote by {0 = λ0 < λ1 < λ2 < . . . } all the elements of 4 in increas-


ing order.
% Similar as &before, powers of  in the index set 3 comes from the term
exp iζ · (X 1ε − y)/ε in (4.11) and powers in 3 comes from exp{−F(x, y, X 1ε )/ε2 }.
Our main result of this section is the following (by letting ε = t H ).
Theorem 4.13 Fix x ∈ Rd . Suppose the Assumption 4.7 is satisfied, then in a
neighborhood V of x, the density function p(t; x, y) of X t in (4.10) has the following
asymptotic expansion near t = 0

2 (x,y) 
N 
1 −d + 2Hβ−1
p(t; x, y) = e 2t 2H t ci (x, y)t λi H +r N +1 (t, x, y)t λ N +1 H , y ∈ V.
(t H ) d
i=0

Here β is some constant, d(x, y) is the Riemannian distance between x and y


determined by V1 , . . . , Vd . Moreover, we can chose V such that ci (x, y) are C ∞
in V × V ⊂ Rd × Rd , and for all multi-indices α and β

sup sup |∂xα ∂ yβ r N +1 (t, x, y)| < ∞


t≤t0 (x,y)∈V ×V

for some t0 > 0.


Remark 4.14 Differentiability of ci (x, y), r N +1 in the above theorem and legitimacy
of Fourier inversion in (4.11) is obtained by Malliavin calculus and some uniform
estimates of the coefficients in the Laplace approximation. We refer the reader to [5]
for details.
Remark 4.15 Our result assumes the ellipticity condition and a strong structure con-
dition (Assumption 4.7). Later Inahama [28, 29] proved the kernel expansion (for
H > 13 ) under some mild conditions on the vector fields. He takes a different approach
On Small Time Asymptotics for Rough Differential Equations … 435

and uses Watanabe distribution theory. Hence he is able to work with D(x, y) intro-
duced in Remark 4.11 directly and avoids the technical assumption A2 of Assumption
4.7. On the other hand, the smoothness of coefficient and the uniform estimate for
the remainder terms in the expansion are not provided in [28, 29].

5 Application to Mathematical Finance

Fractional Brownian motions have been used in financial models to introduce mem-
ory. In this section, we give two examples of such models and remark on how the
methods and results in the previous sections could be applied to the study of such
models.

5.1 One Dimensional Models

Memories can be introduced to stock price process directly. In particular, the so-called
fractional Black and Scholes model is given by
 
σ 2 2H
St = S0 exp μt + σ Bt −
H
t , (5.1)
2

where B H is a fractional Brownian motion with Hurst parameter H , μ the mean rate
of return and σ > 0 the volatility. Let r be the interest rate. The price for the risk-free
bond is given by er t .
More generally, one can also consider a fractional local volatility model

d St = St (μdt + σ(St )d BtH ).

Here the stochastic integration with respect to B H could be understood in the sense
of rough path theory. After a simple change of variable X t = log St , one obtains

d X t = μdt + σ(e X t )d BtH .

There has been an intensive study recently of option prices and implied volatilities for
options with short maturity (e.g. [9, 16, 21]). Since the above equation is a special
case of (4.10), we can use the results obtained in the previous sections to obtain
short-time asymptotic behavior of such models.
A drawback of the finance models discussed above is that they lead to the existence
of arbitrage opportunities. For example, let the couple (αt , βt ), t ∈ [0, T ] be a
portfolio with αt the amount of bonds and βt the amount of stocks at time t. One can
construct an arbitrage in the fractional Black and Scholes model by (for simplicity,
we assume μ = r = 0)
436 F. Baudoin and C. Ouyang
 t
βt = St − S0 , and αt = βt d St − βt St .
0

Let Vt be the value of the portfolio at time t. It is not hard to see that this is a self-
financing portfolio that satisfies V0 = 0 and Vt = (St − S0 )2 for all t > 0, and hence
it is an arbitrage. For more discussion on arbitrage in models given by fractional
Brownian motions, we refer the reader to [35].

5.2 Stochastic Volatility Models

Stochastic volatility models were introduced to capture both the volatility smile and
the correct dynamics of the volatility smile (see [23] for instance). For these models,
modeling the volatility process is one of the key factors. In [14], the authors proposed
a long memory specification of the volatility process in order to capture the steepness
of long term volatility smiles without over increasing the short run persistence.
The following stochastic volatility model based on the fractional Ornstein-
Uhlenbeck process provides another way introducing long memory to the volatility
process:
d St = μSt dt + σt St dWt ,

where σt = f (Yt ) and Yt is a fractional Ornstein-Uhlenbeck process:

dYt = α(m − Yt )dt + βt d BtH .

In the above Wt is a standard Brownian motion and BtH an independent (of Wt )


fractional Brownian motion with Hurst parameter H > 21 . Examples of functions f
are f (x) = e x and f (x) = |x|.
Comte and Renault [13] studied this type of stochastic volatility models which
introduces long memory and mean reversion in the Hull and White setting [26]. The
long memory property allows this model to capture the well-documented evidence of
persistence of the stochastic feature of Black and Scholes implied volatilities when
time to maturity increases.
Unlike one dimensional models mentioned above, the fractional Ornstein-
Uhlenbeck model is arbitrage free since the stock price process is driven by a standard
Brownian motion. In [25], Hu has proved that for this model, market is incomplete
and the martingale measures are not unique. If we set γt = (r − μ)/σt and
 T  T 
dQ 1
= exp γt dWt − |γt |2 dt .
dP 0 2 0

Then Q is the minimal martingale measure associated with P. Moreover, the risk
minimizing-hedging price at t = 0 of an European call option with payoff (ST − K )+
is given by
On Small Time Asymptotics for Rough Differential Equations … 437

C0 = e−r T EQ (ST − K )+ .

The fractional Ornstein-Uhlenbeck model takes a generalized form of Eq. (4.10)


that is studied in the previous sections. It is a system of SDEs driven by fractional
Brownian motions, but with varying Hurst parameter H . We believe that the methods
discussed above can be extended to study small-time asymptotics of these models.

References

1. Azencott, R.: Densité des diffusions en temps petit: développements asymptotiques. I. Seminar
on probability, XVIII. Lecture Notes in Mathematics, vol. 1059, pp. 402-498. Springer, Berlin
(1984)
2. Ben Arous, G.: Développement asymptotique du noyau de la chaleur hypoelliptique hors du
cut-locus. Ann. Sci. École Norm. Sup. (4) 21(3), 307–331 (1988)
3. Ben Arous, G.: Méthode de Laplace et de la phase stationnaire sur l’espace de Wiener, Sto-
chastics 25(3), 125–153 (1988)
4. Baudoin, F., Hairer, M.: A version of Hörmander’s theorem for the fractional Brownian motion.
Probab. Theory Relat. Fields 139, 373–395 (2007)
5. Baudoin, F., Ouyang, C.: Small-time kernel expansion for solutions of stochastic differential
equations driven by fractional Brownian motions. Stoch. Process. Appl. 121(4), 759–792 (2011)
6. Baudoin, F., Ouyang, C.: Gradient bounds for solutions of stochastic differential equa-
tions driven by fractional Brownian motions. Malliavin Calculus and Stochastic Analysis:
A Festschrift in Honor of David Nualart. Springer, Berlin (2012)
7. Baudoin, F., Ouyang, C., Zhang, X.: Varadhan estimates for RDEs driven by fractional Brown-
ian motions. Stoch. Proc. Appl. 125(2), 634–652 (2015)
8. Baudoin, F., Ouyang, C., Zhang, X.: Smoothing effect of rough differential equations driven
by fractional Brownian motions. Ann. Inst. Henri Poincare Probab. Statist. (2013)
9. Berestyki, H., Busca, J., Florent, I.: Computing the implied volatility in stochastic volatility
models. Commun. Pure Appl. Math., Vol. LVII, 1352–1373 (2004)
10. Cass, T., Friz, P.: Densities for rough differential equations under Hörmander condition. Ann.
Math. 171(3), 2115–2141 (2010)
11. Cass, T., Friz, P., Victoir, N.: Non-degeneracy of Wiener functionals arising from rough differ-
ential equations. Trans. Am. Math. Soc. 361, 3359–3371 (2009)
12. Cass, T., Litterer, C., Lyons, T.: Integrability and tail estimates for Gaussian rough differential
equations. Ann. Probab. 41(4), 3026–3050 (2013)
13. Comte, F., Renault, E.: Long memory in continuous-time stochastic volatility models. Math.
Financ. 8, 291–323 (1998)
14. Comte, F., Coutin, L., Renault, E.: Affine fractional stochastic volatility models. Ann. Financ.
8(2–3), 337–378 (2012)
15. Coutin, L., Qian, Z.M.: Stochastic analysis, rough path analysis and fractional Brownian
motions. Probab. Theory Relat. Fields 122(1), 108–140 (2002)
16. Feng, J., Forde, M., Fouque, J.P.: Short maturity asymptotics for a fast mean reverting Heston
stochastic volatility model. SIAM J. Financ. Math. 1, 126–141 (2010)
17. Friz, P., Gess, B., Gulisashvili, A., Riedel, S.: The Jain-Monrad criterion for rough paths and
applications to random Fourier series and non-Markovian Hörmander theory. Ann. Probab.
(2013)
18. Friz, P., Riedel, S.: Integrability of (non-)linear rough differential equations and integrals.
Stoch. Anal. Appl. 31(2), 336–358 (2013)
19. Friz, P., Victoir, N.: Differential equations driven by Gaussian signals. Ann. Inst. Henri Poincare
Probab. Stat. 46(2), 369–413 (2010)
438 F. Baudoin and C. Ouyang

20. Friz, P., Victoir, N.: Multidimensional Dimensional Processes seen as Rough Paths. Cambridge
University Press, Cambridge (2010)
21. Gatheral, J., Hsu, E., Laurence, P., Ouyang, C., Wang, T.-H.: Asymptotics of implied volatility
in local volatility models. Math. Financ. 22, 591–620 (2012)
22. Gubinelli, M.: Controlling rough paths. J. Funct. Anal. 216, 86–140 (2004)
23. Hagan, P., Kumar, D., Lesniewski, A., Woodward, D.: Managing Smile Risk. Wilmott Mag.
(2003)
24. Hairer, M., Pillai, N.S.: Regularity of laws and ergodicity of hypoelliptic SDEs driven by rough
paths. Ann. Inst. Henri Poincaré Probab. Stat. 47(2), 601–628 (2011)
25. Hu, Y.: Integral transformations and anticipative calculus for fractional Brownian motions.
Mem. Am. Math. Soc. 175(825), 324 (2005)
26. Hull, J., White, A.: The pricing of options on assets with stochastic volatilities. J. Financ. 3,
281–300 (1987)
27. Inahama, Y.: Laplace approximation for rough differential equation driven by fractional Brown-
ian motion. Ann. Probab. 41(1), 170–205 (2013)
28. Inahama, Y.: Short time kernel asymptotics for young SDE by means of Watanabe distribution
theory. To appear in J. Math. Soc. Jpn. (2013)
29. Inahama, Y.: Short time kernel asymptotics for rough differential equation driven by fractional
Brownian motion. Preprint (2014)
30. Léandre, R.: Minoration en temps petit de la densité d’une diffusion dégénérée. J. Funct. Anal.
74, 399–414 (1987)
31. Lyons, T.: Differential equations driven by rough signals. Rev. Mat. Iberoam. 14(2), 215–310
(1998)
32. Lyons, T., Qian, Z.: System Control and Rough Paths. Oxford University Press, Oxford (2002)
33. Nualart, D.: The Malliavin Calculus and Related Topics, 2nd edn. Probability and its Applica-
tions. Springer, Berlin (2006)
34. Nualart, D., Saussereau, B.: Malliavin calculus for stochastic differential equations driven by
a fractional Brownian motion. Stoch. Process. Appl. 119(2), 391–409 (2009)
35. Rogers, L.C.G.: Arbitrage with fractional Brownian motion. Math. Financ. 7(1), 95–105 (1997)
On Singularities in the Heston Model

Vladimir Lucic

Abstract In this note we provide characterization of the singularities of the


Heston characteristic function. In particular, we show that all the singularities are
pure imaginary.

Keywords Heston · Complex singularities

1 Problem Formulation

Consider the Heston stochastic volatility model, which under risk-neutral measure
and with zero drift has the following dynamics
√ (1)
dSt = St vt dWt ,
√ (1)
 (2)
dvt = λ(v̄ − vt ) dt + η vt (ρ dWt + 1 − ρ2 dWt ),

where the parameters λ, η, and v̄ are nonnegative, ρ ∈ [−1, 1], and the initial values
S0 and v0 are positive.
The Heston characteristic function is defined as
 
φ H (u, τ ) = E eiu log(Sτ /S0 ) , α < (u) < β.

Results of Heston [5] and Lewis [7] show that on the strip of convergence α <
(u) < β the Heston characteristic function coincides with

φ(u, τ ) = eC(u,τ )v̄+D(u,τ )v0 , u ∈ Z,

V. Lucic (B)
Quantitative Analytics, Barclays, 5 The North Colonnade, Canary Wharf,
London E14 4BB, UK
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 439


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_15
440 V. Lucic

where
  
1 − e−dτ 2 1 − ge−dτ
D(u, τ ) = r− , C(u, τ ) = λ r − τ − log ,
1 − ge−dτ η2 1−g
β±d  r−
r± = , d = β 2 + 2αη 2 , g = ,
η 2 r+
u2 iu
α= − , β = λ + ρηiu.
2 2
With the customary abuse of terminology, we’ll refer to φ(u, τ ), u ∈ Z as the Heston
characteristic function.
Using a result1 of Lukacs [8], Lewis [7] points out that φ(u, τ ) has singularities
on the imaginary axis at the boundaries of the strip of convergence. Whether there
are any other singularities (necessarily complex-conjugate) on that boundary could
not be readily established. Furthermore, no conclusions can be made about singular-
ities outside of the strip of convergence. The purpose of this note is to provide full
characterization of the singularities of φ(u, τ ).

2 Main Result

The following theorem, although presented as an existence result, allows for con-
struction of the singularities of φ(u, τ ) via standard numerical methods.
Theorem 2.1 All singularities of φ(u, τ ) are pure imaginary.
Proof Assume η > 0, as for η = 0 we have the Black-Scholes model whose char-
acteristic function is free of singularities (see, e.g., Lewis [7]).
To simplify notation we put is = u and show that the (essential) singularities of
φ(is, τ ) are real. To this end, we show that the transcendental equation
r+
= e−dτ , (2.1)
r−

where

β = λ − ρηs (2.2a)

d = β 2 − η 2 s(s − 1) (2.2b)
β±d
r± = (2.2c)
η2

has only real roots.

1 Asnoted in Lukacs [8], this is a corollary of a more general result on Laplace transforms, e.g.
Theorem II.5b of Widder [9].
On Singularities in the Heston Model 441

We consider (2.1) and (2.2) as a system in d and s. Equation (2.1) can be written as

d = (−λ + ρηs) tanh(τ d/2). (2.3)

From (2.2a) and (2.2b) we get

−(1 − ρ2 )η 2 s 2 + s(η 2 − 2ρηλ) + λ2 − d 2 = 0, (2.4)



so, with q := 1 − ρ2 , we can express s in terms of d: for q = 0

η − 2ρλ ± (η − 2ρλ)2 + 4q 2 (λ2 − d 2 )
s1/2 = , (2.5)
2q 2 η

and for q = 0 and 2ρλ − η = 0

d 2 − λ2
s= . (2.6)
η 2 − 2λρη

If q = 0 and 2ρλ − η = 0 from from (2.1), (2.2), and (2.4) we obtain d = λ, ρ = 1,


η = 2λ, which implies that the only singularity is

1
s= .
1 − e−λτ

If d = 0 we have equality in (2.3), while from (2.5) and (2.6) it follows that the roots
in s are real.
For d = 0 substituting (2.5) in (2.3) yields

η − 2ρλ ± (η − 2ρλ)2 + 4q 2 (λ2 − d 2 )
d = −λ + ρ tanh(τ d/2),
2q 2

while substituting (2.6) in (2.3) gives


 
d 2 − λ2
d = −λ + ρ tanh(τ d/2).
η − 2ρλ

which imply, respectively,

2
2dq 2 coth(τ d/2) + 2λ − ρη = ρ2 ((η − 2ρλ)2 + 4q 2 (λ2 − d 2 )), (2.7)

and ρ
d coth(τ d/2) + λ = (λ2 − d 2 ). (2.8)
η − 2ρλ
442 V. Lucic

With
τd τ (η − 2λρ) τλ |ρ|
= i z, a = sgn(ρ), b = , c=
2 4q 2 q

Lemma 2.2 implies that the roots of (2.7) are either real or pure imaginary. For the
special case (2.8), Lemma 2.3 with

τd τλ 2ρ
= i z, b = , c=
2 2 τ (η − 2ρλ)

implies that the corresponding roots are also either real or pure imaginary.
Therefore, it follows that for d = 0 the expression in the brackets in (2.3) is
real (being ratio of either real or imaginary numbers), which in turn implies that the
solutions of the transcendental equation (2.1) are real in s. 

Lemma 2.2 For real a and real nonnegative b, c the roots of the equation

(z cot(z) + b − ac)2 = c2 (a 2 + b2 + z 2 ), z ∈ Z (2.9)

are real or pure imaginary.

Proof For c = 0 the result follows from Lemma A.6. If c > 0 from Lemma 2.4 we
have that for sufficiently large N equation (2.9) has 4N + 2 roots inside the square
with vertices (N + 1/2)(±π, ±iπ). On the other hand, from Lemmas A.1 and A.3
it follows that there are 4N + 2 real or pure imaginary roots inside the same square,
so the result follows. 

Lemma 2.3 For real nonnegative b and real c the roots of the equation

z cot(z) + b = c(b2 + z 2 ), z ∈ Z (2.10)

are real or pure imaginary.

Proof For c = 0 the result follows from Lemma A.6. Putting a = 0 in Lemma 2.4
we conclude that for every c = 0 and sufficiently large N equation

(z cot(z) + b)2 = c2 (b2 + z 2 )2 , z ∈ Z

has 4N + 4 roots inside the square with vertices (N + 1/2)(±π, ±iπ). On the other
hand, from Lemmas A.2 and A.4 it follows that both equations

z cot(z) + b = ±c(b2 + z 2 )

have 2N + 2 real or pure imaginary roots inside the same square, whence the result
follows. 
On Singularities in the Heston Model 443

In the next lemma2 we make repeated use of the Rouché’s theorem (e.g., Hille [6]
[Theorem 9.2.3]).
Lemma 2.4 Let C N , N ∈ N denote the square in complex plane with vertices at
(N + 21 )(±π, ±iπ). Then for real a, nonnegative b, c, and d = 1, 2 there exists
N0 ∈ N such that for every integer N > N0 the equation

(z cot(z) + b − ac)2 = c2 (a 2 + b2 + z 2 )d , z ∈ Z (2.11)

has 4N + 2d roots inside C N .


Proof Consider the case d = 1, c > 1 and the case d = 2, c > 0 together. On the
right vertical side of C N we have

π e y − e−y
| cot(z)| = cot + Nπ + iy = | tan(i y)| = < 1, (2.12)
2 e y + e−y

while on the upper horizontal side we have

e2i z + 1 1 + e−(2N +1)π e2i x 1 + e−(2N +1)π


| cot(z)| = = ≤
e2i z − 1 1 − e−(2N +1)π e2i x 1 − e−(2N +1)π

Together with (2.12) and the fact that | cot(z)| = | cot(−z)| this implies

1 + e−(2N +1)π
| cot(z)| ≤ =: k N , z ∈ C N .
1 − e−(2N +1)π

For z ∈ C N have

|(z cot(z) + b − ac)2 | (|z cot(z)| + |b − ac|)2



|c2 (a 2 + b2 + z 2 )|d |c2 (a 2 + b2 + z 2 )|d
 
kN |b − ac| 2 z2
≤ + .
c |cz| (a + b2 + z 2 )d
2

Since limn→∞ kn = 1, the last expression tends to (2 − d)/c2 < 1 uniformly in z as


N → ∞, so for sufficiently large N we have

|(z cot(z) + b − ac)2 | < |c2 (a 2 + b2 + z 2 )|d , z ∈ C N .

Therefore, by Rouché’s theorem the number of roots of (2.11) inside C N is equal


to the number of poles of z → (z cot(z) + b − ac)2 − c2 (a 2 + b2 + z 2 )d inside
C N plus the number of zeros of z → c2 (a 2 + b2 + z 2 )d inside C N (considering

2 A weaker version of this result (dealing with the case of real roots only) appears as Problem E1295

in American Mathematical Monthly, Vol. 65., No. 6, p. 450.


444 V. Lucic

their multiplicities). For sufficiently large N those two numbers are 4N and 2d
respectively, whence the equation (2.11) has 4N + 2d roots inside C N .
Consider now d = 1, 0 < c < 1. Let D N be the square vertices at (±N π, ±N iπ),
and let D N denote D N extended with semicircles of radius so that the poles of
cot(z) at ±N π are inside D N , but the real zeros of (2.11) in (N π, (N + 1/2)π)
and (−(N + 1/2)π, −N π) described in Lemma A.1 remain outside. For ease of
exposition in what follows we make smaller if necessary, which can be done without
invalidating previously established statements.
Similarly as before, on the right vertical side of D N we have

e y − e−y
| tan(z)| = |tan (N π + i y)| = | tan(i y)| = < 1, (2.13)
e y + e−y

while on the upper horizontal side we have

e2i z − 1 1 − e−2N π e2i x 1 + e−2N π


| tan(z)| = = ≤ . (2.14)
e2i z + 1 1 + e−2N π e2i x 1 − e−2N π

Together with (2.13) and the fact that | cot(z)| = | cot(−z)| this implies that for
sufficiently small > 0

1 − e−2N π
| cot(z)| ≥ =: kn , z ∈ D N . (2.15)
1 + e−2N π

On D N we have

|c2 (a 2 + b2 + z 2 )| c2 |c2 (a 2 + b2 )|
≤ + 2 ,
|(z cot(z) + b − ac)2 | 2
|z|| cot(z)| − |b − ac|
| cot(z)| − b−ac
z

so for N large enough

|c2 (a 2 + b2 + z 2 )| c2 |c2 (a 2 + b2 )|
≤ + 2 .
|(z cot(z) + b − ac)2 | 2
|z| k N − |b − ac|
kN − b−ac
z

Since limn→∞ kn = 1, the last expression tends to c2 < 1 uniformly in z as N → ∞,


so for sufficiently large N we have

|c2 (a 2 + b2 + z 2 )| < |(z cot(z) + b − ac)2 |, z ∈ D N .

Therefore, by Rouché’s theorem the number of roots of (2.11) inside D N is equal


to the number of poles of z → (z cot(z) + b − ac)2 − c2 (a 2 + b2 + z 2 ) inside D N
plus the number of zeros minus the number of poles of z → (z cot(z) + b − ac)2
inside D N (considering their multiplicities). The two mappings have common poles,
On Singularities in the Heston Model 445

so we are left with number of zeros of the second mapping, which for sufficiently
small , according to Lemma A.6, is 4N . Therefore, from Lemma A.5, and taking
into account two real zeros in (−(N + 1/2)π, −N π) ∪ (N π, (N + 1/2)π) whose
existence is established in Lemma A.1, we conclude that for d = 1, 0 < c < 1 and
sufficiently large N there are 4N + 2 zeros inside C N .
Finally, consider the case c = 1, d = 1. Put α = b − ac, β 2 = a 2 + b2 , so that
we get
cos(2z) 2 sin(2z)
z + αz 2 − (β 2 − α2 ) = 0, (2.16)
sin2 (z) sin (z)

or, equivalently,

2 cot(2z) cot(z)z 2 + 2αz cot(z) − (β 2 − α2 ) = 0.

On D N we have

2α − zβcot(z)
−α2 2
|2α| + |β|z|k
−α | 2 2
|2αz cot(z) − (β 2 − α2 )|
≤ ≤ N
.
|2 cot(2z) cot(z)z 2 | 2|z|| cot(2z)| 2|z|k2N

Since limn→∞ kn = 1, the last expression tends to zero uniformly in z as N → ∞,


so for sufficiently large N we obtain

|2αz cot(z) − (β 2 − α2 )| < |2 cot(2z) cot(z)z 2 |, z ∈ D N ,

that is,
cos(2z) 2 sin(2z)
z > αz 2 − (β 2 − α2 ) , z ∈ D N .
sin2 (z) sin (z)

Thus, by Rouché’s theorem this implies that the number of roots of (2.16) inside D N
equals the number of zeros of z → cos(2z)
sin2 (z)
z 2 inside D N , which is 4N . Therefore,
reasoning as in the previous part of the proof we conclude that for c = 1 we have
4N + 2 zeros of (2.11) inside C N for N large enough. 

Acknowledgments I wish to thank Tomislav Šekara of University of Belgrade and the anonymous
referee for their comments and suggestions.

Addendum

The first version of this paper appeared on SSRN in 2007 (following the author’s
investigation into applicability of the Talbot’s numerical inversion method in trans-
form analysis of option prices). Since then several publications have appeared using
the main result of the present work, which we list below for completeness.
446 V. Lucic

Based on Theorem 2.1, in Ferreiro-Castilla [2] and del Baño Rollin et al. [1] a
smoothness result for the density of the log-spot in the Heston model is presented,
together with an alternative proof of our main result. Theorem 2.1 was also used in
Friz et al. [3] in the study of the asymptotic behaviour of the stock price density in
the negatively correlated Heston model. Finally, Lemma 6.1 from Gulisashvili et al.
[4], used in the study of the asymptotic behaviour of the mixing distribution density
in the uncorrelated Heston model, is quite close in spirit to the results presented here.

Appendix

Lemma A.1 For N sufficiently large, equation (2.9) has 4N −2 real roots in (−(N +
1/2)π, −π) ∪ (π, (N + 1/2)π).

Proof By Lemma A.5 for every N > 1 equation (2.9) as two real roots in each of
the intervals (−(k + 1)π, −kπ) and (kπ, (k + 1)π), k = 1, 2, . . . N − 1.
Rewrite (2.9) as

z cot(z) = −(b − ac) ± c a 2 + b2 + z 2 . (A.1)

For N > 0
lim z cot(N π) = +∞, z cot(N π + π/2) = 0, (A.2)
z→N π+

so we conclude that for sufficiently large N the equation with plus sign has one real
root in (N π, (N + 1/2)π), hence by symmetry in (−(N + 1/2)π, −N π). 

Lemma A.2 For N sufficiently large equation (2.10) has 2N real roots in (−(N +
1/2)π, −π) ∪ (π, (N + 1/2)π) if c > 0, and 2N − 2 real roots if c < 0.

Proof By Lemma A.5 for every N > 1 equation (2.10) has one real root in each of
the intervals (−(k + 1)π, −kπ) and (kπ, (k + 1)π), k = 1, 2, . . . N − 1.
Rewrite (2.10) as
z cot(z) = −b + c(b2 + z 2 ). (A.3)

From (A.2) and (A.3) we conclude that if c > 0 for sufficiently large N equa-
tion (2.10) has one real root in (N π, (N + 1/2)π), hence by symmetry in (−(N +
1/2)π, −N π). 

Lemma A.3 For real a, nonnegative b, and c > 0 equation (2.9) has either four
real roots in (−π, π), or two real roots in (−π, π) and two imaginary roots.

Proof The proof follows by simple geometrical considerations. For z = 0 the right-
hand side of (A.1) assumes two values
 
α1 := −(b − ac) + c a 2 + b2 , α1 := −(b − ac) − c a 2 + b2 .
On Singularities in the Heston Model 447

Since ac − c b2 + a 2 ≤ 0 we have α2 ≤ 0. On the other hand, the function
x → x cot(x) is zero at the origin and strictly decreases on [0, π), with a discontinuity
of the second kind at π. Thus, (A.1) has one√ real root corresponding to the intersection
of x → x cot(x) and x → −(b − ac) − c a 2 + b2 + z 2 on (0, π).
If α1 < 1 following the same argument we conclude that there is another real root
in√(0, π) corresponding to the intersection of x → x cot(x) and x → −(b − ac) +
c a 2 + b2 + z 2 . If α1 = 1 we have a double root at zero.
Thus, based on the above considerations and the symmetry around the origin it
follows that in (−π, π) equation (A.1) has four real roots if α1 ≤ 1, and two real roots
if α1 > 1. Therefore, to complete the proof we show that (A.1) has two imaginary
roots if α1 > 1.
Put z = i y, y ∈ R in (A.1) to get

y coth(y) = −(b − ac) ± c a 2 + b2 − y 2 . (A.4)

On the left-hand side we have a continuous function equal to one at the origin that
tends to infinity as y increases. Note that α1 > 1 implies a 2 + b2 > 0 Thus, on
the right-hand side we have a semi-circle starting at (0, α1 ) on the ordinate, entering
into the right half-plane, and ending at (0, α2 ) on the ordinate, half-encircling the
point (0, 1) (as α1 > 1 and α2 ≤ 0). Therefore, there must exist y0 > 0 for which
the equality holds in (A.4). Since −y0 also solves (A.4), we have two imaginary
solutions. 

Lemma A.4 Assume b ≥ 0. For c > 0 equation (2.10) has either two real roots
in (−π, π) or two imaginary roots. If c < 0 equation (2.10) has two real roots in
(−π, π) and two imaginary roots.

Proof At z = 0 the right-hand side of (A.1) equals −b + cb2 . The function x →


x cot(x) is zero at the origin and strictly decreases on [0, π), with a discontinuity of
the second kind at π. Thus, if c < 0 or c > 0 and −b + cb2 < 1 there is one real
root in (0, π), hence by symmetry in (−π, 0). If −b + cb2 = 1 we have a double
root at the origin. Next, with z = i y, y ∈ R equation (2.10) becomes

y coth(y) = −b + c(b2 − y 2 ). (A.5)

Therefore, if c > 0 and −b+cb2 > 1 the right-hand side dominates the left-hand side
at the origin, while the opposite is true for sufficiently large y. From the continuity of
the two functions it then follows that (A.5) has one positive root, hence by symmetry
one negative root. Finally, if c < 0 the left-hand side dominates the right-hand side at
the origin, while the opposite is true for sufficiently large y, giving a pair of imaginary
roots. 

Lemma A.5 For every positive integer k equation (2.11) has two real roots in each
of the intervals (−(k + 1)π, −kπ) and (kπ, (k + 1)π).
448 V. Lucic

Proof The result follows from the fact that on each of those intervals the range of
the map x → x cot(x) is the whole real line, while the maps x → −b + ac ± c(a 2 +
b2 + x 2 )d/2 are bounded. 

Lemma A.6 For a ∈ R the equation

z cot(z) = a, z ∈ Z (A.6)

has 2N roots inside the square with vertices (±N π, ±N iπ). The roots are real or
pure imaginary.

Proof For a = 0 the roots are the zeros of cos(z). If a = 0 from (2.13) and (2.14)
we conclude that for sufficiently large N

|a tan(z)| < |z|, z ∈ D N .

Thus, by Rouché’s theorem


z = a tan(z)

has 2N + 1 roots inside the square with vertices (±N π, ±N iπ). If k > 0 it has two
real roots in (−(k + 1)π/2, kπ/2) ∪ (kπ/2, (k + 1)π/2) if either a > 0 and k is
even, or a < 0 and k is odd.
On the other hand, in (−π, π) there are three roots (counting their multiplicities)
if a ≥ 1 and one root if 0 < a < 1. In the latter case there are two imaginary roots
(c.f. example on p. 255 of Hille [6]). Since (A.6) has one root less at the origin, the
result follows. 

References

1. del Baño Rollin, S., Ferreiro-Castilla, A., Utzet, F.: On the density of log-spot in the Heston
volatility model. Stoch. Process. Appl. 120, 2037–2063 (2010)
2. Ferreiro-Castilla, A.: Stochastic Calculus and Analytic Characteristic Functions: Applications
to Finance. Ph.D. thesis, Universitat Autònoma de Barcelona (2011)
3. Friz, P., Gerhold, S., Gulisashvili, A., Sturm, S.: On refined volatility smile expansion in the
Heston model. Quant. Financ. 11, 1151–1164 (2011)
4. Gulisashvili, A., Stein, E.M.: Asymptotic behavior of the stock price distribution density and
implied volatility in stochastic volatility models. Appl. Math. Optim. 61, 287–315 (2010)
5. Heston, S.L.: A closed-form solution for options with stochastic volatility with applications to
bond and currency options. Rev. Financ. Stud. 6(2), 327–343 (1993)
6. Hille, E.: Analytic Function Theory, vol. 1. Blaisdell, New York (1965)
7. Lewis, A.L.: Option Valuation Under Stochastic Volatility. Finance Press, Newport Beach (2000)
8. Lukacs, E.: Characteristic Functions. Charles Griffin & Co., London (1970)
9. Widder, D.V.: The Laplace Transform. Princeton University Press, Princeton (1946)
On the Probability Density Function
of Baskets

Christian Bayer, Peter K. Friz and Peter Laurence

Abstract The state price density of a basket, even under uncorrelated Black–Scholes
dynamics, does not allow for a closed form density. (This may be rephrased as state-
ment on the sum of lognormals and is especially annoying for such are used most
frequently in Financial and Actuarial Mathematics.) In this note we discuss short
time and small volatility expansions, respectively. The method works for general
multi-factor models with correlations and leads to the analysis of a system of ordi-
nary (Hamiltonian) differential equations. Surprisingly perhaps, even in two asset
Black–Scholes situation (with its flat geometry), the expansion can degenerate at
a critical (basket) strike level; a phenomena which seems to have gone unnoticed
in the literature to date. Explicit computations relate this to a phase transition from
a unique to more than one “most-likely” paths (along which the diffusion, if suit-
ably conditioned, concentrates in the afore-mentioned regimes). This also provides a
(quantifiable) understanding of how precisely a presently out-of-money basket option
may still end up in-the-money.

Keywords Sums of lognormals · Focality · Pricing of butterfly spreads on baskets

C. Bayer
Weierstrass Institute, Mohrenstrasse 39, 10117 Berlin, Germany
e-mail: [email protected]
P.K. Friz (B)
Institut für Mathematik, Technische Universität Berlin, Berlin, Germany
e-mail: [email protected]
P.K. Friz
Weierstraß-Institut für Angewandte Analysis und Stochastik, Berlin, Germany
P. Laurence
Dipartimento di Matematica, Università di Roma 1 Piazzale Aldo Moro 2,
00185 Rome, Italy
P. Laurence
Courant Institute of Mathematical Sciences, New York University,
251 Mercer Street, New York, NY 10012, USA

© Springer International Publishing Switzerland 2015 449


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_16
450 C. Bayer et al.

1 Introduction

As is well known, the sum of independent log-normal variable does not admit a
closed-form density. And yet, there are countless applications in Finance and Actu-
arial Mathematics where such sums play a crucial role, consider for instance the law
of a Black–Scholes basket B at time T , i.e. the weighted average of d geometric
Brownian motions.
As a consequence, there is a natural interest in approximations and expansions,
see e.g. [9] and the references therein. This article contains a detailed investigation
in small volatility and short time regimes. Forthcoming work of A. Gulisashvili
and P. Tankov [12] deals with tail asymptotics. Our methods are not restricted to
the geometric Brownian motion case: in principle, each Black–Scholes component
could be replaced by the asset price in a stochastic volatility model, such as the the
Stein–Stein model [16], with full correlation between all assets and their volatilities.
In the end, explicit solutions only depend on the analytical tractability of a system of
ordinary differential equations. If such tractability is not given, one can still proceed
with numerical ODE solvers.
As a matter of fact, our aim here is not to push the generality in which our methods
work: one can and should expect involved answers in complicated models. Rather, our
main—and somewhat surprising—insight is that unexpected phenomena are already
present in the simplest possible setting: to this end, our first focus will be on the
case of d = 2 independent Black–Scholes assets, without drift and correlation, with
unit spot and unit volatility). To be more specific, if C B denotes the fair value of an
(out-of-money) call option on the basket B struck at K , one naturally expects, for a
small maturity T ,
 
∂2  (K ) 1
C B (K , T ) ∼ (const) exp − √ .
∂K2 T T

And yet, while true for most strikes, it fails for K = K ∗ ; in fact,
   
∂2  (K ∗ ) 1
C B (K , T ) ∼ (const) exp − .
∂K2 K =K ∗ T T 3/4

To the best of our knowledge, and despite the seeming triviality of the situation (two
independent Black–Scholes assets!), the existence of a “special” strike level K ∗ , at
which the value of a basket option (here: butterfly spread1 ) has a “special” decay
behavior, as maturity approaches 0, seems to be new. There are different proofs
of this fact; the most elementary argument—based on the analysis of a convolu-
tion integral—is given in Sect. 2. However, this approach—while telling us what
happens—does not tell us how it happens.
The main contribution of this note is precisely a good understanding of the latter.
In fact, there is clear picture that comes with K ∗ . For K < K ∗ and conditional on the

1 Extensions to spreads and vanilla options are possible and will be discussed elsewhere.
On the Probability Density Function of Baskets 451

option to expire on the money, there is a unique “most likely” path around which the
underlying asset price process will concentrate as maturity approaches 0. For K >
K ∗ , however, this ceases to be true: there will be two distinct (here: equally likely)
paths around which concentration occurs. What underlies this interpretation is that
large deviation theory not only characterizes the probability of unlikely events (such
as expiration in-the-money, if presently out-of-the-money, as time to maturity goes to
zero) but also the mechanism via which these events can occur. Such understanding
was already crucial in previous works on baskets aiming at quantification of basket
(implied vol) skew relative to its components, starting with [1, 2]. As a matter of
fact, the analysis in these papers relied on the statement that “generically there is
a unique arrival point (of a unique energy minimizing path) on the (basket-strike)
arrival manifold”. The situation, however, even in the Black–Scholes model, is more
involved. And indeed, we shall establish existence of a critical strike K ∗ , at which
one sees the phase-transition from one to two energy minimizing, “most likely”,
paths.2 And this information will have meaning to traders (as long as they believe in
a diffusion model as maturity approaches 0, which may or may not be a good idea
…) as it tells them the possible scenarios in which an out-of-the money basket option
may still expire in the money.
Let us conclude this introduction with a few technical notes. We view the evolution
of the basket price—even in the Black–Scholes model—as a stochastic volatility
evolution model; by which we mean d Bt /Bt = σ (t, ω)dWt (as opposed to a local
vol evolution where σ = σ (t, Bt )). This should explain why the methods developed
in Part I of [6, 7] for the analysis of stochastic volatility models (then used in Part
II, [7], to solve the concrete smile problem (shape of the wings) for the correlated
Stein–Stein model), are also adequate for the analysis of baskets.

2 Computations Based on Saddle-Point Method


 
In terms of a standard d-dimensional Wiener process W 1 , . . . , W d ,


d
BT = S0i exp μi T + σ i WTi .
i=1

Write f = f T (K ) for the probability density function of BT ; i.e. for P [BT ∈ [K , K


+ d K ]] /d K . Of course, it is given by some (d − 1)-dimensional convolution
integral, explicit asymptotic expansions are—in principle—possible with the

2 It
can be shown that, sufficiently close to the arrival manifold, there is in fact a unique energy
minimizing paths. The (near-the-money) analysis of [1, 2] is then justified.
452 C. Bayer et al.

saddle point method. It will be enough for our purposes to illustrate the method
in the afore-mentioned simplest possible setting:

d = 2, S01 = S02 = 1, μ1 = μ2 = 0, σ 1 = σ 2 = 1.
   
In other words, BT = exp WT1 + exp WT2 . We claim that for some constant
c0 = c0 (K ) > 0


⎨exp − (K ) √1
(c0 + O (T )) , when K = K ∗ , (1a)
T T
f (K ) = (1)
⎩exp − ( K )
⎪ ∗
1
(c + O (T )) , when K = K ∗, (1b)
T T 3/4 0

with
K ∗ = 2e ≈ 5.43656

and
(K ) = inf{h K (x) | x ∈ [0, K ]}

with
h K (x) := (log x)2 + (log(K − x))2 . (2)

Note that for K ≤ K ∗ we can explicitly solve this minimization problem and obtain
(K ) = log(K /2)2 with corresponding minimizer x ∗ = K /2, corresponding to
the single local extremum of h K . For K > K ∗ , we have two global minima, which
cannot be given in closed form, and hence (K ) can only be computed numerically
(Fig. 1).

(a) (b) (c)

Fig. 1 Plot of h K for different choices of K . a For K < K ∗ there is a unique global minimum at
x ∗ = K /2 which is non-degenerate in the sense that h (x ∗ ) > 0. b For K = K ∗ there is a unique
global minimum at x = K /2 which is degenerate in the sense that h (x ∗ ) = 0. c For K > K ∗ ,
x = K /2 gives a local maximum. There are two symmetric global minimizers, which are not given
in closed form
On the Probability Density Function of Baskets 453

The stock price STi has a log-normal distribution with parameters μi = 0 and
√ √
ξ i = σ i T = T , where the density of the log-normal distribution is given by
 
1(log x − μ)2
f μ,ξ (x) = √ exp − . (3)
2π ξ x 2ξ 2

Obviously, the density of the sum of these two independent log-normal random
variables satisfies
 K
f (K ) = f μ1 ,ξ 1 (K − x) f μ2 ,ξ 2 (x)d x. (4)
0

Using our special parameters, the integrand is of the form


 
1 h K (x)
f μ1 ,ξ 1 (K − x) f μ2 ,ξ 2 (x) = exp − .
2π T x(K − x) 2T

In order to apply the Laplace approximation to (4), we compute the minimizer for
h K , which is found by the first order condition

log x log(K − x)
h K (x) = 0 ⇐⇒ − = 0. (5)
x K −x

Clearly, this equation is solved by choosing x ∗ = K /2—which is the unique global


minimizer iff K ≤ 2e and a local maximizer otherwise, in which case we have two
global minima x1∗ < K /2 < x2∗ . Assuming K ≤ 2e, we can check degeneracy of
that minimum directly by computing

1 − log(K /2)
h K (x ∗ ) = h K (K /2) = 16 .
K2
and
h K (x ∗ ) = 0 ⇐⇒ K = 2e. (6)

With more work one can see that also the global minima x1∗ , x2∗ , in the case K > 2e,
are non-degenerate. Hence, whenever K = 2e a standard Laplace method leads to
the expansion (1)a. In the remainder of this section, we consider the degenerate case
and establish (1)b.
454 C. Bayer et al.

Choosing K = 2e and, correspondingly, x ∗ = e, we obtain the Taylor expansion


(4)
h (x ∗ ) (4)
h K (x) = h K (x ∗ )+ K 24 (x −x ∗ )4 +O((x −x ∗ )5 ), with h K (x ∗ ) = 2 and h K (x ∗ ) =
20e−4 , we obtain the Laplace approximation
  
K 1 h K (x)
f (K ) = exp − dx
0 2π T (K − x)x 2T
 K    
1 1 5e−4 (x − K /2)4
= exp − exp − d x (1 + O(T ))
2π T e2 0 T 12T
 
31/4 (1/4) 1 1
= √ exp − (1 + O(T )) ,
1/4
5 2 2π e T T 3/4

where we used  ∞ (1/4)


exp(−αx 4 )d x = , α > 0.
−∞ 2α 1/4

Thus, we arrive at (1)b.

3 Large Deviations Approach

Our main tool here are novel marginal density expansions in small-noise regime [6].
This was used in order to compute the large-strike behavior of implied volatility in
the correlated Stein–Stein model; [11, 16].3
In fact, the technical assumptions of [6] were satisfied in the analysis of the Stein–
Stein model whereas in the (seemingly) trivial case of two IID Black-Scholes assets,
the technical assumptions of [6] are indeed violated for a critical strike K = K ∗ .
The necessity of this condition is then highlighted by the fact, as was seen in the
previous section,
   
∂2  (K ∗ ) 1
C B (K , T )  (const) exp − .
∂K2 K =K ∗ T T 1/2

The computation of K ∗ can be achieved either via a geometric construction borrowed


from Riemannian geometry, which relies on the Weingarten map, or by some (fairly)
elementary analysis of a system of Hamiltonian ODEs. In fact, the Hamiltonian
point of view extends naturally when one introduces correlation, local and even
stochastic volatility. Explicit answers then depend on the analytical tractability of
these (boundary value) ODE problems. (Of course, the numerical solution of such
problems is well-known.)

3 Similarinvestigations have recently been conducted in the Heston model; [10, 13] and the refer-
ences therein.
On the Probability Density Function of Baskets 455
 
In the following, we review [6]. Consider a d-dimensional diffusion Xtε t≥0 given
by the stochastic differential equation
   
dXtε = b ε, Xtε dt + εσ Xtε dWt , with X0ε = x0ε ∈ Rd , (7)

and where W = (W 1 , . . . , W m ) is an m-dimensional Brownian motion. Unless


otherwise
 stated,
 we assume b : [0, 1) × Rd → Rd , σ = (σ1 , . . . , σm ) : Rd →
Lin Rm , Rd and x0· : [0, 1) → Rd to be smooth, bounded with bounded derivatives
of all orders. Set σ0 = b (0, ·) and assume that, for every multi-index α, the drift
vector fields b (ε, ·) converges to σ0 in the sense4

∂xα b (ε, ·) → ∂xα b (0, ·) = ∂xα σ0 (·) uniformly on compacts as ε ↓ 0. (8)

We shall also assume that

∂ε b (ε, ·) → ∂ε b (0, ·) uniformly on compacts as ε ↓ 0 (9)

and
x0ε = x0 + εx̂0 + o (ε) as ε ↓ 0. (10)

Theorem 1 (Small noise) Let (Xε ) be the solution process to


   
dXtε = b ε, Xtε dt + εσ Xtε dWt , with X0ε = x0ε ∈ Rd .

Assume b (ε, ·) → σ0 (·) in the sense of (8), (9), and X0ε ≡ x0ε → x0 as ε → 0
in the sense of (10). Assume non-degeneracy of σ in the sense that σ.σ T is strictly
positive definite everywhere in space.5 Fix y ∈ Rl , Ny := (y, ·) and let Ky be the
the space of all h ∈ H , the Cameron-Martin space of absolutely continuous paths
with derivatives in L 2 ([0, T ], Rm ), s.t. the solution to


m
dφth = σ0 φth dt + σi φth dhti , φ0h = x0 ∈ Rd
i=1

satisfies φTh ∈ Ny . In a neighborhood of y, assume smoothness of 6


 
1
 (y) = inf h H : h ∈ Ky .
2
2

4 If (7) is understood in Stratonovich sense, so that d W is replaced by ◦d W , the drift vector field
  m
b (ε, ·) is changed to b̃ (ε, ·) = b (ε, ·) − ε2 /2 i=1 σi · ∂σi . In particular, σ0 is also the limit of
b̃ (ε, ·) in the sense of (8).
5 This may be relaxed to a weak Hoermander condition with an explicit controllability condition.
6 If # Kmin = 1 smoothness of the energy can be shown and need not be assumed; [6]. Note also
y
that in our application to tail asymptotics, with θ-scaling, θ ∈ {1, 2}, the energy must be linear resp.
quadratic (by scaling) and hence smooth.
456 C. Bayer et al.

Assume also (i) there are only finitely many minimizers, i.e. Kmin
y < ∞ where
 
1
Kmin
y := h0 ∈ Ky : h0 2
H =  (y) ;
2

(ii) x0 is non-focal for Ny in the sense of [6]. (We shall review below how to check
this.) Then there exists c0 = c0 (x0 , y, T ) > 0 such that

YεT = ε
l XT = X Tε,1 , . . . , X Tε,l , 1 ≤ l ≤ d,

admits a density with expansion

− (y) {
max  (y)· ŶT (h0 ):h0 ∈Kmin
y }
f ε (y, T ) = e 2
ε e ε ε−l (c0 + O (ε)) as ε ↓ 0,

where  denotes the gradient of .


Here Ŷ = Ŷ (h0 ) = Ŷ 1 , . . . , Ŷ l is the projection, Ŷ = l X̂, of the solution to
the following (ordinary) differential equation

d X̂t = ∂x b 0, φth0 (x0 ) + ∂x σ (φth0 (x0 ))ḣ0 (t) X̂t dt + ∂ε b 0, φth0 (x0 ) dt, (11)
X̂0 = x̂0 .

Remark 2 (Localization) The assumptions on the coefficients b, σ in Theorem 1


(smooth, bounded with bounded derivatives of all orders) are typical in this context
(cf. Ben Arous [3, 4] for instance) but rarely met in practical examples from finance.
This difficulty can be resolved by a suitable localization. For instance, as detailed in
[6], an estimate of the form

lim lim sup ε2 log P [τ R ≤ T ] = −∞. (12)


R→∞ ε→0
   
with τ R := inf t ∈ [0, T ] : sups∈[0,t] Xsε  ≥ R will allow to bypass the bounded-
ness assumptions.

3.1 Short Time Asymptotics

The reduction of short time expansions to small noise expansions by Brownian


scaling is classical. In the present context, we have the following statement, taken
from [6, Sect. 2.1].
On the Probability Density Function of Baskets 457

Corollary 3 (Short time) Consider dXt = b (Xt ) dt + σ (Xt ) dW , started at X0 =


x0 ∈ Rd , with C ∞ -bounded vector fields which are non-degenerate in the sense
that σ.σ T is strictly positive definite everywhere in space. Fix y ∈ Rl , Ny := (y, ·)
and1 assumel (i), (ii) as in Theorem 1. Let f (t, ·) = f (t, y) be the density of Yt =
Xt , . . . , Xt . Then
 2 
1 d (x0 , y)
f (t, y) ∼ (const) exp − as t ↓ 0
t l/2 2t

where d (x0 , y) is the sub-Riemannian distance, based on (σ1 , . . . , σm ), from the


point x0 to the affine subspace Ny .

3.2 Computational Aspects

We present here the mechanics of the actual computations, in the spirit of the Pon-
tryagin maximum principle (e.g. [15]). For details we refer to [6].

• The Hamiltonian. Based on the SDE (7), with diffusion vector fields σ1 , . . . , σm
and drift vector field σ0 (in the ε → 0 limit) we define the Hamiltonian

1
m
H (x, p) := p, σ0 (x) + p, σi (x)2
2
i=1
1 
= p, σ0 (x) + p, σ σ T (x) p .
2

Remark the driving Brownian motions W 1 , . . . , W m were assumed to be inde-


pendent. Many stochastic models, notably in finance, are written in terms of
correlated Brownian motions, i.e. with a non-trivial correlation matrix  =
ωi, j : 1 ≤ i, j ≤ m , where d W i , W j t = ωi, j dt. The Hamiltonian then
becomes
1 
H (x, p) = p, σ0 (x) + p, σ σ T (x) p . (13)
2
• The Hamiltonian ODEs. The following system of ordinary differential equations,
   
ẋ(t) ∂p H (x (t) , p (t))
= , (14)
ṗ(t) −∂x H (x (t) , p (t))

gives rise to a solution flow, denoted by Ht←0 , so that

Ht←0 (x0 , p0 )
458 C. Bayer et al.

is the unique solution to the above ODE with initial data (x0 , p0 ). Our standing
(regularity) assumption are more than enough to guarantee
 uniqueness
 and local
ODE existence. As in [5, p. 37], the vector field ∂p H, −∂x H is complete, i.e.,
one has global existence. It can be useful to start the flow backwards with time-T
terminal data, say (xT , pT ); we then write

Ht←T (xT , pT )

for the unique solution to (14) with given time-T terminal data. Of course,

Ht←T (HT ←0 (x0 , p0 )) = Ht←0 (x0 , p0 ) .

• Solving the Hamiltonian ODEs as boundary value problem. Given the target
manifold Na = (a, ·), the analysis in [6] requires solving the Hamiltonian ODEs
(14) with mixed initial-, terminal—and transversality conditions,

x (0) = x0 ∈ Rd ,
x (T ) = (y, ·) ∈ Rl ⊕ Rd−l , (15)
p (T ) = (·, 0) ∈ R ⊕ R
l d−l
.

Note that this is a 2d-dimensional system of ordinary differential equations, subject


to d + l + (d − l) = 2d conditions. In general, boundary problems for such ODEs
may have more than one, exactly one or no solution. In the present setting, there
will always be one or more than one solution. After all, we know by [6] that there
exists at least one minimizing control h0 and that can be reconstructed via the
solution of the Hamiltonian ODEs, as explained in the following step.
• Finding the minimizing controls. The Hamiltonian ODEs, as boundary value
problem, are effectively first order conditions (for minimality) and thus yield can-
didates for the minimizing control h0 = h0 (·), given by
⎛ ⎞
σ1 (x (·)) , p (·)
ḣ0 = ⎝ ... ⎠. (16)
σm (x (·)) , p (·)

Each such candidate is indeed admissible in the sense h0 ∈ Ka but may fail
to be a minimizer. We thus compute the energy h0 2H = H(x0 , p0 ) for each
candidate and identify those (“h0 ∈ Kmin a ”) with minimal energy. The procedure
via Hamiltonian flows also yields a unique p0 = p0 (h0 ). If σ0 = 0—as in our
case—the energy is equal to H(x0 , p0 ), otherwise the formula is slightly more
complicated.
• Checking non-focality. By definition
  x0 is non-focal
[6],  for N = (y, ·) along
∗ d
h0 ∈ Kmin
a in the sense that, with x T , p T := H T ←0 0 0 (h0 ) ∈ T R ,
x , p
On the Probability Density Function of Baskets 459
   
0
∂(z,q) |(z,q)=(0,0) π H0←T xT + , pT + (q, 0)
z

is non-degenerate (as d × d matrix; here we think of (z, q) ∈ Rd−l × Rl ∼ = Rd


∗ d
and recall that π denotes the projection from T R onto R ; in coordinates
d

π (x, p) = x). Note that in the point-point setting, xT = y is fixed and only pertur-
bations of the arrival “velocity” pT —without restrictions, i.e. without transversal-
ity condition—are considered. Non-degeneracy of the resulting map should then
be called non-conjugacy (between two points; here: xT and x0 ). In the absence of
the drift vector field σ0 , this is consistent with the usual meaning of non-conjugacy;
after identifying tangent- and cotangent-space ∂q|q=0 π H0←T is precisely the dif-
ferential of the exponential map.
• The explicit marginal density expansion. We then have

f ε (y, T ) = e−c1 /ε ec2 /ε ε−l (c0 + O (ε)) as ε ↓ 0.


2

with c1 =  (y). The second-order exponential constant c2 then requires the


solution of a finitely many ( #Kmin
a < ∞) auxiliary ODEs, cf. Theorem 1.

4 Analysis of the Black–Scholes Basket

For a general multi-dimensional Black-Scholes model, we have a Hamiltonian

1 
H(x, p) = p, (σ (x)σ (x)T ) p ,
2

with σ (x) = (σ 1 x 1 , . . . , σ m x m ). While the corresponding Hamiltonian ODEs can


be solved in closed form, the boundary conditions lead to systems of non-linear
equations, which we cannot solve explicitly any more. While numerical solutions
are, of course, possible, we restrict ourselves to the extremely simple setting of
Sect. 2, in order to keep maximal tractability.  
Consequently, we have the Hamiltonian H(x, p) = 21 (σ x 1 p 1 )2 + (σ x 2 p 2 )2 .
The solutions of the Hamiltonian ODEs started at (x0 , p0 ) satisfy
⎛ ⎞
x01 eσ
2 x 1 p1 t
0 0
⎜ ⎟
⎜ x 2 eσ 2 x02 p02 t ⎟
⎜ 0 ⎟
Ht←0 (x0 , p0 ) = ⎜ ⎟, (17)
⎜ p 1 e−σ 2 x01 p01 t ⎟
⎝ 0 ⎠
p02 e−σ
2 x 2 p2 t
0 0
460 C. Bayer et al.

which can be easily seen from the observation that H is constant along solutions of
the Hamiltonian ODEs together with symmetry between (x 1 , p 1 ) and (x 2 , p 2 ). This
immediately implies that the inverse flow is given by
⎛ ⎞
xt1 e−σ
2 x 1 p1 t
t t
⎜ ⎟
⎜x 2 e−σ 2 xt2 pt2 t ⎟
⎜ t ⎟
H0←t (xt , pt ) = ⎜ ⎟. (18)
⎜ p 1 eσ 2 xt1 pt1 t ⎟
⎝ t ⎠
2 σ 2 x 2 p2 t
pt e t t

Now we introduce the boundary conditions. Note that, contrary to Theorem 1, we


now project to the linear subspace {x : x 1 + x 2 = K }. Thus, the terminal condition
on x translates into x T1 + x T2 = K —we need to end at the target manifold—, whereas
the transversality condition translates to pT being orthogonal to the target manifold.
Evaluating these conditions at T = 1, we get

x01 = S01 = 1,

x02 = S02 = 1,

x11 + x12 = K ,

p11 − p12 = 0.

It is a pleasant exercise to check that solving for x11 =: x and x x2 = K − x then


leads exactly to the first order condition (5) encountered in Sect. 2. With identical
arguments, assuming K ≤ 2e from here on (and disregarding the case K > 2e where
closed form computations are not available), we find that the optimal configuration
must satisfy x1∗ = (K /2, K /2). Inserting this value into the first two components
of (17), we obtain the equation
 
K K
= eσ p0 ⇐⇒ p0i = log
2 i
/σ 2 , i = 1, 2.
2 2

This implies that p1∗ = 2


σ2K
log(K /2), σ 22K log(K /2) . Moreover, we see that the
minimizing control satisfies
     log(K /2) 
σ x 1 (t) p 1 (t) σ p01 σ
ḣ 0 (t) = = = log(K /2)
, (19)
σ x 2 (t) p 2 (t) σ p02 σ

see (16), implying that the minimal energy is given by

1 log(K /2)2
(K ) = h 0 2H = = H(x0 , p0 ). (20)
2 σ2
On the Probability Density Function of Baskets 461

Regarding focality, we have to check that the matrix:


⎛   ⎞
∂  ∂ 
H 1 (x1 + (1, −1), p1 ) ∂η H 1 (x1 , p1 + η(1, 1))
 
⎜ ∂ =0 0←1 η=0 0←1 ⎟
M(x1 , p1 ) := ⎝   ⎠
∂  2 ∂  2
∂ =0 H0←1 (x 1 + (1, −1), p1 ) ∂η η=0 H0←1 (x 1 , p1 + η(1, 1))
(21)

is non-degenerate when evaluated at the optimal configuration (x1∗ , p1∗ ). A simple


calculation shows that
 2 1 1
e−σ x1 p1 − x11 p11 σ 2 e−σ x1 p1 −σ 2 (x11 )2 e−σ x1 p1
2 1 1 2 1 1

M(x1 , p1 ) = ,
−e−σ x1 p1 + x12 p12 σ 2 e−σ x1 p1 −σ 2 (x12 )2 e−σ x1 p1
2 2 2 2 2 2 2 2 2

implying that  
− σ 2K
2
K (1 − log(K /2))
2
M(x1∗ , p1∗ ) = ,
− σ 2K
2
K (−1 + log(K /2))
2

and we can conclude that

det M(x1∗ , p1∗ ) = 2σ 2 (log(K /2) − 1) ,

which is zero if and only if K = 2e. We summarize the results of this calculation as
follows:
• In the generic case K = 2e, the non-focality condition of Theorem 1 holds true,
and we obtain (from Corollary 3) the following (short time) density expansion of
BT = exp(σ WT1 ) + exp(σ WT2 ), expansion
 
 (K ) 1
K → exp − √ (c0 + O (T ))
T T

When specialized to unit volatility, we recover precisely (1)a.


• For K = 2e, the initial stock price is focal for the minimizing configuration, so
the non-focality condition of Theorem 1 fails. And indeed, we want it to fail for
the actual expansion in this case, namely (1)b, is not at all of the generic form
predicted by our theorem.

Remark 4 It is immediate to use this analysis to deal also with the case of non-unit
(but identical) spots S01 = S02 by scaling the Black-Scholes dynamics accordingly,
  i.e.,
by replacing K with K /S01 . Hence, in this case focality happens when log K
2S01
= 1,
i.e., when K = 2S01 e.
462 C. Bayer et al.

Remark 5 The question arises if the critical (“focal”) case K = 2e, with atypical
algebraic factor T −3/4 cf. (1)b, can also be recovered by a general theorem. Related
results in [14] and also [17] suggests that this may indeed be the case but would
require substantial additional work.

5 Extensions: Correlation, Local and Stochastic Vol

5.1 Analysis of the Black–Scholes Basket, Small Noise

In Sect. 4 we analyzed the density of a simple Black–Scholes basket with dynamics

dBt = St1 σ dWt1 + St2 σ dWt2 .

As explained in Sect. 3 the analysis is really based on a small noise (small vol)
expansion of
dBt = St1, σ dWt1 + St2, σ dWt2 ,

run til time T = 1. Consider now a situation with small rates, also of order . In
other words,
d Sti, = r Sti, dt + Sti, σ dW i ,

and then Bt = St1, + St2, as before. We still assume S0i = 1. A look at Theorem 1
(now we cannot use Corollary 3) reveals that the entire leading order computation
remains unchanged (at least at unit time and with trivial changes otherwise). The
resulting (now: small noise) density expansion of BT |T =1 is more involved and
takes the form
   
 (K ) 2r log(K /2) 1
K → exp − 2 exp (c0 + O()) . (22)
 σ 2 log(2) 

/2)
Here  (K ) is given in closed form, cf. (20), so that  (K ) = 2 log(K
σ2K
is also
explicitly known. Furthermore, under similar restrictions on K as before, h 0 is (still)
given by (19), so that  
(K /2)t
φth 0 = .
(K /2)t

Thus, the ODE for X̂ (see Theorem 1) is given by


 
d X̂ t (K /2)t
= log(K /2) X̂ t + r , X̂ 0 = x̂0 = 0,
dt (K /2)t
On the Probability Density Function of Baskets 463

which has the solution   t 


1 Kt
X̂ ti = r 1 − ,
2 log 2

implying that Ŷ1 = X̂ 11 + X̂ 12 = r K / log(2). Thus, the second exponential term has
the form given above.

5.2 Basket Analysis Under Local, Stochastic Vol etc.

One can immediately write down the Hamiltonian associated to, say two, or d > 2
assets, each of which is governed by local vol dynamics or stochastic vol, based
on additional factors. In general, however, one will be stuck with the analysis of
the resulting boundary value problem for the Hamiltonian ODEs; numerical (e.g.
shooting) methods will have to be used. In some models, including the Stein–Stein
model, we believe (due to the analysis carried out in [7]) that, in special cases, closed
form answers are possible but we will not pursue this here. Instead, we continue with
a few more computation in the Black–Scholes case for d assets.

5.3 Multi-variate Black–Scholes Models

In the multi-variate case d > 2 of a general, d-dimensional Black Scholes model


with correlation matrix (ρi j ), the Hamiltonian has the form

1 
d
H(x, p) = ρi j σ i pi x i σ j x j p j .
2
i, j=1

Thus, the Hamiltonian ODEs have the form


d
ẋ l = σ l x l ρli σ i pi x i , i = 1, . . . , d
i=1

d
ṗl = −σ l pl ρli σ i pi x i , i = 1, . . . , d.
i=1

Consequently, it is again easy to see that ∂t∂ x l (t) pl (t) = 0, implying that x l (t) pl (t) =
x0l p0l . The Hamiltonian flow has the form
464 C. Bayer et al.
⎛ d ! d ⎞
x0l exp σ l i=1 ρli σ p0 x 0 t
i i i
⎜ l=1 ⎟
Ht←0 (x0 , p0 ) = ⎝ ! ⎠. (23)
d d
p0l exp −σ l i=1 ρli σi p0i x0i t
l=1

Using again that pl (t)x l (t) = pl (0)x l (0) for any l, we obtain the inverse Hamiltonian
flow ⎛ ! d ⎞
d
xtl exp −σ l ρli σ i pi x i t
⎜ i=1 t t
l=1 ⎟
H0←t (xt , pt ) = ⎝ ! d ⎠. (24)
 d
pt exp σ
l l ρ
i=1 li σ i p i x
t t
i t
l=1

The boundary conditions—at T = 1—are now given by

x0 = S0 (25a)

d
x l (1) = K (25b)
l=1
p 1 (1) = p (1) = · · · = p d (1).
2
(25c)

Indeed, the transversality"condition (25c)# says that the final momentum p(1) is
d
l=1 y = K , whose tangent space is spanned by the
orthogonal to the surface l

collection of vectors e1 − el , l = 2, . . . , d, with e1 , . . . , ed the standard basis of Rd .


The equations (25) are certainly not difficult to solve numerically, but an explicit
solution is not available, neither in the general case nor in the case of d uncorrelated
assets.

Remark 6 The main point of this calculation is that while explicit solutions are no
longer possible in a general Black-Scholes model, the phenomenon (1) potentially
appears in all Black-Scholes models. Moreover, we stress that the non-focality con-
ditions are easily checked numerically.

Remark 7 Note that the discretely monitored Asian option can be considered as
a special case of a basket option on correlated assets. Indeed, let us consider an
option on

1 
N
Sti , with (for simplicity) ti = it, i = 1, . . . , N .
N
i=1

For each individual i ∈ { 1, . . . , N } we have, for fixed t > 0, the equality in law
1 i W i − 1 (σ i )2 t
Sti = S0 eσ Bit − 2 σ = S0 eσ
2 it
t 2
On the Probability Density Function of Baskets 465
√ √  1 
for σ i := iσ and Wt i := B
it / i. In law, the vector Wt , . . . , Wt corre-
N

sponds to the marginal distribution of an N -dimensional Brownian motion at time


t with correlation ρi j = min(i,
√ j) , 1 ≤ i, j ≤ N . Thus, the Asian option corre-
ij
sponds to an option on the basket with S0i ≡ S0 , σ i as above and a correlation matrix
ρi j with maturity t. Moreover, the asymptotic expansion of the price of the Asian
option as t → 0 corresponds to the short-time asymptotics of the basket.

Remark 8 A small-noise asymptotic expansion of the continuous Asian option on


$T
0 St dt is also possible by the techniques of Sect. 3 (with ellipticity conditions
replaced by weak Hörmander conditions). Essentially, this is equivalent to letting
N → ∞ in Remark 7—but more direct.

As in the two-dimensional case, the boundary conditions can be solved explicitly


in the fully symmetric case, when σ l ≡ σ and, say, S0l ≡ 1. For suitable K the
optimal configuration is

x0∗ = (1, . . . , 1)T , x1∗ = (K /d, . . . , K /d)T


   T
log(K /d) log(K /d) T d d
p0∗ = , . . . , , p1
∗= log(K /d), . . . , log(K /d) .
σ2 σ2 σ2K σ2K

Introducing ⎛ ⎞ ⎛ ⎞
1 2 + · · · + d
⎜1⎟ ⎜ −2 ⎟
⎜ ⎟ ⎜ ⎟
q = 1 ⎜ . ⎟ , z = ⎜ .. ⎟,
⎝ .. ⎠ ⎝ . ⎠
1 −d

we obtain (for the case of d uncorrelated assets)



M(x1 , p1 ) := ∂(z,q) (z,q)=0 π H0←1 (x1 + z, p1 + q)
 
a b
= 1 ,
a G

where a = (a2 , . . . , ad )T ∈ R(d−1)×1 , b = b(1, . . . , 1) ∈ R1×(d−1) , G =


diag(g2 , . . . , gd ) ∈ R(d−1)×(d−1) with

al = −(σ l )2 (x1l )2 e−(σ


l )2 p l x l
1 1 , l = 1, . . . d,
!
b = 1 − (σ 1 )2 x11 p11 e−(σ ) p1 x1 ,
1 2 1 1

!
gl = − 1 − (σ l )2 x1l p1l e−(σ ) p1 x1 , l = 2, . . . d.
l 2 l l
466 C. Bayer et al.

In the symmetric case, we can evaluate M at the optimal configuration and obtain
⎛ % & % & ⎞
−σ 2 Kd 1 − log(K /d) Kd ··· 1 − log(K /d) d
K
⎜ 2K % & ⎟
⎜−σ d − 1 − log(K /d) Kd ··· 0 ⎟
M(x1∗ , p1∗ ) =⎜
⎜ .. .. .. .. ⎟,

⎝ . . . . ⎠
% &
−σ 2 Kd 0 · · · − 1 − log(K /d) d
K

whose determinant can be seen to be


' (
d d−1
det M(x1∗ , p1∗ ) = (−1)d σ 2 K (1 − log(K /d)) .
K

Thus, the non-focality condition fails if and only if K = de. Moreover, we obtain
the energy
d log(K /d)2
(K ) = H(x0∗ , p0∗ ) = .
2 σ2

6 A Geometric Approach to Focality

In this final section we take a more geometrical look at the non-focality condition
appearing in Sect. 3.2. Consider the Black Scholes model
 
d Sti = σ i Sti dWti , dW i , dW j = ρi, j dt.
t

We change parameters S → y → x, by
 
Si
log
S0i
y i := , x i = L i p y p , i = 1, . . . , d,
σi

where ρ denotes the correlation matrix of W and ρ = L L T its Cholesky factoriza-


tion. Obviously, S i = S0i eσi y . In terms of the x-coordinates we have
i

 p
x i = x i (F) = L i p log S p /S0 /σ p ,
S i = S i (x) = S0i eσi L
ipx p
.

The advantage of using the chart x is that the corresponding Riemannian metric
tensor is the usual Euclidean metric tensor. Thus, we simply have

d(S0 , S) = |x0 − x|
On the Probability Density Function of Baskets 467

and the geodesics are straight lines as seen from the x-chart. Note furthermore that
S = S0 is transformed to x = 0.  +
The payoff function of the option is given " by wiSTi − K . We# normalize
d
wi ≡ 1 and T ≡ 1. The strike surface F = S ∈ Rd+  i=1 S i = K , which is
(a sub-set of) a hyperplane in S coordinates is, however, transformed to a much more

complicated submanifold in x coordinates. Re-phrasing the equation i S i = K in
y-coordinates and solving for y d gives
)  *

d−1
σi
d
Li p x p
y d = log K− S0i e p=1 /S0d /σ d ,
i=1

with (L i j ) = (L i j )−1 , which implies—using that L and L −1 are lower-triangular


matrices –
)  *

d−1 i 
d−1
i σ i p=1 L i p x p
L x = log
dd d
K− S0 e /S0 /σ d −
d
L dk x k .
i=1 k=1

For sake of clarity, let us introduce the notation q = (q 1 , . . . , q d−1 ) := (x 1 , . . . ,


x d−1 ). A parametrization of the strike surface F is then given by the map ϕ : U ⊂
Rd−1 → Rd with
+ d−1 ,
 i
d−1  i σ i p=1 L i p q p
U := q ∈ R  S0 e <K ,

i=1

and
 + )  * ,
1 
d−1 i 
d−1
S0i eσ
i ip p
ϕ(q) := q, dd log K− p=1 L q /S0d /σ −
d dk k
L q .
L
i=1 k=1

Note that by the change of coordinates, we are implicitly assuming that S i > 0 for
all i. Moreover, the standard basis e1 (p), . . . , ed−1 (p) of the tangent space Tp F to F
at p = ϕ(q) is given by the columns of the Jacobi matrix of ϕ evaluated at q, more
precisely we have
⎛ ⎡ d−1 j ji j σ j  j L jr q r ⎤⎞
1 1 σ L S e r =1
ei (p) = ⎝(δi )d−1 ⎣ + L di ⎦⎠
j j=i 0
j=1 , − dd
σ d K − d−1 S j eσ j rj=1 L jr q r

L
j=1 0
468 C. Bayer et al.

for i = 1, . . . , d − 1 and p = ϕ(q). Consequently, the normal vector field N to S at


p = ϕ(q) is given by
⎛⎛ ⎡ ⎤⎞d−1 ⎞
d−1 j ji j σ j  j L jr q r
⎜ 1 1 σ L S e r =1

N (p) = α(p) ⎝⎝ dd ⎣ d + L di ⎦⎠
j=i 0
, 1⎠ = N ◦ ϕ(q),
σ K − d−1 S j eσ j rj=1 L jr q r

L
j=1 0 i=1

where α is a normalization factor guaranteeing that |N (p)| = 1, i.e.,


⎛ ⎡ ⎤2 ⎞−1/2
d−1 j
j L ji S j eσ j r =1 L jr q r
⎜ 
d−1
1 ⎣ 1 σ ⎟
+ L di ⎦ ⎠
j=i 0
α(p) = ⎝1 + .
(L dd )2 σ d K − d−1 S j eσ j rj=1 L jr q r

i=1 j=1 0

The Weingarten map or shape operator L p : Tp F → Tp F is defined by


 
L p dϕϕ −1 (p) (v) = −d(N ◦ ϕ)(ϕ −1 (p)) · v,

v ∈ Rd−1 = Tϕ −1 (p) U , see [8]. In other words, for ϕ(q) = p, we interpret N as


a map in q and −L p is the directional derivative of that map. We study the Wein-
garten map since it gives us the curvature of the surface F. Indeed, the eigenvalues
k1 (p), . . . , kd−1 (p) of the linear map L p : Tp F → Tp F are called principal curva-
tures of F. Then the focal points of F at p are given by

1
{p + N (p)|1 ≤ i ≤ d − 1 such that ki (p) = 0}.
ki (p)

In order to compute the eigenvalues of the shape operator, we need to compute


the representation of L p in the standard basis (e1 (p), . . . , ed−1 (p)). Let us denote
this matrix by L(p), then we obviously have


L(ϕ(q))i j = − (N ◦ ϕ)(q), ei (ϕ(q)), i, j = 1, . . . , d − 1.
∂q j

The principal curvatures k1 (p), . . . , kd−1 (p) are, thus, the eigenvalues of the (d −1)-
dimensional matrix L(p).
Since the calculations become too complicated in the general case, we now again
concentrate on the case of two uncorrelated assets, i.e., d = 2 and ρ = L = I2 . In
this case, we have
 
σ 1 S01 eσ q
1 1

e1 (p) = 1, − 2 ,
σ K − S01 eσ 1 q 1
1
σ 1 S01 eσ , σ 2 K − S01 eσ
1q1 1q1
N (ϕ(q)) = 1  .
1 1 2
(σ 1 )2 (S01 )2 e2σ q + (σ 2 )2 K − S01 eσ q
1 1
On the Probability Density Function of Baskets 469

Thus, the Weingarten map is given by

L p (ve1 (p)) = vκ(p)e1 (p),

where for q = (q 1 ) ∈ R

K (σ 1 )2 (σ 2 )2 S01 eσ S01 eσ
1q1 1q1
−K
κ(ϕ(q)) = k1 (ϕ(q)) =  2 !3/2
(σ 1 )2 (S01 )2 e2σ + (σ 2 )2 S01 eσ q − K
1q1 1 1

is the curvature of the curve F in R2 . We see that κ = 0 if and only if K = S01 eσ q ,


1 1

i.e., at the boundary of the surface F. Otherwise, κ is negative.


Here, both components of N (p) are positive on F. Consequently, for any p =
ϕ(q) ∈ S there is precisely one focal point f = f(p) ∈ R2 , which is given by
!
S01 eσ
2(σ 2 )2 K − ((σ 1 )2 + (σ 2 )2 )S01 eσ q − (σ 2 )2 K 2
1q1 1 1

f1 = q 1 +  1 1
,
σ 1 (σ 2 )2 K K − S01 eσ q
 
K − S01 eσ q ((σ 1 )2 + (σ 2 )2 )S01 eσ q
1 1 1 1
σ 2 K e−σ q
1 1
1 σ2
f = 2 log
2
+ 2 − − .
σ S02 (σ 1 )2 (σ 1 )2 S01 (σ 1 )2 σ 2 K

Denoting p = (x 1 , x 2 ) and re-introducing the short-cut notation S i = S0i eσ


i xi
,
i = 1, 2, (noting that S 1 + S 2 = K ) we can express f as
% &
S 1 2(σ 2 )2 K − ((σ 1 )2 + (σ 2 )2 )S 1 − (σ 2 )2 K 2
f =x +
1 1
,
σ 1 (σ 2 )2 K S 2
% &
S 1 2(σ 2 )2 K − ((σ 1 )2 + (σ 2 )2 )S 1 − (σ 2 )2 K 2
f =x +
2 2
.
(σ 1 )2 σ 2 K S 1

In the current setting, let q∗ be the optimal configuration in q-coordinates, i.e.,


the point on F with smallest Euclidean norm. Then the non-focality condition of
Theorem 1 is satisfied, if 0 is not a focal point to ϕ(q∗ ), see the discussion in the
proof of [6, Prop. 6].

Remark 9 As both components of the normal vector N are non-negative on F and the
curvature κ is negative, 0 can only be a focal point if F has a non-empty intersection
with the positive quadrant. Inserting into the parametrization of F, we see that this
can only be the case if K > S01 + S02 . In other words: if the option is in the money, then
the non-focality condition is always satisfied (in the two-dimensional, uncorrelated
case).
470 C. Bayer et al.

Let us again use the parameters of Sect. 2, i.e., S01 = S02 = 1, σ 1 = σ 2 = σ . Then
log(K /2) log(K /2)
we consider S∗ = (K /2, K /2), which translates into x∗ = σ , σ .
Inserting into the formulas for the focal points, we obtain
K 
∗ ∗ log −1
f (x ) = f (x ) =
1 2 2
.
σ
So, 0 is focal to the optimal configuration, if and only if

K = 2e,

and we recover, once more, the results of Sects. 2 and 4—recall that S0 corresponds
to 0 in x-coordinates.
In Figs. 2 and 3 the focal points are visualized for two different configurations
of two uncorrelated baskets. We plot the surface F as a submanifold of R2 . We
have seen above that for any p ∈ F there is precisely one focal point f(p). Hence,
we additionally plot the surface {f(p)|p ∈ F}—more precisely, part of this surface.
In Fig. 2 we show the case constructed above where the non-focality condition is
violated. In Fig. 3 the option is ITM. As explained above, in the ITM case the manifold
F does not intersect the positive quadrant, implying that the non-focality condition
is satisfied.

(a) (b)
2

2
1

1
0

0
−1

−1

F F
Opt. path Focal points
Opt. config.
−2

−2

−2 −1 0 1 2 −2 −1 0 1 2
Optimal configuration Focal points

Fig. 2 Optimal configuration and focal points for two independent assets with σ 1 = σ 2 = 1,
S0 = (1, 1), K = 2e. a The dashed line depicts the optimal path between the spot price S0 (0 in the
q-chart) and the optimal configuration. b Dotted lines connect some selected points on the manifold
F with the corresponding focal points. Points marked with a triangle visualize the construction of
the focal points. We see that 0 is, indeed, focal to the optimal configuration
On the Probability Density Function of Baskets 471

(a)
0.0 (b)

0.0
0
−0.5

−0.5
−1.0

−1.0
−1.5

−1.5
−2.0

−2.0
F
Opt. path F
−2.5

−2.5
Focal points
Opt. config.
−3.0

−3.0
−3.0 −2.5 −2.0 −1.5 −1.0 −0.5 0.0 −3.0 −2.5 −2.0 −1.5 −1.0 −0.5 0.0
Optimal configuration (in the money regime) Focal points

Fig. 3 Optimal configuration and focal points for two independent assets with σ 1 = σ 2 = 1,
S0 = (1, 1), K = 2/e. a The dashed line depicts the optimal path between the spot price S0
(0 in the q-chart) and the optimal configuration. b Dotted lines connect some selected points on
the manifold F with the corresponding focal points. Points marked with a triangle visualize the
construction of the focal points. This example illustrates the fact that the non-focality condition
always holds when the basket option is in the money

Acknowledgments Martin Forde kindly informed us about some misleading formulations in a


previous version. P.K.F. has received partial funding from the European Research Council under
the European Union’s Seventh Framework Program (FP7/2007-2013) / ERC grant agreement nr.
258237.

References

1. M. Avellaneda, Boyer-Olson, D., Busca, J., Friz, P.: Application of large deviation methods to
the pricing of index options in finance, Comptes Rendus de l’Académie des Sciences—Series
I—Mathematique (2003)
2. Avellaneda, M., Boyer-Olson, D., Busca, J., Friz, P.: Reconstructing volatility. RISK (2004)
3. Ben Arous, G.: Développement asymptotique du noyau de la chaleur hypoelliptique hors du
cut-locus. Annales Scientifiques de l’Ecole Normale Supérieure 4(21), 307–331 (1988)
4. Ben Arous, G.: Methods de Laplace et de la phase stationnaire sur l’espace de Wiener. Sto-
chastics 25, 125–153 (1988)
5. Bismut, J.M.: Malliavin Calculus and Large Deviations. Birkhauser, Boston (1984)
6. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, part I: theoretical foundations. Commun. Pure Appl. Math. 67(1), 40–
82 (2014)
7. Deuschel, J.D., Friz, P.K., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, part II: applications. Commun. Pure Appl. Math. 67(2), 321–350
(2014)
8. do Carmo, M.P.: Differential Geometry of Curves and Surfaces. Prentice-Hall, Englewood
Cliffs (1976)
472 C. Bayer et al.

9. Dufresne, D.: The log-normal approximation in financial and other computations. Adv. Appl.
Probab. 36, 747–773 (2004)
10. Gulisashvili, A.: Analytically tractable stochastic stock price models. Springer Finance.
Springer, London (2012)
11. Gulisashvili, A., Stein, E.: Asymptotic behavior of the stock price distribution density and
implied volatility in stochastic volatility models. Appl. Math. Optim. 61(3), 287–315. doi:10.
1007/s00245-009-9085-x
12. Gulisashvili, A., Tankov, P.: Tail behavior of sums and differences of log-normal random
variables, Bernoulli, to appear
13. Heston, S.: A closed-form solution for options with stochastic volatility, with application to
bond and currency options. Rev. Financ. Stud. 6, 327–343 (1993)
14. Molchanov, S.A.: Diffusion processes and Riemannian geometry. Russ. Math. Surv. 30(1),
1–63 (1975)
15. Seierstad, A., Sydsaeter, K.: Optimal Control Theory with Economic Applications. Advanced
Textbooks in Economics, vol. 24. North- Holland, Amsterdam (1987)
16. Stein, E.M., Stein, J.C.: Stock price distributions with stochastic volatility: an analytic approach.
Rev. Financ. Stud. 4, 727–752 (1991)
17. Takanobu, S., Watanabe, S.: Asymptotic expansion formulas of the Schilder type for a class
of conditional Wiener functional integration. In Asymptotics problems in probability theory:
Wiener functionals and asymptotics. In: Elworthy, K.D., Ikeda, N. (eds.) Pitman Research
Notes in Mathematics Series, vol. 284, pp. 194–241. (1993)
On Small-Noise Equations with Degenerate
Limiting System Arising from Volatility
Models

Giovanni Conforti, Stefano De Marco and Jean-Dominique Deuschel

Abstract The one-dimensional SDE with non Lipschitz diffusion coefficient


γ
dXt = b(X t )dt + σ X t dBt , X 0 = x, γ < 1 (1)

is widely studied in mathematical finance. Several works have proposed asymptotic


analysis of densities and implied volatilities in models involving instances of (1),
based on a careful implementation of saddle-point methods and (essentially) the
explicit knowledge of Fourier transforms. Recent research on tail asymptotics for
heat kernels (Deuschel et al. Comm. in Pure and Applied Math., 67(1):40–82, 2014,
[11]) suggests to work with the rescaled variable X ε := ε1/(1−γ) X : while allowing
to turn a space asymptotic problem into a small-ε problem, the process X ε satisfies
a SDE in Wentzell–Freidlin form (i.e. with driving noise εdB). We prove a pathwise
large deviation principle for the process X ε as ε → 0. As it will be seen, the limiting
ODE governing the large deviations admits infinitely many solutions, a non-standard
situation in the Wentzell–Freidlin theory. As for applications, the ε-scaling allows
to derive leading order asymptotics for path functionals: while on the one hand the
resulting formulae are confirmed by the CIR-CEV benchmarks, on the other hand
the large deviation approach (i) applies to equations with a more general drift term
and (ii) potentially opens the way to heat kernel analysis for higher-dimensional
diffusions involving (1) as a component.

Keywords Pathwise large deviations · Square-root diffusions · Tail asymptotics ·


Freidlin-Wentzell · Large deviations · Degenerate diffusions · CIR process

G. Conforti
Universität Potsdam, Potsdam, Germany
e-mail: [email protected]
S. De Marco (B)
Ecole Polytechnique, Route de Saclay, Palaiseau Cedex 91128, France
e-mail: [email protected]
J.-D. Deuschel
Technische Universität Berlin, Berlin, Germany
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 473


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_17
474 G. Conforti et al.

1 Introduction

The Wentzell–Freidlin large deviation theory studies the asymptotic behavior of


the distribution on path space of the solution to the equation dXεt = b(X ε )dt +
εσ(X tε )dBt , X 0ε = x as ε → 0, where B is a Brownian motion. When the coeffi-
cients b and σ are, say, Lipschitz functions, it is easy to see (with an application of
Gronwall’s Lemma) that the trajectories of X ε converge in law to the deterministic
solution of the ordinary differential equation dϕt = b(ϕt )dt, ϕ0 = x. The theory of
large deviations accounts for the rate of this convergence: denoting W the Wiener
measure, the large deviation principle (LDP)
1
− inf φ∈ I (φ)
W (X ε ∈ ) ≈ e ε2

holds for subsets  of the path space C([0, T ]).1 Denote ϕ(h) the unique solution of
the ODE dϕt = b(ϕt )dt + σ(ϕt )dh t , ϕ0 = x, where the control h is an absolutely
continuous path with square integrable derivative ḣ. The rate function I is given by
I (φ) = 21 |ḣ|2L 2 , where h is the control steering the trajectory of the deterministic
system along the given path φ, that is ϕ(h) = φ. When the diffusion coefficient σ
is invertible, the control h is identified by ḣ t = σ(ϕt )−1 (ϕ̇t − b(ϕt )), yielding the
typical form of the rate function

1 T (φ̇t − b(φt ))2
I (φ) = dt.
2 0 σ(φt )2

The intuition behind such a result is that we can write X ε (ω) = X (εω), where X
is the ‘pathwise’ solution of dX = b(X )dt + σ(X )dB, X 0 = x. If we accept that such
a map X exists and is regular enough, then the contraction principle in conjunction
with Schilder’s theorem for large deviations of Brownian paths [12, Chap. 1] provides
the LDP and the rate function for X ε . The standard assumptions under which such a
program is carried are conditions of global Lipschitz continuity and ellipticity for the
coefficients, see [10, 12]. Several works have aimed at weakening these assumptions
and extending the class of equations for which the LDP holds. Dependence on ε in
both the drift and the starting point can be introduced, and global Lipschitz continuity
can be replaced with (essentially) local Lipschitz-continuity and conditions for the
non explosion of the solution (building on the idea of Azencott [3] to exploit the
quasi-continuity property of the Itô map, that only relies on local properties of the
equation coefficients). We refer to [4] for a nice recent summary of sets of conditions
under which the Wentzell–Freidlin estimate holds.

1 Theprecise statement here is − inf ◦ I (φ) ≤ lim inf ε→0 ε2 log W (X ε ∈ ) ≤ lim supε→0 ε2
φ∈
log W (X ε ∈ ) ≤ − inf φ∈ I (φ).
On Small-Noise Equations with Degenerate Limiting System … 475

Recent research on heat kernel asymptotics [11] focuses on the tail behavior for
correlated stochastic volatility models. Exploiting the space-scaling properties of
the log-price process Yt in some parametric models (namely: there exists θ > 0
such that the rescaled variable Ytε := εθ Yt has the same law as the log-price in a
stochastic volatility model with driving noise εdBt ), the approach of [11] is to convert
the asymptotic problem for the tail distribution, W (Yt > R) as R → ∞, to the
problem of small-noise probabilities, W (Ytε > 1) as ε → 0. Then, a large deviation
principle for the rescaled process serves as a building block to study the asymptotic
behavior of the corresponding heat kernel (using the tools of Malliavin calculs and
the Laplace method on path space, see [5, 7]). This approach can be fully justified,
and explicit computations are possible, for the stochastic volatility model of Stein
and Stein [25] (also known as Schöbel–Zhu [24] in the correlated case), where the
stochastic volatility follows an Ornstein–Uhlenbeck process with constant diffusion
coefficient, which is the main case-study of [11]. As pointed out in [11, Sect. 5.3], in
the framework of models where the volatility has square-root diffusion coefficient
(main example: Heston), or more generally a diffusion coefficient of the form x γ ,
γ < 1 (as in [2, 21]), such a space-scaling approach leads to a situation where the
same approach is not justified anymore (and a formal application of the resulting
expansion even leads to a wrong conclusion). Quoting [11, Sect. 5.3], “curiously
then even a large deviation principle for (the rescaled volatility process) as given
above presently lacks justification”.
γ
To be more specific, consider the equation dXt = (α + β X t )dt + σ X t dBt with
θ
positive initial condition X 0 = x > 0. Looking for a value of θ such that ε X satisfies
an equation with small-noise ε leads to define the rescaled process X ε := ε1/(1−γ) X ,
which indeed satisfies the equation

dXεt = (αε + β X tε )dt + εσ(X tε )γ dBt , X 0ε = x ε (1.1)

with
αε := ε1/(1−γ) α x ε := ε1/(1−γ) x.

Of course, this change of variables allows to write W (X t > R) = W (X tε > 1)


using ε = R −1/(1−γ) . As mentioned above, the question is whether a large deviation
principle holds at all for W (X tε ∈ ·) as ε → 0. Note that both the initial condition
x0ε and the constant term αε in the drift coefficient tend to zero as ε → 0. On the
one hand, it is not difficult to see that X ε → 0 in law with respect to the uniform
topology on C([0, T ]). On the other hand, writing down formally the limiting ODE
that should govern the large deviations, one gets

ϕ̇t = βϕt + σ|ϕt |γ ḣ t , ϕ0 = 0. (1.2)


476 G. Conforti et al.

The Eq. (1.2) is known to admit infinitely many solutions.When ḣ t ≥ 0, the set
(θ) t
of solutions contains the one-parameter family ϕt = eβt σ(1 − γ) θ e−β(1−γ)s
1/(1−γ)
ḣ s ds 1{t≥θ} , with θ ≥ 0.2 Then, the definition itself of the map h → ϕ(h)
associating the control with the corresponding solution of the ODE is not anymore
possible.
We will occasionally address this situation as “degenerate”. Let us note straight
away that large deviations for diffusions with non-Lipschitz coefficients have been
studied in Baldi and Caramellino [4] Donati-Martin et al. [13], Klebaner and Lipster
[19] and Robertson [23]. In [4, Theorem 1.2] a large deviation principle is derived
for the family of equations dXεt = b(X tε )dt + εσ(X tε )dBt , X 0ε = x > 0 (note the
strictly positive initial condition), where the function σ(·) roughly behaves like σx γ
(see [4, Assumption (A1.1)] for precise conditions) and b : [0, ∞) → R is a locally
Lipschitz function with sub-linear growth and b(0) > 0. The conditions for both a
drift term b and an initial datum independent of ε, such that b(0) > 0 and x > 0, are
violated in the situation we consider here. In [13], b(0) = 0 and x = 0 are allowed,
but the analysis is limited to the square-root case γ = 1/2, and b and x remain
independent of ε. Note in this respect that setting b(0) = x = 0 implies X ε ≡ 0
for all ε, and in this case a LDP trivially holds with the rate function I (0) = 0,
I (φ) = ∞ for φ ≡ 0 (as stated in [13, Theorem 1.3]); in contrast with (1.1), where
both bε (0) = αε and x ε do tend to zero as ε → 0, but coming from strictly positive
values, so that the solution of the SDE is non trivial for every value of ε. In both
these works, uniqueness for the limiting ODE is a key point (and appears as a part
of [4, Assumption (A2.3)] and is exploited in [13, Sect. 5]). In order to study the
asymptotic behavior of the ruin probability W (τ0 ≤ T ) with τ0 = inf{t : X t = 0}
as the initial condition x tends to infinity, Klebaner and Lipster [19] exploit a similar
space scaling by working with the ‘normed’ process X tx = X t /x, and show that a
LDP holds for the process X x as x → ∞. The major difference with our setting is
that the initial condition X 0x = 1 in [19] is fixed and does not tend to zero as x ε in
(1.1), which is one of the difficulties to encompass in our analysis. Robertson [23]
derives LDP for a class of stochastic volatility models, including the Heston model
with square-root volatility process. One of the assumptions used there is that the
small noise problem for the volatility process has the same form as in Donati-Martin
et al. [13], see [23, Assumption 2.1], and the work carried out is to transfer the LDP
to the second component of the process (the log-price). Therefore, the work of [23]
does not cover small-noise problems in the form of (1.1).
We establish a LDP for a generalized version of Eq. (1.1), allowing α to be a
function of the process. That is, we start from Eq. (1) under the assumptions:
(H1) γ ∈ [1/2, 1), σ > 0, x > 0.
(H2) b(y) = α(y) + β y, where α is a Lipschitz continuous and bounded function,
and α(y) ≥ 0 in a neighbourhood of 0.

2 When β = 0, γ = 1/2 and ḣ ≡ 1, one retrieves the textbook example of ODE for which uniqueness
√ (θ)
fails, ϕ̇t = σ |ϕt |, whose solutions from ϕ0 = 0 are given by the one-parameter family ϕt =
σ2
4 (t − θ) 1{t≥θ} .
2
On Small-Noise Equations with Degenerate Limiting System … 477

Under (H1)–(H2), (1) is known to admit a positive solution, which is pathwise unique
by Yamada and Watanabe’s uniqueness theorem.

Theorem 1.1 Assume conditions (H1)–(H2), and let (X t )t≥0 be the unique strong
solution to (1). Set X ε := ε1/(1−γ) X ; then X ε satisfies (1.1) with the constant α
replaced by the function α(·). Then, the family {X ε }ε satisfies a large deviation
principle on the path space C([0, T ], R+ ) with inverse speed ε2 and rate function
  2
1 T ϕ̇t − βϕt
IT (ϕ) = γ 1{ϕt =0} dt,
2σ 2 0 ϕt

and IT (ϕ) = +∞ whenever ϕ(0) = 0 or ϕ is not absolutely continuous.

Let us note that in the definition of IT above, the expression ϕ1t γ 1ϕt =0 is intended
to be well defined for any ϕt ∈ R+ , and it is equal to zero when ϕt = 0.
It is easy to see that the unique zero of IT is ϕ ≡ 0, consistently with the
W
fact that X ε → 0 as  ε →  0. Roughly speaking, Theorem 1.1 allows to write
W (X ε ∈ ) = exp − ε12 inf φ∈ IT (φ) + ψ(ε) for subsets  of C(0, T ) such
that inf φ∈ IT (φ) = inf ◦ IT (φ), where the function ψ(ε) vanishes as ε → 0; we
φ∈
refer to Theorem 2.1 in Sect. 2 for the precise statements.
According to our definition of X ε , one has W (X tε ≥ 0, ∀t ≥ 0, ∀ε > 0) =
W (X t ≥ 0, ∀t ≥ 0) = 1. A criterium for the strict positivity of the trajectories of
X ε , based on Feller’s test for explosion, can also be given (see [9, Proposition 3.1]:
when γ > 1/2, a(0) > 0 implies W (X tε > 0, t ≥ 0) = 1, while for γ = 1/2, the
same conclusion is guaranteed by 2α(y)/σ 2 ≥ 1 for y in a right neighborhood of
zero—yielding the familiar Feller condition 2α/σ 2 ≥ 1 when α is constant). Note
that Theorem 1.1 does not assume any of these conditions for the non-attainability
of zero; in particular for the CIR diffusion, we do not assume the Feller condition on
the coefficients α and σ.
From Theorem 1.1, tail asymptotics for some functionals of the process X can
be derived (which is exactly why the ε-scaling leading to X ε was introduced!). The
pathwise LDP allows to consider path functionals of the process, such as the running
supremum, or the time average.

Theorem 1.2 Let (X t )t≥0 be the unique strong solution to (1) under conditions
(H1)–(H2), and let T > 0. Then, as R → ∞

W (X T ≥ R) = e−R
2(1−γ) (c
T +o(1)) (1.3)

and  
= e−R
2(1−γ) (c
W sup X t ≥ R T +o(1)) (1.4)
t∈[0,T ]
478 G. Conforti et al.

and   
1 T
= e−R
2(1−γ) (ν
W X t dt ≥ R T +o(1)) . (1.5)
T 0

The constant cT , resp. νT are explicitly known in terms of the model parameters, and
are provided below in Proposition 2.5, resp. Proposition 3.14 for the case γ = 1/2.

The estimates in Theorem 1.2 can be compared with the explicit formulae available
for cumulative distributions and critical exponents in the CIR and CEV models:
these consistency checks are done in Sects. 2.1 and 3.4, showing that the estimates
in Theorem 1.2 are correct on the log-scale. While in the one-dimensional setting
the large deviation approach yield by Theorem 1.1 applies to equations with a more
general drift term than a purely affine function, it also opens the way to heat kernel
analysis for higher-dimensional diffusions involving (1) as a component, which is
exactly the case left open in [11].
Let us finally note that, due to the non uniqueness of solutions for the limiting
system, the problem we consider here appears to be related to the issue of regulariza-
tion by noise of ODEs. Leaving further discussions to future work, let us just point
out here a structural difference with that setting: in that context, one considers an
SDE of the form dXεt = b(X tε )dt + εdBt , with unit dispersion coefficient, seen as a
perturbation of the deterministic system ẋt = b(xt ) with non-Lipschitz drift b (e.g.
b(x) = sign(x)|x|γ ). Among the possible solutions of the deterministic system, one
then looks at the (few) ones supporting the limiting law of X ε , obtaining the so-called
zero noise limits of the equation; see [27] and references therein. In our framework,
the equation for X ε already possesses a Lipschitz continuous drift b(x) = αε + βx.
Correspondingly, the limiting system ẋt = βx, x0 = 0, already has a unique solu-
tion (here: the null path x = 0), which then gives the unique weak limit for X ε (in
contrast to [27, Corollary 1.2], where the limit is a probability distribution supported
on two trajectories). As we pointed out, the difficulties in our setting come from the
non-Lipschitz diffusion coefficient and appear at the level of the definition of the rate
function via the control system (1.2).
In the remainder of the document, Sect. 2 is devoted to the proof of Theorem 1.1,
while in Sect. 3.4 we prove the different statements of Theorem 1.2. We collect in
Appendix A the proofs of some of the more technical material.

2 Main Theoretical Estimates

Let  := C ([0, T ], R), ≥0 := C ([0, T ], R+ ) denote the space of continuous


(resp. continuous non negative) functions on [0, T ]. (, Ft , F) denotes the canonical
Wiener space, W the Wiener measure on (, Ft , F), and E the expectation under W .
We denote H = {h ∈ AC([0, T ], R) : ḣ ∈ L 2 } the space of absolutely continuous
paths on [0, T ] with square-integrable derivative (usually referred to as Cameron-
Martin space). For a set of coefficients α(·), β, γ, σ satisfying conditions (H1)–(H2),
On Small-Noise Equations with Degenerate Limiting System … 479

we denote X the W almost-surely unique strong solution of (1). We define the rescaled
1
process X ε := ε 1−γ X ; it is clear that X ε solves Eq. (1.1) with coefficients identified
by αε (x) = ε1/(1−γ) α(x) and x ε = ε1/(1−γ) x. Denote bε (x) := αε (x) + x.
The following theorem gives the precise LDP announced in Theorem 1.1 in the
Introduction. We recall that the expression y1γ 1 y =0 is well defined for any y ∈ R+ ,
and it is equal to zero when y = 0.

Theorem 2.1 Let X ε be the unique strong solution to (1.1). Then,

lim sup ε2 log W (X ε ∈ F) ≤ − inf IT (ϕ)


ε→0 F
(2.1)
lim inf ε2 log W (X ε ∈ G) ≥ − inf IT (ϕ)
ε→0 G

for every closed set F ⊆ ≥0 and every open set G ⊆ ≥0 , where the rate function
IT (ϕ) is defined by
  2
1 T ϕ̇t − βϕt
IT (ϕ) := γ 1{ϕt =0} dt, (2.2)
2σ 2 0 ϕt

and IT (ϕ) = +∞ whenever ϕ(0) = 0 or ϕ is not absolutely continuous.

Remark 2.2 We could state the large deviation principle of Theorem 2.1 on  =
C([0, T ], R), setting the rate function IT (ϕ) to +∞ whenever ϕ ∈
/ ≥0 . Since the
process X ε is known to be positive W -a.s. for every ε > 0, with such a definition
of the rate function the LDP (2.1) holds for every closed subset F and every open
subset G of .

Remark 2.3 As pointed out in the Introduction, the rate function for a family {X ε }ε
satisfying dXε = b(X ε )dt + εσ(X ε )dBt , X 0ε = x, can be written as

1
I T (ϕ) = inf |ḣ| 2 : h ∈ H, ϕ(h) = ϕ (2.3)
2 L

where ϕ(h) is the solution to the limiting ODE controlled by h, ϕ̇ = b(ϕ) + σ(ϕ)ḣ
and ϕ0 = x, provided this solution is unique. In our setting, consider ϕ ∈ S(u),
where now S(u) denotes the set of positive solutions of the degenerate ODE (1.2)
with control parameter h = u ∈ H : on the set {ϕ > 0}, u is uniquely determined by
ϕ via u̇ t = ϕ̇t −βϕ
γ
ϕt
t
; on the set {ϕ = 0}, the function ϕ is seen to satisfy Eq. (1.2) for
any control parameter h. This means that the set of h such that ϕ ∈ S(h) contains
the infinitely many elements given by

ϕ̇t − βϕt d h̃ t
ḣ t = γ 1{ϕt >0} + 1{ϕt =0} , h̃ ∈ H.
ϕt dt
480 G. Conforti et al.

The control h 0 achieving the minimum norm is obtained setting h̃ ≡ 0. This gives
2 |ḣ 0 | L 2 = inf 2 |ḣ| L 2 : h ∈ H, ϕ ∈ S(h) = I T (ϕ) for the rate function I T defined
1 1

in (2.2).

Remark 2.4 Assume that b : [0, ∞) → R is a locally Lipschitz function with


ε ε ε ε γ
sublinear growth and b(0) > 0, and that X satisfies d X t = b(X t )dt + εσ X t dBt
ε
and X 0 = x > 0. Then it is known from [4, Theorem 2.1] or [8, Theorem 4.2] that
ε
X satisfies a LDP with rate function
  2
1 T ϕ̇t − b(ϕt )
JT (ϕ) := γ dt,
2σ 2 0 ϕt

and JT (ϕ) = ∞ if ϕ is not absolutely continuous, where one classically agrees that
1/ϕt is equal to +∞ if ϕt = 0. We stress that the latter rate function is radically
different from IT defined in (2.2): whenever ϕ = 0 on some non trivial interval
K ⊂ [0, 1], then JT (ϕ) = ∞, while in such a case the integrand in (2.2) gives zero
contribution to IT on K . In other words, while trajectories with a zero-set of positive
ε
measure require infinite energy to be followed by the process X in the small-noise
limit, they are favoured by the rate function of the process X ε .

2.1 Tail Asymptotics

The space-scaling X ε = ε1/(1−γ) X together with the large deviation principle (2.1)
allow to work out tail asymptotics for functionals of the process X . The follow-
ing proposition provides the precise constants appearing in Theorem 1.2 in the
Introduction.

Proposition 2.5 The asymptotic formulas (1.3) and (1.4) in Theorem 1.2 hold with
the constant cT given by
⎧ −2β(1−γ)T
⎨ 2 βe if β = 0
σ (1−γ)(1−e−2β(1−γ)T )
cT = (2.4)
⎩ 1
if β = 0.
2σ 2 (1−γ)2 T

One can see that cT does not depend on the function α(·) in the drift of X , nor on
the initial condition x.

Remark 2.6 Some comments are in order.

(i) Comparison with explicit formulae for the CEV process. The asymptotic
behavior (1.3) can be compared with the explicit formulae available for the den-
sity of the CEV process. When α ≡ 0 in (1), X can be obtained as a deterministic
time-change of a power of a squared Bessel process (see [16, Sect. 6.4.3]). As
On Small-Noise Equations with Degenerate Limiting System … 481

a consequence, for every T > 0 the random variable X T is known to admit a


density with respect to the Lebesgue measure on the positive real line, given by

 
(1 − γ) β(−2(1−γ)+1/2)T 1
f X T (y) = e exp − x 2(1−γ) + y 2(1−γ) e−2β(1−γ)T
d(T ) 2d(T )
 
1
× x 1/2 y −2γ+1/2 I1/2(1−γ) x 1−γ y 1−γ e−β(1−γ)T , y > 0,
d(T )
(2.5)

where Iν is the modified Bessel function of the first kind of index ν > 0, and
d(T ) = (1−γ)σ
2
−2β(1−γ)T ) (note en passant that one has d(T ) > 0 for
2β (1 − e
every choice of the sign of β).3 The formula (2.5) is also valid for β = 0, when
one replaces all the β-dependent constants with their limits as β → 0, such as
d(T )|β=0 = (1 − γ)2 σ 2 T . Using the asymptotic behavior (see [1, Sect. 9.7.1])
z
of the modified Bessel function Iν (z) ∼ √e2πz as z → ∞ for fixed ν > 0, one
immediately obtains

e−2β(1−γ)T 2(1−γ)
log f X T (y) =: g(y) ∼ − y = −cT y 2(1−γ) , x → ∞,
2d(T )

with the constant cT defined in (2.4). Using some standard tools of ∞regular vari-
ation [6], one can then easily prove that log W (X T > y) = log y eg(z) dz ∼
g(y) ∼ −cT y 2(1−γ) as y → ∞, thus showing that estimate (1.3) is exact on the
log-scale.
(ii) The asymptotic estimate f X T (y) ≤ A T e−aT y
2(1−γ)
, y > 1, for the density of
X T was proven in [9] for the solutions of a class of SDEs containing (1) under
conditions (H1)–(H2) (namely, in [9] the coefficients β and γ are also allowed
to depend smoothly on X ), relying on techniques of Malliavin calculus and
transformations for 1-dimensional SDEs. The constant aT provided there is not
optimal. While the estimates in [9] remain valid for more general equations, the
large deviation principle in Theorem 2.1 allows to obtain a sharp estimate on
the log-scale.
T  
The asymptotic behavior W T1 0 X t dt = exp −R 2(1−γ) (νT + o(1)) for the
time average of the process can also be proven using Theorem 2.1: see Proposition
3.14 in Sect. 3.4, where an expression of the constant νT is provided in the case
γ = 1/2.

3 When γ ∈ [1/2, 1), the law of X T also possesses an atom at zero, P(X T = 0) = m T > 0, and an
explicit formula for the mass m T is available (see again [16, Chap. 6]). From our point of view, this
only means that the density f X T does not integrate to 1 on (0, ∞), without affecting our analysis
of the tail asymptotics at ∞.
482 G. Conforti et al.

3 Proof of the Main Estimates

We prove the large deviation principle in Theorem 2.1 by first showing the exponential
tightness of the family {X ε }ε , namely for every m < 0 there exists a compact set
K m ⊂ C([0, T ]) such that lim supε→0 ε2 log W (X ε ∈ K mc ) ≤ m. We then prove the
weak upper bound

lim sup lim sup ε2 log W (X ε ∈ B(ϕ, R)) ≤ −IT (ϕ) ∀ϕ ∈ ≥0 ,
R→0 ε→0

and the weak lower bound

lim inf lim inf ε2 log W (X ε ∈ B(ϕ, R)) ≥ −IT (ϕ) ∀ϕ ∈ ≥0
R→0 ε→0

where B(ϕ, R) denotes the closed ball in C([0, T ]) of radius R, B(ϕ, R) := {ϕ̃ :
|ϕ̃ − ϕ|∞ ≤ R}. It is a general fact that exponential tightness combined with the
weak upper bound yields the large deviation upper bound in (2.1) for any closed set
after a covering argument (see [12, Chaps. 1 and 2]). On the other hand, the weak
lower bound trivially provides the full lower bound in (2.1), observing that open sets
are neighborhoods of their points.

3.1 Exponential Tightness

We prove the exponential tightness considering balls in the Hölder norm ωη :=
sups,t≤T,s =t |ω|t−s|
t −ωs |
η and a natural bound on the initial condition ω0 . More precisely,

we define
K R := {ωη ≤ R} ∩ {ω0 ∈ (0, x]}. (3.1)

It is classical that these sets are compact in C([0, T ]).

Proposition 3.1 The family of measures W (X ε ∈ ·) is exponentially tight in scale


ε2 , i.e. 
lim lim sup ε2 log W X ε ∈ K Rc = −∞
R→+∞ ε→0

for every 0 < η < 21 .

We follow [13] in the proof of Proposition 3.1. First, let us observe that for ε ≤ 1,
W (X 0ε ∈ (0, x]) = 1 so that we just need to estimate the Hölder norm of X ε . To this
end, we use a version of Garsia-Rodemich-Rumsey’s Lemma, and the existence of
exponential moments for a process bounding X ε from above.
On Small-Noise Equations with Degenerate Limiting System … 483

Lemma 3.2 Consider ( X̃ t , t ≥ 0) the strong solution to

d X̃ t = (|α|∞ + |β| X̃ t )dt + σ( X̃ t )γ dBt , X̃ 0 = x

and define X̃ ε := ε1/(1−γ) X̃ . Then, there exist positive constants c and C such that:
  
E exp cε−2 ( X̃ tε )2(1−γ) ≤ C, ∀t ∈ [0, T ], ∀ε > 0. (3.2)

ε −2 ε 2(1−γ) = X̃ 2(1−γ)
  has ε ( X̃ t )
 of X̃ , one
Proof According to the definition t , so that
2(1−γ)
(3.2) holds if and only if E exp ≤ C for all t ∈ [0, T ]. When γ = 1/2,
c X̃ t
(3.2) follows from the asymptotic behavior of the density of the CIR process for
large arguments (see e.g. [16, Sect. 6.3.2, p. 358]); for general γ and β = 0, from the
asymptotic behavior of the density of the classical CEV process as stated for example
in [16, Lemma 6.4.3.1, p. 368]. For general γ and β, we rely on a slight generalization
of the proof of [9, Proposition 3.3]; we leave the details to Appendix A.

The next proposition is a direct consequence of Garsia-Rodemich-Rumsey’s


Lemma; see Appendix A for a statement of this lemma and a proof of Proposition 3.3.

Proposition 3.3 Let ω ∈ . Fix ε, R > 0, η ∈ (0, 21 ). Assume that:


   
T T |ωt − ωs |
exp 2 √ dsdt ≤ K ε,η (R) (3.3)
0 0 ε |t − s|
  
with K ε,η (R) := 1
4 exp T η−1/2 8εR2 − 4T 1/2−η − K η − 1 2
4T and K η :=
supu∈[0,T ] 2u 1/2−η log(u −1 ) < ∞. Then,

ωη ≤ R. (3.4)

In the proof of Proposition 3.1, we exploit a localization procedure: for any ε > 0
and n ∈ N, define the process X ε,n as the strong solution of the SDE with truncated
coefficients:

dXtε,n = bε (X tε,n ∧ n)dt + σε X tε,n ∧ n dBt , X 0ε,n = x ε . (3.5)

The paths of X ε,n can be decomposed in their martingale part and locally bounded
variation part
d X tε,n = d Atε,n + d Mtε,n

with d Mtε,n = εσ(X tε,n ∧ n)γ dBt and d Atε,n = bε (X tε,n ∧ n)dt. We shall also define
for every n, ε the stopping time T ε,n := inf t ≥ 0 : X tε ≥ n . By the pathwise
uniqueness for Eq. (1) (equivalently, (3.5)), we have that up to time T ε,n the processes
484 G. Conforti et al.
 
X tε t∈[0,T ] and X tε,n t∈[0,T ] coincide almost surely. More precisely, ∀n ∈ N and
ε>0 
ε ε,n
W X t∧T ε,n = X t∧T ε,n , ∀t ∈ [0, T ] = 1. (3.6)

Proof of Proposition 3.1 Let us fix η ∈ (0, 21 ). By (3.6),


  
W X ε η ≥ R ≤ W X ε,n η ≥ R, T ε,n ≥ T + W T ε,n ≤ T
 
≤ W X ε,n η ≥ R + W T ε,n ≤ T . (3.7)

Let us estimate the first term in (3.7). Using Proposition 3.3 and Markov’s inequality
we have for every ε, n:
   
T |M ε,n − Msε,n |
T
W (M ε,n η ≥ R) ≤ W exp ε−2 t√ dsdt ≥ K ε,η (R)
0 0 |t − s|
 T T   ε,n 
1 −2 |Mt − Msε,n |
≤ E exp ε √ dsdt.
K ε,η (R) 0 0 |t − s|
 
Applying the exponential martingale inequality E (exp(λMt )) ≤ E exp 2λ2 Mt
[22, Chap. IV] with λ = √1
ε2 |t−s|
, for t > s one has

       
|Mtε,n − Msε,n | 2σ 2 t 2γ
E exp √ ≤ 2 E exp X rε,n ∧n dr
ε2 t − s ε2 (t − s) s
 
≤ 2 exp σ 2 ε−2 n 2γ .

Therefore, using the definition of the constant K ε,η (R) in Proposition 3.3

 R
lim sup ε2 log W M ε,n η ≥ R ≤ −T η−1/2 + σ 2 n 2γ . (3.8)
ε→0 8

For the bounded variation part Aε,n , we observe that


 
 
W A ε,n
η ≥ R ≤ W T 1−η
sup b ε
X tε,n ∧n ≥ R .
t∈[0,T ]

Under hypothesis (H), bε (x) ≤ |α|∞ + βx for every x. Therefore, for every ε, n
  
W Aε,n η ≥ R ≤ W T 1−η (|α|∞ + βn) ≥ R = 0, (3.9)

where the last identity holds as soon as R > T 1−η (|α|∞ + βn).
On Small-Noise Equations with Degenerate Limiting System … 485

We now deal with the second term in (3.7). It follows from the comparison theorem
for one-dimensional SDEs [17, Proposition 5.2.18], that X tε ≤ X̃ tε , t ≤ T , almost
surely, where X̃ ε is defined in Lemma 3.2. For every fixed γ and a > 0, it is a simple
exercise to show that the function y → exp(aε−2 (1 + y)2(1−γ) ), y > 0, is increasing
if ε is small enough. ε
 For such values of ε, since X̃ t is a submartingale, so
and convex 4

is exp aε−2 (1 + X̃ tε )2(1−γ) . Then, we can apply Markov’s inequality and Doob’s
L 2 -inequality, obtaining:
 

W T n,ε
≤T =W sup X tε ≥n
t∈[0,T ]
   2(1−γ) 
−2 ε
≤W sup exp aε 1 + X̃ t
t∈[0,T ]

 
≥ exp aε−2 (1 + n)2(1−γ)
    
≤ exp −aε−2 (1 + n)2(1−γ) × 4 E exp aε−2 (1 + X̃ Tε )2(1−γ) .
(3.10)

Using the elementary inequality exp(a(1 + y)2(1−γ) ) ≤ exp(a22(1−γ) ) + exp


(a(2y)2(1−γ) ), and choosing a such that a × 22(1−γ) = c where c is the constant in
Lemma 3.2, it follows from this lemma and estimate (3.10) that
    
W T n,ε ≤ T ≤ exp −aε−2 n 2(1−γ) × 4 exp(cε−2 ) + C , (3.11)


where C is the second constant in Lemma 3.2. Now choosing n :=  R, the
condition under which (3.9) holds true is satisfied for R large enough. Passing to the
limit as ε → 0 in (3.7) and using (3.8), (3.9) and (3.11), we obtain
 
ε
 R 2 γ (1−γ)
lim sup ε log W X η ≥ R
2
≤ max − + σ R , −a R +c .
ε→0 8

Letting R → ∞, the conclusion follows. 

3.2 Weak Upper Bound

This section is devoted to the proof of the following proposition.

−2 (1+y)2(1−γ)
4 Thesecond derivative reads eaε × 2aε−2 (1 − γ)(1 + y)−2γ × [1 − 2γ + 2a
ε2
(1 − γ)
(1 + y)2(1−γ) ].
486 G. Conforti et al.

Proposition 3.4 ∀ϕ ∈ ≥0 ∩ H :



lim sup lim sup ε2 log W X ε ∈ B(ϕ, R) ≤ −IT (ϕ). (3.12)
R→0 ε→0

For every h ∈ H, ε > 0 and φ ∈ ≥0 , define


 T  T  s 
F ε (φ, h) := h T φT − h 0 φ0 − h T bε (φs )ds − φs − bε (φr ) dr ḣ s ds
0 0 0

σ2 T
− h 2s φ2γ
s ds. (3.13)
2 0

By setting ε = 0 in (3.13), we can define the functional F 0 (φ, h). Note that F ε (·, h)
is continuous ∀h ∈ H on the whole space ≥0 with respect to the sup-norm topology,
and converges to F 0 (·, h) uniformly on ≥0 as ε → 0.

Remark 3.5 Applying the integration by parts formula to the product h t X tε , one has
 T  T
εσ h t (X tε )γ dBt = h T X Tε − h 0 x0ε − [ḣ t X tε + h t bε (X tε )]dt
0 0
= h T X Tε − h 0 x0ε
 T  T   t 
− hT bε (X tε )dt − ḣ t X tε − bε (X sε )ds dt,
0 0 0

hence
 
ε
T σ2 T
F (X ·ε , h) = εσ h s (X sε )γ dBs − h 2s (X sε )2γ ds.
0 2 0

According to Remark 3.5, the random variable


 
1 ε ε 
MTε,h (ω) := exp F X (ω), h (3.14)
ε2
.
is the value at time T of the local exponential martingale associated to σε 0 h s (X sε )γ
dBs . It should be stressed that, for any h ∈ H and ε > 0, the functionals F ε (φ, h)
and MTε,h (φ) are well defined for every φ ∈ ≥0 , and not only almost surely.
Proof of Proposition 3.4 Since any positive local martingale is a supermartingale,
we have  
E MTε,h ≤ 1. (3.15)
On Small-Noise Equations with Degenerate Limiting System … 487

Fix now a trajectory ϕ ∈ ≥0 . Using the remark above:


 
− 1
F ε (X ε ,h)
ε
W (X ∈ B(ϕ, R)) = E e ε2 MTε,h 1{X ε ∈B(ϕ,R)}
   
1 ε
≤ sup exp − 2 F (φ, h) E MTε,h
φ∈B(ϕ,R) ε
 
1
≤ sup exp − 2 F ε (φ, h) .
φ∈B(ϕ,R) ε

Since supφ∈B(ϕ,R) |F ε (φ, h) − F 0 (φ, h) | → 0, we have that

lim sup ε2 log W (X ε ∈ B(ϕ, R)) ≤ sup (−F 0 (φ, h)).


ε→0 φ∈B(ϕ,R)

Therefore, by the continuity of φ → F 0 (φ, h),

lim sup lim sup ε2 log(W (X ε ∈ B(ϕ, R) ≤ −F 0 (ϕ, h), ∀h ∈ H.


R→0 ε→0

In the next proposition we prove that:

sup F 0 (ϕ, h) = IT (ϕ)


h∈H

which concludes the proof of (3.12). 

Proposition 3.6 ∀ ϕ ∈ ≥0 we have that:

sup F 0 (ϕ, h) = IT (ϕ) (3.16)


h∈H

Proof Assume ϕ ∈ ≥0 ∩ H is such that IT (ϕ) < ∞. Then, the function u defined by
u 0 = 0, u̇ s = ϕ̇sσϕ
−bϕs
γ 1ϕs =0 is by definition an element of H , and ϕ satisfies by con-
s
struction the ODE (1.2) with control u. Repeating the computations in Remark 3.5,
one can see that
 
T σ2 T
F 0 (ϕ, h) = σ h s ϕγs u̇ s ds − h 2s ϕ2γ
s ds.
0 2 0

Note that F 0 (ϕ, h) is concave in h, hence if it has a critical point, this must be a
maximum. The Fréchet differential D h F 0 (ϕ, h) at h, applied to the generical element
k ∈ H , reads  T  
D h F 0 (ϕ, h)[k] = σ ks ϕγs u̇ s − σh s ϕ2γ
s ds.
0
488 G. Conforti et al.

Therefore, D h F 0 (ϕ, h)|h=h ∗ = 0 at any h ∗ such that h ∗s = σϕ


u̇ s
γ on {s : ϕs = 0}
s
(while h s can take any arbitrary value on {s : ϕs = 0}). For such h ∗ , one has

 T  
1 T 1 T
F 0 (ϕ, h ∗ ) = (u̇ s )2 1ϕs =0 ds − (u̇ s )2 1ϕs =0 ds = (u̇ s )2 1ϕs =0 ds = IT (ϕ).
0 2 0 2 0

On the other hand, if ϕ is absolutely continuous and such that IT (ϕ) = +∞, one can
approximate the function ϕ̇s −βϕ

s
with a sequence h n ∈ H such that F 0 (ϕ, h n ) →
ϕs
+∞. 

3.3 Weak Lower Bound

This section is devoted to the proof of


Proposition 3.7 For all ϕ ∈ ≥0 , we have

lim inf lim inf ε2 log W X ε ∈ B(ϕ, R) ≥ −IT (ϕ). (3.17)
R→0 ε→0

In the spirit of Lamperti’s transformation, we introduce the process Y ε := (X ε )1−γ .


Y ε satisfies a SDE with constant diffusion coefficient and a drift coefficient that we
will be able to control. We will prove a large deviation weak lower bound for Y ε ,
and then transfer it to X ε by means of the contraction principle.
Proposition 3.8 Define

1 T 2
IT (ψ) := ψ̇t − β(1 − γ)ψt dt
2σ 2 (1 − γ)2 0

for ψ ∈ ≥0 , where IT (ψ) = +∞ if ψ(0) = 0 or ψ is not absolutely continuous.


Then, for all ψ such that IT (ψ) < +∞, one has

lim inf lim inf ε2 log W Y ε ∈ B(ψ, R) ≥ −IT (ψ). (3.18)
R→0 ε→0

In other words, the family Y ε satisfies a large deviation weak lower bound on
C([0, T ], R+ ), with rate function IT (ψ).
Once we are provided with Proposition 3.8, it is straightforward to prove the weak
lower bound for X ε .
Proof of Proposition 3.7 Consider ψ ∈ ≥0 absolutely continuous. By Lemma
3.45 in [20], ψ̇ = 0 a.s. on {ψ = 0}. Therefore, IT defined in Proposition 3.8
T 2
can be rewritten as IT (ψ) = 2σ2 (1−γ)
1
2 0 ψ̇t − β(1 − γ)ψt 1ψt =0 dt. Using the
1
definition of Y ε and (3.18), since the map ψ → ϕ = ψ 1−γ is continuous on ≥0 ,
On Small-Noise Equations with Degenerate Limiting System … 489

we can apply the contraction principle and obtain that W (X ε ∈ .) satisfies a large
deviation weak lower bound with rate function I¯T . Let us describe I¯T (ϕ) when ϕ is
absolutely continuous and such that IT (ϕ) < ∞ (where IT was defined in (2.2)). Let
1−γ
ψt = ϕt . On {ϕ = 0}, one has ψ = 0 as well, while for a point t in the open set
{ϕ > 0} such that ϕ̇t exists, one has ψ̇t = (1 − γ) ϕϕ̇γt . Then, noting that IT (ϕ) < ∞
t
implies that ϕϕ̇γt 1ϕt >0 is integrable on [0, T ], ψ is also absolutely continuous on [0, T ]
t
(see [20, Corollary 3.41]), with derivative ψ̇t = (1 − γ) ϕϕ̇γt 1ϕ>0 This yields
t

1
I¯T (ϕ) = IT (ψ(ϕ)) =
2σ 2 (1 − γ)2

ϕ̇t 1−γ 2
T
(1 − γ) γ − β(1 − γ)ϕt 1ϕt =0 dt = IT (ϕ) < ∞. (3.19)
0 ϕt

If I (ϕ) = ∞, there is nothing to prove in (3.17), and the claim follows.

3.3.1 Proof of Proposition 3.8

This section is devoted to the proof of the large deviation weak lower bound for
the process Y ε in (3.18). While postponing some of the most technical elements to
Appendix A, we will make use here of the following notation: for every h ∈ H, y ∈ R,
we define S y (h) to be the unique solution on [0, T ] of the ODE

ψ̇t = β(1 − γ)ψt + σ(1 − γ)ḣ t , ψ0 = y. (3.20)


T
We denote W ε,h the measure on  associated to the Girsanov shift − 1ε 0 ḣ t dt,
  T  T 
dW ε,h 1 1
(ω) = exp ḣ t dBt − 2 ḣ 2t dt . (3.21)
dW ε 0 2ε 0
 d
An application of Girsanov’s Theorem shows that W X ε,h ∈ · = W ε,h (X ε ∈ ·),
where X ε,h solves:
γ γ 1
dXtε,h = bε (X tε,h )dt + σ|X tε,h | ḣ t dt + εσ|X tε,h | dBt , X 0ε,h = ε 1−γ x. (3.22)

We also define the process Y ε,h := |X ε,h |1−γ .

Remark 3.9 Note that for (3.22) there exists a weak solution, which we construct
directly from a solution of (1.1) applying Girsanov’s Theorem. Since pathwise
uniqueness holds for the couple (b, σ), another application of the same theorem
shows that pathwise uniqueness for (1.1) implies pathwise uniqueness for (3.22).
Therefore we can always assume that X ε,h solves (3.22) with the Brownian motion B.
490 G. Conforti et al.

Two main ingredients enter in the proof of Proposition 3.8: the convergence in
law (under some conditions on h) of the process Y ε,h to the deterministic limit
S0 (h) under the measure W (equivalently: the weak convergence of the measure
W ε,h (Y ε ∈ .) to δ S0 (h) ), and a lower bound for the probability W (Y ε ∈ B(ψ, R))
depending explicitly on the relative entropy between the two measures W ε,h and W .
This is the content of the two following lemmas.

Lemma 3.10 (Convergence in law of Y·ε,h ) Let h ∈ H be such that

(i) S0 (h)t > 0, ∀t ∈ (0, T ]; (ii) ḣ t > k in a neighborhood of 0, for some k > 0.
(3.23)

Then, the process Y ε,h converges in law to S0 (h) under W , as ε → 0.

Lemma 3.11 (Relative entropy bound) Let (, F) be a probability space and P,Q
two probability measures on (, F) such that d Q = Fd P. The relative entropy
H (Q|P) is defined as: 
H (Q|P) := F log(F)d P


Then, ∀A ∈ F we have:
 
P(A) e−1 + H (Q|P)
log ≥− . (3.24)
Q(A) Q(A)

Proof Applying Jensen’s inequality, one has


   
P(A) dQ
log ≥ log F −1
Q(A) A Q(A)
 
1 1
≥− log(F)d Q ≥ − (log(F)F)+ dP.
Q(A) A Q(A) A

Using the elementary fact that inf x≥0 x log(x) ≥ − 1e :



1 e−1 + H (Q|P)
− (log(F)F)+ d P ≥ − ,
Q(A) A Q(A)

which proves (3.24). 


On Small-Noise Equations with Degenerate Limiting System … 491

The relative entropy


 H (W ε,h |W
 )t is easily
 computed using the martingale property
ε,h 1 t
of Ft = exp ε 0 ḣ s dBs − 2ε12 0 ḣ 2s ds and Itô isometry:
   T  T 
1 1
H (W ε,h
|W ) = E FTε,h ḣ t dBt − 2 2
ḣ dt
ε 0 2ε 0 t
  T    T
1 ε,h 1 T 1
=E Ft ḣ t dBt × ḣ t dBt − 2 ḣ 2 dt
ε 0 ε 0 2ε 0 t
  T
1 T 2 1
= 2 ḣ t dt − 2 ḣ 2 dt,
ε 0 2ε 0 t

therefore 
  1 T
ε,h
H W |W = 2 ḣ 2t dt. (3.25)
2ε 0

The proof of Lemma 3.10 is postponed to Appendix A; using this lemma and
Lemma 3.11, we can achieve here the proof of Proposition 3.8, completing the proof
of the large deviation weak lower bound for the process X ε .
Proof of Proposition 3.8 If IT (ψ) = ∞, (3.18) is trivially true. Then, consider
ψ ∈ ≥0 such that IT (ψ) < ∞, and define h ∈ H by setting ḣ t = ψ̇t −β(1−γ)ψ
σ(1−γ)
t
, so
that S0 (h) = ψ.
Step 1. Assume that h is such that (3.23) holds true. An application of the relative
entropy bound (3.24) with P = W , Q = W ε,h yields

ε
 e−1 + H (W ε,h |W )
ε log W Y ∈ B(ψ, R)
2
≥ −ε 2
W ε,h (Y ε ∈ B(ψ, R))
+ ε2 log W ε,h (Y ε ∈ B(ψ, R)).

Using W ε,h (Y ε ∈ B(ψ, R)) = W ε (Y ε,h ∈ B(ψ, R)) → 1 for every R > 0 by
Proposition 3.10, and the expression of H (W ε,h |W ) from (3.25), taking the limit as
ε → 0 we obtain (3.18).
Step 2. Assume now ψ ∈ C 1 ([0, 1]). Let h be defined as above, and define h n ∈ H ,
n ∈ N, by
ḣ nt := ḣ t + 1/n. (3.26)

We claim that ∀n ∈ N, h n satisfies (3.23). Let us first prove that condition (ii) in
(3.23) holds. Observe that ψ ≥ 0 and ψ0 = 0 imply ψ̇0 ≥ 0, hence ḣ n0 ≥ 1/n.
By the continuity of ḣ n , ensured by the fact that ψ ∈ C 1 ([0, T ]), it follows that the
condition (ii) in (3.23) holds with, say, k = 1/(2n). In order to prove condition (i), we
observe that the comparison principle for ODEs implies that ∀t ∈ (0, T ], S0 (h n )t >
S0 (h)t = ψt ≥ 0; condition (i) is then proved. Furthermore, by the continuity of the
solution to (3.20) with respect to the control parameter h, one has

S0 (h n ) − ψ∞ → 0 as n → ∞. (3.27)


492 G. Conforti et al.

It follows from (3.27) that, for any R > 0


 
W Y ε ∈ B(ψ, R) ≥ W Y ε ∈ B(S0 (h n ), R/2) (3.28)

if n is large enough. In the first part of the proof, we have shown that the weak lower
bound holds for W (Y ε ∈ B(S0 (h n ), R/2)); then, taking the limits as ε → 0 and
R → 0 in (3.28), one has

lim inf lim inf ε2 log W Y ε ∈ B(ψ, R) ≥ −IT (S0 (h n )) for every n ∈ N.
R→0 ε→0

T T
Since IT (S0 (h n )) = 21 0 (ḣ n )2 dt → 21 0 (ḣ)2 dt = IT (ψ), the bound (3.18) fol-
lows. Finally, a standard density argument of C 1 ([0, 1]) functions in C([0, 1]) allows
to extend the claim to any ψ ∈ ≥0 such that IT (ψ) < +∞. 
Remark 3.12 In a classical situation, the claim would be the lower bound (3.17) for a
process X ε satisfying, say, dXε = bε (X ε ) + εσ(X ε )dB with Lipschitz coefficients σ
and bε → b0 , and X 0ε = x ε → x. In this setting, fixing a control h ∈ H and defining
X ε,h from X ε by shifting the Brownian motion B as in (3.22), it is straightforward (in
fact: an application of Gronwall’s Lemma) to show that X ε,h converges in law to the
unique solution of the deterministic limit equation dϕ = b0 (ϕ)dt + σ(ϕ)dh, ϕ0 = x.
In the present (degenerate) situation, the deterministic limit equation for the process
X ε,h (obtained setting ε = 0 in (3.22)) coincides with the ODE (1.2) which admits
infinitely many solutions. When circumventing this problem by passing through the
transformed process Y ε,h , we actually show that the convergence in law of X ε,h to
a particular solution ϕ∗ of the limiting equation is restored. Indeed, assume as in
Proposition 3.10 that h is such that the unique solution ψ of the well-posed equation
(3.20) with y = 0 is positive for every t > 0, and Y ε,h converges t in law to ψ. The
function ψ is easily computed, namely ψt = σ(1 − γ)eβ(1−γ)t 0 e−β(1−γ)s ḣ s ds. By
 1 W 1
definition, one has X ε,h = Y ε,h 1−γ −→ ψ 1−γ =: ϕ∗ . By direct computation, ϕ∗
is absolutely continuous and such that ϕ∗0 = 0 and ϕ̇∗ = βϕ∗ + σ(ϕ∗ )γ ḣ, hence ϕ∗
is a solution to (1.2); in particular,
  t  1
ϕ∗t := eβt σ(1 − γ) e−β(1−γ)s ḣ s ds
1−γ
. (3.29)
0

Therefore, in the small noise limit, the stochastic dynamics (3.22) performs a selec-
tion among the solutions of the limiting deterministic system (1.2), selecting the
strictly positive one, ϕ∗ . This looks reasonable in light of the fact that, though con-
verging to zero, the drift parameter αε and the initial condition x ε of the process
remain strictly positive for all ε > 0.5 Figure 1 shows the convergence of simulated

5 By perturbing the initial condition and the drift in (1.2), one can retrieve the trajectory ϕ∗ in (3.29)
γ
as the limit as ρ → 0 of the solution of the equation dϕt = ρ + βϕt dt + σϕt dh, ϕ0 = ρ, for
which existence and uniqueness hold.
On Small-Noise Equations with Degenerate Limiting System … 493

25

0.25
20
0.20
0.15
15
Out[8]= 0.10
10 0.05

5 Discarded degenerate ODE solution

1 2 3 4 5

Fig. 1 An illustration of the convergence of the process X ε,h in (3.22) to a particular solution ϕ∗ of
the limiting deterministic sytem. Trajectories have been simulated for different values of the noise
parameter ε and γ = 1/2, α(x) ≡ 1, β = 0, σ = 2, ḣ = 1, x = 0

trajectories of the process X ε,h to ϕ∗ in (3.29) as ε → 0, for a given choice of the


control parameter h.

Remark 3.13 (Lower bound from the upper bound) In general, the weak conver-
gence of the controlled process X ε,h can be shown exploiting the large deviation
upper bound. This goes as follows: in the notation of Remark 3.12, assume X ε sat-
isfies dXε = bε (X ε ) + εσ(X ε )dB with Lipschitz coefficients, and define X ε,h from
X ε as in (3.22). Assume one has proven a large deviation upper bound analogous to
(3.12) for the process X ε,h , with a good rate function I h depending on the control
T  2
parameter h, I h (ψ) := 21 0 ψ̇t −b0 (ψ t )−σ(ψt )ḣ t
σ(ψt ) dt. It is clear that I h admits as a
unique zero the solution ϕ(h) of ψ̇t = b0 (ψt ) + σ(ψt )ḣ t . Using the compactness of
the level sets of I h and the large deviation upper bound, it is easy to conclude that
 
lim W X ε,h ∈
/ B(ϕ(h), R) = 0 ∀R > 0,
ε→0

hence X ε,h → ϕ(h) in law. This provides a way of “bootstrapping” the large devi-
ation lower bound from the upper bound (via weak convergence, together with the
bound on relative entropy in Lemma 3.11). When the limit ODE has several solu-
tions, this approach is not possible anymore: in the present case, the rate func-
 T  ψ̇t −βψt −ψtγ ḣ t 2
tion I h (ψ) = 21 0 ψt
γ 1{ψt >0} dt has uncountably many zeroes, cor-
responding to the possible solutions of the degenerate ODE (1.2). While one is
expecting that converging subsequences of the family of measures {W (X ε,h ∈ ·)}ε
converge to a probability distribution supported by the set of solutions, it is not obvi-
ous a priori how to restore a unique limit for X ε,h (which is why we pass through
the transformed process Y ε,h ). When uniqueness for the limiting equation is granted,
such an approach remains efficient, and applies outside the Markovian framework
(see [8] for a treatment of delayed equations. In the setting of [8], uniqueness of solu-
tions for the deterministic sytem is essential, and enters via their condition (H4)).
494 G. Conforti et al.

3.4 Proof of Tail Estimates

In this section, we prove the asymptotic estimates that have been stated in Sect. 2.1
and that follow from Theorem 2.1.
Proof of Proposition 2.5 Setting ε := R −(1−γ) into (2.1), one has

lim sup R −2(1−γ) log W (X T ≥ R) = lim sup ε2 log W X Tε ≥ 1 ≤ −P
R→+∞ ε→0

where

P = inf {IT (ϕ) : ϕ0 = 0, ϕ ≥ 0, ϕT ≥ 1}


= inf inf {IT (ϕ) : ϕ0 = 0, ϕ ≥ 0, ϕT ≥ y} =: inf P(y).
y≥1 y≥1

Fix y ≥ 1 and a function ϕ in the admissible set of P(y), such that IT (ϕ) < ∞. Set
1−γ
ψt = ϕt . On {ϕ = 0}, one has ψ = 0 as well, while for a point t in the open set
{ϕ > 0} such that ϕ̇t exists, one has ψ̇t = (1 − γ) ϕϕ̇γt . Then, noting that IT (ϕ) < ∞
t
ϕ̇t
implies that γ1 is integrable on [0, T ], ψ is also absolutely continuous on
ϕt ϕt >0
T 
[0, T ] (see [20, Corollary 3.41]). Moreover, IT (ϕ) = 2σ1 2 0 ϕ̇t −βϕ γ
ϕt
t 2
1ϕt >0 dt =
T
1
2σ 2 (1−γ)2 0
(ψ̇t − β(1 − γ)ψt ) 1ψt >0 dt. Noting that the inverse transformation ϕ =
2
1
ψ (1−γ) also maps AC positive functions to AC positive functions (as (1−γ)
1
> 1), one
has
 T
1 2
P(y) = inf ψ̇t − β(1 − γ)ψt 1ψt >0 dt : ψ is abs. cont.,
2σ (1 − γ)
2 2
0

ψ0 = 0, ψ ≥ 0, ψT = y 1−γ
.

When β = 0, the minimizer of this problem is ψt∗ (y) = y 1−γ t/T . When β = 0, the
solution of the Euler-Lagrange equation associated with the Lagrangian (ψ̇ − β(1 −
γ)ψ)2 and the boundary conditions ψ0 = 0, ψT = y 1−γ yields the minimizer

y 1−γ
ψt∗ (y) = (eβ(1−γ)t − e−β(1−γ)t ).
eβ(1−γ)T − e−β(1−γ)T

In both cases, ψt∗ (y) > 0 for all t ∈ (0, T ], and the positivity constraint in P(y)
can be dropped. Using the monotonicity of ψ ∗ w.r.t. y, this yields inf y≥1 P(y) =
T ∗ ∗
2
P(1) = 2σ2 (1−γ)
1
2 0 ψ̇t (1) − β(1 − γ)ψt (1) dt. An application of the large

deviation lower bound (2.1) gives lim inf R→+∞ R −2(1−γ) log W (X T > R) =
lim inf ε→0 ε2 log W X Tε > 1 = − inf y>1 P(y) = −P(1). Finally, the explicit
On Small-Noise Equations with Degenerate Limiting System … 495

evaluation of the integral in P(1) over the function ψ ∗ yields the expression of the
constant cT in (2.4).
Let us consider the running maximum process. Another application of the large
deviation principle (2.1) with ε = R −(1−γ) gives
 
lim inf R −2(1−γ) log W sup X t > R ≥ −c T
R→+∞ t∈[0,T ]

where c T inf IT (ϕ) : ϕ0 = 0, ϕ ≥ 0, supt∈[0,T ] ϕt > 1 . Since W supt∈[0,T ]


X t > R) ≥ W (X t > R) for every t ≤ T , one has c T ≤ inf t∈[0,T ] ct = cT ,
where the last identity holds for ct is a decreasingfunction of t. On the other hand,
lim sup R→+∞ R −2(1−γ) log W supt∈[0,T ] X t ≥ R ≤ −c T := − inf IT (ϕ) : ϕ0 =
0, ϕ ≥ 0, supt∈[0,T ] ϕt ≥ 1 . Since

c T = inf IT (ϕ) : ϕ0 = 0, ϕ ≥ 0, sup ϕt = 1, ϕt ≥ 0


t∈[0,T ]
≥ inf inf{It (φ) : φ is abs. cont. on[0, t], φ0 = 0, φ ≥ 0, φt = 1}
t∈[0,T ]
= inf ct = cT
t∈[0,T ]

one has c T = c T = cT , and the claim is proved. 


As addressed in Sect. 2.1, Theorem 2.1 can also be used to obtain the leading-
order asymptotics for the distribution of the time average of the process. Such a result
can be used to derive the leading-order behavior of the implied volatility of Asian
 T + 
options E T1 0 X t dt − K for large strike K .

Proposition 3.14 Estimate (1.5) in Theorem 1.2 holds with νT > 0. When γ = 1/2,
the constant νT is given by
⎧  

⎨ 2σ1 2 T β 2 + 4ω 2
if T β/2 < 1
T
νT =   (3.30)

⎩ 12 T β2 − 4ω 2
if T β/2 ≥ 1
2σ T

where

⎨ the ω ∈ (0, π) such that ω cos ω = T β/2 sin(ω)
⎪ if T β/2 < 1
ω= 0 if T β/2 = 1


the ω ∈ (0, ∞) such that ω cosh(ω) = T β/2 sinh(ω) if T β(1 − γ) ≥ 1.
(3.31)
496 G. Conforti et al.

Remark 3.15 Following the lines of the proof of Proposition 3.14, one can prove
T
the analogous asymptotic relation for a general time-average functional 0 X t μ(dt),
where μ is a bounded signed measure on [0, T ]. One gets
 T 
2(1−γ) (V
W X t μ(dt) ≥ R = e−R T +ψ(R)) as R → ∞,
0

T
where VT is characterised by the variational formula VT := inf IT (ϕ) : 0 ϕt μ
(dt) ≥ 1, ϕt ≥ 0, ∀t ∈ [0, T ] .

Proof of Proposition 3.14 An application of th large deviation principle (2.1)


T 
with ε := R −(1−γ) yields lim sup R→+∞ R −2(1−γ) log W T1 0 X t dt ≥ R =
 
lim supε2 →0 ε2 log W T1 0 X tε dt ≥ 1 ≤ −νT , with νT = inf{IT (ϕ) : ϕ0 =
T
 T
0, ϕ ≥ 0, T1 0 ϕt dt ≥ 1}. Proceeding as in the proof of Proposition 2.5, and in par-
ticular exploiting the endomorphism of AC([0, T ], R+ ) ϕ → ψ = ϕ1−γ together
with the chain rule ψ̇ = ϕ̇/ϕγ 1ϕ>0 , one has
  
1 T
νT = inf IT (ϕ) : ϕ0 = 0, ϕ ≥ 0, ϕt dt ≥ 1
T 0
 1
T 2
= inf ψ̇T t − β(1 − γ)ψT t dt : ψ0 = 0,
2σ 2 (1 − γ)2 0
 1 
1/(1−γ)
ψ ≥ 0, ψT t dt ≥ 1
0
 1  2
1 d
= inf (ψT t ) − T β(1 − γ)ψT t dt : ψ0 = 0,
2T σ (1 − γ)2
2
0 dt
 1 
1/(1−γ)
ψ ≥ 0, ψT t dt ≥ 1
0
  1 2
1
= inf inf φ̇t − T β(1 − γ)φt dt : φ0 = 0,
η≥1 2T σ 2 (1 − γ)2 0
 1 
1/(1−γ)
φ ≥ 0, φt dt = η =: inf J (η).
0 η≥1

When γ = 1/2, the latter variational problem was studied in [12, Exercise
2.1.13]. The explicit solution for J provides the expression of the constant νT =
inf η≥1 J (η) = J (1) given in (3.30). The large deviation lower bound yields
T  T
lim inf R→+∞ R −2(1−γ) log W T1 0 X t dt > R = lim inf ε2 →0 ε2 log W T1 0 X tε

dt > 1 ≥ −J (1) = ηT , and the claim is proved. 
Consistency check with the explicit formulae for the integrated CIR process.
Let us consider the case γ = 1/2, and compare Proposition 3.14 with the moment
explosion of the integrated CIR process, corresponding to α(x) ≡ α ≥ 0 in con-
On Small-Noise Equations with Degenerate Limiting System … 497

dition (H2). We focus on the (common) case of a mean-reverting drift, i.e. β < 0;
T
computations for β > 0 are similar. Estimate (1.5) establishes that T1 0 X t dt has
finite exponential moments up to order νT : more precisely,
 u  T  1  T 
u ∗ := sup{u > 0 : E exp X t dt < ∞} = sup{ν > 0 : P X t dt > x
T 0 T 0
−νx
= O(e ) as x → ∞} = νT
(3.32)

(for the central identity, see for example [15, Sect. 4]); in other words, νT is the posi-
T
tive critical exponent of T1 0 X t dt. Critical exponents for integrated CIR have been
assessed by [2, 14, 18] relying (essentially) on the affine structure of the process. It
is typical to obtain u ∗ by inverting an explicit explosion time: following [2, Corollary
T
3.3], E[exp( Tu 0 X t dt)] is always finite if u ≤ T β 2 /(2σ 2 ), and if u > T β 2 /(2σ 2 ),
the expectation is finite for T < T ∗ (u) and infinite for T > T ∗ (u), where T ∗ reads
 
γ(u)
π + arctan β
T ∗ (u) = 2 ,
γ(u)

where γ(u) = 2σ 2 Tu − β 2 . Fixing T and using the monotonicity of T ∗ , this means
that the expectation becomes infinite for u > u ∗ with u ∗ the solution to
 
γ(u) T
π + arctan = γ(u) (3.33)
β 2

As an equation in γ, it is easy to see that (3.33) has a unique root γ ∗ on R+ such that
T ∗ π
2 γ ∈ ( 2 , π). From the definition of γ,

1 1  2 4  T γ ∗ 2  1  2 4 ∗ 2
u∗ = (T β 2
+ T (γ ∗ 2
) ) = T β + = T β + (ω )
2σ 2 2σ 2 T 2 2σ 2 T

 
setting ω ∗ = T 2γ . From (3.33), ω ∗ is the unique solution to ω = π + arctan T2ωβ ,
which is equivalent to tan(ω) = T2ωβ together with ω ∈ ( π2 , π): one sees that this
definition coincides with the one for ω in (3.31) (noticing we are in the first case
when β < 0).

Acknowledgments We would like to thank an anonymous referee for the careful reading of the
paper and for several valuable comments which helped to improve the presentation. We thank
Peter Friz for stimulating discussions and Antoine Jacquier for useful references on integrated
CIR processes. SDM (affiliated with TU-Berlin when this work was started) acknowledges partial
financial support from Matheon. GC acknowledges financial support from Berlin Mathematical
School. SDM and GC acknowledge financial support for travel expenses from the research program
‘Chaire Risques Financiers’ of the Fondation du Risque.
498 G. Conforti et al.

Appendix A

We complete the proof of Proposition 3.2 here.


Proof of Proposition 3.2 Let us define an auxiliary process X by
γ
d X t = |α|∞ dt + σ exp(−(1 − γ)|β|t)X t dBt , X 0 = x;

after a simple application of the product rule, one has that the process Z t :=
exp(|β|t)X t is a solution to
 γ
d Z t = |α|∞ exp(|β|t) + |β|Z t dt + σ Z t dBt , Z 0 = x.

Since |α|∞ exp(|β|t) ≥ |α|∞ , an application of the comparison principle for SDE’s
2(1−γ)
[17, Proposition 5.2.18] yields Z t ≥ X̃ t , for all t ≥ 0. Therefore, if X admits
2(1−γ) 2(1−γ)
(some) exponential moments, so does Z t and by comparison X̃ t . In this
sense, the process X is not covered by Proposition 3.3 in [9], since the latter deals
with the case of a diffusion coefficient that does not depend on time (see [9, Eq.
(3.1)]); nonetheless, the essential condition that [9, Proposition 3.3] relies on is the
presence of a non-strictly positive slope coefficient, say b in the drift term a + bX
(cf. [9, Eq. (3.3)]). Since this is the case for the process X (which has zero slope
coefficient b), it is straightforward to extend the proof to the present setting: in
particular, in the spiritof Lamperti’s change-of-variable argument, one still defines
x
the function ϕ(x) = 0 σx1 γ = σ(1−γ) 1
x 1−γ and studies the process ϕ̃(X t ), where
the function ϕ̃ is a modification of ϕ identically null around zero. Itô’s formula
shows that ϕ̃(X t ) is an Itô process with bounded quadratic variation and a bounded
drift term; the existence of quadratic exponential moments for ϕ̃(X t ), then, is a
consequence of Dubins–Schwarz time-change argument and Fernique’s theorem.
2(1−γ)
As a consequence, there exist c , C > 0 such that supt≤T E[exp(c X t )] ≤ C;
2(1−γ) 2(1−γ)
it follows supt≤T E[exp(c X̃ t )] ≤ supt≤T E[exp(cZ t )] ≤ C with c :=
c exp(−2|β|(1 − γ)T ), and the claim is proved. 
We report the statement given in [26, Chap. 2, Theorem 2.13].

Lemma A.1 (Garsia-Rodemich-Rumsey’s Lemma) Let p and  be continu-


ous, strictly increasing functions on [0, +∞) such that p(0) = (0) = 0 and
limt→+∞ (t) = +∞. If ω ∈  is such that:
   
T T |ωt − ωs |
 dsdt ≤ K , (A.1)
0 0 p(|t − s|)

then   
|t−s| 4K
|ωt − ωs | ≤ 8  −1 dp(u). (A.2)
0 u2
On Small-Noise Equations with Degenerate Limiting System … 499

Lemma A.1 allows us to prove Proposition 3.3:


Proof of Proposition 3.3 Assume that (3.3) holds true with the left hand side replaced
by K > 0. Applying Lemma A.1 with the choice of functions (y) = exp(ε−2 y)−1,

p(y) = y, one has for all s, t
 |t−s|    
|t−s| 
−1 4K 4K
|ωt − ωs | ≤ 8  dp(u) = 8ε 2
log + 1 dp(u)
0 u2 0 u2
 |t−s|  
≤ 8ε2 log 4K + T 2 dp(u)
0
  |t−s| 
+ log u −2 dp(u)
 0  
≤ 8ε2
|t − s| log 4K + T 2
 
+ |t − s| (4 − 2 log (|t − s|)) .

Dividing on both sides by (t − s)η and taking suprema we obtain


   
ωη ≤ 8ε2 log 4K + T 2 T 1/2−η + 4T 1/2−η + K η .

−1 (K ), (3.3) yields (3.4).


Since the right hand side in the last estimate is K ε,η 
Finally, we prove Lemma 3.10.
Proof of Lemma 3.10 Denote T ε the stopping time
 
1
T ε (ω) = inf t ≥ 0 : ωt ≤ εx 1−γ . (A.3)
2

We can apply Itô formula to the function f (x) = x 1−γ up to time T ε (Y ε,h ), and
obtain
 t
Ytε,h − εx 1−γ = b̃ε (Ysε,h )ds + σ(1 − γ)h t + εσ(1 − γ)Bt , ∀ t ≤ T ε (Y ε,h ), a.s.
0
(A.4)
where b̃ε is given by

1 1
− (1−γ) 1 1 σ 2 γ(1 − γ) 2 1
b̃ε (y) := (1−γ)ε 1−γ α(ε y (1−γ) ) γ − ε +β(1−γ)y (A.5)
y 1−γ 2 y
500 G. Conforti et al.

We need to prove
 
lim W sup |Ytε,h − S0 (h)t | ≤ R =1 ∀R > 0. (A.6)
ε→0 t∈[0,T ]

In order to simplify the notation, there is no ambiguity in writing Y instead of Y ε,h


inside this proof.
Step 1. We first prove (A.6) under the assumption

k := inf ḣ t > 0 (A.7)


t∈[0,T ]

Let us fist show that    


lim W T ε Y ε,h ≤ T = 0 (A.8)
ε→0

A direct computation shows that there exist a constant c > 0 depending on x, σ, α(·)
such that:
inf b̃ε (y) − β(1 − γ)y ≥ −cε. (A.9)
y≥ 21 εx 1−γ

Define (Z t )t∈[0,T ] by
 t
Z t = εx 1−γ
+ (−cε + σ(1 − γ)k) t + β(1 − γ) Z s ds + εσ(1 − γ)Bt (A.10)
0

Using (A.9), it follows from the comparison principle for SDEs that

Yt ≥ Z t ∀ t ≤ T ε (Y ), a.s. (A.11)

We claim that 
W T ε (Z ) ≤ T → 0 (A.12)

holds true. Since W (T ε (Y ) ≤ T ) ≤ W (T ε (Z ) ≤ T ) by (A.11), then (A.8) holds.


We prove (A.12) later on. Now, it follows from the definition of S0 (h)t and an
application of Gronwall’s Lemma that
 
|Yt − S0 (h)t | ≤ ε c + σ(1 − γ) sup |Bt | e|β|(1−γ)T =: T ∀t ≤ T ε (Y ) ,
t∈[0,T ]

therefore, for any R > 0 and ε small enough


    
W sup |Yt − S0 (h)t | ≤ R ≥W sup |Yt − S0 (h)t | ≤ T ∩ εT ≤ R
t∈[0,T ε ] t∈[0,T ε (Y )]

≥W T ε (Y ) ≥ T ∩ εT ≤ R .
On Small-Noise Equations with Degenerate Limiting System … 501

Since both the events in the right hand side of the last inequality have probability
converging to 1, (A.6) follows, and Lemma 3.10 is proved under condition (A.7).
Step 2. We assume that (A.7) holds only on the time interval [0, ρ], that is ḣ t ≥ k
for every t ≤ ρ, for some k, ρ > 0. Repeating the argument of Step 1 with T = ρ,
we have  
 
lim W sup |Yt − S0 (h)t | ≤ R = 1, ∀R > 0 (A.13)
ε→0 t∈[0,ρ]

We apply estimate (A.13) together with a localization argument. Define a time-


shift operator τρ ω, for every ω ∈ , by (τρ ω)t = ωρ+t for all t ∈ [0, T − ρ]. For
any fixed y > 0, denote X y,ρ the strong solution of the SDE:
 t  t
1
bε (X s ) + σ|X s |γ ḣ ρ+s ds + εσ |X s |γ dBs
y,ρ y,ρ y,ρ y,ρ
Xt = y (1−γ) +
0 0

and set
Y y,ρ := (X y,ρ )1−γ .

Note that Y y,ρ is well defined since X y,ρ ≥ 0 for all t ∈ [0, T ], W -almost surely.
If h = 0 the non negativity of the trajectories of X y,ρ follows from an application
Proposition 3.1 in [9] and extends to h ∈ H by an application of the Girsanov
theorem. By definition of Y and Y y,ρ , the Markov property yields

E( f (τρ Y )|Fρ ) = E( f (Y Yρ ,ρ ))

By the continuity of the map (h, y) → S y (h) we can choose R  > 0 such that

R
sup sup |S y (τρ h)t − SS0 (h)ρ (τρ h)t | ≤ (A.14)
y∈B(S0 (h)ρ ,R  ) t∈[0,T −ρ] 2

Therefore, using (A.14) the following inclusion of events holds (assume w.lo.g R  ≤
R
2 ):
   

sup |Yt − S0 (h)t | ≤ R ⊇ sup |Yt − S0 (h)t | ≤ R
t∈[0,T ] [0,ρ]
 
R
∩ sup |τρ (Y )t − SYρ (τρ h)t | ≤
t∈[0,T −ρ] 2
502 G. Conforti et al.

Applying the Markov property


 
W sup |Yt − S0 (h)t | ≤ R (A.15)
t∈[0,T ]
  
Yρ ,ρ R
≥ E 1{supt∈[0,ρ] |Yt −S0 (h)t |≤R  } W sup |Yt − SYρ (τρ h)t | ≤
t∈[0,T −ρ] 2
 
≥W sup |Yt − S0 (h)t | ≤ R 
t∈[0,ρ]
 
y,ρ R
× inf W sup |Yt − S y (τρ h)t | ≤
y∈B(S0 (h)ρ ,R  ) t∈[0,T −ρ] 2

We want to show that


 
y,ρ R
lim inf W sup |Yt − S y (τρ h)t | ≤ =1 (A.16)
ε→0 y∈B ( S0 (h)ρ ,R  ) t∈[0,T −ρ] 2

It follows from the hypothesis S0 (h)t > 0 ∀t > 0 and the continuity of the map
(y, h) → S y (h) that, if R  , R are small enough

R
y ∗ := inf inf S y (τρ h)t − > 0. (A.17)
y∈B (S0 (h)ρ ,R  ) t∈[0,T −ρ] 2

Define U y,ρ as the unique strong solution of the SDE:


 t 
b̃uε (Us ) + σ(1 − γ)ḣ s+ρ ds + εσ(1 − γ)Bt ,
y,ρ y,ρ
Ut =y+
0

where

⎪ ε if y ≥ y ∗
⎨b̃ (y)
b̃uε (y) = 1 1
− (1−γ) 1
σ 2 γ(1−γ)
⎪β(1 − γ)y
⎩ + (1 − γ)ε 1−γ α(ε (y ∗ ) (1−γ) ) 1
γ − 2 ε2 y1∗ if y < y ∗ .
(y ∗ ) 1−γ

Then one has


   
y,ρ R y,ρ R
W sup |Yt − S y (τρ h)t | ≤ =W sup |Ut − S y (τρ h)t | ≤ .
t∈[0,T −ρ] 2 t∈[0,T −ρ] 2
(A.18)
On Small-Noise Equations with Degenerate Limiting System … 503

Now observing that b̃εu is globally Lipschitz continuous ∀ε > 0 and C ε :=


sup y∈R |b̃uε (y) − β(1 − γ)y| → 0, an application of Gronwall’s lemma gives
 
 y,ρ  √
E sup Ut − S y (τρ h)t  ≤ (C ε T + 2εσ(1 − γ) T ) exp(|β(1 − γ)|T ).
t∈[0,T −ρ]
(A.19)

By letting ε → 0 and applying the Markov inequality, observing that the right hand
side of (A.19) does not depend on y, we have proven (A.16). By letting ε → 0 in
(A.15) and applying (A.13) and (A.16), the proof of Lemma 3.10 is complete. 
Proof of (A.12). Observe that Z̃ := 1ε Z is an Ornstein-Uhlenbeck process,
 t
Z̃ t = x 1−γ
+ με t + β(1 − γ) Z̃ s ds + σ(1 − γ)Bt (A.20)
0

where με := 1ε (−cε + σ(1 − γ)k) = −c + σ(1−γ)k . It is immediate by the definition


 ε 
ε x 1−γ
of Z̃ that W (T (Z ) ≤ T ) = W inf t∈[0,T ] Z̃ ≤ 2 . The explicit representation
of Z̃ reads
 t
1−γ β(1−γ)t
Z̃ t := x e + f ε (t) + σ(1 − γ) exp(β(1 − γ)t) exp(−β(1 − γ)s)dBs
0
(A.21)

with f ε (t) = − με (1−exp(β(1−γ)t))


β(1−γ) . Consider a deterministic time τε with τε → 0 as
ε → 0, to be chosen precisely later on. Noting that f ε is a decreasing function, for
τε ≤ t ≤ T one has
 t 
 
Z̃ t ≥ f ε (τε ) − σ(1 − γ) exp(−β(1 − γ)s)dBs ; (A.22)
0

hence, using Markov’s inequality and Doob’s inequality


  
 t 
 
W inf Z̃ t ≤ x 1−γ
/2 ≤ W sup σ(1 − γ) exp(−β(1 − γ)s)dBs 
t∈[τε ,T ] t∈[τε ,T ] 0

≥ f ε (τε ) − x 1−γ
/2
 −1
≤ Cσ(1 − γ) f ε (τε ) − x 1−γ /2
 T  21
× exp(−2β(1 − γ)s)ds .
0
504 G. Conforti et al.


Now, the choice τε = ε gives f ε (τε ) ∼ με τε → ∞ as ε → 0, so that
−1
f ε (τε ) − x 1−γ /2 → 0. On the other hand, inf t∈[0,τε ] Z̃ t → x 1−γ a.s. as ε → 0,
hence W inf t∈[0,τε ] Z̃ t ≤ x/2 → 0 as ε → 0, and the claim is proven.

References

1. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions with Formulas, Graphs,
and Mathematical Tables, 10th edn. Dover, New York (1972)
2. Andersen, L., Piterbarg, V.: Moment explosions in stochastic volatility models. Financ. Stoch.
11, 29–50 (2007)
3. Azencott, R.: Grandes déviations et applications. Ecole d’été de Probabilités de Saint-Flour
VIII-1978. Lecture Notes in Mathematics, vol. 774, pp. 1–176. Springer, Berlin (1980)
4. Baldi, P., Caramellino, L.: General Freidlin-Wentzell large deviations and positive diffusions.
Stat. Probab. Lett. 81, 1218–1229 (2011)
5. Ben Arous, G.: Développement asymptotique du noyau de la chaleur hypoelliptique hors du
cut-locus. Annales scientifiques de l’Ecole Normale Supérieure 4(21), 307–331 (1988)
6. Bingham, N.H., Goldie, C.M., Teugels, J.L.: Regular Variation. Cambridge University Press,
Cambridge (1987)
7. Bismut, J.-M.: Large Deviations and the Malliavin Calculus. Birkhäuser, Boston (1984)
8. Chiarini, A., Fischer, M.: On large deviations for small noise Itô processes. Adv. Appl. Probab.
46(4), 1126–1147 (2014)
9. De Marco, S.: Smoothness and asymptotic estimates of densities for SDEs with locally smooth
coefficients and applications to square root-type diffusions. Ann. Appl. Probab. 4(21), 1282–
1321 (2011)
10. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Applications
of Mathematics, Springer, New York (1998)
11. Deuschel, J.-D., Friz, P., Jacquier, A., Violante, S.: Marginal density expansions for diffusions
and stochastic volatility, II: Applications. Comm. in Pure and Applied Math. 67(1), 40–82
(2014)
12. Deuschel, J-D., Stroock., W.: Large Deviations. Pure and Applied Mathematics. American
Mathematical Society, New York, London. Revised edition of: An introduction to the theory
of large deviations/D.W. Stroock. cop.1984 (2000)
13. Donati-Martin, C., Rouault, A., Yor, M., Zani, M.: Large deviations for squares of Bessel and
Ornstein-Uhlenbeck processes. Probab. Theory Relat. Fields 129, 261–289 (2004)
14. Dufresne, D.: The integrated square-root process. Research Paper no. 90, Centre for Actuarial
Studies, University of Melbourne (2001)
15. Gulisashvili, A.: Asymptotic formulas with error estimates for call pricing functions and the
implied volatility at extreme strikes. SIAM J. Financ. Math. 1(1), 609–641 (2010)
16. Jeanblanc, M., Yor, M., Chesney, M.: Mathematical Methods for Financial Markets. Springer
finance. Springer, Dordrecht, Heidelberg, London (2009)
17. Karatzas, I., Shreve, S.: Brownian Motion and Stochastic Calculus, 2nd edn. Springer, New
York (1991)
18. Keller-Ressel, M.: Moment explosions and long-term behavior of affine stochastic volatility
models. Math. Financ. 21, 73–98 (2011)
19. Klebaner, F., Liptser, R.: Asymptotic analysis of ruin in the constant elasticity of variance
model. Theory Probab. Appl. 55(2), 291–297 (2011)
20. Leoni, G.: A First Course in Sobolev Spaces. Graduate studies in mathematics. American
Mathematical Society, Cambridge (2009)
21. Lions, P.-L., Musiela, M.: Correlations and bounds for stochastic volatility models. Annales
de l’Institut H. Poincaré 24, 1–16 (2007)
On Small-Noise Equations with Degenerate Limiting System … 505

22. Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion, 3rd edn. Springer, New
York (1999)
23. Robertson, S.: Sample path large deviations and optimal importance sampling for stochastic
volatility models. Stoch. Process. Appl. 120(1), 66–83 (2010)
24. Schöbel, R., Zhu, J.: Stochastic volatility with an Ornstein-Uhlenbeck process: an extension.
Eur. Financ. Rev. 3(1), 23–46 (1999)
25. Stein, E.M., Stein, J.C.: Stock price distribution with stochastic volatility: an analytic approach.
Rev. Financ. Stud. 4, 727–752 (1991)
26. Stroock, D., Varadhan, S.R.S.: Multidimensional Diffusion Processes. Grundlehren der math-
ematischen Wissenschaften. Fundamental Principles of Mathematical Sciences, vol. 233.
Springer, Berlin (1979). Reprinted in 2006
27. Trevisan, D.: Zero noise limits using local times. Electron. Commun. Probab. 18(31), 1–7
(2013)
Long Time Asymptotics for Optimal
Investment

Huyên Pham

Abstract This survey reviews portfolio selection problem for long-term horizon.
We consider two objectives: (i) maximize the probability for outperforming a target
growth rate of wealth process (ii) minimize the probability of falling below a target
growth rate. We study the asymptotic behavior of these criteria formulated as large
deviations control problems, that we solve by duality method leading to ergodic
risk-sensitive portfolio optimization problems. Special emphasis is placed on linear
factor models where explicit solutions are obtained.

Keywords Long-term investment · Large deviations · Risk-sensitive control ·


Ergodic HJB equation · Risk-sensitive control problems · Hamilton-Jacobi-Bellman
equations · Large-time asymptotic · Large deviations

MSC Classification (2000) 60F10 · 91G10 · 93E20

1 Introduction

Dynamic portfolio selection looks for strategies maximizing some performance cri-
terion. It is a main topic in mathematical finance, first solved in continuous time in
the seminal paper [13], and extended in various directions by taking into account
stochastic investment opportunities, market imperfections and/or transaction costs.

Contribution to Springer Proceedings in Asymptotic Methods in Finance (Editors Friz-Gatheral-


Gulisashvili-Jacquier-Teichmann), in memory of Peter Laurence.

H. Pham (B)
Laboratoire de Probabilités et Modèles Aléatoires CNRS, UMR 7599,
Université Paris Diderot, Paris, France
e-mail: [email protected]
URL: http://www.math.univ-paris-diderot.fr
H. Pham
CREST-ENSAE, Malakoff, France

© Springer International Publishing Switzerland 2015 507


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_18
508 H. Pham

We refer for instance to the textbooks [10, 11] or [19], and the recent survey paper
[12] for developments on this subject.
Classical criterion for investment decision is the expected utility maximization
from terminal wealth, which requires to specify on one hand the utility function
representing the investor’s preference, and subjective by nature, and on the other
hand the finite horizon. We consider in this paper an alternative behavioral founda-
tion, with an objective criterion over long term. More precisely, we are concerned
with the performance of a portfolio relative to a given target, and are interested in
maximizing (resp. minimizing) the probability to outperform (resp. to fall below)
a target growth rate when time horizon goes to infinity. Such criterion, formulated
as a large deviations portfolio optimization problem, has been proposed by [22] in
a static framework, studied in a continuous-time framework for the maximization
of upside chance probability by [17], and then by [9], see also [21] in discrete-time
models and [18] for a survey paper. The asymptotics of minimizing the downside
risk probability is studied in [8, 15].
Large deviations portfolio optimization is a nonstandard stochastic control prob-
lem, and is tackled by duality approach. The dual control problem is an ergodic risk-
sensitive portfolio optimization problem studied in [6] by dynamic programming
PDE methods in a Markovian setting, see also [7], and leads to particularly tractable
results with time-homogenous policies. A nice feature of the duality approach is also
to relate the target level in the objective probability of upside chance maximization
or downside risk minimization to the subjective degree of risk aversion, hence to
make endogenous the utility function of the investor.
The rest of this paper is organized as follows. Section 2 formulates the large devi-
ations criterion. In Sect. 3, we state the general duality relation for the large devi-
ations optimization problem, both for the upside chance probability maximization
and downside risk minimization. We illustrate in Sect. 4 our results in the Black-
Scholes toy model with constant proportion portfolio. Finally, we consider in Sect. 5
a factor model for assets price, and characterize the optimal strategy of the large
deviations optimization problem via the resolution of an ergodic Hamilton-Jacobi-
Bellman equation from the risk-sensitive dual control. Explicit solutions are provided
in the linear Gaussian factor model.

2 Large Deviations Criterion

We study a portfolio choice criterion, which is preferences-free, i.e. objective, and


horizon-free, i.e. over long term investment. This is formulated as a large deviations
criterion that we now describe in an abstract set-up. On a filtered probability space
(, F, F = (Ft )t≥0 , P) supporting all the random quantities appearing in the sequel,
we consider a frictionless financial market with d assets of positive price process S =
(S 1 , . . . , S d ). There is an agent investing at any time t a fraction πt of her wealth in the
assets based on the available information Ft . We denote by A the set of admissible
control strategies π = (πt )t≥0 , and X π the associated positive wealth process of
Long Time Asymptotics for Optimal Investment 509

dynamics:

d X tπ = X tπ πt diag(St )−1 d St , t ≥ 0, (2.1)

where diag(St )−1 denotes the diagonal d × d matrix of i-th diagonal term 1/Sti .
We then define the so-called growth rate portfolio, i.e. the logarithm of the wealth
process X π :

L πt := ln X tπ , t ≥ 0.

We set by L̄ π the average growth rate portfolio over time:

L πt
L̄ πt := , t > 0.
t
We shall then consider two problems on the long time asymptotics for the average
growth rate:
(i) Upside chance probability: given a target growth rate , the agent wants to
maximize over portfolio strategies π ∈ A
 
P L̄ πT ≥  when T → ∞.

(ii) Downside risk probability: given a target growth rate , the agent wants to
minimize over portfolio strategies π ∈ A
 
P L̄ πT ≤  when T → ∞.

Actually, when horizon time T goes to infinity, the probabilities of upside chance
or downside risk have typically an exponential decay in time, and we are led to the
following mathematical formulations of large deviations criterion:

1  
v+ () := sup lim sup ln P L̄ πT ≥  , (2.2)
π∈A T →∞ T
1  
v− () := inf lim inf ln P L̄ πT ≤  . (2.3)
π∈A T →∞ T

This criterion depends on the objective probability P, and the target growth rate , but
there is no exogenous utility function, and finite horizon. Large deviations control
problem (2.2) and (2.3) are nonstandard in the literature on stochastic control, and
we shall study these problems by a duality approach.
510 H. Pham

3 Duality

We derive in this section the dual formulation of the large deviations criterion intro-
duced in (2.2) and (2.3). Given π ∈ A, if the average growth rate portfolio L̄ πT satisfies
a large deviations principle, then large deviations theory states that its rate function
I (., π) should be related to its limiting log-Laplace transform (., π) by duality via
the Gärtner-Ellis theorem:
 
I (, π) = sup θ − (θ, π) , (3.1)
θ

where I (., π) is the rate function associated to the LDP of L̄ πT :

1  
lim sup ln P L̄ πT ≥  = − inf I ( , π) = I (, π),  ≥ lim L̄ πT , (3.2)
T →∞ T  ≥ T →∞

and (., π) is the limiting log-Laplace transform of L̄ πT :

1  π
(θ, π) := lim sup ln E eθT L̄ T , θ ∈ R,
T →∞ T

The issue is now to extend this duality relation (3.1) when optimizing over control
π. To fix the ideas, let us formally derive from (3.1) and (3.2) the maximization of
upside chance probability.

1    
sup lim sup ln P L̄ πT ≥  = sup − I (, π)
π T →∞ T π
  
= sup − sup θ − (θ, π)
π θ
 
= sup inf (θ, π) − θ
π θ
 
(if we can invert sup and inf) = inf sup (θ, π) − θ .
θ π

We thus expect that


 
v+ () = inf + (θ) − θ , (3.3)
θ

where + is defined by

+ (θ) = sup (θ, π).


π

In other words, we should have a duality relation between the value function v+ of the
large deviations control problem, and the value function + , which is known in the
Long Time Asymptotics for Optimal Investment 511

mathematical finance literature, as an ergodic risk-sensitive portfolio optimization


problem.
Let us now state rigorously the duality relation in an abstract (model-free) set-
ting. We first consider the upside chance large deviations probability, and define the
corresponding dual control problem:

1  π
+ (θ) := sup lim sup ln E eθT L̄ T , θ ≥ 0. (3.4)
π∈A T →∞ T

We easily see from Hölder inequality that + is convex on R+ . The following result
is due to [17].

Theorem 3.1 Suppose that + is finite and differentiable on (0, θ̄) for some θ̄ ∈
(0, ∞], and there exists π̂(θ) ∈ A solution to + (θ) for any θ ∈ (0, θ̄). Then, for all
 < + (θ̄), we have:
 
v+ () = inf + (θ) − θ .
θ∈[0,θ̄)

Moreover, an optimal control for v+ (), when  ∈ (+ (0), + (θ̄)), is

π +, = π̂(θ()), with + (θ()) = ,

while a nearly-optimal control for v+ () = 0, when  ≤ + (0), is:


 
1 n→∞
π +(n) = π̂(θn ), with θn = θ + (0) + −→ 0,
n

in the sense that


1  +(n) 
lim lim sup ln P L̄ πT ≥  = v+ ().
n→∞ T →∞ T

Proof Step 1: Let us consider the Fenchel-Legendre transform of the convex function
+ on [0, θ̄):

∗+ () = sup [θ − + (θ)],  ∈ R. (3.5)


θ∈[0,θ̄)

Since + is C 1 on (0, θ̄), it is well-known (see e.g. Lemma 2.3.9 in [4]) that the
function ∗+ is convex, nondecreasing and satisfies:

θ() − + (θ()), if + (0) <  < + (θ̄)
∗+ () = (3.6)
0, if  ≤ + (0),
512 H. Pham

θ() − ∗+ () > θ() − ∗+ ( ), ∀+ (0) <  < + (θ̄), ∀ = , (3.7)

where θ() ∈ (0, θ̄) is s.t. + (θ()) =  ∈ (+ (0), + (θ̄)). Moreover, ∗+ is
continuous on (−∞, + (θ̄)).
Step 2: Upper bound. For all  ∈ R, π ∈ A, an application of Chebycheff’s inequality
yields:

P[ L̄ πT ≥ ] ≤ exp(−θT )E[exp(θT L̄ πT )], ∀ θ ∈ [0, θ̄),

and so
1 1
lim sup ln P[ L̄ πT ≥ ] ≤ −θ + lim sup ln E[exp(θT L̄ πT )], ∀ θ ∈ [0, θ̄).
T →∞ T T →∞ T

By definitions of + and ∗+ , we deduce:

1
sup lim sup ln P[ L̄ πT ≥ ] ≤ −∗+ (). (3.8)
π∈A T →∞ T

Step 3: Lower bound. Consider first the case  ∈ (+ (0), + (θ̄)), and let us define
the probability measure QT on (, FT ) via:

dQT  +,

= exp θ()L πT − T (θ(), π +, ) , (3.9)
dP
where

T (θ, π) = ln E[exp(θT L̄ πT )], θ ∈ [0, θ̄), π ∈ A.

For any ε > 0, we have:


 
1 +, 1 dP
ln P[ − ε < L̄ πT <  + ε] = ln 1 +, dQT
T T dQT −ε< L̄ πT <+ε

1
≥ −θ()  + ε + T (θ(), π +, )
T
1  +, 
+ ln QT  − ε < L̄ πT <  + ε ,
T
where we use (3.9) in the last inequality. By definition of the dual problem, this
yields:

1 +,
lim inf ln P[ − ε < L̄ πT <  + ε]
T →∞ T
≥ −θ()  + ε + + (θ())
Long Time Asymptotics for Optimal Investment 513

1  +,

+ lim infln QT  − ε < L̄ πT <  + ε
T →∞ T
≥ −∗+ () − θ()ε
1  +,

+ lim inf ln QT  − ε < L̄ πT <  + ε , (3.10)
T →∞ T

where the second inequality follows by the definition of ∗+ (and actually holds with
equality due to (3.6)). We now show that:

1  +,

lim inf ln QT  − ε < L̄ πT <  + ε = 0. (3.11)
T →∞ T
+,
Denote by ˜ T the c.g.f. under QT of L πT . For all ζ ∈ R, we have by (3.9):

+,
˜ T (ζ) := ln EQT [exp(ζ L πT )]
= T (θ() + ζ, π +, ) − T (θ(), π +, ).

Therefore, by definition of the dual control problem (3.4), we have for all ζ ∈
[−θ(), θ̄ − θ()):

1
lim sup ˜ T (ζ) ≤ + (θ() + ζ) − + (θ()). (3.12)
T →∞ T

As in part (1) of this proof, by Chebycheff’s inequality, we have for all ζ ∈ [0, θ̄ −
θ()):

1  +,  1
lim sup ln QT L̄ πT ≥  + ε ≤ −ζ( + ε) + lim sup ˜ T (ζ)
T →∞ T T →∞ T
≤ −ζ ( + ε) + + (ζ + θ()) − + (θ()),

where the second inequality follows from (3.12). We deduce

1  +, 
lim sup ln QT L̄ πT ≥  + ε ≤ − sup{ζ ( + ε) − + (ζ) : ζ ∈ [θ(), θ̄)}
T →∞ T
−+ (θ()) + θ() ( + ε)
≤ −∗+ ( + ε) − + (θ()) + θ() ( + ε) ,
= −∗+ ( + ε) + ∗+ () + εθ(), (3.13)
514 H. Pham

where the second inequality and the last equality follow from (3.6). Similarly, we
have for all ζ ∈ [−θ(), 0]:

1  +,  1
lim sup ln QT L̄ πT ≤  − ε ≤ −ζ ( − ε) + lim sup ˜ T (ζ)
T →∞ T T →∞ T
≤ −ζ ( − ε) + + (θ() + ζ) − + (θ()),

and so:
1  +, 
lim sup ln QT L̄ πT ≤  − ε ≤ − sup{ζ ( − ε) − + (ζ) : ζ ∈ [0, θ()]}
T →∞ T
−+ (θ()) + θ() ( − ε)
≤ −∗+ ( − ε) + ∗+ (θ()) − εθ(). (3.14)

By (3.13) and (3.14), we then get:

1  +, +,

lim sup ln QT L̄ πT ≤  − ε ∪ L̄ πT ≥  + ε
T →∞ T
  +,   +, 
1 1
≤ max lim sup ln QT L̄ πT ≥  + ε ; lim sup ln QT L̄ πT ≤  − ε
T →∞ T T →∞ T
 
≤ max −+ ( + ε) + + () + εθ(); −∗+ ( − ε) + ∗+ (θ()) − εθ()
∗ ∗

< 0,
+,
where the strict inequality follows from (3.7). This implies that QT [{ L̄ πT ≤  − ε}
+, +,
∪ { L̄ πT ≥  + ε}] → 0 and hence QT [ − ε < L̄ πT <  + ε] → 1 as T goes to
infinity. In particular (3.11) is satisfied, and by sending ε to zero in (3.10), we get for
any  <  < + (θ̄):

1 +, 1 +,
lim inf ln P[ L̄ πT >  ] ≥ lim lim inf ln P[ − ε < L̄ πT <  + ε]
T →∞ T ε→0 T →∞ T
≥ −∗+ ().

By continuity of ∗+ on (−∞, + (θ̄)), we obtain

1 +,
lim inf ln P[ L̄ πT ≥ ] ≥ −∗+ ().
T →∞ T

This last inequality combined with (3.8) proves the assertion for v+ () when  ∈
(+ (0), + (θ̄)).
Now, consider the case  ≤ + (0), and define n = + (0)+ n1 , π +(n) = π̂(θ(n )).
Then, by the same arguments as in (3.10) with n ∈ (+ (0), + (θ̄)), we have
Long Time Asymptotics for Optimal Investment 515

1 +(n) 1 +(n)
lim inf ln P[ L̄ πT ≥ ] ≥ lim lim inf ln P[n − ε < L̄ πT < n + ε]
T →∞ T ε→0 T →∞ T
≥ −∗+ (n ).

By sending n to infinity, together with the continuity of ∗+ , we get

1 +(n)
lim inf lim inf ln P[ L̄ πT ≥ ] ≥ −∗+ (+ (0)) = 0,
n→∞ T →∞ T

which combined with (3.8), ends the proof. 

Remark 3.1 Theorem 3.1 shows that the upside chance large deviations control prob-
lem can be solved via the resolution of the dual control problem. When the target
growth rate level  is smaller than + (0), then one can achieve almost surely over
long term an average growth term above , in the sense that v+ () = 0, with a nearly
optimal portfolio strategy which does not depend on this level. When the target level
 lies between + (0) and + (θ̄), the optimal strategy depends on this level and is
obtained from the optimal strategy for the dual control problem + (θ) at point θ =
θ(). When + (θ̄) = ∞, i.e. + is steep, we have a complete resolution of the large
deviations control problem for all values of . Otherwise, the problem remains open
for  > + (θ̄). 

Let us next consider the downside risk probability, and define the corresponding
dual control problem:

1  π
− (θ) := inf lim inf ln E eθT L̄ T , θ ≤ 0. (3.15)
π∈A T →∞ T

Convexity of − is not so straightforward as for + , and requires the additional


condition that the set of admissible controls A is convex. Indeed, under this condition,
we observe from the dynamics (2.1) that a convex combination of wealth process is
a wealth process. Thus, for any θ1 , θ2 ∈ (−∞, 0), λ ∈ (0, 1), π 1 , π 2 ∈ A, there exists
π ∈ A such that:

λθ1 (1 − λ)θ2
Xπ + X π = X Tπ .
1 2

λθ1 + (1 − λ)θ2 T λθ1 + (1 − λ)θ2 T

By concavity of the logarithm function, we then obtain

λθ1 (1 − λ)θ2
ln X Tπ ≥ ln X Tπ + ln X Tπ ,
1 2

(λθ1 + (1 − λ)θ2 ) (λθ1 + (1 − λ)θ2 )

and so, by setting θ = λθ1 + (1 − λ)θ2 < 0:

θT L̄ πT ≤ λθ1 T L̄ πT + (1 − λ)θ2 T L̄ πT .
1 2
516 H. Pham

Taking exponential and expectation on both sides of this relation, and using Hölder
inequality, we get:

 π
    
π1  λ

π 2  1−λ
E eθT L̄ T ≤ E eθ1 T L̄ T E eθ2 T L̄ T .

Taking logarithm, dividing by T , sending T to infinity, and since π 1 , π 2 are arbitrary


in A, we obtain by definition of − :

− (θ) ≤ λ− (θ1 ) + (1 − λ)− (θ2 ),

i.e. the convexity of − on R− . Since − (0) = 0, the convex function − is either


infinite on (−∞, 0) or finite on R− . We now state the duality relation for downside
risk large deviations probability, whose proof can be found in [15].

Theorem 3.2 Suppose that − is differentiable on (−∞, 0), and there exists π̂(θ)
∈ A solution to − (θ) for any θ < 0. Then, for all  < − (0), we have:
 
v− () = inf − (θ) − θ ,
θ≤0

and an optimal control for v− (), when  ∈ (− (−∞), − (0)) is:

π −, = π̂(θ()), with − (θ()) = ,

while v− () = −∞ when  < − (−∞).

Remark 3.2 Theorem 3.2 shows that the downside risk large deviations control prob-
lem can be solved via the resolution of the dual control problem. When the target
growth rate level  is smaller than − (−∞), then one can find a portfolio strategy
so that the average growth term almost never fall below  over the long term, in the
sense that v− () = −∞. When the target level  lies between − (−∞) and − (0),
the optimal strategy depends on this level and is obtained from the optimal strategy
for the dual control problem − (θ) at point θ = θ(). 

Interpretation of the dual problem


For θ = 0, the dual problem can be written as

1
± (θ) = sup lim sup JT (θ, π),
θ π∈A T →∞

with
1  π
JT (θ, π) := ln E eθT L̄ T ,
θT
Long Time Asymptotics for Optimal Investment 517

and is known in the literature as a risk-sensitive control problem. A Taylor expansion


around θ = 0 puts in evidence the role played by the risk sensitivity parameter θ:
 
JT (θ, π) E L̄ πT + θT Var( L̄ πT ) + O(θ2 ).

This relation shows that risk-sensitive control amounts to making dynamic the
Markowitz problem: one maximizes the expected average growth rate subject to
a constraint on its variance. Risk-sensitive portfolio criterion on finite horizon T has
been studied in [2, 3], and in the ergodic case T → ∞, by [6, 16].
Endogenous utility function
Recalling that growth rate is the logarithm of wealth process, the duality relation for
the upside large deviations probability means formally that for large horizon T :
 + 
P L̄ πT , ≥  exp v+ ()T
= exp + (θ())T − θ()T
 +, θ()

E X Tπ e−θ()T , with θ() > 0.

Similarly, we have for the downside risk probability:


 −,   
−, θ()
P L̄ πT ≤  E X Tπ e−θ()T , with θ() < 0.

In other words, the target growth rate level  determines endogenously the risk
aversion parameter 1−θ() of an agent with Constant Relative Risk Aversion (CRRA)
utility function and large investment horizon. Moreover, the optimal strategy π ±,
for v± () is expected to provide a good approximation for the solution to the CRRA
utility maximization problem
 
sup E (X Tπ )θ() ,
π∈A

with a large but finite time horizon.

4 A Toy Model: The Black Scholes Case

We illustrate the results of the previous section in a toy example, namely the Black-
Scholes model, with one stock of price process

d St = St bd + σdWt , t ≥ 0.

We also consider an agent with constant proportion portfolio strategies. In other


words, the set of admissible controls A is equal to R. Given a constant proportion π
518 H. Pham

∈ R invested in the stock, and starting w.l.o.g. with unit capital, the average growth
rate portfolio of the agent is equal to

L πT σ2 π2 WT
L̄ πT = = bπ − + σπ .
T 2 T

It follows that L̄ πT is distributed according to a Gaussian law:

 σ2 π2 σ2 π2 
L̄ πT  N bπ − , ,
2 T
and its (limiting) Log-Laplace function is equal to

1  π
  σ2 π2 
(θ, π) := ( lim ) ln E eθT L̄ T = θ bπ − (1 − θ)
T →∞ T 2

• Upside chance probability.


The dual control problem in the upside case is then given by

∞, if θ ≥ 1,
+ (θ) = sup (θ, π) = b2 θ
π∈R (θ, π̂(θ)) = 2σ 2 1−θ
, if 0 ≤ θ < 1,

with
b
π̂(θ) = .
σ 2 (1 − θ)
2
Hence, + differentiable on [0, 1) with: + (0) = 2σb 
2 , and + (1) = ∞, i.e. + is
steep. From Theorem 3.1, the value function of the upside large deviations probability
is explicitly computed as:

1  
v+ () := sup lim sup ln P L̄ πT ≥ 
π∈R T →∞ T
 
= inf + (θ) − θ
0≤θ<1

⎨ 0, if  ≤ + (0) = b2
=  √ 2σ 2
⎩ − + (0) −  2 , if  > + (0)

with an optimal strategy:





b
σ2
, if  ≤ + (0)
π +, = 

⎩ 2 , if  >  (0).
σ2 +
Long Time Asymptotics for Optimal Investment 519

Notice that, when  ≤ + (0), we have not only a nearly optimal control as stated
in Theorem 3.1, but an optimal control given by π + = b/σ 2 , which is precisely the
optimal portfolio for the classical Merton problem with logarithm utility function.
+ b2
Indeed, in this model, we have by the law of large numbers: L̄ πT → 2σ 
2 = + (0), as
+
T goes to infinity, and so lim T →∞ T1 ln P[ L̄ πT ≥ ] = 0 = v+ (). Otherwise, when
 > + (0), the optimal strategy depends on , and the larger the target growth rate
level, the more one has to invest in the stock.
• Downside risk probability.
The dual control problem in the downside case is then given by

b2 θ
− (θ) = inf (θ, π) = (θ, π̂(θ)) = , θ ≤ 0,
π∈R 2σ 2 1 − θ

with
b
π̂(θ) = .
σ 2 (1 − θ)
2
Hence, − is differentiable on R− with: − (−∞) = 0, and − (0) = 2σ
b
2 . From
Theorem 3.1, the value function of the downside large deviations probability is
explicitly computed as:

1  
v− () := inf lim inf ln P L̄ πT ≤ 
π∈R T →∞ T
 
= inf − (θ) − θ
θ≤0

=  −∞, √ if  < 0
 2 b2
− − (0) −  , if 0 ≤  ≤ − (0) = 2σ 2

with an optimal strategy:



−, 2
π = , if 0 ≤  ≤ − (0).
σ2

Moreover, when  < 0, and by choosing π − = 0, we have L̄ πT = 0, so that

P[ L̄ πT ≤ ] = 0, and thus v− () = −∞. In other words, when the target growth
rate  < 0, by doing nothing, we have an optimal strategy for v− ().

Remark 4.1 The above direct calculations rely on the fact that we restrict portfolio
π to be constant in proportion. Actually, the explicit forms of the value function and
optimal strategy remain the same if we allow a priori portfolio strategies π ∈ A to
change over time based on the available information, i.e. to be F-predictable. This
520 H. Pham

requires more advanced tools from stochastic control and PDEs to be presented in
the sequel in a more general framework. 

5 Factor Model

We consider a market model with one riskless asset price S 0 = 1, and d stocks of
price process S governed by

d St = diag(St ) b(Yt )dt + σ(Yt )dWt )


dYt = η(Yt )dt + γ(Yt )dWt ,

where Y is a factor process valued in Rm , and W is a d + m dimensional stan-


dard Brownian motion. The coefficients b, σ, η, γ are assumed to satisfy regular
conditions ensuring existence of a unique strong solution to the above stochastic
differential equation, and σ is also of full rank, i.e. the d × d-matrix σσ  is invertible.
A portfolio strategy π is an Rd -valued adapted process, representing the fraction
of wealth invested in the d stocks. The admissibility condition for π in A will be
precised later, but for the moment π is required to satisfy the integrability conditions:

T T
|πt b(Yt )|dt + |πt σ(Yt )|2 dt < ∞, a.s. for all T > 0.
0 0

The growth rate portfolio is then given by:

T  π  σσ  (Yt )πt  T
L πT = πt b(Yt ) − t dt + πt σ(Yt )dWt .
0 2 0

For any θ ∈ R, and π, we compute the Log-Laplace function of the growth rate
portfolio:
 π
T (θ, π) := ln E eθL T
  T  T 
= ln E E θπt σ(Yt )dWt eθ 0 f (θ,Yt ,πt )dt
,
0

where E(.) denotes the Doléans-Dade exponential, and f is the function:

1−θ  
f (θ, y, π) = π  b(y) − π σσ (y)π.
2
We now impose  condition that π lies in A if the Doléans-Dade local
  the admissibility
. 
martingale E 0 θπt σ(Yt )dWt is a true martingale for any T > 0, which is
0≤t≤T
ensured, for instance, by the Novikov condition. In this case, this Doléans-Dade
Long Time Asymptotics for Optimal Investment 521

exponential defines a probability measure Qπ equivalent to P on (, FT ), and we


have:
 T 
T (θ, π) = ln EQπ exp θ f (θ, Yt , πt )dt ,
0

where Y is governed under Qπ by

dYt = η(Yt ) + θγ(Yt )σ  (Yt )πt dt + γ(Yt )dWtπ .

with W π a Qπ -Brownian motion from Girsanov’s theorem.


We then consider the dual control problems:

• Upside chance: for θ ≥ 0,

1  T 
+ (θ) = sup lim sup ln EQπ exp θ f (θ, Yt , πt )dt .
π∈A T →∞ T 0

• Downside risk: for θ ≤ 0,

1  T 
− (θ) = inf lim inf ln EQπ exp θ f (θ, Yt , πt )dt .
π∈A T →∞ T 0

These problems are known in the literature as ergodic risk-sensitive control problems,
and studied by dynamic programming methods in [1, 5, 14]. Let us now formally
derive the ergodic equations associated to these risk-sensitive control problems. We
consider the finite horizon risk-sensitive stochastic control problems:
 T  
u + (T, y; θ) = sup EQπ exp θ f (θ, Yt , πt )dt Y0 = y , θ ≥ 0
π∈A 0
 T  
u − (T, y; θ) = inf EQπ exp θ f (θ, Yt , πt )dt Y0 = y , θ ≤ 0,
π∈A 0

and by using the formal substitution:

ln u ± (T, y; θ) ± (θ)T + ϕ± (y; θ), for large T,

in the corresponding Hamilton-Jacobi-Bellman (HJB) equations for u ± :

∂u ±  1 
= sup θ f (θ, y, π)u ± + (η(y) + θγ(y)σ  (y)π) D y u ± + tr(γγ  (y)D 2y u ± ) ,
∂T π∈Rd 2

we obtain the ergodic HJB equation for the pair (± (θ), ϕ± (., θ)) as:
522 H. Pham

1 1 2
(θ) = η(y) D y ϕ + tr(γγ  (y)D 2y ϕ) + γ  (y)D y ϕ 
2 2
 1−θ   
 
+ θ sup π (b(y) + σ(y)γ (y)D y ϕ − π σσ (y)π ,
π∈Rd 2

which is well-defined for θ < 1. In the above equation (θ) is a candidate for ± (θ)
while ϕ is a candidate solution for ϕ± . This can be rewritten as a semi-linear ergodic
PDE with quadratic growth in the gradient:

θ 1
(θ) = η(y) + γσ  (σσ  )−1 b(y) .D y ϕ + tr(γγ  (y)D 2y ϕ)
1−θ 2
1  θ 
+ D y ϕ γ(y) Id+m + σ  (σσ  )−1 σ(y) γ  (y)D y ϕ
2 1−θ
θ   −1
+ b (σσ ) b(y), (5.1)
2(1 − θ)

and a candidate for optimal feedback control of the dual problem:

1  
π̂(y; θ) = (σσ  )−1 (y) b(y) + σγ  (y)D y ϕ(y; θ) . (5.2)
1−θ

We now face the questions:


• Existence of a pair solution ((θ), ϕ(., θ)) to the ergodic PDE (5.1)?
• Do we have (θ) = ± (θ), and what is the domain of ?
We give some assumptions, which allows us to answer the above issues.
(H1) b, σ, η and γ are smooth C 2 and globally Lipschitz.
(H2) σσ  (y) and γγ  (y) are uniformly elliptic: there exist δ1 , δ2 > 0 s.t.

δ1 |ξ|2 ≤ ξ  σσ  (y)ξ ≤ δ2 |ξ|2 , ∀ξ, y ∈ Rm ,


δ1 |ξ|2 ≤ ξ  γγ  (y)ξ ≤ δ2 |ξ|2 , ∀ξ, y ∈ Rm .

(H3) There exist c1 > 0 and c2 ≥ 0 s.t.

b(σσ  )−1 b(y) ≥ c1 |y|2 − c2 , ∀y ∈ Rm .

(H4) Stability condition: there exist c3 > 0 and c4 ≥ 0 s.t.

η(y) − γσ  (σσ  )−1 b(y) .y ≤ −c3 |y|2 + c4

According to [1] (see also [15, 20]), the next result states the existence of a smooth
solution to the ergodic equation.
Proposition 5.1 Under (H1)–(H4), there exists for any θ < 1, a solution
((θ), ϕ(.; θ)) with ϕ(.; θ) C 2 , to the ergodic HJB equation s.t:
Long Time Asymptotics for Optimal Investment 523

• For θ < 0, ϕ(.; θ) is upper-bounded

ϕ(y; θ) −→ −∞, as |y| → ∞,

• For θ ∈ (0, 1), ϕ(.; θ) is lower-bounded

ϕ(y; θ) −→ ∞, as |y| → ∞,

and
 
 D y ϕ(y; θ) ≤ Cθ (1 + |y|).

We now relate a solution to the ergodic equation to the dual risk-sensitive con-
trol problem. In other words, this means the convergence of the finite horizon risk-
sensitive stochastic control to the component  of the ergodic equation. We distin-
guish the downside and upside cases.
• Downside risk: In this case, it is shown in [15] that for all θ < 0, the solution
((θ), ϕ(.; θ) to (5.1), with ϕ(., θ) C 2 and upper bounded, is unique (up to an additive
constant for ϕ(.; θ)), and we have:

(θ) = − (θ), θ < 0.

Moreover, there is an admissible optimal feedback control π̂(., θ) for − (θ) given
by (5.2), and for which the factor process Y is ergodic under Qπ̂ . It is also proved
in [15] that  = − is differentiable on (−∞, 0). Therefore, from Theorem 3.2, the
solution to the downside risk large deviations probability is given by:
 
v− () = inf (θ) − θ ,  <   (0),
θ≤0

with an optimal control:

πt−, = π̂(Yt ; θ()),   (θ()) = , ∀ ∈ (  (−∞),   (0)),

while v− () = −∞ for  <   (−∞).


• Upside chance: In this case, 0 < θ < 1, there is no unique solution
((θ), ϕ(.; θ)) to the ergodic equation, with ϕ(.; θ) C 2 lower-bounded, even up
to an additive constant, as pointed out in [6]. In general, we only have a verification
type result, which states that if the process Y is ergodic under Qπ̂ , then

(θ) = + (θ),

and π̂(., θ) is an optimal feedback control for + (θ).


In the next paragraph, we consider a linear factor model for which explicit calcu-
lations can be derived.
524 H. Pham

5.1 Linear Gaussian Factor Model

We consider the linear factor model:

d St = diag(St ) (B1 Yt + B0 )dt + σdWt ) in Rd ,


dYt = K Yt dt + γdWt , in Rm ,

with K a stable matrix in Rm , B1 a constant d × m matrix, B0 a non-zero vector in


Rd , σ a d × (d + m)-matrix of rank d, and γ a nonzero m × (d + m) matrix. We
are searching for a candidate solution to the ergodic equation (5.1) in the quadratic
form:
1
ϕ(y; θ) = C(θ)y.y + D(θ)y, y ∈ Rm ,
2
for some m × m matrices C(θ) and D(θ). Plugging this form of ϕ into (5.1), we find
that C(θ) must solve the algebraic Riccati equation:

1 θ
C(θ) γ Id+m + σ  (σσ  )−1 σ γ  C(θ)
2 1−θ
θ  1 θ
+ K+ γσ  (σσ  )−1 B1 C(θ) + B  (σσ  )−1 B1 = 0, (5.3)
1−θ 21−θ 1

while B(θ) is determined by


 θ θ 
K+ γσ  (σσ  )−1 B1 + γ Id+m + σ  (σσ  )−1 σ γ  C(θ) D(θ)
1−θ 1−θ
θ
+ σγ  C(θ) + B1 ) (σσ  )−1 B0 = 0.
1−θ

Then, (θ) is given by:

1 1 θ
(θ) = tr(γγ  C(θ)) + D(θ) γ(Id+m + σ  (σσ  )−1 σ)γ  D(θ)
2 2 1−θ
θ 1 θ
+ B0 (σσ  )−1 σγ  D(θ) + B  (σσ  )−1 B0 ,
1−θ 21−θ 0

and a candidate for the optimal feedback control is:

1  
π̂(y; θ) = (σσ  )−1 (B1 + σγ  C(θ))y + B0 + σγ  D(θ) .
1−θ

In [6], it is shown that there exists some positive θ̄ small enough, s.t. for θ < θ̄, there
exists a solution C(θ) to the Riccati equation (5.3) s.t. Y is ergodic under Qπ̂ , and
so by verification theorem, (θ) = ± (θ). In the one-dimensional asset and factor
Long Time Asymptotics for Optimal Investment 525

model, as studied in [17], we obtain more precise results. Indeed, in this case: d =
m = 1, the Riccati equation is a second-order polynomial equation in C(θ), which
admits two explicits roots given by:
 √ 
K 1 − θ 1 − ρ |γ|B
K |σ| ±
1
(1 − θ)(1 − θβ)
C± (θ) = − 2 ,
|γ| 1 − θ(1 − ρ2 )

for all θ ≤ θ̄, with

1  |γ|B1 2
θ̄ = ∧ 1, β = 1 − ρ2 + ρ − > 0,
β K |σ|

where |γ| (resp. |σ|) is the Euclidian norm of γ (resp. σ), and ρ ∈ [−1, 1] is the
γσ 
correlation between S and Y , i.e. ρ = |γ||σ| . Actually, only the solution C(θ) =
C− (θ) is relevant in the sense that for this root, Y is ergodic under Q π̂ , and thus by
verification theorem:
1 2 1 θ
± (θ) = (θ) = |γ| C− (θ) + |γ|2 D(θ)2 1 + ρ2
2 2 1−θ
θ B0 1 θ B02
+ ρ|γ|D(θ) + , θ < θ̄,
1 − θ |σ| 2 1 − θ |σ|2

where
B1
B0 θ ρ|γ|C− (θ) + |σ|
D(θ) = − √ ,
K |σ| (1 − θ)(1 − θβ)

and with optimal control for ± (θ) given by:

1  B  B0 
1
π̂(y; θ) = + ρ|γ|C− (θ) y + + ρ|γ|D(θ) .
(1 − θ)|σ| |σ| |σ|

Moreover, it is also proved in [17], that

B02 B12 |γ|


  (0) = − > 0,
2|σ|2 4|σ|2 K

(recall that K < 0) and the function  is steep, i.e.

lim   (θ) = ∞.
θ↑θ̄
526 H. Pham

From Theorems 3.1 and 3.2, the solutions to the upside chance and downside risk
large deviations probability are given by:
 
v+ () = inf (θ) − θ ,  ∈ R,
0≤θ<θ̄
 
v− () = inf (θ) − θ ,  <   (0),
θ≤0

with optimal control and nearly optimal control for v+ ():

πt+, = π̂(Yt ; θ()),   (θ()) = , when  >   (0),


+(n) 1 n→∞
πt = π̂(Yt ; θn ), with θn = θ(  (0) + ) −→ 0, when  ≤   (0),
n

and optimal control for v− ():

πt−, = π̂(Yt ; θ()),   (θ()) = , ∀ ∈ (  (−∞),   (0)).

5.2 Examples

• Black-Scholes model. This corresponds to the case where B1 = 0. Then,


β = θ̄ = 1, C− (θ) = D(θ) = 0, and so

1 θ B02
± (θ) = (θ) = , ∀θ < 1.
2 1 − θ |σ|2

We thus obtain the same optimal strategy as described in Sect. 4.


• Platen-Rebolledo model. In this model, the logarithm of the stock price S is
governed by an Ornstein-Uhlenbeck process Y , and this corresponds to the case
where B1 = K < 0, B0 = 21 |γ|2 > 0, γ = σ, and thus ρ = 1. Then, β = 0, θ̄ = 1,

|K |  √  1
C− (θ) = 1 − 1 − θ , D(θ) = − θ,
|σ| 2 2

and so

|K |  √  |σ|2
(θ) = 1− 1−θ +θ , θ < 1,
2 8
|K | |σ|2 |σ|2
  (0) = ¯ := + ,   (−∞) =  := ,
4 8 8
Long Time Asymptotics for Optimal Investment 527

 ¯ −  2
θ() = 1 − , ∀ > .
−

The solution to the upside chance large deviations probability is then given by:
 ¯2
(−)
− ¯ |K |
, if  > ¯
v+ () = −+ 4
0, ¯
if  ≤ .

with optimal (resp. nearly optimal) portfolio strategy:

K − 4( − ) ¯ 1
πt+, = Yt + , if  > ¯
|σ| 2 2
+(n) K − 1/n 1 ¯
πt = Yt + , if  ≤ .
|σ| 2 2

The solution to the downside risk large deviations probability is given by:

¯
− (−) ¯
2

v− () = − , if  <  ≤ 


−∞, if  ≤ ,

with optimal portolio strategy:

4( − ) 1
πt−, = − Yt + , if  <  ≤ ¯
|σ|2 2

References

1. Bensoussan, A., Frehse, J.: On Bellman equations of ergodic control in R N . J. Reine Angew.
Math. 429, 125–160 (1992)
2. Bielecki, T.R., Pliska, S.R.: Risk-sensitive dynamic asset management. Appl. Math. Optim.
39, 337–360 (1999)
3. Davis, M., Lleo, S.: Risk-sensitive benchmarked asset management. Quant. Financ. 8, 415–426
(2008)
4. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer,
New York (1998)
5. Fleming, W., McEneaney, W.: Risk sensitive control on an infinite horizon. SIAM J. Control
Optim. 33, 1881–1915 (1995)
6. Fleming, W., Sheu, S.J.: Risk sensitive control and an optimal investment model. Math. Financ.
10, 197–213 (2000)
7. Guasoni, P., Robertson, S.: Portfolios and risk premia for the long run. Ann. Appl. Probab. 22,
239–284 (2012)
8. Hata, H., Nagai, H., Sheu, S.J.: Asymptotics of the probability minimizing a down-side risk.
Ann. Appl. Probab. 20, 52–89 (2010)
528 H. Pham

9. Hata, H., Sekine, J.: Solving long term investment problems with Cox-Ingersoll-Ross interest
rates. Adv. Math. Econ. 8, 231–255 (2005)
10. Karatzas, I., Shreve, S.: Methods of Mathematical Finance. Springer, New York (1998)
11. Korn, R.: Optimal Portfolios: Stochastic Models for Optimal Investment and Risk Management
in Continuous-time. World Scientific, Singapore (1997)
12. Liu, R., Muhle-Karbe, J.: Portfolio choice with stochastic investment opportunities: a user’s
guide, Preprint (2013)
13. Merton, R.: Optimum consumption and portfolio rules in a continuous-time model. J. Econ.
Theory 3, 373–413 (1971)
14. Nagai, H.: Bellman equations of risk sensitive control. SIAM J. Control Optim. 34, 74–101
(1996)
15. Nagai, H.: Downside risk minimization via a large deviation approach. Ann. Appl. Probab. 22,
608–669 (2012)
16. Nagai, H., Peng, S.: Risk-sensitive dynamic portfolio optimization with partial information on
infinite time horizon. Ann. Appl. Probab 12, 173–195 (2002)
17. Pham, H.: A large deviations approach to optimal long term investment. Financ. Stoch. 7,
169–195 (2003)
18. Pham, H.: Some applications and methods of large deviations in finance and insurance. Paris-
Princeton Lectures on Mathematical Finance, Lecture Notes in Mathematics, vol. 1919 (2007)
19. Pham, H.: Continuous Time Stochastic Control and Optimization with Financial Applications.
SMAP. Springer, New York (2009)
20. Robertson, S., Xing, H.: Large time behavior of solutions to semi-linear equations with
quadratic growth in the gradient. SIAM J. Control Optim. 53(1), 185–212 (2015)
21. Stettner, L.: Duality and risk-sensitive portfolio optimization. In: Yin, G., Zhang, Q. (eds.)
Mathematics of Finance, Contemporary Mathematics, vol. 351, pp. 333–347 (2004)
22. Stutzer, M.: Portfolio choice with endogenous utility: a large deviations approach. J. Econom.
116, 365–386 (2003)
Systemic Risk and Default Clustering
for Large Financial Systems

Konstantinos Spiliopoulos

Abstract As it is known in the finance risk and macroeconomics literature,


risk-sharing in large portfolios may increase the probability of creation of default
clusters and of systemic risk. We review recent developments on mathematical and
computational tools for the quantification of such phenomena. Limiting analysis such
as law of large numbers and central limit theorems allow to approximate the distri-
bution in large systems and study quantities such as the loss distribution in large
portfolios. Large deviations analysis allow us to study the tail of the loss distrib-
ution and to identify pathways to default clustering. Sensitivity analysis allows to
understand the most likely ways in which different effects, such as contagion and
systematic risks, combine to lead to large default rates. Such results could give useful
insights into how to optimally safeguard against such events.

Keywords Systemic risk · Default clustering · Large portfolios · Loss distribution ·


Asymptotic methods · Rare events

1 Introduction

The past several years have made clear the need to better understand the behaviour in
large interconnected financial systems. Almost all areas of modern life are touched
by a financial crisis. The recent financial crisis of 2007–2009 brought into focus the
networked structure of the financial world. It challenged the mathematical finance
community to understand connectedness in financial systems. The understanding of
systemic risk, i.e., the risk that a large numbers of components of an interconnected
financial system fails within a short time leading to the failure of the system itself,
becomes an important issue to investigate.
Interconnections often make a system robust, but they can also act as conduits
for risk. Even things that may seemingly be unrelated, may become related as risk

K. Spiliopoulos (B)
Department of Mathematics & Statistics, Boston University, Boston, MA 02215, USA
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 529


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_19
530 K. Spiliopoulos

restrictions, may for example, force a sale of one type of a well-performing asset to
compensate for the poor behavior of another asset. Thus, appropriate mathematical
models need to be developed, in order to help in the understanding of how risk can
propagate between financial objects.
It is possible that initial shocks could trigger contagion effects (e.g., [1]). Examples
of such shocks include: changes in interest rate values, in currencies values, changes
of commodities prices, or reduction in global economic growth. Then, there may be a
transmission mechanism which causes other institutions in the system to be affected
by the initial shock. An example of such a mechanism is financial linkages among
economies. Another reason could simply be investor irrationality. In either case,
systemic risk causes the perceived risk-return trade-off in the economy to change.
Uncertainty becomes an issue and market participants fear subsequent losses in
asset prices with a large dispersion in regards to the magnitude of the crisis. Reduce-
form point process models of correlated default are many times used (a): to assess
portfolio credit risk and (b): to value securities exposed to correlated default risk.
The workhorses of these models are counting processes. In this work we focus on
using dynamic portfolio credit risk models to study large portfolio asymptotics and
default clustering.
Large portfolio asymptotic were first studied in [2]. The model in [2] is a sta-
tic model of a homogeneous pool and firms default independently of one another
conditional on a normally distributed random variable representing a systematic risk
factor. Alternative distributions of the systematic factor were examined in [3, 4] and
the case of heterogeneous portfolios was studied in [5]. In [6], the authors extend the
model of [2] dynamically and the systematic risk factor follows a Brownian motion.
In [6], the authors study a structural model for distance to default process in a pool
of names. A firm defaults when the default process hits zero. Exploiting conditional
independence of defaults, [7, 8] have studied the tail of the loss distribution in the
static case. Large deviations arguments were also used in [9] to study stochastic
recovery effects on large static pools of credit assets.
Reduced-form models of correlated default timing have appeared in the finance
literature under different forms. Giesecke and Weber [10] take the intensity of a
name as a function of the state of the names in a specified neighborhood of that
name. The authors in [11, 12] take the intensity to be a function of the portfolio
loss and each name can be either in a good or in a distressed financial state. These
papers prove law of large numbers for the portfolio loss distribution and develop
Gaussian approximations to the portfolio loss distribution based on central limit
theorems. Cvitanić et al. [13] consider the typical behavior of a mean field system
with permanent default impact.
Sircar and Zariphopoulou [14] study large portfolio asymptotics for utility indif-
ference valuation of securities exposed to the losses in the pool. In [15], the authors
study systematic risk via a mean field model of interacting agents. Using a model
of a two well potential, agents can move freely from a healthy state to a failed state.
The authors study probabilities of transition from the healthy to the failed state using
large deviations ideas. In [16] the authors propose and study a model for inter-bank
lending and study its stochastic stability.
Systemic Risk and Default Clustering for Large Financial Systems 531

The authors in [17] employ jump-diffusion models driven by Hawkes processes


to empirically study default clustering and the time dimension of systemic risk.
Duan [18] proposes a hierarchical model with individual shocks and group specific
shocks. The work of [19] reviews intensity models that are governed by exogenous
and endogenous Markov Chains. In [20], the authors proposed a dynamic point
process model of correlated default timing in a portfolio of firms (“names”). The
model incorporates different sources of default clustering identified in recent empir-
ical research, including idiosyncratic risks, exposure to systematic risk factors and
contagion in financial markets, see [21, 22]. Based on the weak convergence ideas of
[20], the authors in [23] obtain and study formulas for the bilateral counterparty val-
uation adjustment of a credit default swaps portfolio referencing an asymptotically
large number of entities.
The model in [20] can be naturally understood as an interacting particle system
that is influenced by an exogenous source of randomness. There is a central source of
interconnections and failure of any of the components stresses the central ‘bus’, which
in turn can cause the failure of other components (a contagion effect). Computing
the distribution of the loss from default in such models tends to be a difficult task
and while Monte-Carlo simulation methods are broadly applicable, they can be slow
for large portfolios or large time horizons as it is commonly the interest in practice.
Mathematical and computational tools for the approximation to the distribution of
the loss from default in large heterogeneous portfolios were then developed in [24],
Gaussian correction theory was developed in [25] and analysis of tail events and
most likely paths to failure via the lens of large deviations theory was then developed
in [26]. We remark here that to a large extend systemic risk refers to the tail of the
distribution. The authors in [27] combine the large pool asymptotic results of [1,
3, 4, 9, 10, 24–34] with maximum likelihood ideas to construct tractable statistical
inference procedures for parameter estimation in large financial systems.
Such mathematical results lead to new computational tools for the measurement
and prediction of risk in high-dimensional financial networks. These tools mainly
include approximations of the distribution of losses from defaults and of portfolio
risk measures, and efficient computational tools for the analysis of extreme default
events. The mathematical results also yield important insights into the behavior of
systemic risk as a function of the characteristics of the names in the system, and in
particular their interaction.
Financial institutions (banks, pension funds, etc.) often hold large portfolios in
order to diversify away a number of idiosyncratic effects of individual assets. Deposit
insurance premia depend upon meaningful models and assessment of the macroeco-
nomic effect of the various phenomena that drive defaults. Development of related
mathematical and computational tools can help inform the design of regulatory pol-
icy, improve the pricing of federal deposit insurance, and lead to more accurate risk
measurement at financial institutions.
In this paper, we focus on dynamic default timing models for large financial
systems that fall into the category of intensity models in portfolio credit risk. Based
on the default timing model developed in [20], we address several of the issues just
mentioned and that are typically of interest. The mathematical and computational
532 K. Spiliopoulos

tools developed allow to reach to financial related conclusions for the behavior of
such large financial systems.
Although the primary interest of this work is risk in financial systems, models of
the type discussed in this paper are generic enough to allow for modifications that
make them relevant in other domains, including systems reliability, insurance and
epidemiology. In reliability, a large system of interacting components might have
a central connection, and be influenced by an external environment (temperature,
for example). The failure of an individual component (which could be governed by
an intensity model appropriate for the particular application) increases the stress
on the central connection and thus the other components, making the entire system
more likely to fail. In insurance, the system could represent a pool of insurance
policies. The effect of wildfires might, in that example, be modelled by a contagion
term. Systematic risk in the form of environmental conditions has an impact on the
whole pool.
The rest of the article is structured as follows. In Sect. 2 we describe the correlated
default timing proposed in [20]. Section 3 studies the typical behavior of the loss
distribution in such portfolios as the number of names (agents) in the pool grow to
infinity. Section 4 focuses on developing the Gaussian correction theory. As we shall
see there, Gaussian corrections are very useful because they make the approximations
accurate even for portfolios of relatively small sizes. In Sect. 5, we study the tail of
the loss distribution using arguments from the large deviations theory. We also study
the most likely path to systemic failure and to the development of default clusters.
An understanding of the preferred paths to large default rates and the most likely
path to the creation of default clusters can give useful insights into how to optimally
safeguard against such events. Importance sampling techniques can then be used to
construct asymptotically efficient estimators for tail event probabilities, see Sect. 6.
Conclusions are in Sect. 7. A large part of the material presented in this work, but
not all, is related to recent work of the author described in [20, 24–26].

2 A Dynamic Correlated Default Timing Model

One of the issues of fundamental importance in financial markets is systemic risk,


which may be understood as the likelihood of failure of a substantial fraction of firms
in the economy. There are a number of ways of interpreting this, but our focus will
be the behavior of actual defaults. Defaults are discrete events, so one can frame the
interest within the language of point processes. Empirically, defaults tend to happen
in groups; feedback and exposure to market forces (along the lines of “regimes”)
tend to produce correlation among defaults.
Let us fix a probability space (, F , P) where all random variables will be
defined. Denote by τ n the stopping time at which the nth component (or particle) in
our system fails. Then, as δ  0, a failure time τ n has intensity process λn , which
satisfies
P{τ n ∈ (t, t + δ]|Ft , τ n > t} ≈ λnt δ, (1)
Systemic Risk and Default Clustering for Large Financial Systems 533

where Ft is the sigma-algebra generated by the entiresystem up to time t. Hence, we


t
essentially have that the process defined by 1{τ n ≤t} − 0 λns 1{τ n >s} ds is a martingale.
Motivated by the empirical studies in [21, 22], we may model the intensity λn in
such a way that it depends on three factors: a mean reverting idiosyncratic source
of risk, the portfolio loss rate and a systematic risk factor. Heterogeneity can be
addressed by allowing the intensity parameters of each name to be different. The
mean reverting character of the idiosyncratic source of risk is there to guarantee that
the effect of a default in the pool has a transient effect on the default intensities of the
surviving names. The dependence on the portfolio loss rate, denoted by L ·N is the
term that is responsible for the contagious effects, whereas the systematic risk factor,
denoted by X · is an exogenous source of risk. To be precise, the default intensi-
ties, λn ’s, are governed by the following interacting system of stochastic differential
equations (SDEs)

dλnt = −αn (λnt − λ̄n )dt + σn λnt dWtn + βnC d L tN + εβnS λnt d X t , λn0 = λn◦ . (2)

where, {W n }n∈N be a countable collection of independent standard Brownian


motions.
The process L tN represents the empirical failure rate in the system, i.e.,

1 
N
L tN = 1{τ n ≤t} , (3)
N
n=1

where by letting {en }n∈N to be an i.i.d. collection of standard exponential random


variables we have   t 
τ = inf t ≥ 0 :
n
λs ds ≥ en .
n
(4)
s=0

The process X t represents the systematic risk, which can be modeled to be the
solution to some SDE

d X t = b0 (X t )dt + σ0 (X t )d Vt , X 0 = x◦ . (5)

where V is a standard Brownian motion which is independent of the W n ’s and en ’s.


Plausible models for X t could be an Ornstein-Uhlenbeck process or a Cox-Ingersoll-
Ross (CIR) process.
In the case βnC = βnS = 0 for all n ∈ {1, . . . , N }, one recovers the classi-
cal CIR process model in credit risk, e.g., [35]. Namely, the intensity SDE (1)
extends the widely-used CIR process by including two additional terms that gen-
erate correlation between failure times. The term εβnS λnt d X t induces correlated dif-
fusive movements of the component intensities; the process X represents the state
of the macro-economy, which affects all assets in the pool. The term βnC d L tN intro-
duces a feedback (contagion) effect. The standard term −αn (λnt − λ̄n )dt is a mean
reverting term allowing the component to “heal” after a shock (i.e., a failure). This
534 K. Spiliopoulos

parsimonious formulation allows us to take advantage of the wealth of knowledge


about CIR-type processes. The parameter ε > 0 allows us to later on focus on rare
events.
The process L N of (3), which simply gives us the fraction of components which
have already failed by time t, affects each of the remaining components in a natural
way. Each failure corresponds to a Dirac function in the measure d L N ; the term
βnC d L tN thus leads to upward impulses in λn ’s, which leads (via (4)) to sooner
failure of the remaining functioning components. We might think of a central “bus”
in a system of components. Each of the components depends on this bus, which in
turn sensitive to failures in the various components. In the financial application that
was considered in [20], this feedback mechanism is empirically observed to be an
important channel for the clustering of defaults in the U.S. (see [21]).
In order to allow for heterogeneity, the parameters in (2) depend on the index n.
Define the “type”
pnt = (λnt , αn , λ̄n , σn , βnC , βnS ) (6)

for each n ∈ N and t ≥ 0. The pnt ’s take value in P = R3+ × R × R+ × R ⊂ R6 . The


parameters (λn0 , αn , λ̄n , σn , βnC , βnS ) are assumed to be bounded uniformly in n ∈ N.
N
We can capture the heterogeneity of the system by defining U N = N1 n=1 δp n
and assuming that this empirical type frequency has a (weak) limit. In particular we
make the following assumption
Assumption 2.1 We assume that U = lim N →∞ U N exists (in P(P)).
Proposition 3.3 in [20] guarantees that under the assumption of an existence of a
unique strong solution for the SDE for X · process, the system (2)–(5) has a unique
strong solution such that λnt ≥ 0 for every N ∈ N, n ∈ {1, . . . , N } and t ≥ 0. The
model (2)–(5) is a mean-field type model; the feedback occurs through the empirical
average of the pool of names. It is somewhat similar to certain genetic models (most
notably the Fleming-Viot process; see [36], [37, Chap. 10], and [38]). However, as it
is also demonstrated in [20, 24], the structure of the system (2)–(5) presents several
difficulties that bring the analysis of such systems outside the scope of the standard
setup.

3 Typical Behavior: Law of Large Numbers

The system (2)–(5) can naturally be understood as an interacting particle system. This
suggests how to understand its large-scale behavior. The structure of the feedback (the
empirical average L N ) is of mean-field type (roughly within the class of McKean-
Vlasov models; see [31, 39]). An understanding of “typical” behavior of a system
as N → ∞ is fundamental in identifying “atypical” or “rare” events.
Systemic Risk and Default Clustering for Large Financial Systems 535

To formulate the law of large numbers result, we define the empirical distribution
of the pn ’s corresponding to the names that have survived up to time t, as follows:

1 
N
μtN = δp N 1{τ n >t} .
N t
n=1

This captures the entire dynamics of the model (including the effect of the hetero-
geneities). We can directly calculate the failure rate from the μ N ’s:

L tN = 1 − μtN (P), t ≥ 0. (7)

Let us then identify the limit of μtN (P) as N → ∞. This is a law of large numbers
(LLN) result and it identifies the baseline “typical” behavior of the system. For
f ∈ C 2 (P), let

1 2 ∂2 f ∂f
(L1 f )(p) = σ λ 2 (p) − α(λ − λ̄) (p) − λ f (p)
2 ∂λ ∂λ
∂ f
(L2 f )(p) = β C (p)
∂λ
∂f ε2 ∂2 f
(L3x f )(p) = εβ S λb0 (x) (p) + (β S )2 λ2 σ02 (x) 2 (p) (8)
∂λ 2 ∂λ
∂ f
(L4x f )(p) = εβ S λσ0 (x) (p) and Q(p) = λ
∂λ

for p = (λ, α, λ̄, σ, β C , β S ). The generator L1 corresponds to the diffusive part of


the intensity with killing rate λ, and L2 is the macroscopic effect of contagion on
the surviving intensities at any given time. The operators L3x and L4x capture the
dynamics due to the exogenous systematic risk X . Then μ N tends in distribution (in
the natural topology of subprobability measures on P) to a measure-valued process
μ̄. Letting 
f, μ = f (p)μ(dp)
p∈P

for all f ∈ C 2 (P), the limit μ̄ satisfies the stochastic evolution equation

X X
d f, μ̄t = L1 f, μ̄t + Q, μ̄t L2 f, μ̄t + L3 t f, μ̄t dt + L4 t f, μ̄t d Vt a.s.
(9)

With sufficient regularity, this is equivalent to the stochastic integro-partial differen-


tial equation (SIPDE)
 
dυ = L∗1 υdt + Qυ L∗2 υdt + L3X t ,∗ υdt + εL4X t ,∗ υd Vt a.s. (10)
536 K. Spiliopoulos

where ∗ denotes adjoint in the appropriate sense (for notational simplicity, we have
written (10) to include the types as one of the coordinates; in a heterogeneous col-
lection in practice we would often use only λ in solving (10)). We recall the rigorous
statement in Theorem 3.1.
The SIPDE (10) gives us a “large system approximation” of the failure rate:

L tN ≈ 1 − μ̄t (P) = 1 − υ(t, p)dp. (11)
P

The computation of the first-order approximation (11) suggested by the LLN requires
solving the SIPDE (10) governing the density of the limiting measure. In [24] a
numerical method for this purpose is proposed. The method is based on an infinite
system of SDE’s for certain moments of the limiting measure. These SDEs are
driven by the systematic risk process X and a truncated system can be solved using
a discretization or random ODE scheme. The solution to the SDE system leads to
the solution to the SIPDE via an inverse moment problem.
The approximation (11) has significant computational advantages over a naive
Monte Carlo simulation of the high-dimensional original stochastic system (2)–(5)
and its accuracy is demonstrated in the left of Fig. 1 for a specific choice of parameters.
It also provides information about catastrophic failure.
The tail represents extreme default scenarios, and these are at the center of risk
measurement and management applications in practice. The analysis of the limiting
distribution generates important insights into the behavior of the tails as a function of
the characteristics of the system (2)–(5). For example, we see that the tail is heavily
influenced by the sensitivity of a name to the variations of the systematic risk X . The
bigger the sensitivity the fatter the tail, and the larger the likelihood of large losses
in the system (see the right of Fig. 1). Insights of this type can help understand the

5 70
N = 250 βS = 1
4.5 N = 1000
60 βS = 2
N = 5000
4 N = 10000 βS = 3
Asymptotic 50 βS = 4
3.5

3
40
2.5
30
2

1.5 20
1
10
0.5
0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38 0.4
Portfolio Loss Limiting Portfolio Loss

Fig. 1 On the left Comparison of distributions of failure rate L tN for different N at t = 1. Parameter
choices: (σ, α, λ̄, λ0 , β C , β S ) = (0.9, 4, 0.2, 0.2, 4, 8). On the right Comparison of distribution of
limiting failure rate 1 − μ̄t (P ) for different values of the systematic risk sensitivity β S at t = 1.
Parameter choices: (σ, α, λ̄, λ0 , β C ) = (0.9, 4, 0.2, 0.2, 2)
Systemic Risk and Default Clustering for Large Financial Systems 537

role of contagion and systematic risk, and how they interact to produce atypically
large failure rates. This, in turn, leads to ways to minimize or “manage” catastrophic
failures.
Let us next present the statement of the mathematical result. We denote by E the
collection of sub-probability measures (i.e., defective probability measures) on P;
i.e., E consists of those Borel measures ν on P such that ν(P) ≤ 1.

Theorem 3.1 (Theorem 3.1 in [24]) We have that μ·N converges in distribution to
μ̄· in D E [0, T ]. The evolution of μ̄· is given by the measure evolution equation

d f, μ̄t E = L1 f, μ̄t E + Q, μ̄t E L2 f, μ̄t E + L3X t f, μ̄t dt


E

+ L4X t f, μ̄t d Vt , ∀ f ∈ C ∞ (P) a.s.


E

Suppose there is a solution of the nonlinear SPDE


   
∗ ∗,X t   
dυ(t, p) = L1 υ(t, p) + L3 υ(t, p) + Q(p )υ(t, p )dp L∗2 υ(t, p) dt
p ∈P
(12)
+ L∗,X
4
t
υ(t, p)d Vt , t > 0, p ∈ P

where Li∗ denote adjoint operators, with initial condition

lim υ(t, p)dp = U (dp).


t0

Then
μ̄t = υ(t, p)dp.

We close this section, by briefly describing the method of moments that leads to
the numerical computation of the loss from default. We focus our discussion on the
homogeneous case and we refer the reader to [24] for the general case.
Firstly, we remark that the SPDE (12) can be supplied with appropriate boundary
conditions, which as it is mentioned in [24], are

υ(t, λ = 0) = υ(t, λ = ∞) = 0.
∞
Secondly, it turns out that for k ∈ N, the moments u k (t) = 0 λk υ(t, λ)dλ exist
almost surely. By (11) is clear that we want to compute u 0 (t). In particular, note that
the limiting loss L t = 1 − u 0 (t).
538 K. Spiliopoulos

By an integration by parts and using the boundary conditions at λ = 0 and at


λ = ∞, we can prove that they follow the following system of stochastic differential
equations
  
du k (t) = u k (t) − αk + β S b0 (X t )k + 0.5(β S )2 σ02 (X t )k(k − 1)
  
+ u k−1 (t) 0.5σ 2 k(k − 1) + α λ̄k + β C ku 1 (t) − u k+1 (t) dt
+ β S σ0 (X t )ku k (t)d Vt , (13)
 ∞
u k (0) = λk ◦ (λ)dλ,
0
N
where ◦ (λ) = lim N →∞ N1 n=1 δλn0 (λ).
The system (13) is a non-closed system since to determine u k (t), one needs to
know u k+1 (t). So, in practice one must perform a truncation at some level k = K
where we let u K +1 = u K (that is, we use the first K + 1 moments). As it is shown in
[24] one needs relatively small numbers of moments in order to compute the zero-th
moment u 0 (t) with good accuracy. Then, by solving backwards, one computes u 0 (t)
and from this one gets the limiting loss distribution

L t = 1 − u 0 (t).

4 Central Limit Theorem Correction

The asymptotics of (10) give via (11) the limiting behavior of the system as the
number of components becomes large. Starting with that result, the results in [25]
develop Gaussian fluctuation theory analogous to the central limit theory (see for
example [11, 12, 32, 40] for some related literature). This result provides the leading
order asymptotics correction to the law of large numbers approximation developed
in Sect. 3. In practical terms, the usefulness of such of a result is twofold: (a) the
approximation is accurate even for portfolios of moderate size, see [25], and (b)
one can make use of the approximation to develop tractable statistical inference
procedures for the statistical calibration of such models, see [27].
To be more precise, let us define the signed measure

tN = N μtN − μ̄t ;

as N → ∞. Conditional on the exogenous systematic risk process X , a central limit


¯ = lim N →∞  N exists in an appropriate space of distributions
theorem applies and 
and is Gaussian. Unconditionally, it may not be Gaussian but is of mean zero (since
we have removed the bias μ̄ from μ N ).
Systemic Risk and Default Clustering for Large Financial Systems 539

The usefulness of the fluctuation analysis is that it leads to a second-order approx-


imation to the distribution of the portfolio loss L N in large pools. The fluctuations
analysis yields an approximation which improves the first-order approximation (11)
suggested by the LLN, especially for smaller system sizes N .
In particular, Theorem 4.1 implies that

¯ t (P) ≤ −)
P( N (L tN − L t ) ≥ ) ≈ P(

for large N . This motivates the approximation

1 d 1
μtN = √ tN + μ̄t ≈ √  ¯ t + μ̄t ,
N N

which then implies the following second-order approximation for the portfolio loss.

d 1
L tN ≈ L t − √  ¯ t (P). (14)
N

The numerical computation of the second-order approximation (14) suggested


by the fluctuation analysis is amenable to a moment method similar to that used for
computing the first-order approximation (11). In addition to solving the LLN SIPDE,
we would also need to solve for the fluctuation limit. This limit is governed by a
stochastic evolution equation, which gives rise to an additional system of “fluctuation
moments.” This system is driven by the exogenous systematic risk process X and
the martingale M̄t in Theorem 4.1 that is conditionally Gaussian given X .
Left of Fig. 2 compares the approximate loss distribution with the actual loss dis-
tribution for specific parameter choices. It is evident from the numerical comparisons
that the second-order approximation has increased accuracy, especially for smaller
portfolios and in the tail of the distribution. The right of Fig. 2 compares for the 95
and 99 percent value at risk (VaR) between the actual loss, LLN approximation (11),
and approximation (14) for a pool of N = 1,000 names. It is also evident from the
figure that the approximation for the VaR based on (14) is much more accurate than
the law of large numbers approximation.
Let us close this section, with a few words on the actual mathematical result. It
turns out that the convergence ¯ = lim N →∞  N happens in an appropriate weighted
Hilbert space, which we denote by W0J (w, ρ), with w and ρ the appropriate weight
functions, J ∈ N and W0−J (w, ρ) will be its dual. Such weighted Sobolev spaces
were introduced in [33] and further generalized in [41] to study stochastic partial
differential equations with unbounded coefficients. These weighted spaces turn out
to be convenient for the present situation, see [25].
540 K. Spiliopoulos

30 0.16
Actual Loss Distribution Actual VaR
First−Order Approximation First−Order Approximation VaR
Second−Order Approximation 0.15 Second−Order Approximation VaR
25

0.14
99% VaR
20 N=
1000
0.13

VaR
N=250
15
95% VaR
0.12
N=150
10
0.11

5
0.1

0 0.09
0 0.05 0.1 0.15 0.2 0.35 0.4 0.45 0.5
Loss Time

Fig. 2 On the left Comparison of approximate and actual loss distributions of failure rate L tN
for different N at t = 0.5. Parameter choices: (σ, α, λ̄, λ0 , β C , β S ) = (0.9, 4, 0.2, 0.2, 1, 1). On
the right Comparison of approximate and actual VaR. Parameter choices: (σ, α, λ̄, λ0 , β C , β S ) =
(0.9, 4, 0.2, 0.2, 1, 1). In both cases, X is an OU process with reversion speed 2, volatility 1, initial
value 1 and mean 1

In order to state the convergence result, we introduce some operators. Let p ∈


P ⊂ R6 and for f ∈ Cb2 (P), define

(Gx,μ f )(p) = (L1 f )(p) + (L3x f )(p) + Q, μ (L2 f )(p) + L2 f, μ Q(p)


∂f ∂g
(L5 ( f, g))(p) = σ 2 (p) (p)λ
∂λ ∂λ
(L6 ( f, g))(p) = f (p)g(p)λ
(L7 f )(p) = f (p)λ

Then, we have the following theorem related to the fluctuations analysis.


Theorem 4.1 (Theorem 4.1 in [25]) For J > 0 large enough and for appropriate
weight functions (w, ρ), the sequence {tN , t ∈ [0, T ]} N ∈N is relatively compact
in DW −J (w,ρ) [0, T ]. For any f ∈ W0J (w, ρ), the limit accumulation point of  N ,
0
¯ is unique in W −J (w, ρ) and satisfies the stochastic evolution equation
denoted by , 0
 
    t   t  
¯ t = f, 
f,  ¯0 + G X s ,μ̄s ¯ s ds +
f,  ¯ s d Vs + f, M̄t , a.s.
L4X s f, 
0 0
(15)
for any f ∈ W0J (w, ρ), where M̄ is a distribution-valued martingale with pre-
dictable variation process
 t
 
[ f, M̄ ]t = L5 ( f, f ), μ̄s + L6 ( f, f ), μ̄s + L2 f, μ̄s 2
Q, μ̄s
0
− 2 L7 f, μ̄s L2 f, μ̄s ] ds.
Systemic Risk and Default Clustering for Large Financial Systems 541

Conditional on the σ -algebra Vt that is generated by the V −Brownian motion, M̄t


is centered Gaussian with covariance function, for f, g ∈ W0J (w, ρ), given by

      t1 ∧t2
Cov f, M̄t1 , g, M̄t2  Vt1 ∨t2 = E [ L5 ( f, g), μ̄s + L6 ( f, g), μ̄s
0
+ L2 f, μ̄s L2 g, μ̄s Q, μ̄s
− L7 g, μ̄s L2 f, μ̄s
 

− L7 f, μ̄s L2 g, μ̄s ] ds  Vt1 ∨t2 .
(16)

It is clear that if βnS = 0 for all n, then the limiting distribution-valued martingale
M̄ is centered Gaussian with covariance operator given by the (now deterministic)
term within the expectation in (16).
The main idea for the derivation of (15) comes from the proof of the convergence
to the solution of (9). Define

1 2 ∂2 f ∂f
(L◦1 f )(p) = σ λ 2 (p) − α(λ − λ̄) (p)
2 ∂λ ∂λ

for p = (λ, α, λ̄, σ, β C , β S ). Let’s also assume for the moment that βnS = 0 for every
n ∈ N, i.e., let’s neglect exposure to the exogenous
 risk X and focus on contagion.
Then we can write the evolution of f, μtN as

1  ◦ 1 
N N
d f, μtN = L f (ptN )1{t<τn } dt − f (pnt )λnt 1{t<τ n } dt
N N
n=1 n=1
N   
1 
N
βnC
+ f p +
n
e1 − f (pnt ) 1{τn <t} λm
t 1{τm ≤t} dt + d Mt
N N
n=1 m=1

≈ L1 f, μtN dt + L2 f, μtN Q, μtN dt + d Mt

where M is a martingale which may change from line to line. This leads to (9), when
βnS = 0 for every n ∈ N, see [20].
To get the Gaussian correction, we see that

d f, tN ≈ L1 f, tN + L2 f, tN Q, μtN + L2 f, μ̄t Q, tN dt + d Mt

where M is a martingale. For large N , M should be Gaussian, in which case  N is


indeed a Gaussian process. Putting the systematic risk process X back into (2)–(5),
one recovers the result of Theorem 4.1.
542 K. Spiliopoulos

5 Analysis of Tail Events: Large Deviations

Once we have identified what is typical, we can study the structure of atypically
large failure rates. Large deviations outlines a circle of ideas and calculations for
understanding the origination and transformation of rare events (see [42, 43]). Large
deviation arguments allow us to identify the “dominant” way that rare events will
occur in complex systems. This is the feature that is being exploited in [26], i.e., how
different sources of stochasticity can lead to system collapse.
By the discussion in Sect. 3, we have that the pool has a default rate L T =
1 − μ̄T (P) at time T . Let’s fix  > L T . Then lim N →∞ P{L TN ≥ } = 0; it is a rare
event that the default rate in the pool exceeds . We want to understand as much as
possible about {L TN ≥ }.
Using, the theory of large deviations, we can understand both how rare this event
is, and what the “most likely” way is for this rare event to occur. Events far from
equilibrium crucially depend on how rare events propagate through the system. Large
deviations gives rigorous ways to understand these effects, and we want to use this
machinery to understand the structure of atypically large default clusters in the port-
folio. A reference for large deviations is [44].
If we have that

P{L N ≈ ϕ} ≈ exp [−N I (ϕ)] , as N → ∞

for some appropriate functional I , then by the contraction principle we should


have that  
P{L TN ≈ } ≈ exp −N I  () , as N → ∞ (17)

where
I  () = inf{I (ϕ) : ϕ(T ) = } (18)

(in other words, I  is the large deviations rate function for L TN ). This gives us the rate
at which the tail of the default rate L TN decays as the diversification parameter grows.
More importantly, though, the variational problem (18) gives us the preferred way
which atypically large default rates occur. Namely, if there is a ϕ∗ : [0, T ] → [0, 1]
such that
I  () = I (ϕ∗ )

then for any δ > 0, the Gibbs conditioning principle suggests that

lim P{L TN − ϕ∗  ≥ δ|L TN ≥ } = 0.


N →∞

Insights into large deviations of (2)–(5) have been developed in [26] when ε ↓ 0
and when ε = O(1) as N  ∞. We note here that in the case ε = O(1), the large
deviations principle is conditional on the systematic risk X . Such results allow us to
Systemic Risk and Default Clustering for Large Financial Systems 543

study the comparative effect of the systematic risk process X and of the contagion
feedback on the tails of the loss distribution.
Before presenting the result, let us first investigate numerically a test case, which
is indicative of the kind of results that large deviations theory can give us. Apart
from approximating the tail of the distribution, large deviations can give quantitative
insights into the most likely path to failure of a system.
For presentation purposes and for the rest of this section, we assume that ε =
ε N ↓ 0 as N ↑ ∞. Consider a heterogeneous test portfolio composed initially of
N = 200 names. Let us assume that we can separate the names in the portfolio
into three types: Type A is 16.67 % of the names, Type B is 33.33 % of the names
and Type C is 50 % of the names. For presentation purposes, we assume that all
parameters but the contagion parameter are the same among the different types. In
particular, we have the following choice of parameters.
It is instructive to compare the different cases, based on whether there are conta-
gion effects in the default intensities or not. In particular, we compare two different
cases, (a) Systematic risk only: β S = 0, β C = 0, and (b) Systematic risk and conta-
gion: β S = 0, β C = 0. In each case, the time horizon is T = 1.
Using the methods of Sect. 3, one can compute that the typical loss in such a pool
at time T = 1. If contagion effects are not present, i.e., if β CA = β BC = βCC = 0,
then the typical loss in such a portfolio at time T = 1 is L T = 42.5 %. If on the
other hand, contagion (feedback) effects are present and the β C parameters take the
values of Table 1, then the typical loss in such a portfolio at time T = 1 has been
increased to L T = 72.1 %. In Fig. 3, we plot the large deviations rate functions for
each of the two different cases. As we saw in the beginning of this section, the rate
function governs the asymptotics of the tail of the loss distribution. Notice that in
every case, the rate function is convex and it becomes zero at the corresponding law
of large numbers.
Moreover, since the contagion parameter of Type A is higher than the contagion
parameter for Type B or C, one expects that names of Type A will be more prompt
to the contagious impact of defaults. Indeed, after computing the rate function and
the associated extremals, as defined by large deviations theory, one gets the most
likely paths to failure as seen in Figs. 4 and 5. The ϕ(t) trajectories correspond to the
contagion extremals for each of the three types, whereas the ψ(t) corresponds to the
systematic risk extremal.
One can make two conclusions out of Figs. 4 and 5. The first conclusion is related
to the ϕ extremals (Fig. 4). We notice that at any given time t, the extremal for Type

Table 1 Parameter values for a test portfolio composed of three types of assets
α λ̄ σ λ0 γ βS βC
Type A 0.5 2 0.5 0.2 1 1 10
Type B 0.5 2 0.5 0.2 1 1 3
Type C 0.5 2 0.5 0.2 1 1 1
We take ε N = √1
N
544 K. Spiliopoulos

Systematic risk only


Systematic risk and contagion

0.6
Rate functions
0.4
0.2
0.0

0.0 0.2 0.4 0.6 0.8 1.0


Default rate

Fig. 3 Rate function governing the log-asymtptotics of the tail of the loss distribution

Type A
0.8

Type B
Type C
0.6
phi extremal
0.4
0.2
0.0

0.2 0.4 0.6 0.8 1.0


Time

Fig. 4 Optimal ϕ(t) trajectories for the three different types in the pool for t ∈ [0, 1] and  = 0.81

A is bigger than the extremal for Type B, which in turn is bigger than the extremal of
Type C. This implies that unlikely large losses for components of Type A are more
likely than unlikely large losses for components of Type B, which are more likely
than large losses for components of Type C. Thus, components of Type A affect
the pool more than components of Type B, which in turn affect the pool more than
components of Type C even though Type A composes 16.67 % of the pool, whereas
Type B, composes 33.33 % of the pool and Type C composes 50 % of the pool. The
second conclusion is related to the ψ extremals (Fig. 5). We notice that the effect
of the systematic risk is most profound in the beginning but then its significance
decreases.
Systemic Risk and Default Clustering for Large Financial Systems 545

0.005
Systematic risk only
Systematic risk and contagion

0.000
psi extremal
−0.005
−0.010

0.2 0.4 0.6 0.8 1.0


Time

Fig. 5 Comparing optimal ψ(t) trajectories in the case of absence and presence of the contagion
effects for t ∈ [0, 1] and  = 0.81

Namely, if a large cluster were to occur, the systematic risk factor is likely to
play an important role in the beginning, but then the contagion effects become more
important. Assets of Type A are likely to contribute to the default clustering effect
more, followed by assets of Type B and the ones that will contribute the least to the
default cluster are assets of Type C.
As it is also seen in the numerical experiments done in [26], the large deviations
analysis help quantify the effect that the contagion and the systematic risk factor have
on the behavior of the extremals (the most likely path to failure). An understanding
of the role of the preferred paths to large default rates and the most likely ways in
which contagion and systematic risk combine to lead to large default rates would
give useful insights into how to optimally hedge against such events.
Let us next proceed by motivating the development of the large deviations principle
for the default timing model (2)–(5) that is considered in this paper.
We denote scenarios, i.e., defaults, that are not in [0, T ] by an abstract point  not
in [0, T ] and define the Polish space

T = [0, T ] ∪ {}

To motivate things, let’s first assume for simplicity that β C = β S = 0 and that
the system is homogeneous, i.e., that pn = p for all n. Define

dλt = −α(λt − λ̄)dt + σ λt dWt t >0
546 K. Spiliopoulos

with λ0 = λ◦ . This Feller diffusion will represent the conditional intensity of a


“randomly-selected” component of our (homogeneous and independent) system.
Define the measure μ0 ∈ P(R+ ) by setting

  t 
μ0 [0, t] = 1 − E exp − λs ds
0

for all t > 0; μ0 is the common law of the default times τn ’s.
In the independent case, i.e., when β C = 0, standard Sanov’s theorem [44],
implies that {d L N } N ∈N has a large deviations principle with rate function


H (ν, μ0 ) = ln (t)ν(dt)
t∈T dμ0

if ν  μ0 and H (ν, μ0 ) = ∞ if ν  μ0 (i.e., H (ν, μ0 ) is the relative entropy of ν


with respect to μ0 ). By the contraction principle, the rate function for L TN is

I ind, () = inf {H (ν, μ0 ) : ν ∈ P(R+ ), ν[0, t] = ϕ(t) for all t ∈ [0, T ] and ν[0, T ] = }

In the independent case, we can actually compute both the extremal ϕ that achieves
the infimum and the corresponding rate function I ind, () in closed form.
Assume that μ0 [0, T ] ∈ (0, 1) and  ∈ (0, 1). Fix ν ∈ P(T ) such that ν[0, T ] =
. Define

μ0 (A ∩ [0, T ]) ν(A ∩ [0, T ])


μ0,− (A) = and ν− (A) =
μ0 [0, T ] 

for all A ∈ B[0, T ]. Then μ− and ν− are in P[0, T ]. We can write that
 
 ν{}
H (ν, μ0 ) =  (ν− , μ0,− ) + ln + ln ν{} (19)
μ0 [0, T ] μ0 {}

where  is entropy on P[0, T ]. We can minimize the  term by setting ν− = μ0,− ,


and we get that

 1−
I ind, () =  ln + (1 − ) ln (20)
μ0 [0, T ] μ0 {}
 1−
=  ln + (1 − ) ln .
μ0 [0, T ] 1 − μ0 [0, T ]
Systemic Risk and Default Clustering for Large Financial Systems 547

N
This is in fact obvious; L TN = N1 n=1 1{τn ≤T } , and in this case the 1{τn ≤T } ’s are i.i.d.
Bernoulli random variables with common bias μ0 [0, T ]. The rate function I ind, ()
of (20) is the entropy of Bernoulli coin flips. Of more interest, however, is the optimal
path. In setting ν− = μ− in (19), we essentially identify the optimal path

μ0 [0, t]
ϕ(t) =  ,
μ0 [0, T ]

where the last relation holds since we also require ϕ(T ) = .


It turns out that one can extend this result to give a generalized Sanov’s theorem
for the case β C > 0, where d L N feeds back into the dynamics of the λn ’s. The
case β S > 0 can be treated using a conditioning argument and the well developed
theory of large deviations for small noise diffusions. For the heterogeneous case,
one needs an additional variational step which minimizes over all the possible ways
that losses are distributed among systems of different types. Even though an explicit
closed form expression for the extremals and for the corresponding rate function is
no longer possible, one can still rely on numerically computing them. Let us make
this discussion precise.
To fix the discussion, let us assume (see [26] for the general case) that the exoge-
nous risk X is of Ornstein-Uhlenbeck type, i.e.,

d X t = −γ X t dt + d Vt
X 0 = x◦

Let W ∗ be a reference Brownian motion. Fix a name in the pool p = (λ◦ , α, λ̄, σ,
βC , β S) ∈ P and time horizon T > 0.
The Freidlin-Wentzell theory of large deviations for SDE’s gives us a natural
starting point. In the Freidlin-Wentzell analysis, a dominant ODE is subjected to a
small diffusive perturbation; informally, the Freidlin-Wentzell theory tells us that
if we want to find the probability that the randomly-perturbed path is close to a
reference trajectory, we should use that reference trajectory in the dynamics. This
leads to the correct LDP rate function
 for the original SDE. If we want to find the
asymptotics of the probability that d L N ≈ dϕ, ε N d X ≈ dψ for some absolutely
continuous functions ϕ and ψ, i.e., ϕ, ψ ∈ AC ([0, T ], R), we should consider the
stochastic hazard functions

ϕ,ψ ϕ,ψ ϕ,ψ ϕ,ψ
dλt = −α(λt − λ̄)dt + σ λt d Wt∗ + β C dϕ(t) + β S λt dψ(t) t ∈ [0, T ]
λ 0 = λ◦ .

This will represent the conditional intensity of a “randomly-selected” name in our


pool. Define next
   t 
ϕ,ψ
λϕ,ψ
p
f ϕ,ψ (t) = E λt exp − s ds ,
s=0
548 K. Spiliopoulos

where, we have used the superscript p to denote the dependence on the particular
type. Then for every t ∈ [0, T ] we have that
 t   t   t 
λϕ,ψ λϕ,ψ
p
f ϕ,ψ (s)ds = 1 − E exp − s ds =P s ds >e
s=0 s=0 s=0

where e is an exponential(1) random variable which is independent of W ∗ . In other


p
words, f ϕ,ψ is the density (up to time T ) of a default time whose conditional intensity
is λϕ,ψ . In fact, due to the affine structure of the model, we have an explicit expression
p
for f ϕ,ψ (see Lemma 4.1 in [26]).
p
For given trajectories ϕ and ψ in AC([0, T ]; R), define μϕ,ψ ∈ P(T ) as
   T 
p p p
μϕ,ψ (A) = f ϕ,ψ (t)dt + δ (A) 1 − f ϕ,ψ (t)dt
t∈A∩[0,T ] 0

for all A ∈ B(T ).


At a heuristic level one can derive the large deviations principle as follows. Let
us assume that we can establish that
 
P{L N ≈ ϕ|X N ≈ ψ} ≈ exp −N I ◦ (ϕ, ψ)
 
and that X ·N = ε N X · , N < ∞ also has large deviations principle in C([0, T ]; R)
with action functional J X ; i.e.,
 
1
P X ≈ ψ ≈ exp − 2 J X (ψ)
N
εN

as N  ∞. Then, we should have that


 
◦1
P{L N
≈ ϕ, X N
≈ ψ} ≈ exp −N I (ϕ, ψ) − 2 J X (ψ) .
εN

In fact, the previous heuristics can be carried out rigorously and in the end one derives
the following rigorous large deviations result.
Theorem 5.1 (Theorem 3.8 in [26]) Consider the system defined in (2)–(5) with
lim N →∞ ε N = 0 such that lim N →∞ N ε2N = c ∈ (0, ∞) and let T < ∞. Under
the appropriate assumptions the family {L TN , N ∈ N} satisfies the large deviation
principle, with rate function


I () = inf I (ϕ, ψ) : ϕ ∈ C (P × [0, T ]) , ψ ∈ C ([0, T ]) , ψ(0) = ϕ(p, 0) = 0,
 
ϕ̄(s) = ϕ(p, s)U (dp), ϕ̄(T ) = 
P
Systemic Risk and Default Clustering for Large Financial Systems 549

where if ϕ ∈ AC (P × [0, T ]) , ψ ∈ AC ([0, T ]) , ψ(0) = ϕ(p, 0) = 0, then


 
p 1
I (ϕ, ψ) = H ϕ(p), μϕ̄,ψ U (dp) + J X (ψ)
P c

and I (ϕ, ψ) = ∞ otherwise. Here, J X (ψ) is the rate function for the process
{ε N X N , N < ∞}. Namely, for ψ ∈ AC ([0, T ]; R) with ψ(0) = 0 we have

1 T  
J X (ψ) = ψ̇(s) + γ ψ(s)2 ds
2 0

and J X (ψ) = ∞ otherwise. I  () has compact level sets.


If the heterogeneous portfolio is composed by K different types of assets with
homogeneity within each type, then Theorem 5.1 simplifies to the following expres-
sion.
For ξ, ϕ, ψ ∈ AC([0, T ]) let us define the functional
 ! " ! "
p
T ξ̇ (t) 1 − ξ(T )
g (ξ, ϕ, ψ) = ln p ξ̇ (t)dt + ln T p (1 − ξ(T ))
0 f ϕ,ψ (t) 1 − 0 f ϕ,ψ (t)dt

p
Due to the affine structure of the model, we have an explicit expression for f ϕ,ψ (see
Lemma 4.1 in [26]). K
Assume that κi % of the names are of type Ai with i = 1, . . . , K and i=1 κi =
 K κi
100. Setting ϕ(p, s) = i=1 100 ϕ Ai (s)χ{p Ai } (p), we get the following simplified
expression for the rate function

⎨ K
κi p A 1 K
κi
I  () = inf g i (ϕ Ai , ϕ, ψ) + J X (ψ) : ϕ(t) = ϕ A (t) for every t ∈ [0, T ]
⎩ 100 c 100 i
i=1 i=1


ϕ(T ) = , ϕ Ai (0) = ψ(0) = 0, ϕ Ai , ψ ∈ AC([0, T ]) for every i = 1, . . . , K .

An optimization algorithm can then be employed to solve the minimization problem


associated with I  () and compute the extremals ϕ Ai for i = 1, . . . , K and ψ. This
is the formula that the numerical example presented in Figs. 4 and 5 was based on.
In the numerical example that was considered there we had three types, i.e., K = 3.
The large deviations results have a number of important applications. Firstly, they
lead to an analytical approximation of the tail of the distribution of the failure rate
L N for large systems. These approximations complement the first- and second- order
approximations suggested by the law of large numbers and fluctuations analysis of
Sects. 3 and 4 respectively and facilitates the estimation of the likelihood of systemic
collapse. Secondly, the large deviations results provide an understanding of the “pre-
ferred” ways of collapse, which can also be used to design “stress tests” for the
550 K. Spiliopoulos

system. In particular, this understanding can guide the selection of meaningful stress
scenarios to be analyzed. Thirdly, they can motivate the design of asymptotically
efficient importance sampling schemes for the tail of the portfolio loss. We discuss
some of the related issues in Sect. 6.

6 Monte Carlo Methods for Estimation of Tail Events:


Importance Sampling

Suppose we want to computationally simulate P{L TN ≥ }, where lim N →∞ P{L TN ≥


} = 0 again holds. Accurate estimates of such rare-event probabilities are important
in many applications areas of our system (2)–(5), including credit risk management,
insurance, communications and reliability. Monte Carlo methods are widely used
to obtain such estimates in large complex systems such as ours; see, for example,
[29, 30, 45–51].
Standard Monte Carlo sampling techniques perform very poorly in estimating
rare events (for which, by definition, most samples can be discarded). Importance
sampling, which involves a change of measure, can be used to address this issue.
In general, large deviations theory provides an optimal way to ‘tilt’ measures. The
variational problems identified by large deviations usually lead to measure transfor-
mations under which pre-specified rare events become much more likely, but which
give unbiased estimates of probabilities of interest; see for example [28, 34, 52–56].
Let  N be any unbiased estimator of P{L TN ≥ } that is defined on some prob-
ability space with probability measure Q. In other words,  N is a random variable
such that EQ  N = P{L TN ≥ }, where EQ is the expectation operator associated
with E. In our setting, it takes the form

dP
 N = 1{L N >} ,
T dQ

P
where ddQ is the associated Radon-Nikodym derivative.
Importance sampling involves the generation of independent copies of  N under
Q; the estimate is the sample mean. The specific number of samples required depends
on the desired accuracy, which is measured by the variance of the sample mean.
However, since the samples are independent it suffices to consider the variance of
a single sample. Because of unbiasedness, minimizing the variance is equivalent to
minimizing the second moment. An application of Jensen’s inequality, shows that if

1
lim inf ln EQ ( N )2 = −2I  (),
N →∞ N

then  N achieves this best decay rate, and is said to be asymptotically optimal. One
wants to choose Q such that asymptotic optimality is attained.
Systemic Risk and Default Clustering for Large Financial Systems 551

To motivates things let us assume for the moment that β C = β S = 0 and that
the system is homogeneous, i.e., that pn = p for all n. In the independent and
homogeneous case, n = 1{τn ≤T } are i.i.d. random variables such that for every
t ∈ [0, T ]
 t     t   t
P {τn ≤ t} = P λs0,0 ds > e = 1 − E exp − λ0,0
s ds = f 0,0 (s)ds
0 0 0

For notational convenience, we shall define


 T
p= f 0,0 (s)ds
0

It is easy to see that,


N L TN ∼ Binomial(N , p)

To minimize the variance, we need to increase the probability of defaults. Define


 N
N
(θ ; t) = ln E eθ L t

A simple computation shows that

1    
¯ (θ ; t) = lim N
(N θ ; t) = ln p eθ − 1 + 1
N →∞ N

Define
peθ
pθ =
1 + p(eθ − 1)

Clearly p0 = p. Notice that the density of a Binomial(N , p) with respect to a


Binomial(N , pθ ) is

)
N n 1−n )
N
p 1− p   
Zθ = = 1 + p(eθ − 1) e−θn
pθ 1 − pθ
n=1 n=1
 
N −θ L TN + ¯ (θ;T )
=e

Therefore, for θ fixed, the suggestion is to simulate under a new change of measure,
under which N L TN ∼ Binomial(N , pθ ) and to return the estimator


1 
M
N −θ L TN ,i + ¯ (θ;T )
= 1{L N ,i >} e
M T
i=1
552 K. Spiliopoulos

It is clear that this estimator is unbiased. We want to choose θ that minimizes the
variance, or equivalently the second moment. For this purpose, we define the second
moment   N ¯

Q(, θ ) = Eθ  2 = Eθ 1{L T >} e2N −θ L T + (θ;T )

Notice that
1 1  
− ln Q(, θ ) ≥ −2 N −θ  + ¯ (θ ; T ) = 2(θ  − ¯ (θ ; T ))
N N

Due to convexity of ¯ (θ ; T ), we have that the maximizer over θ ∈ [0, ∞) of the


¯ ∗ ¯ (0;T )
lower bound is at θ ∗ such that ∂ (θ∂θ ;T ) = . In particular, (recall that ∂ ∂θ = p)
we have *
ln (1− p)
p(1−) , if  > p
θ∗ =
0, if  < p

This construction means that under the new measure, we have

Pθ ∗ {τn ≤ T } = pθ ∗ = .

In fact, we have the following theorem.


¯ ∗
Theorem 6.1 Let θ ∗ > 0 such that ∂ (θ∂θ ;T ) = . Then asymptotic optimality holds,
in the sense that
1
lim − ln Q(, θ ∗ ) = 2I ind, ()
N →∞ N

where I ind, () is defined in (20).

Proof By Jensen’s inequality we clearly have the upper bound. Namely, for every
θ ∈ [0, ∞)
1
lim sup − ln Q(, θ ) ≤ 2I ind () (21)
N →∞ N

Now, we need to prove that the lower bound is achieved for θ = θ ∗ , i.e., that
1
lim inf − ln Q(, θ ∗ ) ≥ 2I ind () (22)
N →∞ N
Systemic Risk and Default Clustering for Large Financial Systems 553

(1− p) T
Recalling that θ ∗ = ln p(1−) and p = s=0 f 0,0 (s)ds, we easily see that

1  
lim inf − ln Q(, θ ∗ ) ≥ 2 θ ∗  − ¯ (θ ∗ ; T )
N →∞ N
  ∗
= 2 θ ∗  − ln p(eθ − 1) + 1

 1−
= 2  ln + (1 − ) ln
p 1− p
= 2I ind ()

This concludes the proof of the theorem. 

In the heterogeneous case, i.e., if pn can be different for each n ∈ N, then N L TN =


N
n=1 1{τn ≤T } is no longer Binomial, but it is a sum of independent (but not identically
distributed) Bernoulli random variables with success probability
 T
pn
pn = f 0,0 (s)ds
0

indexed by n. Due to independence, similar methods as the one described above can
be used to construct asymptotically efficient importance sampling schemes in the
heterogeneous case.
The scheme just presented essentially amounts to a twist in the intensity of the
defaults. However, in contrast to the independent case, i.e., when β C = β S = 0, the
situation in the general dependent case β C , β S = 0 is more complicated. Notice also
if at least one of the βnC ’s is not zero, then the model (2)–(5) does not fall into the
category of the doubly-stochastic models, so techniques as the ones used in [45] do
not apply. Also, implementation of interacting particle schemes for Markov Chain
models as the ones developed in [29, 47] do not readily apply for such intensity
models. The re-sampling schemes of [48] could apply in this setting, but one would
need to construct an appropriate mimicking Markov Chain, something which is not
clear how to do in the current setting.
We briefly present here an importance sampling scheme for the case that there
exists at least one βnC = 0 and also applies independently of whether the systematic
effects are present in the model or not. The suggested measure change essentially
mimics the principal idea behind the measure change for theindependent case. To
N
be more precise, one directly twists the intensity of N L TN = n=1 1 .
  {τn ≤T } 
Let {Sk } be the arrival times of N L TN and notice that L TN ≥  = SN  ≤ T .
Let Msn = 1{τ n >s} and θsN ≥ 1 be some progressively measurable twisting process.
Then, define the measure Q via the Radon-Nicodym derivative
 SN   N  SN    N
Z N = e− log θs− d(N L sN )− 0 1−θsN n=1 λs Ms ds
n n
0 .
554 K. Spiliopoulos
 N   
− k=1 log θ SN − P
It is known that if E e k < ∞, then Q defined by ddQ = ZN
is a probability measure and it can be shown that N L sN admits Q−intensity
N
θsN n=1 λns Msn on the interval [0, SN  ).
This construction gives us some freedom into choosing appropriately the twisting
process θsN . Different choices of the twisting process θsN are of course possible. For
tractability purposes we restrict attention to a one-parameter family and set

βN
θsN =  N + 1.
n=1 λs Ms
n n

For any β ≥ 0 and under the measure induced by Z N , i.e., under Qβ , the process
N
N L sN has intensity n=1 λns Msn + β N on [0, SN  ), i.e., it amounts to an additive
shift of the intensity. Thus, β is a superimposed default rate and its role is to increase
the default rate in the whole portfolio.
The purpose then is to optimize the limit as N → ∞ of the upper bound of the
second moment of the resulting estimator over β. This is the measure change that is
investigated in [57], and it is shown there that there is a choice of β = β ∗ for which
asymptotic optimality can be established. Namely, there is a choice of β = β ∗ that
minimizes the second moment of the estimator in the limit as N → ∞. We refer the
interested reader to [57] for implementation details on this change of measure for
related intensity models and for corresponding simulation results.

7 Conclusions

We presented an empirically motivated model of correlated default timing for large


portfolios. Large portfolio analysis allows to approximate the distribution of the loss
from default, whereas Gaussian corrections make the approximation valid even for
portfolios of moderate size. The results can be used to compute the loss distribu-
tion and to approximate portfolio risk measures such as Value-at-Risk or Expected
Shortfall. Then, large deviations analysis can help understand the tail of the loss
distribution and find the most-likely paths to systemic failure and to the creation of
default clusters. Such results give useful insights into the behavior of systemic risk
as a function of the characteristics of the names in the portfolio and can be also
potentially used to determine how to optimally safeguard against rare large losses.
Importance sampling techniques can be used to construct asymptotically efficient
estimators for tail event probabilities.

Acknowledgments The author was partially supported by the National Science Foundation (DMS
1312124).
Systemic Risk and Default Clustering for Large Financial Systems 555

References

1. Meinerding, C.: Asset allocation and asset pricing in the face of systemic risk: a literature
overview and assessment. Int. J. Theor. Appl. Financ. (IJTAF) 15(03), 1250023-1–1250023-27
(2012)
2. Vasicek, O.: Limiting loan loss probability distribution. Technical Report, KMV Corporation
(1991)
3. Lucas, A., Klaassen, P., Spreij, P., Straetmans, S.: An analytic approach to credit risk of large
corporate bond and loan portfolios. J. Bank. Financ. 25, 1635–1664 (2001)
4. Schloegl, L., O’Kane, D.: A note on the large homogeneous portfolio approximation with the
student-t copula. Financ. Stoch. 9(4), 577–584 (2005)
5. Gordy, M.B.: A risk-factor model foundation for ratings-based bank capital rules. J. Financ.
Intermed. 12, 199–232 (2003)
6. Bush, N., Hambly, B., Haworth, H., Jin, L., Reisinger, C.: Stochastic evolution equations in
portfolio credit modelling. SIAM J. Financ. Math. 2, 627–664 (2011)
7. Dembo, A., Deuschel, J.-D., Duffie, D.: Large portfolio losses. Financ. Stoch. 8, 3–16 (2004)
8. Glasserman, P., Kang, W., Shahabuddin, P.: Large deviations in multifactor portfolio credit
risk. Math. Financ. 17(3), 345–379 (2007)
9. Spiliopoulos, K., Sowers, R.: Recovery rates in investment-grade pools of credit assets: a large
deviations analysis. Stoch. Process. Appl. 121(12), 2861–2898 (2011)
10. Giesecke, K., Weber, S.: Credit contagion and aggregate losses. J. Econ. Dyn. Control 30,
741–767 (2006)
11. Dai Pra, P., Runggaldier, W., Sartori, E., Tolotti, M.: Large portfolio losses: a dynamic contagion
model. Ann. Appl. Probab. 19, 347–394 (2009)
12. Dai Pra, P., Tolotti, M.: Heterogeneous credit portfolios and the dynamics of the aggregate
losses. Stoch. Process. Appl. 119, 2913–2944 (2009)
13. Cvitanić, J., Ma, J., Zhang, J.: The law of large numbers for self-exciting correlated defaults.
Stoch. Process. Appl. 122(8), 2781–2810 (2012)
14. Sircar, R., Zariphopoulou, T.: Utility valuation of credit derivatives and application to CDOs.
Quant. Financ. 10(2), 195–208 (2010)
15. Garnier, J., Papanicolaou, G., Yang, T.-W.: Large deviations for a mean field model of systemic
risk. SIAM J. Financ. Math. 4, 151–184 (2012)
16. Fouque, J.-P., Ichiba, T.: Stability in a model of inter-bank lending. SIAM J. Financ. Math. 4,
784–803 (2013)
17. Ait-Sahalia, Y., Cacho-Diaz, J., Laeven, R.: Modeling financial contagion using mutually excit-
ing jump processes. To appear in Journal of Financial Economics
18. Duan, J.-C.: Maximum likelihood estimation using price data of the derivative contract. Math.
Financ. 4, 155–167 (1994)
19. Bielecki, T., Crépey, S., Herbertsson, A.: Markov chain models of portfolio credit risk. Oxford
Handbook of Credit Derivatives. Oxford University Press, New York (2011)
20. Giesecke, K., Spiliopoulos, K., Sowers, R.: Default clustering in large portfolios: typical events.
Ann. Appl. Probab. 23(1), 348–385 (2013)
21. Azizpour, S., Giesecke, K., Schwenkler, G.: Exploring the sources of default clustering. Work-
ing paper, Stanford University (2010)
22. Duffie, D., Saita, L., Wang, K.: Multi-period corporate default prediction with stochastic covari-
ates. J. Financ. Econ. 83(3), 635–665 (2006)
23. Bo, L., Capponi, A.: Bilateral credit valuation adjustment for large credit derivatives portfolios.
Financ. Stoch. 18(2), 431–482 (2014)
24. Giesecke, K., Spiliopoulos, K., Sowers, R., Sirignano, J.A.: Large portfolio asymptotics for
loss from default. Math. Financ. 25(1), 77–114 (2015)
25. Spiliopoulos, K., Sirignano, J.A., Giesecke, K.: Fluctuation analysis for the loss from default.
Stoch. Process. Appl. 124(7), 2322–2362 (2014)
26. Spiliopoulos, K., Sowers, R.: Default clustering in large pools: large deviations. SIAM J.
Financ. Math. 6, 86–116 (2015)
556 K. Spiliopoulos

27. Sirignano, J.A., Schwenkler, G., Giesecke, K.: Likelihood estimation for large financial sys-
tems. Working paper, Stanford University (2013)
28. Glasserman, P., Wang, Y.: Counterexamples in importance sampling for large deviations prob-
abilities. Ann. Appl. Probab. 7, 731–746 (1997)
29. Carmona, R., Fouque, J.-P., Douglas, V.: Interacting particle systems for the computation of
rare credit portfolio losses. Financ. Stoch. 13(4), 613–633 (2009)
30. Deng, S., Giesecke, K., Lai, T.L.: Sequential importance sampling and resampling for dynamic
portfolio credit risk. Oper. Res. 60(1), 78–91 (2012)
31. Kotelenez, P.M., Kurtz, T.G.: Macroscopic limits for stochastic partial differential equations
of Mckeanvlasov type. Probab. Theory Relat. Fields 146(1–2), 189–222 (2010)
32. Kurtz, T.G., Xiong, J.: A stochastic evolution equation arising from the fluctuations of a class
of interacting particle systems. Commun. Math. Sci. 2(3), 325–358 (2004)
33. Purtukhia, O.G.: On the equations of filtering of multi-dimensional diffusion processes
(unbounded coefficients). Thesis, Moscow, Lomonosov University 1984 (in Russian) (1984)
34. Sadowsky, S.J.: On Monte Carlo estimation of large deviations probabilities. Ann. Appl. Probab.
6, 399–722 (1996)
35. Duffie, D., Pan, J., Singleton, K.: Transform analysis and asset pricing for affine jump-
diffusions. Econometrica 68, 1343–1376 (2000)
36. Dawson, D.A., Hochberg, K.J.: Wandering random measures in the Fleming-Viot model. Ann.
Probab. 10(3), 554–580 (1982)
37. Ethier, S.N., Kurtz, T.G.: Markov Processes: Characterization and Convergence. Wiley, New
York (1986)
38. Fleming, W.H., Viot, M.: Some measure-valued Markov processes in population genetics the-
ory. Indiana Univ. Math. J. 28(5), 817–843 (1979)
39. Gartner, J.: On the Mckean-Vlasov limit for interacting diffusions. Mathematische Nachrichten
137(1), 197–248 (1988)
40. Fernandez, B., Mélèard, S.: A Hilbertian approach for fluctuations on the Mckean-Vlasov
model. Stoch. Process. Appl. 71, 33–53 (1997)
41. Gyöngy, I., Krylov, N.: Stochastic partial differential equations with unbounded coefficients
and applications, I. Stoch. Stoch. Rep. 32, 53–91 (1990)
42. Freidlin, M., Wentzell, A.: Random Perturbations of Dynamical Systems, 2nd edn. Springer,
New York (1984)
43. Varadhan, S.R.S.: Large Deviations and Applications. CBMS-NSF Regional Conference Series
in Applied Mathematics, vol. 46. Society for Industrial and Applied Mathematics (SIAM),
Philadelphia (1984)
44. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer,
New York (1988)
45. Bassamboo, A., Jain, S.: Efficient importance sampling for reduced form models in credit risk.
In: Proceedings of the 2006 Winter Simulation Conference, pp. 741–748 (2006)
46. Juneja, S., Bassamboo, A., Zeevi, A.: Portfolio credit risk with extremal dependence: asymp-
totic analysis and efficient simulation. Oper. Res. 56(3), 593–606 (2008)
47. Carmona, R., Crépey, S.: Particle methods for the estimation of Markovian credit portfolio loss
distributions. Int. J. Theor. Appl. Financ. 13(4), 577–602 (2010)
48. Giesecke, K., Kakavand, H., Mousavi, M., Takada, H.: Exact and efficient simulation of cor-
related defaults. SIAM J. Financ. Math. 1, 868–896 (2010)
49. Glasserman, P., Li, J.: Importance sampling for portfolio credit risk. Manag. Sci 51(11), 1643–
1656 (2005)
50. Glasserman, P.: Tail approximations for portfolio credit risk. J. Deriv. 12(2), 24–42 (2004)
51. Zhang, X., Blanchet, J., Giesecke, K., Glynn, P.: Affine point processes: asymptotic analysis
and efficient rare-event simulation. Working paper, Stanford University (2011)
52. Asmussen, S., Peter Glynn, W.: Stochastic Simulation: Algorithms and Analysis. Grundlehren
der Mathematischen Wissenschaften. Springer, New York (2007)
53. Bucklew, J.: Introduction to Rare Event Simulation. Grundlehren der Mathematischen Wis-
senschaften. Springer, New York (2004)
Systemic Risk and Default Clustering for Large Financial Systems 557

54. Dupuis, P., Spiliopoulos, K., Wang, H.: Importance sampling for multiscale diffusions. Multi-
scale Model. Simul. 12, 1–27 (2012)
55. Dupuis, P., Wang, H.: Importance sampling, large deviations and differential games. Stoch.
Stoch. Rep. 76, 481–508 (2004)
56. Dupuis, P., Wang, H.: Subsolutions of an Isaacs equation and efficient schemes for importance
sampling. Math. Oper. Res. 32, 723–757 (2007)
57. Giesecke, K., Shkolnik, A.: Asymptotic optimal importance sampling of default times. Working
paper, Stanford University (2011)
Estimation of Volatility
√ Functionals:
The Case of a n Window

Jean Jacod and Mathieu Rosenbaum

Abstract We consider a multidimensional Itô semimartingale regularly sampled on


[0, t] at high frequency 1/n , with n going to zero. The goal of this paper is to
√ over [0, t] of a given function of the volatility
provide an estimator for the integral
matrix, with the optimal rate 1/ n and minimal asymptotic variance. To achieve
this, we use spot volatility estimators based on observations
√ within time intervals of
length kn n . In [5], this was done with kn → ∞ and kn n → 0, and a central limit
theorem√was given after suitable de-biasing. Here we do the same with the choice
kn  1/ n . This results in a smaller bias, although more difficult to eliminate.

Keywords Semimartingale · High frequency data · Volatility estimation · Central


limit theorem · Efficient estimation · Estimation of volatility functionals · Asymptotic
aspects

MSC2010 60F05 · 60G44 · 62F12

1 Introduction

Consider an Itô semimartingale X t , whose squared volatility ct (a d × d matrices-


valued process if X is d-dimensional) is itself another Itô semimartingale. The process
X is observed at discrete times in for i = 0, 1, . . ., the time lag n being small
(high-frequency setting) and eventually
 t going to 0. The aim is to estimate integrated
functionals of the volatility, that is 0 g(cs ) ds for arbitrary (smooth enough) func-
tions g, on the basis of the observations at stage n and within the time interval [0, t].

J. Jacod
Institut de Mathématiques de Jussieu, CNRS – UMR 7586 and Université Pierre et Marie Curie,
4 Place Jussieu, 75252 Paris Cedex 05, France
e-mail: [email protected]
M. Rosenbaum (B)
Laboratoire de Probabilités et Modèles Aléatoires, CNRS – UMR 7599 and Université Pierre et
Marie Curie, 4 Place Jussieu, 75252 Paris Cedex 05, France
e-mail: [email protected]

© Springer International Publishing Switzerland 2015 559


P.K. Friz et al. (eds.), Large Deviations and Asymptotic Methods
in Finance, Springer Proceedings in Mathematics & Statistics 110,
DOI 10.1007/978-3-319-11605-1_20
560 J. Jacod and M. Rosenbaum

This is of course a quite well understood problem when gis the identity function.
t
In particular, when X is one-dimensional and continuous, 0 g(cs ) ds corresponds
then to the integrated (squared) volatility, which can be efficiently estimated using
the so-called realized volatility, that is the sum of the squared increments of X . How-
ever, many other functions g are of interest. t For example, in the case of the realized
volatility mentioned above, the quantity 0 cs2 ds, called quarticity and corresponding
to g(x) = x 2 , appears in the asymptotic variance of the estimator. Therefore, esti-
mating the quarticity becomes necessary if one wants to build confidence intervals
for the integrated volatility. Actually, in the context of volatility estimation, for most
statistical
t procedures, the asymptotic variance is a combination of terms of the form
0 g(c s ) ds, see [4]. Hence the statistician needs to be able to estimate such quanti-
ties. Note that the functions g involved in limiting variances are often polynomial.
Nevertheless, more complicated expressions may also be found, in particular in the
multi-dimensional setting in the presence of jumps. We refer to [5] for more details on
the motivation for estimating general integrated functionals of the volatility process.
In [5], we have exhibited estimators which are consistent and asymptotically

optimal, in the sense that they asymptotically achieve the best rate 1/ n , and
also the minimal asymptotic variance in the cases where optimality is well-defined
(namely, when X is continuous and has a Markov type structure, in the sense of [2]).
These estimators have this rate and minimal asymptotic variance as soon as the jumps
of X are summable, plus some mild technical conditions.
The aim of this paper is to complement [5] with another estimator, of the same
type, but using spot volatility estimators based on a different window size. In this
introduction, we explain the differences between the estimator in [5] and the one
presented here.
For the sake of simplicity, we consider the case when X is continuous and one-
dimensional (the discontinuous and multi-dimensional case is considered later), that
is of the form  t  t
Xt = X0 + bs ds + σs dWs
0 0
t
and ct = σt2 is the squared volatility. Natural estimators for V (g)t = 0 g(cs ) ds are

]−kn +1
[t/n kn −1
1 
V (g)nt = n g(
cin ), where 
cin = (X (i+ j)n − X (i+ j−1)n )2
k n n
i=1 j=0
(1.1)
for an arbitrary sequence of integers such that kn → ∞ and kn n → 0. One knows
P
that V (g)nt −→V (g)t (when g is continuous and of polynomial growth).
The variables cin are spot volatility estimators, and according to [4] we know that
n
c[t/n ] estimates ct , with a rate depending on the “window size” kn . The optimal rate
1/4 √ √
1/n is achieved by taking kn  1/ n .1 When kn is smaller, the rate is kn

√ √ √
1 By kn  1/ n , we mean a1 / n ≤ kn ≤ a2 / n , for some a1 > 0 and a2 > 0.

Estimation of Volatility Functionals: The Case of a n Window 561

and√the estimation error is a purely “statistical error”; when kn is bigger, the rate is
1/ kn n and the estimation error is due to the variability of the volatility
√ process ct
itself (its volatility and its jumps). With the optimal choice kn  1/ n , the estima-
tion error is a mixture of the statistical error and the error due
√ to the variability of ct .
In [5], we have used a “small” window, that is kn  1/ n . Somewhat √surpris-
t
ingly, this allows for optimality in the estimation of 0 g(cs ) ds (rate 1/ n and
minimal asymptotic variance). However, the price to pay is the need of a de-biasing
term to be subtracted from V (g)n , without which the rate is smaller and no Central
Limit Theorem is available. √
Here,√ we considernthe window size kn  1/ n . This leads to a convergence
rate 1/ n for V (g) itself, and the limit is again conditionally Gaussian with the
“minimal” asymptotic variance, but with a bias that depends on the volatility of
the volatility ct , and on its jumps. It is however possible to subtract from V (g)n a
de-biasing term again, so that the limit becomes (conditionally) centered.
Section 2 is devoted to presenting assumptions and results, and all proofs are
gathered in Sect. 3. The reader is referred to [5] for motivation and various comments
and a detailed discussion of optimality. However, in order to make this paper readable,
we basically give the full proofs, even though a number of partial results have already
been proved in the above-mentioned paper, and with the exception of a few well
designated lemmas.

2 The Results

2.1 Setting and Assumptions

The underlying process X is d-dimensional, and observed at the times in for i =
0, 1, . . ., within a fixed interval of interest [0, t]. For any process we write in Y =
Yin − Y(i−1)n for the increment over the ith observation interval. We assume that
the sequence n goes to 0. The precise assumptions on X are as follows.
First, X is an Itô semimartingale on a filtered space (, F, (Ft )t≥0 , P). It can be
written in its Grigelionis form, as follows, using a d-dimensional Brownian motion
W and a Poisson random measure μ on R+ × E, where E is an auxiliary Polish
space and with the (non-random) intensity measure ν(dt, dz) = dt ⊗ λ(dz) for
some σ-finite measure λ on E:
t t t 
X t = X 0 + 0 bs ds + 0 σs dWs + 0 E δ(s,  t  z) 1{ δ(s,z) ≤1} (μ − ν)(ds, dz)
+ 0 E δ(s, z) 1{ δ(s,z) >1} μ(ds, dz).
(2.1)

This is a vector-type notation: the process bt is Rd -valued optional, the process σt


is Rd ⊗ Rd -valued optional, δ = δ(ω, t, z) is a predictable Rd -valued function on
 × R+ × E and · is the euclidean norm on Rd .
562 J. Jacod and M. Rosenbaum

The spot volatility process ct = σt σt∗ (∗ denotes transpose) takes its values in the
set M+ d of all nonnegative symmetric d × d matrices. We suppose that ct is again
an Itô semimartingale, which can be written as
t t t 
ct = c0 + 
bs ds + 
σs dWs + 
δ(s, z) 1{ δ(s,z) ≤1} (μ − ν)(ds, dz)
0 0 0 E
t 
+ 0 E δ(s, z) 1{ δ(s,z) >1} μ(ds, dz),
(2.2)

with the same W and μ as in (2.1). This is indeed not a restriction: if X and c are
two Itô semimartingales, we have a representation as above for the pair (X, c) and,
if the dimension of W exceeds the dimension of X , one can always add fictitious
component to X , arbitrarily set to 0, so that the dimensions of X and W agree.
In (2.2),  σ are optional and  b and 
δ is as δ; moreover 
2
b and  δ are Rd -valued.
Finally, we need the spot volatility of the volatility and “spot covariation” of the
continuous martingale parts of X and c, which are

i j,kl

d
i j,m kl,m i, jk

d
jk,l

ct = 
σt 
σt , 
ct = σtil 
σt .
m=1 l=1

The precise assumptions on the coefficients are as follows, with r a real in [0, 1).
Assumption (A’-r ): There are a sequence (Jn ) of nonnegative bounded λ-integrable
functions on E and a sequence (τn ) of stopping times increasing to ∞, such that

t ≤ τn (ω) =⇒ δ(ω, t, z) r
∧1+ 
δ(ω, t, z) 2
∧ 1 ≤ Jn (z).

Moreover, the processes bt = bt − δ(t, z) 1{ δ(t,z) ≤1} λ(dz)
(which is well
defined), ct and 
ct are càdlàg or càglàd, and the maps t → δ(ω, t, z) are càglàd (recall
 
that δ should be predictable), as well as the processes  bt +  δ(t, z) κ( δ(t, z) ) −
1{ δ(t,z) ≤1} ) λ(dz) for one (hence for all) continuous function κ on R+ with compact
support and equal to 1 on a neighborhood of 0.
The bigger r , the weaker Assumption (A’-r ), and when (A’-0) holds the process
X has finitely many jumpson each finite interval. The part of (A’-r ) concerning the
jumps of X implies that s≤t X s r < ∞ a.s. for all t  < ∞, and it is in fact
“almost” implied by this property. Since r < 1, this implies s≤t X s < ∞ a.s.

Remark 2.1 (A’-r ) above is basically the same as Assumption (A-r ) in [5], albeit
(slightly) stronger (hence its name): some degree of regularity in time seems to be
needed for  c ,
c,
b, δ in the present case.

2.2 A First Central Limit Theorem

For defining the estimators of the spot volatility, we first choose a sequence kn of
integers which satisfies, as n → ∞:

Estimation of Volatility Functionals: The Case of a n Window 563

θ
kn ∼ √ , θ ∈ (0, ∞), (2.3)
n

and a sequence u n in (0, ∞]. The M+ d -valued variables 


cin are defined, component-
wise, as
kn −1
1 
cin,lm =
 i+
n
j X i+ j X 1{ i+ j X ≤u n } ,
l n m n (2.4)
k n n
j=0

and they implicitly depend on n , kn , u n .


P
One knows that n
c[t/n]
−→ct for any t, and there is an associated Central Limit
1/4
Theorem under (A’-2), with rate 1/n : the choice (2.3) is optimal, in the sense
that it allows us to have the fastest possible
√ rate by a balance between the involved
“statistical error” which is of order
√ 1/ k n , and the variation of ct over the interval
[t, t + kn n ], which is of order kn n because ct is an Itô semimartingale (and
even when it jumps), see [1, 4].
By Theorem 9.4.1 of [4], one also knows that under (A’-r ) and if u n  n for
some ∈ 2p−1 p−r , 2 we have
1

]−kn +1
[t/n  t
V (g)nt := n g(
cin ) =⇒ V (g)t := g(cs ) ds
u.c.p.
(2.5)
i=1 0

(convergence
b in probability, uniformly over each compact interval; by convention
v
i=a i = 0 if b < a), as soon as the function g on M+ d is continuous with
|g(x)| ≤ K (1 + x p ) for some constants K , p. Actually, for this to hold we need
much weaker assumptions on X , but we do not need this below. Note also that when
X is continuous, the truncation in (2.4) is useless: one may use (2.4) with u n ≡ ∞,
which reduces to (1.1) in the one-dimensional case.
Now, we want to determine at which rate the convergence (2.5) takes place. This
amounts to proving an associated Central Limit Theorem. For an appropriate √ choice
of the truncation levels, such a CLT is available for V (g)n , with the rate 1/ n , but
the limit exhibits a bias term. Below, g is a smooth function on M+ d , and the two first
partial derivatives are denoted as ∂ jk g and ∂ jk,lm g, since any x ∈ M+
2 2
d has d com-
ponents x . The family of all partial derivatives of order j is simply denoted as ∂ j g.
jk

Theorem 2.2 Assume (A’-r ) for some r < 1. Let g be a C 3 function on M+


d such that

∂ j g(x) ≤ K (1 + x p− j
), j = 0, 1, 2, 3 (2.6)

for some constants K > 0, p ≥ 3. Either suppose that X is continuous and


u n /εn → ∞ for some ε < 1/2 (for example, u n ≡ ∞, so there is no trunca-
tion at all), or suppose that

2p − 1 1
u n  n , ≤ < . (2.7)
2(2 p − r ) 2
564 J. Jacod and M. Rosenbaum

Then we have the finite-dimensional (in time) stable convergence in law

1 Lf −s
√ (V (g)nt − V (g)t ) −→ A1t + A2t + A3t + A4t + Z t ,
n

where Z is a process defined on an extension ( , F t )t≥0 , 


, (F P) of
(, F, (Ft )t≥0 , P), which conditionally on F is a continuous centered Gaussian
martingale with variance


d 
 t  jl

E (Z t )2 | F =
jm
∂ jk g(cs ) ∂lm g(cs ) cs cskm + cs cskl ds, (2.8)
j,k,l,m=1 0

and where, with the notation


 1
G(x, y) = g(x + wy) − wg(x + y) − (1 − w)g(x) dw, (2.9)
0

we have 
A1t = − 2θ g(c0 ) + g(ct )
d t 2  jl km jm kl
A2t = 2θ
1
0 ∂ jk,lm g(cs ) cs cs + cs cs ds
j,k,l,m=1
θ d t jk,lm
A3t = − 12 0 ∂ 2jk,lm g(cs )
cs ds
 j,k,l,m=1
A4t = θ G(cs− , cs ).
s≤t

Note that |G(x, y)| ≤ K (1 + x ) p y 2 , so the sum defining A4t is absolutely


convergent, and vanishes when ct is continuous.

Remark 2.3 The bias has four parts:


(1) The first one is due to a border effect: indeed, the formula giving V (g)nt contains
[t/n ] − kn + 1 summands only, whereas the natural (unfeasible) approximation
[t/ ]
n i=1 n g(c(i−1)n ) contains [t/n ] summands. The sum of the lacking kn − 1
summands is of order of magnitude (kn − 1)n , which goes to 0 and thus does √ not
impair consistency, but it creates an obvious bias after normalization by 1/ n .
Removing this source of bias is straightforward: since g(cs ) is “under-represented”
when s is close to 0 or to t, we add to V (g)nt the variable

(kn − 1)n  n
g(
c1 ) + g(n
c[t/n ]−kn +1
) .
2

Of course, other weighted averages of g(


cin ) for i close to 0 or to [t/n ] − kn + 1
would be possible.

Estimation of Volatility Functionals: The Case of a n Window 565

(2) The second 2


√ part A is continuous in time and1 is present even in the toy model
given by X t = c Wt with c a constant and n = n and T = 1. In this simple case,
the interpretation is as follows: instead of taking
n the “optimal” cn ) for estimating
g(
n
g(c), with cn = i=1 (in X )2 , one takes n1 i=1 g(
cin ) with 
cin a “local” estimator
of c. This adds a statistical error which results in a bias. Note that, even in the general
case, this bias would disappear, were we taking in (2.3) the (forbidden) value θ = ∞
(with still kn n → 0), at the expense of a slower rate of convergence.
(3) The third and fourth parts A3 and A4 are respectively continuous and purely
discontinuous, due to the continuous part and to the jumps of the volatility process
ct itself. These two biases disappear if we take θ = 0 in (2.3) (with still kn → ∞),
again a forbidden value, and again at the expense of a slower rate of convergence.
The only test function g for which the last three biases disappear is the identity
g(x) = x. This is because, in this case, and up to the border terms, V (g)nt is nothing
but the realized quadratic variation itself and the spot estimators  cin actually merge
together and disappear as such.

Remark 2.4 Observe that (2.7) implies r < 1. This restriction is not a surprise, since
one needs r ≤ 1 in order to estimate the integrated √ volatility by the (truncated)
realized volatility, with a rate of convergence 1/ n . When r = 1, it is likely that
the CLT still holds for an appropriate choice of the sequence u n , and with another
additional bias, see e.g. [6] for a slightly different context. Here we let this borderline
case aside.

2.3 Estimation of the Bias

Now we proceed to “remove” the bias, which means subtracting consistent estimators
for the bias from V (g)nt . As written before, we have

k n n  n P
An,1
t =− g(
c1 ) + g(n
c[t/n ]−kn +1
−→ A1t (2.10)
2
P P
(this comes from 
c1n −→c0 and  n
c[t/ n ]−kn +1
−→ct− , plus ct− = ct a.s.). Next,
observe that A = θ V (h) for the test function h defined on M+
2 1
d by

1 
d

h(x) = ∂ 2jk,lm g(x) x jl x km + x jm x kl .
2
j,k,l,m=1

Therefore
1 P
An,2
t = √ V (h)nt −→ A2t . (2.11)
k n n
566 J. Jacod and M. Rosenbaum

The term A3t involves the volatility of the volatility, for which estimators have
been provided in the one-dimensional case by Vetter [7]; namely, if d = 1 and under
suitable technical assumptions (slightly stronger than here), plus the continuity of
X t and ct , he proves that

[t/n
]−2kn +1
3
(n
ci+k n
−
cin )2
2kn
i=1

t 
converges to 0  cs + θ62 (cs )2 ds. Of course, we need to modify this estimator here,
in order to include the function ∂ 2 g in the limit and account for the possibilities of
having d ≥ 2 and having jumps in X . We propose to take

√ [t/n
]−2kn +1
n 
d
n, jk n, jk
An,3
t =− ∂ 2jk,lm g(
cin ) (
ci+kn − 
ci ) (n,lm
ci+k cin,lm ).
−
8 n
i=1 j,k,l,m=1
(2.12)

When X and c are continuous, one may expect the convergence to A3t − 21 A2t (observe

that 8n ∼ 2k3n 12 θ
), and one may expect the same when X jumps and c is still
continuous, because in (2.4) the truncation basically eliminates the jumps of X . In
contrast, when c jumps, the limit should rather be related to the “full” quadratic
variation of c. Indeed we have the following theorem.

Theorem 2.5 Under the assumptions of Theorem 2.2, for all t ≥ 0 we have

P 1 2
An,3
t −→ − A + A3t + At4 ,
2 t
where 
At4 = θ G (cs− , cs )
s≤t

and
 1
1   2
G (x, y) = − ∂ jk,lm g(x) + ∂ 2jk,lm g(x + (1 − w)y) w 2 y jk y lm dw.
8 0
j,k,l,m
(2.13)

At this stage, it remains to find consistent estimators for A4t − At4 , which has the
form 
A4t − At4 = θ G (cs− , cs ), where G = G − G .
s≤t

Estimation of Volatility Functionals: The Case of a n Window 567

More generally, we aim at estimating



V(F)t = F(cs− , cs ),
s≤t

at least when the function F on M+ d × Md , where Md is the set of all d × d


matrices, is C and |F(x, y)| ≤ K y 2 uniformly in x within any compact set, as is
1

the function G above.


The solution to this problem is not as simple as it might appear at first glance. We
first truncate from below, taking any sequence u n of truncation levels satisfying

un  1
u n → 0, → ∞ for some ∈ 0, . (2.14)
n 8

Second, we resort on the following trick. Since 


cin is “close” to the average of ct over
the interval (in , (i + kn )n ], we (somehow wrongly) pretend that, for all j:

∃s ∈ (( j − 1)kn n , jkn n ] with cs > u n


⇔  cnjkn − 
c(nj−2)kn > u n cs ∼  cnjkn − 
c(nj−2)kn ,

c(nj−1)kn − 
c(nj−3)kn 
c(nj+1)kn − 
c(nj−1)kn < 
cnjkn − 
c(nj−2)kn .

The condition (2.14) implies that for n large enough there is at most one jump of size
bigger than u n in each interval ((i − 1)n , (i − 1 + kn )n ] within [0, t], and no two
consecutive intervals of this form contain such jumps. Despite this, the statement
above is of course not true, the main reason being that  cin and cin do not exactly
agree. However it is “true enough” to allow for the next estimators to be consistent
for V(F)t :
[t/k  ]−3
V(F)nt = j=3n n c(nj−3)kn +1 , δ nj
F( c) 1{ δnj−1c ∨ δnj+1c ∨u n < δnjc } ,
(2.15)
where δ njc = cnjkn +1 −  c(nj−2)kn +1 .

Since this is a sum of approximately [t/kn n ] terms, the rate of convergence of


1/4
V(F)nt towards V(F)t in law is probably 1/n only. However, here we are looking
for consistent estimators, and the rate is not of concern to us. Note that, again, the
upper limit in the sum above is chosen in such a way that V(F)nt is computable on
the basis of the observations within the interval [0, t].
Theorem 2.6 Assume all hypotheses of Theorem 2.2, and let F be a continuous
function on R+ × R satisfying, with the same p ≥ 3 as in (2.7),

|F(x, y)| ≤ K (1 + x + y ) p−2 y 2 . (2.16)

Then for all t ≥ 0 we have


P
V(F)nt −→ V(F)t . (2.17)
568 J. Jacod and M. Rosenbaum

2.4 An Unbiased Central Limit Theorem

At this stage, we can set, with the notation (2.11), (2.12) and (2.15), and also (2.9)
and (2.13) for G and G :

k n n  n
V (g)nt = V (g)nt + g(
c1 ) + g(n
c[t/n ]−kn +1
2
3 n,2 
− n A + An,3 − kn n V(G − G )nt .
2 t t

We then have the following theorem,√ which is a straightforward consequence of


the three previous theorems and of kn n → θ, plus (2.10) and (2.11) and the fact
that the function G − G satisfies (2.16) when g satisfies (2.6).

Theorem 2.7 Under the assumptions of Theorem 2.2, and with Z as in this theorem,
for all t ≥ 0 we have the finite-dimensional stable convergence in law

1 Lf −s
√ (V (g)nt − V (g)t ) −→ Z t .
n

Note that θ no longer explicitly appears in this statement, so one can replace (2.3)
by the weaker statement
1
kn  √
n

(this is easily seen by taking subsequences nl such that knl nl converge to an
arbitrary limit in (0, ∞)).
It is simple to make this CLT “feasible”, that is, usable in practice for determining
a confidence interval for V (g)t at any time t > 0. Indeed, we can define the following
function on M+ d:


d

h(x) = ∂ jk g(x) ∂lm g(x) x jl x km + x jm x kl .
j,k,l,m=1

We then have V (h)n =⇒ V (h), where V (h)t is the right hand side of (2.8). Then
u.c.p.

we readily deduce:

Corollary 2.8 Under the assumptions of the previous theorem, for any t > 0 we
have the following stable convergence in law, where Y is an N (0, 1) variable:

V (g)nt − V (g)t L− s
 −→ Y, in restriction to the set {V (h)t > 0}.
n V (h)nt

Estimation of Volatility Functionals: The Case of a n Window 569

Finally, let us mention that the estimators V (g)nt enjoy exactly the same asymptotic
efficiency properties as the estimators in [5], and we refer to this paper for a discussion
of this topic.

 t 2Suppose d = 1 and take g(x) = x , so we want tho


Example 2.9 (Quarticity) 2

estimate the quarticity 0 cs ds. In this case we have

h(x) = 2x 2 , G(x, y) − G (x, y) = 0.

Then the “optimal” estimator for the quarticity is

[t/n
]−kn +1 [t/n
]−2kn +1
 3 n
n 1 − (
cin )2 + (n
ci+k n
−
cin )2
kn 4
i=1 i=1
(kn − 1)n  n 2
+ (
c1 ) + ( n
c[t/n ]−kn +1
)2 .
2
t
The asymptotic variance is 8 0 cs4 ds, to be compared with the asymptotic variance
[t/ ] t 4
of the more usual estimator 31 n i=1 n (in X )4 , which is 323 0 cs ds.

3 Proofs

3.1 Preliminaries

According to the localization Lemma 4.4.9 of [4] (for the assumption (K) in that
lemma), it is enough to show all four Theorems 2.2, 2.5–2.7 under the following
stronger assumption.
Assumption (SA’-r ): We have (A’-r ). Moreover, we have for a λ-integrable function
J on E and a constant A:

b , b ,  b , c , 
c , c , J ≤ A,
δ(ω, t, z) ≤ J (z), 
r
δ(ω, t, z) 2 ≤ J (z). (3.1)

In the sequel, we thus suppose that X satisfies (SA’-r ), and also that (2.3) holds:
these assumptions are typically not recalled. Below, all constants are denoted by K ,
and they vary from line to line. They may implicitly depend on the process X (usually
through A in (3.1)). When they depend on an additional parameter p, we write K p .
We will usually replace the discontinuous process X by the continuous process
 t  t
Xt = bs ds + σs dWs , (3.2)
0 0
570 J. Jacod and M. Rosenbaum


connected with X by X t = X 0 + X t + s≤t X s . Note that b is bounded, and
without loss of generality we will use below its càdlàg version. Note also that, since
the jumps of c are bounded, one can rewrite (2.2) as
 t  t  t
ct = c0 + 
bs ds + 
σs dWs + 
δ(s, z) (μ − ν)(ds, dz).
0 0 0 E

This amounts to replacing  b in (2.2) by  bt+ + E δ(t+, z)(κ(  δ(t+, z) )
− 1{ δ(t+,z) ≤1} ) λ(dz), where κ is a continuous function with compact support,
equal to 1 on the set [0, A]. Note that the new process 
b is bounded càdlàg.
With any process Z we associate the variables
 
η(Z )t,s = E supv∈(t,t+s] Z t+v − Z t 2 | Ft , (3.3)

and we recall Lemma 4.2 of [5]:

Lemma 3.1 For all t > 0, all bounded càdlàg processes Z , and all sequences
 [t/n ]
vn ≥ 0 of real numbers tending to 0, we have n E i=1 η(Z )(i−1)n ,vn → 0,
and for all 0 ≤ v ≤ s we have E(η(Z )nt+v,s | Ft ) ≤ η(Z )t,s .

3.2 An Auxiliary Result on Itô Semimartingales

In this subsection we give some simple estimates for a d-dimensional semimartingale


 t  t  t
Yt = bsY ds + σsY dWs + δ Y (s, z) (μ − ν)(ds, dz)
0 0 0 E

on some space (, F, (Ft )t≥0 , P), which may be different from the one on which
X is defined, as well as W and μ, but we still suppose that the intensity measure ν is
the same. Note that Y0 = 0 here. We assume that for some constant A and function
J Y we have, with cY = σ Y σ Y,∗ :

bY ≤ A, cY ≤ A2 , δ Y (ω, t, z) 2 ≤ J Y (z) ≤ A2 , J Y (z) λ(dz) ≤ A2 .
E
(3.4)
t
The compensator of the quadratic variation of Y is of the form 0 csY ds, where

ctY = ctY + E δ Y (t, z) δ Y (t, z)∗ λ(dz). Moreover, if the process cY is itself an Itô
semimartingale, the quadratic
 t Ycovariation of the continuous martingale parts of Y
and cY is also of the form 0  cs ds for some process c Y , necessarily bounded if both
Y and cY satisfy (3.4) (and, if Y = X , we have cY = c and  cY = c ).

Estimation of Volatility Functionals: The Case of a n Window 571

Lemma 3.2 Below we assume (3.4), and the constant K only depends on A.
(a) We have for t ∈ [0, 1]:
 
E(Yt | F0 ) − tbY  ≤ t η(bY )0,t ≤ K t
0
  √ (3.5)
E(Yt j Y m | F0 ) − tcY, jm  ≤ K t (t + t η(bY )0,t + η(cY )0,t ) ≤ K t,
t 0

and if further E(ctY − c0Y | F0 ) ≤ A2 t for all t, we also have


  √
E(Yt j Y m | F0 ) − tcY, jm  ≤ 2 t 3/2 (2 A2 t + Aη(bY )0,t ) ≤ K t 3/2 . (3.6)
t 0

(b) When Y is continuous, and if E( ctY − c0Y 2 | F0 ) ≤ A4 t for all t, we have


  j k l m 
E Yt Y Y Y | F0 − t 2 (cY, jk cY,lm + cY, jl cY,km + cY, jm cY,kl ) ≤ K t 5/2 . (3.7)
t t t 0 0 0 0 0 0

(c) When cY is a (possibly discontinuous) semimartingale satisfying the same


conditions (3.4) as Y , and if Y itself is continuous, we have
  j k  √
E (Yt Y − tcY, jk )(ctY,lm − cY,lm ) | F0  ≤ K t 3/2 ( t + η(
c Y )0,t ). (3.8)
t 0 0

Proof The first part of (3.5) follows by taking the F0 -conditional expectation in the
t
decomposition Yt = Mt + tb0Y + 0 (bsY − b0Y ) ds, where M is a d-dimensional
martingale with M0 = 0. For the second part, we deduce from Itô’s formula that
Y j Y m is the sum of a martingale vanishing at 0 and of
 t  t  t  t
j j j j j
b0 Ysm ds + b0m Ys ds + Ysm (bs − b0 ) ds + Ys (bsm − b0m ) ds
0 0 0 0
 t
Y, jm Y, jm Y, jm
+ c0 t + (cs − c0 ) ds.
0

Since E( Yt | F0 ) ≤ K A t, as in (3.9), we deduce the second part of (3.5) and also
(3.6) by taking again the conditional expectation and by using the Cauchy-Schwarz
inequality and the first part.
j
Equation (3.7) is a part of Lemma 4.1 of [5]. For (3.8), we first observe that Yt Ytk −
Y, jk Y,lm Y,lm
tc0 = Bt + Mt and ct − c0 = Bt + Mt , with M and M martingales (M
is continuous). The processes B, B , M, M, M , M  and M, M  are absolutely
continuous, with densities bs , bs , h s , h s and h s satisfying, by (3.4) for Y and cY :

|bs | ≤ 2 Ys bsY + csY − c0Y , |bs | ≤ K , |h s | ≤ K Ys 2


, |h s | ≤ K ,

j
where h s = Ys  c Y,k,lm + Ysk
c Y, j;lm . Again as in (3.9) below, E( Yt q | F0 ) ≤
Kq t q/2 for all q, and E( ct − c0Y 2 | F0 ) ≤ K t. This yields E(Bt2 | F0 ) ≤ K t 3 and
Y

E(Mt | F0 ) ≤ K t 2 . Since |Bt | ≤ K t and E(Mt 2 | F0 ) ≤ K t, we deduce that the


2

F0 - conditional expectations of Bt Bt , Bt Mt and Mt Bt are smaller than K t 2 .


572 J. Jacod and M. Rosenbaum

Finally E(Mt Mt | F0 ) = E(M, M t | F0 ), and M, M t is the sum of


t j  t j Y,k,lm
c0Y,k,lm 0
 Ys ds + 0 Ys ( cs − c0Y,k,lm ) ds and a similar term with k and j
exchanged. Then using again E( Yt 2 | F0 ) ≤ K t, plus E(Yt | F0 ) ≤ K t
and Cauchy-Schwarz inequality, we obtain that the above conditional expectation is
smaller than K (t 2 + t 3/2 η(c Y )t ). This completes the proof of (3.8). 

3.3 Some Estimates

(1) We begin with well known estimates for X and c, under (3.1) and for s, t ≥ 0
and q ≥ 0, see [4] for details:

E supw∈[0,s] X t+w − X t q | Ft ≤ K q s q/2 , E(X t+s − X t | Fs ) ≤ K s
E supw∈[0,s] ct+w − ct q | Ft ≤ K q s 1∧(q/2) , E(ct+s − ct | Fs ) ≤ K s.
(3.9)

Next, it is much easier (although unfeasible in practice) to replace 


cin in (2.5) by
the estimators based on the process X given by (3.2). Namely, we will replace  cin
by the following:
kn −1
1  ∗

ci =
n
i+
n
j X i+ j X .
n
k n n
j=0

The difference between  cin and 


cin is estimated by the following inequality, valid
when u n  n and q ≥ 1, and where an denotes a sequence of numbers (depending
on u n ), going to 0 as n → ∞ (this is Eq. 4.8 of [5]):
 n (2q−r ) +1−q
E ci − 
cin q
≤ K q an n . (3.10)

(2) The jumps of c also potentially cause troubles. So we will eliminate the “big”
jumps as follows. For any ρ > 0 we consider the subset E ρ = {z : J (z) > ρ}, which
satisfies λ(E ρ ) < ∞, and we denote by G ρ the σ-field generated by the variables
μ([0, t] × A), where t ≥ 0 and A runs through all Borel subsets of E ρ . The process
ρ
Nt = μ((0, t] × E ρ ) (3.11)
ρ ρ
is a Poisson process and we let S1 , S2 , . . . be its successive jump times, and n,t,ρ
ρ ρ
be the set on which S j ∈ / {in : i ≥ 1} for all j ≥ 1 such that S j < t, and
ρ ρ ρ
S j+1 > t ∧ S j + (6kn + 1)n for all j ≥ 0 (with the convention S0 = 0; taking 6kn
here instead of the more natural kn will be needed in the proof of Theorem 2.6, and
makes no difference here). All these objects are G ρ -measurable, and P(n,t,ρ ) → 1
as n → ∞, for all t, ρ > 0.

Estimation of Volatility Functionals: The Case of a n Window 573

We define the processes


 

b(ρ)t = 
bt − 
δ(t+, z) λ(dz), c(ρ)t =  σt∗ +
σt  
δ(t+, z) 
δ(t+, z)∗ λ(dz)
Eρ (E ρ )c

t 
c(ρ)t = ct − 
δ(s, z) μ(ds, dz) = c(1) (ρ)t + c(2) (ρ)t , where
0 Eρ
t t
c(1) (ρ)t = c0 + 0  b(ρ)s ds + 0  σs dWs (3.12)
t 
c (ρ)t = 0 (E ρ )c 
(2) δ(t−, z) (μ − ν)(ds, dz),

2 2
so c(ρ), which is Rd ⊗ Rd -valued, is the càdlàg version of the density of the pre-
dictable quadratic variation of c(ρ). Moreover G ρ = {∅, } and (
b(ρ), c(ρ)) = (
b, c)

when ρ exceeds the bound of the function J . Note also that b(ρ) and c(ρ) are càdlàg.
By Lemma 2.1.5 and Proposition 2.1.10 in [4] applied to each components of X
and c(2) (ρ), plus the property  b(ρ) ≤ K /ρ, for all t ≥ 0, s ∈ [0, 1], ρ ∈ (0, 1],
q ≥ 2, we have

E supw∈[0,s] X t+w − X t q | Ft ∨ G ρ ≤ K q s q/2
E(X t+s − X t | Fs ∨ G ρ ) + E(c(ρ)t+s − c(ρ)t | Fs ∨ G ρ ) ≤ K s

E supw∈[0,s] c(2) (ρ)t+w − c(2) (ρ)t q | Ft ∨ G ρ ≤ K q φρ (s + s q/2 )
  q
E supw∈[0,s] c(ρ)t+w − c(ρ)t q | Ft ∨ G ρ ≤ K q φρ s + s q/2 + ρs q ≤ K q,ρ s.
(3.13)

where φρ = (E ρ )c J (z) λ(dz) → 0 as ρ → 0. Note also that 
b(ρ)t ≤ K /ρ.

(3) For convenience, we put

bin = b(i−1)n , cin = c(i−1)n ,



b(ρ)in = 
b(ρ)(i−1)n , c(ρ)in = c(ρ)(i−1)n , c(ρ)in = c(ρ)(i−1)n , (3.14)
= Fin ∨ G ρ .
n,ρ
Fin = F(i−1)n , Fi

n,ρ
All the above variables are Fi -measurable. Recalling (3.3), and writing
η(Z , (Ht ))t,s if we use the filtration (Ht ) instead of (Ft ), we also set

η(ρ)i,n j = max(η(Y, (G ρ Ft ))(i−1)n , jn : Y = b ,  c ,
b(ρ), c, c(ρ),
η(ρ)in = η(ρ)i,i+2k
n
n
.

Therefore, Lemma 3.1 yields for all t, ρ > 0 and j, k such that j + k ≤ 2kn :

[t/ ]
 n n,ρ
n E η(ρ)in → 0, E(η(ρ)i+
n
j,k | Fi ) ≤ η(ρ)in . (3.15)
i=1
574 J. Jacod and M. Rosenbaum

We still need some additional notation. First, define G ρ -measurable (random) set
of integers:
ρ ρ
L(n, ρ) = {i = 1, 2, . . . : N(i+2kn )n − N(i−1)n = 0} (3.16)

(taking above 2kn instead of kn is necessary for the proof of Theorem 2.5).
Observe that

i ∈ L(n, ρ), 0 ≤ j ≤ 2kn + 1 ⇒ ci+


n
j − ci = c(ρ)i+ j − c(ρ)i .
n n n
(3.17)

Second, we define the following Rd ⊗ Rd -valued variables

αin = in X in X ∗ − cin n


kn −1  n
βin = 
cin − cin = kn1n j=0 αi+ j + (ci+ j − ci )n
n n
(3.18)
γin = n
ci+k n
−
cin = βi+k
n
n
− βin + ci+k
n
n
− cin .

(4) Now we proceed with estimates. (3.13) yields, for all q ≥ 0:


n,ρ q n,ρ 3/2
E( αin q | Fi ) ≤ K q n , E(αin | Fi ) ≤ K n ,
 kn −1 n q n,ρ 3q/4 n,ρ (3.19)
E j=0 αi+ j | Fi ≤ K q n , E( 
cin q | Fi ) ≤ K q ,

the third inequality following from the first two ones, plus Burkholder-Gundy and
Hölder inequalities, and the last inequality from the third one and the boundedness
of ct . Moreover, since the set {i ∈ L(n, ρ)} is G ρ -measurable, the last part of (3.13),
(3.17), and Hölder’s inequality, readily yield

  n 
q/2
) ≤ K q
n,ρ q/4
q ≥ 2, i ∈ L(n, ρ) ⇒ E βin q
| Fi n φρ + n + .
ρq
(3.20)

(5) The previous estimates are not enough for us. We will apply the estimates of
Lemma 3.2 with Yt = X (i−1)n +t − X (i−1)n for any given pair n, i, and with the
filtration (F(i−1)n +t ∨ G ρ )t≥0 . We observe that on the set A(ρ, n, i) = {∃ j ≤ 2kn :
i − j ∈ L(n, ρ)}, which is G ρ -measurable, and because of (3.17), the process cY
coincides with c(ρ)(i−1)n +t − c(ρ)(i−1)n if t ∈ [0, n ]. Then in restriction to this
set, by (3.6) and (3.7) and by the definition of η(ρ)i,1
n , we have

  √
E(n X j n X m | F n,ρ ) − cn, jm n  ≤ K ρ n3/2 ( n + η(ρ)n )
i i i i i,1
  n j n k n l n m
E  X  X  X  X | F n,ρ
i i i i i

c )2  ≤ K ρ n
n, jk n,lm n, jl n,km n, jm n,kl 5/2
− (ci c i +c c i +ci i i n

Estimation of Volatility Functionals: The Case of a n Window 575

(the constant above depends on ρ, through the bound K /ρ for the drift of c(ρ)).
Then a simple calculation gives us
  √ 
 E(αn | F n,ρ ) ≤ K ρ n3/2 ( n + η(ρ)n )
i i i,1
  n, jk n,lm  on A(ρ, n, i).
E α αi
n,ρ n, jl
| Fi ) − (ci cin,km + ci ci )2n  ≤ K ρ n
n, jm n,kl 5/2
i
(3.21)

Next, we apply Lemma 3.2 to the process Yt = c(ρ)(i−1)n +t − c(ρ)(i−1)n for


any given pair n, i, and with the filtration (F(i−1)n +t ∨ G ρ )t≥0 . We then deduce
from (3.5), plus again (3.17), that

i ∈ L(n, ρ), 0 ≤ t ≤ kn n ⇒
 n, jklm 
E((c jk jk n,ρ
(i−1)n +t − c(i−1)n )(c(i−1)n +t − c(i−1)n ) | Fi ) − tc(ρ)i
lm lm
(3.22)
≤ K ρ t η(ρ)i,k
n
 n

E(c(i−1) +t − c(i−1) | F n,ρ ) − t b(ρ)in  ≤ K ρ t η(ρ)i,k
n ≤ K p t.
n n i n

Moreover, the Cauchy-Schwarz inequality and (3.19) on the one hand, and (3.8)
applied with the process Yt = X (i−1)n +t − X (i−1)n on the other hand, give us
   n,kl n n,ρ 
E α i b(ρ)ms | Fi  ≤ K n η(ρ)i,1
n
i
i ∈ L(n, ρ) ⇒   n,kl  3/2 √
(3.23)
E α in cms | Fi  ≤ K ρ n ( n + η(ρ)i,1
n,ρ n ).
i

(6) We now proceed to estimates on βin .

Lemma 3.3 We have on the set where i belongs to L(n, ρ):


 n, jklm 
E(β n, jk β n,lm | F n,ρ ) − 1 n, jl n,km
(ci ci + ci ci ) − kn3n
n, jm n,kl
c(ρ)i
i i i kn
√ 1/4
≤K ρ n (n +η(ρ)in )
  √ √
E(β n, jk (cn,lm − cn,lm ) | F n,ρ ) − kn n n, jklm 
c(ρ)i ≤ K ρ n ( n + η(ρ)in ).
i i+kn i i 2

n, jk n,lm
Proof We set ζi,n j = αi+
n
j + (ci+ j − ci )n and write βi
n n βi as

kn −1 kn −2 k
n −1 kn −2 k
n −1
1  n, jk n,lm 1  n, jk n,lm 1  n,lm n, jk
ζ ζ + ζ ζ + ζi,u ζi,v .
kn2 2n i,u i,u
kn2 2n i,u i,v
kn2 2n
u=0 u=0 v=u+1 u=0 v=u+1
(3.24)

For the estimates below, we implicitly assume i ∈ L(n, ρ) and u, v ∈ {0, . . . , kn −1}.
First, we deduce from (3.21) and (3.22), plus (3.23) and successive condition-
ing, that
576 J. Jacod and M. Rosenbaum
 
E(ζ n, jk ζ n,lm | F n,ρ ) − (cn, jl cn,km + cn, jm cn,kl )2  ≤ K n5/2 . (3.25)
i,u i,u i i i i i n

Second, if u < v, the same type of arguments and the boundedness of 


b(ρ)t and ct
yield

)n − 
n, jk n,ρ n, jk n, jk n, jk
|E(ζi,v | Fi+u+1 ) − (ci+u+1 − ci b(ρ)i+u+1 2n (v − u − 1)|
3/2 √
≤ K n (kn n + η(ρ)i+v,1n )
n,lm n, jk n, jk n,ρ 3/2 √
|E(αi+u (ci+u+1 − ci+u ) | Fi+u )| ≤ K ρ n ( n + η(ρ)i+u,1 n )
n,lm n, jk n, jk n,ρ 3/2 √
|E(αi+u (ci+u − ci ) | Fi+u )| ≤ K n ( n + ηi+u,1 ) n

n,lm  n, jk 3/2 √
(b(ρ)i+u+1 − 
n, jk n,ρ
|E(αi+u b(ρ)i+u ) | Fi+u )| ≤ K ρ n ( n + η(ρ)i+u,1 n )
n,lm  n, jk n,ρ 3/2 √
|E(αi+u b(ρ)i+u | Fi+u )| ≤ K ρ n ( n + η(ρ)i+u,1 ) n

n,lm n, jk n, jk n,ρ n, jklm


|E((ci+u − cin,lm ) (ci+u+1 − ci ) | Fi ) − c(ρ)i n u| ≤ K ρ n η(ρ)in
− cin,lm ) 
n,lm n, jk n,ρ 1/4
|E((ci+u b(ρ)i+u+1 | Fi )| ≤ K ρ n .
kn −2 kn −1 n,ρ
Since u=0 v=u+1 u = kn3 /6 + O(kn2 ), we easily deduce that the Fi -conditional
n, jklm
expectation of the last term in (3.24) is 1
c(ρ)i k n n ,
up to a remainder term
√ 1/4
6
which is O( n (n + η(ρ)i )), and the same is obviously true for the second
n

term. The first claim of the lemma readily follows from this and (3.24) and (3.25).
The proof of the second claim is similar. Indeed, we have
kn −1
n, jk n,lm 1   n, jk n, jk n, jk  n,lm
βi (ci+k − cin,lm ) = αi,u + (ci+u − ci )n ci+k − cin,lm
n k n n n
u=0

and
 
E(cn,lm − cn,lm | F n,ρ ) − cn,lm − cn,lm −  n,lm
b(ρ)i+u+1 n (kn − u − 1)
i+kn i i+u+1 i+u+1 i
≤ K n η(ρ)i+u+1,k
n
n −u
.

Using the previous estimates, we conclude as for the first claim. 


Finally, we deduce the following two estimates on the variables of (3.18), for γin
any q ≥ 2:
⎧   n, jk n,lm
⎪ E γ γi
n,ρ n, jl
| Fi ) − k2n (ci cin,km + ci
n, jm n,kl
ci )

⎨ i
2kn n

n, jklm  √  1/8
i ∈ L(n, ρ) ⇒ − 3 c(ρ)i ≤ K ρ n n + η(ρ)in

⎪ √
⎩ q/2
E( γin q | Fi ) ≤ K q n φρ + n + ρnq .
n,ρ q/4

(3.26)
n, jk
To see that the first claim holds, one expands the product γi γin,lm and uses suc-
cessive conditioning, the Cauchy-Schwarz inequality and (3.13), (3.17) and (3.22),

Estimation of Volatility Functionals: The Case of a n Window 577

and Lemma 3.3; the contributing terms are

n, jk n, jk n, jk n, jk
βi βin,lm + βi+kn βi+k
n,lm
n
+ (ci+kn − ci n,lm
)(ci+k n
− cin,lm )
n, jk n,lm n, jk n, jk
− βi (ci+k n
− cin,lm ) − βin,lm (ci+kn − ci ).

For the second claim we use (3.13), (3.17) and (3.20), and it holds for all q ≥ 2.

3.4 The Behavior of Some Functionals of c(ρ)

For ρ > 0, we set


[t/kn n ]−3
U (ρ)nt = j=3 μ(ρ)nj 2 1{ μ(ρ)nj >u n /4} , where
 kn −1 (3.27)
μ(ρ)nj = 1
kn w=0 (c(ρ) jkn +w − c(ρ)( j−2)kn +w ).
n n

The aim of this subsection is to prove the following lemma.

Lemma 3.4 Under (SA’-r ) and (2.14) we have



lim lim sup E U (ρ)nt = 0.
ρ→0 n→∞

Assumption (SA’-r ) is of course not fully used. One only needs the assumptions
concerning the process ct .

Proof With the notation (3.12), and for l = 1, 2 we define μ(l) (ρ)nj and U (l) (ρ)nt
as above, upon substituting c(ρ) and u n /4 with c(l) (ρ) and u n /8. Since U (ρ)nt ≤
4U (1) (ρ)nt + 4U (2) (ρ)nt , it suffices to prove the result for each U (l) (ρ)nt .
First, μ(1) (ρ)nj 2 1{ μ(1) (ρ)n >u n /8} is smaller than K μ(1) (ρ)nj 4 /u n2 , whereas
i 
(recalling  b(ρ) ≤ K /ρ) classical estimates yield E μ(1) (ρ)n 4 ≤ K n (1 +
j
(1) 1/2−2
n /ρ). Thus the expectation of U (ρ)t is less than K n
n (1+n /ρ), yielding
the result for U (1) (ρ)nt .
[t/k  ]
Secondly, we have U (2) (ρ)nt ≤ j=3n n μ(2) (ρ)in 2 and the first part of (3.13)
 √
yields E μ(2) (ρ)in 2 ≤ K φρ n . Since φρ → 0 as ρ → 0, the result for U (1) (ρ)nt
follows. 

3.5 A Basic Decomposition

We start the proof of Theorem 2.2 by giving a decomposition of V (g)n − V (g), with
cin = cin + βin and on the definition
quite a few terms. It is based on the key property 
578 J. Jacod and M. Rosenbaum

(3.18) of αin and βin . A simple calculation shows that √1 (V (g)nt − V (g)t ) =
5 n, j
n

j=1 Vt , as soon as t > kn n , where (the sums on components below always


extend from 1 to d):
[t/n ]−kn +1 
Vtn,1 = n g(
cin ) − g(
cin )
i=1
]−kn +1  in
[t/n
1
Vtn,2 = √ (g(cin ) − g(cs )) ds
n i=1 (i−1)n
[t/n
]−kn +1  n −1
k
1
Vtn,3 = √ ∂lm g(cin ) n,lm
αi+u
kn n i=1 l,m u=0
√ ]−kn +1 
[t/n n −1
k
n
Vtn,4 = ∂lm g(cin ) n,lm
(ci+u − cin,lm )
kn
i=1 l,m u=1
 t
1
−√ g(cs ) ds
n n ([t/n ]−kn +1
]−kn +1
[t/n 
 n
Vtn,5 = n g(ci + βin ) − g(cin ) − ∂lm g(cin ) βin,lm .
i=1 l,m

The leading term is V n,3 , the bias comes from the terms V n,4 and V n,5 , and the
first two terms are negligible, in the sense that they satisfy

n, j P
j = 1, 2 ⇒ Vt −→ 0 for all t > 0. (3.28)

We end this subsection with the proof of (3.28).


The case j = 1: (2.6) implies

cin ) − g(
|g( cin )| ≤ K (1 + 
cin + 
cin ) p−1 
cin − 
cin
cin ) p−1 
≤ K (1 +  cin − 
cin + K 
cin − 
cin p
.

Recalling the last part of (3.19), we deduce from (3.10), together with the fact that
1 − r − p(1 − 2 ) < (2−r 2q
)
for all q > 1 small enough and Hölder’s inequality
(2 p−r ) +1− p
cin ) − g(
that E(|g( cin )|) ≤ K an n . Therefore

(2 p−r ) +1/2− p
E sup |Vsn,1 | ≤ K tan n
s≤t

and (3.28) for j = 1 follows.



Estimation of Volatility Functionals: The Case of a n Window 579

The case j = 2: Since g is C 2 and ct is an Itô semimartingale with bounded char-


acteristics, the convergence V n,2 =⇒ 0 is well known: see for example the proof of
u.c.p.

(5.3.24) in [4], in which one replaces ρcs ( f ) by g(cs ).

3.6 The Leading Term V n,3

Our aim here is to prove that


L− s
V n,3 =⇒ Z (3.29)

(functional stable convergence in law), where Z is the process defined in Theorem 2.2.
A change of order of summation allows us to rewrite V n,3 as

[t/n ] (i−1)∧(k
n −1)
1   n,lm n,lm 1
Vtn,3 = √ wi αi , where win,lm = ∂lm g(ci−
n
j ).
n i=1 l,m kn
j=(i−[t/n ]+kn −1)+

Observe that win and αin are measurable with respect to Fin and Fi+1
n , respectively, so

by Theorem IX.7.28 of [3] (with G = 0 and Z = 0 in the notation of that theorem)


it suffices to prove the following four convergences in probability, for all t > 0 and
all component indices:

[t/n
]−kn +1
1 P
√ win,lm E(αin,lm | Fin ) −→ 0 (3.30)
n i=1

]−kn +1
[t/n
1 n, jk n, jk
wi win,lm E(αi αin,lm | Fin )
n
i=1

P
t  jl jm
−→ ∂ jk g(cs ) ∂lm g(cs ) cs cskm + cs cskl ds (3.31)
0

]−kn +1
[t/n
1 P
win 4
E( αin 4
| Fin ) −→ 0 (3.32)
2n
i=1

[t/n
]−kn +1
1 P
√ win,lm E(αin,lm in N | Fin ) −→ 0, (3.33)
n i=1

where N = W j for some j, or is an arbitrary bounded martingale, orthogonal to W .


For proving these properties, we pick a ρ bigger than the upper bound of the
function J , so G ρ becomes the trivial σ-field and Fin = Fi and L(n, ρ) = N. In such
n,ρ
580 J. Jacod and M. Rosenbaum

a way, we can apply all estimates of the previous subsections with the conditioning
σ-fields Fin . Therefore (3.19) and the property win ≤ K readily imply (3.30) and
(3.32). In view of the form of αin , a usual argument (see e.g. [4]) shows that in fact
E(αin,lm in N | Fin ) = 0 for all N as above, hence (3.33) holds.
For (3.31), by (3.21) it suffices to prove that

]−kn +1
[t/n
n, jk n, jl n,km n, jm n,kl
n wi win,lm (ci ci + ci ci )
i=1

P
t  jl jm
−→ ∂ jk g(cs ) ∂lm g(cs ) cs cskm + cs cskl ds.
0

n, jk n, jk
In view of the definition of win , for each t we have wi(n,t) → ∂ jk g(ct ) and ci(n,t) →
jk
ct almost surely if |i(n, t)n − t| ≤ kn n (recall that c is almost surely continuous
at t, for any fixed t), and the above convergence follows by the dominated convergence
theorem, thus ending the proof of (3.29).

3.7 The Term V n,4

In this subsection we prove that, for all t,

P θ  t
Vtn,4 −→ ∂lm g(cs− ) dcslm − θ g(ct ). (3.34)
2 0
l,m

We call Vt n,4 and Vt n,4 , respectively, the first sum, and the last integral, in the

definition of Vtn,4 . Since kn n → θ and c is a.s. continuous at t, it is obvious that
Vt n,4 converges almost surely to −θ g(ct ), and it remains to prove the convergence
of Vt n,4 to the first term in the right side of (3.34).

We first observe that ci+un − cin = u−1 v=0 i+v c. Then, upon changing the order
n

of summation, we can rewrite Vt n,4 as

[t/
n ]−1 
n,4
Vt = win,lm in clm ,
i=1 l,m
√ (i−1)∧(k
n n −2)
win,lm = (kn − 1 − u)∂lm g(ci−u
n
).
kn
u=0∨(i+kn −1−[t/n ])


In other words, recalling kn n ≤ K and ∂g(cs ) ≤ K , we see that

Estimation of Volatility Functionals: The Case of a n Window 581

 t
Vt n,4 = H (n, t)lm
s dcs ,
lm

l,m 0

where H (n, t)s is a d × d-dimensional predictable process, bounded uniformly (in


n, s, ω) and given on the set [kn n , t − kn n ] by
√ kn −2
n 
(i − 1)n < s ≤ in ⇒ H (n, t)lm
s = (kn − 1 − u)∂lm g(ci−u
n
)
kn
u=0

(its expression on [0, kn n ) and on (t −kn n , t] is more complicated, but not needed,
kn −2
apart from the fact that it is uniformly bounded). Now, since u=0 (kn − 1 − u) =
θ
kn /2 + O(kn ) as n → ∞, we observe that H (n, t)s converges to 2 ∂lm g(cs− ) for all
2 lm

s ∈ (0, t). Since c is a.s. continuous at t, we deduce from the dominated convergence
theorem for stochastic integrals that Vt n,4 indeed converges in probability to the first
term in the right side of (3.34).

3.8 The Term V n,5

The aim of this subsection is to prove the convergence

P  1 
Vtn,5 −→A2t − 2 A3t +θ g(cs− + wcs ) − g(cs− ) − w ∂lm g(cs− ) cslm dw.
s≤t 0 l,m
(3.35)

[t/n ]−kn +1
We have Vtn,5 = i=1 vin , where
 
vin = n g(cin + βin ) − g(cin ) − ∂lm g(cin ) βin,lm .
l,m

We also set
kn −1 n kn −1 n
αin = kn1n u=0 αi+u , β in = βin − αin = k1n u=1 (ci+u − cin ),
√  
vi n = n g(cin + β in ) − g(cin ) − ∂lm g(cin ) β in,lm , vi n = vin − vi n .
l,m
(3.36)

We take ρ ∈ (0, 1], and will eventually let it go to 0. With the sets L(n, ρ) of
(3.16), we associate

L(n, ρ, t) = {1, . . . , [t/n ] − kn + 1} ∩ L(n, ρ)


L(n, ρ, t) = {1, . . . , [t/n ] − kn + 1}\L(n, ρ).
582 J. Jacod and M. Rosenbaum

We split the sum giving Vtn,5 into three terms:

n,ρ
 n,ρ
 n,ρ

Ut = vin , Ut = vi n , Ut = vi n . (3.37)
i∈L(n,ρ,t) i∈L(n,ρ,t) i∈L(n,ρ,t)

(A) The processes U n,ρ . A Taylor expansion and (2.6) give us


⎧ √ 
v(1)in = 2n
n, jk n,lm n,ρ

⎪ j,k,l,m ∂ jk,lm g(ci ) E(βi
2 n
βi | Fi )
⎨ √ 
vin = v(1)in + v(2)in + v(3)in , where v(2)n = n n, jk n,lm
j,k,l,m ∂ jk,lm g(ci ) βi βi − v(1)in
2 n

⎪ i 2
⎩ √
|v(3)in | ≤ K n (1 + βin ) p−3 βin 3 .

Therefore


3 
U n,ρ = U ( j)n,ρ , where U ( j)nt = v( j)in . (3.38)
j=1 i∈L(n,ρ,t)

On the one hand, letting




 1 n, jl n,km n, jm n,kl k n n n, jklm
w(ρ)in = ∂ 2jk,lm g(cin ) √ (ci ci + ci ci ) + c(ρ)i ,
2kn n 6
j,k,l,m


the càdlàg property of c and c(ρ) and kn n → θ imply

]−kn +1
[t/n  t
P ρ θ  jklm
W (ρ)nt := n w(ρ)in −→U (1)t := A2t + ∂ 2jk,lm g(cs ) c(ρ)s ds.
6 0
i=1 j,k,l,m

1/4
On the other hand, Lemma 3.3 yields |v(1)in − n w(ρ)in | ≤ K ρ n (n + η(ρ)in )
when i ∈ L(n, ρ), whereas |w(ρ)in | ≤ K always. Therefore


[t/
n ]  
n,ρ
E |U (1)t − W (ρ)nt | ≤ K ρ n E ( n + η(ρ)in ) + K n E #(L(n, ρ, t)) .
i=1

ρ
Now, √#(L(n, ρ, t)) is not bigger than (2kn +1)Nt , implying that n E(#(L (n, ρ, t)))
≤ K ρ n . Taking advantage of (3.15), we deduce that the above expectation goes
to 0 as n → ∞, and thus
n,ρ P ρ
U (1)t −→ U (1)t . (3.39)
n,ρ n,ρ
Next, v(2)in is Fi+kn -measurable, with vanishing Fi -conditional expectation,
n,ρ
and each set {i ∈ L(n, ρ)} is F0 -measurable. It follows that

Estimation of Volatility Functionals: The Case of a n Window 583

   
n,ρ n,ρ
E (U (2)t )2 ≤ 2kn E i∈L(n,ρ,t) E |v(2)i | | Fi
n 2
  n4  √
n,ρ
≤ K k n n E i∈L(n,ρ,t) E |β i | | Fi ≤ K tφρ + K ρ t n ,

where we have applied (3.20) for the last inequality. Another application of the same
estimate gives us
 1/4
E |U (3)nt |) ≤ K tφρ + K ρ tn .

These two results and the property φρ → 0 as ρ → 0 clearly imply


n,ρ n,ρ
lim lim sup E(|U (2)t | + |U (3)t |) = 0. (3.40)
ρ→0 n→∞

ρ ρ
(B) The processes U n,ρ . We will use here the jump times S1 , S2 , . . . of the Poisson
process N ρ , and will restrict our attention to the set n,t,ρ defined before (3.12),
whose probability goes to 1 as n → ∞. On this set, L(n, ρ, t) is the collection of
ρ ρ
all integers i which are between [Sq /n ] − 2kn + 2 and [Sq /n ] + 1, for some q
ρ
between 1 and Nt . Thus
ρ ρ
Nt [Sq /n ]+1
n,ρ
 
Ut = H (n, ρ, q), where H (n, ρ, q) = vi n . (3.41)
ρ
q=1 i=[Sq /n ]−2kn +1

ρ
The behavior of each H (n, ρ, q) is a pathwise question. We fix q and set S = Sq and
an = [S/n ], so S > an n because S is not a multiple of n . For further reference
we consider a case slightly more general than strictly needed here. We have cin → c S−
when an − 6kn + 1 ≤ i ≤ an + 1 and cin → c S when an + 2 ≤ i ≤ an + 6kn ,
uniformly in i (for each given outcome ω). Hence

(kn − an + i − 2)+ ∧ (kn − 1)


β in − c S → 0
kn
uniformly in i ∈ {an − 6kn + 2, . . . , an + 5kn }. (3.42)

Thus, the following convergence holds, uniform in i ∈ {an − 2kn + 1, . . . , an + 1}:


 kn −an +i−2
√1
n
vi n − g c S− + c S − g(c S− )
kn
  jk 
kn −an +i−2
− l,m ∂lm g(c S− ) c S− + kn clm
S → 0,

which implies

n −3
k  
 u u
H (n, ρ, q) − n g c Sq − + c Sq − g(c Sq − ) − ∂lm g(c Sq − ) clm
S →0
kn kn q
u=1 l,m

and by Riemann integration this yields


584 J. Jacod and M. Rosenbaum
 1 
H (n, ρ, q) → θ g(c Sq − + wc Sq ) − g(c Sq − ) − w ∂lm g(c Sq − ) clm
S dw.
0 l,m

Henceforth, we have
ρ
Nt 
n,ρ P ρ
 1 
Ut −→Ut := θ g(c Sq − + wc Sq ) − g(c Sq − ) − w ∂lm g(c Sq − ) clm
Sq dw.
q=1 0 l,m
(3.43)

(C) The processes U n,ρ . Since |β in | ≤ K we deduce from (2.6) that |vi n | ≤
√ n,ρ q/4
K n ( αin + αin p ). (3.19) yields E( αin q | Fi ) ≤ K q n for all q > 0.
Therefore  n,ρ 3/4 1/4
E |Ut | ≤ K n E(#(L(, n, ρ, t))) ≤ K ρ n ,

by virtue of what precedes (3.39). We then deduce

n,ρ P
Ut −→ 0. (3.44)

(D) Proof of (3.35). On the one hand, V n,5 = U (1)n,ρ + U (2)n,ρ + U (3)n,ρ +
U n,ρ + U n,ρ ; on the other hand, the dominated convergence theorem (observe that
ρ P
c(ρ)t → 
σt2 for all t) yields that U (1)t −→A2 − 1
2 A3t and

ρ P
 1 
Ut −→θ g(cs− + wcs ) − g(cs− ) − w ∂lm g(cs− ) cslm dw
s≤t 0 l,m


as ρ → 0 (for the latter convergence, note that |g(x +y)−g(x)− l,m ∂lm g(x)y lm | ≤
K y 2 when x, y stay in a compact set). Then the property (3.35) follows from (3.39),
(3.40), (3.43) and (3.44).
(E) Proof of Theorem 2.2.
5We are now ready to prove Theorem 2.2. Recall that
√1 (V (g)n t − V (g)) = V n, j . By virtue of (3.28), (3.29), (3.34), (3.35), it is
n j=1
enough to check that
 t
A1t + A3t + A4t + A5t = 2θ l,m 0 ∂lm g(cs− ) dcs − θ g(ct )
lm
  1  
− 2 A3t + θ s≤t 0 g(cs− + wcs ) − g(cs− )−w l,m ∂lm g(cs− ) cslm dw.

To this aim, we observe that Itô’s formula gives us


 t 6 
g(ct ) = g(c0 ) + ∂lm g(cs− ) dcslm − A3t + g(cs− + cs ) − g(cs− )
θ
l,m 0 s≤t

− ∂lm g(cs− ) cslm ,
l,m

Estimation of Volatility Functionals: The Case of a n Window 585

1
so the desired equality is immediate (use also 0 w dw = 1
2 ), and the proof of
Theorem 2.2 is complete.

3.9 Proof of Theorem 2.5

The proof of Theorem 2.5 follows the same line as in Sect. 3.8, and we begin with
an auxiliary step.
Step (1) Replacing  cin . The summands in the definition (2.12) of An,3
cin by  are
 t
R(ci ,
n ci+kn ), where R(x, y) =
n ∂ 2
j,k,l,m jk,lm g(x)(y jk − x jk )(y lm − x lm ), and

we set
√ [t/n]−2kn +1
n
Atn,3 = − cin ,
R( n
ci+k n
).
8
i=1

We prove here that


P
An,3
t − At
n,3
−→ 0 (3.45)

for all t, and this is done as in the step j = 1 in Sect. 3.5. The function R is C 1 on
R2+ with ∂ j R(x, y) ≤ K (1 + x + y ) p− j for j = 0, 1, by (2.6). Thus

cin ,
|R( n
ci+k n
) − R(
cin , n
ci+k n
)|

≤ K (1 + 
cin +  n
ci+k n
) ) p−1 ( 
cin − 
cin + n
ci+k n
− n
ci+k n
)
+K 
cin − 
cin p
+K n
ci+k n
− n
ci+k n
p
.

Then, exactly as in the case afore-mentioned, we conclude (3.45), and it remains to


prove that, for all t, we have

P 1 2
Atn,3 −→ − A + A3t + At4 .
2 t
Step (2) From now on we use the same notation as in Sect. 3.8, although they denote
different variables or processes. For any ρ ∈ (0, 1] we have A n,3 = U n,ρ + U n,ρ +
U n,ρ , as defined in (3.37), but with

n
vin = − 8 R(cin + βin , ci+k
n
n
+ βi+k
n
n
)

vi n = − 8n R(cin + β in , ci+k
n
n
+ β i+k
n
n
), vi n = vin − vi n .

Recalling γin in (3.18), the decomposition (3.38) holds with


586 J. Jacod and M. Rosenbaum
√  n, jk n,lm
n  n,ρ
v(1)in = − 8 j,l,k,m ∂ jl,km
2 g(cin ) E γi γi | Fi
√ 
− 8n
n, jk
v(2)in = j,l,k,m ∂ jl,km
2 g(cin ) γi γin,lm − v(1)in
v(3)in = vin − v(1)in − v(2)in .

Use 
cin − cin = βin and (2.6) and a Taylor expansion to check that

|v(3)in | ≤ K n γin 2
βin (1 + βin ) p−3 .

We also have |v(2)in | ≤ K n γin 2, hence (3.20) and (3.26) yield

n 
E(|v(3)in | | G ρ ) + E(|v(2)in |2 | G ρ ) ≤ K n φρ + n
1/4
+ ,
ρp

and thus (3.40) holds here as well, by the same argument. Moreover, (3.26) again
yields (3.39), with now

ρ
  t θ jklm 1 jl km jm

Ut = − ∂ 2jk,lm g(cs ) c(ρ)s + (cs cs + cs cskl ) ds.
12 4θ
j,k,l,m 0

This goes to A3t − 21 A2t as ρ → 0.


Another application of (2.6) gives us
 
|vi n | ≤ K n 1 + γin 2
αin + αi+k
n
n
+ αin p
+ αi+k
n
n
p
.

Then another application of (3.19), (3.20) and (3.26) yields E(|vi n | | G ρ ) ≤ K n


3/4

and we conclude (3.44) as previously. We are thus left to prove that

n,ρ P ρ ρ P
ρ > 0 ⇒ Ut −→Ut , with, as ρ → 0, Ut −→At4 . (3.46)

Step (3) On the set n,t,ρ we have (3.41) and we study H (n, ρ, q), in the same way
as before, on the set n,t,ρ . We fix q and set S = Sq and an = [S/n ]. We then apply
(3.42) and also cin → c S− or cin → c S , according to whether an −2kn +1 ≤ i ≤ an +1
or an +2 ≤ i ≤ an +kn , to obtain vi n −v in → 0, uniformly in i between an −2kn +1
and an + 1, where


⎪ 0 if an − 2kn + 1 ≤ i ≤ an − 2kn + 2
⎪ (2kn −an +i−2)2 √n 

⎪ jk
⎨− j,k,l,m ∂ jk,lm g(c S− ) c S c S
2 lm
⎪ 8kn2
vi =
n
if an − 2kn + 3 ≤i ≤ an − kn + 1

⎪ 2 √ 
⎪ (a −i+2) kn −an +i+2 jk
j,k,l,m ∂ jk,lm g c S− + c S c S clm
2


n n


8kn2 kn S
if an − kn + 2 ≤ i ≤ an + 1.

Estimation of Volatility Functionals: The Case of a n Window 587

We then deduce, by Riemann integration, that


 1
θ   2 jk
H (n, ρ, q) → − ∂ jk,lm g(c Sq − )+∂ 2jk,lm g(c Sq − +(1−w)c Sq ) w 2 c Sq clm
Sq dw,
8 0
j,k,l,m

ρ  Ntρ
which is θG (c Sq − , c Sq ), hence the first part of (3.46), with Ut = θ q=1 G
(c Sqρ − , c Sqρ ). The second part of (3.46) follows from the dominated convergence
theorem, and the proof of Theorem 2.5 is complete.

3.10 Proof of Theorem 2.6

The proof is once more somewhat similar to the proof of Sect. 3.8, although the way
cin by 
we replace  cin and further by αin + βin is different.
(A) Preliminaries. The jth summand in (2.15) involves several estimators  cin , span-
ning the time interval (( j − 3)kn n , ( j + 2)kn n ]. It is thus convenient to replace
the sets L(n, ρ), L(n, ρ, t) and L(n, ρ, t), for ρ, t > 0, by the following ones:
ρ ρ
L (n, ρ) = { j = 3, 4, . . . : N( j+2)kn n − N( j−3)kn n = 0}
L (n, ρ, t) = {3, . . . , [t/kn n ] − 3} ∩ L (n, ρ)
L (n, ρ, t) = {3, . . . , [t/kn n ] − 3} ∩ (N\L (n, ρ)).

n,ρ n,ρ
For any ρ ∈ (0, 1] we write V(F)nt = Vt + V t , where

v nj = F(
cn , δ n
c) 1{ δnj−1c ∨ δnj+1c ∨u n < δnjc }
n,ρ ( j−3)kn +1 jn n,ρ 
Vt = j∈L (n,ρ,t) v j , V t = j∈L (n,ρ,t) v nj .

We also set

δ nj
c =
c n n +1 − 
c(nj−2)kn +1 , δ nj β = β njkn +1 − β(nj−2)kn +1 ,
jk
2
w j = m=−3 
n c( j+m)kn +1 − 
n c( j+m)kn +1 , w jn = (1 + 
n c(nj−3)kn +1 ) p−1 (1 + δ nj
c )2 .

Equation (3.10) and the last part of (3.19) yield


 (2q−r ) +1−q
q ≥ 1 ⇒ E (w nj )q ) ≤ K q n , E((win )q ) ≤ K q . (3.47)

Observe that δ nj


c is analogous to γin , with a doubled time lag, so it satisfies a version
of (3.26) and, for q ≥ 2, we have

 n,ρ  q/4
q/2
n
i ∈ L (n, ρ) ⇒ E δ nj
c q
| F( j−2)kn +1 ) ≤ K q n φρ + n + .
ρq
(3.48)
588 J. Jacod and M. Rosenbaum

(B) The processes V n,ρ . (2.16) yields

|v nj | ≤ K (1 + 
c(nj−3)kn +1 ) p−2 δ nj
c 2 1{ δ nj
c >u n } + K δ nj
c p.

Thus a (tedious) computation shows that, with the notation

a nj = (1 + c(nj−3)kn +1 ) p−2 δ njc 2 1{ c >u n /2} ,


δ nj
(w n )v 
a jn = w jn win + (win ) p + i v ,
un

with v > 0 arbitrary, we have |v nj | ≤ K (a nj + δ nj


c p + a jn ) (with K depending on
n,ρ n,ρ n,ρ
v). Therefore we have |Vt | ≤ K (Bt + Ct + Dtn ), where

  [t/k
 n n ]
n,ρ n,ρ
Bt = a nj , Ct = δ nj
c p, Dtn = a jn .
j∈L (n,ρ,t) j∈L (n,ρ,t) j=3

l(q,v)
First, (3.47) and Hölder’s inequality give us E(a jn ) ≤ K q,v n for any q > 1
and v > 0, where (recalling (2.7) and (2.14) for and ) we have set l(q, v) =
q − p(1 − 2 ) ∨ v(1 − 2 + ) . Upon choosing v small enough and q close
1−r

enough to 1, and in view of (2.7), we see that l(q, v) > 21 , thus implying

E(Dtn ) → 0. (3.49)

Next, we deduce from (3.48) that

 n,ρ   n 
p/2
| Gρ
p/4
E Ct ≤ KE E δ nj
c p
≤ K t φρ + n + ,
ρp
i∈L (n,ρ,t)

and thus, since p ≥ 3,


n,ρ
lim lim sup E(|Ct |) = 0. (3.50)
ρ→0 n→∞

n,ρ
The analysis of Bt is more complicated. We have δ nj
c = z nj + z jn , where

1 
kn
z nj = αnjkn +1 − αn( j−2)kn +1 , z jn = (cnjkn +m − c(nj−2)kn +m )
kn
m=1

(recall (3.36) for αin ), hence



a nj ≤ 4(1 + 
c(nj−3)kn +1 ) p−2 z nj 2
1{ z in >u n /4} + z j
n 2
1{ zi n >u n /4} .

Estimation of Volatility Functionals: The Case of a n Window 589

It easily follows that for all A > 1,

n,ρ n,ρ,1 n,ρ,2 2 p n,ρ,3


Bt ≤ 16 Bt + 4 A p−2 Bt + B , (3.51)
A t
where

n,ρ,m  zn 3
Bt = j∈L (n,ρ,t) a(m)nj , a(1)nj = (1 +  c(nj−3)kn +1 ) p−2 uj ,
n
a(2)nj = z jn 2 1{ z n >u n /4} , a(3)nj =  c(nj−3)kn +1 p−1 z jn 2 .
i

On the one hand, (3.19) and Hölder’s inequality yield E(a(1)nj | G ρ ) ≤ K n


3/4−

and, since < 41 , we deduce


 n,ρ,1
E Bt → 0. (3.52)

On the other hand, observe that z jn = μ(ρ)nj , with the notation (3.27), and as soon
as j ∈ L (n, ρ), so Lemma 3.4 gives us
 n,ρ,2
lim lim sup E Bt = 0. (3.53)
ρ→0 n→∞

n,ρ √
Finally, (3.13) shows that E( z jn q | F( j−2)kn +1 ) ≤ K q,ρ n for all q ≥ 2 and
j ∈ L (n, ρ), whereas  c(nj−3)kn +1 is F(nj−2)kn +1 -measurable, so (3.13), (3.19) and

successive conditioning yield E(a(3)nj | G ρ ) ≤ K q,ρ n . Then, again as for (3.52),
one obtains  n,ρ,3
E Bt ≤ K ρ t. (3.54)

At this stage, we gather (3.49)–(3.54) and obtain, by letting first n → ∞, then


ρ → 0, then A → ∞, that
 n,ρ
lim lim sup E |Vt | = 0. (3.55)
ρ→0 n→∞

ρ ρ
(C) The processes V n,ρ . With the previous notation S j and Nt , and on the set n,ρ,t ,
we have ρ

n,ρ
Nt
 2
Vt = v[S
n
ρ
/k  ]+ j
. (3.56)
m n n
m=1 j=−2

ρ
This is a finite sum (bounded in n for each ω). Letting S = Sm for m and ρ fixed and
wn = knSn − knSn , we know that for any given j ∈ Z the variable n
c([S/k n n ]+ j)kn +1
converge in probability to c S− if j < 0 and to c S if j > 0, whereas for j = 0 we
P
have n
c[S/k n n ]kn +1
− wn c S − (1 − wn )c S −→0. This in turn implies
590 J. Jacod and M. Rosenbaum

P
j < 0 or j > 2 ⇒ δ[S/k
n
n n ]+ j

c−→0,
P P P
δ[S/k
n
n n ]
c − (1 − wn )c S −→0, δ[S/k
n
n n ]+1
c−→c S , δ[S/k
n
n n ]+2
c − wn c S −→0.

By virtue of the definition of v nj , and since u n → 0 and also since wn is almost surely
in (0, 1) and F is continuous and F(x, 0) = 0, one readily deduces that

P F(c S− , c S ) if j = 1
v[S/k
n
n n ]+ j
−→
0 if j = 1.

Coming back to (3.56), we deduce that


ρ

n,ρ P ρ
Nt

Vt −→ Vt := F(c Smρ − , c Smρ ). (3.57)
m=1

ρ
In view of (2.16), an application of the dominated convergence theorem gives V t →
n,ρ n,ρ
V(F)t . Then (2.17) follows from V(F)nt = Vt + V t and (3.55) and (3.57). The
proof of Theorem 2.6 is complete.

Acknowledgments We are grateful to the referee for his/her very careful reading of the paper.

References

1. Alvarez, A., Panloup, P., Pontier, M., Savy, N.: Estimation of the instantaneous volatility. Stat.
Inference Stoch. Process. 15, 27–59 (2010)
2. Clément, E., Delattre, S., Gloter, A.: An infinite dimensional convolution theorem with applica-
tions to the efficient estimation of the integrated volatility. Stoch. Process. Appl. 123, 2500–2521
(2013)
3. Jacod, J., Shiryaev, A.N.: Limit Theorems for Stochastic Processes, 2nd edn. Springer, Berlin
(2003)
4. Jacod, J., Protter, P.: Discretization of Processes. Springer, Berlin (2012)
5. Jacod, J., Rosenbaum, M.: Quarticity and other functionals of volatility: efficient estimation.
Ann. Stat. 41, 1462–1484 (2013)
6. Vetter, M.: Limit theorems for bipower variation of semimartingales. Stoch. Process. Appl. 120,
22–38 (2010)
7. Dette, H., Podolskij, M., Vetter, M.: Estimation of integrated volatility in continuous-time finan-
cial models with applications to goodness-of-fit testing. Scand. J. Stat. 33(2), 259–278 (2006)

You might also like