0% found this document useful (0 votes)
16 views138 pages

Stoch Calc

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views138 pages

Stoch Calc

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 138

Stochastic Calculus

Lecture notes based on the book by J.-F. Le Gall

Russell Lyons
September 20, 2024
i

Contents

Contents i

Preface iii

The First Day iv

1 Gaussian Variables and Gaussian Processes 1


1.1 Gaussian Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Gaussian Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Gaussian Processes and Gaussian Spaces . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Gaussian White Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Brownian Motion 7
2.1 Pre-Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 The Continuity of Sample Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Properties of Brownian Sample Paths . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 The Strong Markov Property of Brownian Motion . . . . . . . . . . . . . . . . . . 14
Appendix: The Cameron–Martin Theorem . . . . . . . . . . . . . . . . . . . . . . 18

3 Filtrations and Martingales 21


3.1 Filtrations and Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 Stopping Times and Associated 𝜎-Fields . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Continuous-Time Martingales and Supermartingales . . . . . . . . . . . . . . . . 27
3.4 Optional Stopping Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4 Continuous Semimartingales 39
4.1 Finite-Variation Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.1 Functions with Finite Variation . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.2 Finite-Variation Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 Continuous Local Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 The Quadratic Variation of a Continuous Local Martingale . . . . . . . . . . . . . 45
4.4 The Bracket of Two Continuous Local Martingales . . . . . . . . . . . . . . . . . 52
4.5 Continuous Semimartingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
ii

5 Stochastic Integration 57
5.1 The Construction of Stochastic Integrals . . . . . . . . . . . . . . . . . . . . . . . 57
5.1.1 Stochastic Integrals for Martingales Bounded in 𝐿 2 . . . . . . . . . . . . . 57
5.1.2 Stochastic Integrals for Local Martingales . . . . . . . . . . . . . . . . . . 63
5.1.3 Stochastic Integrals for Semimartingales . . . . . . . . . . . . . . . . . . . 65
5.1.4 Convergence of Stochastic Integrals . . . . . . . . . . . . . . . . . . . . . 67
5.2 Itô’s Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.3 A Few Consequences of Itô’s Formula . . . . . . . . . . . . . . . . . . . . . . . . 73
5.3.1 Lévy’s Characterization of Brownian Motion . . . . . . . . . . . . . . . . 73
5.3.2 Continuous Martingales as Time-Changed Brownian Motions . . . . . . . 74
5.3.3 The Burkholder–Davis–Gundy Inequalities . . . . . . . . . . . . . . . . . 77
Appendix: The Cameron–Martin and Girsanov Theorems . . . . . . . . . . . . . . 80

6 General Theory of Markov Processes 83


6.1 General Definitions and the Problem of Existence . . . . . . . . . . . . . . . . . . 83
6.2 Feller Semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.3 The Regularity of Sample Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.4 The Strong Markov Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Appendix: Locally Compact Polish Spaces are 𝜎-compact . . . . . . . . . . . . . 98

8 Stochastic Differential Equations 100


8.1 Motivation and General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.2 The Lipschitz Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.3 Solutions of Stochastic Differential Equations as Markov Processes . . . . . . . . . 111

7 Brownian Motion and Partial Differential Equations 117


7.1 Brownian Motion and the Heat Equation . . . . . . . . . . . . . . . . . . . . . . . 117
7.2 Brownian Motion and Harmonic Functions . . . . . . . . . . . . . . . . . . . . . 118
7.3 Harmonic Functions in a Ball and the Poisson Kernel . . . . . . . . . . . . . . . . 123
7.4 Transience and Recurrence of Brownian Motion . . . . . . . . . . . . . . . . . . . 125
7.5 Planar Brownian Motion and Holomorphic Functions . . . . . . . . . . . . . . . . 125
7.6 Asymptotic Laws of Planar Brownian Motion . . . . . . . . . . . . . . . . . . . . 128
Appendix: The Poisson Kernel is Harmonic . . . . . . . . . . . . . . . . . . . . . 131
Appendix: Convergence of Harmonic Functions to Boundary Values . . . . . . . . 132
iii

Preface

I gave these lectures at Indiana University during the academic year 2017–18. Initially, one
of the students, ChunHsien Lu, typed the notes during class. Later, another student who was not
in the course, Zhifeng Wei, used my handwritten notes to correct and complete the typed notes. I
am very grateful to both of them for all their work. Zhifeng deserves special thanks for figuring
out how to add reasons beautifully to displayed equations, as well as for being attentive in general
to all my typesetting requests. I then did some further editing and added some illustrations and a
bit more material. I would be grateful to learn of any errors or improvements; please email me at
[email protected].
The course was based on the book, Brownian Motion, Martingales, and Stochastic Calculus, by
Jean-François Le Gall. The same theorem and exercise numbers are used here, although I have not
reproduced the exercises. I also added a large number of exercises, especially in order to have some
that were useful for learning new concepts and definitions. I assigned homework once per week,
and have included the dates those assignments were due in order that others may gauge the pace. A
few new problems were added after the course ended; these do not have due dates. Furthermore,
the last homework exercises also do not have dates due because they were given at the end of the
term. I spent a substantial amount of time in class going over solutions to the homework, but no
solutions are presented here. I am grateful to Jean-François for his advice on teaching this course.
This turned out to be one of my most enjoyable teaching experiences ever. I had never taught this
material before, and always promptly forgot it whenever I had learned some of it in the past. This
time, however, teaching it and working hard on the exercises led to actually learning it.
Other differences from Le Gall’s book arise from using somewhat different proofs and
sometimes giving more general results. A couple of proofs are substantially different. In addition, I
covered Chapter 8 on SDEs before Chapter 7 on PDEs. I did not have time to cover Chapter 9 on
local times, nor Sections 5.4–5.6. I later made up for this in part by including appendices on the
Cameron–Martin theorem and Girsanov’s theorem. A couple of appendices provide material I gave
to the students from other sources. Occasionally I refer to Le Gall’s book for details not given in
lecture.
The format of the typed notes tries to reproduce the format of my handwritten notes and most
of what went on the board.
iv

The First Day

We begin with some


Motivation (A special case of Itô’s formula). If (𝐵𝑡 )𝑡>0 is a standard Brownian motion and
𝑓 ∈ 𝐶 2 (R), then
1
d 𝑓 (𝐵𝑡 ) = 𝑓 0 (𝐵𝑡 ) d𝐵𝑡 + 𝑓 00 (𝐵𝑡 ) d𝑡.
2

This is like calculus, but there is a second term on the right-hand side: |d𝐵𝑡 | ≈ d𝑡. So
(d𝐵𝑡 ) 2 ≈ d𝑡. This shows partly why 𝐿 2 (P) is a key.
SDEs (semester 2) are defined via stochastic integration (semester 1). Other relations to PDEs
and harmonic functions are in semester 2, including conformal invariance of complex Brownian
motion.
We will start with preparatory material: Gaussian processes, construction of Brownian motion
and its basic properties, and a quick review of discrete-time martingales. Then we will study new
material on continuous-time martingales and continuous semimartingales.
Before that, recall that a class U of random variables on (Ω, ℱ, P) is uniformly integrable if

lim sup E |𝑋 |1 [|𝑋 |>𝑡] = 0.


 
𝑡→∞ 𝑋∈ U

This holds if (and, it turns out, only if) sup 𝑋∈ U E 𝜑(|𝑋 |) < ∞ for some function 𝜑 : [0, ∞) →
 
P
[0, ∞) with lim𝑡→∞ 𝜑(𝑡)
𝑡 = ∞. If 𝑋𝑛 and 𝑋 are integrable and 𝑋𝑛 →
− 𝑋, then the following are
equivalent:
1. {𝑋𝑛 } is uniformly integrable;
2. E |𝑋 − 𝑋𝑛 | → 0;
 

3. E |𝑋𝑛 | → E |𝑋 | .
   
1

Chapter 1

Gaussian Variables and Gaussian Processes

1.1. Gaussian Random Variables


The standard Gaussian (or normal) density is

1 n 𝑥2 o
𝑝 𝑋 : 𝑥 ↦→ √ exp − (𝑥 ∈ R).
2𝜋 2

The complex Laplace transform of such a random variable, 𝑋, is


∫ ∞
2
𝑧 ↦→ E e e𝑧𝑥 𝑝 𝑋 (𝑥) d𝑥 = e𝑧 /2 (𝑧 ∈ C).
 𝑧𝑋 
=
−∞

One sees this by first calculating the integral for 𝑧 ∈ R and then using analytic continuation (see
page 2 of Le Gall’s book). In particular, the characteristic function (Fourier transform) is
2
𝜉 ↦→ E ei𝜉 𝑋 = e−𝜉 /2
 
(𝜉 ∈ R).

Recall that the Fourier transform determines the law of 𝑋 uniquely. By expanding in a Taylor
series, one gets the moments of 𝑋, such as E[𝑋] = 0 and E 𝑋 2 = 1.


We say 𝑌 ∼ 𝒩(𝑚, 𝜎 2 ) for 𝑚 ∈ R and 𝜎 > 0 if (𝑌 − 𝑚)/𝜎 is standard normal. This is


equivalent to:
1 n (𝑦 − 𝑚) 2 o
𝑌 has density 𝑦 ↦→ √ exp −
𝜎 2𝜋 2𝜎 2
and to
2 𝜉 2 /2
𝑌 has Fourier transform 𝜉 ↦→ ei𝑚𝜉−𝜎 .
Note that then E [𝑌 ] = 𝑚 and Var(𝑌 ) = 𝜎 2 . If 𝑌 = 𝑚 a.s., we will also say 𝑌 ∼ 𝒩(𝑚, 0).
Using the Fourier transform, one sees that a sum of two independent normal random variables
is also normal.
One proves properties of stochastic processes with a continuous parameter by taking limits in
various senses from finite or countable subsets of parameters. This is how we will use the following:
2 Chapter 1. Gaussian Variables and Gaussian Processes

Proposition 1.1. Suppose 𝑋𝑛 ∼ 𝒩(𝑚 𝑛 , 𝜎𝑛2 ) converges to 𝑋 in 𝐿 2 (i.e., E |𝑋𝑛 − 𝑋 | 2 → 0 as


 

𝑛 → ∞). Then
(i) 𝑋 ∼ 𝒩(𝑚, 𝜎 2 ) with 𝑚 B lim 𝑚 𝑛 and 𝜎 B lim 𝜎𝑛 ;
(ii) 𝑋𝑛 → 𝑋 in 𝐿 𝑝 for every 𝑝 ∈ (0, ∞).

Proof. (i) That lim 𝑚 𝑛 = E[𝑋] and lim 𝜎𝑛2 = Var(𝑋) does not use that (𝑋𝑛 )𝑛>1 are Gaussian. The
fact that 𝑋 is Gaussian then follows from using the Fourier transform.
(ii) Because 𝑋𝑛 = 𝜎𝑛 𝑁 + 𝑚 𝑛 with 𝑁 ∼ 𝒩(0, 1), we see that
𝒟

∀𝑞 > 0 sup E |𝑋𝑛 | 𝑞 < ∞,


 
𝑛

whence
sup E |𝑋𝑛 − 𝑋 | 𝑞 < ∞.
 
𝑛

(Recall that k·k𝑞 satisfies the triangle inequality for 𝑞 > 1 and k·k𝑞 does for 𝑞 < 1.) Given
𝑞

𝑝 ∈ (0, ∞), we get that |𝑋𝑛 − 𝑋 | 𝑝 , 𝑛 > 1 is bounded in 𝐿 2 (use 𝑞 B 2𝑝) and tends to 0 in
P
probability because 𝑋𝑛 →− 𝑋, whence is uniformly integrable. Therefore, E |𝑋𝑛 − 𝑋 | 𝑝 → 0. J
 

1.2. Gaussian Vectors

Let 𝐸 be a Euclidean space,


 i.e., a finite-dimensional inner-product space. Let 𝑋 be an 𝐸-valued
random variable with E k 𝑋 k 2 < ∞. We claim that there exist some 𝑚 𝑋 ∈ 𝐸 and a non-negative
quadratic form 𝑞 𝑋 on 𝐸 such that

E h𝑢, 𝑋i = h𝑢, 𝑚 𝑋 i and Var h𝑢, 𝑋i = 𝑞 𝑋 (𝑢).


  
∀𝑢 ∈ 𝐸

We will then
P write E[𝑋] B 𝑚 𝑋 . To see our claim, take an orthonormal basis (𝑒 1 , . . . , 𝑒 𝑑 ) of 𝐸,
write 𝑋 = 𝑋 𝑗 𝑒 𝑗 , and define
X
𝑚𝑋 B E[𝑋 𝑗 ] 𝑒 𝑗 ,
X X 
𝑞 𝑋 (𝑢) B 𝑢 𝑗 𝑢 𝑘 Cov(𝑋 𝑗 , 𝑋𝑘 ) = Var 𝑢 𝑗 𝑋 𝑗 > 0.

Calculation shows this works.


We also write 𝛾 𝑋 : 𝐸 → 𝐸 for the symmetric linear mapping such that

∀𝑢 ∈ 𝐸 𝑞 𝑋 (𝑢) = 𝑢, 𝛾 𝑋 (𝑢) ;

its matrix is Cov(𝑋 𝑗 , 𝑋𝑘 ) 𝑗,𝑘 6𝑑 . The eigenvalues of 𝛾 𝑋 are non-negative.




We call 𝑋 Gaussian if ∀𝑢 ∈ 𝐸 h𝑢, 𝑋i is Gaussian; we also call the components of 𝑋 jointly


Gaussian.
1.3. Gaussian Processes and Gaussian Spaces 3

Example. If 𝑋1 , 𝑋2 , . . . , 𝑋𝑑 are independent Gaussian, then 𝑋 𝑗 𝑒 𝑗 is a Gaussian vector.


P

If 𝑋 is Gaussian, then h𝑢, 𝑋i ∼ 𝒩 h𝑢, 𝑚 𝑋 i, 𝑞 𝑋 (𝑢) , so




E eih𝑢,𝑋i = eih𝑢,𝑚 𝑋 i−𝑞 𝑋 (𝑢)/2 . (1.1)

We write 𝑋 ∼ 𝒩(𝑚 𝑋 , 𝑞 𝑋 ).
P
Proposition 1.2. If 𝑋 is Gaussian, (𝑒 1 , . . . , 𝑒 𝑑 ) is an orthonormal basis of 𝐸, and 𝑋 = 𝑋𝑗 𝑒 𝑗,
then Cov(𝑋 𝑗 , 𝑋𝑘 ) 𝑗,𝑘 6𝑑 is diagonal if and only if 𝑋1 , 𝑋2 , . . . , 𝑋𝑑 are (mutually) independent.
Proof. ⇐: Independence implies pairwise independence. Thus, Cov(𝑋 𝑗 , 𝑋𝑘 ) = 0 for distinct 𝑗 and
𝑘.
⇒: Conversely, when the covariance matrix is diagonal, the right-hand side of Eq. (1.1) factors
as a product over 𝑗, and independence follows. J
In particular, for jointly Gaussian random variables, pairwise independence implies mutual
independence.
For simplicity, we now consider centered Gaussian vectors, i.e., ones with mean 0. We will
not use the following:
Theorem 1.3. (i) If 𝛾 is a positive semi-definite linear map on 𝐸, then there exists a Gaussian
vector 𝑋 on 𝐸 such that 𝛾 𝑋 = 𝛾.
(ii) Let 𝑋 ∼ 𝒩(0, 𝛾 𝑋 ). Let (𝜀1 , . . . , 𝜀 𝑑 ) be an orthonormal basis of eigenvectors of 𝛾 𝑋 with
eigenvalues 𝜆1 > · · · > 𝜆𝑟 > 0 = 𝜆𝑟+1 = · · · = 𝜆𝑑 . Then there exist independent 𝑌 𝑗 ∼ 𝒩(0, 𝜆 𝑗 )
such that
X 𝑟
𝑋= 𝑌𝑗 𝜀 𝑗 .
𝑗=1

The support of the law 𝑃 𝑋 of 𝑋 is the linear span of {𝜀1 , . . . , 𝜀𝑟 }. Also, 𝑃 𝑋 is absolutely
continuous with respect to Lebesgue measure if and only if 𝑟 = 𝑑, in which case the density of
𝑋 is
1 −1
𝑝 𝑋 : 𝑥 ↦→ e−h𝑥,𝛾𝑋 (𝑥)i/2 .
(2𝜋) 𝑑/2 det 𝛾 𝑋
p

1.3. Gaussian Processes and Gaussian Spaces


We will often omit the word “centered”.
Another way to say that (𝑋1 , 𝑋2 , . . . , 𝑋𝑑 ) ∈ R𝑑 is a Gaussian vector is to say that the linear
span of {𝑋1 , 𝑋2 , . . . , 𝑋𝑑 } in 𝐿 2 (Ω, P) consists only of Gaussian random variables.
Definition 1.4. A (centered) Gaussian space is a closed linear subspace of 𝐿 2 (Ω, P) that contains
only centered Gaussian variables.
Definition 1.5. Let 𝑇 be a set and (𝐸, ℰ) be a measurable space. A stochastic process (or random
process) indexed by T with values in 𝐸 is a collection (𝑋𝑡 )𝑡∈𝑇 of 𝐸-valued random variables. If
(𝐸, ℰ) is not specified, then we assume that 𝐸 = R and ℰ = ℬ(R) is its Borel 𝜎-field. Usually,
𝑇 = R+ B [0, ∞).
4 Chapter 1. Gaussian Variables and Gaussian Processes

Definition 1.6. A stochastic process (𝑋𝑡 )𝑡∈𝑇 ∈ R𝑇 is a Gaussian process if for every finite subset 𝑇 0
0
of 𝑇, (𝑋𝑡 )𝑡∈𝑇 0 ∈ R𝑇 is a Gaussian vector.

By Proposition 1.1, we get


Proposition 1.7. If (𝑋𝑡 )𝑡∈𝑇 is a Gaussian process, then the closed linear span of (𝑋𝑡 )𝑡∈𝑇 in 𝐿 2 (Ω, P)
is a Gaussian space, called the Gaussian space generated by (𝑋𝑡 )𝑡∈𝑇 . J

Exercise (due 8/31). Exercise 1.15 (4 parts).


The undergrad notion that jointly normal, centered random variables (𝑋, 𝑌 ) are independent
if and only if they are orthogonal in 𝐿 2 (i.e., E(𝑋𝑌 ) = 0), which we proved and extended in
Proposition 1.2, has the following further extension:
Theorem 1.9. Let 𝐻 be a centered Gaussian space and K be a collection of linear subspaces of 𝐻.
Then the subspaces of K are (pairwise) orthogonal (⊥) in 𝐿 2 if and only the 𝜎-fields 𝜎(𝐾) (𝐾 ∈ K)
are independent (⫫).

Proof. Independence implies orthogonality trivially.


For the converse, it suffices to show that if 𝐾1 , 𝐾2 , . . . , 𝐾 𝑝 ∈ K are distinct, then 𝜎(𝐾1 ), 𝜎(𝐾2 ),
. . . , 𝜎(𝐾 𝑝 ) are independent, because this is the definition of independence for infinitely many
𝜎-fields. In turn, this follows if we show that (𝜉11 , 𝜉21 , . . . , 𝜉𝑛11 ), . . . , (𝜉1 , 𝜉2 , . . . , 𝜉𝑛 𝑝 ) are independent
𝑝 𝑝 𝑝

for 𝜉𝑖 ∈ 𝐾 𝑗 . (This is a standard fact and follows from Dynkin’s 𝜋-𝜆 theorem, which is called in the
𝑗

book “the monotone class lemma”; see Appendix 1 for that and this application. Halmos’ monotone
class lemma is given on page 89 of the book.) Now let (𝜂1 , 𝜂2 , . . . , 𝜂𝑚 𝑗 ) be an orthonormal basis of
𝑗 𝑗 𝑗

the span of (𝜉1 , 𝜉2 , . . . , 𝜉𝑛 𝑗 ). Orthogonality gives that the vector


𝑗 𝑗 𝑗

(𝜂11 , 𝜂21 , . . . , 𝜂𝑚
1
, 𝜂12 , 𝜂22 , . . . , 𝜂𝑚
2 𝑝 𝑝 𝑝
1 2
, . . . , 𝜂1 , 𝜂2 , . . . , 𝜂𝑚 𝑝 )

has covariance matrix the identity. This is a Gaussian vector since its components are in 𝐻.
Proposition 1.2 then yields that all 𝜂𝑖 are independent, whence
𝑗

(𝜂11 , 𝜂21 , . . . , 𝜂𝑚
1 𝑝 𝑝 𝑝
1
), . . . , (𝜂1 , 𝜂2 , . . . , 𝜂𝑚 𝑝 )

are independent. This gives the result. J

If 𝑋 : (Ω, ℱ, P) → (𝐸, ℰ) and 𝒢 is a sub-𝜎-field of ℱ, then a regular conditional distribution


for 𝑋 given 𝒢 is a function 𝜇 : Ω × ℰ → [0, 1] such that
(1) ∀𝜔 ∈ Ω 𝜇(𝜔, ·) is a probability measure
and
(2) ∀𝐴 ∈ ℰ 𝜇(·, 𝐴) is a version of P[𝑋 ∈ 𝐴 | 𝒢].
This exists if (𝐸, ℰ) is a standard Borel space (Borel isomorphic to a Borel subset of R), such
as a Borel subset of a Polish space (complete, separable, metrizable space); see Durrett’s book,
Probability: Theory and Examples.
1.4. Gaussian White Noise 5

Corollary 1.10. Let 𝐻 be a (centered) Gaussian space and 𝐾 be a closed linear subspace of 𝐻.
Let 𝑝 𝐾 : 𝐻 → 𝐾 be the orthogonal projection. If 𝑋1 , 𝑋2 , . . . , 𝑋𝑑 ∈ 𝐻, then the 𝜎(𝐾)-conditional
distribution of (𝑋1 , 𝑋2 , . . . , 𝑋𝑑 ) is
 𝑑 
𝒩 𝑝 𝐾 (𝑋𝑖 ) 𝑖=1 , 𝑞 ( 𝑝⊥ (𝑋𝑖 )) 𝑑 .
𝐾 𝑖=1

Proof. We have
𝑋𝑖 = 𝑝 ⊥
𝐾 (𝑋𝑖 ) + 𝑝 𝐾 (𝑋𝑖 ) , 1 6 𝑖 6 𝑑. J
| {z } | {z }
⫫ 𝜎(𝐾) ∈ 𝜎(𝐾)

See the book for more details when 𝑑 = 1. Note that here E[𝑋 | 𝜎(𝐾)] = 𝑝 𝐾 (𝑋), whereas in
general (outside the context of Gaussian random variables), it is 𝑝 𝐿 2 (Ω,𝜎(𝐾),P) (𝑋).
Exercise (due 9/7). Exercise 1.17.

1.4. Gaussian White Noise


White noise is an engineering term that refers to a signal with constant Fourier transform. In
the case of a stationary stochastic process, we look at the spectral measure (page 11 in the book),
whose Fourier transform is the covariance function; it should be 𝑐 · 𝛿0 . That is, the process should
have no correlations; in the Gaussian case, this is equivalent to independence. This makes most
sense if the index set is Z. But we are interested in R. However, the index set for us will not be R,
but ℬ(R). Motivations include increments of Brownian motion and the Poisson process in R or R2
. . . . Thus, each Borel set 𝐴 ∈ ℬ(R) gives a Gaussian random variable, 𝐺 ( 𝐴). If 𝐴1 ∩ 𝐴2 = ∅,
then we want 𝐺 ( 𝐴1 ) ⫫ 𝐺 ( 𝐴2 ).
Definition 1.12. Let (𝐸, ℰ) be a measurable space and 𝜇 be a measure on (𝐸, ℰ). A Gaussian
white noise with intensity 𝜇 is an isometry 𝐺 from 𝐿 2 (𝐸, ℰ, 𝜇) into a (centered) Gaussian space.
Thus, for 𝑓 , 𝑔 ∈ 𝐿 2 (𝐸), we have
Cov 𝐺 ( 𝑓 ), 𝐺 (𝑔) = h 𝑓 , 𝑔i 𝐿 2 ,


Var 𝐺 ( 𝑓 ) = k 𝑓 k 𝐿2 2 .


If 𝐴 ∈ ℰ with 𝜇( 𝐴) < ∞, we set 𝐺 ( 𝐴) B  𝐺 (1 𝐴 ) ∼ 𝒩 0, 𝜇( 𝐴) . If 𝐴1 , 𝐴2 , . . . , 𝐴𝑛 ∈ ℰ with




𝜇( 𝐴 𝑗 ) < ∞, then 𝐺 ( 𝐴1 ), 𝐺 ( 𝐴2 ), . . . , 𝐺 ( 𝐴𝑛 ) is a Gaussian vector with covariance


Cov 𝐺 ( 𝐴𝑖 ), 𝐺 ( 𝐴 𝑗 ) = 𝜇( 𝐴𝑖 ∩ 𝐴 𝑗 ).


In particular, if 𝐴1 , 𝐴2 , . . . , 𝐴𝑛 are disjoint, then the covariance matrix is diagonal, so by Proposi-


tion 1.2, the variables 𝐺 ( 𝐴1 ), 𝐺 ( 𝐴2 ), . . . , 𝐺 ( 𝐴𝑛 ) are independent.
If 𝐴 ∈ ℰ, 𝜇( 𝐴) < ∞, is partitioned into 𝐴1 , 𝐴2 , . . . ∈ ℰ, then 1 𝐴 = 2
𝑗 1 𝐴 𝑗 in 𝐿 , so by
P

isometry,
𝐺 ( 𝐴𝑖 ) in 𝐿 2 .
X
𝐺 ( 𝐴) =
𝑗
Kolmogorov’s theorem shows that we also have almost sure convergence. However, in general, it is
not possible to make 𝐴 ↦→ 𝐺 ( 𝐴) a signed measure almost surely, even when (𝐸, ℰ) = R, ℬ(R) ,
as Corollary 2.17 will show.
6 Chapter 1. Gaussian Variables and Gaussian Processes

Proposition 1.13. Let (𝐸, ℰ) be a measurable space and 𝜇 be a measure on (𝐸, ℰ). There exists a
probability space (Ω, ℱ, P) and a Gaussian white noise on 𝐿 2 (Ω, ℱ, P) with intensity 𝜇.

Proof. Let ( 𝑓𝑖 )𝑖∈𝐼 be an orthonormal basis for 𝐿 2 (𝐸, ℰ, 𝜇). Choose a probability space on which
there exist i.i.d. random variables 𝑋𝑖 ∼ 𝒩(0, 1) (𝑖 ∈ 𝐼). Define 𝐺 : 𝐿 2 (𝜇) → 𝐿 2 (P) by 𝐺 ( 𝑓𝑖 ) B 𝑋𝑖 .
That is, for 𝑓 ∈ 𝐿 2 (𝜇), we define
X
𝐺( 𝑓 ) B h 𝑓 , 𝑓𝑖 i𝑋𝑖 .
𝑖∈𝐼

The fact that 𝐺 is an isometry uses only that the variables (𝑋𝑖 )𝑖∈𝐼 are orthonormal. The fact that 𝐺
takes values in a Gaussian space uses that (𝑋𝑖 )𝑖∈𝐼 is standard normal and Proposition 1.1(i). J
Exercise. Let (𝐸, ℰ) be a measurable space and 𝜇 be a measure on (𝐸, ℰ). Let 𝑓1 , 𝑓2 ∈ 𝐿 2 (𝜇). Let
𝐺 be a Gaussian white noise on 𝐿 2 (Ω, ℱ, P) with intensity 𝜇. Calculate the joint distribution of
𝐺 ( 𝑓1 ) and 𝐺 ( 𝑓2 ) and the conditional distribution of 𝐺 ( 𝑓2 ) given 𝐺 ( 𝑓1 ).
Exercise (due 9/7). Exercise 1.18.
Proposition 1.14. Let 𝐺 be a Gaussian white noise on (𝐸, ℰ) with intensity 𝜇 and 𝐴 ∈ ℰ have
𝜇( 𝐴) < ∞. If for each 𝑛 ∈ N, 𝐴 is partitioned as 𝐴 = 𝑘𝑗=1
Ð 𝑛 𝑛
𝐴 𝑗 with

lim max 𝜇( 𝐴𝑛𝑗 ) = 0,


𝑛→∞ 16 𝑗 6𝑘 𝑛

then
𝑘𝑛
𝐺 ( 𝐴𝑛𝑗 ) 2 = 𝜇( 𝐴) in 𝐿 2 (P).
X
lim
𝑛→∞
𝑗=1

Proof. We have 𝐺 ( 𝐴𝑛𝑗 ) ∼ 𝒩 0, 𝜇( 𝐴𝑛𝑗 ) are independent. From page 2 of the book, we know


𝐺 ( 𝐴𝑛𝑗 ) 2 have variance 2 𝜇( 𝐴𝑛𝑗 ) 2 . Therefore,

𝑘𝑛
X  𝑘𝑛 𝑘𝑛
𝐺 ( 𝐴𝑛𝑗 ) 2 𝜇( 𝐴𝑛𝑗 ) 2
X X
Var =2 6 2 max 𝜇( 𝐴𝑛𝑗 ) → 0.

𝜇( 𝐴𝑛𝑗 ) ·
16 𝑗 6𝑘 𝑛
𝑗=1 𝑗=1 | {z } 𝑗=1
| {z }
→0
𝜇( 𝐴)

𝑛 2 2
But this is precisely − 𝜇( 𝐴) 2 .
P𝑘 𝑛
𝑗=1 𝐺 ( 𝐴 𝑗 ) J
If the partitions are successive refinements, then we have almost sure convergence by Doob’s
martingale convergence theorem.
7

Chapter 2

Brownian Motion

Although Exercise 1.18 constructed Brownian motion (on [0, 1]), we will give another
construction that yields more information, via a lemma of Kolmogorov that will also be used in later
chapters.

2.1. Pre-Brownian Motion

The following is a natural extension of Exercise 1.18(3).


Definition 2.1. A pre-Brownian motion (𝐵𝑡 )𝑡>0 is a stochastic process such that

𝐵𝑡 = 𝐺 1 [0,𝑡] “c.d.f. of 𝐺”

for some Gaussian white noise 𝐺 on R+ whose intensity is Lebesgue measure.

Proposition 2.2. Every pre-Brownian motion is a centered Gaussian process with covariance
𝐾 (𝑠, 𝑡) = min{𝑠, 𝑡} C 𝑠 ∧ 𝑡.

Proof. Cov(𝐵 𝑠 , 𝐵𝑡 ) = Leb [0, 𝑠] ∩ [0, 𝑡] = 𝑠 ∧ 𝑡.



J

Proposition 2.3. Let (𝑋𝑡 )𝑡>0 be a (real-valued) stochastic process. The following are equivalent:
(i) (𝑋𝑡 )𝑡>0 is a pre-Brownian motion;
(ii) (𝑋𝑡 )𝑡>0 is a centered Gaussian process with covariance 𝐾 (𝑠, 𝑡) = 𝑠 ∧ 𝑡;
(iii) 𝑋0 = 0 a.s. and ∀0 6 𝑠 < 𝑡 𝑋𝑡 − 𝑋𝑠 ∼ 𝒩(0, 𝑡 − 𝑠) is independent of 𝜎(𝑋𝑟 , 𝑟 6 𝑠);
(iv) 𝑋0 = 0 a.s. and ∀0 = 𝑡0 < 𝑡1 < · · · < 𝑡 𝑝 𝑋𝑡𝑖 − 𝑋𝑡𝑖−1 ∼ 𝒩(0, 𝑡𝑖 − 𝑡𝑖−1 ) are independent for
1 6 𝑖 6 𝑝.

Proof. (i) ⇒ (ii): Proposition 2.2.


(ii) ⇒ (iii): 𝑋0 ∼ 𝒩(0, 0); 𝑋𝑡 − 𝑋𝑠 = 𝐺 [𝑠, 𝑡] ∼ 𝒩(0, 𝑡 − 𝑠); if 𝐻𝑠 is the closed linear span


of (𝑋𝑟 )06𝑟 6𝑠 and


 𝐻𝑠 the closed linear span of (𝑋𝑡 − 𝑋𝑠 )𝑡>𝑠 , then 𝐻𝑠 ⊥ 𝐻𝑠 since 𝑋𝑟 ⊥ (𝑋𝑡 − 𝑋𝑠 )
e e
(E 𝑋𝑟 (𝑋𝑡 − 𝑋𝑠 ) = 𝑟 ∧ 𝑡 − 𝑟 ∧ 𝑠 = 𝑟 − 𝑟 = 0) for 𝑟 6 𝑠 6 𝑡, whence by Theorem 1.9, 𝜎(𝐻𝑠 ) ⫫ 𝜎( 𝐻
e𝑠 ).
(iii) ⇒ (iv): By (iii), 𝑋𝑡𝑖 − 𝑋𝑡𝑖−1 ⫫ 𝜎(𝑋𝑡 𝑗 − 𝑋𝑡 𝑗−1 ; 0 6 𝑗 < 𝑖) for each 𝑖 ∈ [1, 𝑝].
8 Chapter 2. Brownian Motion

(iv) ⇒ (i): We need to define 𝐺 ( 𝑓 ) for 𝑓 ∈ 𝐿 2 (R+ ). We start with step functions
𝜆𝑖 1 (𝑡𝑖−1 ,𝑡𝑖 ] , where 0 = 𝑡0 < 𝑡1 < · · · < 𝑡 𝑛 . For such 𝑓 , we define
P 𝑛
𝑓 = 𝑖=1
𝑛
X
𝐺( 𝑓 ) B 𝜆𝑖 (𝑋𝑡𝑖 − 𝑋𝑡𝑖−1 ).
𝑖=1

This does not depend on the


∫ representation of 𝑓 : to see this, use a common refinement. Similarly, to
see that E 𝐺 ( 𝑓 )𝐺 (𝑔) = R 𝑓 𝑔 for each 𝑓 and 𝑔, use a common refinement. Thus, 𝐺 is an isometry
+
from step functions on R+ into the Gaussian space generated by 𝑋. Since step functions are dense in
2 2
𝐿 (R+ ), we may extend 𝐺 to an isometry on 𝐿 (R+ ). By construction, 𝐺 (0, 𝑡] = 𝑋𝑡 − 𝑋0 = 𝑋𝑡 . J


Exercise. Show that (𝑋𝑡 )𝑡>0 is a pre-Brownian motion iff 𝑋0 = 0 a.s. and (𝑋𝑡 )𝑡>0 is a centered
Gaussian process with ∀0 6 𝑠 < 𝑡 Var(𝑋𝑡 − 𝑋𝑠 ) = 𝑡 − 𝑠.
The finite-dimensional distributions of pre-Brownian motion—the laws of (𝐵𝑡1 , 𝐵𝑡2 , . . . , 𝐵𝑡 𝑛 )
for 0 < 𝑡 1 < · · · < 𝑡 𝑛 —are unique by the equivalence of (i) and (iv) in Proposition 2.3. To be
explicit:
Corollary 2.4. Let (𝐵𝑡 )𝑡>0 be a pre-Brownian motion and 0 = 𝑡0 < 𝑡1 < · · · < 𝑡 𝑛 . Then
(𝐵𝑡1 , 𝐵𝑡2 , · · · , 𝐵𝑡 𝑛 ) has density on R𝑛
1 (𝑥𝑖 − 𝑥𝑖−1 ) 2 o
𝑛
Y n X 𝑛
(𝑥1 , . . . , 𝑥 𝑛 ) ↦→ exp − ,
2𝜋(𝑡𝑖 − 𝑡𝑖−1 ) 2(𝑡𝑖 − 𝑡𝑖−1 )
p
𝑖=1 𝑖=1

where 𝑥0 B 0.
Proof. Independence of increments gives the joint P density of the increments. Then we change
variables (𝑦 1 , . . . , 𝑦 𝑛 ) ↦→ (𝑥 1 , . . . , 𝑥 𝑛 ) via 𝑥𝑖 B 𝑖𝑗=1 𝑦 𝑗 , which has Jacobian determinant 1. J
Some simple properties of pre-Brownian motion:
Proposition 2.5. Let (𝐵𝑡 )𝑡>0 be a pre-Brownian motion.
(i) (−𝐵𝑡 )𝑡>0 is a pre-Brownian motion.
(ii) ∀𝜆 > 0 𝐵𝑡𝜆 𝑡>0 defined by 𝐵𝑡𝜆 B 𝜆1 𝐵𝜆2 𝑡 is a pre-Brownian motion.


(iii) ∀𝑠 > 0 𝐵𝑡(𝑠) 𝑡>0 defined by 𝐵𝑡(𝑠) B 𝐵 𝑠+𝑡 − 𝐵 𝑠 is a pre-Brownian motion and is independent


of 𝜎(𝐵𝑟 , 𝑟 6 𝑠).
Proof. (i) and (ii) follow from (say) Proposition 2.3(ii).
In the notation of the proof of Proposition 2.3, we have 𝜎 𝐵𝑡(𝑠) , 𝑡 > 0 = 𝜎( 𝐻e𝑠 ), which we


saw is independent of 𝜎(𝐻𝑠 ) = 𝜎(𝐵𝑟 , 𝑟 6 𝑠). The finite-dimensional distributions are correct as a
special case of those for 𝐵 itself. J
We defined 𝐵 in terms of 𝐺, but 𝐺 is also determined by 𝐵: we did this in Proposi-
tion 2.3 (iv)⇒(i), using step functions and limits. One sometimes writes
∫ ∞
𝑓 (𝑠) d𝐵 𝑠 𝑓 ∈ 𝐿 2 (R+ ) .

𝐺( 𝑓 ) =
0
This is called the Wiener integral. However, 𝐺 (·) is not an almost sure measure and this integral
makes no sense pointwise. We will extend integration to random 𝑓 in Chapter 5.
2.2. The Continuity of Sample Paths 9

Exercise. Nevertheless, one can integrate by parts in the Wiener integral: Suppose that 𝜇 is a finite,
signed measure on (0, 𝑡] for some 𝑡 > 0 and that 𝑓 (𝑠) = 𝜇(0, 𝑠] for 𝑠 6 𝑡. Set 𝑓 (𝑠) := 0 for 𝑠 > 𝑡.
Assume that 𝐵 is continuous a.s. Show that

𝐺 ( 𝑓 ) = 𝑓 (𝑡)𝐵𝑡 − 𝐵 𝑠 𝜇(d𝑠) a.s.
(0,𝑡]

2.2. The Continuity of Sample Paths


In Exercise 1.18, we defined Brownian motion 𝐵0 by taking a limit of continuous functions
based on 𝐵, thus getting almost surely a continuous function 𝑡 ↦→ 𝐵0𝑡 (𝜔). In our redevelopment, we
haven’t done that yet. Finite-dimensional distributions cannot guarantee that, since we could always
change the process at an independent 𝑈 [0, 1] random time to be 0, say, which would not change
the f.d.d.s, yet would make the process discontinuous. We now discuss such modifications more
generally.
Definition 2.6. Let (𝑋𝑡 )𝑡∈𝑇 be a stochastic process with values in 𝐸. The sample paths of 𝑋 are the
maps 𝑡 ↦→ 𝑋𝑡 (𝜔) for each 𝜔 ∈ Ω.

Definition 2.7. Let (𝑋𝑡 )𝑡∈𝑇 and ( 𝑋


e𝑡 )𝑡∈𝑇 be stochastic processes indexed by the same 𝑇 and taking
values in the same 𝐸. We say 𝑋 e is a modification of 𝑋 if

∀𝑡 ∈ 𝑇 e𝑡 = 𝑋𝑡 ] = 1.
P[ 𝑋

This gives the same finite-dimensional distributions, but that is not enough for us.
e is indistinguishable from 𝑋 if
Definition 2.8. With the same notation, we say 𝑋

P[∀𝑡 ∈ 𝑇 e𝑡 = 𝑋𝑡 ] = 1.
𝑋

To be more precise, we use the completion of P here, or, alternatively, the condition is that there
exists a subset 𝑁 ⊆ Ω with P(𝑁) = 0 such that

∀𝜔 ∈ 𝑁 c ∀𝑡 ∈ 𝑇 e𝑡 (𝜔) = 𝑋𝑡 (𝜔).
𝑋

Notice that if 𝑇 is a separable metric space and 𝑋, 𝑋 e both have continuous sample paths almost
surely, then 𝑋
e is a modification of 𝑋 if and only if 𝑋 e is indistinguishable from 𝑋. In case 𝑇 ⊆ R,
the same assertion holds with “continuous” replaced with “right-continuous” or “left-continuous”.
We are going to prove more than continuity, namely, Hölder continuity. In the context of metric
spaces, a function 𝑓 : (𝐸 1 , 𝑑1 ) → (𝐸 2 , 𝑑2 ) is Hölder continuous of order 𝛼 if

∃𝐶 < ∞ ∀𝑠, 𝑡 ∈ 𝐸 1 𝑑2 𝑓 (𝑠), 𝑓 (𝑡) 6 𝐶 · 𝑑1 (𝑠, 𝑡) 𝛼 .

Kolmogorov showed that when 𝑓 is replaced by a stochastic process on a domain in R 𝑘 that satisfies
the above inequality when the left-hand side is replaced by the expectation of a power of the distance
and 𝛼 > 𝑘 on the right-hand side, then the process has almost sure Hölder continuity of order higher
than 0:
10 Chapter 2. Brownian Motion

Theorem 2.9 (Kolmogorov’s lemma, or Kolmogorov’s continuity theorem). Consider a stochastic


process 𝑋 = (𝑋𝑡 )𝑡∈𝐼 on a bounded rectangle 𝐼 ⊆ R 𝑘 that takes values in a complete metric space
(𝐸, 𝑑). If there exist positive 𝑞, 𝜀, 𝐶 such that
 
∀𝑠, 𝑡 ∈ 𝐼 E 𝑑 (𝑋𝑠 , 𝑋𝑡 ) 𝑞 6 𝐶 |𝑠 − 𝑡| 𝑘+𝜀 ,

then there exists a modification 𝑋


e of 𝑋 whose sample paths are Hölder continuous of order 𝛼 for all
𝜀
𝛼 ∈ (0, 𝑞 ). Indeed, 𝑋
e can be chosen to satisfy

" !𝑞#
𝜀 𝑑(𝑋 e𝑡 )
e𝑠 , 𝑋
∀𝛼 < E sup < ∞. (∗)
𝑞 𝑠,𝑡∈𝐼 |𝑠 − 𝑡| 𝛼
𝑠≠𝑡

Note that for unbounded 𝐼, this gives locally Hölder sample paths. Recall that continuous
sample path modifications are unique up to indistinguishability.

Proof. We do only 𝑘 = 1. We also take 𝐼 = [0, 1] for simplicity; the presence of endpoints would
not matter. Note that Eq. (∗) implies that for each 𝛼 ∈ (0, 𝑞𝜀 ), there is a Hölder-𝛼 modification.
Using a sequence 𝛼 𝑗 ↑ 𝑞𝜀 , we get that there is a modification that is Hölder-𝛼 𝑗 for all 𝑗 (by uniqueness
up to indistinguishability). This gives Hölder-𝛼 for all 𝛼 ∈ (0, 𝑞𝜀 ).
Now for 𝑠 ≠ 𝑡, the hypothesis yields
" #
𝑑 (𝑋𝑠 , 𝑋𝑡 ) 𝑞 𝐶 |𝑠 − 𝑡| 1+𝜀
E 6 = 𝐶 |𝑠 − 𝑡| 1+𝜀−𝛼𝑞 .
|𝑠 − 𝑡| 𝛼𝑞 |𝑠 − 𝑡| 𝛼𝑞

Hence,

𝐾 (𝜔)B
z }| {
" !𝑞# " !𝑞#
𝑑 (𝑋 (𝑖−1)2−𝑛 , 𝑋𝑖2−𝑛 ) X X 𝑑 (𝑋 (𝑖−1)2−𝑛 , 𝑋𝑖2−𝑛 )
E sup sup 6 E
𝑛>1 16𝑖62𝑛 (2−𝑛 ) 𝛼 2−𝑛𝛼
𝑛>1 16𝑖62𝑛
X X X
6 𝐶2−𝑛(1+𝜀−𝛼𝑞) = 𝐶2−𝑛(𝜀−𝛼𝑞) < ∞.
𝑛>1 16𝑖62𝑛 𝑛

We now use:

Lemma 2.10. Let 𝐷 B {𝑖2−𝑛 ; 𝑛 > 1, 0 6 𝑖 6 2𝑛 }, 𝑓 : 𝐷 → (𝐸, 𝑑), 𝛼 > 0.


Then
 
1)2 −𝑛 , 𝑓 (𝑖2−𝑛 )

𝑑 𝑓 (𝑠), 𝑓 (𝑡)

2 𝑑 𝑓 (𝑖 −
sup 6 sup sup .
𝑠,𝑡∈𝐷 |𝑠 − 𝑡| 𝛼 1 − 2−𝛼 𝑛>1 16𝑖62𝑛 (2−𝑛 ) 𝛼
𝑠≠𝑡

Proof. Take a “chain” from 𝑠 to 𝑡 that uses at most two hops of order ℓ for every
ℓ > 𝑝, where 2−𝑝 6 |𝑠 − 𝑡| < 2−𝑝+1 . See page 26 of the book for details. J
2.2. The Continuity of Sample Paths 11

This gives Eq. (∗) restricted to 𝑠, 𝑡 ∈ 𝐷 for 𝑋


e𝑟 B 𝑋𝑟 (𝑟 ∈ 𝐷). In particular, 𝑋 is almost surely
Hölder-𝛼 continuous on 𝐷, hence continuous on 𝐷. Define
e𝑡 (𝜔) B lim 𝑋𝑠 (𝜔)
𝑋 when 𝐾 (𝜔) < ∞
𝐷3𝑠→𝑡

and 𝑋
e𝑡 (𝜔) B 𝑥 0 for some fixed 𝑥 0 ∈ 𝐸 when 𝐾 (𝜔) = ∞. Then

2

𝑑 𝑋e𝑠 , 𝑋
e𝑡
∀𝜔 ∈ Ω sup 6 𝐾 (𝜔). J
𝑠,𝑡∈𝐼 |𝑠 − 𝑡|
𝛼 1 − 2−𝛼
𝑠≠𝑡

Corollary 2.11. Pre-Brownian motion has a modification with continuous sample paths. Every
such modification is indistinguishable from one (all of) whose sample paths are locally Hölder
continuous of order 𝛼 for all 𝛼 < 12 .
Proof. Recall that a standard normal random variable has a finite 𝑞th moment for each 𝑞 < ∞.
Thus, for 𝑠 < 𝑡, there exists a standard normal 𝑈 such that

𝐵𝑡 − 𝐵 𝑠 = 𝑡 − 𝑠 · 𝑈 ∈ 𝐿 𝑞

with    
E |𝐵𝑡 − 𝐵 𝑠 | 𝑞 = (𝑡 − 𝑠) 𝑞/2 · E |𝑈| 𝑞 .
1
If 𝑞 > 2, we can apply Theorem 2.9 with 𝜀 B 𝑞
2 − 1 to get Hölder continuity with 𝛼 < 𝜀
𝑞 = 2 − 𝑞1 .
We may take 𝑞 arbitrarily large. J
Remark. The optimal result is known as “Lévy’s modulus of continuity”:
|𝐵𝑡+𝜀 − 𝐵𝑡 |
lim sup q = 1 a.s.
1
2𝜀 log 𝜀
𝜀↓0 𝑡>0

Definition 2.12. A Brownian motion is a pre-Brownian motion with continuous sample paths.
We have proved Brownian motion exists. Since −𝐵, 𝐵𝜆 , 𝐵 (𝑠) have continuous sample paths
when 𝐵 does, the statements of Proposition 2.5 holds when “pre” is removed everywhere.
In order to discuss the law of the sample paths, we use the space 𝐶 (R+ , R) of continuous
functions from R+ to R equipped with the topology 𝜏 of uniform convergence on every compact set.
This topology is locally compact. The corresponding Borel 𝜎-field is generated by the coordinate
maps 𝑤 ↦→ 𝑤(𝑡) (𝑡 ∈ R+ ).
Exercise. Check that 𝜏 is locally compact and its Borel 𝜎-field is generated as claimed.
Then 𝜔 ↦→ 𝑡 ↦→ 𝐵𝑡 (𝜔) is measurable since composing it with each coordinate map 𝑤 ↦→ 𝑤(𝑠)


gives the measurable 𝐵 𝑠 . The pushforward of P is the Wiener measure 𝑊, the law of sample
paths: 𝑊 ( 𝐴) = P[𝐵· ∈ 𝐴] for measurable 𝐴 ⊆ 𝐶 (R+ , R). Corollary 2.4, the finite-dimensional
distributions of pre-Brownian motion, gives the finite-dimensional distributions of 𝑊, i.e., the
collection of laws of 𝑤(𝑡0 ), 𝑤(𝑡 1 ), . . . , 𝑤(𝑡 𝑛 ) for 𝑛 > 0, 0 = 𝑡0 < 𝑡 1 < · · · < 𝑡 𝑛 . The cylinder sets
are the sets
𝑤 ∈ 𝐶 (R+ , R) ; 𝑤(𝑡 0 ) ∈ 𝐴0 , . . . , 𝑤(𝑡 𝑛 ) ∈ 𝐴𝑛

12 Chapter 2. Brownian Motion

Figure 2.1: Simulation of Brownian motion

for 𝐴0 , . . . , 𝐴𝑛 ∈ ℬ(R). The class of cylinder sets is obviously closed under finite intersections; by
definition, this class generates the 𝜎-field, whence by the 𝜋-𝜆 theorem (number 1 on page 262 of
the book), the finite-dimensional distributions of 𝑊 determine 𝑊. Thus, there is only one Wiener
measure.
Exercise (due 9/14). Exercise 2.25 (time inversion).
Exercise. Suppose that 𝑓 ∈ 𝐿 loc 2 (R ), i.e., 𝑓 ∈ 𝐿 2 [0, 𝑡] for all 𝑡 > 0. Define the stochastic

+
process 𝑋 : 𝑡 ↦→ 𝐺 𝑓 1 [0,𝑡] for a Gaussian white noise 𝐺 on R+ with intensity Lebesgue measure.


Define 𝐴𝑡 := 0 | 𝑓 (𝑠)| 2 d𝑠 and 𝜏𝑡 := inf{𝑠 ≥ 0 ; 𝐴𝑠 > 𝑡}. Let 𝛽𝑡 := 𝑋𝜏𝑡 . Show that (𝛽𝑡 )06𝑡<𝐴∞ is a
∫𝑡

pre-Brownian∫ 𝑡 motion restricted to [0, 𝐴∞ ). Let 𝐵 be a modification of 𝛽 that is continuous. Write


2
 1/2
𝑑 (𝑠, 𝑡) := 𝑠 | 𝑓 (𝑢)| d𝑢 . Show that 𝑌 : 𝑡 ↦→ 𝐵 𝐴𝑡 is a modification of 𝑋 that is locally Hölder
continuous of order 𝛼 for all 𝛼 < 1 with respect to the pseudometric 𝑑 on R+ . We call such a process
𝑌 a Wiener integral process.

2.3. Properties of Brownian Sample Paths

Lemma. Let 𝑋1 , 𝑋2 , . . . , 𝑋𝑛 be random variables and 𝐴 be an event. Then 𝐴 is independent of


𝜎(𝑋1 , 𝑋2 , . . . , 𝑋𝑛 ) if and only if
   
∀𝑔 ∈ 𝐶c (R𝑛 , R) E 1 𝐴 𝑔(𝑋1 , 𝑋2 , . . . , 𝑋𝑛 ) = P( 𝐴) E 𝑔(𝑋1 , 𝑋2 , . . . , 𝑋𝑛 ) .

Proof. The law of (𝑋1 , 𝑋2 , . . . , 𝑋𝑛 ), as a Borel probability measure on R𝑛 , is determined by its


integral of 𝑔 ∈ 𝐶c (R𝑛 ), as is the law conditional on 𝐴. J
Let (𝐵𝑡 )𝑡>0 be a Brownian motion. Write ℱ𝑡 B 𝜎(𝐵 𝑠 , 𝑠 6 𝑡) and ℱ0+ B 𝑠>0 ℱ𝑠 . This
Ñ
latter describes how Brownian motion “starts”. Are there ways it can start that have non-trivial
probability? No:
Theorem 2.13 (Blumenthal’s 0-1 Law). ℱ0+ is trivial in the sense that all its sets have probability 0
or 1.
2.3. Properties of Brownian Sample Paths 13

Proof. We want to show that ℱ0+ is independent of ℱ0+ , for which it suffices to show that ℱ0+ is
independent of ℱ𝑠 for some 𝑠 > 0, since ℱ0+ ⊆ ℱ𝑠 . Take any 𝑠 > 0; since ℱ𝑠 = 𝜎(𝐵𝑡 , 0 < 𝑡 6 𝑠)
(as 𝐵0 = 0 a.s.), it suffices to show that ℱ0+ is independent of 𝜎(𝐵𝑡1 , 𝐵𝑡2 , . . . , 𝐵𝑡 𝑛 ) for any
0 < 𝑡1 < 𝑡2 < · · · < 𝑡 𝑛 6 𝑠. In light of the above lemma, we calculate, for 𝑔 ∈ 𝐶c (R𝑛 , R) and
𝐴 ∈ ℱ0+ ,

E 1 𝐴 𝑔(𝐵𝑡1 , 𝐵𝑡2 , . . . , 𝐵𝑡 𝑛 ) = lim E 1 𝐴 𝑔(𝐵𝑡1 − 𝐵𝜀 , 𝐵𝑡2 − 𝐵𝜀 , . . . , 𝐵𝑡 𝑛 − 𝐵𝜀 )


   
𝜀↓0
[bounded convergence theorem]
= lim P( 𝐴) E 𝑔(𝐵𝑡1 − 𝐵𝜀 , 𝐵𝑡2 − 𝐵𝜀 , . . . , 𝐵𝑡 𝑛 − 𝐵𝜀 )
 
𝜀↓0
[ℱ0+ ⊆ ℱ𝜀 ⫫ 𝜎(𝐵𝑡1 − 𝐵𝜀 , 𝐵𝑡2 − 𝐵𝜀 , . . . , 𝐵𝑡 𝑛 − 𝐵𝜀 )
for 𝜀 < 𝑡1 by Proposition 2.5]
 
= P( 𝐴) E 𝑔(𝐵𝑡1 , 𝐵𝑡2 , . . . , 𝐵𝑡 𝑛 ) . J

As a corollary, we deduce the following:


Proposition 2.14. (i) Almost surely,

∀𝜀 > 0 sup 𝐵 𝑠 > 0 and inf 𝐵 𝑠 < 0.


06𝑠6𝜀 06𝑠6𝜀

(ii) Almost surely, lim𝑡→∞ 𝐵𝑡 = ∞ and lim𝑡→∞ 𝐵𝑡 = −∞.

Note that these are random variables since we may restrict to rational times.

Proof. (i) We have


h i Ù h i
P ∀𝜀 > 0 sup 𝐵 𝑠 > 0 = P sup 𝐵 𝑠 > 0
06𝑠6𝜀 𝜀>0 06𝑠6𝜀
h i 1
= lim P sup 𝐵 𝑠 > 0 > lim P[𝐵𝜀 > 0] = ,
𝜀↓0 06𝑠6𝜀 𝜀↓0 2

whence the above probability equals one by Theorem 2.13. Symmetry gives the other result.
(ii) Let 𝑍 := sup𝑡 𝐵𝑡 . Recall that 𝐵𝑡𝜆 B 𝜆1 𝐵𝜆2 𝑡 gives a Brownian motion by Proposition 2.5(ii).
Thus, the law of 𝑍 is the same as the law of 𝑍/𝜆 for all 𝜆 > 0, which means it is concentrated
on {0, ∞}. By part (i), 𝑍 > 0 a.s., whence 𝑍 = ∞ a.s. Therefore, lim𝑡→∞ 𝐵𝑡 = ∞ a.s. as well.
Symmetry gives the other assertion. J

Exercise (due 9/21). Exercise 2.29. In fact, show that for any sequence (𝑡 𝑘 )𝑘 >1 ⊂ (0, ∞) with
√ √
𝑡 𝑘 → 0, we have lim 𝑘→∞ 𝐵𝑡 𝑘 / 𝑡 𝑘 = ∞ and lim 𝑘→∞ 𝐵𝑡 𝑘 / 𝑡 𝑘 = −∞ almost surely.

Exercise. Show that for any sequence (𝑡 𝑘 )𝑘 >1 ⊂ (0, ∞) with 𝑡 𝑘 → ∞, we have lim 𝑘→∞ 𝐵𝑡 𝑘 / 𝑡 𝑘 = ∞

and lim 𝑘→∞ 𝐵𝑡 𝑘 / 𝑡 𝑘 = −∞ almost surely.
Exercise. Show that the tail 𝜎-field 𝑡>0 𝜎(𝐵 𝑠 , 𝑠 > 𝑡) is trivial.
Ñ

Another corollary:
14 Chapter 2. Brownian Motion

Corollary 2.15. Almost surely, Brownian motion is not monotone on any nontrivial interval.

Proof. By Proposition 2.5(iii),


h i
∀𝑡 > 0 P ∀𝜀 > 0 sup 𝐵 𝑠 > 𝐵𝑡 and inf 𝐵 𝑠 < 𝐵𝑡 = 1.
𝑡6𝑠6𝑡+𝜀 𝑡6𝑠6𝑡+𝜀

Apply this to 𝑡 ∈ Q+ . J

Of course, Brownian motion does have local maxima and minima.


We give two last properties that do not depend on Theorem 2.13:
Proposition 2.16. Fix 𝑡 > 0. If 0 = 𝑡0𝑛 < 𝑡1𝑛 < · · · < 𝑡𝑡𝑛𝑝𝑛 = 𝑡 satisfies

lim max (𝑡𝑖𝑛 − 𝑡𝑖−1


𝑛
) = 0,
𝑛→∞ 16𝑖6 𝑝 𝑛

then
𝑝𝑛
2
in 𝐿 2 (P).
X
lim (𝐵𝑡𝑖𝑛 − 𝐵𝑡𝑖−1
𝑛 ) = 𝑡
𝑛→∞
𝑖=1

Proof. Immediate from Proposition 1.14. J

Corollary 2.17. Almost surely, Brownian motion has infinite variation on every nontrivial interval.

Proof. As in the proof of Corollary 2.15, it suffices to prove this for each interval [0, 𝑡], 𝑡 > 0. By
taking a subsequence, we may assume almost sure convergence in Proposition 2.16. Since
𝑝𝑛 𝑝𝑛
2
X X
𝑛 ) 6 max 𝐵𝑡 𝑛 − 𝐵𝑡 𝑛
(𝐵𝑡𝑖𝑛 − 𝐵𝑡𝑖−1 · 𝐵𝑡𝑖𝑛 − 𝐵𝑡𝑖−1
𝑛 ,
16𝑖6 𝑝 𝑛
𝑖 𝑖−1
𝑖=1 𝑖=1

the left-hand side tends to 𝑡 almost surely, and lim𝑛→∞ max16𝑖6 𝑝 𝑛 |𝐵𝑡𝑖𝑛 − 𝐵𝑡𝑖−1
𝑛 | = 0 by continuity, the

result follows. J

Thus, the Wiener integral cannot be defined as an ordinary integral.

2.4. The Strong Markov Property of Brownian Motion


We want to extend the Markov property, that what happens at times before 𝑡 is independent of
the increments after time 𝑡, by replacing 𝑡 with a suitable class of random times, 𝑇. Clearly, such 𝑇
should not “depend on the future”.
Remark. This 𝑇 is not the index set of 𝑡.
Define ℱ∞ B 𝜎(𝐵 𝑠 , 𝑠 > 0).
Definition 2.18. A [0, ∞]-valued random variable 𝑇 is a stopping time if

∀𝑡 > 0 [𝑇 6 𝑡] ∈ ℱ𝑡 .
2.4. The Strong Markov Property of Brownian Motion 15

Examples. Two examples of stopping times:


(1) 𝑇 ≡ 𝑠;
(2) 𝑇 = 𝑇𝑎 B inf{𝑠 > 0 ; 𝐵 𝑠 = 𝑎} is a stopping time, since 𝑇𝑎 6 𝑡 if and only if

inf |𝐵 𝑠 − 𝑎| = 0.
𝑠∈Q∩[0,𝑡]

If 𝑇 is a stopping time, then


Ø
∀𝑡 > 0 [𝑇 < 𝑡] = [𝑇 6 𝑠] ∈ ℱ𝑡 .
𝑠∈Q∩[0,𝑡)

What is the 𝜎-field of events “determined up to time 𝑇”? We might guess 𝐴 is such an event
if for each 𝑡 > 0, 𝐴 ∩ [𝑇 = 𝑡] ∈ ℱ𝑡 . But we know it might be problematic to make such a fine
disintegration of 𝐴. Perhaps it would be better to require 𝐴 ∩ [𝑇 6 𝑡] ∈ ℱ𝑡 . Moreover, this is
enough at the intuitive level since then 𝐴 ∩ [𝑇 < 𝑡] ∈ ℱ𝑡 and so 𝐴 ∩ [𝑇 = 𝑡] ∈ ℱ𝑡 .
Definition 2.19. If 𝑇 is a stopping time, the 𝜎-field of the past before 𝑇 is

ℱ𝑇 B 𝐴 ∈ ℱ∞ ; ∀𝑡 > 0 𝐴 ∩ [𝑇 6 𝑡] ∈ ℱ𝑡 .


It is easy to check that


(1) ℱ𝑇 is a 𝜎-field, and
(2) 𝑇 is ℱ𝑇 -measurable.
What is Brownian motion at time 𝑇? When 𝑇 = ∞, this makes no sense, so define

if 𝑇 (𝜔) < ∞,

 𝐵𝑇 (𝜔) (𝜔)


e𝑇 (𝜔) B
𝐵
0
 if 𝑇 (𝜔) = ∞.

We claim that 𝐵
e𝑇 is ℱ𝑇 -measurable. We use the left-continuity of 𝐵 to write
X X
e𝑇 = lim
𝐵 1 [ 𝑖 6𝑇 < 𝑖+1 ] 𝐵 𝑖 = lim 1 [𝑇 < 𝑖+1 ] 1 [ 𝑖 6𝑇] 𝐵 𝑖 .
𝑛→∞ 𝑛 𝑛 𝑛 𝑛→∞ 𝑛 𝑛 𝑛
𝑖>0 𝑖>0

Thus, we see it suffices to show that

∀𝑠 > 0 1 [𝑠6𝑇] 𝐵 𝑠 ∈ ℱ𝑇 .

Indeed, if 𝐴 ∈ ℬ(R) and 0 ∉ 𝐴, then for each 𝑡 > 0,


(
  ∅ if 𝑡 < 𝑠,
1 [𝑠6𝑇] 𝐵 𝑠 ∈ 𝐴 ∩ [𝑇 6 𝑡] =
[𝐵 𝑠 ∈ 𝐴] ∩ [𝑇 < 𝑠] c ∩ [𝑇 6 𝑡] if 𝑡 > 𝑠
∈ ℱ𝑡 .

In case 0 ∈ 𝐴, just use 𝐴c in what we just established. This gives our claim.
16 Chapter 2. Brownian Motion

Theorem 2.20 (Strong Markov Property). Let 𝑇 be a stopping time with P[𝑇 < ∞] > 0. Define

e𝑡(𝑇) B 𝐵
𝐵 e𝑇+𝑡 − 𝐵
e𝑇 (𝑡 > 0).

e𝑡(𝑇)

Then under P[ · | 𝑇 < ∞], the process 𝐵 𝑡>0 is a Brownian motion independent of ℱ𝑇 .

Proof. Suppose first 𝑇 < ∞ a.s. The assertions will follow from

∀𝐴 ∈ ℱ𝑇 ∀0 6 𝑡1 < · · · < 𝑡 𝑝 ∀𝐹 ∈ 𝐶c (R 𝑝 , R)
h i (2.1)
e𝑡(𝑇) , 𝐵
e𝑡(𝑇) , . . . , 𝐵
e𝑡(𝑇) = P( 𝐴) · E 𝐹 (𝐵𝑡1 , 𝐵𝑡2 , . . . , 𝐵𝑡 𝑝 ) .
 
E 1𝐴 𝐹 𝐵 1 2 𝑝

For then 𝐴 B Ω shows that 𝐵 e𝑡(𝑇)


𝑡>0 has the same finite-dimensional distributions as Brownian


motion, so by Proposition 2.3 is a pre-Brownian motion. Sample paths are continuous, so it is a


Brownian motion. Also, Eq. (2.1) shows that ( 𝐵 e𝑡(𝑇) , 𝐵
e𝑡(𝑇) , . . . , 𝐵
e𝑡(𝑇) ) is independent of ℱ𝑇 . So by
1 2 𝑝

the 𝜋-𝜆 theorem, 𝐵


e(𝑇) is independent of ℱ𝑇 .
To show Eq. (2.1), we use the following notation: d𝑡e𝑛 B d𝑛𝑡e 𝑛 > 𝑡. Now the bounded
convergence theorem yields
h i h i
(𝑇) e(𝑇) (𝑇)  ( d𝑇e𝑛 ) e( d𝑇e𝑛 ) ( d𝑇e𝑛 ) 
E 1 𝐴 𝐹 𝐵𝑡 1 , 𝐵𝑡 2 , . . . , 𝐵𝑡 𝑝
e e = lim E 1 𝐴 𝐹 𝐵𝑡1 e , 𝐵𝑡 2 , . . . , 𝐵𝑡 𝑝
e
𝑛→∞

X h i
= lim E 1 𝐴 1  𝑘−1 𝑘
𝐹 𝐵𝑘 − 𝐵𝑘 , 𝐵𝑘 − 𝐵𝑘 , . . . , 𝐵𝑘 − 𝐵𝑘  .
+𝑡1 +𝑡2 +𝑡 𝑝
𝑛→∞ 𝑛 <𝑇 6 𝑛 𝑛 𝑛 𝑛 𝑛 𝑛 𝑛
𝑘=0

Note that h i h i h ic
𝑘−1 𝑘
𝐴∩ 𝑛 <𝑇 6 𝑛 = 𝐴 ∩ 𝑇 6 𝑛𝑘 ∩ 𝑇 6 𝑘−1
𝑛 ∈ ℱ𝑘 .
| {z } | {z } 𝑛
∈ℱ𝑘/𝑛 since 𝐴 ∈ ℱ𝑇 ∈ℱ𝑘−1 ⊆ℱ𝑘
𝑛 𝑛

Thus, the 𝑘th term in the sum equals


 h i h
𝑘−1 𝑘
i
P 𝐴 ∩ 𝑛 < 𝑇 6 𝑛 E 𝐹 𝐵𝑡 1 , 𝐵𝑡 2 , . . . , 𝐵𝑡 𝑝 .

Summing over 𝑘 gives Eq. (2.1).


In case P[𝑇 = ∞] > 0, the same arguments work with 𝐴 ∩ [𝑇 < ∞] in place of 𝐴,
yielding Eq. (2.1) for such sets in ℱ𝑇 , and this gives the result similarly. J

When 𝑇 < ∞ a.s., we will omit the tildes in 𝐵


e𝑇 and 𝐵
e(𝑇) .
A very nice application of the strong Markov property is the reflection principle.
Theorem 2.21. For 𝑡 > 0, write 𝑆𝑡 B max06𝑠6𝑡 𝐵 𝑠 > 0. Then

∀𝑎 > 0 ∀𝑏 ∈ (−∞, 𝑎] P[𝑆𝑡 > 𝑎, 𝐵𝑡 6 𝑏] = P[𝐵𝑡 > 2𝑎 − 𝑏].


𝒟
Moreover, 𝑆𝑡 = |𝐵𝑡 |.
2.4. The Strong Markov Property of Brownian Motion 17

1.0

0.5

0.2 0.4 0.6 0.8 1.0

-0.5

Figure 2.2: Illustration of the reflection principle

Proof. We use the stopping time 𝑇𝑎 B inf{𝑡 > 0 ; 𝐵𝑡 = 𝑎}. By Proposition 2.14, 𝑇𝑎 < ∞ almost
surely. We have by Theorem 2.20,
 (𝑇𝑎 ) 
P[𝑆𝑡 > 𝑎, 𝐵𝑡 6 𝑏] = P[𝑇𝑎 6 𝑡, 𝐵𝑡 6 𝑏] = P 𝑇𝑎 6 𝑡, 𝐵𝑡−𝑇 𝑎
6 𝑏 − 𝑎
 (𝑇𝑎 ) 
= P[𝑇𝑎 6 𝑡] · P 𝐵𝑡−𝑇𝑎 6 𝑏 − 𝑎 𝑇𝑎 6 𝑡
 (𝑇𝑎 ) 
= P[𝑇𝑎 6 𝑡] · P −𝐵𝑡−𝑇 𝑎
6 𝑏 − 𝑎 𝑇𝑎 6 𝑡
 (𝑇𝑎 )   (𝑇𝑎 ) 
= P[𝑇𝑎 6 𝑡] · P 𝐵𝑡−𝑇𝑎 > 𝑎 − 𝑏 𝑇𝑎 6 𝑡 = P 𝑇𝑎 6 𝑡, 𝐵𝑡−𝑇 𝑎
> 𝑎−𝑏
= P[𝑇𝑎 6 𝑡, 𝐵𝑡 > 2𝑎 − 𝑏] = P[𝐵𝑡 > 2𝑎 − 𝑏]

since 2𝑎 − 𝑏 > 𝑎. The crucial fourth equality uses the independence of 𝐵 (𝑇𝑎 ) and ℱ𝑇𝑎 . It follows that

P[𝑆𝑡 > 𝑎] = P[𝑆𝑡 > 𝑎, 𝐵𝑡 > 𝑎] + P[𝑆𝑡 > 𝑎, 𝐵𝑡 6 𝑎] = 2 P[𝐵𝑡 > 𝑎] = P |𝐵𝑡 | > 𝑎 .
 
J

Exercise (due 9/28). (1) Exercise 2.28.


(2) Prove (2.2) in the book.
𝑎2
Corollary 2.22. ∀𝑎 ≠ 0 𝑇𝑎 =
𝒟 p 
and E 𝑇𝑎 = ∞.
𝐵12
Proof. We may assume by symmetry that 𝑎 > 0. For each 𝑡 > 0,
P[𝑇𝑎 6 𝑡] = P[𝑆𝑡 > 𝑎] = P |𝐵𝑡 | > 𝑎
 

[Theorem 2.21]
h 𝑎2 i
= P (𝐵𝑡 ) 2 > 𝑎 2 = P[𝑡 (𝐵1 ) 2 > 𝑎 2 ] = P
 
6 𝑡 .
(𝐵1 ) 2
Therefore, ∫ ∞
p  𝑝 𝑋 (𝑥)
d𝑥 = ∞,
 
E 𝑇𝑎 = E 𝑎/|𝐵1 | = 𝑎
−∞ |𝑥|
where 𝑝 𝑋 is the standard normal density. J
Exercise (due 9/28). Verify the density in Corollary 2.22 in the book.
An amusing and immediate consequence of Corollary 2.22 is that E[𝑇𝑎−1 ] = 𝑎 −2 .
18 Chapter 2. Brownian Motion

Exercise. Show that for 𝑎 ≠ 0, if 𝑆 𝑎 := sup{𝑠 ; 𝐵 𝑠 = 𝑎𝑠}, then 𝑆 𝑎 = 𝐵12 /𝑎 2 . Hint: use the result of
𝒟

Exercise 2.25.
Exercise. Let 𝑋𝑡 := 0 𝐵 𝑠 d𝑠 be integrated Brownian motion. Show that almost surely, lim𝑡→∞ 𝑋𝑡 =
∫𝑡
∞ and lim𝑡→∞ 𝑋𝑡 = −∞. Hint: For a finite stopping time 𝑇, write 𝑋𝑡 = 𝑋𝑡∧𝑇 + (𝑡 − 𝑇) + 𝐵𝑇 + 𝑌(𝑡−𝑇) +
with 𝑌 a copy of 𝑋 that is independent of ℱ𝑇 . Use 𝑇𝑛 := inf{𝑡 ≥ 𝑛 ; 𝐵𝑡 = 0} to show that
P[sup𝑡 𝑋𝑡 = ∞] ∈ {0, 1}. Use 𝑇 := inf{𝑡 ≥ 1 ; 𝐵𝑡 = −1} to show that P[sup𝑡 𝑋𝑡 = ∞] = 1.
We now extend Brownian motion to initial values other than 0 and to finite dimensions.
Definition 2.23. If 𝑍 is an R-valued random variable and 𝐵 is a Brownian motion independent of
𝑍, then we call (𝑍 + 𝐵𝑡 )𝑡>0 a real Brownian motion started from 𝑍.
Definition 2.24. If 𝐵1 , . . . , 𝐵 𝑑 are independent real Brownian motions started from 0, then we call
(𝐵𝑡1 , . . . , 𝐵𝑡𝑑 ) 𝑡>0 a 𝑑-dimensional Brownian motion started from 0. If we add an independent


starting vector, 𝑍, then we get 𝑑-dimensional Brownian motion started from 𝑍.


Note that by Corollary 2.4, if 𝐵 is a 𝑑-dimensional Brownian motion (from 0), then for
0 = 𝑡0 < 𝑡1 < · · · < 𝑡 𝑛 and 𝑥 1 = (𝑥 11 , . . . , 𝑥 𝑑1 ), . . . , 𝑥 𝑛 = (𝑥1𝑛 , . . . , 𝑥 𝑑𝑛 ), the density of (𝐵𝑡1 , . . . , 𝐵𝑡 𝑛 ) at
(𝑥 1 , . . . , 𝑥 𝑛 ) is
2o
1 1 |𝑥 𝑖 − 𝑥 𝑖−1 | 2 o
𝑑 Y
𝑛 𝑛 𝑛 𝑛
Y n X (𝑥 𝑖𝑘 − 𝑥 𝑖−1
𝑘 )
Y n X
· exp − = · exp − .
2𝜋(𝑡𝑖 − 𝑡𝑖−1 ) 2(𝑡𝑖 − 𝑡𝑖−1 ) 2(𝑡 − )
p 𝑑
2𝜋(𝑡𝑖 − 𝑡𝑖−1 )
p 𝑡
𝑖 𝑖−1
𝑘=1 𝑖=1 𝑖=1 𝑖=1 𝑖=1

This is invariant under isometries of R𝑑 . Therefore, the law of 𝑑-dimensional Brownian motion
(started at 0) is invariant under isometries of R𝑑 that fix 0. Thus, we really have 𝐸-valued Brownian
motion for finite-dimensional inner-product spaces, 𝐸.
It is easy to check that Blumenthal’s 0-1 law and the strong Markov property hold for 𝑑-
dimensional Brownian motion, where now a stopping time is defined with respect to the collection
of 𝜎-fields
ℱ𝑡 B 𝜎 (𝐵1𝑠 , . . . , 𝐵 𝑠𝑑 ), 𝑠 6 𝑡 .


The proofs are the same but with more notation.

Appendix: The Cameron–Martin Theorem


Let 𝐵 be a Brownian motion. How does adding drift to 𝐵 change its∫law? Let 𝐺 be the
corresponding Gaussian white noise. Let 𝑓 ∈ 𝐿 2 (R+ ), and denote 𝐹𝑡 := 0 𝑓 (𝑠) d𝑠. We will
𝑡

consider the process 𝑋 := 𝐵 + 𝐹. We claim that the law of 𝑋 is absolutely continuous with respect to
𝐺 ( 𝑓 )−k 𝑓 k 2 /2
the law of 𝐵; in fact, the law of 𝑋 is equal to the law of 𝐵 with respect to e 𝐿2 P; this result
is
∫ due to Cameron and Martin. To be even more explicit, let 𝑊 be Wiener measure. Recall that
𝑓 (𝑠) d𝑤(𝑠) is defined for 𝑊-a.e. 𝑤 as in Proposition 2.3 (iv)⇒(i), using step functions and limits.

Proposition 5.24. For 𝑓 ∈ 𝐿 2 (R+ ) and 𝐹𝑡 := 0 𝑓 (𝑠) d𝑠, we have for all measurable 𝐴 ⊆ 𝐶 (R+ , R),
∫𝑡

∫ ∫
2
d𝑊 (𝑤) 1 [𝑤∈𝐴] e 𝑓 d𝑤−k 𝑓 k /2 .

d𝑊 (𝑤) 1 [𝑤+𝐹∈𝐴] =
The Cameron–Martin Theorem 19

That is,∫ the P-law of 𝑋 is absolutely continuous with respect to 𝑊, having Radon–Nikodym derivative
2
𝑤 ↦→ e 𝑓 d𝑤−k 𝑓 k /2 .

This allows us to conclude, for example, that every P-a.s. property of 𝐵 also holds for 𝑋.
The class of 𝐹 that are absolutely continuous with derivative in 𝐿 2 (R+ ) and with 𝐹0 = 0 is
known as the Cameron–Martin space, ℋ. It is easy to see that ℋ is dense in 𝐶∗ (R+ , R) := {𝐹 ∈
𝐶 (R+ , R) ; 𝐹0 = 0}. It follows that the support of 𝑊 is all of 𝐶∗ (R+ , R): Indeed, Proposition
5.24 tells us that the support of 𝑊 is unchanged by addition of any function in ℋ, because the
Radon–Nikodym derivative is nonzero 𝑊-a.s. Thus, if 𝑤0 is one point of the support of 𝑊, then
𝑤0 + ℋ also lies in the support. Since the closure of 𝑤0 + ℋ is 𝐶∗ (R+ , R), our claim follows.
Proposition 5.24 is actually a very simple consequence of basic manipulations with Gaussian
random variables. Consider any (centered) Gaussian space, 𝐻 ⊂ 𝐿 2 (P), and any nonzero 𝑌 ∈ 𝐻.
Define 𝜙 : 𝐻 → 𝐿 2 (P) by 𝜙(𝑍) := 𝑍 + h𝑍, 𝑌 i 𝐿 2 (P) . Obviously 𝜙(𝑍) = 𝑍 whenever 𝑍 ⊥ 𝑌 , which
is the same as 𝑍 ⫫ 𝑌 , whereas 𝜙(𝑌 ) = 𝑌 + k𝑌 k 𝐿2 2 (P) . The law of 𝜙(𝑌 ), i.e., 𝒩 k𝑌 k 2 , k𝑌 k 2 , is

2
absolutely continuous with respect to that of 𝑌 with Radon–Nikodym derivative 𝑦 ↦→ e𝑦−k𝑌 k /2 .
Therefore, if (𝑍1 , . . . , 𝑍𝑛 ) ∈ 𝑌 ⊥ , then the law of 𝜙(𝑍1 ), . . . , 𝜙(𝑍𝑛 ), 𝜙(𝑌 ) also has Radon–Nikodym
2
derivative (𝑧1 , . . . , 𝑧 𝑛 , 𝑦) ↦→ e𝑦−k𝑌 k /2 with respect to that of (𝑍1 , . . . , 𝑍𝑛 , 𝑌 ). In other words, the
2
P-law of 𝜙(𝑍1 ), . . . , 𝜙(𝑍𝑛 ), 𝜙(𝑌 ) is equal to the e𝑌 −k𝑌 k /2 P-law of (𝑍1 , . . . , 𝑍𝑛 , 𝑌 ). Since this
determines the finite-dimensional distributions of all of 𝐻, we conclude that the P-law of 𝜙(𝑍) 𝑍 ∈𝐻
2
is equal to the e𝑌 −k𝑌 k /2 P-law of (𝑍)𝑍 ∈𝐻 .
Coming back to Brownian motion, let us apply this general result to 𝐻 being the image of the
Gaussian white noise, 𝐺. Note that 𝐹𝑡 = h1 [0,𝑡] , 𝑓 i 𝐿 2 = h𝐵𝑡 , 𝐺 ( 𝑓 )i 𝐿 2 (P) . Thus, we are exactly in the
2
situation just analyzed: 𝑋𝑡 = 𝜙(𝐵𝑡 ). Therefore, the P-law of 𝑋 is the e𝐺 ( 𝑓 )−k 𝑓 k /2 P-law of 𝐵, as
claimed.
−𝐺 ( 𝑓 )−k 𝑓 k 2 /2
Exercise. Deduce that 𝑋 is a Brownian motion with respect to Q := e 𝐿2 P. Alternatively,
give a direct proof of this property by showing that 𝑋 is a pre-Brownian motion for the Gaussian
white noise 𝐺e : 𝐿 2 (R+ ) → 𝐿 2 (Q) defined by 𝐺
e(ℎ) := 𝐺 (ℎ) + hℎ, 𝑓 i 𝐿 2 .

If 𝐹 is not in the Cameron–Martin space, then the laws of 𝑋 = 𝐵 + 𝐹 and 𝐵 are mutually
singular, a result of Segal. This is obvious if 𝐹0 ≠ 0. When 𝐹0 = 0, note that 𝐵𝑡 ↦→ 𝐹𝑡 extends
uniquely to a linear functional, 𝛾, on the linear span 𝑉 of {𝐵𝑡 ; 𝑡 > 0}, because the random variables
𝐵𝑡 are linearly independent. This map 𝛾 is bounded, i.e., ∃𝐶 < ∞ such that |𝛾(𝑍)| 6 𝐶 k𝑍 k for
all 𝑍 ∈ 𝑉, iff 𝛾 extends continuously to the closure of 𝑉, which is equivalent to 𝛾(𝑍) = h𝑍, 𝑌 i
for some 𝑌 ∈ 𝐻, i.e., 𝐹 ∈ ℋ. Thus, if 𝐹 ∉ ℋ, then there exist 𝑍 ∈ 𝑉 of norm 1 with arbitrarily
large
 𝛾(𝑍). Let  Φ be the c.d.f. of the  standard normal distribution. For k𝑍 k = 1, we have
P 𝑍 > 𝛾(𝑍)/2 = 1 − Φ 𝛾(𝑍)/2 = P 𝑍 + 𝛾(𝑍) 6 𝛾(𝑍)/2 . This leads us to choose 𝑍𝑛 such that


k𝑍𝑛 k = 1 and 𝛼𝑛 := 𝛾(𝑍𝑛 ) satisfies 𝑛 1 − Φ(𝛼𝑛 /2) < ∞. Let 𝜉𝑛 := 𝑍𝑛 + 𝛾(𝑍𝑛 ). Then 𝑍𝑛 > 𝛼𝑛 /2
P  

for only finitely many 𝑛 a.s., whereas 𝜉𝑛 6 𝛼𝑛 /2 for only finitely many 𝑛 a.s. The explicit forms
of 𝑍𝑛 and 𝜉𝑛 are 𝑍𝑛 = 𝑖=1 𝑎 𝑛,𝑖 𝐵𝑡 𝑛,𝑖 and 𝜉𝑛 = 𝑖=1 𝑎 𝑛,𝑖 𝑋𝑡 𝑛,𝑖 for some constants 𝑎 𝑛,𝑖 and times 𝑡 𝑛,𝑖 .
P𝑘 𝑛 P𝑘 𝑛

Thus the laws of 𝑋 and 𝐵 are mutually singular.


20 Chapter 2. Brownian Motion

Exercise. Let 𝐹 ∈ ℋ with 𝐹 0 having bounded variation on [0, 𝑡] for some 𝑡 > 0. Show that
 
P k𝐵 − 𝐹 k 𝐿 ∞ [0,𝑡] 6 𝜀
lim   = exp{− 21 k𝐹 0 k 𝐿2 2 [0,𝑡] }.
𝜀↓0 P k𝐵k 𝐿 ∞ [0,𝑡] 6 𝜀

Note that the denominator here is positive, because 𝑊 has full support. See the discussion of the
4 𝜋2 𝑡
exercise on page 79 for the value of the denominator; it is asymptotic to 𝜋 exp − 8𝜀2 as 𝜀 ↓ 0.
Exercise. For a function 𝐹 : R+ → R, define

X 𝐹 (𝑡𝑖+1 ) − 𝐹 (𝑡𝑖 ) 2
 
𝑀 (𝐹) := sup ,
𝑖
𝑡𝑖+1 − 𝑡𝑖

where the supremum is over all sequences (𝑡𝑖 )𝑖 with 0 6 𝑡1 < 𝑡2 < · · · .
(1) Show that if 𝐹 ∈ ℋ with derivative 𝐹 0, then 𝑀 (𝐹) 6 k𝐹 0 k 2 .
(2) Show that if 𝑀 (𝐹) < ∞ and (𝑠𝑖 , 𝑡𝑖 ] are disjoint intervals in R+ , then
X  X  1/2
|𝐹 (𝑡𝑖 ) − 𝐹 (𝑠𝑖 )| ≤ 𝑀 (𝐹) (𝑡𝑖 − 𝑠𝑖 ) ,
𝑖 𝑖

and deduce that 𝐹 is absolutely continuous.


(3) Show that if 𝑀 (𝐹) < ∞, then 𝐹 ∈ ℋ with k𝐹 0 k 2 6 𝑀 (𝐹).
We conclude that 𝑀 (𝐹) < ∞ iff 𝐹 ∈ ℋ, in which case 𝑀 (𝐹) = k𝐹 0 k 2 .
Exercise. For 𝐹 : R+ → R, let T 𝐹 be the function
∫ ∞ 𝑡 ↦→ 𝑡𝐹 (1/𝑡) for 𝑡 > 0 and 0 ↦→ 0. Note that if
𝐹 (0) = 0, then T T 𝐹 = 𝐹. Let h𝐹, 𝐾iℋ := 0 𝐹 0 (𝑡)𝐾 0 (𝑡) d𝑡 be the natural inner product on ℋ,
making ℋ a Hilbert space.
(1) Show that if 𝐹, 𝐾 ∈ ℋ are continuously differentiable with compact support in (0, ∞), then
h𝐹, T 𝐾iℋ = hT 𝐹, 𝐾iℋ .
(2) Show that if 𝐹 ∈ ℋ is continuously differentiable with compact support in (0, ∞), then
kT 𝐹 kℋ = k𝐹 kℋ .
(3) Show that if 𝐹 ∈ ℋ, then T 𝐹 ∈ ℋ with kT 𝐹 kℋ = k𝐹 kℋ .
(4) Give another proof of (3) by using the fact that T 𝐵 is a Brownian motion, together with
Proposition 5.24.
21

Chapter 3

Filtrations and Martingales

Please review martingales in discrete time, Appendix A2 of the book.


Here, we select just a few things to review. Time is usually N = {0, 1, . . . }. We are given an
increasing sequence (𝒢𝑛 )𝑛∈N of sub-𝜎-fields. For a sequence (𝑌𝑛 )𝑛∈N of integrable random variables
with 𝑌𝑛 ∈ 𝒢𝑛 , e.g., 𝒢𝑛 B 𝜎(𝑌0 , 𝑌1 , . . . , 𝑌𝑛 ), we call (𝑌𝑛 )𝑛∈N
(1) a martingale if E[𝑌𝑛 | 𝒢𝑚 ] = 𝑌𝑚 when 0 6 𝑚 6 𝑛;
(2) a submartingale if E[𝑌𝑛 | 𝒢𝑚 ] > 𝑌𝑚 when 0 6 𝑚 6 𝑛;
(3) a supermartingale if E[𝑌𝑛 | 𝒢𝑚 ] 6 𝑌𝑚 when 0 6 𝑚 6 𝑛.
If 𝑀𝑎𝑏
𝑌 (𝑛) denotes the number of upcrossings by (𝑌 , 𝑌 , . . . , 𝑌 ) of an interval [𝑎, 𝑏] (𝑎 < 𝑏), then
0 1 𝑛
one version of Doob’s upcrossing inequality is that for a supermartingale, (𝑌𝑛 )𝑛∈N ,
 E (𝑌𝑛 − 𝑎) −
 
 𝑌
∀𝑛 ∈ N ∀𝑎 < 𝑏 E 𝑀𝑎𝑏 (𝑛) 6 .
𝑏−𝑎
There is a corresponding version for submartingales.
The maximal inequality given in Appendix A2 can be hard to find, so here is a proof. The
inequality states that for a submartingale or supermartingale (𝑌𝑛 )𝑛∈N ,
h i
∀𝑘 ∈ N ∀𝜆 > 0 𝜆 P max |𝑌𝑛 | > 𝜆 6 E |𝑌0 | + 2 E |𝑌𝑘 | .
   
06𝑛6𝑘
We may assume that 𝑌 is a supermartingale, because if not, then −𝑌 is a supermartingale and
|−𝑌 | = |𝑌 |. The desired inequality will follow from adding the following two inequalities:
h i
∀𝑘 ∈ N ∀𝜆 > 0 𝜆 P max 𝑌𝑛 > 𝜆 6 E[𝑌0 ] + E |𝑌𝑘 |
 
06𝑛6𝑘
and h i
∀𝑘 ∈ N ∀𝜆 > 0 𝜆 P min 𝑌𝑛 6 −𝜆 6 E |𝑌𝑘 | .
 
06𝑛6𝑘
Fix 𝑘 ∈ N and 𝜆 > 0. Let 𝑇 := inf{𝑛 ; 𝑌𝑛 > 𝜆} ∧ 𝑘. By the optional stopping theorem, we have
h i
E[𝑌0 ] > E[𝑌𝑇 ] = E 𝑌𝑇 1 [max06𝑛6𝑘 𝑌𝑛 >𝜆] + E 𝑌𝑇 1 [max06𝑛6𝑘 𝑌𝑛 <𝜆] > 𝜆 P max 𝑌𝑛 > 𝜆 − E |𝑌𝑘 | ,
     
06𝑛6𝑘
which gives the first inequality. For the second, define 𝑇 := inf{𝑛 ; 𝑌𝑛 6 −𝜆} ∧ 𝑘. By the optional
stopping theorem, we have
h i 
E[𝑌𝑘 ] 6 E[𝑌𝑇 ] = E 𝑌𝑇 1 [min06𝑛6𝑘 𝑌𝑛 6−𝜆 ] +E 𝑌𝑇 1 [min06𝑛6𝑘 𝑌𝑛 >−𝜆] 6 −𝜆 P min 𝑌𝑛 6 −𝜆 +E |𝑌𝑘 | ,
    
06𝑛6𝑘
which gives the second inequality.
22 Chapter 3. Filtrations and Martingales

3.1. Filtrations and Processes


Let (Ω, ℱ, P) be a probability space.
Definition 3.1. A filtration on (Ω, ℱ, P) is a collection (ℱ𝑡 )06𝑡6∞ of sub-𝜎-fields of ℱ such that
ℱ𝑠 ⊆ ℱ𝑡 for 0 6 𝑠 6 𝑡 6 ∞.

We also call Ω, ℱ, (ℱ𝑡 )06𝑡6∞ , P a filtered probability space.




Example. In Chapter 2, we used the filtration associated to Brownian motion

ℱ𝑡 = 𝜎(𝐵 𝑠 , 0 6 𝑠 6 𝑡), ℱ∞ = 𝜎(𝐵 𝑠 , 𝑠 > 0).

Example. More generally, if (𝑋𝑡 )𝑡>0 is any stochastic process, then its canonical filtration is

ℱ𝑡𝑋 B 𝜎(𝑋𝑠 , 0 6 𝑠 6 𝑡), ℱ∞𝑋 B 𝜎(𝑋𝑠 , 𝑠 > 0).

These are not the only filtrations of interest, since there may be other stochastic processes we
want to include, or other randomness.
Similar to ℱ0+ that we considered in Chapter 2, define
Ù
ℱ𝑡 + B ℱ𝑠 , ℱ∞+ B ℱ∞ .
𝑠>𝑡

Clearly, (ℱ𝑡 + )06𝑡6∞ is a filtration and ℱ𝑡 ⊆ ℱ𝑡 + . If ℱ𝑡 = ℱ𝑡 + for each 𝑡 > 0, then we say that
(ℱ𝑡 )06𝑡6∞ is right-continuous.

Example. Let (ℱ𝑡 )𝑡>0 be the canonical filtration of a Poisson process, where the process is modified
so as to be left-continuous. Then ℱ𝑡 ≠ ℱ𝑡 + for every 𝑡 > 0.

A filtration (ℱ𝑡 )06𝑡6∞ is complete if ℱ0 contains every subset of each P-negligible set of ℱ∞ .
Every filtration can be completed to a filtration (ℱ𝑡0)06𝑡6∞ , where ℱ𝑡0 B 𝜎(ℱ𝑡 , N ) and N is the
collection of (ℱ∞ , P)-negligible sets (those 𝐴 ⊆ 𝐵 ∈ ℱ∞ with P(𝐵) = 0).
In discrete time, there are no pesky issues of measurability, other than 𝑋𝑛 ∈ 𝒢𝑛 . Now, however,
there are additional issues. We say (𝑋𝑡 )𝑡>0 is adapted to (ℱ𝑡 )06𝑡6∞ if ∀𝑡 > 0 𝑋𝑡 ∈ ℱ𝑡 . We will
want, e.g., to integrate a stochastic process and get a random variable. This requires some joint
measurability. We will also want the result to be an adapted process. These properties will hold
automatically when (𝑋𝑡 ) has continuous sample paths.
Definition 3.2. A process (𝑋𝑡 )𝑡>0 with values in a measurable space (𝐸, ℰ) is (jointly) measurable
if

(𝜔, 𝑡) ↦→ 𝑋𝑡 (𝜔)

Ω × R+ , ℱ ⊗ ℬ(R+ ) → (𝐸, ℰ)

is measurable.

Fix a filtered probability space.


3.1. Filtrations and Processes 23

Definition 3.3. A set 𝐴 ⊆ Ω × R+ is called progressively measurable, written 𝐴 ∈ 𝒫, if

∀𝑡 > 0
 
𝐴 ∩ Ω × [0, 𝑡] ∈ ℱ𝑡 ⊗ ℬ [0, 𝑡] .

The set 𝒫 is a 𝜎-field, called the progressive 𝜎-field. We call (𝑋𝑡 )𝑡>0 progressive if

(𝜔, 𝑡) ↦→ 𝑋𝑡 (𝜔)
(Ω × R+ , 𝒫) → (𝐸, ℰ)

is measurable. Equivalently, (𝑋𝑡 )𝑡>0 is progressive if for all 𝑡 > 0,

(𝜔, 𝑠) ↦→ 𝑋𝑠 (𝜔)

Ω × [0, 𝑡], ℱ𝑡 ⊗ ℬ [0, 𝑡] → (𝐸, ℰ)

is measurable. Note: every progressive process is measurable and adapted.

Exercise (due 10/5). Let (Ω, ℱ, P) B [0, 1], ℒ, 𝜇 , where ℒ is the collection of Lebesgue-


measurable sets and 𝜇 is Lebesgue measure. Let ℒ0 B 𝐵 ∈ ℒ ; 𝜇(𝐵) ∈ {0, 1} . Let ℱ𝑡 B ℒ0




for each 𝑡 ∈ [0, ∞]. Define

𝐴 B (𝑥, 𝑥) ; 0 6 𝑥 6 12 ⊆ Ω × R+ .


Write 𝑋𝑡 (𝜔) B 1 𝐴 (𝜔, 𝑡) for 𝑡 > 0. Show that (𝑋𝑡 )𝑡>0 is a measurable and adapted process, but is
not progressive. Hint: show that for each 𝐶 ∈ 𝒫,
∫ ∫
1𝐶 (𝑥, 𝑥) 𝜇(d𝑥) = 1𝐶 (𝑥, 𝑦) 𝜇 (2) (d𝑥, d𝑦).
[0,1] [0,1] 2

Proposition 3.4. Let 𝐸 a metric space. Suppose that (𝑋𝑡 )𝑡>0 is a stochastic process with values
in 𝐸, ℬ(𝐸) that is adapted and has right-continuous sample paths. Then 𝑋 is progressive. The
same holds if “right-continuous” is replaced by “left-continuous”.

Proof. The case of right-continuous is in the book, so we do left-continuous. We approximate


(𝑋𝑡 )𝑡>0 by processes that are easily seen to be progressive and use that the class of progressive
processes is closed under limits (this uses that 𝐸 is a metric space).
For 𝑛 > 1, define 𝑋𝑡𝑛 B 𝑋 b𝑛𝑡 c ; then lim𝑛→∞ 𝑋𝑡𝑛 (𝜔) = 𝑋𝑡 (𝜔) for all (𝜔, 𝑡). Also, given 𝑡 > 0
𝑛
and 𝐵 ∈ ℬ(𝐸),

(𝜔,𝑠) ∈ Ω × [0, 𝑡] ; 𝑋𝑠𝑛 (𝜔) ∈ 𝐵



Ø  
𝜔 ; 𝑋 𝑘 (𝜔) ∈ 𝐵 × [ 𝑛𝑘 , 𝑘+1

= 𝑛 ) ∩ [0, 𝑡] ∈ ℱ𝑡 ⊗ ℬ [0, 𝑡] ,
𝑛
06𝑘 6𝑛𝑡

whence (𝑋𝑡𝑛 )𝑡>0 is progressive. Since 𝑋 𝑛 → 𝑋, so is 𝑋. J


24 Chapter 3. Filtrations and Martingales

3.2. Stopping Times and Associated 𝜎-Fields

Definition 3.5. A random variable 𝑇 : Ω → [0, ∞] is a stopping time of (ℱ𝑡 )06𝑡6∞ if ∀𝑡 > 0
[𝑇 6 𝑡] ∈ ℱ𝑡 . We write

ℱ𝑇 B 𝐴 ∈ ℱ∞ ; ∀𝑡 > 0 𝐴 ∩ [𝑇 6 𝑡] ∈ ℱ𝑡


for the 𝜎-field of the past before 𝑇.

As we saw for Brownian motion, a stopping time 𝑇 for (ℱ𝑡 ) also satisfies

∀𝑡 > 0 [𝑇 < 𝑡] ∈ ℱ𝑡 ,

but this is not sufficient for 𝑇 to be a stopping time (example: use the canonical filtration for a
left-continuous Poisson process and let 𝑇 be the time of the first jump).
Since ℱ𝑡 ⊆ ℱ𝑡 + , an (ℱ𝑡 )-stopping time is also an ℱ𝑡 + -stopping time, but not conversely (same
example).
Proposition 3.6. Write 𝒢𝑡 B ℱ𝑡 + for 𝑡 ∈ [0, ∞].
(i) The following are equivalent:
(a) ∀𝑡 > 0 [𝑇 < 𝑡] ∈ ℱ𝑡 ;
(b) 𝑇 is a (𝒢𝑡 )-stopping time;
(c) ∀𝑡 > 0 𝑇 ∧ 𝑡 ∈ ℱ𝑡 .
(ii) If 𝑇 is a 𝒢𝑡 -stopping time, then

𝒢𝑇 = 𝐴 ∈ ℱ∞ ; ∀𝑡 > 0 𝐴 ∩ [𝑇 < 𝑡] ∈ ℱ𝑡 .


We write ℱ𝑇 + B 𝒢𝑇 .

Proof. (i) (a) ⇒ (b): ∀0 6 𝑡 < 𝑠,


∈ℱ𝑞 ⊆ℱ𝑠
Ù z }| {
[𝑇 6 𝑡] = [𝑇 < 𝑞] ∈ ℱ𝑠 ,
𝑞∈(𝑡,𝑠)∩Q

so [𝑇 6 𝑡] ∈ 𝒢𝑡 .
(b) ⇒ (c): ∀0 < 𝑠 < 𝑡 [𝑇 ∧ 𝑡 6 𝑠] = [𝑇 6 𝑠] ∈ 𝒢𝑠 ⊆ ℱ𝑡 , so 𝑇 ∧ 𝑡 ∈ ℱ𝑡 .
(c) ⇒ (a): ∀𝑡 > 0
=[𝑇∧𝑡6𝑞]∈ℱ𝑡
Ø z }| {
[𝑇 < 𝑡] = [𝑇 6 𝑞] ∈ ℱ𝑡 .
𝑞∈(0,𝑡)∩Q

(ii) Similar; see the book. J


3.2. Stopping Times and Associated 𝜎-Fields 25

Here are some easy properties (see the book for proofs):
(a) If 𝑇 is a stopping time, then ℱ𝑇 ⊆ ℱ𝑇 + , with equality when (ℱ𝑡 ) is right-continuous.
(b) If 𝑇 = 𝑡 is constant, then ℱ𝑇 = ℱ𝑡 and ℱ𝑇 + = ℱ𝑡 + .
(c) If 𝑇 is a stopping time, then 𝑇 ∈ ℱ𝑇 .
(d) Let 𝑇 be a stopping time, 𝐴 ∈ ℱ∞ and

if 𝜔 ∈ 𝐴,

𝐴
 𝑇 (𝜔)


𝑇 (𝜔) B
∞

 if 𝜔 ∉ 𝐴.

Then 𝐴 ∈ ℱ𝑇 if and only if 𝑇 𝐴 is a stopping time.


(e) If 𝑆 6 𝑇 are stopping times, then ℱ𝑆 ⊆ ℱ𝑇 and ℱ𝑆+ ⊆ ℱ𝑇 + .
(f) If 𝑆 and 𝑇 are stopping times, then 𝑆 ∨ 𝑇 and 𝑆 ∧ 𝑇 are stopping times, ℱ𝑆∧𝑇 = ℱ𝑆 ∩ ℱ𝑇 ,
ℱ𝑆∨𝑇 = 𝜎(ℱ𝑆 , ℱ𝑇 ), [𝑆 6 𝑇] ∈ ℱ𝑆∧𝑇 , and [𝑆 = 𝑇] ∈ ℱ𝑆∧𝑇 .
(g) If (𝑆𝑛 )𝑛 is a monotone increasing sequence of stopping times, then lim𝑛→∞ 𝑆𝑛 is a stopping
time.
(h) If (𝑆𝑛 ) is a monotone decreasing sequence of stopping times, then 𝑆 B lim𝑛→∞ 𝑆𝑛 is an
(ℱ𝑡 + )-stopping time and Ù
ℱ𝑆+ = ℱ𝑆+𝑛 .
𝑛

(i) If (𝑆𝑛 ) is a monotone decreasing of stopping times that is eventually constant (stabilizes), then
𝑆 B lim𝑛→∞ 𝑆𝑛 is a stopping time and
Ù
ℱ𝑆 = ℱ𝑆 𝑛 .
𝑛

(j) Let 𝑇 be a stopping time and 𝑌 : [𝑇 < ∞] → 𝐸. Then 𝑌 ∈ ℱ𝑇 if and only if ∀𝑡 > 0
𝑌 [𝑇 6 𝑡] ∈ ℱ𝑡 . (Here, we use implicitly the fact that for any measurable space (Ω, ℱ)
and any 𝐴 ⊆ Ω, there is an induced 𝜎-field { 𝐴 ∩ 𝐹 ; 𝐹 ∈ ℱ} on 𝐴.)
Exercise (due 10/5). Show that ℱ𝑆∨𝑇 = 𝜎(ℱ𝑆 , ℱ𝑇 ). Hint: one may use the fact that
 
𝐴 = 𝐴 ∩ [𝑆 6 𝑇] ∪ 𝐴 ∩ [𝑇 6 𝑆] .

Note that the graph of a measurable function is measurable: if 𝑌 : (Ω, ℱ)


 → (𝐸, ℰ) is
measurable, then id ⊗ 𝑌 : (Ω, ℱ) → (Ω × 𝐸, ℱ ⊗ ℰ), defined by 𝜔 ↦→ 𝜔, 𝑌 (𝜔) , is measurable.
Here is our first use of progressive measurability:
Theorem 3.7. Let (𝑋𝑡 )𝑡>0 be a progressive (𝐸, ℰ)-valued process and 𝑇 be a stopping time. Then
𝜔 ↦→ 𝑋𝑇 (𝜔) (𝜔) C 𝑋𝑇 (𝜔), defined on [𝑇 < ∞], is ℱ𝑇 -measurable.
Proof. By (j) above, it suffices to verify that ∀𝑡 > 0 𝑋𝑇 [𝑇 6 𝑡] ∈ ℱ𝑡 . Now 𝑋𝑇 [𝑇 6 𝑡] is a


composition:

𝜔 ↦→ (𝜔, 𝑇 (𝜔) ∧ 𝑡)
  
[𝑇 6 𝑡], ℱ𝑡 → [𝑇 6 𝑡] × [0, 𝑡], ℱ𝑡 ⊗ ℬ [0, 𝑡]
26 Chapter 3. Filtrations and Martingales

with

(𝜔, 𝑠) ↦→ 𝑋𝑠 (𝜔)
 
Ω × [0, 𝑡], ℱ𝑡 ⊗ ℬ [0, 𝑡] → (𝐸, ℰ).

Both of these are measurable: the first by our observation about graphs and the measurability of
𝑇 ∧ 𝑡 from Proposition 3.6(i); the second by definition of progressive measurability. J

Note how ℱ𝑇 -measurability dovetails well with progressive measurability. (Actually, it suffices
that 𝑇 be an (ℱ𝑡 + )-stopping time.)
We will need to approximate a stopping time by a stopping time that takes discrete values.
If 𝑇 is a stopping time, 𝑆 6 𝑇, 𝑆 ∈ ℱ𝑇 , then 𝑆 need not be a stopping time. However, 𝑆 > 𝑇
works:
Proposition 3.8. If 𝑇 is a stopping time, 𝑆 > 𝑇, and 𝑆 ∈ ℱ𝑇 , then 𝑆 is a stopping time. If 𝑇 is a
stopping time and
d2𝑛𝑇e
𝑇𝑛 B ,
2𝑛
then 𝑇𝑛 are stopping times with 𝑇𝑛 ↓ 𝑇.

Proof. ∀𝑡 > 0 [𝑆 6 𝑡] = [𝑆 6 𝑡] ∩ [𝑇 6 𝑡] ∈ ℱ𝑡 since [𝑆 6 𝑡] ∈ ℱ𝑇 ; that is, 𝑆 is a stopping time.


The remainder follows since 𝑇 ∈ ℱ𝑇 , so 𝜎(𝑇) ⊆ ℱ𝑇 . J

Our stopping times will be of the following types:


Proposition 3.9. Let (𝑋𝑡 )𝑡>0 be an adapted process with values in a metric space (𝐸, 𝑑). For 𝐴 ⊆ 𝐸,
write
𝑇𝐴 B inf {𝑡 > 0 ; 𝑋𝑡 ∈ 𝐴}.
(i) If the sample paths of 𝑋 are, at each time, either left-continuous or right-continuous and 𝐴 is
open, then 𝑇𝐴 is an (ℱ𝑡 + )-stopping time.
(ii) If the sample paths of 𝑋 are continuous and 𝐴 is closed, then 𝑇𝐴 is a stopping time.

Proof. (i) ∀𝑡 > 0 [𝑇𝐴 < 𝑡] = 𝑠∈[0,𝑡)∩Q [𝑋𝑠 ∈ 𝐴] ∈ ℱ𝑡 , so the result follows immediately
Ð
from Proposition 3.6(i).
(ii) ∀𝑡 > 0 [𝑇𝐴 6 𝑡] = inf 𝑠∈[0,𝑡]∩Q 𝑑 (𝑋𝑠 , 𝐴) = 0 ∈ ℱ𝑡 .
 
J

Exercise (due 10/12). Give an example of an adapted process for each of the following:
(a) 𝑋 is left-continuous and 𝐴 is open, but 𝑇𝐴 is not a stopping time;
(b) 𝑋 is right-continuous and 𝐴 is open, but 𝑇𝐴 is not a stopping time;
(c) 𝑋 is left-continuous and 𝐴 is closed, but 𝑇𝐴 is not a stopping time.
Much more general sets and processes give stopping times under some common restrictions
on the filtration. Namely, suppose that (ℱ𝑡 )𝑡 is right-continuous and complete and that 𝐸 is a
topological space. Let 𝑋 be 𝐸-valued and progressive and 𝐴 ⊂ 𝐸 be Borel. Then both the following
are stopping times: inf {𝑡 > 0 ; 𝑋𝑡 ∈ 𝐴} and inf {𝑡 > 0 ; 𝑋𝑡 ∈ 𝐴}. This is a consequence of the
debut theorem that if 𝐵 ⊆ Ω × R+ is progressive, then 𝜔 ↦→ inf {𝑡 > 0 ; (𝑡, 𝜔) ∈ 𝐵} is a stopping
3.3. Continuous-Time Martingales and Supermartingales 27

time. The usual proofs of this use analytic sets and capacities; for a more elementary proof, see
Richard F. Bass, “The measurability of hitting times,” Electron. Commun. Probab. 15 (2010),
99–105, with a correction at Electron. Commun. Probab. 16 (2011), 189–191. We will not use
these extensions.

3.3. Continuous-Time Martingales and Supermartingales


Fix a filtered probability space Ω, ℱ, (ℱ𝑡 )𝑡>0 , P . Unless otherwise stated, processes in the


remainder of the chapter will be R-valued.


Definition 3.10. Let (𝑋𝑡 )𝑡>0 be an adapted process with 𝑋𝑡 ∈ 𝐿 1 for all 𝑡. We say 𝑋 is a martingale
if
0 6 𝑠 6 𝑡 =⇒ E[𝑋𝑡 | ℱ𝑠 ] = 𝑋𝑠 a.s.
If “ = 𝑋𝑠 ” is replaced by “ 6 𝑋𝑠 ” [“ > 𝑋𝑠 ”], we say 𝑋 is a supermartingale [submartingale].
Many examples are from processes, like Brownian motion, that have independent increments,
where an R𝑑 -valued process (𝑍𝑡 )𝑡>0 has independent increments with respect to (ℱ𝑡 ) if 𝑍 is adapted
and 0 6 𝑠 6 𝑡 ⇒ 𝑍𝑡 − 𝑍 𝑠 ⫫ ℱ𝑠 . If 𝑍 is R-valued and has this property, then the following hold:
(i) if ∀𝑡 > 0 𝑍𝑡 ∈ 𝐿 1 , then 𝑍
e𝑡 B 𝑍𝑡 − E[𝑍𝑡 ] is a martingale;
2 2
(ii) if ∀𝑡 > 0 𝑍𝑡 ∈ 𝐿 2 , then 𝑌𝑡 B 𝑍e𝑡 − E[ 𝑍e𝑡 ] is a martingale;
(iii) if 𝜃 ∈ R and ∀𝑡 > 0 E[exp(𝜃𝑍𝑡 )] < ∞, then 𝑋𝑡 B e𝜃𝑍𝑡 /E[e𝜃𝑍𝑡 ] is a martingale.
Proof. These are easy to prove. For example, for (ii), when 0 6 𝑠 6 𝑡,
 2
E 𝑍e𝑡 ℱ𝑠 = E ( 𝑍e𝑡 − 𝑍e𝑠 + 𝑍e𝑠 ) 2 ℱ𝑠
  
2
= 𝑍e𝑠 + 2 𝑍e𝑠 E 𝑍e𝑡 − 𝑍e𝑠 ℱ𝑠 + E ( 𝑍e𝑡 − 𝑍e𝑠 ) 2 ℱ𝑠
   
2
= 𝑍e𝑠 + E ( 𝑍e𝑡 − 𝑍e𝑠 ) 2
 

2  2  2
= 𝑍e𝑠 + E 𝑍e𝑡 − 2 E 𝑍e𝑡 𝑍e𝑠 + E 𝑍e𝑠
 

e𝑠 | ℱ𝑠 ] = E 𝑍e𝑠 2
   
= E E[ 𝑍
e𝑡 𝑍
2  2  2
= 𝑍e𝑠 + E 𝑍e𝑡 − E 𝑍e𝑠 .

For (iii), see the book—it is even shorter. J


Exercise. On the probability space of [0, 1] with Lebesgue measure, define 𝑋𝑡 := (𝑡 + 1)1 [0,1/(𝑡+1)] .
Show that (𝑋𝑡 )𝑡>0 is a martingale (with respect to some filtration).
We will now derive even more martingales from Brownian motion.
Definition 3.11. A (𝑑-dimensional) Brownian motion that has independent increments with respect
to (ℱ𝑡 ) is called a (𝑑-dimensional) (ℱ𝑡 )-Brownian motion.
From the above, if 𝐵 is an (ℱ𝑡 )-Brownian motion, started from a fixed real number, then
𝜃2
𝐵𝑡 ; 𝐵𝑡2 − 𝑡; e𝜃𝐵𝑡 − 2 𝑡 (𝜃 ∈ R)
28 Chapter 3. Filtrations and Martingales

are martingales with continuous sample paths. These last are called exponential martingales of
Brownian motion.
Here are some more: Suppose 𝑓 ∈ 𝐿 2 (R+ , Leb). Let 𝐺 be a Gaussian white noise on 𝐿 2 (R+ )
and define, as we did earlier,
∫ 𝑡
𝑓 (𝑠) d𝐵 𝑠 B 𝐺 𝑓 1 [0,𝑡] ,

𝑍𝑡 B
0

where 𝐵 𝑠 = 𝐺 1 [0,𝑠] . Then 𝑍 has independent increments with respect to the canonical filtration


of Brownian motion. In fact, first, 𝑍𝑡 ∈ ℱ𝑡 because one may approximate 𝑓 1 [0,𝑡] in 𝐿 2 by step
functions, and second, when 0 6 𝑠 6 𝑡,

𝑍𝑡 − 𝑍 𝑠 = 𝐺 ( 𝑓 1 [𝑠,𝑡] ) ⊥ 𝐺 (ℎ1 [0,𝑠] )

for all ℎ ∈ 𝐿 2 (R+ ), whence 𝑍𝑡 − 𝑍 𝑠 ⫫ ℱ𝑠 by Theorem 1.9. This yields the martingales
∫ 𝑡 ∫ 𝑡 2 ∫ 𝑡
𝑓 (𝑠) d𝐵 𝑠 ; 𝑓 (𝑠) d𝐵 𝑠 − 𝑓 (𝑠) 2 d𝑠 ;
0 0 0
| {z }
k 𝑓 1 [0,𝑡] k 𝐿2 2 (R )
+

since 𝑍𝑡 ∼ 𝒩 0, 0 𝑓 (𝑠) 2 d𝑠
∫𝑡 
n ∫ 𝜃2 𝑡
𝑡 ∫ o
2
exp 𝜃 𝑓 (𝑠) d𝐵 𝑠 − 𝑓 (𝑠) d𝑠 (𝜃 ∈ R).
0 2 0

The first of these is a Wiener integral process, so it has a modification with continuous sample
paths by the exercise on page 12. Therefore, so do all the rest. This also follows from Theorem 5.6.
If 𝑁 is a Poisson process with parameter 𝜆 and (ℱ𝑡 ) is its canonical filtration, then we get the
martingales
𝑁𝑡 − 𝜆𝑡; (𝑁𝑡 − 𝜆𝑡) 2 − 𝜆𝑡; exp 𝜃𝑁𝑡 − 𝜆𝑡 (e𝜃 − 1)

(𝜃 ∈ R).
Of course, these cannot be modified to have continuous sample paths.
𝜃2
Exercise (due 10/12). Let 𝐵 be an (ℱ 𝑡 )-Brownian motion. Write 𝑀𝑡 (𝜃) B e
𝜃𝐵𝑡 − 2 𝑡 . Show that for
d 𝑛
each 𝜃 ∈ R and 𝑛 ∈ N, ( d𝜃 ) 𝑀𝑡 (𝜃) 𝑡>0 is an (ℱ𝑡 )-martingale. By using 𝑛 = 3 and 𝜃 = 0, deduce

3
that 𝐵𝑡 − 3𝑡𝐵𝑡 𝑡>0 is an (ℱ𝑡 )-martingale.
We now give some properties of (sub)(super)martingales. The first is proved exactly as in the
discrete case:
Proposition 3.12. Let (𝑋𝑡 )𝑡>0 be adapted and 𝑓 : R → R+ be convex. Suppose that ∀𝑡 > 0
E[ 𝑓 (𝑋𝑡 )] < ∞.
(i) If 𝑋𝑡 is a martingale, then 𝑓 (𝑋𝑡 ) 𝑡>0 is a submartingale.


(ii) If (𝑋𝑡 ) is a submartingale and 𝑓 is increasing, then 𝑓 (𝑋𝑡 ) 𝑡>0 is a submartingale.




Our next result is trivial in the discrete case:


3.3. Continuous-Time Martingales and Supermartingales 29

Proposition 3.13. Let (𝑋𝑡 )𝑡>0 be a submartingale or supermartingale. Then

∀𝑡 > 0 sup E |𝑋𝑠 | < ∞.


 
06𝑠6𝑡

Proof. By symmetry, it is enough to prove this when (𝑋𝑡 ) is a submartingale. We use |𝑋𝑠 | = 2𝑋𝑠+ − 𝑋𝑠 .
By Proposition 3.12, (𝑋𝑡+ ) is a submartingale, so

06𝑠6𝑡 =⇒ E[𝑋𝑠+ ] 6 E[𝑋𝑡+ ].

Also,
E[𝑋𝑠 ] > E[𝑋0 ].
Hence,
E |𝑋𝑠 | 6 2 E[𝑋𝑡+ ] − E[𝑋0 ].
 
J
Our next proofs will use the fact that if (𝑋𝑡 ) is a (sub)(super)martingale and 𝑡1 < 𝑡 2 < · · · < 𝑡 𝑝 ,
then (𝑋𝑡𝑖 , ℱ𝑡𝑖 ) 16𝑖6 𝑝 is a discrete time (sub)(super)martingale.


The following points towards quadratic variation. We call a process (𝑋𝑡 ) square-integrable if
∀𝑡 𝑋𝑡 ∈ 𝐿 2 .
Proposition 3.14. Let (𝑋𝑡 )𝑡>0 be a square-integrable martingale and 0 6 𝑡0 < · · · < 𝑡 𝑝 . Then
𝑝 i 1 
hX  2 
(𝑋𝑡𝑖 − 𝑋𝑡𝑖−1 ) 2 ℱ𝑡0 = E 𝑋𝑡2𝑝 − 𝑋𝑡20 ℱ𝑡0 = E (𝑋𝑡 𝑝 − 𝑋𝑡0 ) 2 ℱ𝑡0 .

E
𝑖=1

Hence, the same holds unconditionally.

Proof. This is a type of Pythagorean theorem and depends on orthogonality. We have ∀𝑖 ∈ [1, 𝑝]
h  i
2 2

E[(𝑋𝑡𝑖 − 𝑋𝑡𝑖−1 ) | ℱ𝑡0 ] = E E (𝑋𝑡𝑖 − 𝑋𝑡𝑖−1 ) ℱ𝑡𝑖−1 ℱ𝑡0
h  i
= E E 𝑋𝑡2𝑖 ℱ𝑡𝑖−1 − 2𝑋𝑡𝑖−1 E 𝑋𝑡𝑖 ℱ𝑡𝑖−1 + 𝑋𝑡2𝑖−1 ℱ𝑡0
  
h  i
= E E 𝑋𝑡2𝑖 ℱ𝑡𝑖−1 − 𝑋𝑡2𝑖−1 ℱ𝑡0


= E 𝑋𝑡2𝑖 − 𝑋𝑡2𝑖−1 ℱ𝑡0 .


 

Now, sum on 𝑖 to get 1 . If we take 𝑝 = 1, we get 2 . J


Next, we have analogues of discrete-time inequalities. Note that if 𝑓 : [0, 𝑡] → R is right-
continuous, then sup06𝑠6𝑡 𝑓 (𝑠) = sup𝑠∈[0,𝑡]∩(Q∪{𝑡}) 𝑓 (𝑠), whence we obtain measurability of the
supremum for a stochastic process with right-continuous sample paths.
Proposition 3.15. Let (𝑋𝑡 )𝑡>0 be a submartingale or supermartingale with right-continuous sample
paths.
(i) (Maximal inequality)
h i
∀𝑡 > 0 ∀𝜆 > 0 𝜆 P sup |𝑋𝑠 | > 𝜆 6 E |𝑋0 | + 2 E |𝑋𝑡 | .
   
06𝑠6𝑡
30 Chapter 3. Filtrations and Martingales

(ii) (Doob’s 𝐿 𝑝 inequality) If 𝑋 is a martingale, then


h i  𝑝 𝑝 
∀𝑡 > 0 ∀𝑝 > 0 E sup |𝑋𝑠 | 𝑝 6

E |𝑋𝑡 | 𝑝 .
06𝑠6𝑡 𝑝−1

Proof. (i) If 𝐷 is a finite set in [0, 𝑡] with 0, 𝑡 ∈ 𝐷, then the discrete-time inequality yields
h i
𝜆 P sup |𝑋𝑠 | > 𝜆 6 E |𝑋0 | + 2 E |𝑋𝑡 | .
   
𝑠∈𝐷

Now take 𝐷 to be countable and dense in [0, 𝑡] with 0, 𝑡 ∈ 𝐷 and write 𝐷 as an increasing union of
finite sets 𝐷 𝑚 with 0, 𝑡 ∈ 𝐷 𝑚 . Then take the limit in the above inequality for 𝐷 𝑚 as 𝑚 goes to ∞ to
get h i
𝜆 P sup |𝑋𝑠 | > 𝜆 6 E |𝑋0 | + 2 E |𝑋𝑡 | .
   
06𝑠6𝑡
Finally, use this inequality for a sequence 𝜆𝑛 increasing to 𝜆.
(ii) The proof is similar; now we invoke the monotone convergence theorem. J
Remark. If we did not assume right-continuity, we would get the same results for sup𝑠∈𝐷 |𝑋𝑠 | for
any countable 𝐷 ⊆ [0, 𝑡]: we may add {0, 𝑡} to 𝐷 if necessary. In particular, by letting 𝜆 → ∞, we
get from (i) that sup𝑠∈𝐷 |𝑋𝑠 | < ∞ almost surely.
We call a function càdlàg or RCLL if it is right-continuous with left-limits everywhere.
Exercise (due 10/12). Let (𝑋𝑡 )𝑡>0 be a process with càdlàg sample paths. Let 𝐴𝑡 be the event that
𝑠 ↦→ 𝑋𝑠 is continuous for 𝑠 ∈ [0, 𝑡]. Show that ∀𝑡 > 0 𝐴𝑡 ∈ ℱ𝑡𝑋 .
As for discrete times, we prove convergence using upcrossings, where for a function 𝑓 : 𝐼 → R
(𝐼 ⊆ R) and 𝑎 < 𝑏, the upcrossing number of 𝑓 along [𝑎, 𝑏] is
n ∃𝑠 , 𝑡 ∈ 𝐼 with 𝑠1 < 𝑡1 < · · · < 𝑠 𝑘 < 𝑡 𝑘 o
𝑀𝑎𝑏 (𝐼) B sup 𝑘 > 0 ; 𝑖 𝑖
𝑓
.
and 𝑓 (𝑠𝑖 ) 6 𝑎, 𝑓 (𝑡𝑖 ) > 𝑏 for 1 6 𝑖 6 𝑘
In Section 3.4, we use this, as in discrete time, to study convergence as 𝑡 → ∞. Here, we study
right and left limits at finite times. We will use the following easy lemma:
Lemma 3.16. Let 𝐷 ⊆ R+ be dense and countable. Let 𝑓 : 𝐷 → R satisfy

∀𝑢 ∈ 𝐷 sup 𝑓 (𝑡) < ∞


𝑡∈𝐷∩[0,𝑢]

and
𝑓 
∀𝑎, 𝑏 ∈ Q with 𝑎 < 𝑏 𝑀𝑎𝑏 𝐷 ∩ [0, 𝑢] < ∞.
Then ∀𝑡 > 0 𝑓 (𝑡 + ) B lim𝑠↓𝑡∈𝐷 𝑓 (𝑡) exists and 𝑓 (𝑡 − ) B lim𝑠↑𝑡∈𝐷 𝑓 (𝑡) exists. In addition, 𝑡 ↦→ 𝑓 (𝑡 + )
is càdlàg on R+ . J

(Note that 𝑡 ↦→ 𝑓 (𝑡 + ) has left limits by the upcrossing condition.)


In the
 proof of the next theorem, we will use the fact that a backward (sub)(super)martingale
(𝑌𝑛 , ℱ𝑛 ) 𝑛60 with sup𝑛60 E |𝑌𝑛 | < ∞ is uniformly integrable. (If you haven’t seen this, here’s a hint


to prove it. Note that it does not matter whether the process is a submartingale or a supermartingale.
3.3. Continuous-Time Martingales and Supermartingales 31

Assume the former. Show that there is a backward martingale (𝑍𝑛 )𝑛60 and nonnegative random
variables 𝑊𝑛 ∈ ℱ𝑛−1 such that 𝑌𝑛 = 𝑍𝑛 + 𝑊𝑛 and 𝑊𝑛 > 𝑊𝑛−1 . Use that every backward martingale
is uniformly integrable.)
Recall that if a sequence converges almost surely, then it converges in 𝐿 1 if and only if it is
uniformly integrable. The same follows for continuous time.
Theorem 3.17. Let (𝑋𝑡 )𝑡>0 be a supermartingale and 𝐷 be a countable dense subset of R+ .
(i) ∃𝑁 ⊆ Ω such that P(𝑁) = 0 and ∀𝜔 ∉ 𝑁

∀𝑡 > 0 𝑋𝑡 + (𝜔) B lim 𝑋𝑠 (𝜔) and 𝑋𝑡 − (𝜔) B lim 𝑋𝑠 (𝜔)


𝐷3𝑠↓𝑡 𝐷3𝑠↑𝑡

exist.
(ii) ∀𝑡 ∈ R+ 𝑋𝑡 + ∈ 𝐿 1 and satisfies
𝑋𝑡 > E[𝑋𝑡 + | ℱ𝑡 ]
with equality if 𝑡 ↦→ E[𝑋𝑡 ] is right-continuous (e.g., if 𝑋 is a martingale). The process
(𝑋𝑡 + )𝑡>0 is indistinguishable from a process that is an (ℱ𝑡 + )-supermartingale and, if 𝑋 is a
martingale, an (ℱ𝑡 + )-martingale.

Proof. (i) Fix 𝑢 ∈ 𝐷. We saw in the remark that

sup |𝑋𝑠 | < ∞ a.s.


𝑠∈𝐷∩[0,𝑢]

For finite 𝐷 0 ⊆ 𝐷 ∩ [0, 𝑢], Doob’s upcrossing inequality yields



 
E (𝑋𝑢 − 𝑎)
E 𝑀𝑎𝑏 (𝐷 0) 6
 𝑋 
∀𝑎 < 𝑏 .
𝑏−𝑎
Thus, using an increasing sequence of finite subsets of 𝐷 ∩ [0, 𝑢] whose union is 𝐷 ∩ [0, 𝑢] and
using the monotone convergence theorem, we get

 i E (𝑋𝑢 − 𝑎) −
 
h
𝑋
E 𝑀𝑎𝑏 𝐷 ∩ [0, 𝑢] 6 < ∞,
𝑏−𝑎
whence 𝑀𝑎𝑏𝑋 𝐷 ∩ [0, 𝑢] < ∞ almost surely. Let


h i
𝑁 B ∃𝑢 ∈ 𝐷, sup |𝑋𝑠 | = ∞ or ∃𝑎, 𝑏 ∈ Q with 𝑎 < 𝑏 and 𝑀𝑎𝑏 𝑋 
𝐷 ∩ [0, 𝑢] = ∞ .
𝑠∈𝐷∩[0,𝑢]

We have seen that P(𝑁) = 0 as a countable union of sets of probability 0. For 𝜔 ∉ 𝑁, we may
apply Lemma 3.16 to get (i).
(ii) Fix 𝑡 ∈ R+ . Choose 𝐷 3 𝑡 𝑛 ↓ 𝑡 monotonically, so 𝑋𝑡 + B lim𝑛→∞ 𝑋𝑡 𝑛 almost surely. If we
re-index time, 𝑌𝑛 B 𝑋𝑡−𝑛 (𝑛 6 0), then (𝑌𝑛 )𝑛60 is a backward supermartingale. By Proposition 3.13,

sup E |𝑌𝑛 | < ∞,


 
𝑛60

whence 𝑋𝑡 𝑛 → 𝑋𝑡 + in 𝐿 1 by uniform integrability. In particular, 𝑋𝑡 + ∈ 𝐿 1 .


32 Chapter 3. Filtrations and Martingales

Since 𝐿h1 convergence implies 𝐿i1 convergence


h  of conditional expectations (because of the
i
inequality E E[𝑍1 | 𝒢] − E[𝑍2 | 𝒢] 6 E E |𝑍1 − 𝑍2 | 𝒢 = E |𝑍1 − 𝑍2 | ), we get
 

𝐿1
𝑋𝑡 > E[𝑋𝑡 𝑛 | ℱ𝑡 ] −→ E[𝑋𝑡 + | ℱ𝑡 ].
If 𝑠 ↦→ E[𝑋𝑠 ] is right-continuous, then the expectation of the right-hand side equals E[𝑋𝑡 ], which is
the expectation of the left-hand side, whence the two sides agree almost surely.
Now redefine 𝑋𝑡 + to be lim𝐷3𝑠↓𝑡 𝑋𝑠 when the limit exists and 0 elsewhere. This changes 𝑋𝑡 +
on a subset of 𝑁, whence it is indistinguishable from its definition in (i). Furthermore, 𝑋𝑡 + ∈ ℱ𝑡 +
now. Consider 𝑠 < 𝑡 and choose 𝑠𝑛 ↓ 𝑠 and 𝑡 𝑛 ↓ 𝑡 with 𝑠𝑛 , 𝑡 𝑛 ∈ 𝐷 and 𝑠𝑛 6 𝑡 𝑛 . To show that
𝑋𝑠+ > E[𝑋𝑡 + | ℱ𝑠+ ], it suffices to show that
 
∀𝐴 ∈ ℱ𝑠+ E[𝑋𝑠+ 1 𝐴 ] > E E[𝑋𝑡 + | ℱ𝑠+ ]1 𝐴 = E[𝑋𝑡 + 1 𝐴 ]
(by considering 𝐴 B 𝑋𝑠+ < E[𝑋𝑡 + | ℱ𝑠+ ] ). Indeed, 𝐿 1 convergence yields
 

E[𝑋𝑠+ 1 𝐴 ] = lim E[𝑋𝑠𝑛 1 𝐴 ] > lim E[𝑋𝑡 𝑛 1 𝐴 ] = E[𝑋𝑡 + 1 𝐴 ].


𝑛→∞ 𝑛→∞
[𝑋𝑠𝑛 > E[𝑋𝑡 𝑛 | ℱ𝑠𝑛 ] and 𝐴 ∈ ℱ𝑠+ ⊆ ℱ𝑠𝑛 ]

Thus, (𝑋𝑡 + ) is an (ℱ𝑡 + )-supermartingale. If (𝑋𝑡 ) had been a submartingale, then we would
have concluded (𝑋𝑡 + ) is an (ℱ𝑡 + )-submartingale, whence (𝑋𝑡 ) is a martingale implies (𝑋𝑡 + ) is an
(ℱ𝑡 + )-martingale. J
Exercise (due 10/19). Let (𝑋𝑡 )𝑡>0 be a supermartingale with càdlàg sample paths. Show that
𝑡 ↦→ E[𝑋𝑡 ] is càdlàg.
There is a kind of converse to this exercise:
Theorem 3.18. Let (ℱ𝑡 ) be right-continuous and complete (often called “the usual conditions”). If
(𝑋𝑡 ) is a supermartingale such that 𝑡 ↦→ E[𝑋𝑡 ] is right-continuous, then (𝑋𝑡 ) has a modification
that is a supermartingale with càdlàg sample paths.
Proof. Consider the modification of (𝑋𝑡 + ) that we used in the proof of Theorem 3.17(ii). We saw
there that 𝑋𝑡 + ∈ ℱ𝑡 + , which now equals ℱ𝑡 . The theorem showed now that 𝑋𝑡 = E[𝑋𝑡 + | ℱ𝑡 ] = 𝑋𝑡 +
almost surely. Thus, use (𝑋𝑡 + ) as the modification of (𝑋𝑡 ). Lemma 3.16 shows that 𝑋𝑡 + is càdlàg. J

3.4. Optional Stopping Theorems


Our first two results really belong in Section 3.3. They are about convergence as time 𝑡 → ∞.
Theorem 3.19. Let 𝑋 be a right-continuous submartingale or supermartingale bounded in 𝐿 1 .
Then there exists 𝑋∞ ∈ 𝐿 1 such that lim𝑡→∞ 𝑋𝑡 = 𝑋∞ almost surely.
Proof. We may assume 𝑋 is a supermartingale. Let 𝐷 be a countable dense subset of R+ . In the
proof of Theorem 3.17, we saw that
 i E (𝑋𝑠 − 𝑎) −
 
h
𝑋
∀𝑠 ∈ R+ ∀𝑎 < 𝑏 E 𝑀𝑎𝑏 𝐷 ∩ [0, 𝑠] 6 .
𝑏−𝑎
3.4. Optional Stopping Theorems 33

Taking the supremum over 𝑠 and using the monotone convergence theorem yields
1
sup E (𝑋𝑠 − 𝑎) − < ∞.
 𝑋   
E 𝑀𝑎𝑏 (𝐷) 6
𝑏 − 𝑎 𝑠>0
[by hypothesis]
Apply this to 𝑎, 𝑏 ∈ Q to get 𝑀𝑎𝑏 𝑋 (𝐷) < ∞ a.s. simultaneously in 𝑎, 𝑏 ∈ Q, whence

𝑋∞ B lim𝐷3𝑡→∞ 𝑋𝑡 exists in [−∞, ∞] almost surely. By Fatou’s lemma, we have

E |𝑋∞ | 6 lim E |𝑋𝑡 | < ∞,


   
𝐷3𝑡→∞

whence 𝑋∞ ∈ 𝐿 1 . Finally, right-continuity shows the conclusion. J


Whether convergence holds in 𝐿 1 is just as in the case of deterministic time:
Definition 3.20. A martingale 𝑋 is closed if there exists 𝑍 ∈ 𝐿 1 such that

∀𝑡 > 0 𝑋𝑡 = E[𝑍 | ℱ𝑡 ] a.s.

Theorem 3.21. Let 𝑋 be a right-continuous martingale. The following are equivalent:


(i) 𝑋 is closed;
(ii) 𝑋 is uniformly integrable;
(iii) 𝑋𝑡 converges almost surely and in 𝐿 1 as 𝑡 → ∞.
In this case
∀𝑡 > 0 𝑋𝑡 = E[𝑋∞ | ℱ𝑡 ] a.s.,
where 𝑋∞ B lim𝑡→∞ 𝑋𝑡 . J
Since we are now interested in 𝑋∞ , we will define 𝑋𝑇 even where 𝑇 = ∞: If lim𝑡→∞ 𝑋𝑡 = 𝑋∞
almost surely and 𝑇 is a [0, ∞]-valued random variable, then we write

𝑋𝑇 (𝜔) B 𝑋𝑇 (𝜔) (𝜔),

defined almost surely. We saw in Theorem 3.7 that if 𝑋 is progressive and 𝑇 is a stopping time, then
𝑋𝑇 is ℱ𝑇 measurable on [𝑇 < ∞]. If 𝑋 is adapted, then 𝑋∞ ∈ ℱ∞ , whence 𝑋𝑇 is ℱ𝑇 -measurable
on [𝑇 = ∞]. Therefore, if 𝑋 is a right-continuous submartingale or supermartingale and 𝑇 is a
stopping time, then 𝑋𝑇 ∈ ℱ𝑇 (by Proposition 3.4).
One of the main reasons martingales are useful is:
Theorem 3.22 (Optional stopping theorem for martingales). Let 𝑋 be a uniformly integrable,
right-continuous martingale. Let 𝑆 6 𝑇 be stopping times. Then 𝑋𝑆 , 𝑋𝑇 ∈ 𝐿 1 and

𝑋𝑆 = E[𝑋𝑇 | ℱ𝑆 ], (∗)
𝑋𝑇 = E[𝑋∞ | ℱ𝑇 ],
E[𝑋0 ] = E[𝑋𝑇 ] = E[𝑋∞ ].

Remark. This extends to uniformly integrable, right-continuous supermartingales with “>” in the
conclusions; see Stochastic Calculus and Applications, second edition, by Samuel N. Cohen and
Robert J. Elliott, Theorem 5.3.1.
34 Chapter 3. Filtrations and Martingales

Note: In the case that 𝑆 and 𝑇 are constants, the first equation is the definition of martingale
and the rest are from Theorem 3.21.

Proof. We use the approximations of 𝑆 and 𝑇 from Proposition 3.8, now defined where [𝑆 = ∞] or
[𝑇 = ∞]:
𝑆𝑛 B d2𝑛 𝑆e/2𝑛 and 𝑇𝑛 B d2𝑛𝑇e/2𝑛 .
These are stopping times that decrease to 𝑆 and 𝑇 and also satisfy 𝑆𝑛 6 𝑇𝑛 . Thus, we may apply the
discrete-time version of this theorem to get

∀𝑛 𝑋𝑆 𝑛 = E[𝑋𝑇𝑛 | ℱ𝑆 𝑛 ].

We want to let 𝑛 → ∞ to get Eq. (∗), i.e., that

∀𝐴 ∈ ℱ𝑆 E[1 𝐴 𝑋𝑆 ] = E[1 𝐴 𝑋𝑇 ]

(because indeed 𝑋𝑆 ∈ ℱ𝑆 ). Now, right-continuity yields 𝑋𝑆 𝑛 → 𝑋𝑆 and 𝑋𝑇𝑛 → 𝑋𝑇 almost surely


and in 𝐿 1 , the latter since

𝑋𝑆 𝑛 = E[𝑋∞ | ℱ𝑆 𝑛 ] and 𝑋𝑇𝑛 = E[𝑋∞ | ℱ𝑇𝑛 ]

by the discrete-time theorem. In particular, 𝑋𝑆 , 𝑋𝑇 ∈ 𝐿 1 . Thus, for each 𝐴 ∈ ℱ𝑆 ⊆ ℱ𝑆 𝑛 , we have

E[1 𝐴 𝑋𝑆 𝑛 ] = E[1 𝐴 𝑋𝑇𝑛 ]

E[1 𝐴 𝑋𝑆 ] E[1 𝐴 𝑋𝑇 ].

This shows Eq. (∗), and the rest is immediate from the fact that 𝑇 6 ∞ is a stopping time. J

Uniform integrability is a key assumption (e.g., the double-or-nothing martingale). To make it


easier to use, we have the following two corollaries.
Corollary 3.23. Let 𝑋 be a right-continuous martingale and 𝑆 6 𝑇 be bounded stopping times.
Then 𝑋𝑆 , 𝑋𝑇 ∈ 𝐿 1 and 𝑋𝑆 = E[𝑋𝑇 | ℱ𝑆 ].

Proof. Suppose 𝑇 6 𝑟 almost surely. Note that (𝑋𝑡∧𝑟 )𝑡>0 is a martingale, closed by 𝑋𝑟 . Thus, it is
uniformly integrable. Since 𝑆 ∧ 𝑟 = 𝑆 and 𝑇 ∧ 𝑟 = 𝑇, the result follows from applying Theorem 3.22
to (𝑋𝑡∧𝑟 )𝑡>0 . J

A trivial fact is that if 𝑍 ∈ 𝐿 1 , then

∀𝑠, 𝑡 > 0 𝑍 ∈ ℱ𝑠 =⇒ E[𝑍 | ℱ𝑡 ] = E[𝑍 | ℱ𝑡∧𝑠 ].

We now replace 𝑠 by a stopping time, 𝑇:


3.4. Optional Stopping Theorems 35

Proposition. If 𝑇 is a stopping time, then ∀𝑍 ∈ 𝐿 1


∀𝑡 > 0 𝑍 ∈ ℱ𝑇 =⇒ E[𝑍 | ℱ𝑡 ] = E[𝑍 | ℱ𝑡∧𝑇 ].
Proof. Let 𝑌 B E[𝑍 | ℱ𝑡∧𝑇 ]. Since 𝑌 ∈ ℱ𝑡∧𝑇 ⊆ ℱ𝑡 , it suffices to show that
∀𝐴 ∈ ℱ𝑡 E[1 𝐴 𝑍] = E[1 𝐴𝑌 ].
Consider 𝐴 = 𝐴 ∩ [𝑇 6 𝑡] ∪ 𝐴 ∩ [𝑇 > 𝑡] . Now, 𝑍1 [𝑇 6𝑡] ∈ ℱ𝑡 (property (j) of stopping times)
 

and since 𝑍 ∈ ℱ𝑇 and 𝑇 ∈ ℱ𝑇 , also 𝑍1 [𝑇 6𝑡] ∈ ℱ𝑇 , whence 𝑍1 [𝑇 6𝑡] ∈ ℱ𝑡 ∩ ℱ𝑇 = ℱ𝑡∧𝑇 (property


(f)). Therefore,
𝑍1 [𝑇 6𝑡] = E[𝑍1 [𝑇 6𝑡] | ℱ𝑡∧𝑇 ] = E[𝑍 | ℱ𝑡∧𝑇 ] 1 [𝑇 6𝑡] = 𝑌 1 [𝑇 6𝑡] ,
|{z}
∈ℱ𝑡∧𝑇
so
E[𝑍1 𝐴 1 [𝑇 6𝑡] ] = E[𝑌 1 𝐴 1 [𝑇 6𝑡] ].
Also, 𝐴 ∩ [𝑇 > 𝑡] ∈ ℱ𝑡 and since we have that ∀𝑠 > 0 𝐴 ∩ [𝑇 > 𝑡] ∩ [𝑇 6 𝑠] ∈ ℱ𝑠 , also
𝐴 ∩ [𝑇 > 𝑡] ∈ ℱ𝑇 , so again 𝐴 ∩ [𝑇 > 𝑡] ∈ ℱ𝑡∧𝑇 , whence by definition of 𝑌 ,
   
E 𝑍1 𝐴∩[𝑇 >𝑡] = E 𝑌 1 𝐴∩[𝑇 >𝑡] .
Adding these last two displays gives the result. J
We apply this to stopping a process, i.e., if 𝑋 is a process, the stopped process (𝑋𝑡∧𝑇 )𝑡>0 .
Corollary 3.24. Let 𝑋 be a right-continuous martingale and 𝑇 be a stopping time.
(i) The process (𝑋𝑡∧𝑇 )𝑡>0 is a martingale.
(ii) If 𝑋 is uniformly integrable, then so is (𝑋𝑡∧𝑇 )𝑡>0 , which is closed by 𝑋𝑇 :
𝑋𝑡∧𝑇 = E[𝑋𝑇 | ℱ𝑡 ] a.s. (∗)

Proof. (ii) By Theorem 3.22, 𝑋𝑇 ∈ 𝐿 1 , so we may apply the proposition to 𝑍 B 𝑋𝑇 to obtain


E[𝑋𝑇 | ℱ𝑡 ] = E[𝑋𝑇 | ℱ𝑡∧𝑇 ].
Also, 𝑡 ∧ 𝑇 is a stopping time (property (f)), so Theorem 3.22 gives E[𝑋𝑇 | ℱ𝑡∧𝑇 ] = 𝑋𝑡∧𝑇 . This
gives Eq. (∗)—which, by the way, also implies uniform integrability.
(i) Recall that ∀𝑠 > 0 (𝑋𝑡∧𝑠 )𝑡>0 is a uniformly integrable martingale. Applying Eq. (∗) to this
process, we get ∀𝑡 6 𝑠
𝑋 (𝑡∧𝑠)∧𝑇 = E[𝑋𝑇∧𝑠 | ℱ𝑡 ]. J
| {z }
=𝑋𝑡∧𝑇

Exercise (due 10/26). Deduce the proposition from Corollary 3.24 for (ℱ𝑡 ) that is right-continuous
and complete.
Exercise. There is a kind of converse to Theorem 3.22. Suppose that 𝑋𝑡 is defined for all 𝑡 ∈ [0, ∞],
including 𝑡 = ∞. Show that if 𝑋 is progressive and for every finite or infinite stopping time, 𝑇, 𝑋𝑇
is integrable with mean 0, then 𝑋 is a uniformly integrable martingale. Hint: consider 𝐴 ∈ ℱ𝑡 and
define 𝑇 := 𝑡1 𝐴 + ∞1 𝐴c to deduce that 𝑋𝑡 = E[𝑋∞ | ℱ𝑡 ].
36 Chapter 3. Filtrations and Martingales

Applications
Let 𝐵 be a real Brownian motion. For 𝑎 ∈ R, let 𝑇𝑎 B inf{𝑡 > 0 ; 𝐵𝑡 = 𝑎}.
(a) For 𝑎, 𝑏 > 0, consider 𝑇 B 𝑇−𝑎 ∧ 𝑇𝑏 and the stopped martingale 𝑀𝑡 B 𝐵𝑡∧𝑇 . Because
|𝑀𝑡 | 6 𝑎 ∨ 𝑏, 𝑀 is uniformly integrable, whence
0 = E[𝑀0 ] = E[𝑀𝑇 ] = E[𝐵𝑇 ] = 𝑏 P[𝑇𝑏 < 𝑇−𝑎 ] − 𝑎 P[𝑇−𝑎 < 𝑇𝑏 ].
Since the two probabilities add to 1, we can solve to find
𝑎
P[𝑇𝑏 < 𝑇−𝑎 ] = .
𝑎+𝑏
We needed only that Brownian motion is a continuous martingale from 0 that leaves (−𝑎, 𝑏).
(b) For 𝑎, 𝑏 > 0 and 𝑇 B 𝑇−𝑎 ∧ 𝑇𝑏 , consider the martingale 𝑀𝑡 B 𝐵𝑡2 − 𝑡. Again, 𝑀𝑡∧𝑇 is a
martingale, though no longer bounded. Still, we have from the martingale property that
∀𝑡 > 0 0 = E[𝑀0 ] = E[𝑀𝑡∧𝑇 ],
i.e.,  2 
E 𝐵𝑡∧𝑇 = E[𝑡 ∧ 𝑇].
We may let 𝑡 → ∞ and use the bounded convergence theorem on the left-hand side and monotone
convergence theorem on the right-hand side to obtain
E[𝐵𝑇2 ] = E[𝑇].
Using (a), we find that E[𝑇] = 𝑎𝑏.
(c) For 𝑎 > 0, and 𝜃 > 0, consider the martingale
2 𝑡/2
𝑁𝑡𝜃 B e𝜃𝐵𝑡 −𝜃 .
The stopped process, 𝑁𝑡∧𝑇
𝜃
𝑎
, takes values in (0, e𝜃𝑎 ), so is uniformly integrable, whence
2𝑇 /2
1 = E[𝑁0𝜃 ] = E[𝑁𝑇𝜃𝑎 ] = e𝜃𝑎 E[e−𝜃 𝑎
]. (∗)

Taking 𝜃 B 2𝜆 gives the Laplace transform of 𝑇𝑎 :

2𝜆
E[e−𝜆𝑇𝑎 ] = e−𝑎 (𝜆 > 0). (3.7)

Note that if we used 𝜃 = − 2𝜆 in Eq. (∗), we would get a different result. The reason is that when
𝜃 < 0, 𝑁 𝜃 is not uniformly integrable.
Exercise (due 10/26). Exercise 3.26.
Exercise (due 10/26). For 𝑥 ∈ R𝑑 and 𝑅 > |𝑥|, let 𝑇𝑑,𝑅 B inf{𝑡 > 0 ; |𝐵𝑡 | = 𝑅}, where (𝐵𝑡 )𝑡>0 is a
𝑑-dimensional Brownian motion started from 𝑥. Show that
𝑅 2 − |𝑥| 2
E[𝑇𝑑,𝑅 ] = .
𝑑
Extra credit: compute the Laplace transform of 𝑇𝑑,𝑅 .
3.4. Optional Stopping Theorems 37

Exercise. Let 𝐵 be 1-dimensional Brownian motion and 𝑇 B 𝑇−𝑎 ∧ 𝑇𝑏 for 𝑎, 𝑏 > 0. Use an earlier
martingale to compute E[𝑇 𝐵𝑇 ].
We’ll later need the continuous-time analogue of the discrete-time optional stopping theorem
for nonnegative supermartingales.
Theorem 3.25. Let 𝑋 be a nonnegative right-continuous supermartingale and 𝑆 6 𝑇 be stopping
times. Then 𝑋𝑆 , 𝑋𝑇 ∈ 𝐿 1 and
𝑋𝑆 > E[𝑋𝑇 | ℱ𝑆 ].
Also,
E[𝑋𝑆 ] > E[𝑋𝑇 ]
and      
E 𝑋𝑆 1 [𝑆<∞] > E 𝑋𝑇 1 [𝑆<∞] > E 𝑋𝑇 1 [𝑇 <∞] .
Proof. The strategy of proof for the martingale case (Theorem 3.22) mostly works, but now we
need some extra arguments. We are not assuming uniform integrability, and even for nonnegative
martingales, equality need not hold in the conclusion.
We first claim that if 𝑇 is bounded, then E[𝑋𝑆 ] > E[𝑋𝑇 ]. Let 𝑆𝑛 B d2𝑛 𝑆e/2𝑛 and 𝑇𝑛 B
d2𝑛𝑇e/2𝑛 . Right-continuity ensures that 𝑋𝑆 𝑛 → 𝑋𝑆 and 𝑋𝑇𝑛 → 𝑋𝑇 as 𝑛 → ∞. The optional
stopping theorem in discrete time for bounded stopping times gives

∀𝑛 > 0 𝑋𝑆 𝑛+1 > E[𝑋𝑆 𝑛 | ℱ𝑆 𝑛+1 ]

(note 𝑆𝑛+1 6 𝑆𝑛 ). This means that (𝑋𝑆−𝑛 , ℱ𝑆−𝑛 ) 𝑛60 is a backward supermartingale. The optional


stopping theorem also yields E[𝑋𝑆 𝑛 ] 6 E[𝑋0 ], so this backward supermartingale is 𝐿 1 -bounded,
whence converges in 𝐿 1 to 𝑋𝑆 . Likewise, 𝑋𝑇𝑛 → 𝑋𝑇 in 𝐿 1 .
Since 𝑆𝑛 6 𝑇𝑛 , the optional stopping theorem also implies that

E[𝑋𝑆 𝑛 ] > E[𝑋𝑇𝑛 ].

Taking 𝑛 → ∞ gives the claim.


Now, we prove the theorem. For any 𝑚 > 0, we may apply the first part to the bounded stopping
times 0 6 𝑆 ∧ 𝑚 to get E[𝑋𝑆∧𝑚 ] 6 E[𝑋0 ]. Fatou’s lemma then yields E[𝑋𝑆 ] 6 E[𝑋0 ] < ∞ and
similarly 𝑋𝑇 ∈ 𝐿 1 .
Property (d) of stopping times says that for 𝐴 ∈ ℱ𝑆 ,

for 𝜔 ∈ 𝐴,

𝐴
 𝑆(𝜔)


𝑆 (𝜔) B
∞

 for 𝜔 ∉ 𝐴

is a stopping time. Likewise, 𝑇 𝐴 is a stopping time because ℱ𝑆 ⊆ ℱ𝑇 . Applying the first part of the
proof to the bounded stopping times 𝑆 𝐴 ∧ 𝑚 6 𝑇 𝐴 ∧ 𝑚 gives

∀𝑚 > 0 E[𝑋𝑆 𝐴∧𝑚 ] > E[𝑋𝑇 𝐴∧𝑚 ].

Now if 𝑆 > 𝑚, then 𝑇 > 𝑚, so

𝑋𝑆 𝐴∧𝑚 1 [𝑆>𝑚] = 𝑋𝑇 𝐴∧𝑚 1 [𝑆>𝑚] ,


38 Chapter 3. Filtrations and Martingales

whence    
E 𝑋𝑆 1 𝐴∩[𝑆6𝑚] > E 𝑋𝑇∧𝑚 1 𝐴∩[𝑆6𝑚] .
Apply the monotone convergence theorem to the left-hand side and Fatou’s lemma to the right-hand
side to obtain    
E 𝑋𝑆 1 𝐴∩[𝑆<∞] > E 𝑋𝑇 1 𝐴∩[𝑆<∞] .
Since 𝑋𝑆 1 𝐴∩[𝑆=∞] = 𝑋𝑇 1 𝐴∩[𝑆=∞] , we get
 
E[𝑋𝑆 1 𝐴 ] > E[𝑋𝑇 1 𝐴 ] = E E[𝑋𝑇 | ℱ𝑆 ]1 𝐴 .

Therefore, 𝑋𝑆 > E[𝑋𝑇 | ℱ𝑆 ]. J


Exercise. Prove that if 𝑋 is a nonnegative right-continuous supermartingale and 𝜆 > 0, then
h i
𝜆 P sup 𝑋𝑡 > 𝜆 6 E[𝑋0 ].
𝑡>0
39

Chapter 4

Continuous Semimartingales

A semimartingale is by definition the sum of a local martingale and a finite-variation process,


both of which we define and study here. The next chapter studies stochastic integration with respect
to continuous semi-martingales. A key process studied here will be the quadratic variation process
of a continuous local martingale.

4.1. Finite-Variation Processes

4.1.1. Functions with Finite Variation

This is a review from real analysis. Let 𝐼 ⊆ R be an interval. We say 𝑎 : 𝐼 → R has finite (or
bounded) variation if
𝑝
X 
sup 𝑎(𝑡𝑖 ) − 𝑎(𝑡𝑖−1 ) ; 𝑡0 < 𝑡1 < · · · < 𝑡 𝑝 ∈ 𝐼 < ∞. (∗)
𝑖=1

This is equivalent to the property that there exists a signed Borel measure 𝜇 on 𝐼 such that

∀𝑠 < 𝑡 ∈ 𝐼 𝑎(𝑡) − 𝑎(𝑠) = 𝜇 (𝑠, 𝑡] .

Such a 𝜇 is uniquely determined by 𝑎. A signed measure 𝜇 has a Hahn–Jordan decomposition


𝜇 = 𝜇+ − 𝜇− , where 𝜇+ , 𝜇− > 0 and 𝜇+ ⊥ 𝜇− . We write |𝜇| B 𝜇+ + 𝜇− . A function of finite
variation is thus the difference of two bounded increasing functions and conversely.
For 𝑓 ∈ 𝐿 1 (𝜇), we write
∫ ∫ ∫ ∫
𝑓 d𝑎 B 𝑓 d𝜇 and 𝑓 |d𝑎| B 𝑓 |d𝜇|.
𝐼 𝐼 𝐼 𝐼

We have ∫ ∫
𝑓 d𝑎 6 | 𝑓 | |d𝑎|.
𝐼 𝐼

Furthermore, the function 𝑡 ↦→ 𝑓 |d𝑎| on 𝐼 has finite variation, represented by the



𝐼∩(−∞,𝑡]
measure 𝑓 · 𝜇.
40 Chapter 4. Continuous Semimartingales

Proposition 4.2. If 𝑎 has finite variation on [𝑠, 𝑡], then


∫ 𝑡
|d𝑎| = supremum in Eq. (∗)
𝑠
𝑝
nX o
= lim 𝑎(𝑡𝑖 ) − 𝑎(𝑡𝑖−1 ) ; 𝑠 = 𝑡0 < 𝑡1 < · · · < 𝑡 𝑝 = 𝑡, max |𝑡𝑖 − 𝑡𝑖−1 | < 𝜀 . J
𝜀↓0 𝑖
𝑖=1

We also have
Lemma 4.3. If 𝑎 has finite variation on [𝑠, 𝑡] and 𝑓 : [𝑠, 𝑡] → R is continuous, then
∫ 𝑡 X 𝑝 
𝑠 = 𝑡 0 < 𝑡1 < · · · < 𝑡 𝑝 = 𝑡,
𝑓 d𝑎 = lim 𝑓 (𝑡𝑖−1 ) 𝑎(𝑡𝑖 ) − 𝑎(𝑡𝑖−1 ) ;
 
max𝑖 |𝑡𝑖+1 − 𝑡 1 | < 𝜀
. J
𝑠 𝜀↓0
𝑖=1

Corollary. If 𝑎 has finite variation on [𝑠, 𝑡] and 𝑓 : [𝑠, 𝑡] → R is continuous, then


∫ 𝑡 X𝑝 
𝑠 = 𝑡 0 < 𝑡1 < · · · < 𝑡 𝑝 = 𝑡,
𝑓 |d𝑎| = lim 𝑓 (𝑡𝑖−1 ) 𝑎(𝑡𝑖 ) − 𝑎(𝑡𝑖−1 ) ; .
𝑠 𝜀↓0 max𝑖 |𝑡𝑖+1 − 𝑡 1 | < 𝜀
𝑖=1

We will not use this, so we skip the proof.


We call a function 𝑎 : R+ → R a finite-variation function if 𝑎𝐼 has finite variation for all
bounded 𝐼 ⊆ R+ . In this case, there is a 𝜎-finite positive measure 𝜇 such that


∀𝑠 < 𝑡 ∈ R+ |d𝑎| = 𝜇 (𝑠, 𝑡] .
(𝑠,𝑡]

For 𝑓 ∈ 𝐿 1 (𝜇), we write ∫ ∞ ∫ 𝑡


𝑓 d𝑎 B lim 𝑓 d𝑎.
0 𝑡→∞ 0

4.1.2. Finite-Variation Processes


Fix a filtered probability space.
Definition 4.4. A process 𝐴 is a finite-variation process if 𝐴 is adapted, all its sample paths are
finite-variation functions on R+ , and, for us, 𝐴0 = 0 and all sample paths are continuous. If also the
sample paths are increasing, then 𝐴 is an increasing process.
If 𝐴 is a finite-variation process, then the process
∫ 𝑡
𝑉𝑡 B |d𝐴𝑠 |
0
is an increasing process: adaptedness follows from Proposition 4.2. As for the case of functions, 𝐴
is the difference of two increasing processes:
𝑉𝑡 + 𝐴𝑡 𝑉𝑡 − 𝐴𝑡
𝐴𝑡 = − .
2 2
Integration with respect to a finite-variation process can be done pointwise, but we need something
to guarantee the result will be adapted:
4.1. Finite-Variation Processes 41

Proposition 4.5. Let 𝐴 be a finite-variation process and 𝐻 be a progressive process satisfying


∫ 𝑡
∀𝑡 > 0 ∀𝜔 ∈ Ω 𝐻𝑠 (𝜔) d𝐴𝑠 (𝜔) < ∞.
0
| {z }
integration with respect to 𝑠, not 𝜔

Then the process 𝐻 · 𝐴 defined by


∫ 𝑡
(𝐻 · 𝐴)𝑡 B 𝐻𝑠 d𝐴𝑠
0

is a finite-variation process.

Proof. We already saw that 𝐻 · 𝐴 has sample paths that are finite-variation functions, so it remains
to check that 𝐻 · 𝐴 is adapted. Recall that for all 𝑡, 𝐻 : Ω × [0, 𝑡] → R is measurable with respect to
ℱ ⊗ ℬ [0, 𝑡] . Thus, it suffices to show that if
 
ℎ : Ω × [0, 𝑡], ℱ𝑡 ⊗ ℬ [0, 𝑡] → R, ℬ(R)


and ∫ 𝑡
∀𝜔 ∈ Ω ℎ(𝜔, 𝑠) d𝐴𝑠 (𝜔) < ∞,
0
then ∫
 𝑡 
𝜔 ↦→ ℎ(𝜔, 𝑠) d𝐴𝑠 (𝜔) ∈ ℱ𝑡 .
0
This is like Fubini’s theorem. We start with ℎ of the form ℎ(𝜔, 𝑠) = 1Γ (𝜔)1 (𝑢,𝑣] (𝑠) for Γ ∈ ℱ𝑡 and
0 6 𝑢 < 𝑣 6 𝑡. In this case,
∫ 𝑡
ℎ(𝜔, 𝑠) d𝐴𝑠 (𝜔) = 1Γ (𝜔) 𝐴𝑣 (𝜔) − 𝐴𝑢 (𝜔) ∈ ℱ𝑡 .
 
0 |{z} | {z } | {z }
∈ℱ𝑡 ∈ℱ𝑣 ⊆ℱ𝑡 ∈ℱ𝑢 ⊆ℱ𝑡

The class of such Γ × (𝑢, 𝑣] is closed under finite intersections, so forms a 𝜋-system. Also the
class of 𝐺 ∈ ℱ𝑡 ⊗ ℬ [0, 𝑡] such that ℎ = 1𝐺 satisfies the conclusion is closed under complements
and countable disjoint unions, so forms a 𝜆-system. Therefore, the class is exactly ℱ𝑡 ⊗ ℬ [0, 𝑡] .
Taking a limit of simple functions dominated by |ℎ| gives the desired result. J

Suppose we have only


∫ 𝑡
∃ negligible 𝑁 ⊆ Ω ∀𝜔 ∉ 𝑁 ∀𝑡 > 0 𝐻𝑠 (𝜔) · d𝐴𝑠 (𝜔) < ∞.
0

If the filtration is complete, then we can define 𝐻 0 · 𝐴, where

if 𝜔 ∉ 𝑁,

 𝐻𝑡 (𝜔)


𝐻𝑡0 (𝜔) =
0
 if 𝜔 ∈ 𝑁.

42 Chapter 4. Continuous Semimartingales

Because of completeness, 𝐻 0 is progressive. We then define 𝐻 · 𝐴 B 𝐻 0 · 𝐴.


If 𝐻 and 𝐾 are progressive, then so is 𝐻𝐾, where (𝐻𝐾)𝑡 B 𝐻𝑡 𝐾𝑡 . If 𝐻 and 𝐻𝐾 satisfy the
integrability condition of Proposition 4.5, then

𝐾 · (𝐻 · 𝐴) = (𝐾 𝐻) · 𝐴

because for functions and measures,

𝑘 (ℎ𝜇) = (𝑘 ℎ)𝜇.

In the simple case 𝐴𝑡 ≡ 𝑡 and 𝐻 is progressive with


∫ 𝑡
∀𝜔 ∈ Ω ∀𝑡 > 0 |𝐻𝑠 (𝜔)| d𝑠 < ∞,
0

we obtain a finite-variation process 𝐻𝑠 d𝑠.


∫𝑡
0

4.2. Continuous Local Martingales

For a process 𝑋 = (𝑋𝑡 )𝑡>0 and a stopping time 𝑇, we write 𝑋 𝑇 B (𝑋𝑡∧𝑇 )𝑡>0 for the process 𝑋
stopped at 𝑇. Note that if 𝑆 is also a stopping time, then

(𝑋 𝑇 ) 𝑆 = 𝑋 𝑇∧𝑆 = (𝑋 𝑆 )𝑇 .

Recall from Corollary 3.24 that if 𝑋 is a martingale and 𝑇 is a bounded stopping time, then 𝑋 𝑇 is
a uniformly integrable martingale. But there are other processes that have this property besides
martingales.
Like local integrability on R+ , but instead of [0, 𝑡 𝑛 ], we use [0, 𝑇𝑛 ] in the following definition.
Definition 4.6. An adapted process 𝑀 with continuous sample paths and 𝑀0 = 0 a.s. is called
a continuous local martingale if there exist stopping times 𝑇1 6 𝑇2 6 · · · → ∞ such that for all
𝑛, 𝑀 𝑇𝑛 is a uniformly integrable martingale. If we do not assume 𝑀0 = 0 a.s. but (𝑀𝑡 − 𝑀0 )𝑡>0
satisfies the preceding condition, then we still call 𝑀 a continuous local martingale. Stopping
times 𝑇𝑛 that witness the definition are said to reduce 𝑀.

One need not assume sample paths are continuous in order to define local martingales, but we
will.
Note that it is not assumed that 𝑀𝑡 ∈ 𝐿 1 . In particular, 𝑀0 need only be ℱ0 -measurable.
To distinguish martingales from local martingales, we may speak of true martingales. Some
examples of the difference:

Example. Let 𝐵 be an (ℱ𝑡 )-Brownian motion from 0 and 𝑍 ∈ ℱ0 , 𝑍 ∉ 𝐿 1 . Then 𝑀𝑡 B 𝑍 𝐵𝑡 is a


continuous local martingale but not a true martingale by Exercise 4.22.

Exercise (due 11/2). Exercise 4.22.


4.2. Continuous Local Martingales 43

Example. Let 𝐵 be a real Brownian motion and 𝑇 B inf{𝑡 > 0 ; 𝐵𝑡 = −1}. Define

if 𝑡 < 1,

𝐵
 𝑡
1−𝑡 ∧𝑇

𝑋𝑡 B
 −1

 if 𝑡 > 1.

By Corollary 3.24(i), 𝐵𝑇 is a martingale. We claim that 𝑋 is a continuous local martingale with


respect to (ℱ𝑡𝑋 )𝑡>0 but not a martingale. To see this, let

𝑇𝑛 B inf{𝑡 > 0 ; 𝑋𝑡 = 𝑛}.

These are stopping times that increase to infinity. We want to show that 0 6 𝑠 < 𝑡 implies
 
𝑋𝑠𝑇𝑛 = E 𝑋𝑡𝑇𝑛 ℱ𝑠𝑋 .

Write

𝑠
if 𝑠 < 1,


 1−𝑠

𝜑(𝑠) B
∞

 if 𝑠 > 1.

Since 𝑋 𝑡 = 𝐵𝑇𝜑(𝑡) , we have

𝜑(𝑇 )∧𝑇
𝑋𝑠𝑇𝑛 = 𝐵 𝜑(𝑠)𝑛 and ℱ𝑠𝑋 = ℱ𝜑(𝑠)∧𝑇
𝐵
,

so the desired equation is


𝜑(𝑇 )∧𝑇  𝜑(𝑇 )∧𝑇 𝐵

𝐵 𝜑(𝑠)𝑛 = E 𝐵 𝜑(𝑡)𝑛 ℱ𝜑(𝑠)∧𝑇 .

Since 𝜑(𝑇𝑛 ) is an (ℱ•𝐵 )-stopping time and 𝐵 𝜑(𝑇𝑛 )∧𝑇 is a bounded martingale, the equation follows
from the optional stopping theorem. To see that 𝑋 is not a true martingale, we note that
𝑋0 = 0 ≠ −1 = E[𝑋1 ]. In effect, 𝐵𝑇 is not closed, but is still a martingale.

Some properties of continuous local martingales:


(b) If 𝑀 is a continuous adapted process with 𝑀0 = 0 and 𝑇𝑛 are increasing stopping times going
to infinity with 𝑀 𝑇𝑛 a martingale, then 𝑀 is a continuous local martingale: we may replace
𝑇𝑛 by 𝑇𝑛 ∧ 𝑛 to get stopping times that reduce 𝑀 because (𝑀 𝑇𝑛 ) 𝑛 is a uniformly integrable
martingale by Theorem 3.21.
(c) If 𝑀 is a continuous local martingale and 𝑇 is any stopping time, then 𝑀 𝑇 is a continuous
local martingale, because if 𝑇𝑛 reduce 𝑀, then (𝑀 𝑇 )𝑇𝑛 = (𝑀 𝑇𝑛 )𝑇 and Corollary 3.24(ii)
shows that 𝑇𝑛 reduce 𝑀 𝑇 .
(d) Similarly, if 𝑇𝑛 reduce 𝑀 and 𝑆𝑛 → ∞ are stopping times, then 𝑇𝑛 ∧ 𝑆𝑛 reduce 𝑀.
(e) If 𝑇𝑛 reduce 𝑀 and 𝑇𝑛0 reduce 𝑀 0, then 𝑇𝑛 ∧ 𝑇𝑛0 reduce both 𝑀 and 𝑀 0 by (d), whence also
reduce 𝑀 + 𝑀 0 (the sum of two uniformly integrable classes is uniformly integrable). Thus,
the space of continuous local martingales is a vector space.
44 Chapter 4. Continuous Semimartingales

Exercise (due 11/9). Show that if 𝑋 is a discrete-time adapted process in 𝐿 1 and 𝑇𝑛 are stopping
times going to infinity such that 𝑋 𝑇𝑛 is a martingale for each 𝑛, then 𝑋 is a martingale.
If 𝑀 is a continuous local martingale reduced by (𝑇𝑛 ) and 𝑀0 ∈ 𝐿 1 then 𝑀 𝑇𝑛 is a uniformly
integrable martingale, since adding an 𝐿 1 function to a uniformly integrable class results in a
uniformly integrable class.
Proposition 4.7. Let 𝑀 be a continuous local martingale with 𝑀0 ∈ 𝐿 1 .
(i) If 𝑀 > 0, then 𝑀 is a supermartingale.
(ii) If 𝑀 is dominated (i.e., ∃𝑍 ∈ 𝐿 1 with |𝑀𝑡 | 6 𝑍 for all 𝑡 > 0), then 𝑀 is a uniformly integrable
martingale.
(iii) 𝑀 is reduced by
𝑇𝑛 B inf 𝑡 > 0 ; |𝑀𝑡 | > 𝑛 + |𝑀0 | .


Proof. (i) Let 𝑇𝑛 reduce 𝑀. Then E[𝑀0 ] = E[𝑀𝑡∧𝑇𝑛 ] for 𝑛 > 0. By Fatou’s lemma,
E[𝑀𝑡 ] 6 E[𝑀0 ] < ∞.
Furthermore,
𝑠6𝑡 =⇒ ∀𝑛 𝑀𝑠∧𝑇𝑛 = E[𝑀𝑡∧𝑇𝑛 | ℱ𝑠 ]. (∗)
By Fatou’s lemma for conditional expectation, we get
𝑀𝑠 > E[𝑀𝑡 | ℱ𝑠 ].
(ii) Combined with Eq. (∗), the Lebesgue dominated convergence theorem implies that
𝑀𝑠 = E[𝑀𝑡 | ℱ𝑠 ].
(iii) Proposition 3.9 shows that 𝑇𝑛 are stopping times. By property (c) of local martingales,
𝑀 is a continuous local martingale. Since it is dominated by 𝑛 + |𝑀0 |, part (ii) shows that it is a
𝑇𝑛

uniformly integrable martingale, as required. J


It is not true that a uniformly integrable continuous local martingale is necessarily a martingale,
even if it is bounded in 𝐿 2 . A natural example appears in Exercise 5.33 (which was historically the
first example, due to Johnson and Helms in 1963; two years later, Itô and Watanabe introduced local
martingales).
Recall Corollary 2.17 that Brownian motion has infinite variation on every non-trivial interval.
This came from the fact that the quadratic variation was positive on every interval (indeed, it equals
the length of interval), which was a simple consequence of Brownian motion coming from Gaussian
white noise. Every continuous local martingale 𝑀 also has infinite variation on every interval where
it is “changing”. To prove this, we could try to use Proposition 3.14: if 𝑀 is a square-integrable
martingale, then for 0 = 𝑡 0 < 𝑡1 < · · · < 𝑡 𝑝 , we have
𝑝
hX i
E[𝑀𝑡2𝑝 − 𝑀𝑡20 ] =E (𝑀𝑡𝑖 − 𝑀𝑡𝑖−1 ) 2 .
𝑖=1
However, 𝑀 need not even be a martingale, nor square-integrable. In addition, we don’t have almost
sure convergence of the quadratic variation to a something greater than 0 (though we will prove this
convergence in probability in the next section). Instead, localization will allow us to get a proof,
using a proper stopping time derived from the assumption to the contrary that the variation is finite.
4.3. The Quadratic Variation of a Continuous Local Martingale 45

Theorem 4.8. Let 𝑀 be a continuous local martingale that is also a finite-variation process. Then
P[∀𝑡 > 0 𝑀𝑡 = 0] = 1.
Proof. Since 𝑡 ↦→ 0 |d𝑀𝑠 | is an increasing process, for 𝑛 ∈ N
∫𝑡

n ∫ 𝑡 o
𝑇𝑛 B inf 𝑡 > 0 ; |d𝑀𝑠 | > 𝑛
0

is a stopping time by Proposition 3.9. It is enough to show that


∀𝑛 𝑀 𝑇𝑛 = 0 a.s.,
since 𝑇𝑛 → ∞.
Fix 𝑛 and write 𝑁 B 𝑀 𝑇𝑛 . Then
∫ 𝑡∧𝑇𝑛 ∫ 𝑡∧𝑇𝑛
∀𝑡 > 0 |𝑁𝑡 | = |𝑀𝑡∧𝑇𝑛 | = d𝑀𝑠 6 |d𝑀𝑠 | 6 𝑛.
0 0

By property (c) and Proposition 4.7(ii), 𝑁 is a bounded martingale. For 𝑡 > 0, consider 0 = 𝑡0 <
𝑡1 < · · · < 𝑡 𝑝 = 𝑡. By Proposition 3.14,
𝑝
hX i
E[𝑁𝑡2 ] =E (𝑁𝑡𝑖 − 𝑁𝑡𝑖−1 ) 2

𝑖=1
6 𝑛 by Proposition 4.2
z }| {
h X𝑝 i
6 E sup |𝑁𝑡𝑖 − 𝑁𝑡𝑖−1 | · |𝑁𝑡𝑖 − 𝑁𝑡𝑖−1 | .
𝑖 𝑖=1
| {z }
6 2𝑛, small by continuity

Thus, the bounded convergence theorem yields E[𝑁𝑡2 ] = 0, whence 𝑁𝑡 = 0 a.s. Because 𝑁 is
continuous, it follows that 𝑁 = 0 a.s., as desired. J
Exercise (due 11/9). Let 𝑝 > 1 and 𝑋 be a right-continuous martingale satisfying sup𝑡 E |𝑋𝑡 | 𝑝 < ∞.
 

Show that for all measurable 𝑇 : Ω → [0, ∞], 𝑋𝑇 ∈ 𝐿 𝑝 . Show that this is not always true for 𝑝 = 1.

4.3. The Quadratic Variation of a Continuous Local Martingale


For the rest of the chapter, we assume (ℱ𝑡 ) is complete.
Again like Brownian motion, continuous local martingales have finite quadratic variation on
every bounded interval. This will be a crucial result and is the main result of Chapter 4.
Theorem4.9. Let 𝑀 be a continuous local martingale. There is an increasing process h𝑀, 𝑀i =
h𝑀, 𝑀i𝑡 𝑡>0 such that 𝑀𝑡2 − h𝑀, 𝑀i𝑡 𝑡>0 is a continuous local martingale. Such a process h𝑀, 𝑀i
is unique up to indistinguishability and has the following form: if 𝑡 > 0 and 0 = 𝑡0𝑛 < 𝑡1𝑛 < · · · <
𝑡 𝑛𝑝 𝑛 = 𝑡 is an increasing sequence of subdivisions of [0, 𝑡] with mesh going to zero, then
𝑝𝑛
2
X
h𝑀, 𝑀i𝑡 = lim (𝑀𝑡𝑖𝑛 − 𝑀𝑡𝑖−1
𝑛 ) in probability. (4.3)
𝑛→∞
𝑖=1
46 Chapter 4. Continuous Semimartingales

Remark. The sum in Eq. (4.3) is not monotone in 𝑛, unlike for total variation.
We call h𝑀, 𝑀i the quadratic variation of 𝑀. For example, if 𝐵 is a Brownian motion, then
h𝐵, 𝐵i𝑡 = 𝑡.
From Eq. (4.3), we see that h𝑀, 𝑀i does not depend on 𝑀0 , nor on (ℱ𝑡 )𝑡>0 . Also, Eq. (4.3)
holds even if the subdivisions are not increasing, but we will prove that only in Chapter 5.
The proof of Theorem 4.9 relies on some calculations.
Lemma A. Let (𝑋 𝑘 )𝑘∈N be a martingale, (𝑘 𝑖 )𝑖∈N be an increasing sequence with 𝑘 0 = 0, and
𝑖(𝑘) B min{𝑖 ; 𝑘 6 𝑘 𝑖 }. Define
𝑚
X
𝑌𝑚 B 𝑋 𝑘−1 (𝑋 𝑘 − 𝑋 𝑘−1 )
𝑘=1

and

X
𝑍ℓ B 𝑋 𝑘 𝑖−1 (𝑋 𝑘 𝑖 − 𝑋 𝑘 𝑖−1 ).
𝑖=1
Then
𝑘ℓ
X 
2 2 2
 
E (𝑌𝑘 ℓ − 𝑍ℓ ) =E (𝑋 𝑘 𝑖 (𝑘)−1 − 𝑋 𝑘−1 ) (𝑋 𝑘 − 𝑋 𝑘−1 ) .
𝑘=1

Proof. We have
𝑘ℓ
ℓ X
X  
E[𝑌𝑘 ℓ 𝑍ℓ ] = E 𝑋 𝑘 𝑖−1 (𝑋 𝑘 𝑖 − 𝑋 𝑘 𝑖−1 ) 𝑋 𝑘−1 (𝑋 𝑘 − 𝑋 𝑘−1 ) .
𝑖=1 𝑘=1

If 𝑘 𝑖 6 𝑘 − 1, then by conditioning on ℱ𝑘−1 , we see that the (𝑖, 𝑘) summand is 0. Similarly, if


𝑘 6 𝑘 𝑖−1 , then by conditioning on ℱ𝑘 𝑖−1 , we get that the (𝑖, 𝑘) summand is 0. Therefore,
𝑘ℓ
X  
E[𝑌𝑘 ℓ 𝑍ℓ ] = E 𝑋 𝑘 𝑖 (𝑘)−1 (𝑋 𝑘 𝑖 (𝑘) − 𝑋 𝑘 𝑖 (𝑘)−1 ) 𝑋 𝑘−1 (𝑋 𝑘 − 𝑋 𝑘−1 ) .
𝑘=1

Writing
𝑘 𝑖 (𝑘)
X 
𝑋 𝑘 𝑖 (𝑘) − 𝑋 𝑘 𝑖 (𝑘)−1 = 𝑋 𝑗 − 𝑋 𝑗−1 ,
𝑗=𝑘 𝑖 (𝑘)−1 +1

we get that the (𝑘, 𝑗) summand is 0 unless 𝑗 = 𝑘: if 𝑗 < 𝑘, condition on ℱ𝑘−1 , whereas if 𝑗 > 𝑘,
condition on ℱ𝑗−1 . Thus, we have
𝑘ℓ
E 𝑋 𝑘 𝑖 (𝑘)−1 𝑋 𝑘−1 (𝑋 𝑘 − 𝑋 𝑘−1 ) 2 .
X
(1)
 
E[𝑌𝑘 ℓ 𝑍ℓ ] =
𝑘=1

By choosing 𝑘 𝑖 ≡ 𝑖, we obtain
𝑘ℓ
E[𝑌𝑘2ℓ ]
 2
(𝑋 𝑘 − 𝑋 𝑘−1 ) 2 .
X
(2)

= E 𝑋 𝑘−1
𝑘=1
4.3. The Quadratic Variation of a Continuous Local Martingale 47

If we apply Eq. (2) to the martingale (𝑋 𝑘 𝑖 )𝑖∈N , we obtain



E[𝑍ℓ2 ] E 𝑋 𝑘2𝑖−1 (𝑋 𝑘 𝑖 − 𝑋 𝑘 𝑖−1 ) 2 .
X  
=
𝑖=1

Now condition on ℱ𝑘 𝑖−1 and use Proposition 3.14 to write


𝑘𝑖
2
E (𝑋 𝑘 − 𝑋 𝑘−1 ) 2 ℱ𝑘 𝑖−1 ,
  X  
E (𝑋 𝑘 𝑖 − 𝑋 𝑘 𝑖−1 ) ℱ𝑘 𝑖−1 =
𝑘=𝑘 𝑖−1 +1

whence
ℓ h 𝑘𝑖
i
E[𝑍ℓ2 ] = E 𝑋 𝑘2𝑖−1 E (𝑋 𝑘 − 𝑋 𝑘−1 ) 2 ℱ𝑘 𝑖−1
X X 

𝑘=𝑘 𝑖−1 +1
(3)
𝑖=1
𝑘ℓ
E 𝑋 𝑘2𝑖 (𝑘)−1 (𝑋 𝑘 − 𝑋 𝑘−1 ) . 2
X  
=
𝑘=1

Using Eqs. (1) to (3), we get the desired result. J


Lemma B. If (𝑋 𝑘 )𝑘∈N is a martingale, then
𝑚
X 2
2 4
∀𝑚 ∈ N E (𝑋 𝑘 − 𝑋 𝑘−1 ) 6 6 · max k 𝑋 𝑘 k∞ .
06𝑘 6𝑚
𝑘=1

Proof. Write 𝐴 B max06𝑘 6𝑚 k 𝑋 𝑘 k∞ . Note that

E (𝑋 𝑘 − 𝑋 𝑘−1 ) 2 (𝑋 𝑗 − 𝑋 𝑗−1 ) 2
X  

16𝑘 < 𝑗 6𝑚
𝑚−1  𝑚
hX i
2 2
X
= E (𝑋 𝑘 − 𝑋 𝑘−1 ) E (𝑋 𝑗 − 𝑋 𝑗−1 ) ℱ𝑘
𝑘=1 𝑗=𝑘+1
𝑚−1 h i
E (𝑋 𝑘 − 𝑋 𝑘−1 ) 2 E 𝑋𝑚2 − 𝑋 𝑘2 ℱ𝑘
X
[by Proposition 3.14]

=
𝑘=1
𝑚−1
2
E (𝑋 𝑘 − 𝑋 𝑘−1 ) 2 .
X  
6𝐴 ·
𝑘=1

In addition,
E (𝑋 𝑘 − 𝑋 𝑘−1 ) 4 6 4𝐴2 E (𝑋 𝑘 − 𝑋 𝑘−1 ) 2 .
   

Therefore,
𝑚 𝑚
22
h X i
2
E (𝑋 𝑘 − 𝑋 𝑘−1 ) 2
X
6 6𝐴
 
E (𝑋 𝑘 − 𝑋 𝑘−1 )
𝑘=1 𝑘=1
= 6𝐴 E 𝑋𝑚2 − 𝑋02
2
[by Proposition 3.14]
 

6 6𝐴4 . J
48 Chapter 4. Continuous Semimartingales

Lemma C. If ∀𝑛 ∈ N 𝑋 𝑛 = (𝑋𝑡𝑛 )𝑡∈𝐼 is a process with continuous sample paths and

lim E sup (𝑋𝑡𝑛 − 𝑋𝑡𝑚 ) 2 = 0,


 
𝑛,𝑚→∞ 𝑡∈𝐼

then there exists 𝑛 𝑘 → ∞ and 𝑌 = (𝑌𝑡 )𝑡∈𝐼 with continuous sample paths such that almost surely

∀𝑡 ∈ 𝐼 lim 𝑋𝑡𝑛 𝑘 = 𝑌𝑡 .
𝑘→∞

Proof. Choose 𝑛 𝑘 → ∞ such that


 1/2
E sup (𝑋𝑡𝑛 𝑘 − 𝑋𝑡𝑛 𝑘+1 ) 2
X 
< ∞.
𝑘=1 𝑡∈𝐼

P∞ P∞ 
Then E 𝑘=1 sup𝑡∈𝐼 |𝑋𝑡
𝑛𝑘
− 𝑋𝑡
𝑛 𝑘+1 
| = 𝑘=1 E sup𝑡∈𝐼 |𝑋𝑡
𝑛𝑘
− 𝑋𝑡 | < ∞, so
𝑛 𝑘+1 


X
sup |𝑋𝑡𝑛 𝑘 − 𝑋𝑡𝑛 𝑘+1 | < ∞ almost surely.
𝑘=1 𝑡∈𝐼

Off a negligible set 𝑁, we have uniform convergence of 𝑋𝑡𝑛 𝑘 , so for 𝜔 ∉ 𝑁 one may define
𝑌 (𝜔) B lim 𝑘→∞ 𝑋𝑡𝑛 𝑘 (𝜔), whereas for 𝜔 ∈ 𝑁, define 𝑌 (𝜔) := 0. (Note that 𝑁 depends on 𝑋𝑡(𝑛 𝑘 )
for all 𝑡 ∈ 𝐼—or at least a dense subset of such 𝑡—and all 𝑘, so that we cannot conclude that
𝑌𝑡 ∈ 𝜎(𝑋𝑡(𝑛 𝑘 ) , 𝑘 > 1).) J

Proof of 𝑇 ℎ𝑒𝑜𝑟𝑒𝑚 4.9. We first show uniqueness. Suppose that 𝐴 and 𝐴0 are increasing processes
such that (𝑀𝑡2 − 𝐴𝑡 ) and (𝑀𝑡2 − 𝐴0𝑡 ) are both continuous local martingales. Then their difference,
𝐴0𝑡 − 𝐴𝑡 , is a continuous local martingale and a finite-variation process, whence is 0 by Theorem 4.8
(up to indistinguishability).
To prove existence, first assume 𝑀0 = 0 and 𝑀 is bounded. By Proposition 4.7(ii), 𝑀 is a true
martingale. Fix 𝐾 > 0 and an increasing sequence of subdivisions of [0, 𝐾] with mesh going to 0,
0 = 𝑡0𝑛 < 𝑡1𝑛 < · · · < 𝑡 𝑛𝑝 𝑛 = 𝐾.
It is easy to see that if 0 6 𝑟 < 𝑠 and 𝑍 ∈ 𝐿 ∞ (ℱ𝑟 ), then 𝑡 ↦→ 𝑍 (𝑀𝑠∧𝑡 − 𝑀𝑟∧𝑡 ) is a martingale.
Therefore, for all 𝑛, the process

𝑝𝑛
X 
𝑋𝑡𝑛 B 𝑀𝑡𝑖−1
𝑛 𝑀𝑡𝑖𝑛 ∧𝑡 − 𝑀𝑡𝑖−1
𝑛 ∧𝑡

𝑖=1
4.3. The Quadratic Variation of a Continuous Local Martingale 49

is a bounded martingale. Now for 0 6 𝑗 6 𝑝 𝑛 ,


𝑗
2
𝑀𝑡2𝑛
X
(𝑀𝑡 𝑛𝑗 ) − 2𝑋𝑡𝑛𝑛 = −2 𝑛 (𝑀𝑡 𝑛 − 𝑀𝑡 𝑛 )
𝑀𝑡𝑖−1 𝑖 𝑖−1
𝑗 𝑗
𝑖=1
= 𝑀𝑡2𝑛 − 2𝑀𝑡 𝑛𝑗 𝑀𝑡 𝑛𝑗−1 + 𝑀𝑡2𝑛
𝑗 𝑗−1

+ 𝑀𝑡2𝑛 − 2𝑀𝑡 𝑛𝑗−1 𝑀𝑡 𝑛𝑗−2 + 𝑀𝑡2𝑛


𝑗−1 𝑗−2

+··· (4.4)
+ 𝑀𝑡2𝑛 − 2𝑀𝑡1𝑛 𝑀𝑡0𝑛 + 𝑀𝑡2𝑛
1 0

+ 𝑀𝑡2𝑛
0
𝑗
2
X
= (𝑀𝑡𝑛𝑖 − 𝑀𝑡𝑖−1
𝑛 )

𝑖=1

since 𝑀𝑡0𝑛 = 𝑀0 = 0. (In Chapter 5, we will see this implies 𝑀𝑡2 − h𝑀, 𝑀i𝑡 = 2 0 𝑀𝑠 d𝑀𝑠 . Note
∫𝑡

that if 𝑀 is a finite-variation process, then h𝑀, 𝑀i = 0 and this is ordinary calculus.)


By Lemma A, if 𝑛 6 𝑚,
X𝑝𝑚 
𝑚 2 2 2
 𝑛 
E (𝑋𝐾 − 𝑋𝐾 ) = E (𝑀𝑡𝑖𝑛 ( 𝑗)−1 − 𝑀𝑡 𝑚𝑗−1 ) · (𝑀𝑡 𝑚𝑗 − 𝑀𝑡 𝑚𝑗−1 ) ,
𝑛
𝑗=1

where 𝑖 𝑛 ( 𝑗) B min{𝑖 ; 𝑡 𝑚𝑗 6 𝑡𝑖𝑛 }. The right-hand side is


𝑝𝑚
2
h i
(𝑀𝑡 𝑚𝑗 − 𝑀𝑡 𝑚𝑗−1 ) 2
X
6 E max 𝑀𝑡𝑖𝑛 ( 𝑗)−1 − 𝑀𝑡 𝑚𝑗−1 ·
16 𝑗 6 𝑝 𝑚 𝑛
𝑖=1
i 1/2  2 i 1/2
4
h hX
6 E max 𝑀𝑡𝑖𝑛 ( 𝑗)−1 − 𝑀𝑡 𝑚𝑗−1 ·E (𝑀𝑡 𝑛𝑗 − 𝑀𝑡 𝑛𝑗−1 ) 2 .
𝑗 𝑛
𝑗
[Cauchy–Schwarz inequality]
The first term converges to 0 as 𝑚, 𝑛 →√∞ by continuity of sample paths and boundedness of
2 by Lemma B. Therefore,
𝑀. The second term is less than or equal to 6 · sup06𝑡6𝐾 k 𝑀𝑡 k∞

lim E (𝑋𝐾𝑛 − 𝑋𝐾𝑚 ) 2 = 0.


 
𝑛,𝑚→∞

By Doob’s 𝐿 2 -inequality (Proposition 3.15(ii)), we get

lim E sup (𝑋𝑡𝑛 − 𝑋𝑡𝑚 ) 2 = 0.


 
𝑚,𝑛→∞ 𝑡6𝐾

By Lemma C, there exists 𝑌 = (𝑌𝑡 )06𝑡6𝐾 with continuous sample paths and 𝑛 𝑘 → ∞ such that
almost surely,
∀𝑡 ∈ [0, 𝐾] lim 𝑋𝑡𝑛 𝑘 = 𝑌𝑡 .
𝑘→∞
Also,
∀𝑡 ∈ [0, 𝐾] lim 𝑋𝑡𝑛 𝑘 = 𝑌𝑡 in 𝐿 2 .
𝑛→∞
50 Chapter 4. Continuous Semimartingales

Because the filtration is complete, 𝑌𝑡 ∈ ℱ𝑡 . Since 𝑠 6 𝑡 implies that E[𝑋𝑡𝑛 | ℱ𝑠 ] = 𝑋𝑠𝑛 , we obtain
that E[𝑌𝑡 | ℱ𝑠 ] = 𝑌𝑠 for 0 6 𝑠 6 𝑡 6 𝐾, i.e., (𝑌𝑡∧𝐾 )𝑡>0 is a continuous martingale.
By Eq. (4.4), the sample paths of 𝑀𝑡2 −2𝑋𝑡𝑛 are increasing along the sequence 𝑡 0𝑛 < 𝑡1𝑛 < · · · < 𝑡 𝑛𝑝 𝑛 .
Therefore, the sample paths of 𝑀𝑡2 − 2𝑌𝑡 are increasing on [0, 𝐾] off a negligible set 𝑁. Thus, define
the increasing process 𝐴 (𝐾) on [0, 𝐾] by

 𝑀𝑡2 − 2 𝑌𝑡 on Ω \ 𝑁,


𝐴𝑡(𝐾)

B
0
 on 𝑁.

Then 𝐴 (𝐾) is an increasing process and 𝑀𝑡∧𝐾 2 − 𝐴 (𝐾)


𝑡∧𝐾 𝑡>0 is a continuous martingale.


In this manner, for all ℓ ∈ N, we obtain a process 𝐴 (ℓ) on [0, ℓ]. By the uniqueness argument
(ℓ+1) (ℓ)
at the beginning of this proof, 𝐴𝑡∧ℓ 𝑡>0 and 𝐴𝑡∧ℓ 𝑡>0 are indistinguishable. This allows us to

(ℓ) 
define an increasing process h𝑀, 𝑀i such that  h𝑀, 𝑀i𝑡∧ℓ 𝑡>0 is indistinguishable from 𝐴𝑡∧ℓ

𝑡>0
for each ℓ ∈ N. It satisfies that 𝑀𝑡2 − h𝑀, 𝑀i𝑡 𝑡>0 is a martingale.
This is not quite Eq. (4.3) because there 𝑡 was arbitrary and the subdivisions were of [0,  𝑡].
(𝐾)
However, call “𝑡” there now by “𝐾”. As before, 𝐴𝑡∧𝐾 𝑡>0 is indistinguishable from h𝑀, 𝑀i𝑡∧𝐾 𝑡>0 .
In particular, h𝑀, 𝑀i𝐾 = 𝐴𝐾(𝐾) almost surely. As we saw, this gives 𝐿 2 -convergence in Eq. (4.3),
which is stronger than convergence in probability. This completes the proof when 𝑀0 = 0 and 𝑀 is
bounded.
For the general case, write 𝑀𝑡 = 𝑀0 + 𝑁𝑡 . Then 𝑀𝑡2 = 𝑀02 + 2𝑀0 𝑁𝑡 + 𝑁𝑡2 . By Exercise 4.22,
(𝑀0 𝑁𝑡 )𝑡 is a continuous local martingale, so by uniqueness, h𝑀, 𝑀i = h𝑁, 𝑁i. Thus, we may take
𝑀0 = 0 without loss of generality.
Now use the stopping times 𝑇𝑛 B inf 𝑡 > 0 ; |𝑀𝑡 | > 𝑛 . The case we proved applies to 𝑀 𝑇𝑛 :

[𝑛+1]  [𝑛] 
Write 𝐴 [𝑛] B h𝑀 𝑇𝑛 , 𝑀 𝑇𝑛 i. By uniqueness, for all 𝑛, 𝐴𝑡∧𝑇 𝑛 𝑡 and 𝐴 𝑡 𝑡 are indistinguishable, so
there exists an increasing process 𝐴 such that for all 𝑛 ∈ N, 𝐴𝑇𝑛 and 𝐴 [𝑛] are indistinguishable. By
2
construction, for all 𝑛, (𝑀𝑡∧𝑇 − 𝐴𝑡∧𝑇𝑛 )𝑡 is a martingale, whence (𝑀𝑡2 − 𝐴𝑡 )𝑡 is a continuous local
𝑛
martingale. Thus, we may define h𝑀, 𝑀i B 𝐴. Now, Eq. (4.3) in the bounded case says that
(𝑚)
∀𝑛 ∀𝑡 lim 𝑍𝑡∧𝑇𝑛
= h𝑀, 𝑀i𝑡∧𝑇𝑛 in probability,
𝑚→∞

where
𝑝𝑚
2
𝑍𝑡(𝑚)
X
B (𝑀𝑡𝑖𝑚 ∧𝑡 − 𝑀𝑡𝑖−1
𝑚 ∧𝑡 ) .

𝑖=1
That is, h i
(𝑚)
∀𝑛 ∀𝑡 ∀𝜀 > 0 ∃𝑚 0 ∀𝑚 > 𝑚 0 P 𝑍𝑡∧𝑇𝑛
− h𝑀, 𝑀i𝑡∧𝑇𝑛 > 𝜀 < 𝜀.
In addition, there exists 𝑛0 such that P[𝑇𝑛0 < 𝑡] < 𝜀. Therefore,
h i
(𝑚)
∀𝑚 > 𝑚 0 P 𝑍𝑡 − h𝑀, 𝑀i𝑡 > 𝜀 < 2𝜀,

P
so 𝑍𝑡(𝑚) →
− h𝑀, 𝑀i𝑡 , as desired. J
Exercise (due 11/30). Exercise 4.23.
4.3. The Quadratic Variation of a Continuous Local Martingale 51

Proposition 4.11. If 𝑀 is a continuous local martingale and 𝑇 is a stopping time, then almost
surely,
∀𝑡 > 0 h𝑀 𝑇 , 𝑀 𝑇 i𝑡 = h𝑀, 𝑀i𝑡𝑇 .

Proof. By property (c) of continuous local martingales, 2 − h𝑀, 𝑀i


𝑡 is a continuous local

𝑀 𝑡∧𝑇 𝑡∧𝑇
martingale. By uniqueness, h𝑀, 𝑀i𝑡∧𝑇 𝑡 is indistinguishable from h𝑀 𝑇 , 𝑀 𝑇 i𝑡 .

J

Exercise (due 11/30). Show that if 𝑀 is a continuous local martingale and 𝑇 is a stopping time,
then almost surely,

∀𝑡 > 0 h𝑀 − 𝑀 𝑇 , 𝑀 − 𝑀 𝑇 i𝑡 = h𝑀, 𝑀i𝑡 − h𝑀, 𝑀i𝑡𝑇 .

Exercise (due 11/30). Let 𝐵 be an (ℱ𝑡 )-Brownian motion and 𝑆, 𝑇 be stopping times. Calculate

h𝐵𝑇 − 𝐵 𝑆 , 𝐵𝑇 − 𝐵 𝑆 i.

Hint: Do first the case 𝑆 6 𝑇.


We next show how various properties of 𝑀 are reflected in h𝑀, 𝑀i. Our first result is that 𝑀
changes only where h𝑀, 𝑀i changes.
Proposition 4.12. Let 𝑀 be a continuous local martingale and 0 6 𝑡1 < 𝑡2 6 ∞. Then a.s.
∀𝑡 ∈ [𝑡1 , 𝑡2 ] 𝑀𝑡 = 𝑀𝑡1 if and only if a.s. ∀𝑡 ∈ [𝑡1 , 𝑡2 ] h𝑀, 𝑀i𝑡 = h𝑀, 𝑀i𝑡1 .

Proof. ⇒: By Eq. (4.3), h𝑀, 𝑀i𝑡2 = h𝑀, 𝑀i𝑡1 . Since h𝑀, 𝑀i is increasing, we get the result.
⇐: 𝑀 𝑡𝑖 is a continuous local martingale by property (c), whence so is 𝑀 𝑡2 − 𝑀 𝑡1 by property (e).
By Eq. (4.3), h𝑀 𝑡2 − 𝑀 𝑡1 , 𝑀 𝑡2 − 𝑀 𝑡1 i = 0. Therefore, (𝑀𝑡2 − 𝑀 𝑡1 ) 2 is a continuous local martingale.
By Proposition 4.7(i), it is a supermartingale. Thus, E (𝑀𝑡 − 𝑀𝑡1 ) 6 E (𝑀𝑡1 − 𝑀𝑡1 ) 2 = 0 for
2
 

𝑡 ∈ [𝑡1 , 𝑡2 ]. J

For an increasing process 𝐴, we define 𝐴∞ B lim𝑡→∞ 𝐴𝑡 ∈ [0, ∞].


Theorem 4.13. Let 𝑀 be a continuous local martingale with 𝑀0 ∈ 𝐿 2 .
(i) The following are equivalent:
(a) 𝑀 is a true martingale bounded in 𝐿 2 .
(b) E h𝑀, 𝑀i∞ < ∞.
 

If these hold, then 𝑀𝑡2 − h𝑀, 𝑀i𝑡 𝑡>0 is a uniformly integrable martingale and so


2
] = E[𝑀02 ] + E h𝑀, 𝑀i∞ .
 
E[𝑀∞

(ii) The following are equivalent:


(a) 𝑀 is a true martingale and ∀𝑡 > 0 𝑀𝑡 ∈ 𝐿 2 .
(b) ∀𝑡 ∈ [0, ∞) E h𝑀, 𝑀i𝑡 < ∞.
 

If these hold, then 𝑀𝑡2 − h𝑀, 𝑀i𝑡 𝑡>0 is a true martingale.



52 Chapter 4. Continuous Semimartingales

Proof. (i) Without loss of generality, 𝑀0 = 0.


(a) ⇒ (b): By Doob’s 𝐿 2 -inequality,

∀𝑠 > 0 sup |𝑀𝑡 | 2


6 2k𝑀𝑠 k2 ,
06𝑡6𝑠

whence sup𝑡>0 |𝑀𝑡 | ∈ 𝐿 2 if (a) holds. Let 𝑆𝑛 B inf{𝑡 > 0 ; h𝑀, 𝑀i𝑡 > 𝑛}. By property (c) of
2
continuous local martingales, 𝑀𝑡∧𝑆 𝑛 − h𝑀, 𝑀i𝑡∧𝑆 𝑛 𝑡>0 is a continuous local martingale. Also, it is


dominated by (sup𝑡>0 𝑀𝑡2 + 𝑛) ∈ 𝐿 1 , so by Proposition 4.7(ii), it is a uniformly integrable martingale.


Therefore,  2  2
∀𝑡 > 0 E h𝑀, 𝑀i𝑡∧𝑆 𝑛 = E 𝑀𝑡∧𝑆 sup
   
𝑛
6 E 𝑀 𝑠 < ∞.
𝑠>0
Take 𝑛 → ∞ and 𝑡 → ∞ to get (b).
(b) ⇒ (a): If (b) holds, then set

𝑇𝑛 B inf{𝑡 > 0 ; |𝑀𝑡 | > 𝑛}.

Now
2
𝑀𝑡∧𝑇 𝑛
− h𝑀, 𝑀i𝑡∧𝑇𝑛 6 𝑛2 + h𝑀, 𝑀i∞ ∈ 𝐿 1 ,
2
so again 𝑀𝑡∧𝑇 − h𝑀, 𝑀i𝑡∧𝑇𝑛 𝑡>0 is a uniformly integrable martingale and

𝑛

2
∀𝑡 > 0 (∗)
   
E[𝑀𝑡∧𝑇𝑛
] = E h𝑀, 𝑀i𝑡∧𝑇𝑛 6 E h𝑀, 𝑀i ∞ < ∞.

Take 𝑛 → ∞ to get (𝑀𝑡 )𝑡>0 is bounded in 𝐿 2 by Fatou’s Lemma.


To see that 𝑀 is a martingale, note that Eq. (∗) implies (𝑀𝑡∧𝑇𝑛 )𝑛>1 is uniformly integrable, so
converges in 𝐿 1 to 𝑀𝑡 as 𝑛 → ∞. By Proposition 4.7(iii), 𝑀 𝑇𝑛 is a martingale, whence so is its
𝐿 1 -limit, 𝑀.
Lastly, if (a) and (b) hold, then

𝑀𝑡2 − h𝑀, 𝑀i𝑡 6 sup 𝑀𝑠2 + h𝑀, 𝑀i∞ ∈ 𝐿 1 ,


𝑠>0

so by Proposition 4.7(ii), 𝑀 2 − h𝑀, 𝑀i is a uniformly integrable martingale.


(ii) Apply (i) to 𝑀 𝑎 for each 𝑎 > 0. J
Exercise (due 11/30). Exercise 4.24.

4.4. The Bracket of Two Continuous Local Martingales


The reason for our notation h𝑀, 𝑀i is that it leads to:
Definition 4.14. If 𝑀 and 𝑁 are continuous local martingales, the bracket (or covariation) h𝑀, 𝑁i
is the finite-variation process
1 
h𝑀, 𝑁i𝑡 B h𝑀 + 𝑁, 𝑀 + 𝑁i𝑡 − h𝑀, 𝑀i𝑡 − h𝑁, 𝑁i𝑡 .
2
4.4. The Bracket of Two Continuous Local Martingales 53

Proposition 4.15. Let 𝑀 and 𝑁 be continuous local martingales.


(i) h𝑀, 𝑁i is  the unique (up to indistinguishability) finite-variation process such that 𝑀𝑡 𝑁𝑡 −
h𝑀, 𝑁i𝑡 𝑡>0 is a continuous local martingale.
(ii) The map (𝑀, 𝑁) ↦→ h𝑀, 𝑁i is bilinear and symmetric.
(iii) If 0 = 𝑡0𝑛 < 𝑡 1𝑛 < · · · < 𝑡 𝑛𝑝 𝑛 = 𝑡 is an increasing sequence of subdivisions of [0, 𝑡] with mesh
going to 0, then
𝑝𝑛
X
lim (𝑀𝑡𝑖𝑛 − 𝑀𝑡𝑖−1
𝑛 )(𝑁 𝑡 𝑛 − 𝑁 𝑡 𝑛 ) = h𝑀, 𝑁i𝑡
𝑖 𝑖−1
in probability.
𝑛→∞
𝑖=1

(iv) If 𝑇 is a stopping time, then

h𝑀 𝑇 , 𝑁 𝑇 i = h𝑀, 𝑁i𝑇 = h𝑀 𝑇 , 𝑁i.

(v) If 𝑀 and 𝑁 are both true martingales bounded in 𝐿 2 , then 𝑀 𝑁 − h𝑀, 𝑁i is a uniformly
integrable martingale, whence h𝑀, 𝑁i∞ exists as the almost sure and 𝐿 1 limit of h𝑀, 𝑁i𝑡 as
𝑡 → ∞ and satisfies
 
E[𝑀∞ 𝑁∞ ] = E[𝑀0 𝑁0 ] + E h𝑀, 𝑁i∞ .

Proof. (i) This follows from Theorem 4.9, with uniqueness from Theorem 4.8.
(ii) This follows from uniqueness in (i).
(iii) This follows from Eq. (4.3) applied to 𝑀, 𝑁 and 𝑀 + 𝑁.
(iv) The first equality follows from (i) as in the proof of Proposition 4.11. By (iii), given
0 6 𝑠 6 𝑡, we may take the subdivisions of [0, 𝑡] to include 𝑠 in order to deduce that

h𝑀 𝑇 , 𝑁i𝑡 = h𝑀, 𝑁i𝑡 a.s. on [𝑇 > 𝑡]

and
h𝑀 𝑇 , 𝑁i𝑡 = h𝑀 𝑇 , 𝑁i𝑠 a.s. on [𝑇 6 𝑠],
whence h𝑀 𝑇 , 𝑁i𝑡 = h𝑀, 𝑁i𝑡𝑇 almost surely (consider 𝑠 ∈ Q+ ).
(v) Apply Theorem 4.13(i) to 𝑀, 𝑁 and 𝑀 + 𝑁 to get three uniformly integrable martingales.
Combining them gives the result. J

Exercise (due 11/30). Give another proof that

h𝑀 − 𝑀 𝑇 , 𝑀 − 𝑀 𝑇 i = h𝑀, 𝑀i − h𝑀, 𝑀i𝑇

by using Proposition 4.15.


Proposition 4.16. If 𝐵 and 𝐵0 independent (ℱ𝑡 )-Brownian motions, then h𝐵, 𝐵0i = 0.

We will skip the proof in favor of the following exercise:


Exercise (due 11/30). Let 𝑀 and 𝑁 be independent continuous local martingales (so 𝜎(𝑀) ⫫ 𝜎(𝑁)).
Give two proofs as follows that h𝑀, 𝑁i = 0:
54 Chapter 4. Continuous Semimartingales

(1) Assume first that 𝑀 and 𝑁 are bounded. For 0 = 𝑡 0 < 𝑡1 < · · · < 𝑡 𝑛 = 𝑡, show that
𝑛
 X 2
6 max E[𝑀𝑡2𝑖 ] − E[𝑀𝑡2𝑖−1 ] · E[𝑁𝑡2 ] − E[𝑁02 ] .
 
E (𝑀𝑡𝑖 − 𝑀𝑡𝑖−1 )(𝑁𝑡𝑖 − 𝑁𝑡𝑖−1 )
16𝑖6𝑛
𝑖=1

Deduce that h𝑀, 𝑁i = 0.


In the general case, localize 𝑀 and 𝑁 and use Proposition 4.15.
(2) Assume first that 𝑀 and 𝑁 are martingales. Show that for 0 6 𝑠 6 𝑡, 𝐴 ∈ ℱ𝑠𝑀 , 𝐵 ∈ ℱ𝑠𝑁 , we
have
E[𝑀𝑡 𝑁𝑡 1 𝐴∩𝐵 ] = E[𝑀𝑠 𝑁 𝑠 1 𝐴∩𝐵 ].
Deduce that 𝑀 𝑁 is an ℱ𝑡𝑀 ∨ ℱ𝑡𝑁 𝑡>0 -martingale and that h𝑀, 𝑁i = 0.


In the general case, localize 𝑀 and 𝑁.


Definition 4.17. We say that two continuous local martingales are orthogonal if their bracket is 0;
this is equivalent to their product being a continuous local martingale.
Thus, 𝑀 ⫫ 𝑁 implies 𝑀 and 𝑁 are orthogonal. The converse is false:
Exercise (due 11/30). Show that if 𝐵 is an (ℱ𝑡 )-Brownian motion and 𝑇 is a stopping time, then
h𝐵𝑇 , 𝐵 − 𝐵𝑇 i = 0. Give an example where 𝐵𝑇 and 𝐵 − 𝐵𝑇 are not independent.
If h𝑀, 𝑁i = 0 and 𝑀, 𝑁 are true martingales bounded in 𝐿 2 , then
E[𝑀𝑇 𝑁𝑇 ] = E[𝑀0 𝑁0 ]
for all stopping times 𝑇 by Proposition 4.15(v) and Theorem 3.22 (the optional stopping theorem).
Exercise. Prove, conversely, that if 𝑀 and 𝑁 are true martingales bounded in 𝐿 2 and E[𝑀𝑇 𝑁𝑇 ] =
E[𝑀0 𝑁0 ] for all finite or infinite stopping times 𝑇, then h𝑀, 𝑁i = 0.
We are next going to prove a Cauchy–Schwarz type inequality that involves integrating
with respect to the bracket of two continuous local martingales. The proof involves the usual
Cauchy–Schwarz inequality a couple of times, including via the following:
Lemma. Let 𝑎 : R+ → R be a finite-variation function and 𝑣1 , 𝑣2 : R+ → R be increasing functions.
If p p
∀0 6 𝑠 < 𝑡 < ∞ 𝑎(𝑡) − 𝑎(𝑠) 6 𝑣1 (𝑡) − 𝑣1 (𝑠) · 𝑣2 (𝑡) − 𝑣2 (𝑠),
then for any Borel functions ℎ, 𝑘 : R+ → R+ ,
∫ ∞ ∫ ∞  1/2 ∫ ∞  1/2
2
ℎ · 𝑘 |d𝑎| 6 ℎ d𝑣1 𝑘 2 d𝑣2 . (∗)
0 0 0

Proof. Suppose that (∗) holds for functions


P ℎ𝑖 and 𝑘 𝑖 P that are both are 0 outside a Borel set 𝐴𝑖 , with
𝐴𝑖 disjoint for different 𝑖. Write ℎ := ℎ𝑖 and 𝑘 := 𝑘 𝑖 . Then
∫ ∞ ∫ ∞X X ∫ ∞  1/2 ∫ ∞  1/2
2 2
ℎ · 𝑘 |d𝑎| = ℎ𝑖 𝑘 𝑖 |d𝑎| 6 ℎ𝑖 d𝑣1 𝑘 𝑖 d𝑣2
0 0 0 0
 ∞X
∫  1/2  ∞ X
∫  1/2  ∞ ∫  1/2 ∫ ∞  1/2
2 2 2
6 ℎ𝑖 d𝑣1 𝑘 𝑖 d𝑣2 = ℎ d𝑣1 𝑘 2 d𝑣2
0 0 0 0
4.4. The Bracket of Two Continuous Local Martingales 55

by the Cauchy–Schwarz inequality. That is, (∗) then holds also for the pair ℎ, 𝑘. A similar
computation using the hypothesis shows that if 𝑠 = 𝑡 0 < 𝑡1 < · · · < 𝑡 𝑝 = 𝑡, then
𝑝
X p p
𝑎(𝑡𝑖 ) − 𝑎(𝑡𝑖−1 ) 6 𝑣1 (𝑡) − 𝑣1 (𝑠) · 𝑣2 (𝑡) − 𝑣2 (𝑠).
𝑖=1

d𝑎
(Actually, this is a special case of the initial computation with ℎ𝑖 = 1 (𝑡𝑖−1 ,𝑡𝑖 ] |d𝑎| and 𝑘 𝑖 =
1 (𝑡𝑖−1 ,𝑡𝑖 ] sgn 𝑎(𝑡𝑖 ) − 𝑎(𝑡𝑖−1 ) .) Taking a limit of such subdivisions and using Proposition 4.2,


we obtain ∫ ∫  1/2 ∫  1/2


|d𝑎| 6 d𝑣1 d𝑣2 ,
(𝑠,𝑡] (𝑠,𝑡] (𝑠,𝑡]
in other words, (∗) holds for functions of the form 1 (𝑠,𝑡] . By our first computation, it follows that if
𝐵 is a finite disjoint union of intervals (𝑠𝑖 , 𝑡𝑖 ], then
∫ ∫  1/2 ∫  1/2
|d𝑎| 6 d𝑣1 d𝑣2 . (∗∗)
𝐵 𝐵 𝐵

The class of 𝐵 such that (∗∗) holds is closed under countable increasing unions and decreasing
intersections. Furthermore, the class of finite disjoint unions of intervals (𝑠, 𝑡] is an algebra. By
Halmos’ monotone class lemma, it follows that (∗∗) holds for all 𝐵 ∈ ℬ(R+ ). Therefore, (∗) holds
when ℎ and 𝑘 are multiples of the same indicator, and thus when they are simple functions. We may
take monotone increasing limits of simple functions to get the full result. J
Proposition 4.18 (Kunita–Watanabe). If 𝑀 and 𝑁 are continuous local martingales and 𝐻 and 𝐾
are measurable processes, then almost surely,
∫ ∞ ∫ ∞  1/2 ∫ ∞  1/2
2
|𝐻𝑠 | · |𝐾 𝑠 | dh𝑀, 𝑁i𝑠 6 𝐻𝑠 dh𝑀, 𝑀i𝑠 𝐾 𝑠2 dh𝑁, 𝑁i𝑠 .
0 0 0

Proof. For 𝑠 = 𝑡0 < 𝑡1 · · · < 𝑡 𝑝 = 𝑡, we have


𝑝 𝑝
X 𝑝
 1/2 X  1/2
2
(𝑁𝑡𝑖 − 𝑁𝑡𝑖−1 ) 2
X
(𝑀𝑡𝑖 − 𝑀𝑡𝑖−1 )(𝑁𝑡𝑖 − 𝑁𝑡𝑖−1 ) 6 (𝑀𝑡𝑖 − 𝑀𝑡𝑖−1 ) .
𝑖=1 𝑖=1 𝑖=1

Taking a limit and using Theorem 4.9 and Proposition 4.15, we get almost surely
 1/2  1/2
h𝑀, 𝑁i𝑡 − h𝑀, 𝑁i𝑠 6 h𝑀, 𝑀i𝑡 − h𝑀, 𝑀i𝑠 h𝑁, 𝑁i𝑡 − h𝑁, 𝑁i𝑠 .

By taking 𝑠, 𝑡 ∈ Q+ and using continuity, we obtain that this holds almost surely simultaneously in
0 6 𝑠 < 𝑡 < ∞. The result now follows from the lemma. J
56 Chapter 4. Continuous Semimartingales

4.5. Continuous Semimartingales

Definition 4.19. A process 𝑋 is a continuous semimartingale if there is a continuous local


martingale 𝑀 and a finite-variation process 𝐴 such that

∀𝑡 > 0 𝑋𝑡 = 𝑀𝑡 + 𝐴𝑡 .

By 𝑇 ℎ𝑒𝑜𝑟𝑒𝑚 4.8, the decomposition 𝑋 = 𝑀 + 𝐴 is unique up to indistinguishability; it is called the


canonical decomposition of 𝑋.

Definition 4.20. Let 𝑋 = 𝑀 + 𝐴 and 𝑋 0 = 𝑀 0 + 𝐴0 be the canonical decompositions of two continuous


semimartingales, 𝑋 and 𝑋 0. The bracket of 𝑋 and 𝑋 0 is

h𝑋, 𝑋 0i B h𝑀, 𝑀 0i.

Proposition 4.21. Let 𝑋 and 𝑋 0 be continuous semimartingales. Given an increasing sequence of


subdivisions 0 = 𝑡0𝑛 < 𝑡1𝑛 < · · · < 𝑡 𝑛𝑝 𝑛 = 𝑡 of [0, 𝑡] whose mesh tends to 0, we have
𝑝𝑛
X
lim (𝑋𝑡𝑖𝑛 − 𝑋𝑡𝑖−1 0 0 0
𝑛 )(𝑋 𝑛 − 𝑋 𝑛 ) = h𝑋, 𝑋 i𝑡
𝑡 𝑡 in probability.
𝑛→∞ 𝑖 𝑖−1
𝑖=1

Proof. We have

𝑛 )(𝑋 𝑛 − 𝑋 𝑛 ) = (𝑀𝑡 𝑛 − 𝑀𝑡 𝑛 )(𝑀 𝑛 − 𝑀 𝑛 ) + terms involving 𝐴 or 𝐴 .


0 0 0 0 0
(𝑋𝑡𝑖𝑛 − 𝑋𝑡𝑖−1 𝑡 𝑡 𝑖 𝑖−1 𝑡 𝑡
𝑖 𝑖−1 𝑖 𝑖−1

The sums of the first terms converge in probability to h𝑀, 𝑀 0i = h𝑋, 𝑋 0i by Proposition 4.15(iii).
The other terms have sums going to 0 almost surely by continuity; e.g.,
𝑝𝑛
X ∫ 𝑡
(𝑀𝑡𝑖𝑛 − 0
𝑛 )( 𝐴 𝑛
𝑀𝑡𝑖−1 − 𝐴0𝑡 𝑛 ) 6 max |𝑀𝑡𝑖𝑛 − 𝑀𝑡𝑖−1
𝑛 | · |d𝐴0𝑠 |. J
𝑡𝑖 𝑖−1 16𝑖6 𝑝 𝑛 0
𝑖=1 | {z }
→0
57

Chapter 5

Stochastic Integration

This chapter is the heart of the course.

5.1. The Construction of Stochastic Integrals


We fix a complete filtered probability space.
We construct stochastic integrals in stages. In Section 5.1.1, we begin with the analogue of step
functions and proceed to integrate with respect to 𝐿 2 -bounded martingales. We’ll find that all the
hard work was done earlier, especially Theorem 4.9. In Section 5.1.2, we extend to integrate with
respect to continuous local martingales and in Section 5.1.3, to continuous semimartingales; these
extensions will be easy. In Section 5.1.4, we prove some limit theorems about stochastic integrals.

5.1.1. Stochastic Integrals for Martingales Bounded in 𝐿 2


Every 𝐿 2 -bounded martingale is closed, so one can think of the space of 𝐿 2 -bounded martingales
as a subspace of 𝐿 2 (Ω). However, they are certainly not necessarily all continuous. Thus, we define
H2 to be the space of continuous 𝐿 2 -bounded martingales 𝑀 with 𝑀0 = 0 and identify it with a
subspace of 𝐿 2 (Ω), so for 𝑀, 𝑁 ∈ H2 , we have

(𝑀, 𝑁)H2 := (𝑀∞ , 𝑁∞ )𝐿 2 (Ω) = E[𝑀∞ 𝑁∞ ] = E h𝑀, 𝑁i∞


 

by Proposition 4.15(v) and the fact that 𝑀0 = 𝑁0 = 0.


This subspace H2 is closed:
Proposition 5.1. The space H2 is a Hilbert space.

Proof. Suppose that (𝑀𝑛 )𝑛∈N is a Cauchy sequence in H2 , i.e., (𝑀∞


𝑛) 2
𝑛∈N is Cauchy in 𝐿 (Ω). By
Doob’s 𝐿 2 -inequality, E sup𝑡>0 (𝑀𝑡𝑛 − 𝑀𝑡𝑚 ) 2 6 4 E (𝑀∞ 𝑛 − 𝑀 𝑚 ) 2 , so by Lemma C in Chapter 4
 

for the proof of Theorem 4.9, there exists a subsequence (𝑛 𝑘 )𝑘∈N and a 𝑌 with continuous sample
paths such that almost surely,
∀𝑡 > 0 𝑀𝑡𝑛 𝑘 → 𝑌𝑡 .
Also, there exists 𝑍 ∈ 𝐿 2 (Ω) such that 𝑀∞
𝑛 → 𝑍 in 𝐿 2 . This implies

𝑀𝑡𝑛 = E[𝑀∞
𝑛
| ℱ𝑡 ] → E[𝑍 | ℱ𝑡 ],
58 Chapter 5. Stochastic Integration

whence
∀𝑡 𝑌𝑡 = E[𝑍 | ℱ𝑡 ].
Therefore, 𝑌 is an 𝐿 2 -bounded continuous martingale. Since 𝑀∞
𝑛 → 𝑍 in 𝐿 2 , we get that 𝑌 is the

limit of 𝑀 𝑛 in H2 . J
Recall that 𝒫 denotes the progressive 𝜎-field. For 𝑀 ∈ H2 , write h𝑀, 𝑀i P for the measure
on 𝒫 given by
h∫ ∞ i
𝐴 ↦→ E 1 𝐴 (𝜔, 𝑠) dh𝑀, 𝑀i𝑠 ;
0

the total mass of h𝑀, 𝑀i P is E h𝑀, 𝑀i∞ = k 𝑀 kH2 2 . Then


 

n h∫ ∞ i o
2 2
𝐿 (𝑀) B 𝐿 Ω × R+ , 𝒫, h𝑀, 𝑀i P = 𝐻 ∈ 𝒫 ; E 𝐻𝑠2 dh𝑀,

𝑀i𝑠 < ∞ .
0

This has the usual inner product


h∫ ∞ i
(𝐻, 𝐾)𝐿 2 (𝑀) = E 𝐻𝑠 𝐾 𝑠 dh𝑀, 𝑀i𝑠 .
0

Note that ∫ ∞
𝐻𝑠 𝐾 𝑠 dh𝑀, 𝑀i𝑠 ∈ 𝐿 1 (Ω, P)
0

for 𝐻, 𝐾 ∈ 𝐿 2 (𝑀).
The analogue of step function is:
Definition 5.2. An elementary process is a process 𝐻 of the form
𝑝−1
X
𝐻𝑠 (𝜔) = 𝐻 (𝑖) (𝜔)1 (𝑡𝑖 ,𝑡𝑖+1 ] (𝑠)
𝑖=0

for 0 = 𝑡0 < 𝑡1 < · · · < 𝑡 𝑝 and 𝐻 (𝑖) ∈ 𝐿 ∞ (ℱ𝑡𝑖 , P). We denote this class by ℰ.

It is straightforward to check that ℰ ⊆ 𝒫; for this, we could even use 1 [𝑡𝑖 ,𝑡𝑖+1 ) . The stricter
measurability requirement from using (𝑡𝑖 , 𝑡𝑖+1 ] makes ℰ a smaller class. We have ℰ ⊆ 𝐿 2 (𝑀) for
𝑀 ∈ H2 . In fact, ℰ is dense in 𝐿 2 (𝑀):
Proposition 5.3. ∀𝑀 ∈ H2 ℰ is dense in 𝐿 2 (𝑀).

Proof. This is equivalent to showing that ℰ ⊥ = {0}. Let 𝐾 ⊥ ℰ. Then for 0 6 𝑠 < 𝑡 and
𝐹 ∈ 𝐿 ∞ (ℱ𝑠 ), we have
h ∫ 𝑡 i
0 = 𝐾, 𝐹 ⊗ 1 (𝑠,𝑡] 𝐿 2 (𝑀) = E 𝐹 𝐾𝑢 dh𝑀, 𝑀i𝑢 = E 𝐹 (𝑋𝑡 − 𝑋𝑠 ) ,
  
𝑠

where 𝑋𝑡 B 0 𝐾𝑢 dh𝑀, 𝑀i𝑢 ∈ 𝐿 1 (Ω, P). That is, by Proposition 4.5, 𝑋 = 𝐾 · h𝑀, 𝑀i is a
∫𝑡

finite-variation process that is also a martingale, whence by Theorem 4.8, 𝑋 = 0. This means almost
surely, 𝐾 = 0 dh𝑀, 𝑀i-a.e., i.e., 𝐾 = 0 in 𝐿 2 (𝑀). J
5.1. The Construction of Stochastic Integrals 59

Exercise (due 12/7). Prove that if 𝑀 is a bounded continuous martingale and 𝐴 is a bounded
increasing process, then
h∫ ∞ i
E[𝑀∞ 𝐴∞ ] = E 𝑀𝑡 d𝐴𝑡 .
0

If 𝑀 ∈ H2 and 𝑇 is a stopping time, then

h𝑀 𝑇 , 𝑀 𝑇 i∞ = h𝑀, 𝑀i∞
𝑇
= h𝑀, 𝑀i𝑇 6 h𝑀, 𝑀i∞ ,

so 𝑀 𝑇 ∈ H2 .

Exercise (due 12/7). Derive 𝑀 𝑇 ∈ H2 from the optional stopping theorem (Theorem 3.22) instead.

Let 1 [0,𝑇] denote the process (𝜔, 𝑡) ↦→ 1 [0,𝑇 (𝜔)] (𝑡). If 𝑇 is a stopping time, then 1 [0,𝑇] is
adapted and left-continuous, so progressive by Proposition 3.4. Therefore, if 𝐻 ∈ 𝐿 2 (𝑀), also
1 [0,𝑇] 𝐻 ∈ 𝐿 2 (𝑀).
Here is our first definition of stochastic integral.

Theorem 5.4. Let 𝑀 ∈ H2 . Given an 𝐻 ∈ ℰ as in Definition 5.2, the formula

𝑝−1
X 
(𝐻 · 𝑀)𝑡 B 𝐻 (𝑖) 𝑀𝑡𝑖+1 ∧𝑡 − 𝑀𝑡𝑖 ∧𝑡
𝑖=0

defines a process 𝐻 · 𝑀 ∈ H2 . The map 𝐻 ↦→ 𝐻 · 𝑀 from ℰ → H2 extends uniquely to a linear


isometry 𝐿 2 (𝑀) → H2 , also denoted 𝐻 ↦→ 𝐻 · 𝑀. For all 𝐻 ∈ 𝐿 2 (𝑀), 𝐻 · 𝑀 is the unique element
of H2 such that
∀𝑁 ∈ H2 h𝐻 · 𝑀, 𝑁i = 𝐻 · h𝑀, 𝑁i. (5.2)

(Recall 𝐻 · h𝑀, 𝑁i 𝑡 = 0 𝐻𝑠 dh𝑀, 𝑁i𝑠 .) If 𝑇 is a stopping time, then


 ∫𝑡

∀𝐻 ∈ 𝐿 2 (𝑀) (1 [0,𝑇] 𝐻) · 𝑀 = (𝐻 · 𝑀)𝑇 = 𝐻 · 𝑀 𝑇 . (5.3)

We call 𝐻 · 𝑀 the stochastic integral of 𝐻 with respect to 𝑀 and write


∫ 𝑡
(𝐻 · 𝑀)𝑡 =: 𝐻𝑠 d𝑀𝑠 .
0

Note that the two uses of · in Eq. (5.2) are unambiguous because every finite-variation martingale
is 0 by Theorem 4.8.
More abstractly, one could alternatively use Eq. (5.2) to define 𝐻 · 𝑀 as follows: Given 𝑀 ∈ H2
and 𝐻 ∈ 𝐿 2 (𝑀), the map
H2 3 𝑁 ↦→ E 𝐻 · h𝑀, 𝑁i ∞
  
60 Chapter 5. Stochastic Integration

satisfies
∫ ∞ h∫ ∞ i
𝐻𝑠 dh𝑀, 𝑁i𝑠 6 E
  
E 𝐻 · h𝑀, 𝑁i ∞ 6E |𝐻𝑠 | |dh𝑀, 𝑁i𝑠 |
0 0
h∫ ∞  1/2 ∫ ∞  1/2 i
2
6 E 𝐻𝑠 dh𝑀, 𝑀i𝑠 · dh𝑁, 𝑁i𝑠
0 0
[Kunita–Watanabe]
h ∞
∫ i 1/2 h∫ ∞ i 1/2
2
6 E 𝐻𝑠 dh𝑀, 𝑀i𝑠 E dh𝑁, 𝑁i𝑠
0 0
[Cauchy–Schwarz inequality]
= k𝐻 k 𝐿 2 (𝑀) · k𝑁 kH2 , (∗)

and thus is a continuous linear functional on H2 . Hence, there is a unique 𝐻 · 𝑀 ∈ H2 such that
    
E 𝐻 · h𝑀, 𝑁i ∞ = (𝐻 · 𝑀, 𝑁)H2 = E h𝐻 · 𝑀, 𝑁i∞ .

One can then deduce Eq. (5.2) and everything else.


In fact, we should have verified 𝐻 ∈ 𝐿 1 dh𝑀, 𝑁i almost surely and


𝐻 · h𝑀, 𝑁i ∞ ∈ 𝐿 1 (P),


but this follows as in (∗), starting instead with


h∫ ∞ i
E |𝐻𝑠 | · |dh𝑀, 𝑁i𝑠 | .
0

Note that a special case of (∗), with 𝐻 = 1, is

(∗∗)
 
E |h𝑀, 𝑁i∞ | 6 k 𝑀 kH2 · k𝑁 kH2 .

Proof of Theorem 5.4. It is easy to see that the definition of 𝐻 · 𝑀 for 𝐻 ∈ ℰ does not depend on
its representation as in Definition 5.2. It follows that 𝐻 ↦→ 𝐻 · 𝑀 is linear on ℰ. To see that the map
is an isometry into H2 , write
𝑀 (𝑖) B 𝐻 (𝑖) (𝑀 𝑡𝑖+1 − 𝑀 𝑡𝑖 ),
so that 𝐻 · 𝑀 = 𝑖=0 𝑀 (𝑖) . We saw in the proof of Theorem 4.9 that 𝑀 (𝑖) is a continuous martingale,
P 𝑝−1

so 𝐻 · 𝑀 ∈ H2 . By Proposition 4.15(iv), we have

∀𝑠, 𝑡 > 0 h𝑀 𝑠 , 𝑀 𝑡 i = h𝑀, 𝑀i 𝑠∧𝑡 .

Thus, h𝑀 (𝑖) , 𝑀 ( 𝑗) i = 0 for 𝑖 ≠ 𝑗 and


2
h𝑀 (𝑖) , 𝑀 (𝑖) i = 𝐻 (𝑖)

h𝑀, 𝑀i 𝑡𝑖+1 − h𝑀, 𝑀i 𝑡𝑖 .

Therefore,
𝑝−1 ∫ •
2
𝐻𝑠2 dh𝑀, 𝑀i𝑠
X
𝑡 𝑖+1 𝑡𝑖 
h𝐻 · 𝑀, 𝐻 · 𝑀i = 𝐻 (𝑖) h𝑀, 𝑀i − h𝑀, 𝑀i =
𝑖=0 0
5.1. The Construction of Stochastic Integrals 61

(i.e., h𝐻 · 𝑀, 𝐻 · 𝑀i𝑡 = 𝐻𝑠2 dh𝑀, 𝑀i𝑠 ). In particular,


∫𝑡
0
h∫ ∞ i
k𝐻 · 𝑀 kH2 2 𝐻𝑠2 dh𝑀, 𝑀i𝑠 = k𝐻 k 𝐿2 2 (𝑀) ,
 
= E h𝐻 · 𝑀, 𝐻 · 𝑀i∞ = E
0
as desired.
By Proposition 5.3, ℰ is dense in 𝐿 2 (𝑀), so 𝐻 ↦→ 𝐻 · 𝑀 has a unique continuous extension to
a map from 𝐿 2 (𝑀) to H2 .
If 𝐻 ∈ ℰ, we have, similarly to above,
𝑝−1
X 𝑝−1
X
(𝑖) 
h𝐻 · 𝑀, 𝑁i = h𝑀 , 𝑁i = 𝐻 (𝑖) h𝑀, 𝑁i 𝑡𝑖+1 − h𝑀, 𝑁i 𝑡𝑖
∫𝑖=1• 𝑖=0

= 𝐻𝑠 dh𝑀, 𝑁i𝑠 = 𝐻 · h𝑀, 𝑁i,


0
i.e., Eq. (5.2) holds for 𝐻 ∈ ℰ. Now, Eq. (∗) shows that 𝐻 ↦→ 𝐻 · h𝑀, 𝑁i ∞ is continuous as a


map from H2 to 𝐿 1 (P) and Eq. (∗∗) that 𝑋 ↦→ h𝑋, 𝑁i∞ is continuous from H2 to 𝐿 1 (P). Since
𝐻 ↦→ 𝐻 · 𝑀 is an isometry from 𝐿 2 (𝑀) → H2 , it follows that 𝐻 ↦→ h𝐻 · 𝑀, 𝑁i∞ is continuous
from 𝐿 2 (𝑀) → 𝐿 1 (P). Therefore,

h𝐻 · 𝑀, 𝑁i∞ = 𝐻 · h𝑀, 𝑁i ∞
for all 𝐻 ∈ 𝐿 2 (𝑀), 𝑁 ∈ H2 . Replace 𝑁 by 𝑁 𝑡 to obtain Eq. (5.2) in general.
We’ve already seen that something weaker than Eq. (5.2) characterizes 𝐻 · 𝑀 among elements
2
of H .
To see Eq. (5.3), let 𝑁 ∈ H2 and note that
[Eq. (5.2)] [Eq. (5.2)]
(𝐻 · 𝑀)𝑇 , 𝑁 = h𝐻 · 𝑀, 𝑁i𝑇 = 𝐻 · h𝑀, 𝑁i = 1 [0,𝑇] 𝐻 · h𝑀, 𝑁i = (1 [0,𝑇] 𝐻) · 𝑀, 𝑁 ,
𝑇

[Proposition 4.15(iv)] [deterministic]

whence by the uniqueness of Eq. (5.2), 1 [0,𝑇] 𝐻 · 𝑀 = (𝐻 · 𝑀)𝑇 . Similarly,




h𝐻 · 𝑀 𝑇 , 𝑁i = 𝐻 · h𝑀 𝑇 , 𝑁i = 𝐻 · h𝑀, 𝑁i𝑇 = 1 [0,𝑇] 𝐻 · h𝑀, 𝑁i,


[Eq. (5.2)]
so (𝐻 · 𝑀)𝑇 = 𝐻 · 𝑀 𝑇 . J
We could rewrite Eq. (5.2) as
D∫ • E ∫ 𝑡
𝐻𝑠 d𝑀𝑠 , 𝑁 = 𝐻𝑠 dh𝑀, 𝑁i𝑠 .
0 𝑡 0

If 𝑀 ∈ H2 and 𝐻 ∈ 𝐿 2 (𝑀), then by Eq. (5.2),


h𝐻 · 𝑀, 𝐻 · 𝑀i = 𝐻 · h𝑀, 𝐻 · 𝑀i = 𝐻 · h𝐻 · 𝑀, 𝑀i = 𝐻 2 · h𝑀, 𝑀i. (5.4)
If also 𝑁 ∈ H2 and 𝐾 ∈ 𝐿 2 (𝑁), then similarly we obtain
h𝐻 · 𝑀, 𝐾 · 𝑁i = 𝐻𝐾 · h𝑀, 𝑁i.
62 Chapter 5. Stochastic Integration

Proposition 5.5. Let 𝑀 ∈ H2 , 𝐻 ∈ 𝐿 2 (𝑀), and 𝐾 be progressive. Then


𝐾 𝐻 ∈ 𝐿 2 (𝑀) ⇐⇒ 𝐾 ∈ 𝐿 2 (𝐻 · 𝑀),
in which case
(𝐾 𝐻) · 𝑀 = 𝐾 · (𝐻 · 𝑀).
Proof. By Eq. (5.4), we have
E 𝐾 2 𝐻 2 · h𝑀, 𝑀i ∞ = E 𝐾 2 · h𝐻 · 𝑀, 𝐻 · 𝑀i ∞ ,
     

which gives k𝐾 𝐻 k 𝐿 2 (𝑀) = k𝐾 k 𝐿 2 (𝐻·𝑀) . If this is finite, then for 𝑁 ∈ H2 , we have



(𝐾 𝐻) · 𝑀, 𝑁 = 𝐾 𝐻 · h𝑀, 𝑁i = 𝐾 · 𝐻 · h𝑀, 𝑁i
[Eq. (5.2)] [see below]
= 𝐾 · h𝐻 · 𝑀, 𝑁i = 𝐾 · (𝐻 · 𝑀), 𝑁 ,
[Eq. (5.2)] [Eq. (5.2)]
where the second equality is justified as follows: by the Kunita–Watanabe inequality,
∫ ∞ ∫ ∞
2
𝐻 dh𝑀, 𝑀i < ∞ and 𝐾 2 𝐻 2 dh𝑀, 𝑀i < ∞
0 0
implies
∫ 𝑡 ∫ 𝑡
∀𝑡 |𝐻𝑠 | dh𝑀, 𝑁i𝑠 < ∞ and |𝐾 𝑠 𝐻𝑠 | dh𝑀, 𝑁i𝑠 < ∞.
0 0
By the uniqueness part of Eq. (5.2), we conclude that (𝐾 𝐻) · 𝑀 = 𝐾 · (𝐻 · 𝑀). J
Recall that for 𝑀, 𝑁 ∈ H2 , (𝑀, 𝑁)H2 = E[𝑀∞ 𝑁∞ ] = E[h𝑀, 𝑁i∞ ]. By considering 𝑀 𝑡 and
𝑁 𝑡 , this implies that E[𝑀𝑡 𝑁𝑡 ] = E[h𝑀, 𝑁i𝑡 ] for 𝑡 ∈ [0, ∞].
Suppose that 𝑀, 𝑁 ∈ H2 , 𝐻 ∈ 𝐿 2 (𝑀), and 𝐾 ∈ 𝐿 2 (𝑁). Since 𝐻 · 𝑀, 𝐾 · 𝑁 ∈ H2 , we get
h∫ 𝑡 i
𝐻𝑠 d𝑀𝑠 = E (𝐻 · 𝑀)𝑡 = E (𝐻 · 𝑀)0 = 0 (5.6)
   
∀𝑡 ∈ [0, ∞] E
0
[martingale]
and
h∫ 𝑡 ∫ 𝑡 i
𝐻𝑠 d𝑀𝑠 · 𝐾 𝑠 d𝑁 𝑠 = E (𝐻 · 𝑀)𝑡 (𝐾 · 𝑁)𝑡
 
E
0 0  
= E h𝐻 · 𝑀, 𝐾 · 𝑁i𝑡
 
= E 𝐻𝐾 · h𝑀, 𝑁i 𝑡
h∫ 𝑡 i
=E 𝐻𝑠 𝐾 𝑠 dh𝑀, 𝑁i𝑠 .
0
In particular,
h∫ 𝑡 2i h∫ 𝑡 i
E 𝐻𝑠 d𝑀𝑠 =E 𝐻𝑠2 dh𝑀, 𝑀i𝑠 ; (5.8)
0 0
this equality of norms is referred∫ 𝑡to as the Itô isometry. ∫∞
Note that we have defined 0 𝐻𝑠 d𝐵 𝑠 for progressive 𝐻 with 0 E[𝐻𝑠2 ] d𝑠 < ∞ by stopping 𝐵
at 𝑡. If 𝐻 is deterministic, this agrees with the Wiener integral almost surely: check first for step
functions. Thus, when 𝐻 is deterministic, 𝐻 · 𝐵 is a Wiener integral process.
5.1. The Construction of Stochastic Integrals 63

Exercise (due 1/18). Exercise 5.25; assume sup𝑡,𝜔 𝐻𝑡 (𝜔) < ∞.


We may rewrite the martingale condition for 𝐻 · 𝑀 as follows:
h∫ 𝑡 i ∫ 𝑠
0 6 𝑠 < 𝑡 6 ∞ =⇒ E 𝐻𝑟 d𝑀𝑟 ℱ𝑠 = 𝐻𝑟 d𝑀𝑟 , (5.9)
0 0

or, with 𝐻𝑟 d𝑀𝑟 B 𝐻𝑟 d𝑀𝑟 − 𝐻𝑟 d𝑀𝑟 ,


∫𝑡 ∫𝑡 ∫𝑠
𝑠 0 0

h∫ 𝑡 i
E 𝐻𝑟 d𝑀𝑟 ℱ𝑠 = 0.
𝑠

5.1.2. Stochastic Integrals for Local Martingales


The stopping time identities Eq. (5.3) will allow us to extend stochastic integrals to continuous
local martingales. If 𝑀 is a continuous local martingale, we again write 𝐿 2 (𝑀) for the set of
progressive processes 𝐻 in 𝐿 2 h𝑀, 𝑀i P . We write 𝐿 loc 2 (𝑀) for the set of the progressive 𝐻 such

that ∫ 𝑡
a.s. ∀𝑡 > 0 𝐻𝑠2 dh𝑀, 𝑀i𝑠 < ∞.
0

Theorem 5.6. Let 𝑀 be a continuous local martingale. If 𝐻 ∈ 𝐿 loc2 (𝑀), then there exists a unique

continuous local martingale with initial value 0, denoted 𝐻 · 𝑀, such that for all continuous local
martingales 𝑁,
h𝐻 · 𝑀, 𝑁i = 𝐻 · h𝑀, 𝑁i. (5.10)
2 (𝑀),
If 𝑇 is a stopping time, then for all 𝐻 ∈ 𝐿 loc

(5.11)

1 [0,𝑇] 𝐻 · 𝑀 = (𝐻 · 𝑀)𝑇 = 𝐻 · 𝑀 𝑇 .
2 (𝑀) and 𝐾 is progressive, then 𝐾 ∈ 𝐿 2 (𝐻 · 𝑀) if and only if 𝐻𝐾 ∈ 𝐿 2 (𝑀), in which
If 𝐻 ∈ 𝐿 loc loc loc
case
𝐾 · (𝐻 · 𝑀) = (𝐾 𝐻) · 𝑀. (5.12)
If 𝑀 ∈ H2 and 𝐻 ∈ 𝐿 2 (𝑀), then this definition of 𝐻 · 𝑀 agrees with that of Theorem 5.4.

Proof. Since h𝑀 − 𝑀0 , 𝑁i = h𝑀, 𝑁i for every continuous local martingale 𝑁, we may set
𝐻 · 𝑀 B 𝐻 · (𝑀 − 𝑀0 ) (to be defined) and ∫ 𝑡 assume that 𝑀0 = 0. Also, we may take 𝐻 to be 0 on
the negligible set where for some 𝑡 > 0, 0 𝐻𝑠2 dh𝑀, 𝑀i𝑠 = ∞.
The idea is to localize and put together the resulting definitions.
For 𝑛 > 1, let ∫ 𝑡
n o
𝑇𝑛 B inf 𝑡 > 0 ; (1 + 𝐻𝑠2 ) dh𝑀, 𝑀i𝑠 > 𝑛 .
0
This gives a sequence of stopping times that increase to infinity. Since

∀𝑡 > 0 h𝑀 𝑇𝑛 , 𝑀 𝑇𝑛 i𝑡 = h𝑀, 𝑀i𝑡∧𝑇𝑛 6 𝑛,


[Proposition 4.11]
64 Chapter 5. Stochastic Integration

Theorem 4.13 tells us that 𝑀 𝑇𝑛 ∈ H2 . Also,


∫ ∞ ∫ 𝑇𝑛
2
𝐻𝑠 dh𝑀 , 𝑀 i𝑠 =
𝑇𝑛 𝑇𝑛
𝐻𝑠2 dh𝑀, 𝑀i𝑠 6 𝑛,
0 0

so 𝐻 ∈ 𝐿 2 (𝑀 𝑇𝑛 ). Therefore, Theorem 5.4 defines 𝐻 · 𝑀 𝑇𝑛 . These are consistent: if 𝑚 > 𝑛, then

(𝐻 · 𝑀 𝑇𝑚 )𝑇𝑛 = 𝐻 · (𝑀 𝑇𝑚 )𝑇𝑛 = 𝐻 · 𝑀 𝑇𝑛 .
[Eq. (5.3)]

Thus, there exists a unique process, 𝐻 · 𝑀, such that ∀𝑛 (𝐻 · 𝑀)𝑇𝑛 = 𝐻 · 𝑀 𝑇𝑛 . Since 𝐻 · 𝑀 𝑇𝑛 has
continuous sample paths, so does 𝐻 · 𝑀. Since (𝐻 · 𝑀)𝑡 = lim𝑛→∞ (𝐻 · 𝑀 𝑇𝑛 )𝑡 , we get that 𝐻 · 𝑀
is adapted. Since (𝐻 · 𝑀)𝑇𝑛 is a martingale (in H2 , even), we get that 𝐻 · 𝑀 is a continuous local
martingale.
Now we verify the properties (5.10)–(5.12).
To prove Eq. (5.10), we may assume that 𝑁0 = 0. For 𝑛 > 1, write

𝑇𝑛0 = inf 𝑡 > 0 ; |𝑁𝑡 | > 𝑛 , 𝑆𝑛 B 𝑇𝑛 ∧ 𝑇𝑛0 .




As before, 𝑁 𝑇𝑛 ∈ H2 , so
0

 𝑇𝑛0 𝑇𝑛0 0
h𝐻 · 𝑀, 𝑁i 𝑆 𝑛 = h𝐻 · 𝑀, 𝑁i𝑇𝑛 = (𝐻 · 𝑀)𝑇𝑛 , 𝑁 = (𝐻 · 𝑀)𝑇𝑛 , 𝑁 𝑇𝑛
[Proposition 4.15(iv)]
𝑇𝑛0 0
= h𝐻 · 𝑀 𝑇𝑛 , 𝑁 i = 𝐻 · h𝑀 𝑇𝑛 , 𝑁 𝑇𝑛 i = 𝐻 · h𝑀, 𝑁i 𝑆 𝑛
[definition] [Eq. (5.2)] [Proposition 4.15(iv)]
= 𝐻 · h𝑀, 𝑁i 𝑛 .
𝑆

[deterministic]
Since 𝑆𝑛 → ∞, this gives h𝐻 · 𝑀, 𝑁i = 𝐻 · h𝑀, 𝑁i, as desired. If 𝑋 is also a continuous
local martingale with 𝑋0 = 0 and h𝑋, 𝑁i = 𝐻 · h𝑀, 𝑁i for all continuous local martingales 𝑁, then
h𝐻 · 𝑀 − 𝑋, 𝑁i = 0, so choosing 𝑁 B 𝐻 · 𝑀 − 𝑋, we get 𝑋 = 𝐻 · 𝑀 from Proposition 4.12.
The proof of Eq. (5.11) is like that of Eq. (5.3), and proof of Eq. (5.12) is like that of
Proposition 5.5.
If 𝑀 ∈ H2 and 𝐻 ∈ 𝐿 2 (𝑀), then h𝐻 · 𝑀, 𝐻 · 𝑀i = 𝐻 · h𝑀, 𝐻 · 𝑀i = 𝐻 2 · h𝑀, 𝑀i by two uses
of Eq. (5.10). This shows that 𝐻 · 𝑀 ∈ H2 , so the characteristic property Eq. (5.12) (which holds
by Eq. (5.10)) shows that the definitions agree. J
We again write ∫ 𝑡
𝐻𝑠 d𝑀𝑠 B (𝐻 · 𝑀)𝑡 .
0
We can then rewrite Eq. (5.10) as
D∫ • E ∫ 𝑡
𝐻𝑠 d𝑀𝑠 , 𝑁• = 𝐻𝑠 dh𝑀, 𝑁i𝑠 .
0 𝑡 0
5.1. The Construction of Stochastic Integrals 65

2 (𝑀), 0 6 𝑡 6 ∞, and E 𝑡 𝐻 2 dh𝑀, 𝑀i


If 𝐻 ∈ 𝐿 loc 2
𝑠 < ∞, then (𝐻 · 𝑀) ∈ H by Theorem 4.13,
∫  𝑡
0 𝑠
so we have the analogues of Eqs. (5.6), (5.8) and (5.9):
h∫ 𝑡 i
E 𝐻𝑠 d𝑀𝑠 = 0,
0
h∫ 𝑡 2i h∫ 𝑡 i
E 𝐻𝑠 d𝑀𝑠 =E 𝐻𝑠2 dh𝑀, 𝑀i𝑠 .
0 0

In particular, if 𝐻 ∈ 𝐿 2 (𝑀) (the case 𝑡 = ∞), then 𝐻 · 𝑀 ∈ H2 (even though 𝑀 need not be in H2 ).
2 (𝑀)
Exercise (due 1/18). Give an example of a continuous local martingale 𝑀 and a process 𝐻 ∈ 𝐿 loc
such that
h∫ 1 i h ∫ 1 2i h∫ 1 i
2
E 𝐻𝑠 d𝑀𝑠 ≠ 0 and E 𝐻𝑠 d𝑀𝑠 ≠E 𝐻𝑠 dh𝑀, 𝑀i𝑠 .
0 0 0

Hint: use 𝑀 that is not a true martingale.


Exercise (due 1/18). Exercise 5.25 (in general).

5.1.3. Stochastic Integrals for Semimartingales


We call a progressive process 𝐻 locally bounded if

a.s. ∀𝑡 > 0 sup |𝐻𝑠 | < ∞.


06𝑠6𝑡

This is equivalent to the existence of stopping times 𝑇𝑛 ↑ ∞ such that 1 [0,𝑇𝑛 ] 𝐻 is bounded, i.e., that
there exists a negligible set 𝒩 such that

sup 1 [0,𝑇𝑛 (𝜔)] (𝑡)𝐻𝑡 (𝜔) < ∞.


𝜔∉𝒩, 𝑡>0

Note that if 𝐻 is adapted and continuous, then 𝐻 is locally bounded.


The assumption that 𝐻 is locally bounded is convenient, because then for each finite-variation
process 𝑉, ∫ 𝑡
a.s. 𝑡 > 0 |𝐻𝑠 | |d𝑉𝑠 | < ∞,
0
1 |d𝑉 | , and for each continuous local martingale, 𝑀, we have 𝐻 ∈ 𝐿 2 (𝑀).
i.e., 𝐻 ∈ 𝐿 loc

loc
Definition 5.7. Let 𝑋 = 𝑀 + 𝑉 be the canonical decomposition of a continuous semimartingale,
𝑋, and 𝐻 be locally bounded. We define the stochastic integral 𝐻 · 𝑋 to be the continuous
semimartingale with canonical decomposition

𝐻 · 𝑋 B 𝐻 · 𝑀 + 𝐻 · 𝑉.

𝐻𝑠 d𝑋𝑠 B (𝐻 · 𝑋)𝑡 .
∫𝑡
We also write 0
2 (𝑀) ∩ 𝐿 1 |d𝑉 | .
Remark. We could have done the same as long as 𝐻 ∈ 𝐿 loc

loc
66 Chapter 5. Stochastic Integration

The following properties are evident:


(i) (𝐻, 𝑋) ↦→ 𝐻 · 𝑋 is bilinear.
(ii) If 𝐻 and 𝐾 are locally bounded, then 𝐻 · (𝐾 · 𝑋) = (𝐻𝐾) · 𝑋. Rewritten: if 𝑌𝑡 = 0 𝐾 𝑠 d𝑋𝑠 ,
∫𝑡

then 0 𝐻𝑠 d𝑌𝑠 = 0 𝐻𝑠 𝐾 𝑠 d𝑋𝑠 .


∫𝑡 ∫𝑡

(iii) For all stopping times 𝑇, (𝐻 · 𝑋)𝑇 = (1 [0,𝑇] 𝐻) · 𝑋 = 𝐻 · 𝑋 𝑇 .


(iv) If 𝑋 is a continuous local martingale, then so is 𝐻 · 𝑋; if 𝑋 is a finite-variation process, then
so is 𝐻 · 𝑋.
The next property is less evident:
(v) If 𝐻𝑠 (𝜔) = 𝑖=0 𝐻 (𝑖) (𝜔)1 (𝑡𝑖 ,𝑡𝑖+1 ] (𝑠), 0 = 𝑡0 < 𝑡1 < · · · < 𝑡 𝑝 , and 𝐻 (𝑖) ∈ ℱ𝑡𝑖 is locally
P 𝑝−1

bounded, then
𝑝−1
X
(𝐻 · 𝑋)𝑡 = 𝐻 (𝑖) (𝑋𝑡𝑖+1 ∧𝑡 − 𝑋𝑡𝑖 ∧𝑡 ).
𝑖=1

This is clear if 𝑀 = 0, so it suffices to prove it when 𝑉 = 0. If 𝐻 is bounded and 𝑀 ∈ H2 ,


then this is the definition of 𝐻 · 𝑀. In general, we may assume 𝑀0 = 0; let

𝑇𝑛 B inf 𝑡 > 0 ; |𝐻𝑡 | > 𝑛 = min 𝑡𝑖 ; |𝐻 (𝑖) | > 𝑛


 

and
𝑆𝑛 B inf 𝑡 > 0 ; h𝑀, 𝑀i𝑠 > 𝑛 .


Note that
𝑝−1
𝐻𝑖(𝑛) 1 (𝑡𝑖 ,𝑡𝑖+1 ] (𝑠),
X
1 [0,𝑇𝑛 ] (𝑠)𝐻𝑠 =
𝑖=0
(𝑛)
where 𝐻 (𝑖) B 1 [𝑇𝑛 >𝑡𝑖 ] 𝐻 (𝑖) ∈ ℱ𝑡𝑖 . Therefore,

𝑝−1
(𝑛)
X
(𝐻 · 𝑀)𝑡∧𝑇𝑛 ∧𝑆 𝑛 = (1 [0,𝑇𝑛 ] 𝐻 · 𝑀 )𝑡 =
𝑆𝑛
𝐻 (𝑖) (𝑀𝑡𝑆𝑖+1
𝑛 𝑆𝑛
∧𝑡 − 𝑀𝑡𝑖 ∧𝑡 ).
𝑖=0
[Eq. (5.11)] [definition]

Now let 𝑛 → ∞.
Exercise. Show that if 𝑍 ∈ ℱ0 and 𝑋 is a continuous semimartingale, then 𝐻 · 𝑋 = 𝑍 𝑋, where
𝐻𝑡 B 𝑍 for all 𝑡.
5.1. The Construction of Stochastic Integrals 67

5.1.4. Convergence of Stochastic Integrals

Proposition 5.8 (Dominated Convergence Theorem). Let 𝑋 = 𝑀 +𝑉 be the canonical decomposition


of a continuous semimartingale and 𝑡 > 0. Suppose that 𝐻, 𝐻 (1) , 𝐻 (2) , . . . are locally bounded,
progressive processes and that 𝐾 is a nonnegative, progressive process such that almost surely,
(i) ∀𝑠 ∈ [0, 𝑡] lim𝑛→∞ 𝐻𝑠(𝑛) = 𝐻𝑠 ,
(ii) ∀𝑠 ∈ [0, 𝑡] ∀𝑛 > 1 |𝐻𝑠(𝑛) | 6 𝐾 𝑠 , and
(iii) 0 (𝐾 𝑠 ) 2 dh𝑀, 𝑀i𝑠 < ∞ and 0 𝐾 𝑠 |d𝑉𝑠 | < ∞.
∫𝑡 ∫𝑡

Then ∫ ∫ 𝑡 𝑡
lim 𝐻𝑠(𝑛) d𝑋𝑠 = 𝐻𝑠 d𝑋𝑠 in probability.
𝑛→∞ 0 0

That 𝐻 and 𝐻 (𝑛) be locally bounded can be weakened. In (i) and (ii), we can weaken “∀𝑠 ∈ [0, 𝑡]”
to “∀𝑠 ∈ [0, 𝑡] outside a set of dh𝑀, 𝑀i-measure 0 and of |d𝑉 |-measure 0”; this will be clear from
the proof. Part (iii) is automatic if 𝐾 is locally bounded.

Proof. The Lebesgue dominated convergence theorem gives 𝐻𝑠(𝑛) d𝑉𝑠 → 𝐻𝑠 d𝑉𝑠 where (i)–(iii)
∫𝑡 ∫𝑡
0 0
hold, hence almost surely. It remains to show that
∫ 𝑡 ∫ 𝑡
P
𝐻𝑠(𝑛) d𝑀𝑠 →
− 𝐻𝑠 d𝑀𝑠 .
0 0

For 𝑝 > 1, let ∫


n 𝑟 o
2
𝑇𝑝 B inf 𝑟 ∈ [0, 𝑡] ; (𝐾 𝑠 ) dh𝑀, 𝑀i𝑠 > 𝑝 ∧ 𝑡.
0

Then almost surely, by (iii), for all large 𝑝, 𝑇𝑝 = 𝑡. Now,


∫ 𝑇𝑝  ∫ 𝑇𝑝 
2 2
E (𝐻𝑠(𝑛) − 𝐻𝑠 ) dh𝑀, 𝑀i𝑠 6 E (2𝐾 𝑠 ) dh𝑀, 𝑀i𝑠 6 4𝑝 < ∞,
0 0

whence
 2i
h ∫ 𝑇𝑝 
2
E (𝐻 (𝑛)
− 𝐻) · 𝑀 𝑇𝑝 =E (𝐻𝑠(𝑛) − 𝐻𝑠 ) dh𝑀, 𝑀i𝑠 → 0
0

as 𝑛 → ∞ by Lebesgue’s dominated convergence theorem applied to h𝑀, 𝑀i𝑇𝑝 P. Because


P[𝑇𝑝 = 𝑡] → 1 as 𝑝 → ∞, we get the result. J

We can deduce the following Riemann-integral type of result:


Proposition 5.9. Let 𝑋 be a continuous semimartingale and 𝐻 be a continuous adapted process. If
𝑡 > 0 and 0 = 𝑡0𝑛 < 𝑡1𝑛 < · · · < 𝑡 𝑛𝑝 𝑛 = 𝑡 is any sequence of subdivisions with mesh going to 0, then

𝑛 −1
𝑝X ∫ 𝑡
lim 𝐻 (𝑋𝑡𝑖𝑛 𝑛
𝑡𝑖+1 −𝑋 )=𝑡𝑖𝑛 𝐻𝑠 d𝑋𝑠 in probability.
𝑛→∞
𝑖=0 0
68 Chapter 5. Stochastic Integration

Proof. Note that the sum on the left-hand side equals 0 𝐻𝑠(𝑛) d𝑋𝑠 for the step function progressive
∫𝑡

process
𝑛 −1
𝑝X
(𝑛)
𝐻𝑠 B 𝑛 ] (𝑠) + 𝐻0 1 {0} (𝑠)
𝐻𝑡𝑖𝑛 1 (𝑡𝑖𝑛 ,𝑡𝑖+1
𝑖=0

by property (v) in Section 5.1.3. Since 𝐾 𝑠 B max06𝑟 6𝑠 |𝐻𝑟 | is locally bounded, the result follows. J

It is crucial that the Riemann sum used the left-hand endpoints. For example, let 𝐻 = 𝑋. Then

𝑛 −1
𝑝X 𝑛 −1
𝑝X 𝑛 −1
𝑝X
 2
𝑋𝑡𝑖+1
𝑛 𝑛 − 𝑋𝑡 𝑛 =
𝑋𝑡𝑖+1 𝑛 − 𝑋𝑡 𝑛 ) +
𝑋𝑡𝑖𝑛 (𝑋𝑡𝑖+1 𝑛 − 𝑋𝑡 𝑛
𝑋𝑡𝑖+1 .
𝑖 𝑖 𝑖
𝑖=0 𝑖=0 𝑖=0

Proposition 4.21
Proposition 5.9
if subdivisions are increasing
∫ 𝑡
𝑋𝑠 d𝑋𝑠 h𝑋, 𝑋i𝑡
0

Thus, we get a different limit when we use the right-hand endpoints unless the martingale part of 𝑋 is
constant on [0, 𝑡]. On the other hand, this calculation is useful: if we add to it that of Proposition 5.9,
we get
∫ 𝑡
2 2
(𝑋𝑡 ) − (𝑋0 ) = 2 𝑋𝑠 d𝑋𝑠 + h𝑋, 𝑋i𝑡 .
0
This can also be derived from Itô’s formula (in the next section).
Exercise (due 1/25). Show that if 𝑋 is a continuous semimartingale, 𝑡 > 0, and 0 = 𝑡0𝑛 < · · · < 𝑡 𝑛𝑝 𝑛 = 𝑡
is any sequence of subdivisions of [0, 𝑡] with mesh going to 0, then
𝑝𝑛
2
X
lim (𝑋𝑡𝑖𝑛 − 𝑋𝑡𝑖−1
𝑛 ) = h𝑋, 𝑋i𝑡 in probability.
𝑛→∞
𝑖=1

5.2. Itô’s Formula


This is analogous to the fundamental theorem of calculus: in order to calculate an integral, it
helps to know how to differentiate. However, there is no stochastic derivative. The formula also
shows that the class of continuous semimartingales is closed under compositions with 𝐶 2 functions.
Theorem 5.10 (Itô’s Formula). Let 𝑋 be a continuous semimartingale and 𝐹 ∈ 𝐶 2 (R) (i.e., twice
continuously differentiable). Then

1
∫ 𝑡 ∫ 𝑡
∀𝑡 > 0 𝐹 (𝑋𝑡 ) = 𝐹 (𝑋0 ) + 𝐹 (𝑋𝑠 ) d𝑋𝑠 +
0
𝐹 00 (𝑋𝑠 ) dh𝑋, 𝑋i𝑠 .
0 2 0

[antiderivative] [value at 0] [integrand] [derivative]


5.2. Itô’s Formula 69

We may also write this as


1 00
𝐹 (𝑋) = 𝐹 (𝑋0 ) + 𝐹 0 (𝑋) · 𝑋 + 2 𝐹 (𝑋) · h𝑋, 𝑋i,

[CLM+FV] [FV]
showing that 𝐹 (𝑋) is a continuous semimartingale and giving its canonical decomposition. More
generally, if 𝑋 1 , . . . , 𝑋 𝑝 are continuous semimartingales and 𝐹 ∈ 𝐶 2 (R 𝑝 ), then
𝑝 ∫ 𝑡
1 1
𝐹𝑖 (𝑋𝑠1 , . . . , 𝑋𝑠 ) d𝑋𝑠𝑖
X
∀𝑡 > 0 𝐹 (𝑋𝑡 , . . . , 𝑋𝑡 ) = 𝐹 (𝑋0 , . . . , 𝑋0 ) +
𝑝 𝑝 𝑝

𝑖=1 0
𝑝
1
∫ 𝑡
𝐹𝑖 𝑗 (𝑋𝑠1 , . . . , 𝑋𝑠 ) dh𝑋 𝑖 , 𝑋 𝑗 i𝑠 ,
X 𝑝
+
2 0
𝑖, 𝑗=1

which we may write for 𝑋 = (𝑋 1 , . . . , 𝑋 𝑝 ) as


1
𝑋, ∇2 𝐹 (𝑋) · 𝑋 ,

𝐹 (𝑋) = 𝐹 (𝑋0 ) + ∇𝐹 (𝑋) · 𝑋 +
2
where
𝑝
X 𝑞 𝑝
X
𝑘
𝐻·𝑋 B 𝐻𝑗𝑘 · 𝑋 and h𝑋, 𝑌 i B h𝑋 𝑗 , 𝑌 𝑘 i
𝑘=1 𝑗=1 𝑗,𝑘=1
when 𝐻 is a (𝑞 × 𝑝)-matrix-valued process and 𝑌 is a 𝑝-dimensional process.
P P P
− 0 and 𝑍𝑛 →
Lemma. If 𝑌𝑛 → − 0.
− 𝑍, then 𝑌𝑛 𝑍𝑛 →
This lemma is a special case of Slutsky’s theorem, Exercise 25.7 in Billingsley’s book,
Probability and Measure.
Proof. Let 𝜀 > 0. Choose 𝐾 such that
 
P |𝑍 | > 𝐾 < 𝜀.
Choose 𝑁 such that
P |𝑍𝑛 − 𝑍 | > 1 < 𝜀 and P |𝑌𝑛 | > 𝜀/(𝐾 + 1) < 𝜀
   

for 𝑛 > 𝑁. Then for 𝑛 > 𝑁,


+ P |𝑍𝑛 | > 𝐾 + 1
   𝜀
  
P |𝑌𝑛 𝑍𝑛 | > 𝜀 6 P |𝑌𝑛 | > 𝐾+1
< 𝜀 + P |𝑍 | > 𝐾 + P |𝑍𝑛 − 𝑍 | > 1 < 3𝜀.
   
J
Lemma. If 𝑋 is a continuous semimartingale and 0 = 𝑡0𝑛 < · · · < 𝑡 𝑛𝑝 𝑛 = 𝑡 form an increasing
sequence of subdivisions of [0, 𝑡] whose mesh goes to zero, then there exists a subsequence (𝑛 𝑘 )𝑘 >1
such that almost surely,
𝑝 𝑛𝑘 −1
(𝑋𝑡 𝑛𝑘 − 𝑋𝑡 𝑛𝑘 ) 2 𝛿𝑡 𝑛𝑘 ⇒ 1 [0,𝑡] dh𝑋, 𝑋i
X
𝑖+1 𝑖 𝑖
𝑖=0

as 𝑘 → ∞.
70 Chapter 5. Stochastic Integration

𝑛 2 𝑛
P 𝑝 𝑛 −1
Proof. Write 𝜇𝑛 B 𝑖=0 (𝑋𝑡𝑖+1 − 𝑋𝑡𝑖 ) 𝛿𝑡𝑖 . Let 𝐷 B {𝑡𝑖 ; 𝑛 > 1, 0 6 𝑖 6 𝑝 𝑛 }. We have
𝑛 𝑛

by Proposition 4.21 that


 P
𝜇𝑛 [0, 𝑟] → − h𝑋, 𝑋i𝑟
as 𝑛 → ∞ for each 𝑟 ∈ 𝐷. Choose (𝑛 𝑘 ) such that this converges almost surely for all 𝑟 ∈ 𝐷. Then it
also converges for all 𝑟 ∈ [0, 𝑡]. J

Proof of Theorem 5.10. Suppose first 𝑝 = 1. Let (𝑡𝑖𝑛 )𝑖=0𝑛 be an increasing sequence of subdivisions
𝑝

of [0, 𝑡] with mesh going to 0. Then

𝑛 −1
𝑝X

𝐹 (𝑋𝑡 ) = 𝐹 (𝑋0 ) + 𝐹 (𝑋𝑡𝑖+1
𝑛 ) − 𝐹 (𝑋𝑡 𝑛 )
𝑖
𝑖=0
𝑝X𝑛 −1
1 00 2
= 𝐹 (𝑋0 ) + 𝐹 0 (𝑋𝑡𝑖𝑛 )(𝑋𝑡𝑖+1
𝑛 − 𝑋𝑡 𝑛 ) + 𝐹 (𝜉𝑛,𝑖 )(𝑋𝑡𝑖+1
𝑛 − 𝑋𝑡 𝑛 )
𝑖
2 𝑖
𝑖=0

for some 𝜉𝑛,𝑖 between 𝑋𝑡𝑖𝑛 and 𝑋𝑡𝑖+1


𝑛 . By Proposition 5.9,

𝑛 −1
𝑝X ∫ 𝑡
P
0
𝐹 (𝑋 )(𝑋 𝑡𝑖𝑛 𝑛
𝑡𝑖+1 −𝑋 )→
𝑡𝑖𝑛− 𝐹 0 (𝑋𝑠 ) d𝑋𝑠 .
𝑖=0 0

Since max𝑖 𝐹 00 (𝜉𝑛,𝑖 ) − 𝐹 00 (𝑋𝑡𝑖𝑛 ) → 0 as 𝑛 → ∞ (because 𝐹 00 ◦ 𝑋 is uniformly continuous on [0, 𝑡]


and the mesh goes to 0), the first lemma in combination with Proposition 4.21 shows that it suffices
to prove that
𝑛 −1
𝑝X ∫ 𝑡
2 P
00
𝐹 (𝑋𝑡𝑖𝑛 )(𝑋𝑡𝑖+1
𝑛 − 𝑋𝑡 𝑛 ) →
𝑖
− 𝐹 00 (𝑋𝑠 ) dh𝑋, 𝑋i𝑠 .
𝑖=0 0

In fact, we prove this holds almost surely along a subsequence [which could be taken to be a
subsequence of any given sequence, so the claim of the display does hold]. Note that the left-hand
side equals
∫ 𝑡
𝐹 00 (𝑋𝑠 ) d𝜇𝑛 (𝑠),
0

where 𝜇𝑛 is as in the proof of the second lemma. Since 𝐹 00 ◦ 𝑋 is continuous on [0, 𝑡], the result
follows from the lemma (with weak convergence applied to 𝐹 00 ◦ 𝑋).
For 𝑝 > 1, we consider 𝐹 on the broken line from 𝑋0 to 𝑋𝑡 that is linear between 𝑋𝑡𝑖𝑛 and 𝑋𝑡𝑖+1
𝑛 .

We may again choose 𝜉𝑛,𝑖 on that broken line between 𝑋𝑡𝑖𝑛 and 𝑋𝑡𝑖+1
𝑛 to write

[dot product]
𝑛 −1
𝑝X
𝐹 (𝑋𝑡 ) = 𝐹 (𝑋0 ) + ∇𝐹 (𝑋𝑡𝑖𝑛 ) · (𝑋𝑡𝑖+1
𝑛 − 𝑋𝑡 𝑛 )
𝑖
𝑖=0
𝑛 −1
𝑝X
1 2
+ (𝑋𝑡𝑖+1
𝑛 − 𝑋𝑡 𝑛 ) · ∇ 𝐹 (𝜉 𝑛,𝑖 )(𝑋𝑡 𝑛 − 𝑋𝑡 𝑛 ).
2 𝑖 𝑖+1 𝑖
𝑖=0
[dot product]
5.2. Itô’s Formula 71

Proposition 5.9 shows that the first sum converges to 0 ∇𝐹 (𝑋𝑠 ) · d𝑋𝑠 in probability. We can apply
∫𝑡

the second lemma to 𝑋 𝑗 , 𝑋 ℓ and 𝑋 𝑗 + 𝑋 ℓ to get a subsequence (𝑛 𝑘 ) such that almost surely,

𝑝 𝑛𝑘 −1
X
𝑋𝑡𝑙𝑛𝑘 − 𝑋𝑡𝑙𝑛𝑘 𝛿𝑡 𝑛𝑘 ⇒ 1 [0,𝑡] dh𝑋 𝑗 , 𝑋 ℓ i,
𝑗 𝑗  
𝑋 𝑛𝑘 − 𝑋 𝑛𝑘
𝑡𝑖+1 𝑡𝑖 𝑖+1 𝑖 𝑖
𝑖=0

and this completes the proof. J

Exercise (due 1/25). Exercise 5.26.


If we use 𝐹 (𝑥, 𝑦) B 𝑥𝑦, then we get a formula for integration by parts:
∫ 𝑡 ∫ 𝑡
𝑋𝑡 𝑌𝑡 = 𝑋0𝑌0 + 𝑋𝑠 d𝑌𝑠 + 𝑌𝑠 d𝑋𝑠 + h𝑋, 𝑌 i𝑡 .
0 0

If we use 𝑌 = 𝑋, then
∫ 𝑡
𝑋𝑡2 = 𝑋02 +2 𝑋𝑠 d𝑋𝑠 + h𝑋, 𝑋i𝑡 . (∗)
0

When 𝑋 is a continuous local martingale, we get the formula promised during the proof of Theorem 4.9
and seen at the end of Section 5.1.3:
∫ •
2 2
𝑋 − h𝑋, 𝑋i = 𝑋0 + 2 𝑋𝑠 d𝑋𝑠 .
0

Also, Eq. (∗) implies the integration by parts formula by applying Eq. (∗) to 𝑋, 𝑌 , and 𝑋 + 𝑌 . In fact,
we can prove Itô’s formula from integration by parts (and therefore from Eq. (∗)):
Exercise (due 1/25). (1) Use integration by parts to show that if Itô’s formula holds for some
𝐹 ∈ 𝐶 2 (R 𝑝 ), then it also holds for all 𝐺 of the form

𝐺 (𝑥 1 , . . . , 𝑥 𝑝 ) = 𝑥𝑖 𝐹 (𝑥 1 , . . . , 𝑥 𝑝 ).

(2) Deduce that Itô’s formula holds for all polynomials, 𝐹.


(3) Show that if 𝐾 ⊆ R 𝑝 is compact, then for each 𝐹 ∈ 𝐶 2 (R 𝑝 ), there exist polynomials 𝑃𝑛 such
that
 𝑝
X 𝑝
X 
lim k𝐹 − 𝑃𝑛 k𝐶 (𝐾) + k𝐹𝑖 − (𝑃𝑛 )𝑖 k𝐶 (𝐾) + k𝐹𝑖 𝑗 − (𝑃𝑛 )𝑖, 𝑗 k𝐶 (𝐾) = 0.
𝑛→∞
𝑖=1 𝑖, 𝑗=1

(4) Deduce that if 𝑋 takes values only in 𝐾, then Itô’s formula holds for all 𝐹 ∈ 𝐶 2 (R 𝑝 ).
(5) By using stopping times, deduce the full Itô’s formula.
Exercise. Show that if 𝑋 and 𝑌 are continuous semimartingales, then h𝑋𝑌 , 𝑋𝑌 i = 𝑋 2 · h𝑌 , 𝑌 i +
2(𝑋𝑌 ) · h𝑌 , 𝑋i + 𝑌 2 · h𝑋, 𝑋i.
72 Chapter 5. Stochastic Integration

If 𝐵 is a 𝑑-dimensional Brownian motion, then the components of 𝐵 − 𝐵0 are independent,


whence h𝐵𝑖 , 𝐵 𝑗 i = 0 for 𝑖 ≠ 𝑗, and Itô’s formula becomes
1 𝑡
∫ 𝑡 ∫
𝐹 (𝐵𝑡 ) = 𝐹 (𝐵0 ) + ∇𝐹 (𝐵 𝑠 ) · d𝐵 𝑠 + Δ𝐹 (𝐵 𝑠 ) d𝑠.
0 2 0
If 𝐹 ∈ 𝐶 2 (R+ × R 𝑝 ) instead, then we get
1
∫ 𝑡 ∫ 𝑡  𝜕𝐹 
𝐹 (𝑡, 𝐵𝑡 ) = 𝐹 (0, 𝐵0 ) + ∇𝐹 (𝑠, 𝐵 𝑠 ) · d𝐵 𝑠 + + Δ𝐹 (𝑠, 𝐵 𝑠 ) d𝑠.
0 0 𝜕𝑡 2
[gradient for [Laplacian for
space variables] space variables]
We actually do not need 𝐹 to have a second derivative in 𝑡. Indeed, in general, if 𝑋 1 , . . . , 𝑋 𝑝
are continuous semimartingales and 𝑋 𝑖 (𝑖 ∈ 𝐼) are finite-variation, then we need only 𝐹𝑖 continuous
(1 6 𝑖 6 𝑝) and 𝐹𝑖, 𝑗 continuous (𝑖, 𝑗 ∉ 𝐼).
Suppose 𝑈 ⊆ R 𝑝 is open and 𝐹 ∈ 𝐶 2 (𝑈). If 𝑋𝑠 ∈ 𝑈 almost surely for 0 6 𝑠 < 𝑇 (𝑇 random)
and 𝑋0 ∈ 𝐾, where 𝐾 is compact, then we may still apply Itô’s formula to 𝐹 (𝑋𝑇 ). To see this, let
𝐾 ⊆ 𝑉1 ⊆ 𝑉2 ⊆ · · · be open with 𝑉𝑛 ⊆ 𝑈 and 𝑛 𝑉𝑛 = 𝑈. Let
Ð

𝑇𝑛 B inf{𝑡 > 0 ; 𝑋𝑡 ∉ 𝑉𝑛 },
which is a stopping time by Proposition 3.9. By using a partition of unity, we may construct
𝐺 𝑛 ∈ 𝐶 2 (R 𝑝 ) such that 𝐺 𝑛 𝑉 𝑛 = 𝐹𝑉 𝑛 . Itô’s formula applied to 𝐺 𝑛 (𝑋 𝑇𝑛 ) involves only 𝐹 and its
derivatives. Then we may let 𝑛 → ∞, noting that 𝑇𝑛 ∧ 𝑇 → 𝑇.
Exercise (due 1/25). Exercise 5.28 (“to be determined” means you should give it).
A C-valued random process whose real and imaginary parts are continuous local martingales
is called a complex continuous local martingale.
Proposition 5.11. Let 𝑀 be a continuous local martingale and 𝜆 ∈ C. The (stochastic) exponential
process
ℰ(𝜆𝑀) B exp 𝜆𝑀 − h𝜆𝑀, 𝜆𝑀i/2


is a complex continuous local martingale that satisfies


ℰ(𝜆𝑀) = e𝜆𝑀0 + 𝜆ℰ(𝜆𝑀) · 𝑀,
where    
𝜆ℰ(𝜆𝑀) · 𝑀 B Re 𝜆ℰ(𝜆𝑀) · 𝑀 + i Im 𝜆ℰ(𝜆𝑀) · 𝑀.
We saw some examples in Section 3.3 that were true martingales.
Proof. The function
𝐹 (𝑟, 𝑥) B exp 𝜆𝑥 − 𝜆2𝑟/2


satisfies the time-reversed heat equation, 𝐹1 + 𝐹22 /2 = 0. Applying Itô’s formula to Re 𝐹 and Im 𝐹
gives
  
𝐹 h𝑀, 𝑀i, 𝑀 = 𝐹 (0, 𝑀0 ) + 𝐹2 h𝑀, 𝑀i, 𝑀 · 𝑀 + (𝐹1 + 𝐹22 /2) h𝑀, 𝑀i, 𝑀 · h𝑀, 𝑀i
= 𝐹 (0, 𝑀0 ) + 𝐹2 h𝑀, 𝑀i, 𝑀 · 𝑀 = e𝜆𝑀0 + 𝜆ℰ(𝜆𝑀) · 𝑀.

J
5.3. A Few Consequences of Itô’s Formula 73

Exercise (due 2/1). Let 𝑀 be a continuous local martingale with 𝑀0 = 0.


(1) Show that
n 𝑎2 o
∀𝑡, 𝑎, 𝑏 > 0 P 𝑀𝑡 > 𝑎, h𝑀, 𝑀i𝑡 6 𝑏 6 exp −
 
.
2𝑏
(2) Show that
n 𝑎2 o
∀𝑎, 𝑏 > 0 P ∃𝑡 > 0 𝑀𝑡 > 𝑎, h𝑀, 𝑀i𝑡 6 𝑏 6 exp −
 
.
2𝑏

5.3. A Few Consequences of Itô’s Formula

5.3.1. Lévy’s Characterization of Brownian Motion


We know that if 𝐵 is a real Brownian motion, then h𝐵, 𝐵i𝑡 = 𝑡. In fact, 𝐵 is the only continuous
local martingale with this property:
Theorem 5.12 (Lévy). Let 𝑋 = (𝑋 1 , . . . , 𝑋 𝑑 ) be an adapted continuous process. The following are
equivalent:
(i) 𝑋 is a 𝑑-dimensional (ℱ𝑡 )-Brownian motion.
(ii) Each 𝑋 𝑖 is a continuous local martingale and ∀𝑖, 𝑗, 𝑡 h𝑋 𝑖 , 𝑋 𝑗 i𝑡 = 𝛿𝑖 𝑗 𝑡.
Note this implies that if 𝑋 is a Brownian motion and each coordinate is an (ℱ𝑡 )-martingale, then
𝑋 is an (ℱ𝑡 )-Brownian motion. Also, if 𝐻 is progressive and ±1-valued and 𝐵 is a 1-dimensional
(ℱ𝑡 )-Brownian motion, then 𝐻 · 𝐵 is also an (ℱ𝑡 )-Brownian motion, an extension of the symmetry
used in the reflection principle. More generally, if 𝐵 = (𝐵1 , . . . , 𝐵 𝑑 ) is a 𝑑-dimensional (ℱ𝑡 )-
Brownian motion and 𝐻 is a (𝑑 × 𝑑)-matrix-valued process whose entries are in 𝐿 loc 2 (𝐵1 ), then

𝐻 · 𝐵 is an (ℱ𝑡 )-Brownian motion iff 𝐻 is a.s. an orthogonal matrix.


Proof. We have seen (i) ⇒ (ii) in Chapter 4. Assume (ii). Then for all 𝜉 ∈ R𝑑 , the process
𝑑
X 𝑗
𝜉 · 𝑋𝑡 = 𝜉 𝑗 𝑋𝑡
𝑗=1

is a continuous local martingale with

𝜉 𝑗 𝜉 𝑘 h𝑋 𝑗 , 𝑋 𝑘 i𝑡 = |𝜉 | 2 𝑡.
X
h𝜉 · 𝑋, 𝜉 · 𝑋i𝑡 =
𝑗,𝑘

2
Use 𝜆 B i in Proposition 5.11 to conclude that ei𝜉·𝑋𝑡 +|𝜉 | 𝑡/2 𝑡 is a complex continuous local


martingale. Since it is bounded on every finite interval, it is a true complex martingale. That is, for
0 6 𝑠 < 𝑡 < ∞,
2 2
E ei𝜉·𝑋𝑡 +|𝜉 | 𝑡/2 ℱ𝑠 = ei𝜉·𝑋𝑠 +|𝜉 | 𝑠/2 ,
 

or
2
E ei𝜉·(𝑋𝑡 −𝑋𝑠 ) ℱ𝑠 = e−|𝜉 | (𝑡−𝑠)/2 .
 

This means that for all 𝐴 ∈ ℱ𝑠 , the 𝑃( · | 𝐴)-distribution of 𝑋𝑡 − 𝑋𝑠 is 𝒩 0, (𝑡 − 𝑠)𝐼 , and thus


𝑋𝑡 − 𝑋𝑠 ⫫ ℱ𝑠 . Hence, all 𝑋 𝑗 have independent increments with respect to (ℱ𝑡 ). Furthermore,


74 Chapter 5. Stochastic Integration

𝑗 ≠ 𝑘 implies that 𝑋𝑡 − 𝑋𝑠 and 𝑋𝑡𝑘 − 𝑋𝑠𝑘 are independent given ℱ𝑠 . It follows that 𝑋 − 𝑋0 is
𝑗 𝑗

a 𝑑-dimensional (ℱ𝑡 )-Brownian motion started from 0; since 𝑋 − 𝑋0 ⫫ ℱ0 and 𝑋0 ∈ ℱ0 , also


𝑋 − 𝑋0 ⫫ 𝑋0 , whence 𝑋 is a 𝑑-dimensional (ℱ𝑡 )-Brownian motion. J

Exercise. Let 𝑋 𝑖 be continuous square-integrable martingales for 1 6 𝑖 6 𝑑. Show that 𝑋 :=


(𝑋 1 , . . . , 𝑋 𝑑 ) is a 𝑑-dimensional
 (ℱ𝑡 )-Brownian motion if and only if for all 𝑖, 𝑗, and 𝑠 < 𝑡,
𝑗 𝑗
E (𝑋𝑡𝑖 − 𝑋𝑠𝑖 )(𝑋𝑡 − 𝑋𝑠 ) ℱ𝑠 = 𝛿𝑖 𝑗 (𝑡 − 𝑠).
Exercise (due 2/1). Show that a continuous local martingale 𝑀 is an (ℱ𝑡 )-Brownian motion if and
only if for all 𝑓 ∈ 𝐶 2 (R),
1 𝑡 00
 ∫ 
𝑓 (𝑀𝑡 ) − 𝑓 (𝑀𝑠 ) d𝑠
2 0 𝑡>0

is a continuous local martingale.


2 (𝐵) is such that 𝐻 · 𝐵 is a Gaussian
Exercise. Let 𝐵 be a Brownian motion. Suppose that 𝐻 ∈ 𝐿 loc
process. Show that 𝐻 · 𝐵 is indistinguishable from a Wiener integral process.
Exercise. Let 𝐵 be a Brownian motion. Suppose that 𝐻 is a measurable process (not necessarily
1
adapted) such that E |𝐻| ∈ 𝐿 loc (R+ ). Show that the Brownian motion with random drift defined by


𝑋𝑡 := 𝐵𝑡 + 0 𝐻𝑠 d𝑠 also satisfies 𝑋𝑡 = 𝛽𝑡 + 0 E[𝐻𝑠 | ℱ𝑠𝑋 ] d𝑠 for some (ℱ𝑡𝑋 )-Brownian motion, 𝛽.


∫𝑡 ∫𝑡

5.3.2. Continuous Martingales as Time-Changed Brownian Motions


We have seen several quantitative similarities between continuous local martingales and
Brownian motion. This is not a coincidence. In fact, a continuous local martingale 𝑀 is a Brownian
motion 𝛽 with time process h𝑀, 𝑀i:
𝑀𝑡 = 𝛽 h𝑀,𝑀i𝑡 .
This is similar in spirit to other ways of representing random walks or random variables via Brownian
motion. For example, for simple random walk on Z, we could let 𝐵 be aBrownian motion from 0,
𝜏1 B inf{𝑡 ; |𝐵𝑡 | = 1}, 𝜏2 B inf{𝑡 > 𝜏1 ; |𝐵𝑡 − 𝐵𝜏1 | = 1}, etc. Then 𝐵𝜏𝑛 𝑛>0 has the law of simple
random walk with 𝜏0 B 0, and also has the nice property that E[𝜏𝑛 − 𝜏𝑛−1 ] = 1. If the steps have
mean 0 and finite variance more generally, this is a bit harder to achieve:
Exercise (due 2/8).
 2 (Skorokhod) Let 𝐵
 be a Brownian motion and 𝑍 be a random variable with
E[𝑍] = 0 and E 𝑍 < ∞. Let 𝑝 B E 𝑍1 [𝑍 >0] .


(1) Show that


𝑥−𝑦
1 (0,∞) (𝑥)1 (−∞,0] (𝑦) d𝐹𝑍 (𝑥) d𝐹𝑍 (𝑦)
𝑝
is a probability measure on R2 , where 𝐹𝑍 is the c.d.f. of 𝑍.
(2) Let (𝑋, 𝑌 ) have the law of (1), independent of 𝐵. Write 𝑇𝑎 B inf{𝑡 ; 𝐵𝑡 = 𝑎}. Show that
𝐵𝑇𝑋 ∧𝑇𝑌 ∼ 𝐹𝑍 and
E[𝑇𝑋 ∧ 𝑇𝑌 ] = E 𝑍 2 .
 

(3) Show there exists a continuous closed martingale 𝑀 on some filtered probability space such
that 𝑀0 = 0 and 𝑀∞ ∼ 𝐹𝑍 .
5.3. A Few Consequences of Itô’s Formula 75

Skorokhod’s embedding can be done even with an ℱ𝑡𝐵 -stopping time, but that is much harder;


see, e.g., Billingsley.


We need a strengthening of Proposition 4.12. For a function 𝑓 : R+ → R, write
Øn o
𝐶𝑓 B [𝑠, 𝑡] ; 𝑠 < 𝑡, 𝑓 [𝑠, 𝑡] is constant

for its intervals of constancy.


Lemma 5.14. Let 𝑀 be a continuous local martingale. Then 𝐶 𝑀 = 𝐶h𝑀,𝑀i almost surely.

Proof. By continuity of 𝑀 and h𝑀, 𝑀i, this will follow from the statement that for 0 6 𝑎 < 𝑏,

P [𝑎, 𝑏] ⊆ 𝐶 𝑀 4 [𝑎, 𝑏] ⊆ 𝐶h𝑀,𝑀i = 0.


  

(This will also show that 𝐶 𝑀 = 𝐶h𝑀,𝑀i is measurable.)


 

Fix 𝑎 < 𝑏. By Eq. (4.3) of Theorem 4.9, it follows directly that

P [𝑎, 𝑏] ⊆ 𝐶 𝑀 \ [𝑎, 𝑏] ⊆ 𝐶h𝑀,𝑀i = 0.


  

For the other direction, let 𝑁 B 𝑀 − 𝑀 𝑎 . By exercise, we have

h𝑁, 𝑁i = h𝑀, 𝑀i − h𝑀, 𝑀i 𝑎 .

Define 𝑇0 B inf 𝑡 > 0 ; h𝑁, 𝑁i𝑡 > 0 . Now, this may not be a stopping time. However, let us


change to the filtration (ℱ𝑡 + )𝑡 , with respect to which 𝑇0 is a stopping time by Proposition 3.9(i).
In addition, 𝑁 is still a continuous local martingale by Theorem 3.17. Since h𝑁, 𝑁i𝑇0 = 0, it
follows from Proposition 4.12 that 𝑁 𝑇0 = 0 a.s. If [𝑎, 𝑏] ⊆ 𝐶h𝑀,𝑀i (𝜔), then 𝑇0 (𝜔) > 𝑏, whence
∀𝑡 6 𝑏 𝑁𝑡 (𝜔) = 0 for a.e. such 𝜔. This proves the other direction. J

Exercise (due 2/1). Show that if 𝑋 = 𝑀 + 𝑉 is the canonical decomposition of a continuous


semimartingale, then 𝐶 𝑋 = 𝐶 𝑀 ∩ 𝐶𝑉 almost surely.
Exercise. Let 𝑋 be a continuous semimartingale and 𝐻 be a locally bounded, progressive process.
Show that almost surely,
𝐶𝐻0 ∪ 𝐶 𝑋 ⊆ 𝐶𝐻·𝑋 ,
where 𝐶𝐻0 B 𝐶𝐻 ∩ 𝐻 −1 {0} .
 

Theorem 5.13 (Dambis–Dubins–Schwarz). If 𝑀 is a continuous local martingale with h𝑀, 𝑀i∞ =


∞ almost surely, then there exists a Brownian motion 𝛽 such that

a.s. ∀𝑡 > 0 𝑀𝑡 = 𝛽 h𝑀,𝑀i𝑡 .

Remarks. (1) If h𝑀, 𝑀i∞ < ∞ with positive probability, one can do the same, but one may need
a larger probability space to define 𝛽 after time h𝑀, 𝑀i∞ . It follows that for every 𝑡 > 0, up
to a set of probability 0, we have sup𝑠<𝑡 𝑀 (𝑠) > 0 iff inf 𝑠<𝑡 𝑀 (𝑠) < 0 iff sup𝑠<𝑡 |𝑀 (𝑠)| > 0 iff
h𝑀, 𝑀i𝑡 > 0 by Theorem 2.13.
(2) 𝛽 is not adapted to (ℱ𝑡 ), but to a “time-changed” filtration.
76 Chapter 5. Stochastic Integration

(3) 𝛽 ⫫ h𝑀, 𝑀i if and only if M is an Ocone continuous local martingale if and only if the
conditional law of (𝑀𝑡 − 𝑀𝑠 )𝑡>𝑠 given (𝑀𝑡 )𝑡6𝑠 is symmetric for all 𝑠 > 0.
Proof. We will define 𝛽 by the conclusion, “inverting” h𝑀, 𝑀i. Assume first 𝑀0 = 0 almost surely.
For 𝑟 > 0, set
𝜏𝑟 B inf 𝑡 > 0 ; h𝑀, 𝑀i𝑡 > 𝑟 .


By Proposition 3.9, 𝜏𝑟 is a stopping time. Except on the event 𝒩 B h𝑀, 𝑀i∞ < ∞ , we have
 

𝜏𝑟 < ∞ for all 𝑟. Since P[𝒩] = 0, we may redefine 𝜏𝑟 to be 0 on 𝒩. Recall that (ℱ𝑡 ) is complete
by assumption, so still 𝜏𝑟 is a stopping time.
Note that h𝑀, 𝑀i can be constant on intervals (where a.s., 𝑀 is constant by Lemma 5.14).
Still, 𝑟 ↦→ 𝜏𝑟 is increasing and left-continuous, so has right limits, namely,
lim 𝜏𝑠 = 𝜏𝑟 + = inf{𝑡 > 0 ; h𝑀, 𝑀i𝑡 > 𝑟},
𝑠↓𝑟

except on 𝒩, where 𝜏𝑟 + = 0.
Define 𝛽𝑟 B 𝑀𝜏𝑟 for 𝑟 > 0. By Theorem 3.7, 𝛽𝑟 ∈ ℱ𝜏𝑟 , i.e., 𝛽 is adapted to (𝒢𝑟 ), where
𝒢𝑟 B ℱ𝜏𝑟 and 𝒢∞ B ℱ∞ . Because (ℱ𝑡 ) is complete, so is (𝒢𝑟 ).
Let 𝒩0 be the set of probability 0 where 𝑀 is non-constant on some interval where h𝑀, 𝑀i is
constant. Then off 𝒩0, we have 𝑀𝜏𝑟 = 𝑀𝜏𝑟 + , whence 𝛽 is continuous. Redefine 𝛽 B 0 on 𝒩0. We
have off 𝒩 ∪ 𝒩0,
𝛽 h𝑀,𝑀i𝑡 = 𝑀𝜏h𝑀 , 𝑀 i𝑡
and
𝜏h𝑀,𝑀i𝑡 6 𝑡 6 𝜏h𝑀,𝑀i𝑡+ .
Because 𝑀 is constant on that interval, we get 𝛽 h𝑀,𝑀i𝑡 = 𝑀𝑡 .
It remains to show that 𝛽 is a Brownian motion. We use Lévy’s characterization, i.e.,
we prove that 𝛽 and (𝛽2𝑠 − 𝑠)𝑠>0 are continuous (𝒢𝑟 )-martingales. Consider 𝑛 ∈ N. Since
h𝑀, 𝑀i∞ 𝜏𝑛
= h𝑀, 𝑀i𝜏𝑛 = 𝑛 almost surely, Theorem 4.13(i) yields that 𝑀 𝜏𝑛 and (𝑀 𝜏𝑛 ) 2 − h𝑀, 𝑀i 𝜏𝑛
are uniformly integrable martingales. The optional stopping theorem thus gives
0 6 𝑟 6 𝑠 6 𝑛 =⇒ E 𝛽𝑠 𝒢𝑟 = E 𝑀𝜏𝜏𝑠𝑛 ℱ𝜏𝑟 = 𝑀𝜏𝜏𝑟𝑛 = 𝛽𝑟 .
   

Similarly,
E 𝛽2𝑠 − 𝑠 𝒢𝑟 = E (𝑀𝜏𝜏𝑠𝑛 ) 2 − h𝑀, 𝑀i𝜏𝜏𝑠𝑛 | ℱ𝜏𝑟 = (𝑀𝜏𝜏𝑟𝑛 ) 2 − h𝑀, 𝑀i𝜏𝜏𝑟𝑛 = 𝛽𝑟2 − 𝑟.
   

This finishes the proof when 𝑀0 = 0.


If 𝑀0 ≠ 0, write 𝑀𝑡 = 𝑀0 + 𝑀𝑡0. The previous argument gives a Brownian motion 𝛽0 such that
a.s. ∀𝑡 > 0 𝑀𝑡0 = 𝛽0h𝑀 0,𝑀 0i𝑡 .
We actually showed that 𝛽0 is a (𝒢𝑟 )-Brownian motion, so 𝛽0 ⫫ 𝒢0 = ℱ0 3 𝑀0 . Therefore,
𝛽𝑠 B 𝑀0 + 𝛽0𝑠 is a Brownian motion. J
Exercise (due 2/8). Exercise 5.27.
The following additional result will be useful in Chapter 7 when we show conformal invariance
of complex Brownian motion.
5.3. A Few Consequences of Itô’s Formula 77

Proposition 5.15. Let 𝑀, 𝑁 be continuous local martingales such that 𝑀0 = 𝑁0 = 0, h𝑀, 𝑀i =


h𝑁, 𝑁i, h𝑀, 𝑁i = 0, and h𝑀, 𝑀i∞ = h𝑁, 𝑁i∞ = ∞. Let 𝛽, 𝛾 be the real Brownian motions such
that 𝑀 = 𝛽 h𝑀,𝑀i and 𝑁 = 𝛾 h𝑁,𝑁i . Then 𝛽 ⫫ 𝛾. Thus, (𝑀, 𝑁) is a time change of a 2-dimensional
Brownian motion.
(Note: if h𝑀, 𝑀i is not deterministic, then 𝑀 is not independent of 𝑁.)
Proof. Again, let 𝜏𝑟 B inf 𝑡 > 0 ; h𝑀, 𝑀i > 𝑟 , so 𝛽𝑟 = 𝑀𝜏𝑟 , 𝛾𝑟 = 𝑁𝜏𝑟 and 𝛽, 𝛾 are (𝒢𝑟 )-


Brownian motions, where 𝒢𝑟 B ℱ𝜏𝑟 . Since h𝑀, 𝑁i = 0, we have 𝑀 𝑁 is a continuous local


martingale. As before, we get (𝑀 𝑁) 𝜏𝑛 is a uniformly integrable martingale for 𝑛 > 1 (now using
Proposition 4.15(v)), whence
0 6 𝑟 6 𝑠 6 𝑛 =⇒ E 𝛽𝑠 𝛾𝑠 𝒢𝑟 = E 𝑀𝜏𝜏𝑠𝑛 𝑁𝜏𝜏𝑠𝑛 ℱ𝜏𝑟 = 𝑀𝜏𝜏𝑟𝑛 𝑁𝜏𝜏𝑟𝑛 = 𝛽𝑟 𝛾𝑟 .
   

Thus, 𝛽𝛾 is a (𝒢𝑟 )-martingale and so h𝛽, 𝛾i = 0. By Theorem 5.12, (𝛽, 𝛾) is a 2-dimensional


Brownian motion. Since 𝛽0 = 𝛾0 = 0, it follows that 𝛽 ⫫ 𝛾. J
The proposition also holds without the assumption that h𝑀, 𝑀i∞ = h𝑁, 𝑁i∞ = ∞; see the first
remark after Theorem 5.13. In addition, there is an extension due to Knight when h𝑀, 𝑀i ≠ h𝑁, 𝑁i,
but one loses the filtration (𝒢𝑟 ); it is more difficult.
Exercise (due 2/8). (1) Let 𝑀 be a continuous local martingale such that 𝑀0 = 0 and h𝑀, 𝑀i is
deterministic with h𝑀, 𝑀i∞ = ∞. Show that 𝑀 is a Gaussian process.
(2) Show that if 𝑀 and 𝑁 are continuous local martingales such that 𝑀0 = 𝑁0 = 0, h𝑀, 𝑀i and
h𝑁, 𝑁i are deterministic, h𝑀, 𝑀i∞ = h𝑁, 𝑁i∞ = ∞ and h𝑀, 𝑁i = 0, then 𝑀 ⫫ 𝑁. Do not
use Knight’s theorem. Hint: modify the proof of Lévy’s theorem.
Exercise (due 2/15). Exercise 5.33.
Exercise. Let 𝑀 be a continuous martingale with 𝑀𝑡 ∈ 𝐿 2 (P) for all 𝑡 ∈ R+ and h𝑀, 𝑀i∞ = ∞
almost surely. Write 𝑑 (𝑠, 𝑡) := k𝑀𝑠 − 𝑀𝑡 k2 for 𝑠, 𝑡 ∈ R+ . Show that the stochastic process 𝑀 is
almost surely locally Hölder continuous of order 𝛼 for all 𝛼 < 1 with respect to the metric 𝑑 on R+ .

5.3.3. The Burkholder–Davis–Gundy Inequalities


Here we give yet another relation between a continuous local martingale and its quadratic
variation. This will be useful in Chapter 8. For a process 𝑋, write
𝑋𝑡∗ B sup |𝑋𝑠 |.
𝑠6𝑡

Theorem 5.16 (Burkholder–Davis–Gundy). There exist 𝑐, 𝐶 : (0, ∞) → (0, ∞) such that for all
continuous local martingales 𝑀 with 𝑀0 = 0, for all stopping times 𝑇,
p 𝑝 p 𝑝
𝑐( 𝑝) E h𝑀, 𝑀i𝑇 6 E (𝑀𝑇∗ ) 𝑝 6 𝐶 ( 𝑝) E h𝑀, 𝑀i𝑇 .
 
∀𝑝 ∈ R+
Remark. Note that the case 𝑝 = 2 is immediate from Doob’s 𝐿 2 -inequality and Theorem 4.13: if
𝑀𝑇∗ ∈ 𝐿 2 , then 𝑀 𝑇 ∈ H2 and
E (𝑀𝑇∗ ) 2 6 4 E (𝑀𝑇 ) 2 = 4 E h𝑀, 𝑀i𝑇 6 4 E (𝑀𝑇∗ ) 2 ,
       

whereas if h𝑀, 𝑀i𝑇 ∈ 𝐿 1 , then 𝑀𝑇∗ ∈ 𝐿 2 .


78 Chapter 5. Stochastic Integration

We need some preliminary results.


Proposition. Let 𝑀 be a continuous local martingale with 𝑀0 = 0. Write

𝑇𝑥 B inf{𝑡 > 0 ; 𝑀𝑡 = 𝑥}.

We have
𝑏
∀𝑎, 𝑏 > 0 ∗
> 0].
 
P 𝑇𝑎 < 𝑇−𝑏 6 P[𝑀∞
𝑎+𝑏
Proof. Because 𝑀 𝑇𝑎 ∧𝑇−𝑏 is a bounded martingale, we have

0 = E[𝑀𝑇𝑎 ∧𝑇−𝑏 ] = 𝑎 P[𝑇𝑎 < 𝑇−𝑏 ] − 𝑏 P[𝑇−𝑏 < 𝑇𝑎 ] + E 𝑀∞ 1 [𝑇𝑎 =𝑇−𝑏 =∞,𝑀∞∗ >0]
 

> 𝑎 P[𝑇𝑎 < 𝑇−𝑏 ] − 𝑏 P[𝑇−𝑏 6 𝑇𝑎 , 𝑀∞ > 0]
= −𝑏 P[𝑀∞ > 0] + (𝑎 + 𝑏) P[𝑇𝑎 < 𝑇−𝑏 ].

J

Corollary. Let 𝑋, 𝑌 > 0, 𝑋0 = 𝑌0 = 0, and 𝑋 − 𝑌 be a continuous local martingale. Then


 𝑏 
>0 .
 ∗ ∗
> 𝑎, 𝑌∞∗ < 𝑏 6 P (𝑋 − 𝑌 )∞

∀0 < 𝑏 < 𝑎 P 𝑋∞
𝑎
Proof. Since

> 𝑎, 𝑌∞∗ < 𝑏 ⊆ sup(𝑋 − 𝑌 ) > 𝑎 − 𝑏 ∩ inf (𝑋 − 𝑌 ) > −𝑏 ,
     
𝑋∞

it follows that on this event, 𝑋 − 𝑌 hits 𝑎 − 𝑏 before −𝑏, so the proposition applies. J
Corollary. Let 𝑀 be a continuous local martingale with 𝑀0 = 0 and 𝑟 > 0. Then
 ∗ 2  ∗ 2
∀𝑏 ∈ (0, 1) P (𝑀∞ ) > 4𝑟, h𝑀, 𝑀i∞ < 𝑏𝑟 6 𝑏 P (𝑀∞
 
) >𝑟

and
∗ 2
∀𝑏 ∈ (0, 14 ) P h𝑀, 𝑀i∞ > 2𝑟, (𝑀∞ ) < 𝑏𝑟 6 4𝑏 P h𝑀, 𝑀i∞ > 𝑟 .
   

Proof. Since 𝑀 2 − h𝑀, 𝑀i is a continuous local martingale, the previous corollary gives
 ∗ 2
∀𝑏 ∈ (0, 1) P (𝑀∞ >0 .
  ∗ 
) > 𝑟, h𝑀, 𝑀i∞ < 𝑏𝑟 6 𝑏 P 𝑀∞

Here, we have used the fact that ∗ > 0 iff h𝑀, 𝑀i > 0 by Proposition 4.12. Apply this to
𝑀∞ ∞

𝑁 B 𝑀 − 𝑀 𝑇 , where 𝑇 B inf 𝑡 > 0 ; 𝑀𝑡∗ > 𝑟 . Then


∗ 2

[𝑁∞ > 0] = [(𝑀∞ ) > 𝑟],
h𝑁, 𝑁i = h𝑀, 𝑀i − h𝑀, 𝑀i𝑇 6 h𝑀, 𝑀i,

and √ √

[𝑀∞ > 2 𝑟 ] ⊆ [𝑁∞

> 𝑟]
∗ > 𝑀 ∗ − √𝑟 in that case. This gives the first inequality.
since 𝑁∞ ∞
Likewise, h𝑀, 𝑀i − 𝑀 2 is a continuous local martingale, so
∗ 2
∀𝑏 ∈ (0, 41 ) P h𝑀, 𝑀i∞ > 𝑟, (𝑀∞ ) < 4𝑏𝑟 6 4𝑏 P h𝑀, 𝑀i∞ > 0 .
   
5.3. A Few Consequences of Itô’s Formula 79


Apply this to 𝑁 B 𝑀 − 𝑀 𝑇 , where 𝑇 B inf 𝑡 > 0 ; h𝑀, 𝑀i𝑡 > 𝑟 . Then


h𝑁, 𝑁i∞ > 0 = h𝑀, 𝑀i∞ > 𝑟 ,


   

h𝑀, 𝑀i∞ > 2𝑟 ⊆ h𝑁, 𝑁i∞ > 𝑟 ,


   

and
∗ 2
  ∗ 2
) < 4𝑏𝑟
 
(𝑀∞ ) < 𝑏𝑟 ⊆ (𝑁∞
√ √
since ∀𝑡 > 0 𝑀𝑡 ∈ (− 𝑏𝑟, 𝑏𝑟 ) in that case. This gives the second inequality. J
Proof of Theorem 5.16. Recall that for 𝑋 > 0 and 𝑝 > 0,
∫ ∞ ∫ ∞
d𝑟 = 𝑏 𝑝 P[𝑋 > 𝑏𝑟]𝑟 𝑝−1 d𝑟
 𝑝 𝑝−1 𝑝
E 𝑋 = 𝑝 P[𝑋 > 𝑟]𝑟
0 0

for 𝑏 > 0. By the corollary, for 𝑏 ∈ (0, 1),


 ∗ 2  ∗ 2
) > 4𝑟 6 P h𝑀, 𝑀i∞ > 𝑏𝑟 + P (𝑀∞ ) > 4𝑟, h𝑀, 𝑀i∞ < 𝑏𝑟
   
P (𝑀∞
   ∗ 2 
6 P h𝑀, 𝑀i∞ > 𝑏𝑟 + 𝑏 P (𝑀∞ ) >𝑟 .
𝑝
Multiply by 2𝑝 𝑟 2 −1 and integrate from 𝑟 = 0 to ∞:

2−𝑝 E (𝑀∞
 ∗ 𝑝
) 6 𝑏 −𝑝/2 E h𝑀, 𝑀i∞ + 𝑏 E (𝑀∞
 𝑝/2   ∗ 𝑝
) .
Choose 𝑏 ∈ (0, 2−𝑝 ) to obtain 𝐶 ( 𝑝) for 𝑇 = ∞.
Similarly, for 𝑏 ∈ (0, 14 ), we have
 ∗ 2
P h𝑀, 𝑀i∞ > 2𝑟 6 P (𝑀∞ ) > 𝑏𝑟 + 4𝑏 P h𝑀, 𝑀i∞ > 𝑟 ,
    

so
2−𝑝/2 E h𝑀, 𝑀i∞ 6 𝑏 −𝑝/2 E (𝑀∞ ) + 4𝑏 E h𝑀, 𝑀i∞ .
 𝑝/2   ∗ 𝑝  𝑝/2 

Choose 𝑏 ∈ (0, 2−𝑝/2 /4) to obtain 𝑐( 𝑝) for 𝑇 = ∞.


Finally, as usual, apply these inequalities to 𝑀 𝑇 to obtain them for any 𝑇, not just 𝑇 = ∞. J
Exercise. Let 𝑀 be a continuous local martingale with 𝑀0 = 0 and 𝑎, 𝑏 > 0. Applying the previous
exercise on page 73 to both 𝑀 and −𝑀, we see that
 ∗ 2 𝑎
) > 𝑎, h𝑀, 𝑀i∞ 6 𝑏 6 2e− 2𝑏 .

P (𝑀∞
Show that
∗ 2
) 6 𝑏 6 P (𝐵∗𝑎 ) 2 6 𝑏 ,
   
P h𝑀, 𝑀i∞ > 𝑎, (𝑀∞
where 𝐵 is a real Brownian motion starting at 0. You may use the extension of Theorem 5.13 to the
case h𝑀, 𝑀i∞ < ∞ with positive  probability.
An exponential bound on P (𝐵∗𝑎 ) 2 6 𝑏 is shown in Exercise 6.29(6); it can also be bounded by


using an alternating infinite series expression for its exact value (e.g., page 342 of Feller, volume 2,
2
or (7.15) of Mörters and Peres). Taking its first term gives the bound 𝜋4 exp − 𝜋8𝑏𝑎 . As a third
alternative, one can get an exponential bound by a direct iterated martingale argument using a
sequence of stopping times and conditioning on the associated 𝜎-fields to bound the conditional
probabilities.
80 Chapter 5. Stochastic Integration

Exercise. Let 𝐻 be a bounded, continuous, adapted process with 𝐻0 ≡ 0 and 𝐵 be a Brownian motion.
Show that k(𝐻 · 𝐵)𝑡∗ /𝐵𝑡 k 𝑝 → 0 as 𝑡 ↓ 0 for all 𝑝 ∈ (0, 1). Find an 𝐻 so that k(𝐻 · 𝐵)𝑡 /𝐵𝑡 k1 6→ 0 as
𝑡 ↓ 0.
Corollary 5.17. If 𝑀 be a continuous local martingale with 𝑀0 = 0 and
p 
E h𝑀, 𝑀i∞ < ∞,

then 𝑀 is a uniformly integrable martingale.

Proof. By Theorem 5.16 with 𝑝 = 1, we have |𝑀 | 6 𝑀∞ ∗ ∈ 𝐿 1 , so Proposition 4.7(ii) applies. J

Note that this condition is weaker than E h𝑀, 𝑀i∞ < ∞, which is the condition for 𝑀 ∈ H2 .
 

√  (due 2/15). Let 𝐵 be a Brownian


Exercise  2motion with 𝐵0 = 0 and 𝑇 be a stopping time with
E 𝑇 < ∞. Show that E[𝐵𝑇 ] = 0 and E 𝐵𝑇 = E[𝑇] (this is trickier than it looks).


Appendix: The Cameron–Martin and Girsanov Theorems


While the proof of the Cameron–Martin theorem given in the appendix to Chapter 2 is short
and elementary, it is instructive to see how stochastic calculus can also be used for a short proof.
This will lead us to extensions of the theorem. It is convenient here to change the sign of the drift.
Let 𝐵 be an (ℱ𝑡 )𝑡 -Brownian motion. We begin with linear drift up to some finite time, 𝑟, after
which we add no further drift: 𝐵𝑡 − 𝜃 (𝑡 ∧ 𝑟). Note that the quadratic variation does not change
with a deterministic drift. Let 𝑀 := ℰ(𝜃𝐵) be the exponential martingale corresponding to 𝐵.
Recall from the exercise on page 28 that we may differentiate 𝑀 with respect to 𝜃 to get another
martingale, namely, (𝐵𝑡 − 𝜃𝑡) 𝑀𝑡 . We claim this means that with respect to the probability measure
Q := 𝑀𝑟 P, the process 𝐵𝑡 − 𝜃 (𝑡 ∧ 𝑟) 𝑡 is an (ℱ𝑡 )𝑡 -martingale. First note the following general


principle: if 𝑋 is an adapted process and 𝑁 is a uniformly integrable nonnegative martingale such


that (𝑋𝑡 𝑁𝑡 )𝑡 is a martingale, then
 (𝑋𝑡 𝑁∞ )𝑡 is a martingale. Indeed, 𝑋𝑡 𝑁∞ is integrable because
E[|𝑋𝑡 |𝑁∞ ] = E E[|𝑋𝑡 |𝑁∞ | ℱ𝑡 ] = E[|𝑋𝑡 |𝑁𝑡 ]. Now we may calculate for 0 6 𝑠 6 𝑡 and 𝐴 ∈ ℱ𝑠
that  
E[𝑋𝑡 𝑁∞ 1 𝐴 ] = E E[𝑋𝑡 𝑁∞ 1 𝐴 | ℱ𝑡 ] = E[𝑋𝑡 𝑁𝑡 1 𝐴 ] = E[𝑋𝑠 𝑁 𝑠 1 𝐴 ] = E[𝑋𝑠 𝑁∞ 1 𝐴 ].
Using this principle, we see that (𝐵𝑡 − 𝜃𝑡)06𝑡6𝑟 is a martingale with respect to Q, and it is obvious
that (𝐵𝑡 − 𝜃𝑟)𝑟 6𝑡<∞ is a martingale with respect to Q. This proves our claim. Now the quadratic
variation of 𝐵𝑡 − 𝜃 (𝑡 ∧ 𝑟) 𝑡 is the same with respect to Q as with respect to P because Q  P.


Hence, it follows from Lévy’s theorem that 𝐵𝑡 − 𝜃 (𝑡 ∧ 𝑟) 𝑡 is an (ℱ𝑡 )𝑡 -Brownian motion with
respect to Q. Writing 𝑄 for the P-law of (𝐵𝑡 − 𝜃𝑡)𝑡 and 𝑊 for Wiener measure, we may conclude
that 𝑄𝒢𝑟  𝑊𝒢𝑟 for all 𝑟 ∈ R+ , where 𝒢• is the natural filtration on 𝐶 (R+ , R); we say that 𝑄 is
loc
locally absolutely continuous with respect to 𝑊, written 𝑄  𝑊. Of course, if 𝜃 ≠ 0, then 𝑄 ⊥ 𝑊.
Exercise. Let 𝐵 be a 𝑑-dimensional (ℱ𝑡 )𝑡 -Brownian motion and 𝜇 ∈ R𝑑 . Define 𝑋𝑡 := 𝐵𝑡 + 𝜇𝑡 and
𝑇 := inf{𝑡 > 0 ; |𝑋𝑡 | = 1}. Write 𝑀𝑡 := exp{−𝜇 · 𝐵𝑡 − |𝜇| 2 𝑡/2} and Q := 𝑀𝑇 P.
(1) Verify that 𝑀 𝑇 is a uniformly integrable P-martingale.
(2) Show that the Q-law of 𝑋 𝑇 is Brownian motion up to time 𝑇.
The Cameron–Martin and Girsanov Theorems 81

(3) Let 𝑃 be the Q-law of (𝑋 𝑇 , 𝑇) on 𝐶 (R+ , R𝑑 ) × R+ . Show that the P-law of (𝑋 𝑇 , 𝑇) is 𝑚𝑃,
2
where 𝑚(𝑤, 𝑡) = e 𝜇·𝑤𝑡 −|𝜇| 𝑡/2 . Deduce that 𝑋𝑇 and 𝑇 are independent with respect to P; that
for some constant 𝑐 𝜇 , the P-law of 𝑋𝑇 has density 𝑥 ↦→ 𝑐 𝜇 e 𝜇·𝑥 with respect to hypersurface
measure on the sphere of radius 1; and that the P-law of 𝑇 has density 𝑡 ↦→ 𝑐−1 −|𝜇| 2 𝑡/2 with
𝜇 e
respect to the Q-law of 𝑇, the hitting time for ordinary Brownian motion.
For more general drift functions, suppose that 𝑓 ∈ 𝐿 2 (R+ ) and 𝐹𝑡 := 0 𝑓 (𝑠) d𝑠. We will
∫𝑡

consider 𝑋 := 𝐵 − 𝐹. To analyze this, let 𝐿 𝑡 := 0 𝑓 (𝑠) d𝐵 𝑠 , which is just a Wiener integral. By


∫𝑡

Proposition 5.11, we have dℰ(𝐿) = 𝑓 ℰ(𝐿) d𝐵. Thus, integration by parts gives us
d 𝑋ℰ(𝐿) = 𝑋 𝑓 ℰ(𝐿) d𝐵 + ℰ(𝐿) d𝐵 − 𝑓 d𝑡 + 𝑓 ℰ(𝐿) d𝑡 = 𝑋 𝑓 ℰ(𝐿) d𝐵 + ℰ(𝐿) d𝐵,
 

whence 𝑋ℰ(𝐿) is a continuous local martingale. For 𝑡 ∈ [0, ∞], we have 𝐿 𝑡 ∼ 𝒩 0, k 𝑓 1 [0,𝑡] k 2 ,

2
whence E e 𝐿 𝑡 = e k 𝑓 1 [0,𝑡 ] k /2 = eh𝐿,𝐿i𝑡 , so E ℰ(𝐿)𝑡 = 1. This implies that ℰ(𝐿) is a uniformly
   

integrable martingale: by Proposition 4.7, it is a supermartingale, and so


1 = E ℰ(𝐿)𝑡 > E E[ℰ(𝐿)∞ | ℱ𝑡 ] = E ℰ(𝐿)∞ = 1,
     

which implies ℰ(𝐿)𝑡 = E ℰ(𝐿)∞ ℱ𝑡 a.s., as desired. Furthermore, h𝑋ℰ(𝐿), 𝑋ℰ(𝐿)i𝑡 =


 
2
1 ℰ(𝐿)𝑠2 d𝑠 has finite expectation, whence 𝑋ℰ(𝐿) is a true martingale by Theo-
∫𝑡
0
𝑋 (𝑠) 𝑓 (𝑠) +
rem 4.13(ii). As above, it follows that 𝑋 is a martingale with respect to Q := ℰ(𝐿)∞ P. Again,
the quadratic variation of 𝑋 is the same as that of ∫𝐵, whence 𝑋 is∫ an (ℱ𝑡 )𝑡 -Brownian motion with
∞ ∞
respect to Q. The explicit form of ℰ(𝐿)∞ is exp 0 𝑓 (𝑠) d𝐵 𝑠 − 0 𝑓 (𝑠) 2 d𝑠/2 .
In fact, we may add random drifts as well: Suppose that 𝐿 is a continuous local martingale such
that ℰ(𝐿) is a uniformly integrable martingale with mean 1. Then 𝐵 − h𝐵, 𝐿i is an (ℱ𝑡 )𝑡 -Brownian
motion with respect to ℰ(𝐿)∞ P, whence the P-law of 𝐵 − h𝐵, 𝐿i is mutually absolutely continuous
with Wiener measure. This follows just as above, with the following extension of the “general
principle” we used.
Proposition. If 𝑁 is a nonnegative uniformly integrable martingale, 𝑋 is adapted, and 𝑋 𝑁 is a
local martingale, then (𝑋𝑡 𝑁∞ )𝑡 is a local martingale.
Proof. We claim that a sequence of stopping times that reduces 𝑋 𝑁 also reduces 𝑋 𝑁∞ . Indeed, let
𝑇 be a stopping time such that 𝑋 𝑇 𝑁 𝑇 is a martingale; it suffices to show
 that 𝑋𝑇 𝑁𝑇∞ is a martingale.
𝑇

Integrability follows as before: E[|𝑋𝑡 |𝑁∞ ] = E E[|𝑋𝑡 |𝑁∞ | ℱ𝑇∧𝑡 ] = E[|𝑋𝑡 |𝑁𝑡 ]. Let 0 6 𝑠 6 𝑡
𝑇
 𝑇

and 𝐴 ∈ ℱ𝑠 . Similar to the calculation near the end of the proof of the proposition on page 35, we
have
 
E[𝑋𝑡𝑇 1 𝐴∩[𝑇 >𝑠] 𝑁∞ ] = E E[𝑋𝑡𝑇 1 𝐴∩[𝑇 >𝑠] 𝑁∞ | ℱ𝑇∧𝑡 ]
 
= E 𝑋𝑡𝑇 1 𝐴∩[𝑇 >𝑠] E[𝑁∞ | ℱ𝑇∧𝑡 ] = E[𝑋𝑡𝑇 1 𝐴∩[𝑇 >𝑠] 𝑁𝑡𝑇 ].
Applying this to 𝑡 = 𝑠, we obtain E[𝑋𝑠𝑇 1 𝐴∩[𝑇 >𝑠] 𝑁∞ ] = E[𝑋𝑠𝑇 1 𝐴∩[𝑇 >𝑠] 𝑁 𝑠𝑇 ]. Because 𝑋 𝑇 𝑁 𝑇 is a
martingale, we conclude that E[𝑋𝑡𝑇 1 𝐴∩[𝑇 >𝑠] 𝑁∞ ] = E[𝑋𝑠𝑇 1 𝐴∩[𝑇 >𝑠] 𝑁∞ ]. On the other hand, 𝑋𝑡𝑇 = 𝑋𝑠𝑇
on the event [𝑇 6 𝑠], whence E[𝑋𝑡𝑇 1 𝐴∩[𝑇 6𝑠] 𝑁∞ ] = E[𝑋𝑠𝑇 1 𝐴∩[𝑇 6𝑠] 𝑁∞ ]. Adding these equations
gives E[𝑋𝑡𝑇 1 𝐴 𝑁∞ ] = E[𝑋𝑠𝑇 1 𝐴 𝑁∞ ], as desired. J
This part of the proof has nothing to do with Brownian motion, so we may deduce this theorem
of Girsanov:
82 Chapter 5. Stochastic Integration

Theorem 5.22 (Girsanov). Let 𝑀 and 𝐿 be continuous local martingales such that ℰ(𝐿) is a
uniformly integrable martingale with mean 1. Then 𝑀 − h𝑀, 𝐿i is a continuous local martingale
with respect to ℰ(𝐿)∞ P.

Proof. Let 𝑋 := 𝑀 − h𝑀, 𝐿i. By Proposition 5.11, we have dℰ(𝐿) = ℰ(𝐿) d𝐿, so that from
integration by parts,

d 𝑋ℰ(𝐿) = 𝑋ℰ(𝐿) d𝐿 + ℰ(𝐿) d𝑀 − dh𝑀, 𝐿i + ℰ(𝐿) dh𝑀, 𝐿i = 𝑋ℰ(𝐿) d𝐿 + ℰ(𝐿) d𝑀,


 

whence 𝑋ℰ(𝐿) is a continuous local martingale. J


Exercise. Find 𝑀 and 𝐿 as in Girsanov’s theorem such that the P-law of 𝑀 is not equal to the
ℰ(𝐿)∞ P-law of 𝑀 − h𝑀, 𝐿i.
Exercise. Show that if 𝐿 is a continuous local martingale such that h𝐿, 𝐿i∞ = ∞ a.s., then ℰ(𝐿) is
not uniformly integrable. Show that for each 𝜀 > 0, there exists a continuous martingale 𝐿 such that
P[h𝐿, 𝐿i∞ < ∞] < 𝜀 and ℰ(𝐿) + ℰ(−𝐿) is uniformly integrable.
Exercise. Show that if 𝐿 isa continuous local martingale such that h𝐿, 𝐿i∞ 6 𝛼 < ∞ a.s. for some
∗ 2
constant, 𝛼, then E e𝑐(𝐿 ∞ ) < ∞ for all constants 𝑐 < 1/(2𝛼).
Returning to Brownian motion with random drift, suppose that 𝐿 is not only a continuous local
martingale such that ℰ(𝐿) is a uniformly integrable martingale with mean 1, but also is adapted to the
completed canonical filtration :
 ℱ• . In other words, there are Borel functions 𝑓𝑡 𝐶 [0, 𝑡], R) → R
𝐵

such that 𝐿 𝑡 = 𝑓𝑡 (𝐵 𝑠 )06𝑠6𝑡 a.s. Write this relation as 𝐿 = 𝑓 (𝐵). Since 𝛽 := 𝐵 − h𝐵, 𝐿i is a
Brownian motion with respect to Q := ℰ(𝐿)∞ P, we have that the process 𝑋 = 𝐵 satisfies the
equation 𝑋 = 𝛽 + h𝑋, 𝑓 (𝑋)i, i.e., 𝑋 is a Brownian motion with drift that depends on 𝑋.
To give a concrete example satisfying all these assumptions, suppose that 𝑏 : R+ × R → R is
2 2
Borel with 𝑔 := sup𝑥 |𝑏(·, 𝑥)| ∈ 𝐿 loc (R+ ). Then (𝜔, 𝑠) ↦→ 𝑏 𝑠, 𝐵 𝑠 (𝜔) ∈ 𝐿 loc (𝐵), so we may define


𝐿 𝑡 := 0 𝑏(𝑠, 𝐵 𝑠 ) d𝐵 𝑠 . Because h𝐿, 𝐿i𝑡 6 0 𝑔(𝑠) 2 d𝑠 < ∞, the preceding exercise implies that
∫𝑡 ∫𝑡

ℰ(𝐿) is a martingale. We conclude that for each 𝑡0 < ∞, the pair (𝑋, 𝛽) on Ω, (ℱ𝑡 )𝑡6𝑡0 , ℰ(𝐿)𝑡0 P


solves the stochastic differential equation d𝑋𝑡 = d𝛽𝑡 + 𝑏(𝑡, 𝑋𝑡 ) d𝑡 for 0 6 𝑡 6 𝑡 0 , and 𝛽 is a Brownian
motion up to time 𝑡 0 . Because ℰ(𝐿) is a martingale, the laws for pairs (𝑋, 𝛽) corresponding to two
endings times 𝑡0 and 𝑡00 are consistent. Therefore, Kolmogorov’s consistency theorem yields a global
solution for all 𝑡 > 0. Write 𝑄 for the resulting law of 𝑋 on 𝐶 (R+ , R). Because the law of 𝑋 𝑡0 is the
loc loc
pushforward of ℰ(𝐿)𝑡0 P by 𝑋 𝑡0 = 𝐵𝑡0 , we have that 𝑄  𝑊 (and 𝑊  𝑄). We have that 𝑄  𝑊
iff ℰ(𝐿) is uniformly integrable. In Chapter 8, we will discuss solutions to SDEs, but only with
restrictive regularity assumptions on the function 𝑏. We will also allow a function 𝜎 in front of d𝛽𝑡 .
83

Chapter 6

General Theory of Markov Processes

The Markov property allows one to make many more calculations than one can for a general
stochastic process. Also, it is desirable for modeling, analogous to not having a time-lag in a
differential equation.

6.1. General Definitions and the Problem of Existence


Let (𝐸, ℰ) be a measurable space. A (Markovian) transition kernel from 𝐸 to 𝐸 is a map
𝑄 : 𝐸 × ℰ → [0, 1] such that
(i) ∀𝑥 ∈ 𝐸 𝐴 ↦→ 𝑄(𝑥, 𝐴) is a probability measure on (𝐸, ℰ), and
(ii) ∀𝐴 ∈ ℰ 𝑥 ↦→ 𝑄(𝑥, 𝐴) is ℰ-measurable.
This looks like a regular conditional probability, and indeed will be one. When 𝐸 is countable, 𝑄 is
determined by all 𝑄(𝑥, {𝑦}), the transition matrix.
Let 𝐵(𝐸) be the space of bounded (real) ℰ-measurable functions on 𝐸 with the supremum
norm. For 𝑓 ∈ 𝐵(𝐸), we write 𝑄 𝑓 for the function whose value at 𝑥 ∈ 𝐸 is the integral of 𝑓 with
respect to 𝑄(𝑥, ·); we write ∫
(𝑄 𝑓 )(𝑥) = 𝑄(𝑥, d𝑦) 𝑓 (𝑦).

Obviously, 𝑓 > 0 implies 𝑄 𝑓 > 0 and ∀ 𝑓 ∈ 𝐵(𝐸) 𝑄 𝑓 ∈ 𝐵(𝐸) (note this is ℰ-measurable by
approximating by simple functions) with k𝑄 𝑓 k 6 k 𝑓 k (𝑄 is a contraction on 𝐵(𝐸)). Thus, 𝑄
defines a bounded, positive, linear operator on 𝐵(𝐸).
Definition 6.1. A collection (𝑄 𝑡 )𝑡>0 of transition kernels on 𝐸 is called a transition semigroup if
(i) 𝑄 0 = Id, i.e., ∀𝑥 ∈ 𝐸 𝑄 0 (𝑥, ·) = 𝛿𝑥 ,
(ii) ∀𝑠, 𝑡 > 0 𝑄 𝑡 𝑄 𝑠 = 𝑄 𝑡+𝑠 , i.e., ∀𝑥 ∈ 𝐸 ∀𝐴 ∈ ℰ

𝑄 𝑡 (𝑥, d𝑦) 𝑄 𝑠 (𝑦, 𝐴) = 𝑄 𝑡+𝑠 (𝑥, 𝐴),
𝐸

called the Chapman–Kolmogorov identity, and


(iii) ∀𝐴 ∈ ℰ (𝑡, 𝑥) ↦→ 𝑄 𝑡 (𝑥, 𝐴) is ℬ(R+ ) ⊗ ℰ-measurable.
84 Chapter 6. General Theory of Markov Processes


Definition 6.2. Given a filtered probability space Ω, ℱ, (ℱ𝑡 )06𝑡6∞ , P and a transition semigroup
(𝑄 𝑡 )𝑡>0 on 𝐸, an (ℱ𝑡 )-adapted 𝐸-valued process (𝑋𝑡 )𝑡>0 is called a Markov process (with respect
to (ℱ𝑡 )) with transition semigroup (𝑄 𝑡 )𝑡>0 if
∀𝑠, 𝑡 > 0 ∀𝐴 ∈ ℰ P[𝑋𝑠+𝑡 ∈ 𝐴 | ℱ𝑠 ] = 𝑄 𝑡 (𝑋𝑠 , 𝐴). (∗)
If we do not specify (ℱ𝑡 ), then we mean the canonical filtration (ℱ𝑡𝑋 ).
Thus, 𝑄 𝑡 gives many regular conditional probabilities. Inherent in (∗) is the assumption of
time-homogeneity. Note that (∗) gives
∀𝑠, 𝑡 > 0 ∀𝐴 ∈ ℰ
 
P 𝑋𝑠+𝑡 ∈ 𝐴 (𝑋𝑟 )06𝑟 6𝑠 = 𝑄 𝑡 (𝑋𝑠 , 𝐴).
This can also be stated as saying that 𝑋 is Markov with respect to its canonical filtration (ℱ𝑡𝑋 )𝑡 .
Note also that (∗) gives
∀𝑠, 𝑡 > 0 ∀ 𝑓 ∈ 𝐵(𝐸) E 𝑓 (𝑋𝑠+𝑡 ) | ℱ𝑠 = (𝑄 𝑡 𝑓 )(𝑋𝑠 ) : (∗∗)
 

the definition gives this when 𝑓 is an indicator, from which it follows when 𝑓 is simple and then, by
taking a limit, for general 𝑓 .
One can calculate as follows. Let 𝑋0 ∼ 𝛾. We claim that if 0 < 𝑡 1 < 𝑡 2 < · · · < 𝑡 𝑝 and
𝐴0 , 𝐴1 , . . . , 𝐴 𝑝 ∈ ℰ, then
(∗∗∗)
 
P 𝑋0 ∈ 𝐴0 , 𝑋𝑡1 ∈ 𝐴1 , . . . , 𝑋𝑡 𝑝 ∈ 𝐴 𝑝
∫ ∫ ∫
= 𝛾(d𝑥0 ) 𝑄 𝑡1 (𝑥 0 , d𝑥1 ) · · · 𝑄 𝑡 𝑝 −𝑡 𝑝−1 (𝑥 𝑝−1 , d𝑥 𝑝 ).
𝐴0 𝐴1 𝐴𝑝

To show this, we show the more general formula,

∀ 𝑓0 , 𝑓1 , . . . , 𝑓 𝑝 ∈ 𝐵(𝐸)
 
E 𝑓0 (𝑋0 ) 𝑓1 (𝑋𝑡1 ) · · · 𝑓 𝑝 (𝑋𝑡 𝑝 )
∫ ∫ ∫
= 𝛾(d𝑥 0 ) 𝑓0 (𝑥 0 ) 𝑄 𝑡1 (𝑥 0 , d𝑥 1 ) 𝑓1 (𝑥 1 ) 𝑄 𝑡2 −𝑡1 (𝑥 1 , d𝑥 2 ) 𝑓2 (𝑥 2 ) · · ·

𝑄 𝑡 𝑝 −𝑡 𝑝−1 (𝑥 𝑝−1 , d𝑥 𝑝 ) 𝑓 𝑝 (𝑥 𝑝 ).

For 𝑝 = 0, this is the definition of 𝛾. Suppose 𝑝 > 1 and the formula holds for 𝑝 − 1. Then
 
E 𝑓0 (𝑋0 ) 𝑓1 (𝑋𝑡1 ) · · · 𝑓 𝑝 (𝑋𝑡 𝑝 )
h  i
= E 𝑓0 (𝑋0 ) 𝑓1 (𝑋𝑡1 ) · · · 𝑓 𝑝−1 (𝑋𝑡 𝑝−1 ) E 𝑓 𝑝 (𝑋𝑡 𝑝 ) ℱ𝑡 𝑝−1
 
= E 𝑓0 (𝑋0 ) 𝑓1 (𝑋𝑡1 ) · · · 𝑓 𝑝−1 (𝑋𝑡 𝑝−1 )(𝑄 𝑡 𝑝 −𝑡 𝑝−1 𝑓 𝑝 )(𝑋𝑡 𝑝−1 )
by Eq. (∗∗), so we may apply the induction hypothesis with functions 𝑓0 , . . . , 𝑓 𝑝−2 , 𝑓 𝑝−1 · (𝑄 𝑡 𝑝 −𝑡 𝑝−1 𝑓 𝑝 )
to get
∫ ∫
𝛾(d𝑥 0 ) 𝑓0 (𝑥 0 ) · · · 𝑄 𝑡 𝑝−1 −𝑡 𝑝−2 (𝑥 𝑝−2 , d𝑥 𝑝−1 ) 𝑓 𝑝−1 (𝑥 𝑝−1 ) (𝑄 𝑡 𝑝 −𝑡 𝑝−1 𝑓 𝑝 )(𝑥 𝑝−1 ),
| {z }

= 𝑄 𝑡 𝑝 −𝑡 𝑝−1 (𝑥 𝑝−1 ,d𝑥 𝑝 ) 𝑓 𝑝 (𝑥 𝑝 )
6.1. General Definitions and the Problem of Existence 85

with slightly different notation in case 𝑝 = 1.


Conversely, if (∗∗∗) holds and (𝑄 𝑡 ) is a transition semigroup, then (𝑋𝑡 )𝑡>0 is a Markov process
with semigroup (𝑄 𝑡 )𝑡>0 with respect to (ℱ𝑡𝑋 )𝑡>0 : use a 𝜋-𝜆 argument as on page 262 of the book,
number 3.
Note that (∗∗∗) shows that 𝛾 and (𝑄 𝑡 )𝑡>0 determine the finite-dimensional distributions of 𝑋.
If Definition 6.1(i) and (iii) hold and ∀𝑥 ∈ 𝐸 ∃(ℱ𝑡 )-adapted 𝑋 such that Eq. (∗) holds and
𝑄 𝑡 𝑓 (𝑥) = E 𝑓 (𝑋𝑡 ) for 𝑓 ∈ 𝐵(𝐸), then (𝑄 𝑡 )𝑡 is a transition semigroup:
 

  h  i
𝑄 𝑡+𝑠 𝑓 (𝑥) = E 𝑓 (𝑋𝑡+𝑠 ) = E E 𝑓 (𝑋𝑡+𝑠 ) ℱ𝑠
 
= E 𝑄 𝑡 𝑓 (𝑋𝑠 ) = 𝑄 𝑠 (𝑄 𝑡 𝑓 )(𝑥).

Thus, if Eq. (∗∗∗) holds for all 𝛾 (or all 𝛿𝑥 ), the Chapman–Kolmogorov identity holds.

Example. Let 𝑋 be 𝑑-dimensional Brownian motion. Then 𝑋𝑡 has density

1 −|𝑥| 2 /(2𝑡)
𝑝 𝑡 (𝑥) B e (𝑡 > 0, 𝑥 ∈ R𝑑 ).
(2𝜋𝑡) 𝑑/2

Let 𝑄 𝑡 (𝑥, ·) have density 𝑦 ↦→ 𝑝 𝑡 (𝑦 − 𝑥) for 𝑡 > 0. The Markov property of Brownian motion shows
that 𝑋 is a Markov process with transition semigroup (𝑄 𝑡 )𝑡>0 —in particular, (𝑄 𝑡 ) is a semigroup.

Exercise (due 2/22). Exercise 6.24.


Given a transition semigroup, is there a Markov process with that semigroup? We show the
answer is yes under a topological condition on 𝐸.
First, we recall a version of Kolmogorov’s extension theorem. Let Ω∗ B 𝐸 R+ with the 𝜎-field
ℱ generated by the coordinate maps 𝜔 ↦→ 𝜔(𝑡) (𝑡 ∈ R+ ). For 𝑈 ⊆ R+ , write 𝜋𝑈 : Ω∗ → 𝐸 𝑈 for

the map 𝜔 ↦→ 𝜔𝑈. For 𝑈 ⊆ 𝑉 ⊆ R+ , write 𝜋𝑈 𝑉 : 𝐸 𝑉 → 𝐸 𝑈 for the map 𝜔 ↦→ 𝜔𝑈. Let 𝐹 (R ) be
+
the collection of finite sets in R+ .
A topological space is Polish if it is separable (there exists a countable dense subset) and its
topology is generated by a complete metric.
Theorem 6.3 (Kolmogorov). Let (𝐸, ℰ) be a Polish space with its Borel 𝜎-field. Suppose that
∀𝑈 ∈ 𝐹 (R+ ) 𝜇𝑈 is a probability measure on 𝐸 𝑈 . If (𝜇𝑈 )𝑈∈𝐹 (R+ ) is consistent in the sense that
∀𝑈 ⊆ 𝑉 ∈ 𝐹 (R+ ) (𝜋𝑈 𝑉 ) 𝜇 = 𝜇 , then there exists a unique probability measure 𝜇 on (Ω∗ , ℱ ∗ )
∗ 𝑉 𝑈
such that ∀𝑈 ∈ 𝐹 (R+ ) (𝜋𝑈 )∗ 𝜇 = 𝜇𝑈 . J

In words: consistent finite-dimensional distributions determine a probability measure.


Corollary 6.4. Let (𝐸, ℰ) be a Polish space with its Borel 𝜎-field. If (𝑄 𝑡 )𝑡>0 is a transition
semigroup on 𝐸 and 𝛾 is a probability measure on 𝐸, then there exists a unique probability measure
𝑃 on Ω∗ such that the canonical process

𝑋𝑡 (𝜔) B 𝜔(𝑡) (𝑡 ∈ R+ , 𝜔 ∈ Ω∗ )

is a Markov process with transition semigroup (𝑄 𝑡 )𝑡>0 and 𝑋0 ∼ 𝛾.


86 Chapter 6. General Theory of Markov Processes

Proof. We make sure (∗∗∗) holds. Given 0 6 𝑡1 < 𝑡 2 < · · · < 𝑡 𝑝 , we define 𝜇 {𝑡1 ,...,𝑡 𝑝 } on 𝐸 {𝑡1 ,...,𝑡 𝑝 }
by
∫ ∫ ∫
𝜇 {𝑡1 ,...,𝑡 𝑝 } ( 𝐴1 × · · · × 𝐴 𝑝 ) B 𝛾(d𝑥 0 ) 𝑄 𝑡1 (𝑥 0 , d𝑥 1 ) · · · 𝑄 𝑡 𝑝 −𝑡 𝑝−1 (𝑥 𝑝−1 , d𝑥 𝑝 )
𝐸 𝐴1 𝐴𝑝

for 𝐴𝑖 ∈ ℰ. The consistency condition amounts to putting some 𝐴𝑖 = 𝐸 and verifying that those
coordinates can be eliminated via Chapman–Kolmogorov (details are left to the reader). J

In particular, for 𝑥 ∈ 𝐸 and 𝛾 = 𝛿𝑥 , we write P𝑥 for the associated measure. Note that 𝑥 ↦→ P𝑥
is measurable in the sense that for all 𝐴 ∈ ℱ ∗ , 𝑥 ↦→ P𝑥 ( 𝐴) is measurable: when 𝐴 depends on only
finitely many coordinates, this follows from the measurability assumption in Definition 6.1, and then
the 𝜋-𝜆 theorem gives it for all 𝐴. We may express the general measure P(𝛾) associated to any 𝛾 by

P(𝛾) ( 𝐴) = 𝛾(d𝑥) P𝑥 ( 𝐴) :
𝐸

the integral makes sense by measurability of 𝑥 ↦→ P𝑥 and the integral is a probability measure
by the monotone convergence theorem. By uniqueness, this is the measure from Corollary 6.4.
Under additional assumptions (of Section 6.2), we prove there is a càdlàg modification of 𝑋
in Section 6.3. There is a lot of operator theory one can develop related to semigroups, but we will
avoid most of it. However, to motivate the next definition, suppose that 𝑄 𝑡 = e𝑡 𝐿 (reasonable from the
Chapman–Kolmogorov identity). The resolvent of 𝐿 is the operator-valued function 𝜆 ↦→ (𝜆 − 𝐿) −1
for 𝜆 ∉ 𝜎(𝐿). Formally, for 𝜆 > 0 and thinking of 𝐿 6 0, we have
∫ ∞
e−𝜆𝑡 e𝑡 𝐿 d𝑡 = (𝜆 − 𝐿) −1 .
0

Definition 6.5. For 𝜆 > 0, the 𝜆-resolvent of the semigroup (𝑄 𝑡 )𝑡>0 is the linear operator
𝑅𝜆 : 𝐵(𝐸) → 𝐵(𝐸) defined by
∫ ∞
(𝑅𝜆 𝑓 )(𝑥) B e−𝜆𝑡 (𝑄 𝑡 𝑓 )(𝑥) d𝑡 ( 𝑓 ∈ 𝐵(𝐸), 𝑥 ∈ 𝐸).
0

Note that Definition 6.1(iii) shows∫ that 𝑡 ↦→ (𝑄 𝑡 𝑓)(𝑥) is measurable and 𝑅𝜆 𝑓 ∈ ℰ (if
𝑔 ∈ ℬ(R+ ) ⊗ ℰ is bounded, then 𝑥 ↦→ R e−𝜆𝑡 𝑔(𝑡, 𝑥) d𝑡 ∈ ℰ by the usual progression starting
+
from 𝑔 being an indicator).
Clearly, 𝑓 > 0 implies 𝑅𝜆 𝑓 > 0 and ∀ 𝑓 ∈ 𝐵(𝐸) k𝑅𝜆 𝑓 k 6 k 𝑓 k/𝜆. We also have the resolvent
equation
𝑅𝜆 − 𝑅 𝜇
𝜆 ≠ 𝜇 =⇒ 𝑅𝜆 𝑅 𝜇 = − .
𝜆−𝜇
To see this, we first show
Lemma. ∀𝜇 > 0 ∀𝑡 > 0 𝑄 𝑡 𝑅 𝜇 = 𝑅 𝜇 𝑄 𝑡 .
6.1. General Definitions and the Problem of Existence 87

Proof. We show this lemma by direct calculation:



(𝑄 𝑡 𝑅 𝜇 𝑓 )(𝑥) = 𝑄 𝑡 (𝑥, d𝑦) 𝑅 𝜇 𝑓 (𝑦)
𝐸
∫ ∫ ∞
= 𝑄 𝑡 (𝑥, d𝑦) e−𝜇𝑠 𝑄 𝑠 𝑓 (𝑦) d𝑠
𝐸 0
∫ ∞ ∫
= e −𝜇𝑠
𝑄 𝑡 (𝑥, d𝑦) 𝑄 𝑠 𝑓 (𝑦) d𝑠
∫0 ∞ 𝐸

= e−𝜇𝑠 𝑄 𝑡 (𝑄 𝑠 𝑓 )(𝑥) d𝑠
∫0 ∞
= e−𝜇𝑠 𝑄 𝑠 (𝑄 𝑡 𝑓 )(𝑥) d𝑠 = (𝑅 𝜇 𝑄 𝑡 𝑓 )(𝑥) J
0
Now, using the above lemma, we can verify the resolvent equation:
∫ ∞
(𝑅𝜆 𝑅 𝜇 𝑓 )(𝑥) = e−𝜆𝑡 (𝑄 𝑡 𝑅 𝜇 𝑓 )(𝑥) d𝑡
∫0 ∞
= e−𝜆𝑡 (𝑅 𝜇 𝑄 𝑡 𝑓 )(𝑥) d𝑡
∫0 ∞ ∫ ∞
= e−𝜆𝑡
e−𝜇𝑠 𝑄 𝑠 (𝑄 𝑡 𝑓 )(𝑥) d𝑠 d𝑡
∫0 ∞ ∫0 ∞
= e−𝜆𝑡 e−𝜇𝑠 𝑄 𝑡+𝑠 𝑓 (𝑥) d𝑠 d𝑡
0 0
∫ ∞ ∫ ∞
= e e
−𝜆𝑡 𝜇𝑡
e−𝜇𝑟 𝑄 𝑟 𝑓 (𝑥) d𝑟 d𝑡
∫0 ∞ 𝑡
∫ 𝑟
= e 𝑄 𝑟 𝑓 (𝑥)
−𝜇𝑟
e−(𝜆−𝜇)𝑡 d𝑡 d𝑟
0 0
[Fubini’s theorem]
∫ ∞  1 − e−(𝜆−𝜇)𝑟 
= e−𝜇𝑟 𝑄 𝑟 𝑓 (𝑥) d𝑟
0 𝜆−𝜇
∫ ∞  e−𝜇𝑟 − e−𝜆𝑟 
= 𝑄 𝑟 𝑓 (𝑥) d𝑟
0 𝜆−𝜇
𝑅 𝜇 𝑓 (𝑥) − 𝑅𝜆 𝑓 (𝑥)
= .
𝜆−𝜇

Example. For real Brownian motion, we have



𝑅𝜆 𝑓 (𝑥) = 𝑟 𝜆 (𝑦 − 𝑥) 𝑓 (𝑦) d𝑦,
R
where
1 √
𝑟 𝜆 (𝑧) B √ exp −|𝑧| 2𝜆 .

2𝜆
See page 157 of the book for a proof.
The resolvent provides useful supermartingales:
88 Chapter 6. General Theory of Markov Processes

Lemma 6.6. Let 𝑋 be a Markov process with semigroup (𝑄 𝑡 )𝑡>0 and filtration (ℱ𝑡 ), 0 6 ℎ ∈ 𝐵(𝐸),
and 𝜆 > 0. Then
𝑡 ↦→ e−𝜆𝑡 𝑅𝜆 ℎ(𝑋𝑡 )
is an (ℱ𝑡 )-supermartingale.

Proof. Since 𝑅𝜆 ℎ ∈ 𝐵(𝐸), integrability of the random variables e−𝜆𝑡 𝑅𝜆 ℎ(𝑋𝑡 ) is ensured. We want
to bound, for 𝑠, 𝑡 > 0,

E e−𝜆(𝑡+𝑠) 𝑅𝜆 ℎ(𝑋𝑡+𝑠 ) ℱ𝑡 = e−𝜆(𝑡+𝑠) 𝑄 𝑠 𝑅𝜆 ℎ(𝑋𝑡 ).


 

[Definition 6.2]
It suffices to show that e−𝜆𝑠 𝑄 𝑠 𝑅𝜆 ℎ 6 𝑅𝜆 ℎ. Indeed, we have
[ℎ > 0]
∫ ∞
e−𝜆𝑠 𝑄 𝑠 𝑅𝜆 ℎ = e−𝜆𝑡 𝑄 𝑡 ℎ d𝑡 6 𝑅𝜆 ℎ. J
𝑠
[by the Lemma for resolvent equation]

6.2. Feller Semigroups


A topological space is locally compact if every point has a neighborhood with compact closure.
A locally compact Polish space has the property that there exist compact 𝐾𝑛 such that for all 𝑛,
𝐾𝑛 ⊆ 𝐾𝑛+1 and every compact set is contained in some 𝐾𝑛 . See the appendix to this chapter.
In the rest of this section, let 𝐸 be locally compact and Polish. We say 𝑓 : 𝐸 → R vanishes at
infinity if for all 𝜀 > 0, there exists a compact set 𝐾 such that | 𝑓 (𝑥)| < 𝜀 for 𝑥 ∉ 𝐾. Write 𝐶0 (𝐸)
for the continuous functions that vanish at infinity, and give it the supremum norm.
Definition 6.7. A transition semigroup (𝑄 𝑡 )𝑡>0 on 𝐸 is called Feller if
(i) ∀𝑡 > 0 ∀ 𝑓 ∈ 𝐶0 (𝐸) 𝑄 𝑡 𝑓 ∈ 𝐶0 (𝐸), and
(ii) ∀ 𝑓 ∈ 𝐶0 (𝐸) lim𝑡→0 k𝑄 𝑡 𝑓 − 𝑓 k = 0.
A Markov process with a Feller semigroup is called Feller.

Part (i) says that 𝑄 𝑡 (𝑥, ·) depends continuously on 𝑥 and for all compact 𝐾, 𝑄 𝑡 (·, 𝐾) vanishes
at infinity. Part(ii) says that 𝑄 𝑡 (𝑥, ·) has most of its mass near 𝑥 for small 𝑡. Be aware that different
authors use different definitions for “Feller”.
We also see that a Feller semigroup has the property that ∀ 𝑓 ∈ 𝐶0 (𝐸) 𝑡 ↦→ 𝑄 𝑡 𝑓 is uniformly
continuous:

∀𝑠, 𝑡 > 0 k𝑄 𝑠 𝑓 − 𝑄 𝑡 𝑓 k = 𝑄 𝑠∧𝑡 (𝑄 |𝑠−𝑡| 𝑓 − 𝑓 ) 6 k𝑄 |𝑠−𝑡| 𝑓 − 𝑓 k.

Lebesgue’s dominated convergence theorem gives that ∀ 𝑓 ∈ 𝐶0 𝐸 ∀𝜆 > 0 𝑅𝜆 𝑓 ∈ 𝐶0 (𝐸).


For the rest of the section, let (𝑄 𝑡 )𝑡>0 be a Feller semigroup on 𝐸.
We are now going to show how 𝑅𝜆 is an inverse. However, the operator of which it is the
inverse is not defined on all of 𝐶0 (𝐸), but only a dense subspace, which we can get as the range of
𝑅𝜆 and which does not depend on 𝜆.
6.2. Feller Semigroups 89

Proposition 6.8. For 𝜆 > 0, let R B 𝑅𝜆 𝑓 ; 𝑓 ∈ 𝐶0 (𝐸) . Then R does not depend on 𝜆 and is


dense in 𝐶0 (𝐸).

Proof. We can write the resolvent equation this way:



𝑅𝜆 𝑓 = 𝑅 𝜇 𝑓 + (𝜇 − 𝜆)𝑅𝜆 𝑓 .

Therefore, every 𝑅𝜆 𝑓 has the form 𝑅 𝜇 𝑔 for some 𝑔 ∈ 𝐶0 (𝐸), as desired.


To show R is dense, we write
∫ ∞ ∫ ∞
𝜆𝑅𝜆 𝑓 = 𝜆 e 𝑄 𝑡 𝑓 d𝑡 =
−𝜆𝑡
e−𝑡 𝑄 𝑡/𝜆 𝑓 d𝑡,
0 0
so
∫ ∞
𝜆𝑅𝜆 𝑓 − 𝑓 = e−𝑡 (𝑄 𝑡/𝜆 𝑓 − 𝑓 ) d𝑡
∫ 0∞
6 e−𝑡 k𝑄 𝑡/𝜆 𝑓 − 𝑓 k d𝑡 → 0
0

by Lebesgue’s dominated convergence theorem. J


d
If 𝑄 𝑡 = e𝑡 𝐿 , then 𝐿 = d𝑡 𝑡=0 𝑄 𝑡 . This motivates
Definition 6.9. Write
n 𝑄𝑡 𝑓 − 𝑓 o
𝐷 (𝐿) B 𝑓 ∈ 𝐶0 (𝐸) ; converges in 𝐶0 (𝐸) as 𝑡 ↓ 0
𝑡
and, for 𝑓 ∈ 𝐷 (𝐿),
𝑄𝑡 𝑓 − 𝑓
𝐿 𝑓 B lim .
𝑡↓0 𝑡
The set 𝐷 (𝐿) is the domain of the (infinitesimal) generator 𝐿 of (𝑄 𝑡 )𝑡>0 .

Of course, 𝐷 (𝐿) is a linear subspace and 𝐿 is a linear map from 𝐷 (𝐿) to 𝐶0 (𝐸).
We can express differential and integral equations for (𝑄 𝑡 )𝑡>0 at times other than 𝑡 = 0:
Proposition 6.10. If 𝑓 ∈ 𝐷 (𝐿) and 𝑠 > 0, then 𝑄 𝑠 𝑓 ∈ 𝐷 (𝐿) with

𝐿(𝑄 𝑠 𝑓 ) = 𝑄 𝑠 (𝐿 𝑓 ).

Proof. We have, for 𝑡 > 0,


𝑄 𝑡 (𝑄 𝑠 𝑓 ) − 𝑄 𝑠 𝑓 𝑄 𝑓 − 𝑓 
(∗)
𝑡
= 𝑄𝑠 .
𝑡 𝑡
Since every Markovian kernel is a contraction, the right-hand side converges to 𝑄 𝑠 (𝐿 𝑓 ) in 𝐶0 (𝐸). J
Exercise (due 3/8). Prove that
𝑄 𝑡+𝑠 𝑓 − 𝑄 𝑠 𝑓
∀𝑠 > 0 ∀ 𝑓 ∈ 𝐷 (𝐿) lim = 𝐿𝑄 𝑠 𝑓 in 𝐶0 (𝐸).
𝑡↑0 𝑡
90 Chapter 6. General Theory of Markov Processes

Proposition 6.11. For 𝑓 ∈ 𝐷 (𝐿) and 𝑡 > 0, we have


∫ 𝑡 ∫ 𝑡
𝑄𝑡 𝑓 = 𝑓 + 𝑄 𝑠 (𝐿 𝑓 ) d𝑠 = 𝑓 + 𝐿 (𝑄 𝑠 𝑓 ) d𝑠.
0 0

Proof. Another way of writing (∗), Proposition 6.10, and the exercise is that for all 𝑥 ∈ 𝐸,
𝑡 ↦→ 𝑄 𝑡 𝑓 (𝑥) has a derivative 𝑄 𝑡 𝐿 𝑓 (𝑥), which is a continuous function of 𝑡. [Indeed, repeating
shows that 𝑡 ↦→ 𝑄 𝑡 𝑓 (𝑥) ∈ 𝐶 ∞ (R+ ).] Thus, the result follows from the fundamental theorem of


calculus. J
We are ready to justify the name “resolvent”.
Proposition 6.12. Let 𝜆 > 0. Then 𝐷 (𝐿) = R and

𝑅𝜆 : 𝐶0 (𝐸) → 𝐷 (𝐿), 𝜆 − 𝐿 : 𝐷 (𝐿) → 𝐶0 (𝐸)

are inverses. That is,


(i) ∀𝑔 ∈ 𝐶0 (𝐸) 𝑅𝜆 𝑔 ∈ 𝐷 (𝐿) and (𝜆 − 𝐿)𝑅𝜆 𝑔 = 𝑔, and
(ii) ∀ 𝑓 ∈ 𝐷 (𝐿) 𝑅𝜆 (𝜆 − 𝐿) 𝑓 = 𝑓 .

Proof. (i) We want to show that 𝑅𝜆 𝑔 ∈ 𝐷 (𝐿) with 𝐿𝑅𝜆 𝑔 = 𝜆𝑅𝜆 𝑔 − 𝑔. We calculate for all 𝜀 > 0,
[1st term by Lemma for resolvent equation;
decompose the 2nd term to get the next step]
∫ ∞ ∫ ∞ 
−1
𝜀 (𝑄 𝜀 𝑅𝜆 𝑔 − 𝑅𝜆 𝑔) = 𝜀 −1
e
−𝜆𝑡
𝑄 𝜀+𝑡 𝑔 d𝑡 − e
−𝜆𝑡
𝑄 𝑡 𝑔 d𝑡
0 0

1 − e−𝜆𝜀 ∞
1
∫ ∫ 𝜀
= e −𝜆𝑡
𝑄 𝜀+𝑡 𝑔 d𝑡 − e−𝜆𝑡 𝑄 𝑡 𝑔 d𝑡 ,
𝜀 0 𝜀 0

𝜆 𝑅𝜆 𝑔 𝑔
the last two convergences as 𝜀 ↓ 0 being in 𝐶0 (𝐸).
This also shows that R ⊆ 𝐷 (𝐿).
(ii) Now we want 𝜆𝑅𝜆 𝑓 = 𝑓 + 𝑅𝜆 𝐿 𝑓 . Using Proposition 6.11, we get
∫ ∞
𝜆𝑅𝜆 𝑓 = 𝜆 e−𝜆𝑡 𝑄 𝑡 𝑓 d𝑡
∫0 ∞  ∫ 𝑡 
=𝜆 e−𝜆𝑡
𝑓+ 𝑄 𝑠 𝐿 𝑓 d𝑠 d𝑡
0 0
∫ ∞
= 𝑓+ e−𝜆𝑠 𝑄 𝑠 𝐿 𝑓 d𝑠 = 𝑓 + 𝑅𝜆 𝐿 𝑓 .
0

This also shows that 𝐷 (𝐿) ⊆ R. J


Exercise (due 3/8). Show that if 𝑓𝑛 , 𝑓 , 𝑔 ∈ 𝐶0 (𝐸) and 𝑓𝑛 → 𝑓 and 𝐿 𝑓𝑛 → 𝑔 in 𝐶0 (𝐸), then 𝑔 = 𝐿 𝑓 .
Corollary 6.13. The map 𝐿 : 𝐷 (𝐿) → 𝐶0 (𝐸) determines (𝑄 𝑡 )𝑡>0 .

This justifies the name “generator”.


6.2. Feller Semigroups 91

Proof. Given 𝐿, we know 𝑅𝜆 for each 𝜆 > 0, whence we know the Laplace transform of 𝑡 ↦→ 𝑄 𝑡 𝑓 (𝑥)
for each 𝑓 ∈ 𝐶0 (𝐸) and 𝑥 ∈ 𝐸. The uniqueness of the Laplace transform shows that we then
know 𝑄 𝑡 𝑓 (𝑥). Since 𝑄 𝑡 𝐶0 (𝐸) determines 𝑄 𝑡 (Riesz representation theorem—regularity gives
uniqueness), this completes the proof. J

Exercise (due 3/8). Fix 𝑎 ∈ R \ {0}. Let 𝑄 𝑡 (𝑥, ·) B 𝛿𝑥+𝑎𝑡 .


(1) Show that (𝑄 𝑡 )𝑡>0 is a Feller semigroup on R.
(2) Given a probability measure 𝛾 on R, find a Markov process with semigroup (𝑄 𝑡 )𝑡>0 and initial
distribution 𝛾.
(3) Find the generator of (𝑄 𝑡 )𝑡>0 and its domain.
It is easy to see that the semigroup (𝑄 𝑡 )𝑡 of Brownian motion is Feller. It is also intuitive that
its generator is 𝐿 𝑓 = 12 𝑓 00 in some sense:

𝑄 𝑡 𝑓 (𝑥) − 𝑓 (𝑥) = E𝑥 𝑓 (𝑋𝑡 ) − 𝑓 (𝑥)


1
= E𝑥 𝑓 0 (𝑥)(𝑋𝑡 − 𝑥) + 𝑓 00 (𝜉𝑡 )(𝑋𝑡 − 𝑥) 2
 
2
𝑡 00
≈ 𝑓 (𝑥)
2
for some 𝜉𝑡 between 𝑥 and 𝑋𝑡 , except that we would need 𝜉𝑡 to be measurable for this argument to
work.
Instead, we use the resolvent. We saw that
1  √

∀𝜆 > 0 ∀ 𝑓 ∈ 𝐶0 (R) 𝑅𝜆 𝑓 (𝑥) = √ exp − 2𝜆|𝑦 − 𝑥| 𝑓 (𝑦) d𝑦.
2𝜆

Take 𝜆 B 12 . If ℎ ∈ 𝐷 (𝐿), then ∃ 𝑓 ∈ 𝐶0 (R) such that ℎ = 𝑅 1 𝑓 and 𝑓 = ( 12 − 𝐿)ℎ. If we differentiate


2


ℎ(𝑥) = e−|𝑦−𝑥| 𝑓 (𝑦) d𝑦

twice (see page 161 of the book for details), we get



00
ℎ (𝑥) = (−2𝛿𝑥 + e−|𝑦−𝑥| ) 𝑓 (𝑦) d𝑦 (informally)

= −2 𝑓 (𝑥) + ℎ(𝑥) (∈ 𝐶0 (R))


= 2𝐿ℎ (by above).

In particular, 𝐷 (𝐿) ⊆ ℎ ∈ 𝐶 2 (R) ; ℎ, ℎ00 ∈ 𝐶0 (R) .




In fact, this equals 𝐷 (𝐿). If 𝑔 ∈ 𝐶 2 (R) with 𝑔, 𝑔00 ∈ 𝐶0 (R), then set

𝑓 B (𝑔 − 𝑔00)/2 ∈ 𝐶0 (R)

and ℎ B 𝑅 1 𝑓 ∈ 𝐷 (𝐿).We saw that ℎ ∈ 𝐶 2 (R) and ℎ00 = −2 𝑓 + ℎ, i.e., (ℎ − ℎ00)/2 = 𝑓 . Therefore,
2
(ℎ − 𝑔) 00 = ℎ − 𝑔. Since ℎ − 𝑔 ∈ 𝐶0 (R), it follows that ℎ − 𝑔 = 0. Thus, 𝑔 = ℎ ∈ 𝐷 (𝐿), as desired.
92 Chapter 6. General Theory of Markov Processes

Exercise (due 3/8). Exercise 6.23.


Exercise. We
∫ call a probability measure 𝛾 stationary if for all 𝑓 ∈ 𝐶0 (𝐸) and
∫ all 𝑡 > 0, we have
𝑄 𝑡 𝑓 d𝛾 = 𝑓 d𝛾. Show that 𝛾 is stationary iff for all 𝑓 ∈ 𝐷 (𝐿), we have 𝐿 𝑓 d𝛾 = 0.

It is usually very difficult to determine 𝐷 (𝐿) exactly, but we can find a subset of 𝐷 (𝐿) by using
the following martingales.
Theorem 6.14. Suppose that for all 𝑥 ∈ 𝐸, there is a càdlàg process 𝑋 that is Markov with
semigroup (𝑄 𝑡 ) for the probability measure P𝑥 . Let ℎ, 𝑔 ∈ 𝐶0 (𝐸). The following are equivalent:
(i) ℎ ∈ 𝐷 (𝐿) and 𝐿ℎ = 𝑔.
(ii) ∀𝑥 ∈ 𝐸 ∫ 𝑡
𝑡 ↦→ ℎ(𝑋𝑡 ) − 𝑔(𝑋𝑠 ) d𝑠
0
is a P𝑥 -martingale.
(iii) ∀𝑥 ∈ 𝐸 ∀𝑡 > 0 E𝑥 ℎ(𝑋𝑡 ) − 0 𝑔(𝑋𝑠 ) d𝑠 = ℎ(𝑥).
 ∫𝑡 

Proof. Assume (i). We have


h ∫ 𝑡+𝑠 i
E𝑥 ℎ(𝑋𝑡+𝑠 ) − 𝑔(𝑋𝑟 ) d𝑟 ℱ𝑡
0
∫ 𝑡 h∫ 𝑡+𝑠 i
= E𝑥 [ℎ(𝑋𝑡+𝑠 ) | ℱ𝑡 ] − 𝑔(𝑋𝑟 ) d𝑟 − E𝑥 𝑔(𝑋𝑟 ) d𝑟 ℱ𝑡
0 𝑡
∫ 𝑡 ∫ 𝑡+𝑠
= 𝑄 𝑠 ℎ(𝑋𝑡 ) − 𝑔(𝑋𝑟 ) d𝑟 − E𝑥 [𝑔(𝑋𝑟 ) | ℱ𝑡 ] d𝑟
0 𝑡
[for the third term, take E[1 𝐴 · · · ] for 𝐴 ∈ ℱ𝑡 ]
∫ 𝑡 ∫ 𝑡+𝑠
= 𝑄 𝑠 ℎ(𝑋𝑡 ) − 𝑔(𝑋𝑟 ) d𝑟 − 𝑄 𝑟−𝑡 𝑔(𝑋𝑡 ) d𝑟
0 𝑡
∫ 𝑡 ∫ 𝑠
= 𝑄 𝑠 ℎ(𝑋𝑡 ) − 𝑔(𝑋𝑟 ) d𝑟 − 𝑄 𝑟 𝑔(𝑋𝑡 ) d𝑟
0 0
∫ 𝑡
= ℎ(𝑋𝑡 ) − 𝑔(𝑋𝑟 ) d𝑟
0

because 𝑄 𝑠 ℎ = ℎ + 0 𝑄 𝑟 𝑔 d𝑟 by Proposition 6.11. This gives (ii).


∫ 𝑠

Obviously, (ii) implies (iii).


Assume (iii). We also have
h ∫ 𝑡 i ∫ 𝑡
E𝑥 ℎ(𝑋𝑡 ) − 𝑔(𝑋𝑠 ) d𝑠 = 𝑄 𝑡 ℎ(𝑥) − 𝑄 𝑠 𝑔(𝑥) d𝑠.
0 0

Therefore,
𝑄𝑡 ℎ − ℎ 1
∫ 𝑡
= 𝑄 𝑟 𝑔 d𝑟,
𝑡 𝑡 0
which converges to 𝑔 in 𝐶0 (𝐸). J
6.3. The Regularity of Sample Paths 93

For example, consider 𝑑-dimensional Brownian motion. By Itô’s formula,

1 𝑡

2 𝑑
∀ℎ ∈ 𝐶 (R ) 𝑡 ↦→ ℎ(𝑋𝑡 ) − Δℎ(𝑋𝑠 ) d𝑠
2 0
is a continuous local martingale. If ℎ and Δℎ are bounded, then this is a true martingale. In particular,
this holds if ℎ, Δℎ ∈ 𝐶0 (R𝑑 ). Therefore, Theorem 6.14 tells us that

𝐷 (𝐿) ⊇ ℎ ∈ 𝐶 2 (R𝑑 ) ; ℎ, Δℎ ∈ 𝐶0 (R𝑑 )




and 𝐿ℎ = 21 Δℎ for such ℎ. Equality does not hold for 𝑑 > 2, where, in fact,

𝐷 (𝐿) = ℎ ∈ 𝐶0 (R𝑑 ) ; Δℎ ∈ 𝐶0 (R𝑑 ) in the sense of distributions




(see page 288 of the book by Revuz and Yor).


𝑥 2 −𝑥 2
Example. If 𝑑 = 2, let ℎ(𝑥) B |𝑥| 1log 2|𝑥| . For 𝑑 > 2, multiply this by a smooth function. See
B. Epstein, Partial Differential Equations, pp. 162–163.

Exercise (due 3/8). Exercise 6.27 (note the hypotheses on page 180 of the book).

6.3. The Regularity of Sample Paths


Let 𝐸 be a locally compact Polish space and (𝑄 𝑡 )𝑡>0 be a Feller semigroup on 𝐸.
Theorem 6.15. Suppose (𝑋𝑡 ) is a process and (P𝑥 )𝑥∈𝐸 are probability measures such that ∀𝑥 ∈ 𝐸
(𝑋𝑡 )𝑡>0 is a P𝑥 -Markov process with semigroup (𝑄 𝑡 )𝑡>0 with respect to (ℱ𝑡 )𝑡 and P𝑥 [𝑋0 = 𝑥] = 1.
Set ℱe∞ B ℱ∞ and ∀𝑡 > 0 ℱ e𝑡 B ℱ𝑡 + ∨ 𝜎(𝒩), where

𝒩 B 𝐴 ∈ ℱ∞ ; ∀𝑥 ∈ 𝐸 P𝑥 ( 𝐴) = 0 .


Then there exists a process ( 𝑋 e𝑡 ) that is càdlàg, adapted to ( ℱe𝑡 )𝑡 , and for all probability measures 𝛾
on 𝐸, ( 𝑋 e𝑡 )𝑡>0 is a P(𝛾) -modification of (𝑋𝑡 )𝑡>0 , P(𝛾) -Markov with semigroup (𝑄 𝑡 )𝑡>0 with respect to
(ℱ
e𝑡 )𝑡 , and ∀𝐴 ∈ ℰ P(𝛾) [ 𝑋 e0 ∈ 𝐴] = 𝛾( 𝐴).

Remark. The hypothesis implies that 𝑋 itself is P(𝛾) -Markov, etc., for all 𝛾 on 𝐸.

Exercise (due 3/22). Show that ( ℱ


e𝑡 )𝑡 is right-continuous. Hint: check that for all 𝐵 ∈ ℱ
e𝑡 , there
exists 𝐶 ∈ ℱ𝑡 + and 𝑁 ∈ 𝒩 such that 𝐵 = 𝐶 4 𝑁.
Lemma. Let 𝑌 be a right-continuous, nonnegative supermartingale with respect to a right-continuous
filtration. Define
𝑇 B inf{𝑡 > 0 ; 𝑌𝑡 = 0} ∧ inf{𝑡 > 0 ; 𝑌𝑡 − = 0}.
Then
P ∀𝑡 ∈ [𝑇, ∞) 𝑌𝑡 = 0 = 1.
 

Remark. The right-continuity of the filtration is not necessary.


94 Chapter 6. General Theory of Markov Processes

Proof. By Proposition 3.9(i),


𝑇𝑛 B inf 𝑡 > 0 ; 𝑌𝑡 < 1/𝑛


is a stopping time. By property (g) of stopping times in Chapter 3, 𝑇 B lim𝑛→∞ 𝑇𝑛 is also a stopping
time. We are concerned only with what happens on [𝑇 < ∞]. For 0 < 𝑞 ∈ Q, apply Theorem 3.25
to the stopping times 𝑇𝑛 < 𝑇 + 𝑞:

E 𝑌𝑇+𝑞 1 [𝑇 <∞] 6 E 𝑌𝑇𝑛 1 [𝑇𝑛 <∞] 6 𝑛1 .


   

This gives E 𝑌𝑇+𝑞 1 [𝑇 <∞] = 0, so 𝑌𝑇+𝑞 = 0 almost surely on [𝑇 < ∞]. By right-continuity, we get
 

the result. J

Proof of Theorem 6.15. If 𝐸 is not compact, then let

𝐸 Δ B 𝐸 ∪ {Δ}

be its one-point compactification; otherwise, let 𝐸 Δ B 𝐸. Every function in 𝐶0 (𝐸) extends to a


function in 𝐶 (𝐸 Δ ) by defining it to be 0 at Δ.
Step 1: (define 𝑋 e on 𝐸 Δ )
Let ( 𝑓𝑛 )𝑛>0 be a sequence of nonnegative functions in 𝐶0 (𝐸) that separates points of 𝐸 Δ , i.e.,
∀𝑥 ≠ 𝑦 ∈ 𝐸 Δ ∃𝑛 𝑓𝑛 (𝑥) ≠ 𝑓𝑛 (𝑦). Let

H B 𝑅 𝑝 𝑓𝑛 ; 𝑝 ∈ N+ , 𝑛 ∈ N .


Then H also separates points of 𝐸 Δ because lim 𝑝→∞ k 𝑝𝑅 𝑝 𝑓𝑛 − 𝑓𝑛 k = 0, as we saw in the proof
of Proposition 6.8.
For ℎ ∈ H with ℎ = 𝑅 𝑝 𝑓𝑛 , the process e−𝑝𝑡 ℎ(𝑋𝑡 ) 𝑡>0 is, for all 𝑥, a P𝑥 -supermartingale


by Lemma  6.6. Let 𝑁 ℎ be the event that for some 𝑘 ∈ N and some 𝑎, 𝑏 ∈ Q with 𝑎 < 𝑏,
e−𝑝𝑠 ℎ(𝑋𝑠 ) 𝑠∈Q+ ∩[0,𝑘] makes an infinite number of upcrossings of [𝑎, 𝑏]. In the proof of The-
orem 3.17, we saw that P𝑥 (𝑁 ℎ ) = 0. Put 𝑁 B ℎ∈H 𝑁 ℎ ∈ 𝒩. Then for all 𝛾, P(𝛾) (𝑁) = 0,
Ð
and
∀𝜔 ∉ 𝑁 ∀ℎ ∈ H ∀𝑡 > 0 lim ℎ 𝑋𝑠 (𝜔) exists

Q+ 3𝑠↓𝑡

and
∀𝜔 ∉ 𝑁 ∀ℎ ∈ H ∀𝑡 > 0 lim ℎ 𝑋𝑠 (𝜔) exists.

Q+ 3𝑠↑𝑡

Because H separates points, it follows that ∀𝜔 ∉ 𝑁 has right and left limits in 𝐸 Δ (not

𝑋𝑠 (𝜔) 𝑠∈Q+
necessarily in 𝐸).
Thus, we may define

∀𝜔 ∉ 𝑁 ∀𝑡 > 0 e𝑡 (𝜔) B lim 𝑋𝑠 (𝜔).


𝑋
Q+ 3𝑠↓𝑡

If 𝜔 ∈ 𝑁, put 𝑋 e𝑡 (𝜔) B 𝑥0 for some fixed 𝑥 0 ∈ 𝐸 and all 𝑡 > 0. Then 𝑋 e is 𝐸 Δ -valued and
( ℱ𝑡 )𝑡 -adapted. Lemma 3.16 shows that ℎ( 𝑋e𝑡 ) is càdlàg for all ℎ ∈ H , whence so is 𝑋.

e e
𝑡
6.3. The Regularity of Sample Paths 95

Step 2: (show ∀𝑡 > 0 ∀𝛾 P(𝛾) 𝑋𝑡 = 𝑋 e𝑡 = 1)


 

Let 𝑡 > 0. For all 𝑓 , 𝑔 ∈ 𝐶0 (𝐸), we have


e𝑡 ) = lim E (𝛾) 𝑓 (𝑋𝑡 )𝑔(𝑋𝑠 )
   
E (𝛾) 𝑓 (𝑋𝑡 )𝑔( 𝑋
Q3𝑠↓𝑡
[bounded convergence theorem ]
= lim E (𝛾) 𝑓 (𝑋𝑡 ) E (𝛾) [𝑔(𝑋𝑠 ) | ℱ𝑡 ]
 

= lim E (𝛾) 𝑓 (𝑋𝑡 )𝑄 𝑠−𝑡 𝑔(𝑋𝑡 )


 
 
= E (𝛾) 𝑓 (𝑋𝑡 )𝑔(𝑋𝑡 ) .

[Feller property and


bounded convergence theorem]

As in Exercise 6.27, this means that (𝑋𝑡 , 𝑋 = (𝑋𝑡 , 𝑋𝑡 ) under P(𝛾) , whence we have P(𝛾) 𝑋𝑡 = 𝑋
e𝑡 ) 𝒟 e𝑡 = 1.
 

Step 3: (show that ∀𝛾 𝑋e is P(𝛾) -Markov with semigroup (𝑄 𝑡 )𝑡>0 with respect to ( ℱ
e𝑡 ))
We want to verify that
∀𝑠 > 0 ∀𝑡 > 0 ∀ 𝑓 ∈ 𝐵(𝐸)
 
E (𝛾) 𝑓 ( 𝑋
e𝑠+𝑡 ) ℱ
f𝑠 = 𝑄 𝑡 𝑓 ( 𝑋
e𝑠 ),

i.e.,    
∀𝐴 ∈ ℱ
e𝑠 E (𝛾) 1 𝐴 𝑓 ( 𝑋
e𝑠+𝑡 ) = E (𝛾) 1 𝐴 𝑄 𝑡 𝑓 ( 𝑋
e𝑠 ) .
By regarding each side as a linear functional on 𝐵(𝐸), we see that it suffices to establish the equality
for 𝑓 ∈ 𝐶0 (𝐸). In addition, since 𝑠 and 𝑡 are fixed, we may replace 𝑋 e𝑠+𝑡 by 𝑋𝑠+𝑡 . Furthermore, we
may assume 𝐴 ∈ ℱ𝑠+ . Then for 𝑟 ∈ Q ∩ (𝑠, 𝑠 + 𝑡),
  h  i
E (𝛾) 1 𝐴 𝑓 (𝑋𝑠+𝑡 ) = E (𝛾) 1 𝐴 E (𝛾) 𝑓 (𝑋𝑠+𝑡 ) ℱ𝑟
 
= E (𝛾) 1 𝐴 𝑄 𝑠+𝑡−𝑟 𝑓 (𝑋𝑟 ) .

Since k𝑄 𝑠+𝑡−𝑟 𝑓 − 𝑄 𝑡 𝑓 k → 0 as 𝑟 ↓ 𝑠 and 𝑋𝑟 → 𝑋


e𝑠 P(𝛾) -a.s. as 𝑟 ↓ 𝑠, we get the result.
Step 4: ( 𝑋
e is càdlàg as an 𝐸-valued process off another set in 𝒩; this step is not needed if 𝐸 is
compact)
Note that 𝑋 e being càdlàg in 𝐸 Δ and Step 2 do not ensure this (even if 𝑋
e = 𝑋).
Choose 0 < 𝑔 ∈ 𝐶0 (𝐸), put 𝐶0 (𝐸) 3 ℎ B 𝑅1 𝑔 > 0, and
𝑌𝑡 B e−𝑡 ℎ( 𝑋
e𝑡 ).

Then 𝑌 is a nonnegative ( ℱe𝑡 )-supermartingale by Lemma 6.6. Also, 𝑌 is càdlàg (recall that
ℎ(Δ) B 0). Define 𝑇 as in the lemma. Let
𝑁1 B ∃𝑡 ∈ [𝑇, ∞) 𝑌𝑡 ≠ 0 .
 

By the lemma, 𝑁1 ∈ 𝒩. Let


e𝑘 ;
 
𝑁2 B ∃𝑘 ∈ N 𝑋 𝑘 ≠ 𝑋
by Step 2, 𝑁2 ∈ 𝒩. Off 𝑁1 ∪ 𝑁2 , we have 𝑇 = ∞ because 𝑌𝑡 = 0 iff 𝑋
e𝑡 = Δ, and 𝑋 𝑘 ≠ Δ. That is,
off 𝑁1 ∪ 𝑁2 , 𝑋
e is 𝐸-valued. J
96 Chapter 6. General Theory of Markov Processes

6.4. The Strong Markov Property

Theorem 6.16 (Simple Markov Property). Let 𝐸 be a measurable space, (𝑋𝑡 )𝑡>0 be an 𝐸-valued
process, and (P𝑥 )𝑥∈𝐸 be probability measures such that ∀𝑥 ∈ 𝐸 (𝑋𝑡 )𝑡>0 is a P𝑥 -Markov process with
semigroup (𝑄 𝑡 )𝑡>0 with respect to (ℱ𝑡 )𝑡>0 and P𝑥 [𝑋0 = 𝑥] = 1. Let 𝛾 be a probability measure on 𝐸.
Let
Φ : 𝐸 R+ −→ R+
be measurable. Then

∀𝑠 > 0
  
E (𝛾) Φ (𝑋𝑠+𝑡 )𝑡>0 ℱ𝑠 = E 𝑋𝑠 [Φ],
 
where E 𝑋𝑠 [Φ] denotes the composition of 𝜔 ↦→ 𝑋𝑠 (𝜔) and 𝑥 ↦→ E𝑥 Φ(𝑋) .

Proof. We saw in Section 6.1 that 𝑥 ↦→ P𝑥 is measurable, whence so is 𝑥 ↦→ E𝑥 Φ(𝑋) .


 

To prove the theorem, it suffices to prove the case when Φ is an indicator of an elementary
cylinder set, or, more generally,
𝑝
Y
Φ(𝑋) = 𝜑𝑖 (𝑋𝑠+𝑡𝑖 ), 0 6 𝑡1 < 𝑡2 < · · · < 𝑡 𝑝 , 𝜑𝑖 ∈ 𝐵(𝐸).
𝑖=1

The proof is just like that of (∗∗∗), but with an extra conditioning: We want

the left-hand side of the conclusion of the theorem


∫ ∫ ∫
= 𝑄 𝑡1 (𝑋𝑠 , d𝑥 1 )𝜑1 (𝑥 1 ) 𝑄 𝑡2 −𝑡1 (𝑥1 , d𝑥2 )𝜑2 (𝑥 2 ) · · · 𝑄 𝑡 𝑝 −𝑡 𝑝−1 (𝑥 𝑝−1 , d𝑥 𝑝 )𝜑 𝑝 (𝑥 𝑝 ).

For 𝑝 = 1, this is Definition 6.2 (of a Markov process with semigroup). The induction step is: the
left-hand side of the conclusion of the theorem equals

𝑝−1
hY i
 
E (𝛾) 𝜑𝑖 (𝑋𝑠+𝑡𝑖 ) · E (𝛾) 𝜑 𝑝 (𝑋𝑠+𝑡 𝑝 ) ℱ𝑠+𝑡 𝑝−1 ℱ𝑠
𝑖=1
𝑝−1
hY i
= E (𝛾) 𝜑𝑖 (𝑋𝑠+𝑡𝑖 ) · 𝑄 𝑡 𝑝 −𝑡 𝑝−1 𝜑 𝑝 (𝑋𝑠+𝑡 𝑝−1 ) ℱ𝑠 . J
𝑖=1

Exercise (due 3/22). Exercise 6.26 (note that the derivative in part 3 is in the norm sense).
If 𝐸 is a topological space,we write

D(𝐸) B 𝑓 ∈ 𝐸 R+ ; 𝑓 is càdlàg ,


and give D(𝐸) the 𝜎-field 𝒟 induced from 𝐸 R+ , even though D(𝐸) need not be measurable. We
call D(𝐸) the Skorokhod space.
6.4. The Strong Markov Property 97

Theorem 6.17 (Strong Markov Property). Let (𝑋𝑡 )𝑡>0 be an 𝐸-valued càdlàg process, and (P𝑥 )𝑥∈𝐸
be probability measures such that ∀𝑥 ∈ 𝐸 (𝑋𝑡 )𝑡>0 is a P𝑥 -Markov process with semigroup (𝑄 𝑡 )𝑡>0
with respect to (ℱ𝑡 )𝑡>0 and P𝑥 [𝑋0 = 𝑥] = 1. Assume that 𝐸 is locally compact Hausdorff, 𝑄 𝑡 maps
𝐶0 (𝐸) to 𝐶 (𝐸) [e.g., 𝐸 is also Polish and (𝑄 𝑡 )𝑡>0 is Feller],

Φ : (D(𝐸), 𝒟) → R+

is measurable, and 𝑇 is an (ℱ𝑡 + )𝑡 -stopping time [e.g., 𝑇 is a stopping time]. Then, for all probability
measures 𝛾 on 𝐸,

  
E (𝛾) 1 [𝑇 <∞] Φ (𝑋𝑇+𝑡 )𝑡>0 ℱ𝑇 = 1 [𝑇 <∞] E 𝑋𝑇 [Φ].

Proof. Theorem 3.7 guarantees that 𝑋𝑇 is measurable on [𝑇 < ∞], so the right-hand side is
ℱ𝑇 -measurable. Thus, it suffices to show that
   
∀𝐴 ∈ ℱ𝑇 E (𝛾) 1 𝐴∩[𝑇 <∞] Φ (𝑋𝑇+𝑡 )𝑡>0 = E (𝛾) 1 𝐴∩[𝑇 <∞] E 𝑋𝑇 [Φ] .

As in the proof of Theorem 6.16, it suffices to do this for Φ of the form

𝑝
Y
0 6 𝑡1 < 𝑡2 < · · · < 𝑡 𝑝 ,

Φ( 𝑓 ) = 𝜑𝑖 𝑓 (𝑡𝑖 ) , 𝜑𝑖 ∈ 𝐵(𝐸).
𝑖=1

We again use induction, but this time the case 𝑝 = 1 requires work; the induction step is like before:

𝑝
Y
 
E (𝛾) 1 𝐴∩[𝑇 <∞] 𝜑𝑖 (𝑋𝑇+𝑡𝑖 )
𝑖=1
h 𝑝−1
Y  i
= E (𝛾) 1 𝐴∩[𝑇 <∞] 𝜑𝑖 (𝑋𝑇+𝑡𝑖 ) · E (𝛾) 𝜑 𝑝 (𝑋𝑇+𝑡 𝑝 ) ℱ𝑇+𝑡 𝑝−1
𝑖=1
𝑝−1
Y
 
= E (𝛾) 1 𝐴∩[𝑇 <∞] 𝜑𝑖 (𝑋𝑇+𝑡𝑖 ) · 𝑄 𝑡 𝑝 −𝑡 𝑝−1 𝜑 𝑝 (𝑋𝑇+𝑡 𝑝−1 ) .
𝑖=1

So it remains to prove that

∀𝑡 > 0 ∀𝜑 ∈ 𝐵(𝐸) ∀𝐴 ∈ ℱ𝑇
   
E (𝛾) 1 𝐴∩[𝑇 <∞] 𝜑(𝑋𝑇+𝑡 ) = E (𝛾) 1 𝐴∩[𝑇 <∞] 𝑄 𝑡 𝜑(𝑋𝑇 ) .

Because 𝐸 is locally compact Hausdorff, it suffices to prove this for all 𝜑 ∈ 𝐶0 (𝐸) [regularity
ensures uniqueness again]. Write
𝑇𝑛 B b𝑛𝑇 + 1c/𝑛.
98 Chapter 6. General Theory of Markov Processes

Then

E (𝛾) 1 𝐴∩[𝑇 <∞] 𝜑(𝑋𝑇+𝑡 ) = lim E (𝛾) 1 𝐴∩[𝑇 <∞] 𝜑(𝑋𝑇𝑛 +𝑡 )


   
𝑛→∞
[right-continuity]
X∞
= lim
 
E (𝛾) 1 𝐴∩[(𝑖−1)/𝑛6𝑇 <𝑖/𝑛] 𝜑(𝑋 𝑖 +𝑡 )
𝑛→∞ 𝑛
𝑖=1
X ∞
= lim
 
E (𝛾) 1 𝐴∩[(𝑖−1)/𝑛6𝑇 <𝑖/𝑛] 𝑄 𝑡 𝜑(𝑋𝑖/𝑛 )
𝑛→∞
𝑖=1
[conditioning on ℱ𝑖/𝑛 ]
= lim E (𝛾) 1 𝐴∩[𝑇 <∞] 𝑄 𝑡 𝜑(𝑋𝑇𝑛 )
 
𝑛→∞
 
= E (𝛾) 1 𝐴∩[𝑇 <∞] 𝑄 𝑡 𝜑(𝑋𝑇 ) ,

because 𝑋 is right-continuous and 𝑄 𝑡 𝜑 is continuous. J

The formulation of the strong Markov property for Brownian motion, Theorem 2.20, was
different, though equivalent, because it used that Brownian motion has independent, stationary
increments.
Exercise (due 3/29). Exercise 6.25, Exercise 6.29.
Exercise. Derive Eq. (3.7), the Laplace transform of the hitting time for Brownian motion, from
Dynkin’s formula, Exercise 6.29(3).

Appendix: Locally Compact Polish Spaces are 𝜎-compact


We prove the standard result that if 𝐸 is a locally compact Polish space, then there is an
increasing sequence (𝐾𝑛 )𝑛>1 of compact subsets of 𝐸 such that every compact subset of 𝐸 is
contained in some 𝐾𝑛 .
First note that every separable metric space is second countable, i.e., has a countable basis
for its topology. Indeed, if 𝐷 is a countable dense set, then the balls centered at points of 𝐷 with
rational radii form a countable basis.
Second, we claim that every second countable, locally compact space has a countable basis
each of whose members has compact closure. To see this, let U be a countable basis. Write U 0 for
the collection of members of U whose closure is compact. Let 𝑂 be open. For each 𝑥 ∈ 𝑂, there is
a neighborhood 𝑉 ⊆ 𝑂 of 𝑥 with compact closure. Every element of U contained in 𝑉 lies in U 0,
whence 𝑉 is a union of sets in U 0. Since this holds for all 𝑥 ∈ 𝑂, also 𝑂 is a union of sets in U 0.
Therefore U 0 satisfies the requirements.
Putting together these two facts, we see that 𝐸 has a countable basis each of whose members
has compact closure. Order such a basis as (𝑉𝑛 )𝑛>1 . Define 𝐾𝑛 := 𝑘 6𝑛 𝑉𝑘 . Then 𝐾𝑛 is compact and
Ð
𝐾𝑛 ⊆ 𝐾𝑛+1 . If 𝐾 is any compact subset of 𝐸, then since 𝐸 = 𝑛 𝑉𝑛 , the definition of compactness
Ð
provides a finite subcover of 𝐾, whence 𝐾 ⊆ 𝐾𝑛 for some 𝑛.
Locally Compact Polish Spaces are 𝜎-compact 99

Although we did not use completeness of 𝐸, every locally compact, separable metric space is
Polish. For a proof, see Theorem 5.3 of Classical Descriptive Set Theory by Alexander S. Kechris.
100 Chapter 8. Stochastic Differential Equations

Chapter 8

Stochastic Differential Equations

We treat mainly the case of Lipschitz coefficients, where we prove existence, uniqueness, and
that the solution is a Feller Markov process whose generator is a second-order differential operator.

8.1. Motivation and General Definitions


A real-valued ordinary differential equation has the form

𝑦0 (𝑡) = 𝑏 𝑡, 𝑦(𝑡) ,


also written as
d𝑦 𝑡 = 𝑏(𝑡, 𝑦 𝑡 ) d𝑡.
Here, we are writing the subscript 𝑡 not to indicate derivative, but the time variable. We may wish to
model noise by adding a term on the right, 𝜎 d𝐵𝑡 or 𝜎(𝑡, 𝑦 𝑡 ) d𝐵𝑡 , where 𝐵 is Brownian motion. The
equation
d𝑦 𝑡 = 𝑏(𝑡, 𝑦 𝑡 ) d𝑡 + 𝜎(𝑡, 𝑦 𝑡 ) d𝐵𝑡
means, by definition, that
∫ 𝑡 ∫ 𝑡
𝑦𝑡 = 𝑦0 + 𝑏(𝑠, 𝑦 𝑠 ) d𝑠 + 𝜎(𝑠, 𝑦 𝑠 ) d𝐵 𝑠 .
0 0

We have a similar notion for vector-valued processes:


Definition 8.1. Let 𝑑, 𝑚 ∈ N+ . Denote the set of 𝑑 × 𝑚 real matrices by 𝑀𝑑×𝑚 (R) and give it
the product topology. Let 𝜎 : R+ × R𝑑 → 𝑀𝑑×𝑚 (R) and 𝑏 : R+ × R𝑑 → R𝑑 be locally bounded,
measurable functions. Write their coordinates as 𝜎 = (𝜎𝑖 𝑗 )16𝑖6𝑑, 16 𝑗 6𝑚 and 𝑏 = (𝑏𝑖 )16𝑖6𝑑 . By a
solution of the stochastic differential equation

𝐸 (𝜎, 𝑏) : d𝑋𝑡 = 𝜎(𝑡, 𝑋𝑡 ) d𝐵𝑡 + 𝑏(𝑡, 𝑋𝑡 ) d𝑡,

we mean

• a filtered probability space Ω, ℱ, (ℱ𝑡 )06𝑡6∞ , P with a complete filtration,




• an 𝑚-dimensional (ℱ𝑡 )-Brownian motion 𝐵 = (𝐵1 , . . . , 𝐵𝑚 ) started from 0, and


8.1. Motivation and General Definitions 101

• an (ℱ𝑡 )-adapted continuous R𝑑 -valued process 𝑋 = (𝑋 1 , . . . , 𝑋 𝑑 ) such that


∫ 𝑡 ∫ 𝑡
∀𝑡 > 0 𝑋𝑡 = 𝑋0 + 𝜎(𝑠, 𝑋𝑠 ) d𝐵 𝑠 + 𝑏(𝑠, 𝑋𝑠 ) d𝑠,
0 0

which means
𝑚 ∫
X 𝑡 ∫ 𝑡
𝑋𝑠 ) d𝐵 𝑠 𝑏𝑖 (𝑠, 𝑋𝑠 ) d𝑠.
𝑗
∀𝑖 ∈ [1, 𝑑] 𝑋𝑡𝑖 = 𝑋0𝑖 + 𝜎𝑖 𝑗 (𝑠, +
𝑗=1 0 0

If also P[𝑋0 = 𝑥] = 1, then we say 𝑋 is a solution of 𝐸 𝑥 (𝜎, 𝑏).

Note that 𝑋0 ⫫ 𝐵 because 𝑋0 ∈ ℱ0 ⫫ 𝐵.


Existence and uniqueness can be defined in various ways probabilistically:
Definition 8.2. We say 𝐸 (𝜎, 𝑏) has weak existence if for all 𝑥 ∈ R𝑑 , there exists a solution
to 𝐸 𝑥 (𝜎, 𝑏). We say 𝐸 (𝜎, 𝑏) has weak uniqueness if for each 𝑥 ∈ R𝑑 , over all solutions to
𝐸 𝑥 (𝜎, 𝑏) (including varying the filtered probability space), the law of 𝑋 is the same.
 We say
𝐸 (𝜎, 𝑏) has pathwise uniqueness if for each filtered probability space Ω, ℱ, (ℱ𝑡 ), P and for each
(ℱ𝑡 )-Brownian motion 𝐵, any pair, 𝑋 and 𝑋 0, of solutions to 𝐸 (𝜎, 𝑏) such that P[𝑋0 = 𝑋00 ] = 1
are indistinguishable. We say a solution of 𝐸 𝑥 (𝜎, 𝑏) is a strong solution if it is adapted to the
completed canonical filtration of 𝐵.

Exercise (due 3/29). Exercise 8.9.

Example (Section 8.4.1). To model the motion of a physical Brownian particle, we should not
consider the forces as changing the position of the particle, but its momentum or, equivalently,
velocity. Furthermore, there is a frictional drag (viscosity). This leads to the stochastic differential
equation
d𝑋𝑡 = d𝐵𝑡 − 𝜆𝑋𝑡 d𝑡 (1-dimension)
for the velocity 𝑋, where 𝜆 > 0. This is the Langevin equation, up to constants, historically the
first stochastic differential equation. It is also exponential decay with noise. We can solve this by
applying integration by parts to e𝜆𝑡 𝑋𝑡 :
want
d(e𝜆𝑡 𝑋𝑡 ) = 𝜆e𝜆𝑡 𝑋𝑡 d𝑡 + e𝜆𝑡 d𝑋𝑡 = e𝜆𝑡 d𝐵𝑡 ,

suggesting that
∫ 𝑡
𝑋𝑡 B 𝑋0 e −𝜆𝑡
+ e−𝜆(𝑡−𝑠) d𝐵 𝑠
0

is a solution. Indeed, Itô’s formula shows that it is, called the Ornstein–Uhlenbeck process. We
have proved weak existence, weak uniqueness, and pathwise uniqueness. It is also a strong solution.
This integral is a Wiener integral, whence it belongs to the Gaussian space generated by 𝐵. We
consider two special cases.
102 Chapter 8. Stochastic Differential Equations

100 Samples of the Ornstein-Uhlenbeck Process


for Time 10 Compared to Exponential Decay at Rate 2
10

2 4 6 8 10

-2

Figure 8.1: Simulation of the Ornstein–Uhlenbeck process

(1) Suppose P[𝑋0 = 𝑥] = 1. Then 𝑋 is a non-centered Gaussian process with mean function

𝑚(𝑡) B E[𝑋𝑡 ] = 𝑥e−𝜆𝑡 .

Thus, 𝑋 is gotten by adding 𝑚(𝑡) to a centered Gaussian process with covariance function, for
0 6 𝑠 6 𝑡,
h∫ 𝑡 ∫ 𝑠 i
𝐾 (𝑠, 𝑡) B Cov(𝑋𝑠 , 𝑋𝑡 ) = E e−𝜆(𝑡−𝑢)
d𝐵𝑢 · e−𝜆(𝑠−𝑢) d𝐵𝑢
0
∫ 𝑠 ∫ 𝑠0
= e−𝜆(𝑡−𝑢) e−𝜆(𝑠−𝑢) d𝑢 = e−𝜆(𝑡+𝑠) e2𝜆𝑢 d𝑢
0 0
| {z }
[isometry]
e2𝜆𝑠 − 1
2𝜆
e−𝜆|𝑡−𝑠| − e−𝜆(𝑡+𝑠)
= .
2𝜆
Thus, we see decay of the initial condition and convergence to a stationary process.
1
(2) Suppose 𝑋0 ∼ 𝒩(0, 2𝜆 ). Then 𝑋 is a centered Gaussian process with covariance function

e−𝜆(𝑡+𝑠) e−𝜆|𝑡−𝑠|
𝐾 (𝑠, 𝑡) + E 𝑋0 e−𝜆𝑡 · 𝑋0 e−𝜆𝑠 = 𝐾 (𝑠, 𝑡) +
 
= .
2𝜆 2𝜆
We see that the Ornstein–Uhlenbeck process in this case is stationary. Our later theory will
show it is Markov, but this can be shown directly:
Exercise (due 4/5). Show that a centered stationary Gaussian process on R+ is Markov if and only if
there exist 𝜆 ∈ [0, ∞] and 𝑎 > 0 such that the covariance function is

(𝑠, 𝑡) ↦→ 𝑎e−𝜆|𝑠−𝑡| .

In this case, give the transition semigroup.


8.1. Motivation and General Definitions 103

Exercise. Calculate the quadratic variation of an Ornstein–Uhlenbeck process in two ways, one
from the defining SDE and the other from the solution of the SDE as a stochastic integral.
A simple transformation of Brownian motion gives another description of the stationary
Ornstein–Uhlenbeck process: Suppose that (𝛽𝑡 )𝑡 is an (ℱ𝑡 )𝑡 -Brownian motion and 𝜆 > 0. Then
(2𝜆) −1/2 e 𝛽e2𝜆𝑡 𝑡>0 is a centered Gaussian process with the same initial distribution and covariance
−𝜆𝑡

function as in (2) above. Since it is continuous, this process has the same law as a stationary
Ornstein–Uhlenbeck process, but it is adapted to the filtration (ℱe2𝜆𝑡 )𝑡 . A slightly different description
comes from applying Theorem 5.13 (the Dambis–Dubins–Schwarz theorem) to e −𝜆𝑡 (𝑋 − 𝑋 ) ,

𝑡 0 𝑡
yielding e−𝜆𝑡 (𝑋0 + 𝛽 (e2𝜆𝑡 −1)/(2𝜆) ) 𝑡>0 , which works for any 𝑋0 ∈ ℱ0 .


We now give an example (due to Tanaka) of a stochastic differential equation where weak
existence and weak uniqueness hold, but pathwise uniqueness fails and there is no strong solution:

d𝑋𝑡 = 𝜎(𝑋𝑡 ) d𝐵𝑡 ,

where
 1 if 𝑥 > 0,



𝜎(𝑥) B
 −1 if 𝑥 6 0.


Recall that Theorem 5.12 (of Lévy) implies that if 𝐵 is an (ℱ𝑡 )-Brownian motion, 𝐻 is progressive,
and |𝐻| = 1, then 𝐻 · 𝐵 is an (ℱ𝑡 )-Brownian motion. Therefore, weak uniqueness holds (since 𝑋
is progressive, so is 𝜎(𝑋)). This also suggests how to get weak existence: Let 𝑋 be a Brownian
motion starting from 𝑥 ∈ R and define
∫ 𝑡
𝐵𝑡 B 𝜎(𝑋𝑠 ) d𝑋𝑠 .
0

Then 𝐵 is a Brownian motion, and d𝑋𝑡 = 𝜎(𝑋𝑡 ) d𝐵𝑡 because 𝜎 2 = 1.


We claim that d(−𝑋𝑡 ) = 𝜎(−𝑋𝑡 ) d𝐵𝑡 , which means that pathwise uniqueness fails. It suffices
to see that ∫ 𝑡
1{0} (𝑋𝑠 ) d𝐵 𝑠 = 0,
0
which follows from the fact that its expected quadratic variation is
h∫ 𝑡 i ∫ 𝑡
E𝑥 1{0} (𝑋𝑠 ) d𝑠 = P𝑥 [𝑋𝑠 = 0] d𝑠 = 0.
0 0

One can show that ℱ•𝐵 = ℱ•|𝑋 | $ ℱ•𝑋 (here, denotes completion), so 𝑋 is not a strong solution.
Similarly, one shows there does not exist any strong solution. This relies on Tanaka’s formula for
local time (Chapter 9).
Barlow (1982) gave, for each 𝛽 ∈ (0, 1/2), a function 𝜎 : R → R that is Hölder-continuous of
order 𝛽 and bounded above and below by positive constants such that

d𝑋𝑡 = 𝜎(𝑋𝑡 ) d𝐵𝑡

has a weak solution but no strong solution and no pathwise uniqueness.


104 Chapter 8. Stochastic Differential Equations

Exercise (due 4/5). Let 𝑀 be a continuous semimartingale with 𝑀0 = 0. The proof of Proposi-
tion 5.11 shows that for all 𝜆 ∈ C,

1
ℰ(𝜆𝑀) B exp 𝜆𝑀 − h𝜆𝑀, 𝜆𝑀i

2
satisfies
d𝑋𝑡 = 𝜆𝑋𝑡 d𝑀𝑡 , 𝑋0 = 1.
Show that there is no other solution. Hint: compute 𝑋ℰ(𝜆𝑀) −1 using Itô’s formula.

100 Samples of the Exponential Martingale


for Brownian Motion to Time 10

800

600

400

200

2 4 6 8 10

Figure 8.2: Simulation of the exponential martingale. Note: ℰ(𝐵)𝑡 → 0 almost surely as 𝑡 → ∞.

Example (Section 8.4.2). A combination of the SDEs of both the Ornstein–Uhlenbeck process and
the preceding exercise is
d𝑋𝑡 = 𝜎𝑋𝑡 d𝐵𝑡 + 𝑟 𝑋𝑡 d𝑡
for constants 𝜎 > 0 and 𝑟 ∈ R. To solve this, calculate (if 𝑋0 > 0)

1 𝜎2
d log 𝑋𝑡 = 𝑋𝑡−1 d𝑋𝑡 − 𝑋𝑡−2 dh𝑋, 𝑋i𝑡 = 𝜎 d𝐵𝑡 + 𝑟 d𝑡 − d𝑡,
2 2
whence
𝜎2
𝑋𝑡 = 𝑋0 exp 𝜎𝐵𝑡 + (𝑟 − )𝑡 = 𝑋0ℰ(𝜎𝐵)𝑡 e𝑟𝑡 ;

2
one checks this is indeed a solution. One can also show uniqueness as in the exercise. This is
known as geometric Brownian motion with parameters 𝜎 and 𝑟. It is fundamental in financial
mathematics; 𝑟 represents interest rate.
In fact, this example is itself an example as in the exercise: take 𝜆 := 1 and 𝑀𝑡 := 𝜎𝐵𝑡 + 𝑟𝑡.

There is a very general relation between existence and uniqueness:


8.2. The Lipschitz Case 105

100 Samples of Geometric Brownian Motion


for Time 10 Compared to Exponential Decay at Rate 2
25

20

15

10

2 4 6 8 10

Figure 8.3: Simulation of geometric Brownian motion

Theorem (Yamada–Watanabe). If 𝐸 (𝜎, 𝑏) has pathwise uniqueness, then 𝐸 (𝜎, 𝑏) has weak unique- 
ness. If 𝐸 (𝜎, 𝑏) also has weak existence, then for every filtered probability space Ω, ℱ, (ℱ𝑡 ), P
and (ℱ𝑡 )-Brownian motion, for all 𝑥 ∈ R𝑑 , 𝐸 𝑥 (𝜎, 𝑏) has a strong solution.
Theorem (Gikhman–Skorokhod). If 𝐸 (𝜎, 𝑏) has weak uniqueness and a strong solution, then it
has pathwise uniqueness.
We will not prove these because we will establish these properties for the case of Lipschitz
coefficients.

8.2. The Lipschitz Case


Here we show that when 𝜎 and 𝑏 are Lipschitz in space uniformly in time, then all the existence
and uniqueness properties hold. (The hypothesis is the same as in the Picard–Lindelöf theorem for
ordinary differential equations, which gives existence and uniqueness there.)
Lemma 8.4 (Gronwall’s Lemma). Let 𝑇 > 0 be a constant and 𝑔 be a measurable function on [0, 𝑇].
If there exist 𝑎 ∈ R and a measurable, nonnegative function 𝑏 such that 𝑏 · 𝑔 is Lebesgue-integrable
on [0, 𝑇] such that ∫ 𝑡
∀𝑡 ∈ [0, 𝑇] 𝑔(𝑡) 6 𝑎 + 𝑏(𝑠)𝑔(𝑠) d𝑠,
0
then
𝑏(𝑠) d𝑠
∫𝑡
∀𝑡 ∈ [0, 𝑇] 𝑔(𝑡) 6 𝑎 · e 0 .
We will use only the case that 𝑏 is constant, in which case the upper bound is 𝑎 · e𝑏𝑡 .
Proof. Let 𝐺 (𝑡) denote the right-hand side of the hypothesized inequality, 𝑔 6 𝐺. Then 𝐺 is
a.e.
an absolutely continuous function with 𝐺 0 = 𝑏 · 𝑔 6 𝑏 · 𝐺 on [0, 𝑇]. It follows that, with
ℎ(𝑡) := 0 𝑏(𝑠) d𝑠,
∫𝑡
 0 a.e.
e−ℎ 𝐺 = (𝐺 0 − 𝑏𝐺)e−ℎ 6 (𝑏𝐺 − 𝑏𝐺)e−ℎ = 0.
106 Chapter 8. Stochastic Differential Equations

Therefore, e−ℎ 𝐺 6 e−ℎ(0) 𝐺 (0) = 𝑎 on [0, 𝑇], whence 𝑔 6 𝐺 6 𝑎eℎ , as desired. J

In the rest of this section, we assume 𝜎 : R+ × R𝑑 → 𝑀𝑑×𝑚 (R) and 𝑏 : R+ × R𝑑 → R𝑑 are


continuous and
(
|𝜎(𝑡, 𝑥) − 𝜎(𝑡, 𝑦)| 6 𝐾 |𝑥 − 𝑦|,
∃𝐾 ∀𝑡 > 0 ∀𝑥, 𝑦 ∈ R𝑑
|𝑏(𝑡, 𝑥) − 𝑏(𝑡, 𝑦)| 6 𝐾 |𝑥 − 𝑦|.

Theorem 8.3. 𝐸 (𝜎, 𝑏) has pathwise uniqueness and for every filtered probability space and
associated Brownian motion, for all 𝑥 ∈ R𝑑 , 𝐸 𝑥 (𝜎, 𝑏) has a (unique) strong solution.

In particular, we have weak existence. Theorem 8.5 will imply weak uniqueness.
We will prove this using 𝑑 = 𝑚 = 1 to simplify the notation.
Lemma. Suppose that for all 𝑡 > 0,
∫ 𝑡 ∫ 𝑡
e0 +
e𝑡 = 𝑋
𝑋 𝜎(𝑠, 𝑋𝑠 ) d𝐵 𝑠 + 𝑏(𝑠, 𝑋𝑠 ) d𝑠
0 0

and
∫ 𝑡 ∫ 𝑡
𝑌e𝑡 = 𝑌e0 + 𝜎(𝑠, 𝑌𝑠 ) d𝐵 𝑠 + 𝑏(𝑠, 𝑌𝑠 ) d𝑠.
0 0

Then for all stopping times 𝜏, for all 𝑡 > 0,


∫ 𝑡
2 2 2
E sup ( 𝑋 6 3 E (𝑋 + 3(4 + 𝑡)𝐾 E (𝑋𝑟∧𝜏 − 𝑌𝑟∧𝜏 ) 2 d𝑟.
     
e𝑠∧𝜏 − 𝑌e𝑠∧𝜏 ) e0 − 𝑌e0 )
06𝑠6𝑡 0

Proof. By the arithmetic mean-quadratic mean inequality, we see that the left-hand side is
2i
 h ∫ 𝑠
2
6 3 E (𝑋 e0 − 𝑌e0 ) + E sup 𝜎(𝑟, 𝑋𝑟∧𝜏 ) − 𝜎(𝑟, 𝑌𝑟∧𝜏 ) 1 [𝑠6𝜏] d𝐵𝑟
   
06𝑠6𝑡 0
 2i
h ∫ 𝑠 
+ E sup 𝑏(𝑟, 𝑋𝑟∧𝜏 ) − 𝑏(𝑟, 𝑌𝑟∧𝜏 )1 [𝑠6𝜏] d𝑟

.
06𝑠6𝑡 0

The second term can be bounded by the Doob’s 𝐿 2 -inequality: it is


h∫ 𝑡  2 i
∫ 𝑡
2
6 4E 𝜎(𝑟, 𝑋𝑟∧𝜏 ) − 𝜎(𝑟, 𝑌𝑟∧𝜏 ) d𝑟 6 4𝐾 E (𝑋𝑟∧𝜏 − 𝑌𝑟∧𝜏 ) 2 d𝑟.
 
0 0

The third term can be bounded by the arithmetic mean-quadratic mean inequality: it is
h ∫ 𝑡 2 i
∫ 𝑡
2
𝑏(𝑟, 𝑋𝑟∧𝜏 ) − 𝑏(𝑟, 𝑌𝑟∧𝜏 ) d𝑟 6 𝑡𝐾 E (𝑋𝑟∧𝜏 − 𝑌𝑟∧𝜏 ) 2 d𝑟.
 
6E𝑡·
0 0

Adding these gives the result. J


8.2. The Lipschitz Case 107

Proof of Theorem 8.3. We first show uniqueness. Fix a filtered probability space and a Brownian
motion. Suppose that 𝑋 and 𝑋 0 are both solutions of 𝐸 (𝜎, 𝑏) with 𝑋0 = 𝑋00 . Fix 𝑀 > 0 and define

𝜏 B inf 𝑡 > 0 ; |𝑋𝑡 | > 𝑀 or |𝑋𝑡0 | > 𝑀 .




Then by the lemma, ∀𝑇 > 0 ∀𝑡 ∈ [0, 𝑇]


∫ 𝑡
E sup (𝑋𝑠∧𝜏 − 0
)2 2
6 3𝐾 (4 + 𝑇) 0
) 2 d𝑟.
   
𝑋𝑠∧𝜏 E (𝑋𝑟∧𝜏 − 𝑋𝑟∧𝜏
06𝑠6𝑡 0

Thus, Gronwall’s lemma applies to

𝑔(𝑡) B E sup (𝑋𝑠∧𝜏 − 𝑋𝑠∧𝜏


0
)2 ,
 
06𝑠6𝑡

yielding 𝑔 = 0. Now let 𝑀 → ∞ to get 𝑋 = 𝑋 0 (i.e., indistinguishable).


To show existence, we use Picard’s approximation method. Fix 𝑥 ∈ R and define recursively

𝑋𝑡0 B 𝑥,
∫ 𝑡 ∫ 𝑡
∀𝑛 > 0 𝑋𝑡𝑛+1 B𝑥+ 𝜎(𝑠, 𝑋𝑠𝑛 ) d𝐵 𝑠 + 𝑏(𝑠, 𝑋𝑠𝑛 ) d𝑠.
0 0

Note that by induction, 𝑋 𝑛 is adapted to ℱ•𝐵 and continuous, so the stochastic integrals are defined.
Next, fix 𝑇 > 0. For 𝑛 > 1, set

𝑔𝑛 (𝑡) B E sup (𝑋𝑠𝑛 − 𝑋𝑠𝑛−1 ) 2 .


 
06𝑠6𝑡

Because 𝜎(·, 𝑥) is continuous, it is bounded on [0, 𝑇], whence by Doob’s 𝐿 2 -inequality,


2i
h ∫ 𝑡
E sup 𝜎(𝑠, 𝑥) d𝐵 𝑠
06𝑠6𝑡 0

is bounded on [0, 𝑇]. Therefore,

∃𝐶𝑇0 ∀𝑡 6 𝑇 𝑔1 (𝑡) 6 𝐶𝑇0 .

The lemma shows that


∫ 𝑡
∃𝐶𝑇 ∀𝑛 > 1 ∀𝑡 ∈ [0, 𝑇] 𝑔𝑛+1 (𝑡) 6 𝐶𝑇 𝑔𝑛 (𝑠) d𝑠.
0

Induction shows that


𝑡 𝑛−1
∀𝑛 > 1 𝑔𝑛 (𝑡) 6 𝐶𝑇0 (𝐶𝑇 ) 𝑛−1 .
(𝑛 − 1)!
1/2
P∞
In particular, 𝑛=1 𝑔𝑛 (𝑇) < ∞, whence the arithmetic mean-quadratic mean inequality yields

X 
E sup |𝑋𝑡𝑛+1 − 𝑋𝑡𝑛 | < ∞,
𝑛=1 06𝑡6𝑇
108 Chapter 8. Stochastic Differential Equations

so the sum is finite almost surely. Therefore, 𝑋 𝑛 [0, 𝑇] almost surely converges uniformly; let its
limit be 𝑋 on [0, 𝑇], which necessarily has continuous sample paths. As the almost sure limit of 𝑋 𝑛 ,
𝑋 is also adapted to ℱ•𝐵 .
Since 𝑋 𝑛 → 𝑋 in 𝐿 2 Ω × [0, 𝑇] and 𝑏(·) is Lipschitz, we have


∫ 𝑡 ∫ 𝑡
P
𝑏(𝑥, 𝑋𝑠𝑛 ) d𝑠 →
− 𝑏(𝑥, 𝑋𝑠 ) d𝑠.
0 0

Similarly,
h ∫ 𝑡 ∫ 𝑡 2i h∫ 𝑡 2 i
E 𝜎(𝑠, 𝑋𝑠𝑛 ) d𝐵 𝑠 − 𝜎(𝑠, 𝑋𝑠 ) d𝐵 𝑠 =E 𝜎(𝑠, 𝑋𝑠𝑛 ) − 𝜎(𝑠, 𝑋𝑠 ) d𝑠 → 0,
0 0 0

so ∫ ∫
𝑡 𝑡
P
𝜎(𝑠, 𝑋𝑠𝑛 ) d𝐵 𝑠 →
− 𝜎(𝑠, 𝑋𝑠 ) d𝐵 𝑠 .
0 0
Therefore, 𝑋 satisfies 𝐸 𝑥 (𝜎, 𝑏) on [0, 𝑇]. Because 𝑇 is arbitrary, 𝑋 𝑛 has an almost sure limit, 𝑋, on
R+ and 𝑋 is a strong solution to 𝐸 𝑥 (𝜎, 𝑏). J

Exercise. Let 𝜎, 𝜎0, 𝑏, 𝑏0 all satisfy the Lipschitz conditions of this section. Suppose that on
some open set 𝑈 ⊆ R𝑑 , we have (𝜎, 𝑏) = (𝜎0, 𝑏0) on R+ × 𝑈. Fix 𝑥 ∈ 𝑈 and let 𝑋 and 𝑋 0
be the corresponding solutions to 𝐸 𝑥 (𝜎, 𝑏) and 𝐸 𝑥 (𝜎0, 𝑏0). Let 𝑇 := inf{𝑡 ≥ 0 ; 𝑋𝑡 ∉ 𝑈} and
0
𝑇 0 := inf{𝑡 ≥ 0 ; 𝑋𝑡0 ∉ 𝑈}. Show that 𝑇 = 𝑇 0 a.s. and 𝑋 𝑇 is indistinguishable from (𝑋 0)𝑇 .
Exercise (due 4/5). Exercise 8.10, Exercise 8.12.
Exercise. Suppose that 𝑋 is a solution of 𝐸 0 (𝜎, 𝑏), where 𝜎, 𝑏 : R → R are Borel and such that
𝑏/𝜎 2 is continuous and locally integrable and 𝜎(𝑥) ≠ 0 for all 𝑥 ∈ R. Let 𝑇𝑥 := inf{𝑡 > 0 ; 𝑋𝑡 = 𝑥}
and 𝑐 < 0 < 𝑑.
(1) Show that 𝑇𝑐 ∧ 𝑇𝑑 < ∞ a.s.
(2) Calculate P[𝑇𝑐 < 𝑇𝑑 ].
(3) Show that the answers do not change if 𝜎 is replaced by 𝑔 · 𝜎 and 𝑏 is replaced by 𝑔 2 · 𝑏,
where 𝑔 is a strictly positive, Borel function.
Exercise. Let 𝜎, 𝑏 : R ∫→ R be bounded, 𝜎 be continuous, 𝑏 be Borel, and∫ inf 𝜎 > 0. Let 𝑋 solve
𝐸 (𝜎, 0). Define 𝐿 𝑡 := 0 𝑏(𝑋𝑠 )𝜎(𝑋𝑠 ) −1 d𝐵 𝑠 , 𝑋
e := 𝑋 − h𝑋, 𝐿i, and 𝛽𝑡 := 𝑡 𝜎(𝑋𝑠 ) −1 d 𝑋
e𝑠 .
𝑡
0
(1) Show that d𝑋𝑡 = 𝜎(𝑋𝑡 ) d𝛽𝑡 + 𝑏(𝑋𝑡 ) d𝑡.
(2) Use Girsanov’s theorem to show that 𝐸 (𝜎, 𝑏) has a weak solution whose law is mutually
locally absolutely continuous with respect to the P-law of 𝑋.
We now show continuity of the solution to 𝐸 𝑥 (𝜎, 𝑏) as a function of 𝑥. The space 𝐶 (R+ , R𝑚 ) of
continuous functions from R+ to R𝑚 has the topology of uniform convergence on compact subsets of
R+ , whose Borel 𝜎-field 𝒞𝑚 is the one generated by coordinate maps. Note 𝐶 (R+ , R𝑚 ) is complete.
The law of 𝑚-dimensional Brownian motion started at 0 is Wiener measure 𝑊 on 𝐶 (R+ , R𝑚 ). The
idea is to look at the solution of 𝐸 𝑥 (𝜎, 𝑏) as a function 𝐹𝑥 of the path 𝑤 of Brownian motion. Write
𝒞𝑚 for the 𝑊-completion of 𝒞𝑚 .
8.2. The Lipschitz Case 109

Theorem 8.5. There exists a map R𝑑 3 𝑥 ↦→ 𝐹𝑥 : 𝐶 (R+ , R𝑚 ), 𝒞 𝑚 → 𝐶 (R+ , R𝑑 ), 𝒞𝑑 such that


 

(i) for all 𝑡 > 0 and 𝑥 ∈ R𝑑 , there exists 𝜑𝑡𝑥 : 𝐶 ([0, 𝑡], R𝑚 ), 𝒞𝑚 → (R𝑑 , ℛ𝑑 ) such that



𝐹𝑥 (𝜔)𝑡 = 𝜑𝑡𝑥 𝜔[0, 𝑡] 𝑊-a.s.;

(ii) for all 𝜔 ∈ 𝐶 (R+ , R𝑚 ), 𝑥 ↦→ 𝐹𝑥 (𝜔) is continuous from R𝑑 to 𝐶 (R+ , R𝑑 ); and


(iii) for every complete filtered Ω, ℱ, (ℱ𝑡 ), P and 𝑚-dimensional (ℱ𝑡 )-Brownian motion 𝐵 with


𝐵0 = 0, for all R𝑑 -valued 𝑈 ∈ ℱ0 , the process

𝑡 ↦→ 𝐹𝑈 (𝐵)𝑡

is the pathwise unique solution to 𝐸 (𝜎, 𝑏) with initial value 𝑈.


Remark. Note that (iii) implies weak uniqueness: each solution of 𝐸 𝑥 (𝜎, 𝑏) has the form 𝐹𝑥 (𝐵) for
some Brownian motion 𝐵, whence its law is (𝐹𝑥 )∗ (𝑊), the pushforward of 𝑊 under 𝐹𝑥 .
Proof. Again for simplicity of notation, we prove only the case 𝑚 = 𝑑 = 1. Let 𝒢𝑡 be the 𝑊-
completion of the 𝜎-field on 𝐶 (R+ , R) generated by the coordinate maps 𝑠 ↦→ 𝑤(𝑠) for 𝑠 ∈ [0, 𝑡],
and 𝒢∞ B 𝑡>0 𝒢𝑡 (which we denoted by 𝒞1 above).
Ô
The topology on 𝐶 (R+ , R) can be defined by a metric of the form

X  
𝜌(𝑤, 𝑤 ) B 0
𝛼𝑘 sup |𝑤(𝑠) − 𝑤0 (𝑠)| ∧ 1
𝑘=1 𝑠∈[0,𝑘]

for any sequence 𝛼 𝑘 > 0 with ∞


P
𝑘=1 𝛼 𝑘 < ∞.
Let 𝑋 be the solution to 𝐸 𝑥 (𝜎, 𝑏) corresponding to 𝐶 (R+ , R), 𝒢∞ , (𝒢𝑡 )𝑡 , 𝑊 and the Brownian
𝑥


motion 𝑡 ↦→ 𝑤(𝑡): such a solution exists and is unique (up to indistinguishability) by Theorem 8.3.
Fix 𝑥, 𝑦 ∈ R. Let
𝑇𝑛 B inf 𝑡 > 0 ; |𝑋𝑡𝑥 | > 𝑛 or |𝑋𝑡 | > 𝑛 .
 𝑦

By the lemma,
∫ 𝑡
∀𝑡 > 0 sup (𝑋𝑠∧𝑇 𝑋𝑠∧𝑇𝑛 ) 2 2
6 3(𝑥 − 𝑦) + 3𝐾 (4 + 𝑡) 2 2
d𝑠.
 𝑥 𝑦   𝑥 𝑦 
E 𝑛
− E (𝑋𝑠∧𝑇𝑛
− 𝑋 𝑠∧𝑇𝑛 )
𝑠6𝑡 0

Note that (𝑋𝑠∧𝑇 − 𝑋𝑠∧𝑇𝑛 ) 2 6 4𝑛2 , so Gronwall’s lemma implies that


𝑥 𝑦
𝑛

2
∀𝑇 > 0 E sup (𝑋𝑠∧𝑇 − 𝑋𝑠∧𝑇𝑛 ) 2 6 3(𝑥 − 𝑦) 2 e3𝐾 (4+𝑇)𝑇 .
 𝑥 𝑦 
𝑛
𝑠6𝑇

Choose 𝛼 𝑘 > 0 such that


∞ ∞
3𝐾 2 (4+𝑘)𝑘
X X
𝛼𝑘 e C𝐶<∞ and 𝛼 𝑘 = 1.
𝑘=1 𝑘=1

Then by the arithmetic mean-quadratic mean inequality,



hX i
𝑦 2
𝛼 𝑘 sup 𝑋𝑠 ) 2 6 3𝐶 (𝑥 − 𝑦) 2 .
 𝑥
 𝑦
E 𝜌(𝑋 , 𝑋 ) 6E (𝑋𝑠𝑥 −
𝑘=1 𝑠∈[0,𝑘]
110 Chapter 8. Stochastic Differential Equations

By Kolmogorov’s lemma (Theorem 2.9) applied to the process 𝑥 ↦→ 𝑋 𝑥 with values in 𝐶 (R+ , R), 𝜌 ,


we get a modification 𝑋e of 𝑋 with continuous (in 𝑥 ∈ R) sample paths. (Note: ∀𝑥 ∈ R 𝑋 𝑥 and 𝑋 e𝑥


are indistinguishable.) Define

e𝑥 (𝑤) = 𝑋
𝐹𝑥 (𝑤) B 𝑋 e𝑡𝑥 (𝑤)
𝑡>0 .

This makes (ii) satisfied.


Since 𝑋𝑡𝑥 ∈ 𝒢𝑡 , there exists
 
𝜑𝑡 : 𝐶 [0, 𝑡], R , 𝜎 𝑤(𝑠), 𝑠 ∈ [0, 𝑡] → (R, ℛ)
𝑥 

such that 𝑋𝑡𝑥 (𝑤) = 𝜑𝑡𝑥 𝑤[0, 𝑡] 𝑊-a.s. Because




e𝑡𝑥 (𝑤) 𝑊-a.s.


𝐹𝑥 (𝑤)𝑡 = 𝑋 = 𝑋𝑡𝑥 (𝑤),

we obtain (i).
Because 𝑋 𝑥 solves 𝐸 𝑥 (𝜎, 𝑏), we have
∫ 𝑡 ∫ 𝑡
∀𝑡 > 0 𝜎 𝑠, 𝑋𝑠 (𝑤) d𝑤(𝑠) + 𝑏 𝑠, 𝑋𝑠𝑥 (𝑤) d𝑠
𝑥 𝑥  
𝑋𝑡 (𝑤) = 𝑥 +
0 0

for 𝑊-a.e. 𝑤. By definition of 𝐹𝑥 , we obtain


∫ 𝑡 ∫ 𝑡
𝜎 𝑠, 𝐹𝑥 (𝑤)𝑠 d𝑤(𝑠) + 𝑏 𝑠, 𝐹𝑥 (𝑤)𝑠 d𝑠
 
𝐹𝑥 (𝑤)𝑡 = 𝑥 +
0 0

for 𝑊-a.e. 𝑤. We want to substitute 𝑈 for 𝑥 and 𝐵 for 𝑤. The stochastic integral is not defined
pointwise, so we must be careful.
First, consider the map (𝑥, 𝜔) ↦→ 𝐹𝑥 𝐵(𝜔) 𝑡 . For fixed 𝜔, this is continuous in 𝑥. Now


 
𝐹𝑥 𝐵(𝜔) 𝑡 = 𝜑𝑡𝑥 𝐵(𝜔)[0, 𝑡] P-a.s.

by (i) and the fact that 𝑊 = 𝐵∗ P. The right-hand side belongs to ℱ𝑡 , whence so does the left-hand
side by completeness. In other words, for fixed 𝑥, it is ℱ𝑡 -measurable in 𝜔. Therefore, the map is
the limit as 𝑛 → ∞ of the functions
X 
(𝑥, 𝜔) ↦→ 1 [ 𝑘 , 𝑘+1 ) (𝑥)𝐹 𝑘 𝐵(𝜔) 𝑡 ,
𝑛 𝑛 𝑛
𝑘 ∈Z

which shows that the map is measurable with respect to ℛ ⊗ ℱ𝑡 . Because 𝜔 ↦→ 𝑈 (𝜔), 𝜔 is


measurable from ℱ𝑡 to ℛ ⊗ ℱ𝑡 , the composition 𝜔 ↦→ 𝐹𝑈 (𝜔) 𝐵(𝜔) 𝑡 is ℱ𝑡 -measurable. Thus, the


process 𝐹𝑈 (𝐵) is adapted.
Write ∫ 𝑡
𝜎 𝑠, 𝐹𝑥 (𝑤)𝑠 d𝑤(𝑠)

𝐻 (𝑥, 𝑤) B
0
and
𝑛−1
(𝑖+1)𝑡 
X  
𝑖𝑡
𝐻𝑛 (𝑥, 𝑤) B 𝜎 𝑛 , 𝐹𝑥 (𝑤)𝑖𝑡/𝑛 𝑤 𝑛 − 𝑤( 𝑖𝑡𝑛 ) .
𝑖=0
8.3. Solutions of Stochastic Differential Equations as Markov Processes 111

𝑊
By Proposition 5.9, ∀𝑥 𝐻𝑛 (𝑥, 𝑤) −→ 𝐻 (𝑥, 𝑤) (i.e., in probability). Therefore,
P
∀𝑥 𝐻𝑛 (𝑥, 𝐵) →
− 𝐻 (𝑥, 𝐵).

Because 𝑈 ⫫ 𝐵, it follows (by conditioning on 𝑈) that


P
𝐻𝑛 (𝑈, 𝐵) →
− 𝐻 (𝑈, 𝐵).
P
By Proposition 5.9 again, 𝐻𝑛 (𝑈, 𝐵) → 𝜎 𝑠, 𝐹𝑈 (𝐵)𝑠 d𝐵 𝑠 , whence 𝐻 (𝑈, 𝐵) is that stochastic
∫𝑡 
− 0
integral. Because ∫ 𝑡
𝑏 𝑠, 𝐹𝑥 (𝑤)𝑠 d𝑠,

𝐻 (𝑥, 𝑤) = 𝐹𝑥 (𝑤)𝑡 − 𝑥 −
0
it follows that 𝐹𝑈 (𝐵) solves 𝐸 (𝜎, 𝑏). J

In the lemma, one may use powers 𝑝 > 1, not only 𝑝 = 2, and this allows us to show (from our
version of Kolmolgorov’s lemma) that
  |𝑋 𝑥 − 𝑋 𝑦 |  𝑝 
∀𝐴 > 0 ∀𝑇 > 0 ∀𝜀 ∈ (0, 1) ∀𝑝 > 0 E sup 𝑡 𝑡
1−𝜀
< ∞.
𝑡∈[0,𝑇],𝑥≠𝑦 |𝑥 − 𝑦|
|𝑥|,|𝑦|6 𝐴

Exercise (due 4/12). Exercise 8.14 (the inequality 0 < |𝑍 𝑠 | in (1) should be 0 6 |𝑍 𝑠 |; the conclusion
in (4) is that 𝑋 and 𝑋 0 are indistinguishable).

8.3. Solutions of Stochastic Differential Equations as Markov Processes


We now suppose that 𝜎 and 𝑏 do not depend on time, but still are Lipschitz. Let 𝐹𝑥 be as
in Theorem 8.5.

Theorem 8.6. Let 𝑋 be a solution of 𝐸 (𝜎, 𝑏) on a complete filtered probability space Ω, ℱ, (ℱ𝑡 ), P
with (ℱ𝑡 )-Brownian motion 𝐵. Then 𝑋 is a Markov process with respect to (ℱ𝑡 ) with semigroup

( 𝑓 ∈ 𝐵(R𝑑 ), 𝑡 > 0, 𝑥 ∈ R𝑑 ).

𝑄 𝑡 𝑓 (𝑥) B 𝑓 𝐹𝑥 (𝑤)𝑡 𝑊 (d𝑤)

Proof. We first show that

∀ 𝑓 ∈ 𝐵(R𝑑 ) ∀𝑠, 𝑡 > 0


 
E 𝑓 (𝑋𝑠+𝑡 ) ℱ𝑠 = 𝑄 𝑡 𝑓 (𝑋𝑠 ).

Write ∫ ∫
𝑠+𝑡 𝑠+𝑡
𝑋𝑠+𝑡 = 𝑋𝑠 + 𝜎(𝑋𝑟 ) d𝐵𝑟 + 𝑏(𝑋𝑟 ) d𝑟.
𝑠 𝑠
Recall that the stochastic integral is defined as (𝜎(𝑋)·𝐵)𝑠+𝑡 −(𝜎(𝑋)·𝐵)𝑠 . However, by Proposition 5.9,
we may also write it as ∫ 𝑡
𝜎(𝑋𝑢0 ) d𝐵0𝑢 ,
0
112 Chapter 8. Stochastic Differential Equations

where 𝑋𝑢0 B 𝑋𝑠+𝑢 , 𝐵0𝑢 B 𝐵 𝑠+𝑢 − 𝐵 𝑠 , and we use the complete filtration ℱ𝑢0 B ℱ𝑠+𝑢 : 𝑋 0 is adapted
to ℱ•0 and 𝐵0 is an 𝑚-dimensional ℱ•0-Brownian motion. Thus,
∫ 𝑡 ∫ 𝑡
𝑋𝑡0 = 𝑋𝑠 + 𝜎(𝑋𝑢0 ) d𝐵0𝑢 + 𝑏(𝑋𝑢0 ) d𝑢,
0 0

i.e., 𝑋 0 solves 𝐸 (𝜎, 𝑏) on (Ω, ℱ, ℱ•0, P) with Brownian motion 𝐵0 and initial value

𝑋00 = 𝑋𝑠 ∈ ℱ00 .

By Theorem 8.5(iii), we have 𝑋 0 = 𝐹𝑋𝑠 (𝐵0). It follows that

E 𝑓 (𝑋𝑠+𝑡 ) ℱ𝑠 = E 𝑓 (𝑋𝑡0) ℱ𝑠 = E 𝑓 𝐹𝑋𝑠 (𝐵0) ℱ𝑠


      


= 𝑓 𝐹𝑋𝑠 (𝑤)𝑡 𝑊 (d𝑤) = 𝑄 𝑡 𝑓 (𝑋𝑠 ),
[𝐵0 ∼ 𝑊 and 𝐵0 ⫫ ℱ𝑠 3 𝑋𝑠 ; this is not a stochastic integral,
so we may substitute 𝑋𝑠 for 𝑥]
as desired.
It remains to show that (𝑄 𝑡 ) is a transition semigroup, so we check Definition 6.1 step by step:
(i) (𝑄 0 = Id) 𝐹𝑥 (𝑤)0 = 𝑥
(ii) (𝑄 𝑡 𝑄 𝑠 = 𝑄 𝑠+𝑡 ) Let 𝑋 𝑥 solve 𝐸 𝑥 (𝜎, 𝑏). Then
[by the above]
 𝑥
  
𝑄 𝑠+𝑡 𝑓 (𝑥) = E 𝑓 (𝑋𝑠+𝑡 ) = E 𝑄 𝑡 𝑓 (𝑋𝑠𝑥 ) = 𝑄 𝑠 (𝑄 𝑡 𝑓 )(𝑥).
𝒟
𝑥 = 𝐹 (𝑤) ] [the first equality with 𝑓 replaced
[𝑋𝑠+𝑡
by 𝑄 𝑡 𝑓 and 𝑠 + 𝑡 by 𝑠]
𝑥 𝑠+𝑡

(iii) ((𝑡, 𝑥) ↦→ 𝑄 𝑡 (𝑥, 𝐴) is measurable) By the topology of uniform convergence on compact


sets for 𝐶 (R+ , R𝑑 ), Theorem 8.5 gives that for all 𝑓 ∈ 𝐶b (R𝑑 ),


(𝑡, 𝑥) ↦→ 𝑓 𝐹𝑥 (𝑤)𝑡 𝑊 (d𝑤)

is continuous (since (𝑡, 𝑥) ↦→ 𝐹𝑥 (·)𝑡 is continuous). Thus, for all 𝑓 ∈ 𝐶b (R𝑑 ),

(𝑡, 𝑥) ↦→ 𝑄 𝑡 𝑓 (𝑥)

is continuous. By regularity, it follows that if 𝐴 is closed, (𝑡, 𝑥) ↦→ 𝑄 𝑡 (𝑥, 𝐴) is measurable. By the


𝜋-𝜆 theorem, the same holds for all 𝐴 ∈ ℬ𝑑 . J

Exercise (due 4/19). Let 𝐸 be locally compact Polish and (𝑄 𝑡 )𝑡>0 be a transition semigroup on 𝐸
that satisfies
(i) ∀𝑡 > 0 ∀ 𝑓 ∈ 𝐶0 (𝐸) 𝑄 𝑡 𝑓 ∈ 𝐶0 (𝐸), and
(ii) ∀ 𝑓 ∈ 𝐶0 (𝐸) ∀𝑥 ∈ 𝐸 lim𝑡↓0 𝑄 𝑡 𝑓 (𝑥) = 𝑓 (𝑥).
This exercise will show that (𝑄 𝑡 ) is Feller.
8.3. Solutions of Stochastic Differential Equations as Markov Processes 113

(1) The proof of Proposition 6.8 shows that the range R of 𝑅𝜆 on 𝐶0 (𝐸) is the same for all 𝜆 > 0
and is contained in 𝐶0 (𝐸); also,

∀𝑥 ∈ 𝐸 ∀ 𝑓 ∈ 𝐶0 (𝐸) lim 𝜆𝑅𝜆 𝑓 (𝑥) = 𝑓 (𝑥).


𝜆→∞

Use the Hahn–Banach theorem to deduce that R is dense in 𝐶0 (𝐸).


(2) The proof of Lemma 6.6 showed (used) that
∫ ∞
∀𝑠 > 0 ∀ℎ ∈ 𝐵(𝐸) e 𝑄 𝑠 𝑅1 ℎ =
−𝑠
e−𝑡 𝑄 𝑡 ℎ d𝑡.
𝑠

Deduce that
∀ℎ ∈ 𝐵(𝐸) lim k𝑄 𝑠 𝑅1 ℎ − 𝑅1 ℎk = 0.
𝑠↓0

Infer that
∀ 𝑓 ∈ 𝐶0 (𝐸) lim k𝑄 𝑠 𝑓 − 𝑓 k = 0,
𝑠↓0

whence (𝑄 𝑡 )𝑡 is Feller.
Let 𝐶c2 (R𝑑 ) denote the space of 𝑓 ∈ 𝐶 2 (R𝑑 ) with compact support.
Theorem 8.7. The semigroup (𝑄 𝑡 )𝑡 of Theorem 8.6 is Feller. Its generator 𝐿 satisfies
(i) 𝐶c2 (R𝑑 ) ⊆ 𝐷 (𝐿), and
(ii) ∀ 𝑓 ∈ 𝐶c2 (R𝑑 ) ∀𝑥 ∈ R𝑑

1X
𝑑 X𝑑

𝐿 𝑓 (𝑥) = (𝜎𝜎 )𝑖 𝑗 (𝑥) 𝑓𝑖 𝑗 (𝑥) + 𝑏𝑖 (𝑥) 𝑓𝑖 (𝑥)
2
𝑖, 𝑗=1 𝑖=1
1 ∗ 2 
= 2 𝜎𝜎 (𝑥), ∇ 𝑓 (𝑥) + 𝑏(𝑥) · ∇ 𝑓 (𝑥),
| {z }
natural inner product

where 𝜎 ∗ is the transpose of 𝜎 ∈ 𝑀𝑑×𝑚 (R𝑑 ).

Proof. Let 𝑓 ∈ 𝐶0 (R𝑑 ). Theorem 8.5 (continuity of 𝑥 ↦→ 𝐹𝑥 (𝑤)), the definition of 𝑄 𝑡 , and the
bounded convergence theorem show that 𝑄 𝑡 𝑓 ∈ 𝐶 (R𝑑 ). Similarly, since 𝑡 ↦→ 𝐹𝑥 (𝑤)𝑡 is continuous,
we obtain
∀𝑥 lim 𝑄 𝑡 𝑓 (𝑥) = 𝑓 (𝑥).
𝑡↓0

Thus, by the exercise, it suffices to show that

𝑄 𝑡 𝑓 ∈ 𝐶0 (R𝑑 )

in order to conclude that (𝑄 𝑡 )𝑡 is Feller. We assume 𝜎 and 𝑏 are bounded and leave the general case
to another exercise.
Suppose
∀𝑖, 𝑗 |𝜎𝑖, 𝑗 | 6 𝐶 and |𝑏𝑖 | 6 𝐶.
114 Chapter 8. Stochastic Differential Equations

For a solution 𝑋 𝑥 of 𝐸 𝑥 (𝜎, 𝑏), we obtain, for all 𝑡 > 0,


𝑑 𝑚
X h∫ 𝑡 2i h∫ 𝑡 2i 
2
X
(𝑚 + 1) 𝜎𝑖 𝑗 (𝑋𝑠𝑥 ) d𝐵 𝑠 𝑏𝑖 (𝑋𝑠𝑥 ) d𝑠
  𝑗
E |𝑋𝑡𝑥 − 𝑥| 6 E +E
𝑖=1 𝑗=1 0 0

6 𝑑 (𝑚 + 1)𝐶 2 (𝑡 + 𝑡 2 ).

Therefore, for all 𝐴 > 0, Chebyshev’s inequality gives


 
|𝑄 𝑡 𝑓 (𝑥)| = E 𝑓 (𝑋𝑡𝑥 )
  𝑑 (𝑚 + 1)𝐶 2 (𝑡 + 𝑡 2 )
6 E 𝑓 (𝑋𝑡𝑥 )1 [|𝑋𝑡𝑥 −𝑥|6 𝐴] + k 𝑓 k .
𝐴2

Now let 𝑥 → ∞ and then 𝐴 → ∞ to get 𝑄 𝑡 𝑓 ∈ 𝐶0 (R𝑑 ).


Finally, let 𝑓 ∈ 𝐶c2 (R𝑑 ) and set 𝑔 to be the desired value of 𝐿 𝑓 . By Theorem 6.14, if we show
that ∫ •
𝑀 B 𝑓 (𝑋 𝑥 ) − 𝑔(𝑋𝑠𝑥 ) d𝑠
0
is a martingale, then it will follow that 𝑓 ∈ 𝐷 (𝐿) and 𝑔 = 𝐿 𝑓 . Apply Itô’s formula to 𝑓 (𝑋 𝑥 ):

𝑓 (𝑋𝑡𝑥 ) = 𝑓 (𝑥) + stochastic integral


1X 𝑡
X 𝑑 ∫ 𝑡 𝑑 ∫
+ 𝑏𝑖 (𝑋𝑠 ) 𝑓𝑖 (𝑋𝑠 ) d𝑠 +
𝑥 𝑥
𝑓𝑖 𝑗 (𝑋𝑠𝑥 ) dh𝑋 𝑥,𝑖 , 𝑋 𝑥, 𝑗 i𝑠
2
𝑖=1 0 𝑖, 𝑗=1 0
∫ 𝑡
= 𝑓 (𝑥) + stochastic integral + 𝑔(𝑋𝑠𝑥 ) d𝑠.
| {z } 0
continuous local martingale

Therefore, 𝑀 is a continuous local martingale. Because 𝑓 and 𝑔 are bounded, 𝑀 is a martingale


by Proposition 4.7(ii). J

Exercise (due 4/19). Complete the proof of Theorem 8.7 in the general Lipschitz case as follows.
By what we have shown at the start of the proof of Theorem 8.7 and our version of Theorem 6.17,
𝑋 𝑥 has the strong Markov property.
(1) Show that there exists 𝐶 < ∞ such that

∀|𝑥| > 1 |𝜎(𝑥)| 6 𝐶 |𝑥| and |𝑏(𝑥)| 6 𝐶 |𝑥|.

For 𝐴 > 0, let


𝑇𝐴 B inf 𝑡 > 0 ; |𝑋𝑡 | = 𝐴 .


For 𝑘 ∈ N+ , let 𝑆 𝑘 B 𝑇𝐴·2𝑘−1 ∧ 𝑇𝐴·2𝑘+1 . Show that there exists 𝐶 0 < ∞ such that

∀𝑡 > 0 ∀𝐴 > 1 ∀|𝑥| = 𝐴 · 2 𝑘


E𝑥 |𝑋𝑡∧𝑆 𝑘 − 𝑥| 2 6 𝐶 0 · 𝐴2 · 22𝑘 (𝑡 + 𝑡 2 ) and P𝑥 [𝑆 𝑘 6 𝑡] 6 4𝐶 0 (𝑡 + 𝑡 2 ).
 
8.3. Solutions of Stochastic Differential Equations as Markov Processes 115

(2) Show that there exists 𝑡0 > 0 such that


1
∀𝑘 ∈ N+ ∀𝐴 > 1 ∀|𝑥| > 𝐴 · 2 𝑘 P𝑥 [𝑇𝐴·2𝑘−1 > 𝑡0 ] > .
2
Use the strong Markov property to deduce that for all 𝐴 > 1,
h 𝑘i
lim sup P𝑥 𝑗 ∈ [1, 𝑘] ; 𝑇𝐴·2 𝑗 = ∞ or 𝑇𝐴·2 𝑗−1 − 𝑇𝐴·2 𝑗 > 𝑡0 < = 0.
𝑘→∞ |𝑥|> 𝐴·2 𝑘 4

(3) Show that


𝑘𝑡 0 
∀𝐴 > 1 lim sup P𝑥 𝑇𝐴 < = 0.

𝑘→∞ |𝑥|> 𝐴·2 𝑘 4
Deduce that ∀ 𝑓 ∈ 𝐶0 (R𝑑 ) ∀𝑡 > 0 𝑄 𝑡 𝑓 ∈ 𝐶0 (R𝑑 ).
The words “diffusion process” are assigned different meanings by different authors, but one
common meaning is a process with continuous sample paths that is Markov and solves a stochastic
differential equation. The study of 𝐿 as a second-order differential operator can be done via
probability, as we will see (for Brownian motion) in Chapter 7.
Exercise (due 4/19). (1) What is the generator on 𝐶c2 (R) of the Ornstein–Uhlenbeck process? of
geometric Brownian motion?
(2) Find Lipschitz 𝜎 and 𝑏 such that the solution of 𝐸 (𝜎, 𝑏) has the generator

𝐿 𝑓 (𝑥1 , 𝑥2 ) = 2𝑥2 𝑓1 (𝑥 1 , 𝑥2 ) + ln(1 + 𝑥 12 + 𝑥 22 ) 𝑓2 (𝑥 1 , 𝑥2 )


+ 12 (1 + 𝑥12 ) 𝑓11 (𝑥 1 , 𝑥2 ) + 𝑥 1 𝑓12 (𝑥 1 , 𝑥2 ) + 12 𝑓22 (𝑥1 , 𝑥2 )

on 𝐶c2 (R2 ).
Exercise (due 4/19). Exercise 8.11: for (1), we consider only 𝜆 > 0; for (2), this means to show the
Laplace transform of 𝑄 𝑡 (𝑥, ·) is as on the bottom of page 178 of the book.
Exercise (due 4/19). Let 𝐵 be 1-dimensional Brownian motion. Define

𝑋𝑡𝑥 B (𝑥 1/3 + 13 𝐵𝑡 ) 3

for 𝑡 > 0. Let 𝜎(𝑥) B 𝑥 2/3 and 𝑏(𝑥) = 𝑥 1/3 /3.


(1) Show that 𝑋 𝑥 solves 𝐸 𝑥 (𝜎, 𝑏).
(2) Let 𝑇 B inf{𝑡 > 0 ; 𝑋𝑡 = 0}. Show that (𝑋 𝑥 )𝑇 is also a strong solution to 𝐸 𝑥 (𝜎, 𝑏).
Exercise (due 4/26). Let 𝜎 and 𝑏 be Lipschitz. Let

𝐻 B {𝑦 ∈ R𝑑 ; 𝜎(𝑦) = 0 and 𝑏(𝑦) = 0}.

Show that if 𝑋 𝑥 solves 𝐸 𝑥 (𝜎, 𝑏), then 𝐻 is absorbing for 𝑋 𝑥 , i.e., if

𝑇 B inf{𝑡 > 0 ; 𝑋𝑡𝑥 ∈ 𝐻},

then 𝑋 𝑥 = (𝑋 𝑥 )𝑇 .
Extra credit: Show that if 𝑥 ∉ 𝐻, then 𝑇 = ∞ almost surely.
116 Chapter 8. Stochastic Differential Equations

Exercise. Suppose that d𝑋𝑡 = d𝐵𝑡 + 𝑏(𝑋𝑡 ) d𝑡 and 𝑋0 = 0, where 𝑏(𝑥) := 𝑐0 (𝑥)/ 2𝑐(𝑥) for


some strictly positive function 𝑐 ∈ 𝐶 2 (R). Define 𝜏0 := 0 and recursively set 𝜏𝑘+1 := inf{𝑡 >
𝜏𝑘 ; |𝑋𝜏𝑘+1 − 𝑋𝜏𝑘 | = 1} for 𝑘 > 0. Show that the discrete-time process (𝑋𝜏𝑘 ; 𝑘 > 0) is a
nearest-neighbor random walk on Z with transition probabilities 𝑝 𝑛,𝑛+1 = 𝑟 𝑛 / 𝑟 𝑛 + 𝑟 𝑛+1 , where
:= d𝑥/𝑐(𝑥). (If we interpret 𝑐(𝑥) as the conductivity at 𝑥, then 𝑟 𝑛 is the resistance of the
∫𝑛
𝑟𝑛 𝑛−1
edge between 𝑛 − 1 and 𝑛 in an electrical network on Z.)
117

Chapter 7

Brownian Motion and Partial Differential


Equations

We discuss the heat equation and especially Laplace’s equation and how they can be solved
using Brownian motion. This is a model for other partial differential equations and diffusions. We
then discuss some path properties of Brownian motion.

7.1. Brownian Motion and the Heat Equation


For the whole chapter, 𝐵 denotes 𝑑-dimensional Brownian motion with P𝑥 the measure such
that P𝑥 [𝐵0 = 𝑥] = 1, 𝑄 • is its transition semigroup, and 𝐿 is its generator. Recall that

𝐷 (𝐿) ⊇ 𝜓 ∈ 𝐶 2 (R𝑑 ) ; 𝜓, Δ𝜓 ∈ 𝐶0 (R𝑑 )




and on that set,


𝐿𝜓 = 12 Δ𝜓.
For all 𝜑 ∈ 𝐵(R𝑑 )
∀𝑡 > 0 𝑄 𝑡 𝜑 = 𝑝 𝑡 ∗ 𝜑,
where 𝑝 𝑡 is the density of 𝒩(0, 𝑡 𝐼), a 𝐶 ∞ function. Thus, 𝑄 𝑡 𝜑 ∈ 𝐶 ∞ . If 𝜑 ∈ 𝐶0 (R𝑑 ), then all
derivatives of 𝑄 𝑡 𝜑 lie in 𝐶0 (R𝑑 ), as we may see by differentiating under the integral:

𝜕𝑖 𝑄 𝑡 𝜑 = (𝜕𝑖 𝑝 𝑡 ) ∗ 𝜑.

Thus, for 𝜑 ∈ 𝐶0 (R𝑑 ), we have 𝑄 𝑡 𝜑 ∈ 𝐷 (𝐿) and 𝐿 (𝑄 𝑡 𝜑) = 12 Δ(𝑄 𝑡 𝜑) for 𝑡 > 0.


Exercise. Exercise 7.28; add part (0): Let 𝑠 ↦→ 𝜑 𝑠 be continuous from R+ to 𝐶0 (R𝑑 ). Show that
𝑄 𝑠 𝜑 𝑠 d𝑠 ∈ 𝐶0 (R𝑑 ) ∩ 𝐶 1 (R𝑑 ) with gradient in 𝐶0 (R𝑑 ) and continuous in 𝑡 ∈ R+ , where (𝑄 𝑡 )𝑡 is
∫𝑡
0
the transition semigroup of Brownian motion. Hint for (1): write 0 = 0 + 𝑡/2 for the second
∫ 𝑡 ∫ 𝑡/2 ∫ 𝑡

derivatives.
The following shows how Brownian motion solves the heat equation with initial value
𝜑 ∈ 𝐶0 (R𝑑 ). (Direct calculation shows more, but our proof extends to other equations involving the
generator of a Feller process.)
118 Chapter 7. Brownian Motion and Partial Differential Equations

Theorem 7.1. Let 𝜑 ∈ 𝐶0 (R𝑑 ). For 𝑡 > 0 and 𝑥 ∈ R𝑑 , set


 
𝑢 𝑡 (𝑥) B 𝑄 𝑡 𝜑(𝑥) = E𝑥 𝜑(𝐵𝑡 ) .
Then (𝑡, 𝑥) ↦→ 𝑢 𝑡 (𝑥) on (0, ∞) × R𝑑 satisfies
𝜕𝑢 𝑡 1
= Δ𝑢 𝑡 and lim 𝑢 𝑡 (𝑦) = 𝜑(𝑥).
𝜕𝑡 2 𝑡↓0
𝑦→𝑥

Proof. Recall Proposition 6.11:


∫ 𝑡
∀ 𝑓 ∈ 𝐷 (𝐿) ∀𝑡 > 0 𝑄𝑡 𝑓 = 𝑓 + 𝐿 (𝑄 𝑠 𝑓 ) d𝑠.
0
We do not necessarily have 𝜑 ∈ 𝐷 (𝐿), but we do have 𝑄 𝜀 𝜑 ∈ 𝐷 (𝐿) for 𝜀 > 0. Thus,
∫ 𝑡−𝜀 ∫ 𝑡
∀𝑡 > 𝜀 > 0 𝑢 𝑡 = 𝑢 𝜀 + 𝐿 (𝑄 𝑠 𝑢 𝜀 ) d𝑠 = 𝑢 𝜀 + 𝐿𝑢 𝑠 d𝑠.
0 𝜀
Now, we can also write
𝐿𝑢 𝑠 = 𝑄 𝑠−𝜀 (𝐿𝑢 𝜀 )
by Proposition 6.10 to see that 𝑠 ↦→ 𝐿𝑢 𝑠 is continuous on [𝜀, ∞). Thus,
𝜕𝑢 𝑡 1
∀𝑡 > 𝜀 > 0 = 𝐿𝑢 𝑡 = Δ𝑢 𝑡 .
𝜕𝑡 2
The initial condition follows from the Feller property 𝑄 𝑡 𝜑 → 𝜑 as 𝑡 ↓ 0. J
Exercise. Let 𝑋 be a Markov process on a locally compact Polish space 𝐸 with Feller semigroup
(𝑄 𝑡 )𝑡 and generator 𝐿. Let 𝜑 ∈ 𝐶0 (𝐸). Show that if 𝑢 𝑡 (𝑥) := 𝑄 𝑡 𝜑(𝑥), then
𝜕𝑢 𝑡
= 𝐿𝑢 𝑡 and lim 𝑢 𝑡 (𝑦) = 𝜑(𝑥)
𝜕𝑡 𝑡↓0
𝑦→𝑥

for 𝑡 > 0 and 𝑥 ∈ 𝐸.

7.2. Brownian Motion and Harmonic Functions

Definition 7.2. A domain of R𝑑 is a non-empty, open, connected set. For a domain 𝐷 ⊆ R𝑑 , a


function 𝑢 : 𝐷 → R is harmonic on 𝐷 if 𝑢 ∈ 𝐶 2 (𝐷) and Δ𝑢 = 0 on 𝐷.
Suppose 𝐷 0 is a subdomain of 𝐷 with 𝐷 0 ⊆ 𝐷. Let
𝑇 B inf{𝑡 > 0 ; 𝐵𝑡 ∉ 𝐷 0 }.
By Itô’s formula, if 𝑢 is harmonic on 𝐷 and 𝐵0 ∈ 𝐷 0, then 𝑢(𝐵)𝑇 is a continuous local martingale
with ∫ 𝑡∧𝑇
𝑢(𝐵𝑡∧𝑇 ) = 𝑢(𝐵0 ) + ∇𝑢(𝐵 𝑠 ) · d𝐵 𝑠 ,
0
like a line integral. In fact, if
𝐷0 is bounded, then 𝑢 is bounded on 𝐷 0, so 𝑢(𝐵)𝑇 is a true martingale.
Conversely, if 𝑢 ∈ 𝐶 2 (𝐷) and for all subdomains 𝐷 0 ⊆ 𝐷 with 𝐷 0 ⊂ 𝐷, 𝑢(𝐵)𝑇 is a continuous local
martingale, then by Itô’s formula, 𝑢 is harmonic in 𝐷. We will weaken the hypothesis “𝑢 ∈ 𝐶 2 (𝐷)”
with Lemma 7.5.
7.2. Brownian Motion and Harmonic Functions 119

Proposition 7.3. Let 𝑢 be harmonic on a domain 𝐷. For a bounded subdomain 𝐷 0 of 𝐷 with


𝐷 0 ⊂ 𝐷, let
𝑇 B inf{𝑡 > 0 ; 𝐵𝑡 ∉ 𝐷 0 }.
Then
∀𝑥 ∈ 𝐷 0
 
𝑢(𝑥) = E𝑥 𝑢(𝐵𝑇 ) .

Proof. Because 𝐷 0 is bounded, 𝑢(𝐵)𝑇 is a bounded P𝑥 -continuous local martingale, so it is a true


martingale. Thus,
∀𝑡 > 0 𝑢(𝑥) = E𝑥 𝑢(𝐵𝑡∧𝑇 ) .
 

Since 𝑇 < ∞ P𝑥 -a.s., we may let 𝑡 → ∞ and use the bounded convergence theorem to obtain the
desired formula. J

Rotational symmetry of Brownian motion shows that if 𝐷 0 is a ball centered at 𝑥, then 𝐵𝑇 has
the uniform distribution on 𝜕𝐷 0. Let 𝜎𝑥,𝑟 denote the uniform measure on the sphere of radius 𝑟
centered at 𝑥.
Proposition 7.4 (Mean-Value Property). If 𝑢 is harmonic on a neighborhood of the closed ball of
radius 𝑟 centered at 𝑥, then ∫
𝑢(𝑥) = 𝑢(𝑦) d𝜎𝑥,𝑟 (𝑦). J

Exercise (due 4/26). Let 𝑢 be harmonic on R𝑑 and 𝐵 be Brownian motion.


(1) Let 𝑡 > 0. Show that

∀𝑦 ∈ R𝑑 𝑢(𝐵) 𝑡 is a true P𝑦 -martingale

if and only if ∫
2 /2𝑡
∀𝑦 ∈ R 𝑑
𝑢(𝑥) e−|𝑥−𝑦| d𝑥 < ∞.
R𝑑

(2) Find 𝑢 on R2 such that for all 𝑦 ∈ R2 , 𝑢(𝐵 𝑠 ) is a true P𝑦 -martingale, but 𝑢(𝐵 𝑠 )
 
06𝑠<1 06𝑠61
is not a true P𝑦 -martingale.
Exercise (due 4/26). (1) Show that every bounded harmonic function on R2 is constant by using
Exercise 5.33(5).
(2) Show the same on R𝑑 for 𝑑 > 2. Hint: Let 𝑥 ≠ 𝑦 and let 𝐻 be the hyperplane

𝑧 ∈ R𝑑 ; |𝑧 − 𝑥| = |𝑧 − 𝑦| .


Let 𝑇 B inf{𝑡 > 0 ; 𝐵𝑡 ∈ 𝐻}. Show that


   
E𝑥 𝑢(𝐵𝑇 ) = E𝑦 𝑢(𝐵𝑇 ) .

(3) Let 𝑢 be a nonconstant harmonic function on R𝑑 (𝑑 > 2). Show that



2
∀𝑝 > 1 sup 𝑢(𝑡𝑥) e−|𝑥| d𝑥 = ∞.
𝑝
𝑡>0 R𝑑
120 Chapter 7. Brownian Motion and Partial Differential Equations

(4) (Extra credit) Does (3) hold for 𝑝 = 1?


(5) (Extra credit) Show that every positive harmonic function on R𝑑 (𝑑 > 1) is constant.
We say that a locally bounded, measurable function 𝑢 on 𝐷 satisfies the mean-value property
if the equation of Proposition 7.4 holds for all closed balls in 𝐷.
Lemma 7.5. If 𝑢 satisfies the mean-value property on a domain 𝐷, then 𝑢 is harmonic on 𝐷.
Thus, if 𝑢 is locally bounded and measurable, and 𝑢(𝐵)𝑇 is a continuous local martingale
for every exit time 𝑇 of a closed ball centered at the starting point and contained in 𝐷, then 𝑢 is
harmonic.
Proof. It suffices to show that for all 𝑟 0 > 0, if 𝐷 0 B 𝑥 ∈ 𝐷 ; |𝑥 − 𝐷 c | > 𝑟 0 , then 𝑢 is harmonic


in 𝐷 0.
We first show that 𝑢 ∈ 𝐶 ∞ (𝐷 0). Choose any ℎ : R+ → R+ that is 𝐶 ∞ , has support in (0, 𝑟 0 ),
and is not identically zero. For 0 < 𝑟 < 𝑟 0 , multiply both sides of

𝑢(𝑥) = 𝑢(𝑦) d𝜎𝑥,𝑟 (𝑦) (𝑥 ∈ 𝐷 0)

by 𝑟 𝑑−1 ℎ(𝑟) and integrate from 𝑟 = 0 to 𝑟 0 . We get, for some constant 𝑐 > 0, that
∫ ∫
0
𝑢(𝑥 + 𝑦)ℎ |𝑦| d𝑦 = 𝑢(𝑥 + 𝑦)ℎ |𝑦| d𝑦
 
∀𝑥 ∈ 𝐷 𝑐𝑢(𝑥) =
|𝑦|<𝑟 0 R𝑑

if we set 𝑢 to be 0 outside 𝐷. We can rewrite this as a convolution:



𝑢(𝑧)ℎ |𝑧 − 𝑥| d𝑧.

𝑐𝑢(𝑥) =
R𝑑

Since 𝑧 ↦→ ℎ |𝑧| ∈ 𝐶 ∞ (R𝑑 ), we get 𝑢 ∈ 𝐶 ∞ (𝐷 0).




To show Δ𝑢 = 0 in 𝐷 0, we may now apply Itô’s formula to 𝑢(𝐵): Let

𝑇𝑥,𝑟 B inf 𝑡 > 0 ; |𝑥 − 𝐵𝑡 | = 𝑟 .




Then
h∫ 𝑡∧𝑇𝑥,𝑟 i
1
0
Δ𝑢(𝐵 𝑠 ) d𝑠 .
 
∀𝑥 ∈ 𝐷 ∀𝑟 ∈ (0, 𝑟 0 ) E𝑥 𝑢(𝐵𝑡∧𝑇𝑥,𝑟 ) = 𝑢(𝑥) + E𝑥 2
0
Recall that E𝑥 [𝑇𝑥,𝑟 ] < ∞ (in fact, E𝑥 [𝑇𝑥,𝑟 ] = 𝑟 2 /𝑑). Thus, we may let 𝑡 → ∞ and apply Lebesgue’s
dominated convergence theorem to get
h∫ 𝑇𝑥,𝑟 i
1
Δ𝑢(𝐵 𝑠 ) d𝑠 .
 
E𝑥 𝑢(𝐵𝑇𝑥,𝑟 ) = 𝑢(𝑥) + E𝑥 2
0

The left-hand side equals 𝑢(𝑥), therefore the second term on the right-hand side equals 0. Now, let
𝑟 ↓ 0 to get Δ𝑢(𝑥) = 0. J
Definition 7.6. Let 𝐷 be a domain and 𝑔 ∈ 𝐶 (𝜕𝐷). We say that 𝑢 : 𝐷 → R solves the Dirichlet
problem in 𝐷 with boundary condition 𝑔 if 𝑢 is harmonic in 𝐷 and

∀𝑦 ∈ 𝜕𝐷 lim 𝑢(𝑥) = 𝑔(𝑦).


𝐷3𝑥→𝑦
7.2. Brownian Motion and Harmonic Functions 121

Thus, if
if 𝑥 ∈ 𝐷,

 𝑢(𝑥)


𝑢 (𝑥) B
e

 𝑔(𝑥)
 if 𝑥 ∈ 𝜕𝐷,

then e
𝑢 ∈ 𝐶 (𝐷). If 𝐷 is bounded, then 𝑢 is bounded.
Exercise (due 4/26). Let 𝐷 be a bounded domain, 𝑔 ∈ 𝐵(𝜕𝐷), and

𝑇 B inf{𝑡 > 0 ; 𝐵𝑡 ∉ 𝐷}.

Define
for 𝑥 ∈ 𝐷.
 
𝑣(𝑥) B E𝑥 𝑔(𝐵𝑇 )
Show that 𝑣 ∈ 𝐶 (𝐷).
Proposition 7.7. Keep the notation of the preceding exercise.
(i) If 𝑔 ∈ 𝐶 (𝜕𝐷) and 𝑢 solves the Dirichlet problem in 𝐷 with boundary condition 𝑔, then 𝑢 = 𝑣.
(ii) The function 𝑣 is harmonic in 𝐷 and

∀𝑥 ∈ 𝐷 lim 𝑣(𝐵𝑡 ) = 𝑔(𝐵𝑇 ) P𝑥 -a.s.


𝑡↑𝑇

Proof. (i) In Proposition 7.3, we saw this formula for subdomains 𝐷 0 with 𝐷 0 ⊆ 𝐷. Take an
increasing sequence 𝐷 𝑛 ⊆ 𝐷 with 𝐷 𝑛 ⊆ 𝐷 and 𝑛 𝐷 𝑛 = 𝐷. Apply continuity of sample paths and
Ð
of e
𝑢.
(ii) The exercise shows 𝑣 ∈ 𝐶 (𝐷), and obviously 𝑣 is bounded. (Or: measurability follows
from the start of Theorem 6.16 and measurability of 𝑔 and 𝐵𝑇 .) The mean-value property is a
consequence of the strong Markov property: If |𝑥 − 𝐷 c | > 𝑟, then
h  i  
𝑣(𝑥) = E𝑥 E𝑥 𝑔(𝐵𝑇 ) ℱ𝑇𝑥,𝑟 = E𝑥 𝑣(𝐵𝑇𝑥,𝑟 ) .

Here, we use the strong Markov property in the forms of both Theorem 6.17 and Theorem 2.20.
Thus, 𝑣 is harmonic in 𝐷. The proof of the rest of (ii), which we won’t use, is in an appendix. J

These results do not say when the Dirichlet problem has a solution. In fact, it need not:
Exercise (due 4/26). Exercise 7.24, Exercise 7.25.
However, convex domains have a solution for all 𝑔. In fact, it suffices that every point of 𝜕𝐷
satisfy the exterior cone condition, where 𝑦 ∈ 𝜕𝐷 satisfies  this if there exists a non-empty open
cone C with apex 𝑦 and there exists 𝑟 > 0 such that C ∩ 𝑧 ; |𝑧 − 𝑦| < 𝑟 ⊆ 𝐷 c . The idea is that if
𝑥 ∈ 𝐷 is close to 𝜕𝐷, then it is P𝑥 -likely that 𝐵𝑡 leaves 𝐷 close to 𝑥.
Lemma 7.9. Let 𝐷 be a domain that satisfies the exterior cone condition at some 𝑦 ∈ 𝜕𝐷. Let
𝑇 B inf{𝑡 > 0 ; 𝐵𝑡 ∉ 𝐷}. Then the P𝑥 -law of 𝑇 tends weakly to 𝛿0 as 𝐷 3 𝑥 → 𝑦.

Proof. Let B𝑟 B {𝑧 ∈ R𝑑 ; |𝑧| < 𝑟}. Let C be an open circular cone whose apex is 0 and whose
intersection with the unit sphere has normalized measure 𝛼 > 0 such that 𝑦 + (C ∩ B𝑟 ) ⊆ 𝐷 c for
122 Chapter 7. Brownian Motion and Partial Differential Equations

some 𝑟 > 0. Then lim𝑡→0 P0 [𝐵𝑡 ∈ C ∩ B𝑟 ] = 𝛼. Blumenthal’s 0-1 law (Theorem 2.13) extends
to higher-dimensional Brownian motion with the same proof, whence P0 [𝑇C∩B𝑟 = 0] = 1, where
𝑇𝐹 B inf{𝑡 > 0 ; 𝐵𝑡 ∈ 𝐹}.
Let C 0 ⊆ C be an open circular cone with apex 0 and opening 𝛼/2 and the same axis of
symmetry as C. Then P0 [𝑇C 0∩B𝑟 = 0] = 1 as well. Given 𝜂 > 0, there exists 𝑎 > 0 such that
P0 [𝑇C𝑎0 ∩B𝑟 /2 6 𝜂] > 1 − 𝜂, where C𝑎0 B {𝑧 ∈ C 0 ; |𝑧| > 𝑎} (because C𝑎0 ↑ C 0 as 𝑎 ↓ 0). Choose
𝜀 > 0 such that
|𝑧| < 𝜀 =⇒ C𝑎0 ∩ B𝑟/2 ⊆ 𝑧 + C ∩ B𝑟 .
Then for |𝑦 − 𝑥| < 𝜀,

P𝑥 [𝑇 6 𝜂] > P𝑥 [𝑇𝑦+C∩B𝑟 6 𝜂] = P0 [𝑇𝑦−𝑥+C∩B𝑟 6 𝜂]


> P0 [𝑇C𝑎0 ∩B𝑟 /2 6 𝜂] > 1 − 𝜂. J

Theorem 7.8. Let 𝐷 be a bounded domain in R𝑑 that satisfies the exterior cone condition at every
point of 𝜕𝐷. Then for all 𝑔 ∈ 𝐶 (𝜕𝐷), the Dirichlet problem in 𝐷 with boundary condition 𝑔 has a
solution.

Proof. Let 𝑣 be as in the exercise. By Proposition 7.7(ii), we need only show that

∀𝑦 ∈ 𝜕𝐷 lim 𝑣(𝑥) = 𝑔(𝑦).


𝐷3𝑥→𝑦

In fact, we show this holds for each 𝑦 where the exterior cone condition holds and where 𝑔 is
continuous, regardless of other points on 𝜕𝐷. The idea is that for 𝑥 close to 𝑦, it is P𝑥 -likely that 𝑇 is
small (from Lemma 7.9) and thus that 𝐵𝑇 is close to 𝑦, whence that 𝑔(𝐵𝑇 ) is close to 𝑔(𝑦).
Let 𝜀 > 0. Choose 𝛿 > 0 such that |𝑔(𝑧) − 𝑔(𝑦)| < 𝜀/3 for |𝑧 − 𝑦| < 𝛿, 𝑧 ∈ 𝜕𝐷. Choose 𝜂 > 0
such that
2k𝑔k P0 sup |𝐵𝑡 | > 2𝛿 < 𝜀3 .
 
𝑡6𝜂

By Lemma 7.9, we may choose 𝛼 ∈ (0, 2𝛿 ) such that

2k𝑔k P𝑥 [𝑇 > 𝜂] < 𝜀


3 for |𝑥 − 𝑦| < 𝛼, 𝑥 ∈ 𝐷.

We obtain that for |𝑥 − 𝑦| < 𝛼, 𝑥 ∈ 𝐷,


   
|𝑣(𝑥) − 𝑔(𝑦)| 6 E𝑥 |𝑔(𝐵𝑇 ) − 𝑔(𝑦)|1 [𝑇 6𝜂] + E𝑥 |𝑔(𝐵𝑇 ) − 𝑔(𝑦)|1 [𝑇 >𝜂]
 
6 E𝑥 |𝑔(𝐵𝑇 ) − 𝑔(𝑦)|1 [𝑇 6𝜂]∩[sup𝑡 6 𝜂 |𝐵𝑡 −𝑥|6𝛿/2]
+ 2k𝑔k P𝑥 sup |𝐵𝑡 − 𝑥| > 𝛿/2 + 2k𝑔k P𝑥 [𝑇 > 𝜂]
 
𝑡6𝜂
𝜀 𝜀 𝜀
< 3 + 3 + 3 = 𝜀. J

For 𝑑 = 2, another sufficient condition for Lemma 7.9 is that 𝑦 belongs to a nonconstant curve
contained in 𝜕𝐷. To see this, make Brownian motion behave like in the following figure.
7.3. Harmonic Functions in a Ball and the Poisson Kernel 123

𝑦1

𝑦 𝑥
𝑦2

Figure: Choose two points, 𝑦 1 and 𝑦 2 , on the curve near 𝑦 that are
separated by 𝑦. A Brownian motion started at 𝑥 has a probability that
is bounded below over all 𝑥 near 𝑦 that it will create a curve (up to
some time) surrounding 𝑦 and hitting 𝜕𝐷 only near 𝑦. For example,
consider the event that it stays within the union of the green rectangles,
moving successively from one of the 6 rectangles to the next until it is
guaranteed to cross itself.

7.3. Harmonic Functions in a Ball and the Poisson Kernel


The P𝑥 -law of 𝐵𝑇 ∈ 𝜕𝐷 is called the harmonic measure of 𝐷 relative to 𝑥. When 𝐷 is a
ball, this has a very simple expression: It has a density 𝐾 (𝑥, ·) with respect to normalized surface
measure, 𝜎. Note that the strong Markov property shows that 𝐾 (𝑥, 𝑦) is harmonic in 𝑥 for each
𝑦 ∈ 𝜕𝐷. Also, for 𝑧 ∈ 𝜕𝐷,
lim 𝐾 (𝑥, ·)𝜎 = 𝛿 𝑧 weakly.
𝐷3𝑥→𝑧
This suggests some properties to look for when finding 𝐾.
For the rest of this section, let 𝐷 = B1 , the open unit ball in R𝑑 , 𝑑 > 2.
Definition 7.10. The Poisson kernel is the function 𝐾 : B1 × 𝜕B1 → R+ defined by

1 − |𝑥| 2
𝐾 (𝑥, 𝑦) B .
|𝑦 − 𝑥| 𝑑
Lemma 7.11. For all 𝑦 ∈ 𝜕B1 , 𝐾 (·, 𝑦) is harmonic on B1 .

Proof. Clearly, 𝐾 (·, 𝑦) ∈ 𝐶 ∞ (B1 ). A direct calculation shows that

Δ𝐾 (·, 𝑦) = 0

off 𝜕B1 (see an appendix to this chapter). J


There is a beautiful geometric representation of 𝐾 (𝑥, ·)𝜎1 due to Malmheden in 1934, where
𝜎1 B 𝜎0,1 . Namely, given 𝑔 ∈ 𝐶 (𝜕B1 ), for each line 𝐿 through 𝑥 ∈ B1 , let 𝑓 (𝐿) denote the value
at 𝑥 ∈ 𝐿 of the linear function on 𝐿 whose values at 𝐿 ∩ 𝜕B1 are 𝑔. Then the harmonic extension 𝑢
of 𝑔 satisfies ∫
𝑢(𝑥) = 𝑓 (𝐿) d𝐿,

where d𝐿 denotes a uniform direction for lines 𝐿 that pass through 𝑥.


124 Chapter 7. Brownian Motion and Partial Differential Equations

Equivalently, if 𝐴 ∈ 𝐵(𝜕B1 ) and 𝐴0 is its image on 𝜕B1 reflected in 𝑥, then the harmonic
measure of 𝐴 equals 𝜎1 ( 𝐴0).
In R2 , this is due to Schwarz. It is easy to prove in another form. Recall Euclid’s theorem that
if 𝐴 is an arc, then 𝜎1 ( 𝐴) + 𝜎1 ( 𝐴0) = 2𝜃, where 𝜃 is the angle at 𝑥 of the chords giving 𝐴 and 𝐴0:

𝐴0
𝜃
𝑥

Therefore, 𝜎1 ( 𝐴0) = 2𝜃 − 𝜎1 ( 𝐴). It is not hard to check that this is indeed the harmonic measure
of 𝐴 from 𝑥 by checking boundary values and by representing 𝜃 using the imaginary part of a
holomorphic function.
Lemma 7.13. We have ∫
∀𝑥 ∈ B1 𝐾 (𝑥, 𝑦) d𝜎1 (𝑦) = 1,
𝜕B1
where 𝜎1 B 𝜎0,1 .

Proof. For 𝑥 ∈ B1 , write 𝐹 (𝑥) for the integral in the lemma. We claim that 𝐹 satisfies the mean-value
property because 𝐾 (·, 𝑦) does. Clearly, 𝐹 is locally bounded and measurable. Now if 0 < 𝑟 < 1 − |𝑥|,
then
∫ ∬
𝐹 (𝑧) d𝜎𝑥,𝑟 (𝑧) = 𝐾 (𝑧, 𝑦) d𝜎1 (𝑦) d𝜎𝑥,𝑟 (𝑧)
∬ ∫
= 𝐾 (𝑧, 𝑦) d𝜎𝑥,𝑟 (𝑧) d𝜎1 (𝑦) = 𝐾 (𝑥, 𝑦) d𝜎1 (𝑦)

= 𝐹 (𝑥),

as desired. Also, 𝐹 is rotationally symmetric because 𝐾 is diagonally invariant under rotations and
𝜎1 is invariant. Therefore,

1 = 𝐹 (0) = 𝐹 (𝑥) d𝜎0,𝑟 (𝑥) for 0 < 𝑟 < 1

implies
𝐹 (𝑥) = 𝐹 (0) for |𝑥| = 𝑟,
i.e., 𝐹 ≡ 1. J
Theorem 7.14. If 𝑔 ∈ 𝐶 (𝜕B1 ), then the solution to the Dirichlet problem in B1 with boundary
condition 𝑔 is ∫
𝑢(𝑥) B 𝐾 (𝑥, 𝑦)𝑔(𝑦) d𝜎1 (𝑦) (𝑥 ∈ B1 ).
𝜕B1
7.4. Transience and Recurrence of Brownian Motion 125

Proof. A Fubini argument as in the proof of Lemma 7.13 shows that 𝑢 satisfies the mean-
value property, so is harmonic. It is clear that the probability measures 𝐾 (𝑥, ·)𝜎1 ⇒ 𝛿 𝑧 as
B1 3 𝑥 → 𝑧 ∈ 𝜕B1 . Therefore, 𝑢(𝑥) → 𝑔(𝑧) as B1 3 𝑥 → 𝑧 ∈ 𝜕B1 . J

Corollary 7.15. The harmonic measure of B1 relative to 𝑥 ∈ B1 is 𝐾 (𝑥, ·)𝜎1 . J

Exercise. Exercise 7.26.

7.4. Transience and Recurrence of Brownian Motion

Theorem 7.17. The following hold.


(i) For 𝑑 = 2, Brownian motion is (neighborhood) recurrent, meaning that almost surely, for all
open 𝑈 ⊆ R2 , {𝑡 > 0 ; 𝐵𝑡 ∈ 𝑈} is unbounded.
(ii) For 𝑑 > 3, Brownian motion is transient, meaning that almost surely, |𝐵𝑡 | → ∞.

Proof. We saw (ii) in Exercise 5.33(7). We also saw in Exercise 5.33(5) that for 𝑥 ≠ 0, P𝑥 [∀𝑡 >
0 𝐵𝑡 ≠ 0] = 1, while from the same formula as there, for 𝑑 = 2,

∀𝜀 > 0 P𝑥 ∃𝑡 |𝐵𝑡 | < 𝜀 = 1.


 

Combining these two facts, we get that for every neighborhood 𝑈 of 0, {𝑡 > 0 ; 𝐵𝑡 ∈ 𝑈} is unbounded
P𝑥 -a.s. The same holds for every ball with rational center and rational radius simultaneously almost
surely by a similar argument, whence (i) holds. J

7.5. Planar Brownian Motion and Holomorphic Functions

Let 𝑑 = 2, identify R2 with C, and write 𝐵𝑡 = 𝑋𝑡 + i𝑌𝑡 , where 𝑋 and 𝑌 are independent
real Brownian motions. We call 𝐵 a complex Brownian motion. Let 𝐷 ⊆ C be a domain and
Φ : 𝐷 → C be analytic. Since Re Φ and Im Φ are harmonic, Φ(𝐵)𝑇 is a continuous local martingale,
where 𝑇 is the exit time of 𝐷. Much more is true:
Theorem 7.18 (Lévy). Suppose C \ 𝐷 is polar. Write
∫ 𝑡
2
𝐶𝑡 B Φ0 (𝐵 𝑠 ) d𝑠 (𝑡 > 0).
0

Then for each 𝑧 ∈ 𝐷, there exists a complex Brownian motion Γ started from Φ(𝑧) such that P𝑧 -a.s.,

∀𝑡 > 0 Φ(𝐵𝑡 ) = Γ𝐶𝑡 .

That is, Φ(𝐵) is a time-changed complex Brownian motion with clock 𝐶; this is called the
conformal invariance property. The case Φ(𝑧) = 𝑎𝑧 is the usual Brownian scaling for 𝑎 ∈ R and is
rotation invariance for |𝑎| = 1. This shows why Theorem 7.18 is true on an infinitesimal level. If
C \ 𝐷 is not polar, then a similar conclusion holds for the process Φ(𝐵)𝑇 .
126 Chapter 7. Brownian Motion and Partial Differential Equations

Proof. Let Φ = 𝑔 + iℎ, where 𝑔 and ℎ are real harmonic. Write 𝑀 B 𝑔(𝐵) and 𝑁 B ℎ(𝐵). By Itô’s
formula,
d𝑀 = 𝑔𝑥 (𝐵) d𝑋 + 𝑔 𝑦 (𝐵) d𝑌 , d𝑁 = ℎ𝑥 (𝐵) d𝑋 + ℎ 𝑦 (𝐵) d𝑌 ,

so 𝑀 and 𝑁 are continuous local martingales. The Cauchy–Riemann equations,

𝑔𝑥 = ℎ 𝑦 , 𝑔 𝑦 = −ℎ𝑥 ,

imply that h𝑀, 𝑁i = 0 and h𝑀, 𝑀i = h𝑁, 𝑁i = 𝐶.


Recall Exercise 4.24, which shows that
a.s. a.s.
[𝐶∞ < ∞] = [ lim 𝑀𝑡 exists in R] = [ lim 𝑁𝑡 exists in R].
𝑡→∞ 𝑡→∞

By neighborhood recurrence of 𝐵, if Φ is not constant, then Φ(𝐵𝑡 ) does not have a finite limit as
𝑡 → ∞, whence 𝐶∞ = ∞ almost surely. Of course, if Φ is constant, then 𝐶 ≡ 0 and nothing is more
needed. Thus, in the nonconstant case, we may apply Proposition 5.15 to 𝑀 − 𝑔(𝑧) and 𝑁 − ℎ(𝑧)
under P𝑧 to obtain independent real Brownian motions 𝛽 and 𝛾 started from 0 such that

𝑀𝑡 − 𝑔(𝑧) = 𝛽𝐶𝑡 and 𝑁𝑡 − ℎ(𝑧) = 𝛾𝐶𝑡 .

Thus, the result holds with Γ B Φ(𝑧) + 𝛽 + i𝛾. J

In this proof, one could also write in complex notation dΦ(𝐵) = Φ0 (𝐵) d𝐵. More generally, if
𝑋 1 , . . . , 𝑋 𝑝 are continuous semimartingales taking values in C and 𝐹 : C 𝑝 → C is analytic in each
variable, then Itô’s formula takes exactly the same form as in Theorem 5.10, where the bracket is
now complex valued and still bilinear, not sesquilinear. (Hartog’s theorem guarantees that such an 𝐹
has a multivariable power series expansion in a neighborhood of each point.) In applications of this,
note that for complex Brownian motion, we have h𝐵, 𝐵i = 0. Bilinearity also guarantees that all
parts of Proposition 4.15 hold for complex-valued, continuous local martingales. Complex-valued,
continuous local martingales 𝑍 that satisfy h𝑍, 𝑍i = 0 are called conformal; they are time changes
of complex Brownian motion, as we can see by the second half of the proof of Lévy’s theorem.
Exercise. Determine all Φ such that in the preceding proof, 𝑀 and 𝑁 are independent.
Exercise. Exercise 7.27.
Exercise. Use the result
 of Exercise 7.27 to show that every nonconstant complex polynomial has a
root. Hint: note that 𝑧 ; |𝑃(𝑧)| 6 𝜀 is compact if 𝑃 is a polynomial.
Exercise. Suppose that 𝐻 is complex-valued and progressive and that 𝐵 is complex Brownian
motion. Show that if |𝐻| 2 is locally integrable with respect to Lebesgue measure and ∫𝑍 = 𝐻 · 𝐵,
then there exists a complex Brownian motion Γ such that 𝑍𝑡 = Γ𝐶𝑡 for 𝑡 > 0, where 𝐶𝑡 B 0 |𝐻𝑠 | 2 d𝑠
𝑡

for 𝑡 > 0. In particular, if |𝐻| = 1, then 𝑍 is a complex Brownian motion.


It is also interesting to look at Brownian motion in polar coordinates. This yields the
skew-product representation (or decomposition):
7.5. Planar Brownian Motion and Holomorphic Functions 127

Theorem 7.19. Fix 𝑧 ∈ C \ {0} and choose any 𝑤 ∈ C with 𝑧 = e𝑤 . There exists a complex
Brownian motion 𝛽 starting at 𝑤 such that P𝑧 -a.s.,

∀𝑡 > 0 𝐵𝑡 = exp{𝛽𝐻𝑡 },

where
d𝑠
∫ 𝑡
𝐻𝑡 B .
0 |𝐵 𝑠 | 2
The point is that Re 𝛽 describes the radial motion of 𝐵 while Im 𝛽 describes the angular motion
of 𝐵.

Proof. This is intuitive from Theorem 7.18 by using

Φ(𝜁) B log 𝜁 .

Note Φ0 (𝜁) = 1/𝜁. However, this Φ is multiple valued and would require an extension to Riemann
surfaces.
Instead, let us start with a complex Brownian motion Γ that starts from 𝑤 and use Φ(𝜁) B e𝜁 .
Then by Theorem 7.18, we have
eΓ𝑡 = 𝑍𝐶𝑡 ,
where 𝑍 is a complex Brownian motion from 𝑧 and
∫ 𝑡 ∫ 𝑡
Γ𝑠 2
𝐶𝑡 = |e | d𝑠 = e2 Re Γ𝑠 d𝑠.
0 0

Let 𝐾• be the inverse function of 𝐶• : by calculus,

d𝑢
∫ 𝑠 ∫ 𝑠
𝐾𝑠 = exp{−2 Re Γ𝐾𝑢 } d𝑢 = .
0 0 |𝑍𝑢 | 2

Then 𝑍 𝑠 = eΓ𝐾𝑠 . This is what we want, except it is for 𝑍 rather than for 𝐵. But the formula
𝐵𝑡 = exp{𝛽𝐻𝑡 } together with the formula for 𝐻 gives 𝛽 as a deterministic function of 𝐵. When
applied to 𝑍, it gives Γ. Since 𝐵 = 𝑍, it follows that 𝛽 = Γ, as desired.
𝒟 𝒟
J

Exercise. For which 𝑡 > 0 is E[𝐻𝑡1/2 ] < ∞? Hint: is log|𝐵| a true martingale?
Exercise. Exercise 7.29.
Exercise. Let 𝐵 := (𝐵𝑡 )𝑡>0 be Brownian motion in the complex plane. Suppose that 𝐵0 = 1.
(1) Let 𝑇1 be the first time that 𝐵 hits the imaginary axis, 𝑇2 be the first time after 𝑇1 that 𝐵 hits
the real axis, 𝑇3 be the first time after 𝑇2 that 𝐵 hits the imaginary axis, etc. Prove that for
each 𝑛 > 1, the probability that |𝐵𝑇𝑛 | 6 1 is 1/2.
(2) More generally, let ℓ𝑛 be lines through 0 for 𝑛 > 1 such that 1 ∉ ℓ1 . Let 𝑇1 := inf{𝑡 > 0 ; 𝐵𝑡 ∈
ℓ1 }, and recursively define 𝑇𝑛+1 := inf{𝑡 > 𝑇𝑛 ; 𝐵𝑡 ∈ ℓ𝑛+1 } for 𝑛 > 1. Prove that for each
𝑛 > 1, the probability that |𝐵𝑇𝑛 | 6 1 is 1/2.
128 Chapter 7. Brownian Motion and Partial Differential Equations

(3) (Extra credit) In P


the context of part (2), let 𝛼𝑛 be the smaller of the two angles between ℓ𝑛 and
ℓ𝑛+1 . Show that ∞ 𝑛=1 𝛼𝑛 = ∞ iff for all 𝜀 > 0, the probability that 𝜀 6 |𝐵𝑇𝑛 | 6 1/𝜀 tends to 0
as 𝑛 → ∞.
(4) (Extra credit) In the context of part (1), show that
2𝛿/𝜋 2
√  √ i e−𝑢 /2
h ∫
lim P exp −𝛿𝑛 𝑛 6 |𝐵𝑇𝑛 | 6 exp 𝛿𝑛 𝑛 = √ d𝑢
𝑛→∞ −2𝛿/𝜋 2𝜋

if 𝛿𝑛 > 0 tend to 𝛿 ∈ [0, ∞].


Exercise. Let 𝐵 be a complex Brownian motion not starting from 0. Let 𝐴𝑡 be an argument of 𝐵𝑡 ,
so that 𝐵𝑡 = |𝐵𝑡 | ei𝐴𝑡 . Assume that 𝐴 has been chosen to be continuous. Show how to reconstruct 𝐵
from 𝐴; more formally, show that 𝐵 is adapted to ℱ•𝐴 .

7.6. Asymptotic Laws of Planar Brownian Motion


Let 𝐵 be complex Brownian motion. Our first result is not asymptotic. If 𝐵0 = 𝑎 · i, 𝑎 > 0, and
𝑇 B inf{𝑡 > 0 ; Im 𝐵𝑡 = 0}, what is the distribution of 𝐵𝑇 ? By scaling, it is 𝑎 times the distribution
when 𝑎 = 1: the P𝑎i -law of 𝐵𝑇 equals the Pi -law of 𝑎 · 𝐵𝑇 .
Proposition. If 𝐵0 = i and 𝑇 B inf{𝑡 > 0 ; Im 𝐵𝑡 = 0}, then 𝐵𝑇 has the standard symmetric
Cauchy distribution,

d𝑦 1 1
∫ 𝑥
Pi [𝐵𝑇 6 𝑥] = = + arctan(𝑥).
2
−∞ 𝜋(1 + 𝑦 ) 2 𝜋

Proof. The function


1−𝑧
𝜑(𝑧) B i ·
1+𝑧
maps the unit disk to the upper half plane with 𝜑(0) = i and 𝜑(e2𝜋i𝛼 ) = tan(𝜋𝛼). Let

𝑆 B inf{𝑡 > 0 ; |𝐵𝑡 | = 1}.

The Pi -law of 𝐵𝑇 equals the P0 -law of 𝐵𝑆 pushed forward by 𝜑, in view of Theorem 7.18. Because
the P0 -law of 𝐵𝑆 is the uniform measure, it follows that

Pi [𝐵𝑇 6 𝑥] = P0 𝜑(𝐵𝑆 ) 6 𝑥 = P0 arg 𝐵𝑆 ∈ (−𝜋, 2𝜋𝛼] ,


   

where tan(𝜋𝛼) = 𝑥. This gives the result. J

Exercise. (1) Let 𝐶1 ⫫ 𝐶2 be standard Cauchy and 𝑎 1 , 𝑎 2 > 0. Show that 𝑎 1𝐶1 + 𝑎 2𝐶2 has the
law of 𝑎 1 + 𝑎 2 times standard Cauchy.
(2) Let 𝐵 be complex Brownian motion starting at 0. For 𝑠 ≥ 0, write 𝑇𝑠 B inf{𝑡 > 0 ; Im 𝐵𝑡 = 𝑠}
and 𝐶𝑠 := Re 𝐵𝑇𝑠 . Show that the process 𝐶 has independent, stationary increments and that
𝐶𝑠 has the law of 𝑠 times standard Cauchy.
7.6. Asymptotic Laws of Planar Brownian Motion 129

Exercise. Let (𝐵1 , 𝐵2 , . . . , 𝐵 𝑑+1 ) be Brownian motion in R𝑑+1 starting at (0, 0, . . . , 0, 1). Let
𝑇 B inf{𝑡 > 0 ; 𝐵𝑡𝑑+1 = 0}. Use Corollary 2.22 to show that (𝐵𝑇1 , . . . , 𝐵𝑇𝑑 ) has density

Γ 𝑑+12
𝑥 ↦→  (𝑑+1)/2
𝜋(1 + |𝑥| 2 )
∫∞
on R𝑑 , where Γ(𝑎) := 0 𝑠 𝑎−1 e−𝑠 d𝑠 is the usual Gamma function. This is called the standard
𝑑-dimensional (multivariate) Cauchy distribution. Because of its connection to Brownian motion,
it follows that when 𝑑 0 < 𝑑, every 𝑑 0-dimensional marginal of a standard 𝑑-dimensional Cauchy
distribution is a standard 𝑑 0-dimensional Cauchy distribution. Use the fact that the characteristic
function of the standard 1-dimensional Cauchy distribution is 𝜉 ↦→ e−|𝜉 | (𝜉 ∈ R) to deduce that the
characteristic function of the standard 𝑑-dimensional Cauchy distribution is 𝜉 ↦→ e−|𝜉 | (𝜉 ∈ R𝑑 ).
Hint: For 𝜉1 , . . . , 𝜉 𝑑 ∈ R, what is the law of 𝑗=1 𝜉 𝑗 𝐵𝑇 ?
P𝑑 𝑗

Now we look at the winding of Brownian motion about 0 and its distance from 0 separately.
Let 𝜃 𝑡 be a continuous process such that
𝐵𝑡
= ei𝜃 𝑡 (𝐵𝑡 ≠ 0, 𝜃 0 ∈ (−𝜋, 𝜋]).
|𝐵𝑡 |

If 𝐵𝑡 = e 𝛽 𝐻𝑡 as in Theorem 7.19, then 𝜃 𝑡 = Im 𝛽𝐻𝑡 . Because 𝐻𝑡 → ∞ almost surely and Im 𝛽 is


recurrent, it follows that 𝜃 is recurrent as well. Winding happens both when |𝐵| is small (when it is
fast) and when |𝐵| is large (infinitely often—consider 1/𝐵). How large is 𝜃 𝑡 typically?
Theorem 7.20 (Spitzer). For all 𝑧 ≠ 0
𝜃𝑡
the P𝑧 -law of √ ⇒ standard symmetric Cauchy distribution
log 𝑡
as 𝑡 → ∞.
Proof. (David Williams) Write 𝐵𝑡 = 𝑧 + 𝛽𝑡 , so that when 𝐵0 = 𝑧, we have 𝛽0 = 0. In particular, by
Brownian scaling, the P𝑧 -law of 𝐵1/𝛿2 equals the law of 𝑧 + 𝛿−1 𝛽1 when 𝛽0 = 0. Because the angle
does not change when we multiply the location by 𝛿, it follows that the conclusion is equivalent to:
𝜃 1/𝛿2 𝜃1
lim P𝑧 -law of 1
= lim P𝛿𝑧 -law of
𝛿↓0 log 𝛿
𝛿↓0 log 1𝛿

is standard Cauchy.
Fix (small) 𝑎 > 0 such that P0 |𝐵1 | > 𝑎 is close to1. Let 𝑇 B inf 𝑡 > 0 ; |𝐵𝑡 | > 𝑎 . Let 𝛼
  

satisfy
 the property  that for all 𝑧 with |𝑧| = 𝑎, P𝑧 |𝜃 1 | > 𝛼 is small; then also for all 𝛿 with 𝛿|𝑧| < 𝑎,
P𝛿𝑧 |𝜃 1 − 𝜃𝑇 | > 𝛼 is small, so we need concern ourselves only with the winding between times 0
and 𝑇, rather than between 0 and 1, i.e., with 𝜃𝑇 .
Consider 𝐵𝑡 = e 𝛽 𝐻𝑡 ; the law of 𝜃𝑇 is the law of Im 𝛽𝐻𝑇 , where 𝐻𝑇 is the time that Re 𝛽𝐻𝑇 goes
from log 𝛿|𝑧| to log 𝑎. Thus, the law of 𝜃𝑇 is log 𝛿|𝑧| − log 𝑎 times standard Cauchy. Since we
are dividing 𝜃 1 by log 1𝛿 , this gives the result. J
For the radial part, we know min |𝐵 𝑠 | ; 0 6 𝑠 6 𝑡 → 0 as 𝑡 → ∞; how fast?

130 Chapter 7. Brownian Motion and Partial Differential Equations

Proposition 7.22. For every 𝑧 ≠ 0, we have


1
∀𝑎 > 0 lim P𝑧 min |𝐵 𝑠 | 6 𝑡 −𝑎/2 =
 
.
𝑡→∞ 06𝑠6𝑡 1+𝑎

Proof. Choose 𝑏 > 0 so large that P0 [ 𝑏1 < max06𝑠61 |𝐵 𝑠 | < 𝑏] is close to 1. Then for fixed 𝑧,
 √𝑡 √ 
< max |𝐵 𝑠 | < 𝑏 𝑡 = P𝑧 𝑇√𝑡/𝑏 < 𝑡 < 𝑇𝑏√𝑡
 
P𝑧 𝑏 06𝑠6𝑡

is close to 1 uniformly in 𝑡 for 𝑡 sufficiently large. (Note that 𝑠 is the time variable, √ not 𝑡.) Now,
min𝑠6𝑡 |𝐵 𝑠 | 6 𝑡 −𝑎/2 if and only if 𝑇𝑡 −𝑎/2 6 𝑡. For 𝑐 > 0, we have (if 𝑡 −𝑎/2 6 |𝑧| 6 𝑐 𝑡)



 log(𝑐 𝑡) − log|𝑧|
P𝑧 𝑇𝑡 −𝑎/2 < 𝑇𝑐 𝑡 = √
log(𝑐 𝑡) − log 𝑡 −𝑎/2
1
by Exercise 5.33(5) [or use optional stopping, Section 3.4, on log|𝐵|]. As 𝑡 → ∞, this goes to 1+𝑎 .
Use 𝑐 = 𝑏1 and 𝑐 = 𝑏 to get the result:
     
P𝑧 𝑇𝑡 −𝑎/2 6 𝑡 > P𝑧 𝑇𝑡 −𝑎/2 < 𝑇√𝑡/𝑏 − P𝑧 𝑡 6 𝑇√𝑡/𝑏

and      
P𝑧 𝑇𝑡 −𝑎/2 6 𝑡 6 P𝑧 𝑇𝑡 −𝑎/2 < 𝑇𝑏√𝑡 + P𝑧 𝑡 > 𝑇𝑏√𝑡 . J
It is also interesting to know how quickly 𝐻𝑡 grows.
𝐻𝑡 1
Lemma 7.21. For all 𝑧, the P𝑧 -law of √
(log 𝑡) 2
converges to that of 𝑍2
, where 𝑍 is standard normal.

Le Gall uses this to prove the two preceding results; he also formulates it differently—in
particular, see Corollary 2.22.
Proof idea. We have ∫
n 𝑠 o
2 Re 𝛽𝑢
𝐻𝑡 B inf 𝑠 ; e d𝑢 > 𝑡 ;
0

this is 𝐾• = 𝐶•−1 in the proof of Theorem 7.19. For large 𝑠,


∫ 𝑠
log e2 Re 𝛽𝑢 d𝑢 ≈ max 2 Re 𝛽𝑢 ,
0 06𝑢6𝑠

so √
√ 𝒟 (log 𝑡) 2
𝐻𝑡 ≈ inf 𝑠 ; Re 𝛽𝑠 > log 𝑡 =

𝑍2
by Corollary 2.22. J
The Poisson Kernel is Harmonic 131

Appendix: The Poisson Kernel is Harmonic

We give the calculations for Lemma 7.11, which states that the Poisson kernel in R𝑑 ,

1 − |𝑥| 2
𝐾 (𝑥, 𝑦) := ,
|𝑦 − 𝑥| 𝑑
is harmonic in 𝑥 for 𝑥 ≠ 𝑦 and |𝑦| = 1. I took this from https://math.stackexchange.com/q/
569481.
Recall the following from calculus, where 𝑢 : R𝑑 → R, 𝜑 : R → R, and F : R𝑑 → R𝑑 :

∇ 𝜑(𝑢) = 𝜑0 (𝑢)∇𝑢, (A1)




Δ(𝑢) = div ∇𝑢, (A2)


div(𝑢 F) = ∇𝑢 · F + 𝑢 div F, (A3)
Δ(𝑢𝑣) = 𝑢 Δ𝑣 + 𝑣 Δ𝑢 + 2 ∇𝑢 · ∇𝑣. (A4)

For fixed 𝑦, we have 𝐾 (𝑥, 𝑦) = 𝑢(𝑥)𝑣(𝑥), where 𝑢(𝑥) := 1 − |𝑥| 2 and 𝑣(𝑥) := |𝑥 − 𝑦| −𝑑 . We
calculate that
∇𝑢 = −2𝑥, and so Δ𝑢 = −2𝑑.
Using (A1), we get

∇𝑣 = −𝑑 |𝑥 − 𝑦| −𝑑−1 ∇|𝑥 − 𝑦|
𝑥−𝑦
= −𝑑 |𝑥 − 𝑦| −𝑑−1
|𝑥 − 𝑦|
−𝑑−2
= −𝑑 |𝑥 − 𝑦| (𝑥 − 𝑦).

Using (A2) and then (A3), we get

Δ𝑣 = −𝑑 div |𝑥 − 𝑦| −𝑑−2 (𝑥 − 𝑦)

𝑥−𝑦
= −𝑑 (−𝑑 − 2)|𝑥 − 𝑦| −𝑑−3 · (𝑥 − 𝑦) − 𝑑 |𝑥 − 𝑦| −𝑑−2 𝑑
|𝑥 − 𝑦|
= 2𝑑 |𝑥 − 𝑦| −𝑑−2 .

Finally, combine the results using (A4) and the fact that |𝑦| = 1:

|𝑥 − 𝑦| 𝑑+2 Δ(𝑢𝑣) = (1 − |𝑥| 2 )2𝑑 − 2𝑑 |𝑥 − 𝑦| 2 + 4𝑑𝑥 · (𝑥 − 𝑦)


= 0.
132 Chapter 7. Brownian Motion and Partial Differential Equations

Appendix: Convergence of Harmonic Functions to Boundary Values


The following standard result (see, e.g., Doob’s 1984 book, Classical Potential Theory and Its
Probabilistic Counterpart, Theorem 2.IX.13, p. 651) is not as obvious as it looks:
Theorem A.1. Let 𝐷 be a bounded domain in R𝑑 and 𝑔 be a bounded Borel function on 𝜕𝐷. Let 𝐵
be Brownian motion in R𝑑 and 𝑇 := inf{𝑡 > 0 ; 𝐵𝑡 ∉ 𝐷}. For 𝑥 ∈ 𝐷, define

𝑢(𝑥) := E𝑥 𝑔(𝐵𝑇 ) .
 

Then for all 𝑥 ∈ 𝐷,


∀𝑥 ∈ 𝐷 lim 𝑣(𝐵𝑡 ) = 𝑔(𝐵𝑇 ) P𝑥 -a.s.
𝑡↑𝑇

This is part of Proposition 7.7.


We give two proofs, the first being essentially the standard one; thanks to Michael Damron for
conversations on this. The result can easily be extended to unbounded domains by defining 𝑔(𝐵∞ )
to be a constant.
Our first proof uses the following standard result (a version of the “predictable stopping
theorem”):
Theorem A.2. Let (ℱ𝑡 )𝑡 be a filtration. Let 𝑇 be a predictable stopping time, i.e., there are stopping
Ô
times 𝑇𝑛 < 𝑇 that increase to 𝑇 (such 𝑇𝑛 announce 𝑇). Suppose that ℱ𝑇 = 𝑛 ℱ𝑇𝑛 . Let 𝑀 be a
uniformly integrable right-continuous martingale. Then 𝑀 is left-continuous at time 𝑇.

An example where the conclusion fails for a nonpredictable stopping time is given by continuous-
time simple random walk on {0, −1, 1} started at 0 and stopped at the time 𝑇 of its first jump.

Proof. Because 𝑀 is right-continuous, the optional-stopping theorem gives 𝑀𝑇𝑛 = E[𝑀𝑇 | ℱ𝑇𝑛 ].
By the convergence of closed martingales in discrete time, we may deduce that lim𝑛→∞ 𝑀𝑇𝑛 = 𝑀𝑇 .
For 𝜖 > 0, let 𝐴𝜖 be the event that lim𝑡↑𝑇 |𝑀𝑡 − 𝑀𝑇 | > 𝜖. For 𝑛 > 1, define the stopping times

𝑆𝑛 := 𝑇𝑛+1 ∧ inf 𝑡 > 𝑇𝑛 ; |𝑀𝑇𝑛 − 𝑀𝑡 | > 𝜖 .




Since 𝑇𝑛 6 𝑆𝑛 < 𝑇, we have ℱ𝑇 = 𝑛 ℱ𝑆 𝑛 . Since (𝑆𝑛 )𝑛 announce 𝑇, it follows that 𝑀𝑆 𝑛 → 𝑀𝑇 a.s.


Ô
Thus, 𝑀𝑇𝑛 − 𝑀𝑆 𝑛 → 0 a.s., whence P( 𝐴𝜖 ) = 0. Because this holds for every 𝜖 > 0, we obtain the
desired result. J

Proof of Theorem A.1. Let (ℱ𝑡 ) be the completed canonical filtration of 𝐵. Let 𝑀𝑡 := 𝑢(𝐵𝑡∧𝑇 ).
Clearly 𝑀 is bounded and right-continuous. By the strong Markov property, 𝑀 is a martingale. The
stopping times
𝑇𝑛 := inf 𝑡 > 0 ; |𝐵𝑡 − 𝐷 c | 6 1/𝑛


announce 𝑇. Since ℱ𝑇 = 𝑛 ℱ𝑇𝑛 by continuity of 𝐵, we may apply Theorem A.2.


Ô
J

Our second proof uses some auxiliary results.


Convergence of Harmonic Functions to Boundary Values 133

Theorem A.3. Let 𝑑 ∈ N+ . Let 𝐵 be 𝑑-dimensional Brownian motion. Let 𝜎 ∈ 𝐶 2 (R𝑑 ) with
bounded first and second derivatives. Suppose that 𝑋 solves 𝐸 𝑥 (𝜎, 0), i.e., 𝑥 ∈ R𝑑 and 𝑋 is an
adapted process with values in R𝑑 such that
∫ 𝑡
𝑋𝑡 = 𝑥 + 𝜎(𝑋𝑠 ) d𝐵 𝑠 (𝑡 > 0).
0
If 𝜎(𝑥) ≠ 0, then
P ∀𝑡 > 0 𝜎(𝑋𝑡 ) ≠ 0 = 1.
 

Proof. By Itô’s formula,


d𝜎 2 (𝑋𝑡 ) = 2𝜎(𝑋𝑡 )𝜎0 (𝑋𝑡 ) d𝑋𝑡 + 𝜎0 (𝑋𝑡 ) 2 + 𝜎(𝑋𝑡 )𝜎00 (𝑋𝑡 ) dh𝑋𝑡 , 𝑋𝑡 i


= 2𝜎 2 (𝑋𝑡 )𝜎0 (𝑋𝑡 ) d𝐵𝑡 + 𝜎0 (𝑋𝑡 ) 2 + 𝜎(𝑋𝑡 )𝜎00 (𝑋𝑡 ) 𝜎 2 (𝑋𝑡 )𝑑 d𝑡




= 𝜎 2 (𝑋𝑡 ) d𝑀𝑡
for a continuous semimartingale 𝑀 with 𝑀0 = 0. It follows (say, by the exercise on page 104 near
the end of Section 8.1) that
𝜎 2 (𝑋) = 𝜎 2 (𝑥) ℰ(𝑀) = 𝜎 2 (𝑥) exp 𝑀 − h𝑀, 𝑀i/2


is never 0 if 𝜎(𝑥) ≠ 0. J
Theorem A.4. Let 𝐷 be a bounded domain in R𝑑 . Let 𝐵 be Brownian motion in R𝑑 and
𝑇 := inf{𝑡 > 0 ; 𝐵𝑡 ∉ 𝐷}. Let 𝜎 ∈ 𝐶 2 (R𝑑 ) be such that 𝜎(𝑥) > 0 if 𝑥 ∈ 𝐷 and 𝜎(𝑥) = 0 if 𝑥 ∈ 𝜕𝐷.
If 𝑥 ∈ 𝐷 and 𝑋 solves 𝐸 𝑥 (𝜎, 0), then P[∀𝑡 > 0 𝑋𝑡 ∈ 𝐷] = 1 and 𝑋 is a time change of (𝐵𝑡 )06𝑡<𝑇
(in law).
Proof. The first statement is immediate from Theorem A.3, while the second is proved just
as the conformal invariance of Brownian motion (Theorem 7.18)∫ is proved, but simpler (use

Proposition 5.15). Note that 𝑋 is a continuous bounded martingale, so 0 𝜎 2 (𝑋𝑡 ) d𝑡 = h𝑋, 𝑋i∞ < ∞
a.s., whence 𝑋∞ ∈ 𝜕𝐷 a.s. J
Remark. A special  case of Theorem A.4 is the following: Let 𝐷 be the unit disk when 𝑑 = 2 and
2
𝜎(𝑥) := 1 − |𝑥| /2 for |𝑥| 6 1. Then 𝑋 is Brownian motion in the Poincaré model of the hyperbolic
plane. The law of 𝑋 is the same as the law of 𝜑(𝑋) for every Möbius transformation 𝜑 of the unit
disk to itself. For Brownian motion 𝑋 in the Poincaré model of 𝑑-dimensional hyperbolic space,
there is a drift term: 𝑋 solves 𝐸 𝑥 (𝜎, 𝑏) with 𝜎(𝑥) := 1 − |𝑥| 2 /2 and 𝑏(𝑥) := (𝑑/2 − 1)𝜎(𝑥)𝑥 for
|𝑥| 6 1.
We are now ready to give a second proof of Theorem A.1.
Second proof of Theorem A.1. Let (ℱ𝑡 ) be the completed canonical filtration of 𝐵. Construct 𝜎 as
in the statement of Theorem A.4 by, say, summing a countable
 collection of small bump functions.
Fix 𝑥 ∈ 𝐷. Let 𝑋 solve 𝐸 𝑥 (𝜎, 0). Then the path 𝑢(𝐵𝑡 ) 06𝑡<𝑇 is the same in law as the path
𝑢(𝑋𝑡 ) 06𝑡<∞ but with a different parametrization. Also, 𝑋 is a continuous Markov process. Let
𝑋∞ := lim𝑡→∞ 𝑋𝑡 . By the strong Markov property, 𝑢(𝑋𝑡 ) = E 𝑔(𝑋∞ ) ℱ𝑡 , whence the result


follows from the convergence of closed martingales (in continuous time). J


A generalization of Theorem A.3 is in the extra credit exercise on page 115 at the end of
Chapter 8. Our proof of Theorem A.3 is modelled on one we heard from Étienne Pardoux for 𝑑 = 1.

You might also like