0% found this document useful (0 votes)
2K views280 pages

Quests On: Ereq Le: Tly y Asked

Uploaded by

Agent Smith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2K views280 pages

Quests On: Ereq Le: Tly y Asked

Uploaded by

Agent Smith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 280

S4 ados

Radoer
cq

Ereq l
e: tlyy As
Que ked
sts
eual on
ue _

A
Seco
n G E
=\d
¢ iti
on
Digitized by the Internet Archive
in 2024

https://archive.org/details/150mostfrequentlO000unse
POCKET BOOK GUIDES FOR

QUANT INTERVIEWS

FE PRESS

New York
Pocket Book Guides for Quant Interviews

1. 150 Most Frequently Asked Questions on Quant Inter-


views, Second Edition, by Dan Stefanica, Rados Radoici¢,
and ‘Tai-Ho Wang. FE Press, 2019

2. Challenging Brainteasers for Quant Interviews, by Rados


Radoiéié, Ivan Mati¢é, and Dan Stefanica. FE Press, 2020

3. Stochastic Calculus and Probability Quant Interview


Questions, by Ivan Mati¢é, Rados Radoiéic, and Dan Ste-
fanica. FE Press, 2020

Other Titles from FE Press

1. A Primer for the Mathematics of Financial Engineer-


ing, Second Edition, by Dan Stefanica. FE Press, 2011

2. Solutions Manual — A Primer for the Mathematics of


Financial Engineering, Second Edition, by Dan Stefanica.
FE Press, 2011

3. Numerical Linear Algebra Primer for Financial Engi-


neering, by Dan Stefanica. FE Press, 2014

4. Solutions Manual — Numerical Linear Algebra Primer


for Financial Engineering, by Dan Stefanica. FE Press,
2014

5. Elements of Stochastic Processes: A Computational


Approach, by C. Douglas Howard. FE Press, 2017

6. A Probability Primer for Mathematical Finance, by


Elena Kosygina.
150 MOST FREQUENTLY ASKED
QUESTIONS
ON QUANT INTERVIEWS
Second Edition

DAN STEFANICA
RADOS RADOICIC
TAI-HO WANG

Baruch College
City University of New York

FE Press

New York
FE PRESS
New York

www.fepress.org

©Dan Stefanica, Rados Radoicié, ‘Tai-Ho Wang 2019

All rights reserved. No part of this publication may be


reproduced, stored in a retrieval system, or transmitted,
in any form or by any means, electronic, mechanical,
photocopying, recording, or otherwise, without the prior
written permission of the publisher.

‘This edition first published 2019

Printed in the United States of America

ISBN-13 978-0-9797576-9-3
ISBN-10 0-9797576-9-X
To our beloved families
7


1

;
5

7
;

7
——

j
,

“oe
e

=
4

|
|
;
;
j i
/ }

=
|
;

,
ii

|
|
¢

ess
=
ce
A
3=, ©¥iia
/
oe
b:ie

| r a ball
we
, mae ee ie
ih * is. {aoe
a p 4
' a " od é | -
| thy : Pa a fi
‘fe x 2-2 : | rae
;
¢>+
7
i‘
4
-S 7S ‘i
_ of
2 i a 7:
fs
Ven ©
cae t 4 ;i 7
Hi ' =
=

“LB hi at
hile Wea RG
Contents

Preface to the Second Edition ix

Preface to the First Edition xi

Acknowledgments xiii

1 First Look: 15 Questions 1

2 Questions 31
2.1 Mathematics, calculus, differential equations. 33
2.2. Covariance and correlation matrices. Lin-
CAlsalOCbrae = cr oeets oerCoane 35
2.3 Financial instruments: options, bonds, swaps,
LODW ATC SHEELULINGS cette
a) cage men Savi
PN AGREE DPN aMONUINSS 5 Gea Bo oo Due o © Al
2.5 Monte Carlo simulations. Numerical meth-
OCS cag ck a al, ae eae en ad okies AT
2.6 Probability. Stochastic calculus........ 49
FOS Vemeerbeaaeresn nici 8)olat.y ein a ee 55

3 Solutions 63
3.1 Mathematics, calculus, differential equations. 63

vii
viii CONTENTS

3.2 Covariance and correlation matrices. Lin-


earalgebraa vac temas oA ee ee eee 79
Financial instruments: options, bonds, swaps,
forwards Shutures. Oy.) oe. aie at oes one 99
3.4 C4--. Data structures: {472 alee ano ae 127
3.5 Monte Carlo simulations. Numerical meth-
OdS 24 See rake Seance ira ea ee ee 147
3.6 Probability. Stochastic calculus... .... . 161
3.7 IDTALNUCASELSs> 5, e atiee Mem co ee en een 211

Bibliography 265
Preface to the First
Edition

‘The use of quantitative methods and programming skills


in all areas of finance, from trading to risk management
has grown tremendously in recent years, and accelerated
through the financial crisis and with the advent of the big
data era.
A core body of knowledge is required for successfully
interviewing for a quant type position. The challenge lies
in the fact that this knowledge encompasses finance, pro-
gramming (in particular C++ programming), and several
areas of mathematics (probability and stochastic calculus,
numerical methods, linear algebra, and advanced calcu-
lus). Moreover, brainteasers are often asked to probe the
ingenuity of candidates.
This book contains over 150 questions covering this
core body of knowledge, without which it is not possible
to advance to a final interview round. ‘These questions are
not only frequently, but also currently, asked on interviews
for quantitative positions, and cover a vast spectrum, from
C++ and data structures, to finance, brainteasers, and
stochastic calculus.
‘The answers to all of these questions are included in
the book. ‘hese answers are written in the same very
practical vein that was used to select the questions: they

xl
xii PREFACE TO THE FIRST EDITION

are complete, but straight to the point — as they would be


given in an interview.

‘Topics:
e Mathematics, calculus, differential equations.
e Covariance and correlation matrices. Linear algebra.
e Financial instruments: options, bonds, swaps, forwards,
futures.
e C++, algorithms, data structures.
e Monte Carlo simulations. Numerical methods.
e Probability. Stochastic calculus.
e Brainteasers.

The authors are faculty members of the Baruch Col-


lege Financial Engineering Masters Program, and have
over 20 years of experience educating students who were
very successful interviewing for quantitative positions. As
such, the authors had the privilege to interact with gen-
erations of exceptional students, whose contributions as
alumni to the continued success of our students has been
tremendous. ‘Vhis book is a tribute to our special Baruch
MFE community.
This is the first book in the Pocket Book Guides for
Quant Interviews Series, to be followed by books on ad-
vanced probability and stochastic calculus questions and
on challenging brainteasers asked in quant interviews.

New York, 2013


Acknowledgments

As professors in the Baruch MFE program, it has been a


privilege to have the opportunity to contribute to the early
career success of so many talented students and alumni.
The strong community that developed around the Baruch
MFE program is the reward educators truly dream of,
and, for us, a reality that is an inspiration. ‘This book
would have not been possible without our Baruch MFE
community — we are grateful to everyone who is part of
it.
Working alongside and learning from our colleagues,
both academics and finance professionals, created the per-
fect context for writing this book; we are thankful to all
of them.
Alejandro Canete and David Zhang are owed a spe-
cial thank you for their wonderful help on the program-
ming questions, as is Max Rumyantsev who provided the
art for the book cover of the first edition. Several stu-
dents spearheaded the proofreading effort, and their help
was greatly appreciated: Zhaofeng Brent Liao, Sheng Rick
Cao, Zilong Cheng, Xiangtian Forest Deng, Xiang Lu, Wei
Mao, Bo Pang, Zhou Robert Qi for the second edition,
and Yu Gan, Jun Hua, Alireza Kashef, Yi Bill Lu, Svet-
lana Rafailova, Fubo Shi, Yujia Helen Sun, Jun Charlotte
Wang, Peng Wu, Yanzhu Wendy Wu, Yongyi Ivan Ye, He
Hillary Zhao, Wenyi Zhou for the first edition.

xii
xiv ACKNOWLEDGMENTS

It is the tremendous support and the understanding


of our families that made this book possible, and we are
forever in debt to them. ‘Thank you, thank you, and thank
you!

‘This book is dedicated to our wives and to our children,


from all our hearts.

Dan Stefanica
Rados Radoitéié
‘Tai-Ho Wang

New York, 2019


Chapter 1

First Look: Fifteen Questions.

1. Nine months call options with strikes 20 and 25 on


a non-dividend—paying underlying asset with spot
price $22 are trading for $5.50 and $1, respectively.
Can you find an arbitrage?

2. (i) What is the sum of the eigenvalues of the corre-


lation matrix of n random variables?

(ii) Find a lower bound for the sum of the eigenval-


ues of the inverse of a nonsingular correlation matrix
of n random variables.

3. Let W; be a Wiener process, and let


t

= a W,dr.
0
What is the distribution of X;? Is X; a martingale?

4. An 8 x 8 matrix contains zeros and ones. You may


repeatedly choose any 3 x 3 or 4 x 4 block and flip
all bits in the block (that is, convert zeros to ones,
CHAPTER 1. FIRST LOOK: 15 QUESTIONS

and ones to zeros). Can you always modify the orig-


inal matrix into an all-zero matrix using these block
flips?

Or . Find all the values of p such that

1 0.6 —0.3
0.6 il p
—0.3 p 1

is a correlation matrix.

. How many independent random variables uniformly


distributed on [0,1] should you generate to ensure
that there is at least one between 0.70 and 0.72 with
probability 95%?

. Suppose you have in your possession an incredibly


large bag of M&M’s containing a uniform distribu-
tion of the six M&M colors. (M&M’s come in blue,
orange, green, yellow, red, brown.) You decide to
play a game: you draw one M&M from the bag and
place it on the table. You then continue to draw
M&M’s from the bag one at a time. If you draw
an M&M that is the same color as one already on
the table, you eat both of them. Otherwise, you
place the M&M on the table along with the others
of different color. ‘The game ends when you have six
M&M’s (all of different colors) on the table. How
many M&M’s should you expect to eat playing this
game?

. Assume the Earth is perfectly spherical and you are


standing somewhere on its surface. You travel ex-
actly 1 mile south, then 1 mile east, then 1 mile
north. Surprisingly, you find yourself back at the
starting point. If you are not at the North Pole,
where can you possibly be?!

. Solve the Ornstein-Uhlenbeck SDE

dry = A(O — rz) dt an odW;,

with A > 0, which is used, e.g., in the Vasicek model


for interest rates.

10. Find the value of

Oe.
ip f(x i ::~ x)
assuming that f(x) is a function such that the inte-
gral above exists.

{litte Let X and Y be standard normal variables with joint


normal distribution with correlation p. Find the ex-
pectation
E [sgn(X )sgn(Y)] ,
where sgn(-) is the sign function given by sgn(z) =
if x > 0, sgn(x) = —1, if x < 0, and sgn(0) = 0.

12: How do you create a long Gamma, short vega op-


tions trading strategy?

. Let X; and Y; be geometric Brownian motions driven


by
dX;
= puxdt+oxdWi;
Xt
dYs lI joy dt + oy dBi,
Y;
4 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

where W; and B; are correlated Brownian motions


with constant correlation p. Show that
Xt
ve
Ly =

is also a geometric Brownian motion and determine


its drift and volatility coefficients.

14. Find the k-th largest element in an unsorted array.


Assume that k is always valid, i.e., k > 1 and k is
less than or equal to the length of the array.

Note: You are looking for the k-th largest element


in the sorted order, not the k-th distinct element of
the array.

Example 1:

Enputseloses
ss oOs ol manda —
Output: 5

Example 2:

Input 1S; 2Qeset52,4,5,5molmand ike—#2.


Output: 4

. Given an array nums, there is a sliding window of


_ size k which is moving from the very left of the array
to the very right of the array. You can only see the
k numbers in the window. Each time the sliding
window moves right by one position. Assume that k
is always valid, i.e., k > 1 and k is less than or equal
to the size of the input array size for non-empty
arrays.
Write an algorithm that returns the maximum of
the sliding window.
Example:
Input; nums =—pi53,-15-355,3;6,7), and k = 3
OutpucieS,3,5550500)

Explanation:

Window Position Max

| '
ry ice) oi —Ww [o>) “NIws
6 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

Solutions

Question 1. Nine months call options with strikes 20


and 25 on a non-dividend—paying underlying asset with
spot. price $22 are trading for $5.50 and $1, respectively.
Can you find an arbitrage?
Answer: Note that a call option with strike 0 on a non—
dividend—paying underlying asset is the same as one unit
of the asset, since the call with strike 0 will always be
exercised at. maturity by paying $0, i.e., the strike of the
option, to receive one unit of the asset. ‘Thus, we are
implicitly given a third call option with strike K = 0 and
price $22 (i.e., the spot price of the asset), and we can
proceed to identify whether there is convexity arbitrage
for these three call options.
WE AG SR 166s) can AO Ra IS elie == OP). (C5) =
5.50, C3 = 1. Note that 20 = +-0+ 4-25, ie.,
1 4
Ko 2 = —K
5 its—Kz3.3

Since 4
Bit 5Cs == Al) << a0) = (Oy, GAS

the convexity of option prices with respect to strike is


violated.
The arbitrage strategy is to “buy low” eC + 2C3 and
“sell high” C2. To normalize units, we multiply the po-
sitions by 500 to obtain the following arbitrage strategy:
“buy low” 100C; + 400C2 and “sell high” 500C2. Note
that buying 100C1, i-e., 100 calls with strike K, = 0,
is equivalent to buying 100 units of the underlying asset
since the asset does not pay dividends.
Arbitrage Strategy:
e buy 100 units of the underlying asset for $2,200;
e buy 400 calls with strike 3 = 25 for $400;
e sell 500 calls with strike A2 = 20 for $2,750;
e realize a positive cash flow of $150.
The positive cash flow $150 represents risk-free profit
since the arbitrage portfolio does not lose money at ma-
turity:
The value of the arbitrage portfolio at the maturity 7’ of
the options is

V(T) lI 100S(1) — 50002(7') + 40003(T)


100S(7) — 500 max(S(T) — 20,0)
+ 400 max(S(T')— 25,0).
If S(T) < 20,

If 20 < S(T) < 25,


Vir) = 100S(T) — 500(S(T)
— 20)
10000 — 400S(T")
Saw 0.
If 25 < S(T),
Vit). = 100s ys 00(s(r) 220)
+ 400(S(1) — 25)
= tk

Note that 150 = 500 - (5.50 — 5.20), i.e., the risk-free


profit $150 is equal to the size of the convexity disparity
$5.50 — $5.20 times the amplifier factor 500.

Question 2. (i) What is the sum of the eigenvalues of


the correlation matrix of n random variables?
(ii) Find a lower bound for the sum of the eigenvalues
of the inverse of a nonsingular correlation matrix of n
random variables.
8 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

Answer: (i) The sum of the eigenvalues of a matrix is


equal to the trace of the matrix, i.e., to the sum of the
main diagonal entries of the matrix.’ Since the correla-
tion matrix of n random variables is an n x n matrix with
all main diagonal entries equal to 1, the trace of the cor-
relation matrix is equal to n. We conclude that the sum
of the eigenvalues of the correlation matrix of n random
variables is n.
(ii) If Ar, A2,..., An are the eigenvalues of the nonsingular
n xX n correlation matrix Q, then A; > O for alli = 1: n,
since a nonsingular correlation matrix is symmetric pos-
itive definite. ‘The eigenvalues of the inverse matrix Q7'
are —s a snes ar ‘Thus, the question asks us what
could be said about the sum

of the eigenvalues of Q7'.


Recall from (i) that the sum of the eigenvalues of the
correlation matrix Q is n, i.e.,

Swi (12)

Also, recall from the Cauchy—Schwartz inequality that

(s«) (>) : (Sra). foe


Since 4; > 0 for all i = 1: n, we can use the Cauchy-
Schwartz inequality (1.3) for a; = /\; and bj = meee
Xj
1 This property follows from the fact that the eigenvalues ofa
matrix A are the roots of the characteristic polynomial P,(t) =
det(tl — A) of the matrix A; see, e.g., Theorem 4.1 from Staten.
ica [4]. :
t= 1:n, to obtain that

2
BARE?
since ajOe= (VAs) ee= Au, BF2 = (+) =oa 3-, and ab; =
VX = tOrealle pele 77

From (1.4) and using (1.2), we find that

In other words, the sum of the eigenvalues of the inverse


of a nonsingular correlation matrix of n random variables
is bounded from below by n. U1

Question 3. Let W; be a Wiener process, and let


t
2. C— ihW,dr. (1.5)
0
What is the distribution of X;? Is X; a martingale?
Answer: Note that we can rewrite (1.5) in differential
form as
dX, = Widt = Widt+0dW,.

Then, X; is a diffusion process with only drift part W,,


and therefore X; is not a martingale.
We use integration by parts to find the distribution of
X1; a different solution can be found in Section 3.6.
10 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

By applying integration by parts, we obtain that

Xt = W,dr
0
t
= ww. - [ TdW,
JO
t t
= ¢t{idW, — | rdW,
0 (0)

\| |
ahsQ =
om

Recall that, if f(t) is a deterministic square integrable


function, then the stochastic integral Ae f(7)dW,, is a nor-
mal random variable of mean 0 and variance is \f(r)|?dr,
1.€.,

[remaw. ~ (0,f) nemrer).


t t

0
‘Thus,
t
X, = i (t —r)dW,
0

~ v(o, fe-n? ar)

= v (05).
We conclude that X+ is anormal random variable of tnean
0 and variance =. O

Question 4. An 8 x 8 matrix contains zeros and ones.


You may repeatedly choose any 3 x 3 or 4 x 4 block and
flip all bits in the block (that is, convert zeros to ones,
and ones to zeros). Can you always modify the original
matrix into an all-zero matrix using these block flips?
Answer: No! Note that all the block flips are reversible,
so it will suffice to show that there exist 8 x 8 matrices M7
11

containing zeroes and ones that cannot be obtained using


the block flips starting from an all-zero matrix.
Given a multiset of 3 x 3 and 4 x 4 blocks to be flipped
in some order, the final matrix obtained is independent of
the order in which the flips of the blocks in the multiset
are applied. Moreover, we can remove all the block repe-
titions; in other words, we can reduce the multiset of the
blocks flipped to a set with no repeated blocks by recog-
nizing that flipping the same block twice does not affect
the final matrix obtained at the end.
‘The total number of 33 blocks in an 8x8 matrix is 36:
the upper left corner of the 3 x 3 block cannot be located
in the 7-th or 8-th row or in the 7-th or 8th column
of the 8 x 8 matrix and therefore there are 6 x 6 = 36
possible positions for it. Similarly, the total number of
4x4 blocks in an 8 x 8 matrix is 25: the upper left corner
of the 4 x 4 cannot be located in the 6—-th, 7-th or 8-th
row or in the 6—th, 7—-th or 8-th column of the 8 x 8 matrix
and therefore there are 5 x 5 = 25 possible positions for
it.
‘Thus, there are 36+ 25 = 61 blocks that can be flipped
and the total number of different sets of blocks made with
these 61 blocks (with no repeated blocks) is 2°'. ‘Then,
starting with an all-zero matrix, we can obtain at most
2°' distinct matrices. Since the total number of 8 x 8
matrices containing zeros and ones is 2°, it follows that
there exist matrices that cannot be obtained starting from
an all-zero matrix by using block flips.

Question 5. Find all the values of p such that

is a correlation matrix.
LZ CHAPTER 1. FIRST LOOK: 15 QUESTIONS

Answer: A symmetric matrix with diagonal entries equal


to 1 is a correlation matrix if and only if the matrix is
symmetric positive semidefinite. ‘Thus, we need to find all
the values of p such that the matrix

£ 1.06).003
Q= OGiby ah. wtp (1.6)
ssitwe wel
is symmetric positive semidefinite.
We give a short solution using Sylvester’s criterion.
‘lwo more solutions, one using the Cholesky decomposi-
tion, and another one based on the definition of symmetric
positive semidefinite matrices will be given in Section 3.2.
Recall from Sylvester’s criterion that a matrix is sym-
metric positive semidefinite if and only if all its principal
minors are greater than or equal to 0. Also, recall that
the principal minors of a matrix are the determinants of
all the square matrices obtained by eliminating the same
rows and columns from the matrix. In particular, the
matrix 2 from (1.6) has the following principal minors:

det(1j=1; det(@y= 1; “det(1)=1;

0.6
det (06 1 ) == UGE

1 —0.3

lp
det (ye ) de
= l—p';2.

det(Q) II 1 — 0.36p — 0.09 — 0.36 — p?


0.55 — 0.369 — p”.
13

Thus, it follows from Sylvester’s criterion that Q is a


symmetric positive semidefinite matrix if and only if

Lap i >-1 0;
0:55 0:36) — p’ ==> J0)
which is equivalent to —1 < p< 1 and

p +0.36p —0.55 < 0. (1.7)


Since the roots of the quadratic equation corresponding
to (1.7) are —0.9432 and 0.5832, we conclude that the
matrix Q is symmetric positive semidefinite, and therefore
a correlation matrix, if and only if

—0.9432 < p < 0.5832. O (1.8)

Question 6. How many independent random variables


uniformly distributed on [0,1] should you generate to en-
sure that there is at least one between 0.70 and 0.72 with
probability 95%?
Answer: Denote by N the smallest number of random
variables you should generate such that

P(at least one r.v. in [0.70,0.72]) > 0.95. (1.9)

The probability that a random variable uniformly dis-


tributed on [0,1] is not in the interval [0.70, 0.72] is 0.98.
Thus, the probability that none of the N independent
variables are in [0.70, 0.72] is 0,98", ile.

P(no r.v. in (0.70,0.72]) = 0.98%.


Note that

P(at least one r.v. in [0.70, 0.72])


= 1 —P(nor.v. in (0.70, 0.72])
1 — (0.98). (1.10)
14 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

From (1.9) and (1.10), we find that N is the smallest


integer such that

1 (0:98)2. =. 0.95,
which is equivalent to

(0.98)% < 0.05


<= N'1n(0.98) < 1In(0.05)
In(0.05)
oe EO
Syl 2557598)
<> N= 149.
We conclude that at least 149 uniform random vari-
ables on [0,1] must be generated in order to have 95%
confidence that at least one of the random variables is
between 0.70 and 0.72.

Question 7. Suppose you have in your possession an


incredibly large bag of M&M’s containing a uniform dis-
tribution of the six M&M colors. (M&M’s come in blue,
orange, green, yellow, red, brown.) You decide to play a
game: you draw one M&M from the bag and place it on
the table. You then continue to draw M&M’s from the
bag one at a time. If you draw an M&M that is the same
color as one already on the table, you eat both of them.
Otherwise, you place the M&M on the table along with
the others of different color. ‘The game ends when you
have six M&M’s (all of different colors) on the table. How
many M&M’s should you expect to eat playing this game?
Answer: Denote by S;, the state with M&M’s of n distinct
colors on the table, for 0 < n < 6. Let A,, be the expected
number of M&M’s that have to be eaten in order to reach
state S,41 from state S;,. The question asks us to find .

n=0
Note that Ao = 0: Starting from state So with zero
M&M’s on the table we draw a single M&M which is, by
default, of distinct color, and we end up in state S; with
no M&M’s eaten.
Suppose we are in state S,, 1 <n <5. On the next
draw from the bag, two different instances can occur:
e with probability 1— %, we draw an M&M of a color that
does not match any color of the M&M’s already on the
table, hence moving to state S41 with no M&M’s eaten
in the process;
e with probability |, we draw an M&M of a color that
matches the color of another M&M already present on the
table. In this case, we eat both M&M’s with matching
color and we are left with n — 1 M&M)’s of different colors
on the table, which corresponds to state S,—1. ‘Then, in
order to reach state S41 from state S;,—1, we will have
to eat an additional A,—; + A, M&M’s, on average.
In other words,
n n
An a (1- = “0! == 6 (2 =: Sneed F in)

and therefore

An = 3 —n
ae t(D oN ed) Nay aos C95 SNA)
Since Ap = 0, we obtain from the recursion (1.11) that

2 6
A: = 53 Az = 5)
16 52
A3 = 5B” Ag = Tee

As = 62.

We conclude that the expected number of M&M’s eaten


in order to reach state Sg from state So is

SA, ==) Wifes, 1)


n=O
16 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

Question 8. Assume the Earth is perfectly spherical and


you are standing somewhere on its surface. You travel
exactly 1 mile south, then 1 mile east, then 1 mile north.
Surprisingly, you find yourself back at the starting point.
If you are not at the North Pole, where can you possibly
be?!
Answer: ‘Chere are infinitely many locations, aside from
the North Pole, that have this property.
Somewhere near the South Pole, there is a latitude
that has a circumference of one mile. In other words, if
you are at this latitude and start walking east (or west),
in one mile you will be back exactly where you started
from. If you instead start at some point one mile north
of this latitude, your journey will take you one mile south
to this special latitude, then one mile east “around the
globe” and finally one mile north right back to wherever
you started from. Moreover, there are infinitely many
points on the Earth that are one mile north of this special
latitude, where you could start your journey and eventu-
ally end up exactly where you started.
We are still not finished! ‘There are infinitely many spe-
cial latitudes as well; namely, you could start at any point
one mile north of the latitude that has a circumference of
1/k miles, where k is a positive integer. Your journey will
take you one mile south to this special latitude, then one
mile east looping “around the globe” k times, and finally
one mile north right back to where you started from. 0

Question 9. Solve the Ornstein-Uhlenbeck SDE

dr; = A(O = re)dt = odW:, (1.12)

with A > 0, which is used, e.g., in the Vasicek model for


interest rates. :
Answer: We can rewrite (1.12) as

dr + Aridt = AOdt 4- odW;. (1.13)


Leh

By multiplying (1.13) on both sides by the integrating


factor e*’, we obtain that

e dr, + re ridt = edt + ae dW,

which is equivalent to

d (err) = dbedt + ced. (1.14)


By integrating (1.14) from 0 to ¢, it follows that
t t

e'n,—1ro. = r0 | eds +o f e dW,


Jo 0

Meni e dW.
, ‘ t s
II

By solving for r;, we find that the solution to the Ornstein-


Uhlenbeck SDE is
t
Pe = et ngste C0 (e* = 1)+ae™ f e dW,
0
t

l| e tro +0 (1= oP) abe a | er) ie


0
Note that the process r; is mean reverting to 0, regard-
less of the starting point 7o. ‘To see this, recall that the
expected value of the stochastic integral i f(s)dW; of a
non-random function f(s) is 0. Then,
ot

E / Maw, |=0,
0

and therefore

Thus,
18 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

Question 10. Find the value of

1
ee
Fee
fa Sey ae
assuming that f(a) is a function such that the integral
above exists.
Answer: Let

lB pers
We use the substitution y = a — x for (1.15). Note that
x = 0 corresponds to y = a, x = a corresponds to y = 0,
and, since x = a — y, it follows that da = —dy. ‘Vhen,
from (1.15), we obtain that

lap alt acelin te


4 & patna card
II vA a—y)
itFa-u+ fa)
een C eo) dy. (1.16)
o f(y) +fla—y)
By renaming the integrating variable y as x in (1.16), we
find that
Pie
see IG oa
(1.17)
> Fe)+fa—z) “
By adding (1.15) eat (1.17), we obtain that

, f(z)
+ f(@— 2) Lhe .
ai =f ae
Fire ae) die i dt =a.

and conclude that


g5"

‘Thus,
a .
19

for any function f(a) such that the integral from (1.18)
exists. ©

Question 11. Let X and Y be standard normal variables


with joint normal distribution with correlation p. Find the
expectation
E [sgn(X )sgn(Y)] ,
where sgn(-) is the sign function given by sgn(a) = 1, if
x > 0, sgn(x) = —1, if x < 0, and sgn(0) = 0.
Answer: If p = 1, then

E (sgn(X)sgn(Y)] = E [sen(Z)”] =E[1J=1, (1.19)


where Z is the standard normal variable, and, if p = —1,

E [sgn(X)sgn(Y)] = E[—sen(Z)”] = E[-1] = -1.


(1.20)
Ifp € (—1,1), we obtain that

E [sgn(X)sgn(Y)]
= Pix
>.0.Y SOP Pix < OY <0
= PLX SO) Y <0] = Pix <0,Y > 0} ..(1-21)
Note that

PACS O00] ee PPOs


ye<0}. bGt.22)
Pix
OY aU fee Pe Ory i Cho)
due to symmetry, and therefore (1.21) can be written us-
ing (1.22-1.23) as

E |sgn(X )sgn(¥)]
= 2P(X >0,Y >0)—2P([X >0,Y <0](1.24)
Moreover,

P[X >0,Y > 0])+P[X <0,Y <0]


+P[X
>0,Y <0])+P[X <0,Y >0]
= 1, (1.25)
20 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

Using (1.22—1.23) in (1.25), we find that

2P(X >0,Y +0 F2P(X>0,Y <0) = 1

and therefore

Pics
0,¥ <0) =i = Pi 0.y Oe tee)
re
No]

By substituting (1.26) in (1.24), we obtain that

E(sen(X)sen(¥)]| = 4P(X>O0,Y >0)=—1. | (1.27)

To compute P[X > 0,Y > 0], recall that, if X and Y


are standard normal variables with joint normal distribu-
tion with correlation p, then there exist two independent
standard normal variables 7, and Z2 such that

aS
Ww
ae
ig pZ +
—_
1-— p? Zo 7
(1.28)
:

Let p = V1 — p?. From (1.28), we obtain that

Vo ="px pz, (1.29)


where we denoted Zz by Z for simplicity. Note that X =
Z, and Z = Z2 are independent standard normals.
‘Then, from (1.29) and using the fact that X and Z are
independent standard normal variables, it follows that

PLX 2.0) S30)


=» PIX Oh 670)
21

We use a polar coordinates change of variables to com-


pute the integral (1.30). Let

z= rcos(?)> 2=rsin(@),

and recall that

dzdx = rdédr. Cise)

Note that

abg ve ge 6
p
ee akg tan(0) < oo
p

=> a<0< 3? (1°32)

where

a = arctan (-8).
p

Note that a is the signed angle between the x-axis and


the straight line px + pz = 0 on the (a, z) plane.
From (1.30) and using (1.31) and (1.32), we obtain that

Pi S20.0)
1 co co eee ee
= == e 2 dzdx
or 0 re or

|
M | who
yi2
|
m


fe) eae Se

Spy (1.33)
DY CHAPTER 1. FIRST LOOK: 15 QUESTIONS

From (1.27) and (1.33), we conclude that

E[sgn(X)sgn(Y)] = 4P[X >0,Y >0]-1


lI >
| |2 —1
mle =)
Paes

a Oe (1.34)
Formulas (1.19) and (1.20) for E[sgn(X)sgn(Y)] cor-
responding to p = 1 and p = —1, respectively, can be
obtained from the general formula (1.34) as limiting cases
when p goes to 1 and to —1. For example,

jim E [sgn( X )sgn(Y )]

Qt
= —-— lim arctan { —
T pN-1

= -—
ca
2
lim. arctan
ier)
which is the same as (1.20), since

PN-1 {= p?

and therefore

lim arctan Pee Le = ng O


(oie es p? Y
23

Question 12. How do you create a long Gamma, short


vega options trading strategy?
Answer: Both Gamma and vega are highest for options
around at-the-money (ATM). However, the Gamma of
ATM options is higher for options with shorter maturity
(i.e., for short-dated options), while the vega of ATM op-
tions is higher for longer maturity options (i.e., for long—
dated options); see Figure 1.1 and Figure 1.2, respectively.

0.06 —— 5 EEE See ; +


|= — Gamma, T=0.5 |
| Gamma, T=0.25 |
0.05 J
Gamma, T=0.25

0.04 =< 4
|
Pe |

‘S
£ 0.03 4
o |
0)
Gamma, T=0.5
0.02 [ 4

0.01 5
|

0 Ie fi B 1 = S|

0 02 0.4 OGe Ga 1 ee. 1.4 1.6 1.8 2


S/K

Figure 1.1: Dependence of Gamma on time to maturity

A trader who buys short-dated ATM options and sells


the same number of long-dated ATM options will be long
Gamma and short vega.
Note that calls and puts with the same strike have
24 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

——— vega, T-0.5


| \ vega, T=0.25

10k \ vega, T=0.5 4

vega
6k | vega, T=0.25

Figure 1.2: Dependence of vega on time to maturity

the same Gamma and vega, a consequence of the Put—


Call parity, so you can take positions in either call or put
options.
Also, the long Gamma, short vega portfolio can be
made Delta-neutral by taking an appropriate position in
the underlying asset. ‘he delta of short-dated ATM op-
tions is smaller than the delta of long-dated ATM options.
Then, the delta of the long Gamma, short vega portfolio
is negative and therefore the trader will have to purchase
units of the underlying asset in order to make the portfolio -
Delta-—neutral. 0

Question 13. Let X; and Y; be geometric ‘Brownian ©


25

motions driven by

dX .
{ = pxdt+oxdW;; (1.35)
Xt
Y;
ety = pydt+oydBi, (1.36)

where W; and B; are correlated Brownian motions with


constant correlation p. Show that

Y;
is also a geometric Brownian motion and determine its
drift and volatility coefficients. .
Answer: Let
v
f(x,y)
X,Y aT y
By applying Itd’s lemma to Z; = ++, we obtain that

OVS, We saree
Y;

= Fe(Xe.¥i)
0 .
dXe + )SE(Xe, Ye) di
LOsfae. P 1o°f
+5 ge (Ae He) XI + 5 a Xe M4) d[Y |e

Onfy OS>
ime AVEIRO
:

Note that
si dare OF £L
Ag y) = y’ =—-(£,
By | 4) ===;
y) y?

Of 2
Of = par” 7
oF (a,y)= 0; Bye rd) = ase Apoyo N

and therefore

v1 Xx:
74 X eee .
dZ, = —dXi-—SsdVitty dlY}s — ye X,Y (1.37)
26 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

Since

die Su vay vas.


d[X,Y |¢ = poxoy X1Y;dt,

it follows from (1.37) that


1 Xt Xt 2 Xt
dZt
7, = Y, dX,
—dXt- y2 dy;Y, ++ —oy
Y, oy dt —- Y,
— poxoy
C dt

X, dX; Xan Ye XE Xt
= —
Ve —
oe - Y— — ar
VY + oy
Y, oy dt —- y,
— Poxey
“oy dt

= Zt oo 2 + Lot dt — Zipoxoy dt,


At t

which can be written as

VAR Ce mie + (oy — poxoy) dt. (1.38)


Zt Xt Y;

By substituting (1.35) and (1.36) in (1.38), we obtain that

dZ
FZ t
= (uxdt +oxdW) — (wydt + oy dBi)
=e (oy — poxoy) dt

== (ux == Ine se oy _ poxoy) dt


+ (axdW; — oy dB:)
= pzdt + (oxdW,—oydB:), (1.39)

where
2
[EZ ie Ova OX Oa

Note that Ww, given by

oxdW; — oy db,
dW. =

is a Brownian motion, and let

oz = 4/o%
x —2poxoy + 02.
y
‘Then, =
oxdW, — oy dB; = ozdW;, (1.40)

and we conclude from (1.39) and (1.40) that Z; satisfies


the SDE
diZt = pzdt + oz dW;.

Thus, Z; is a geometric Brownian motion with drift juz


and volatility oz, where

MZ II [ise <= Une =r oy — poxoy;

GZ = 4/02 = 2poxoy +644

Question 14. Find the k-th largest element in an un-


sorted array. Assume that k is always valid, i.e., k > 1
and k is less than or equal to the length of the array.
Note: You are looking for the k-th largest element in the
sorted order, not the k-th distinct element of the array.
Example 1:

Input: [3,2,1,5,6,4] and k = 2


Output: 5

Example 2:

Input: [3,2,3,1,2,4,5,5,6] and k = 4


Output: 4

Answer:
Solution 1: Use a max heap data structure as follows
(sample code in C++):
class Solution {
public:
int findKthLargest (vector<int>& nums, int k) {
std: :priority_queue<int> max_heap;
for (int i = 0; i < nums.size(); ++i){
max_heap.push(nums[i]) ;
28 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

}y
Int. j=. Os
while (j++ < k - 1){
max_heap.pop() ;
y
return max_heap.top() ;
}
Ue

Solution 2: Use a quick selection algorithm as follows


(sample code in C++):
class Solution {
public:
int findKthLargest (vector<int>& nums, int k) {
const int size_n = nums.size();
int left = 0, right = size_n;
while (left < right) {
int i = left, j = right - 1, pivot = nums[left];
while(i <= j) {
while (i <= j && nums[i] >= pivot) i++;
while (i <= j && nums[j] < pivot) j--;
fy Cie)
swap(nums[it+], nums[j--]);
1
swap(nums[left], nums[j]);
if (j == k - 1) return nums[j];
if (j <k
- 4) left
= j + 1;
else right = j;

}
Ios

Question 15. Given an array nums, there is a sliding


window of size k which is moving from the very left of
the array to the very right of the array. You can only
see the k numbers in the window. Each time the sliding
window moves right by one position. Assume that k is
always valid, i.e., k > 1 and k is less than or equal to the
size of the input array size for non-empty arrays.
29

Write an algorithm that returns the maximum of the slid-


ing window.
Example:

Input: nums’ = [153,-1,-3.5,3.6,71), and k = 3


Output: [3,3,5,5,6,7]

Explanation:

Window Position Max

6
iL ise sh = gi) Goo GE
6
6

Answer: Use a deque (double-ended queue) data structure


as follows (sample code in C++):

class Solution {
public:
vector<int> maxSlidingWindow(vector<int>& nums, int k) {
int n = nums.size();
vector<int> res;
if (n == 0) return res;
if (k == 1) return nums;
deque<int> myDeque;
fdore A(rbahe, oh EXON Ab acS Set Ge AS
if (myDeque.empty()) myDeque.push_back(i);
else {
if (i - myDeque.front() == k) myDeque.pop_front() ;
if (i - myDeque.front() < k){
if (nums[myDeque.back()] > nums[i]) myDeque.push_back (i) ;
if (nums[myDeque.back()] < nums[i]){
while (!myDeque.empty() && nums[i] > nums[myDeque.back()]){
myDeque. pop_back() ;
5
,myDeque. push_back (i) ;
3p
}
iy
30 CHAPTER 1. FIRST LOOK: 15 QUESTIONS

if (i >= k-1) res.push_back(nums [myDeque.front()]);


}
return res;
i
BS
Chapter 2

Questions

1. Mathematics, calculus, differential equations.

2. Covariance and corelation matrices. Linear


algebra.

3. Financial instruments: options, bonds, swaps,


forwards, futures.

4, C++. Data structures.

5. Monte Carlo simulations. Numerical methods.

6. Probability. Stochastic calculus.

7. Brainteasers.

31
32 CHAPTER 2. QUESTIONS
2.1. MATHEMATICS, CALCULUS, ODE

PHA Mathematics, calculus, differential


equations.

. What is the value of i’, where i = \/—1?

>
. Which number is larger, 7° or e7?

. Show that

. Solve x® = 64.

or . What is the derivative of «*?

. Calculate

a4 2+ V2+ 5.

. Find « such that

ee)
34 CHAPTER 2. QUESTIONS

. Compute
i!
leer. is

10. Compute

[eine)ae and [vetae.

1a Compute
fe In(a) da.

1 Compute
[ony da,

13. Solve the ODE

y” — Ay’ + 4y = 1.

14. Find f(a) such that

f'(w) = f(x)Q— f(z).

15. Derive the Black-Scholes PDE.


eo LINEAR ALGEBRA 35

2.2 Covariance and correlation matrices.


Linear algebra.

. Show that any covariance matrix is symmetric pos-


itive semidefinite. Show that the same is true for
correlation matrices.

. Find the correlation matrix of three random vari-


ables with covariance matrix

1 0.36 —1.44
ae = 0.36 4 0.80
=1.44 0.80 9

. Assume that all the entries of an n x n correlation


matrix which are not on the main diagonal are equal
to p. Find upper and lower bounds on the possible
values of p.

. How many eigenvalues does an n x n matrix with


real entries have? How many eigenvectors?

on . Let
2a
A ee & 5 i)

(i) Find a 2 x 2 matrix M such that M* = A;

(ii) Find a 2 x 2 matrix M such that A= MM".


36 CHAPTER 2. QUESTIONS

. The 2 x 2 matrix A has eigenvalues 2 and —3 with


1 —1
corresponding eigenvectors (9 )and ( 3 i If

v= (fs tindAu,

. Let A and B be square matrices of the same size.


Show that the traces of the matrices AB and BA
are equal.

. Can you find n x n matrices A and B such that

ABS BA =I,
where J, is the identity matrix of size n?

. A probability matrix is a matrix with nonnegative


entries such that the sum of the entries in each row
of the matrix is equal to 1. Show that the product
of two probability matrices is a probability matrix.

10. Find all the values of p such that

1 0.6 —0.3
0.6 1
—0.3 p ed

is a correlation matrix.

‘A solution to this question was given in Chapter 1 using


Sylvester’s criterion; two different solutions will be given herein.
2.3. FINANCIAL INSTRUMENTS 37

2.3 Financial instruments: options,


bonds, swaps, forwards, futures.

. The prices of three put options with strikes 40, 50,


and 70, but otherwise identical, are $10, $20, and
$30, respectively. Is there an arbitrage opportunity
present? If yes, how can you make a riskless profit?

. The price of a stock is $50. In three months, it will


either be $47 or $52, with 50% probability. How
much would you pay for an at-the-money put? As-
sume for simplicity that the stock pays no dividends
and that interest rates are zero.

. A stock worth $50 today will be worth either $60


or $40 in three months, with equal probability. The
value of a three months at-the-money put on this
stock is $4. Does the value of the three months ATM
put increase or decrease, and by how much, if the
probability of the stock going up to $60 were 75%
and the probability of the stock going down to $40
were 25%?

. What is risk-neutral pricing?

or . Describe briefly how you arrive at the Black-Scholes


formula.

. How much should a three months at-the-money put


‘on an asset with spot price $40 and volatility 30%
be worth? Assume, for simplicity, that interest rates
are zero and that the asset does not pay dividends.
38 CHAPTER 2. QUESTIONS

. If the price of a stock doubles in one day, by how


much will the value of a call option on this stock
change?

. What are the smallest and largest values that Delta


can take?

. What is the Delta of an at-the—money call? What


is the Delta of an at-the-money put?

10. What is the Put—Call parity? How do you prove it?

iki Show that the time value of a European call option


is highest at-the—money.

12. What is implied volatility? What is a volatility


smile? How about a volatility skew?

13: What is the Gamma of an option? Why is it prefer-


able to have small Gamma? Why is the Gamma of
plain vanilla options positive?

14. When are a European call and a European put worth


the same? (The options are written on the same
asset and have the same strike and maturity.) What
is the intuition behind this result?

. What is the two year volatility of an asset with 30%


six months volatility?
Dheays FINANCIAL INSTRUMENTS 39

16. How do you value an interest rate swap?

Wtf By how much will the price of a ten year zero coupon
bond change if the yield increases by ten basis points?

18. A five year bond with 3.5 years duration is worth


102. What is the value of the bond if the yield de-
creases by fifty basis points?

LO: What is a forward contract? What is the forward


price?

20. What is the forward price for treasury futures con-


tracts? What is the forward price for commodities
futures contracts?

. What is a Eurodollar futures contract?

Dp What are the most important differences between


forward contracts and futures contracts?

23. What is the ten-day 99% VaR of a portfolio with a


five-day 98% VaR. of $10 million?

24. Put options with strikes 30 and 20 on the same un-


derlying asset and with the same maturity are trad-
ing for $6 and $4, respectively. Can you find an
arbitrage?
AO CHAPTER 2. QUESTIONS

25. I sell a one month put option with 28% implied


volatility today and I hedge my position “contin-
uously” until maturity. In one month, I calculate
that the realized volatility of the underlying asset
was 16%. Did I make money or did I lose money?

26. Consider the following option replication strategy


for a call option with strike 30 on an underlying
asset with spot price $25: If the price of the asset
goes above $30, buy one unit of the asset for $30
and hold it while the price is above $30. If the price
of the asset goes back below $30, sell the one unit of
the asset for $30. Thus, at maturity, you will either
hold no position, if the price of the asset is below
the strike price 30 or you will have one unit of the
asset which you bought for $30, corresponding to the
payoff of a call option with strike $30. Seemingly,
you replicated the call option at no cost. What is
wrong with this argument?
C++. DATA STRUCTURES 4]

C++. Data structures.

. How do you declare an array?

. How do you get the address of a variable?

. How do you declare an array of pointers?

. How do you declare a const pointer, a pointer to <


const and a const pointer to a const?

. How do you declare a dynamic array?

. What is the general form for a function signature?

. How do you pass-by-reference?

. How do you pass a read only argument by reference?

. What are the important differences between using a


pointer and a reference?

10. How do you set a default value for a parameter?

11. How do you create a template function?


42 CHAPTER 2. QUESTIONS

How do you declare a pointer to a function?

13. How do you prevent the compiler from doing an im-


plicit conversion with your class?

14. Describe all the uses of the keyword static in C++.

. Can a static member function be const?

16. C+-4+ constructors support the initialization of mem-


ber data via an initializer list. When is this prefer-
able to initialization inside the body of the construc-
tor?

Wee What is a copy constructor, and how can the default


copy constructor cause problems when you have
pointer data members?

18. What is the output of the following code:

#include <iostream>
using namespace std;

class A
1
public:
int * ptr;
MO)
{
delete(ptr) ;
2.4. C++. DATA STRUCTURES 43

void foo(A object_input)


al

iy

int main()
{
A aa;
aa.ptr = new int(2);
foo(aa);
cout<<(*aa.ptr)<<end1;
return 0; é
i

19. How do you overload an operator?

20. What are smart pointers?

Pil What is encapsulation?

2s What is a polymorphism?

. What is inheritance?

24. What is a virtual function? What is a pure virtual


function and when do you use it?

25. Why are virtual functions use for destructors? Can


they be used for constructors?
AA CHAPTER 2. QUESTIONS

26. Write a function that computes the factorial of a


positive integer.

. Write a function that takes an array and returns the


subarray with the largest sum.

. Write a function that returns the prime factors of a


positive integer.

29. Write a function that takes a 64-bit integer and


swaps the bits at indices i and j.

30. Write a function that reverses a single linked list.

Salle Write a function that takes a string and returns true


if its parenthesis are balanced.

32. Write a function that returns the height of an arbi-


trary binary tree.

33. Write a C++ function that computes the n-th Fi-


bonacci number.

34, Implement a basic calculator to evaluate a simple


expression string. ‘lhe expression string may con-
tain open parentheses ”(” and closing parentheses
”)”, the plus sign ”+” or the minus sign ”-”, non-
negative integers and empty spaces.
2.4. C++. DATA STRUCTURES 45

Note: You may assume that the given expression is


always valid. Do not use the eval” built-in library
function.

Example 1:

Aireyoymhns. ily se al
Output: 2

Example 2:

Ibayesihe sg Yekesl ce Ph
Output: 3

Example 3:

Input: "(1+(4+5+2)-3)+(6+8)"
Output: 23
46 CHAPTER 2. QUESTIONS
215. MONTE CARLO METHODS AT

2.5 Monte Carlo simulations. Numerical


methods.

. How would you compute 7 using Monte Carlo sim-


ulations? What is the standard deviation of this
method?

. What methods do you know for generating indepen-


dent samples of the standard normal distribution?

. How do you generate a geometric Brownian motion


stock path using random numbers from a normal
distribution?

. How do you generate a sample of the standard nor-


mal distribution from 12 independent samples of the
uniform distribution on [0, 1]?

Or . What is the rate of convergence for Monte Carlo


methods?

. What variance reduction techniques do you know?

. How do you generate samples of normal random


variables with correlation p?

. What is the order of convergence of the Newton’s


method?
48 CHAPTER 2. QUESTIONS

9. Which finite difference method corresponds to tri-


nomial trees?

10. What is the relationship between the LU and Cholesky


decompositions?

al (i) Which matrices have an LU decomposition with-


out pivoting?

(ii) Does a symmetric positive definite matrix have


an LU decomposition without pivoting?
2.6. PROBABILITY. STOCHASTIC CALCULUS 49

Probability. Stochastic calculus.

. What is the exponential distribution? What are the


mean and the variance of the exponential distribu-
tion?

. If X and Y are independent exponential random


variables with mean 6 and 8, respectively, what is
the probability that Y is greater than X?

. What are the expected value and the variance of the


Poisson distribution?

. A point is chosen uniformly from the unit disk. What


is the expected value of the distance between the
point and the center of the disk?

. Consider two random variables X and Y with mean


0 and variance 1, and with joint normal distribu-
tion. If cov(X,Y) = wet what is the conditional
probability P(X > O0|Y < 0)?

. If X and Y are lognormal random variables, is their


product XY lognormally distributed?

. Let X be a normal random variable with mean jz and


variance o”, and let ® be the cumulative distribution
‘function of the standard normal distribution. Find
the expected value of Y = ®(X).
CHAPTER 2. QUESTIONS

. What is the law of large numbers?

. What is the central limit theorem?

10. What is a martingale? How is it related to option


pricing?

ileile Explain the assumption (dW;)? = dt used in the


informal derivation of It6’s Lemma.

. If W; is a Wiener process, find E[W,W6].

13. If W; is a Wiener process, what is var(W; + W;)?

14. Let W; be a Wiener process. Find


t nt
‘ W, dW, and FE / Ws aw,|:
0 )

. Find the distribution of the random variable


1
eau W,dW,.
0

16. Let W; be a Wiener process. Find the mean and the


variance of
t

A Ww2dw,.
0
2.6. PROBABILITY. STOCHASTIC CALCULUS 51

17. If W; is a Wiener process, find the variance of

l Ww?
X= [ vie™am,.
JO

18. If W; is a Wiener process, what is FP [e™ |?

19. If W; is a Wiener process, find the variance of


7; «

/ s dW.
Jo

20. Let W; be a Wiener process, and let

t
3G. = W,dr.
0

What is the distribution of X;? Is X;a martingale?”

21. What is an Ito process?

22. What is It6’s lemma?

23. If W; is a Wiener process, is the process X; = W/


a martingale?

7A solution to this question was given in Chapter 1 using inte-


gration by parts; a different solution will be given herein.
52 CHAPTER 2. QUESTIONS

24 If W is a Wiener process, is the process

Ne = W? —3tW:
a martingale?

25. What is Girsanov’s theorem?

26. What is the martingale representation theorem, and


how is it related to option pricing and hedging?

Pali Solve dY; = Y; dW:, where W;, is a Wiener process.

28. Solve the following SDEs:

(i) dY¥; = pY,dt + oYidWi;


(ii) dX, = pdt + (aX, + b)dWi.

29. What is the Heston model?

30. Show that the probability density function of the


standard normal integrates to 1.

31. Let Wi = (Xt, Y:) be a two dimensional Brownian


motion starting at (x,y), ie., X+¢ and Y; are in-
dependent one dimensional Brownian motions with
Xo = 27 and Yo = y.

(i) Find the probability that the Brownian motion


W; reaches the y-axis before reaching the a-axis.
2.6. PROBABILITY. STOCHASTIC CALCULUS 53

(ii) Let 0 < 71 < ro such that ry < fa? + y? < ro.
Find the probability that W; enters the inner circle
of center 0 and radius r; before leaving the outer
circle of center 0 and radius ro.

32. Let By, t > 0, be a standard Brownian motion in


the probability measure P. Determine the proba-
bility density function of B 3 under the probability
measure P defined by the Radon—Nikodym deriva-
tive
dP By-4
des wees a

33. Let B: = (B}, B?) be a two dimensional Brownian


motion in the (a,y) plane. Let a > 0 and denote by
7 the first time B; hits the line y = a. Determine
the probability distribution of B?.
54 CHAPTER 2. QUESTIONS
Pathe BRAINTEASERS 55

2.7 Brainteasers.

. A flea is going between two points which are 100


inches apart by jumping (always in the same direc-
tion) either one inch or two inches at a time. How
many different paths can the flea travel by?

. I have a bag containing three pancakes: one golden


on both sides, one burnt on both sides, and one
golden on one side and burnt on the other. You
shake the bag, draw a pancake at random, look at
one side, and notice that it is golden. What is the
probability that the other side is golden?

. Alice and Bob are playing heads and tails, Alice


tosses n + 1 coins, Bob tosses n coins. ‘Vhe coins are
fair. What is the probability that Alice will have
strictly more heads than Bob?

. Alice is in a restaurant trying to decide between


three desserts. How can she choose one of three
desserts with equal probability with the help of a
fair coin? What if the coin is biased and the bias is
unknown?

or . What is the expected number of times you must flip


a fair coin until it lands on head? What if the coin
is biased and lands on head with probability p?

. What is the expected number of coin tosses of a fair


coin in order to get two heads in a row? What if
CHAPTER 2. QUESTIONS

the coin is biased with 25% probability of getting


heads?

. A fair coin is tossed n times. What is the probability


that no two consecutive heads appear?

. You have two identical Fabergé eggs, either of which


would break if dropped from the top of a build-
ing with 100 floors. Your task is to determine the
highest floor from which an egg could be dropped
without breaking. What is the minimum number of
drops required to achieve this? You are allowed to
break both eggs in the process.

. An ant is in the corner of a 10 x 10 x 10 room and


wants to go to the opposite corner. What is the
length of the shortest path the ant can take?

10. A 10 x 10 x 10 cube is made of 1,000 unit cubes.


How many unit cubes can you see on the outside?

La. Fox Mulder is imprisoned by aliens in a large circular


field surrounded by a fence. Outside the fence is
a vicious alien that can run four times as fast as
Mulder, but is constrained to stay near the fence. If
Mulder can contrive to get to an unguarded point on
the fence, he can quickly scale the fence and escape.
Can he get to a point on the fence ahead of the alien?

12: At your subway station, you notice that of the two


trains running in opposite directions which are sup-
posed to arrive with the same frequency, the train
PATE BRAINTEASERS 57

going in one direction comes first 80% of the time,


while the train going in the opposite direction comes
first only 20% of the time. What do you think could
be happening?

TSE You start off with one amoeba. Every minute, this
amoeba can either die, do nothing, split into two
amoebas, or split into three amoebas; all these sce-
narios being equally likely to happen. All further
amoebas behave the same way. What is the proba-
bility that the amoebas eventually die off?

14. Given a set X with n elements, choose two subsets


A and B at random. What is the probability of A
being a subset of B?

. Alice writes two distinct real numbers between 0


and | on two sheets of paper. Bob selects one of
the sheets randomly to inspect it. He then has to
declare whether the number he sees is the bigger or
smaller of the two. Is there any way Bob can expect
to be correct more than half the times Alice plays
this game with him?

16. How many digits does the number 125'°° have? You
are not allowed to use values of log,, 2 or logy, 5.

Ie For every subset of {1,2,3,...,2013}, arrange the


numbers in the increasing order and take the sum
with alternating signs. ‘Vhe resulting integer is called
58 CHAPTER 2. QUESTIONS

the weight of the subset. Find the sum of the


weights of all the subsets of {1,2,3,..., 2013}.

18. Alice and Bob alternately choose one number from


one of the following nine numbers: 1/16, 1/8, 1/4,
1/2, 1, 2, 4, 8, 16, without replacement. Whoever
gets three numbers that multiply to one wins the
game. Alice starts first. What should her strategy
be? Can she always win?

19. Mr. and Mrs. Jones invite four other couples over
for a party. At the end of the party, Mr. Jones asks
everyone else how many people they shook hands
with, and finds that everyone gives a different an-
swer. Of course, no one shook hands with his ot her
spouse and no one shook the same person’s hand
twice. How many people did Mrs. Jones shake
hands with?

20. ‘The New York Yankees and the San Francisco Gi-
ants are playing in the World Series (best of seven
format). You would like to bet $100 on the Yankees
winning the World Series, but you can only place
bets on individual games, and every time at even
odds. How much should you bet on the first game?

21. We have two red, two green and two yellow balls.
For each color, one ball is heavy and the other is
light. All heavy balls weigh the same. All light balls
weigh the same. How many weighings on a scale are
necessary to identify the three heavy balls?
“For example, the weight of the subset {3} is 3. The weight of
the subset {2,5,8} is 2—-5+4+8=5.
BRIG BRAINTEASERS 59

22s There is a row of 10 rooms and a treasure in one of


them. Each night, a ghost moves the treasure to an
adjacent room. You are trying to find the treasure,
but can only check one room per day. How do you
find it?

23. How many comparisons do you need to find the max-


imum in a set of n distinct numbers? How many
comparisons do you need to find both the maximum
and minimum in a set of n distinct numbers?

. Given a cube, you can jump ‘from one vertex to a


neighboring vertex with equal probability. Assume
you start from a certain vertex (does not matter
which one). What is the expected number of jumps
to reach the opposite vertex?

5. Select numbers uniformly distributed between 0 and


1, one after the other, as long as they keep decreas-
ing; i.e. stop selecting when you obtain a number
that is greater than the previous one you selected.

(i) On average, how many numbers have you se-


lected?

(ii) What is the average value of the smallest number


you have selected?

26. ‘To organize a charity event that costs $100K, an or-


ganization raises funds. Independent of each other,
one donor after another donates some amount of
. money that is exponentially distributed with a mean
of $20K. The process is stopped as soon as $100K
or more has been collected. Find the distribution,
60 CHAPTER 2. QUESTIONS

mean, and variance of the number of donors needed


until at least $100K has been collected.

oi. Consider a random walk starting at 1 and with equal


probability of moving to the left or to the right by
one unit, and stopping either at O or at 3.

(i) What is the expected number of steps to do so?

(ii) What is the probability of the random walk end-


ing at 3 rather than at 0?

28. A stick of length 1 drops and breaks at a random


place uniformly distributed across the length. What
is the expected length of the smaller part?

29. You are given a stick of unit length.

(i) The stick drops and breaks at two places. What


is the probability that the three pieces could form a
triangle?

(ii) The stick drops and breaks at one place. Then


the larger piece is taken and dropped again, breaking
at one place. What is the probability that the three
pieces could form a triangle?

30. Why is a manhole cover round?

31. When is the first time after 12 o’clock that the hour
and minute hands of a clock meet again?
2.7. BRAINTEASERS 61

32. ‘Three light switches are in one room, and they turn
three light bulbs in another. How do you figure out
which switch turns on which bulb in one shot?

33. The number 27° has 9 digits, all different. Without


computing 27°, find the missing digit.

34. Alice and Bob stand at opposite ends of a straight


line segment. Bob sends 50 ants towards Alice, one
after another. Alice sends 20 ants towards Bob. All
ants travel along the straight line segment. When-
ever two ants collide, they simply bounce back and
start traveling in the opposite direction. How many
ants reach Bob and how many ants reach Alice?
How many ant collisions take place?

35. There are 20 people at a party. Everyone writes


down their name on a piece of paper and throws it
in a bag. We shake up the bag and each person
draws one name from the bag. You are in the same
group as the person you have drawn.
For example, if people labeled 1 through 20 drew
the following names from the bag:

Die Be eS eae ies) )Ong i eee O


Lee ol. Sat) abel ys aa ee
6d Sse) 1h AO On 3 to at2
Lie 1S 1A 1S S161 18 eG
Fe Sar a ee set Ns ee
taeales 2 e201 los fe tO” ot
‘the groups that form are

(1,6, 10, 12, 18, 19), (2,5, 11, 14, 20), (3),
62 CHAPTER 2. QUESTIONS

(4,8, 13), (7,9, 15, 17), (16).


What is the expected number of groups?

36. Let A be the sum of the digits of 2019!. Let B be


the sum of the digits of A. Let C be the sum of the
digits of B. Find C.

Sie Find 2019 consecutive positive integers that are not


prime.

38. Exactly 4 out of 100 coins are fake. All the genuine
coins weigh the same; all the fake coins, too. A fake
coin is lighter than a genuine coin. How can you
find at least one genuine coin using a balance scale
only twice?

39. Can you design a pair of 6-sided non-identical fair


dice different from the standard dice with each face
bearing a positive integer and having the same prob-
ability distribution for the sum as the pair of stan-
dard dice? (In other words, there must be two ways
to roll a 3, six ways to roll a 7, one way to roll a 12,
and so forth.)
Chapter 3

Solutions

3.1 Mathematics, calculus, differential


equations.

Question 1. What is the value of 7°, where i = //—1?


Answer: Recall that e’® = cos@ +isin@. ‘Then,

x as i eh T i
i = cos—+isin= = e’2,
PA Zz

and therefore

since??? =—1. O

Question 2. Which number is larger, 7° or e”?


Answer: We will show that 2° < e”. By taking the natu-
ral logarithm, we find that

tice) 22> Inge |< Inte") <= elntr) <1


Is (369) oak
ae < S (S21)

63
64 CHAPTER 3. SOLUTIONS

which can be written as

In(7) é In(e)

Let f : (0,00) — R given by f(x) = 1a)


x
»Then,

fi@) = Re)
Note that f’(x) = 0 has one solution, x = e. Also, f’(a) >
0 for 0 < x < e, and f'(z) < 0 for x > e, and therefore
f(a) is increasing on the interval (0,e) and is decreasing
on the interval (e, 00).
Thus, the function f(x) = a has a global maximum
point at « =e, ie., f(x) < f(e) = 4 for all x > 0 with
x # e, and therefore

f(z) — —< esiist)

which is equivalent to 7° < e”; cf. (3.1). O

Question 3. Show that

7 + eY aty
Sto 2 , V2,yeR. (3.2)

Answer: Let e* = a and e¥ = b. Note that a,b > 0, and


that ms
e 2 = Vetty = A/ier seu = Vab.

Then, (3.2) can be written as

a+b
> vVab <= a+b
— 2vab > 0
2
2

> (va-vb) > 0,


which is what we wanted to show. UO
3.1. MATHEMATICS, CALCULUS, ODE 65

Question 4. Solve x° = 64.


Answer: Recall that the six unit roots of z®° = 1 are

II
at PRAT \* ox kart
Zk
Gee mh) mae as
ki v
cos (=) + isin (=) ; (3.3)

fork = 0:5. Since V/64 = 2, we obtain from (3.3) that


the solutions of x° = 64 are

oo 2eos (SE) +-atsin(5), Wiis (Ss, 10

Question 5. What is the derivative of «*?


Answer: Note that

eos elntz*) = et in). (3.4)

Using Chain Rule and (3.4), we find that


Game Carel = e7!™(@) (x In(x))!

= 2°(In(#)+1). O

Question 6. Calculate

24+V2+V2+.... (3.5)

Answer: Assume that the limit from (3.5) exists, and de-
note that limit by 1. Then, 1 = /2+1, which can be
written as

PAl oe = 2D = 10.
66 CHAPTER 3. SOLUTIONS

Since | > 0, we obtain that / = 2, ie.,

2+ a4 oe ah (3.6)

Thus, proving (3.6) is equivalent to showing that the


sequence (%)n>0 given by xo = /2 and

Chl = V2 Dns Vn = 0,

is convergent.
We can see by induction that the sequence (@n)n>0 is
bounded from above by 2, since ro = V2 < 2, and, if we
assume that v7, < 2, then

Zn4+1 = V2+8n < V4 =a

Moreover, the sequence (2n)n>0 is increasing, since

Bip Sly SS a ee

<> ~ (tn — 2G + 1) < Q;

which holds true since x, > 0, and since, as shown above,


Gr <2 tor alli. = 10:
Thus, the sequence (Gr)nso is convergent since it is
increasing and bounded from above, which is what we
needed to show in order to prove (3.6). O

Question 7. Find « such that

x = 2. (3.7)

Answer: If x exists such that (3.7) holds true, then


3.1. MATHEMATICS, CALCULUS, ODE 67

and therefore the only possible solution to GO re Voy


We prove that x = V2 is, indeed, the solution to (3.7), by
showing that

2 = 2. (3.8)
Consider the sequence (@n)n>0 with xo = V2 and sat-
isfying the following recursion:

Ln4+1 = (v2) ie = Dente Vn = 0. (3.9)

We can see by induction that (an)n>o0 is an increasing


sequence, since a = V2 < /2°” = 21, and, if we assume
thatia,;—1< on, then
PPR
By
ez Deal? ec Cee
gtnl2 ag
Also by induction, we can prove that the sequence
(%n)n>o is bounded from above by 2, since 29 = V2 < 2,
and, if tm, < 2, then
tn[2
Inti = Die <a.

Thus, the sequence (%,,)n>0 is convergent since it is


increasing and bounded from above.
Let | = limn—oo Zn. From (3.9), we find that J = Sica
which is equivalent to

[ens Oye. (3.10)


The function f : (0,00) — (0,00) given by

f(t) = t'/* = exp (in(e’”*))

wo (%)
is increasing for t < e and decreasing for f > e, since

f(t) = a3) exp (72)


68 CHAPTER 3. SOLUTIONS

and f’(t)> 0:for ¢;< eand.f'(t) < Oifort|>. e:


‘Thus, there are two values of | such that (3.10) is sat-
isfied, i.e., such that 1'/! = 2/2, one value being equal to
2, and the other one greater than e. Since we showed that
ty < 2 for allay =0, we conclude that} = lint; sco tn 2.
which is what we needed to show to complete the proof of
(eo) ime elk

Question 8. Which of the following series converge:

en ee ee ee |
Lay
k=
apes
ko
sex aia,
k=2

Answer: We show that

ae
2.RB is convergent;
k=
co 1 co 1 }

Ee and ), Ent are divergent.

Since all the terms of the series Nae 3 are positive,


it is enough to show that the partial sums }7;7_, a are
uniformly bounded, in order to conclude that the series is
convergent. ‘his can be seen as follows:

E
n 1

Le =|
|
+

k=) k=2

‘a R=2
“k(k—1)

“i fae Gen :)
3.1. MATHEMATICS, CALCULUS, ODE 69

‘To show that the series >, 7 is divergent, we will


prove that

mr 1 1

y — > In(n)+—, Vn2>1. (3.11)


pane, n

Since + is a decreasing function, it follows that

1 1
=e RB Wi leap [SEM
a

‘Then,

i > ~
/\ Q i)
rile
———-

=
=a"
nm

Se ae (3.12)
n
k=1 k

From (3.12), we find that


nm n 1

f= a
1
=r bit) qa VR
n
which is what we wanted to prove; see (3.11).
Similarly, note that

ee1 1 ee a
nal fee ee
70 CHAPTER 3. SOLUTIONS

and therefore

n+1 il n k+1 1 :

if x In(a) ars yey x In(x) 4

= . (38.13)

Since
ne ae
n+l il P

| zin() n(In(n+ 1))— In(In(2)),


we obtain from (3.13) that

se a > In(in(n + 1)) — In(In(2)),


k=2

and we conclude that the series }7>°., Parte) is divergent.


Note: Although not needed to answer this question, it can

Sie
be shown that

a ma
and

lim. by > - | =,

where y © 0.57721 is Euler’s constant. 0

Question 9. Compute

1
[op
3.1. MATHEMATICS, CALCULUS, ODE 71

Answer: Use the substitution z = tan(z). Then, dz =


—3z—
cos*(z)
dz and

1 1
eee i (1 + tan?(z)) cos?(z) f
1
= | oo dz
J cos?(z) + sin*(z)

= [i dz

= ig ee Oe

where C is a real constant, since cos?(z) + sin?(z) = 1


for any z. Solving x = tan(z) for z, we obtain that z =
arctan(x), and therefore

1
leer.
——— dxd : = arctan(7)
KG: ; + Gh CO

Question 10. Compute

[etn(oyae and [ ede.

Answer: By integration by parts,

iE In(a) dx \|

[ve" ae = ge”— prea


72 CHAPTER 3. SOLUTIONS

Question 11. Compute

[ee In(x) da.

Answer: If n #4 —1, we use integration by parts and find


that

3 ie Sacha Sake|
ke In(a) dx II ne) = f 2 ae

ot ae! [oa
n+l

n+1 n+1
_ grt In(x) 2 grtt Ki ral

n+1 (n + 1)?

For n = —1, we obtain that

o oe
/ In(a) (In(z)) 2 £¢.
ae 2

where C is a real constant, since

Question 12. Compute

/(In(x))” de.

Answer: For every integer n > 0, let

ine) = fone dz.


3.1. MATHEMATICS, CALCULUS, ODE wo

By using integration by parts, we find that, for any n > 1,

fone dx

x(in(z))”
II — fx((n(a))"Y' de
= a(tn(a))” = fx-n(tn(x))"* -(n(x)! de
= a(ln(a))" — fe-nQn(e))"* = de
= a(tn(2))" =n [(in(e))"* de,
and therefore

fale) = oan) 5 nfpea(e) pe eS 1: (3.14)

Note that ‘
fo(z) = oa = eC.

‘Thus, the recursion (3.14) can be used to find the values


of f(x) for all n. For example:

fi(w) = aln(x) — fo(x) = a(In(x)-1)+C;


f2(x) x(In(x))” — 2f1(2)
\| oy ((In(a))? — 2In(x) + 2) +C.

The following general formula can be obtained by in-


duction:
nm Loa \— kt ;
fonay Ot a= 2y> U™ an(e)y' + C,
k=0 :

foralnm sO.

Question 13. Solve the ODE

yl — 4y' + 4y = 1. (3.15)
74 CHAPTER 3. SOLUTIONS

Answer: Note that (3.15) is a second order non homoge-


neous linear ODE with constant coefficients. he homo-
geneous ODE associated to (3.15) is

dey og =O, (3.16)


whose characteristic equation, z*—4z+4 = 0, has a double
root 21 = z2 = 2. Thus, the solution to the homogeneous
ODE (3.16) is

y(z) = ce”” + cone, (S217,)

where c; and c2 are constants.


Since the constant function yo(#) = + is a solution
to the non homogeneous ODE (3.15), we conclude from
(3.17) that the general form of the solution of (3.15) is

y(a) = cie”™ + cove™® + ; 0

Question 14. Find f(x) such that

f(z) = f(x)
— f(@)). (3.18)

Answer: Note that (3.18) is as an ODE with separable


variables and can be written as follows:
y’

= 1, (3.19)
y(1 — y)
where y = f(a). By integrating (3.19) with respect to x
we obtain that

(eoeee Ohh = ne ph ge " ir Cie }


(3.20)

where C; € R is a real constant.


3.1. MATHEMATICS, CALCULUS, ODE 75

Note that dy = y'dx, and therefore

rat ete
pe
{

i (=a
Sanaa
iei
at ra
ierey
ao >

lI a(t) Sng (t 4)
= 9 lin il
y
(3.21)
€ .

From (3.20) and (3.21), it follows ‘that

Y se x2+C,
Li)

and therefore

y
Lae

where C2 = e@! > 0 is a positive real constant.


Thus, either ere = O2e", or i = —C2e*, which can
be written as
dere Wel# (3.22)
Ly
where C’ is a real constant.
From (3.22), we obtain that y = ee. We conclude
that the ODE (3.18) has the following solution:
Ger

FO) = Tyce’
where C € R is a fixed constant. OU

Question 15. Derive the Black-Scholes PDE.


76 CHAPTER 3. SOLUTIONS

Answer: Consider an asset with spot price S following a


lognormal distribution with drift js and volatility o and
paying dividends continuously at rate gq. ‘Then,

dS = (w-—@q)Sdt + oSdWi,
where W;, t > 0, is a Wiener process.
Let V = V(S,t) be the value at time t of a replicable
non path dependent derivative security on this asset, when
the underlying is priced at S. Set up a portfolio II made
of a long position in the derivative security V and a short
position in
OV
A= — 3.
OS oe
units of the asset. ‘Then,

i SANS:

Denote by dS, dV, and dll the changes in the values


of S, V, and II, respectively, over an infinitesimally small
time period dt. ‘Vhen,

dIY = dV — AdS — AgSdt, (3.24)

where AqSdt is the dividend payment owed over the time


dt on the short A units asset position. From (3.23) and
(3.24), we find that
OV OV
AM dV - =~
36 dS — qS Sait (3.25)
3.3

From It6’s formula, it follows that

OV, iG 54 OV:
Ve
| (—at —-
2 0S )nt
OV
+ age: (3.26)

and, from (3.25) and (3.26), we obtain that


OV NaS
dtl = (F+ oS? &V 055m) dt (3.27)
Ot Ube 6fof Os
3.1, MATHEMATICS, CALCULUS, ODE Ge

which means that the value of the portfolio II is determin-


istic over the small time period dt. For no-arbitrage, the
value of the portfolio Il must grow at the risk-free rate
over the time period dt, i.e., dll = rIldt, where r denotes
the risk-free rate. ‘hus,

dil = rlldt = r(V—AS)dt


fe ad eh E
lI (3.28)

From (3.27) and (3.28), we find that

ov + oS? OPV p=: g2Y = rV = rgow

at 2 oer On OS
and therefore

OV CS oe. OV
GE egal =4) sary
bu ee se ee 0,

which is the Black-Scholes PDE for V(S,t). O


78 CHAPTER 3. SOLUTIONS
3.2. LINEAR ALGEBRA 79

3.2 Covariance and correlation matrices.


Linear algebra.

Question 1. Show that any covariance matrix is sym-


metric positive semidefinite. Show that the same is true
for correlation matrices.
Answer: Let Sx and Q»x be the covariance matrix and
the correlation matrix of n random variables X,, X2,...,
Xn. It is easy to see that Mx and Qx are symmetric
matrices:

Lx(9,R) “== CovGxy,pe acovXn.X,)


= Ux(k,j), V1I<3,k Sn;
Ox, 8), ea corr vn), Sacarnl Xp X 7)
= Aix (kh), WIR
kh 2.

Let ci, c2,..., Cn be real numbers, and let C = (c:)i=1:n


be a column vector of size n. Recall that’
nm

var (>:
ax) = C'DxC. (3.29)
i=1
Since the variance of any random variable is nonnega-
tive, it follows that

EXC S.0) ViCceR”, (3.30)


and we conclude that x is a symmetric positive semidef-
inite matrix.
1Note that (3.29) is a special case of the following more general
result: If C) = (er eis and C(?) — (ct) stn are two column
vectors of size n, then,

cov beob) X,, Ss <x) = (CM)'*r, Co),


i=1 =I
see Lemma 7.3 from Stefanica [4] for a proof of this result.
80 CHAPTER 3. SOLUTIONS

For completeness, we include a proof of (3.29) here.


Let Y=) a, Gk. hen,
mr

Y —- HY] = Nae — pi),


4=1

where ju; = E[X;], for i = 1: n, and therefore

(3.31)

E| > cyeu(Xs — py)(Xe — we)


1<j,k<n

Dd epee (Xj — wy)(Xe — wx)]


1<j,k<n
St cjcecov(X;, Xx)
l<j,k<n

S (GceExG, k)
I< 7khSn

Cuxd, (3.32)
where, for (3.31) and for (3.32), we used the following
facts, respectively:

) Zak, N ere Ry tls


1<j,k<n

1<j,k<n

for any nx 1 vector Z = (2i)i=in € R”, and for any nxn


matrix A.
3.2. LINEAR ALGEBRA Sl

‘To show that 2.x is a symmetric positive semidefinite ma-


trix, recall the following correspondence between covari-
ance matrices and correlation matrices:

Sept Ds,x Dey, (3.33)


where Do, = diag (o:),_,.,, is the diagonal matrix with
entries equal to the standard deviations of the n random
variables, i.e., 07 = vat(:X;),.10r ¢== i 72,
Note that (D,, Ne = diag (2) . Let v € R”, and
: 7% Ji=in
let

Ui (Dag). 7U:

‘Then,

ery (Ooareas =aiid Bee ies (3.34)

since (Coro ae is a diagonal matrix and therefore sym-


metric, i.e., (Dara ree)
From (3.33), (3.34), and (3.30), we find that

wdxw = w’ (DeoxQx Dox) w


=O (eh Cry OSD 25.) (Daxe 2
v'Qxv
5 6:
Thus, v'Qxv > 0 for all v € R”, and we conclude that
Qx is a symmetric positive semidefinite matrix. 0

Question 2. Find the correlation matrix of three random


variables with covariance matrix

i, Qhahi eat
SaaS CFB emsidien nOSOW He: (3.35)
i 14 JOS0 nuh
82 CHAPTER 3. SOLUTIONS

Answer: If Qx is the correlation matrix of the three ran-


dom variables, then

(OMS 0 0 O1 0 0

Mix = 0” Foote Qx OF "os 2 10 :


6) 0 03 0 0 03

where o;, 7 = 1 : 3, are the standard deviations of the


three random variables, and therefore

Ce ree LU
y= 0S ON Mae 0
US 7 if
Ose 1

Since the standard deviations of the random variables


with covariance matrix x given by (3.35) are

Ci ue \/ Dae leds
Gain = Eyaes2 ee:

oe == V/V Ux (2, 2) |= w

we obtain that

LO ay ag)
QgeMS 0/3040 |e) e@ 4 0
0340 os ie ae
1 6.18 =h48
= 0.18 1 0.1333
0:43) s021333 1

Question 3.° Assume that all the entries of an n x n


correlation matrix which are not on the main diagonal are
equal to p. Find upper and lower bounds on the possible
values of p.
Answer: Recall that a symmetric matrix with diagonal
entries equal to | is a correlation matrix if and only if the
3.2. LINEAR ALGEBRA 83

matrix is symmetric positive semidefinite, i.e., if and only


if all the eigenvalues of the matrix are nonnegative.
Let
Ls ae

Se
plo) wore hive g/d aK

We include two ways to compute the eigenvalues of Q,


which are then used to find the necessary and sufficient
conditions for the matrix Q to be a correlation matrix.
Solution 1: Note that

pp p

O = (apy a ie
; ; 5 p

ee ate

where M is the n x n matrix with all entries equal to 1,


and J is the n x n identity matrix.
Let A and v = (vi)i=1:n be an eigenvalue and a corre-
sponding eigenvector of M, i.e., Mv = Av, with v ¥ 0.
Then, Mv = Av can be written as

VUitveter-+Un = Avi;
Vitve2t-+-+Un = Xv2;

V1 + V2 zeSDo si == Me.

and therefore

Oi = NID) = tee SS ONG

Thus, either \ = 0, or vj = v2 = --* = Un, in which case


nv; = Avi, and therefore A = n, since v = (v;)i=1:n F 0.
84 CHAPTER 3. SOLUTIONS

In other words, the eigenvalues of M are \ = 0 and


Nee ie
Note that, if Mv = Av, then,

Qv = (1-p)vut+pMv = (1—p)v+ prv


II (1— p+ paA)v.

Thus, p = 1 — p+ pd and v are an eigenvalue and corre-


sponding eigenvector of 22.
Since the eigenvalues of M are A = 0 and A = n, it
follows that the eigenvalues” of Q are yp = 1 — p, corre-
sponding to A = 0, and w= (1—p)+np=1+4+(n-I1)p,
corresponding to \=n
Since 2) is a correlation matrix if and only if all its
eigenvalues are nonnegative, we conclude that the matrix
Q is a correlation matrix if and only if

O<1+(n-—1)p and 0<1-p,


which is equivalent to

= Sai < jn SA
ey mle (3.36)
ee

Solution 2: Note that


il if il

1
Q. = (l-ip)l > p 5d

1 il 1
1
= (1-p)I +p ; Cl aiaacpe
AE)
1
(1—p)I+ pwu"
= (1-p)I+ A,
? The eigenvalue 1 — p has multiplicity m — 1, and the eigenvalue
1+ (n— 1)p has multiplicity 1; see Solution 2 of this question. .
3.2. LINEAR ALGEBRA 85

where / is the n x n identity matrix and A is the n x n


matrix given by A = ww‘, where w is the n x 1 column
vector of size n with all entries equal to 1.
Recall that an n x n matrix of the form uu‘, where
U = (Ui)i=1:n is an n X 1 column vector, has an eigen-
value equal to >>", u? with multiplicity 1 and another
eigenvalue equal to 0 with multiplicity n — |.
Then, the eigenvalues of the matrix A are:
N= = yo, 1 = with multiplicity 1:
A = 0 with multiplicity n — 1.
Note that, if A and v are an eigenvalue and a corre-
sponding eigenvector of A, then Av = Av, and therefore

Qv = (1—-p)v+pAv = (1—p)v+ pau


= (l-p+pdA)v.

‘Thus, |—p+pd and v are an eigenvalue and corresponding


eigenvector of Q. and we obtain that the matrix Q has the
following eigenvalues:
e(1—p)+np=1+4 (n — 1)p with multiplicity 1;
e 1 —p with multiplicity n — 1.
As before, since Q is a correlation matrix if and only if
all its eigenvalues are nonnegative, we conclude that the
matrix Q is a correlation matrix if and only if

O<1+(n-1)p and O0<1-ap,

which is equivalent to

Sea
n-—-l

which is the same as (3.36). O

Question 4. How many eigenvalues does an n X n matrix


with real entries have? How many eigenvectors?
Answer: Any nxn matrix with real entries has n eigenval-
ues, counted with their multiplicities; some of the eigen-
86 CHAPTER 3. SOLUTIONS

values may be complex numbers. Any n x n matrix has


at most n eigenvectors.
Let A be an n x n matrix. Let be an eigenvalue of
A with corresponding eigenvector v # 0, and let Pa(a) =
det (aI, — A) be the characteristic polynomial of A, where
I, is the n x n identity matrix. Note that

Ab= Mi we e0 (AI, — A)v = 0, v #0


Aly, — A singular matrix
det(AIp — A) =0
Pit) PAN AA:

In other words, A is an eigenvalue of A if and only if


\ is a root. of the corresponding characteristic polynomial
P(x). Since Pa(x) is a polynomial of degree n, it follows
from the Fundamental Theorem of Algebra that P(x)
has exactly n (complex) roots when counted with their
multiplicities. We conclude that any n x n matrix has n
eigenvalues, counted with their multiplicities.
An eigenvalue of multiplicity m has at least one eigen-
vector and at most m linearly independent corresponding
eigenvectors, but it may have less than m linearly inde-
pendent eigenvectors.’ ‘Thus, an n x n matrix has at most
n eigenvectors, and at least as many eigenvectors as the
number of distinct eigenvalues of the matrix. O

Question 5. Let

2 0
°For example, the matrix (OnrnS te vit has eigenvalue 2’°with
0 2
1
multiplicity 3 and only one eigenvector, (0 :
0
3.2.. LINEAR ALGEBRA 87

(i) Find a 2 x 2 matrix M such that M? = A;


(ii) Find a 2 x 2 matrix M such that A= MM¢*.
Answer: (i) Recall that any symmetric matrix has the
diagonal form
Ab=Oh@ (Sot)
where A is the diagonal matrix whose entries on the main
diagonal are the eigenvalues of A and Q is the orthogonal
matrix whose columns are the corresponding eigenvectors
of A of norm 1], i.e.,

See mae
yi ( ieee HE Q = (v1 v2), (3.38)

where Av, = Avi and Avo — A2Vv2, with I|vr|| = ||vo|| =

aks
If the matrix A has nonnegative eigenvalues, i.e., if
Ay > 0 and A2 > 0, then the matrix

M = QA’’?Q¢! (3.39)

with

AV? = ( A a ) (3.40)
has the property that M? = A:

es (2A"7Q') (QA"7Q")

ee Qn/?(QtQyaQ
= QAV/?241/2Q¢

QAQ'
ow A,

since Q is an orthogonal matrix and therefore Q'Q = J,


and since, from (3.40), it follows that A’/?A1/? = A,
88 CHAPTER 3. SOLUTIONS

We now proceed to compute the eigenvalues and the


eigenvectors of the matrix A. The eigenvalues of the ma-
trix A are the roots of the characteristic polynomial P’4(«)
of the matrix A given by*

Pa(x) det(xt — A) = det (757


p .
gee )

lI (x—2)(x2-—5)-4 = 2? —7r+6
I] (a — 1)(x
—6).
The roots of Pa(a) are 1 and 6, and therefore the eigen-
values of A are A; = 1 and A2 = 6. The corresponding
eigenvectors of norm 1 are
2 1
mmole] and n= (4 )
V5 V5
For example, if A2 = 6, any corresponding eigenvector
V2 = ( # 0 is a solution to Av = 6v, which can be
written as

2a — 2b 6a b —2a
~2a + 5b ll} Goi wuscane ll —2a

Thus, any eigenvector corresponding to the eigenvalue

w= (4) -e(4)
A2 = 6 is of the form

By choosing a = Re we obtain that an eigenvector of


norm | corresponding to the eigenvalue Az = 6 is
Sath
v2 = (a5).
V5
‘The characteristic polynomial of the matrix A can also be
obtained as follows:
PA) a? — tr(A)a +det(A) = x? — 72+ 6,
where tr(A) = 2+5 = 7 and det(A) = 2-5 — (—2)- (-2) =6."
3.2. LINEAR ALGEBRA 91

Since Avy = Aiv1 = 2; and Av2 = A2v2 = —3v2, we


find from (3.42) that

Av = 2Av; — Ave = 2(2u1) — (—3ve2)


= 4, + 3v2

\| eon
Question 7. Let A and B be square matrices of the same
size. Show that the traces of the matrices AB and BA are
equal.
Answer: Recall that, for any two‘square matrices A and
B of the same size, the matrices AB and BA have the
same characteristic polynomial, i.e.,

Pap(xz). = det(2J— AB) = det(xI


— BA)
= BAD). Vie Cle, (3.43)

where J is the identity matrix of the same size as the


matrices A and B.
Also, recall the following connection between the char-
acteristic polynomial Pjy(x) of an n x n matrix M and
the trace tr(M) of the matrix:

Pu (2)

II det(aI — M)
2” — tr(M)2"~* + +--+ (—1)"det(M).(3.44)
Since Pag(x) = Pe a(ax), see (3.43), we conclude from
(3.44) that
tr(AB) = tr(BA). (3.45)

For completeness, we include a proof of (3.43). If the


matrix B is nonsingular, then

aI —AB = B7)(2I — BA)B,


92 CHAPTER 3. SOLUTIONS

and therefore

det (aI — AB) det(B~')det(aI — BA)det(B)


det(aI — BA), (3.46)

since

det(B~')det(B) = det(B~'B) = det(I) = 1.


If the matrix B is singular, let € be a real number, and
note that the matrix B— el is singular if and only if € is
equal to an eigenvalue of B. Since the n x n matrix B has
at most n eigenvalues, it follows that, except for a finite
number of values of ¢, the matrix B — «/ is nonsingular,
in which case we obtain from (3.46) that

det(xI — A(B —e1)) = det(aI—(B-—elI)A). (3.47)

Since both sides of (3.47) are polynomials of degree n


in e, and therefore continuous functions of €, we can let
€ — 0 in (3.47) and obtain that

lim ( det(aJ — A(B —el)) )


= lim ( det(aI — (B — «I)A) )
<= det(zJ—AB) = det(z]—BA). (3.48)

From (3.46) and (3.48), we conclude that det(xJ —


AB) = det(aI — BA) regardless of whether the matrix
B is nonsingular or singular, which concludes the proof of
(3.43).

Question 8. Can you find n x n matrices A and B such


that
AB—BA = In,
where J, is the identity matrix of size n?
3.2. LINEAR ALGEBRA 93

Answer: We give a proof by contradiction. If it were


possible to find n x n matrices A and B such that AB —
BA= I, then

tr(AB— BA) = trUn) = n. (3.49)

However,

tr(AB-— BA) = tr(AB)—tr(BA) = 0, (3.50)

since, if A and B are square matrices, then tr(AB) =


tr( BA): ef. (3:45);
Since (3.49) and (3.50) contradict each other, we con-
clude that there are no matrices A and B such that AB -—
BAs Is O

Question 9. A probability matrix is a matrix with non-


negative entries such that the sum of the entries in each
row of the matrix is equal to 1. Show that the product of
two probability matrices is a probability matrix.
Answer: We first establish the following equivalent defi-
nition for a probability matrix:
The n x n matrix M is a probability matrix if and only if
all the entries of MW are nonnegative and

M1 = 1, (3.51)
where 1 is the n x 1 column vector with all entries equal
to! Lt
To see this, let M = row(rj),_,., be the row form
of the matrix M, where r; is an 1 x n row vector, for
j =1:n. The sum of all the entries in the j-th row r; of
M can be written as follows:°
rm

>
te ery. (3.52)
k=]
°“Note that rj is an 1 x n vector and 1 is an n X | vector, and
therefore the expression r;1 from (3.52) is consistent.
94 CHAPTER 3. SOLUTIONS

Thus, the definition of a probability matrix as a matrix


with the sum of the entries in each row equal to 1 can be
written as

S ory (k) ata ee eotee ee)


k=1
rl = Voe lin
(rjl)j-1n = 1
{ttThs
since M1 = (rj1);21:n if M = row (rj) j<in 1S the row
form of M.
In other words, we established that (3.51) is an equiv-
alent condition for M to be a probability matrix.
Let A and B be probability matrices. Then all the
entries of A and B are nonnegative, and therefore all the
entries of AB are also nonnegative. From (3.51), it follows
that
Adj 1s sands, Bile
and therefore

CAB) iss, GAL sok Wess 1:

Then, from (3.51), we conclude that AB is a probability


matrix. O

Question 10. Find all the values of p such that

1 0.6 —0.3
0.6 il p
—0.3 p 1

is a correlation matrix.

Answer: Recall that a solution to this question based on


was Sylvester’s criterion was included in Chapter 1. We
give two more solutions to this question here, one using
3.2. LINEAR ALGEBRA 95

the Cholesky decomposition, and another one based on


the definition of symmetric positive semidefinite matrices.
A symmetric matrix with diagonal entries equal to 1 is
a correlation matrix if and only if the matrix is symmetric
positive semidefinite. ‘Thus, we need to find all the values
of p such that the matrix

| 06s —0'3
Us 0.6 i p (3.53)
053" 2 1

is symmetric positive semidefinite.


Solution 1: Yo identify the values of p such that the ma-
trix Q“ is symmetric positive semidefinite, we apply the
first step of the Cholesky algorithm to Q, and obtain the
following 2 x 2 matrix:

&1 i) = 0.6 )oo 21013)


(ine

II ie | 0he. F088
od 0.18 0.09
z 0.64 p+0.18
- p+0.18 0.91
‘Thus, the matrix Q is symmetric positive semidefinite
if and only if the matrix

ae 0.64 p+0.18
Ma ees O18 00)
is symmetric positive semidefinite. Since M(1, 1) = 0.64 >
0, it follows that M is symmetric positive semidefinite if
and only if det(/) > 0, i-e., if and only if

det(M) = 0.5824 —(p+0.18)? > 0, (3.54)


which is equivalent to

lo +0.18| < V0.5824 = 0.7632.


96 CHAPTER 3. SOLUTIONS

We conclude that Q is a symmetric positive semidefi-


nite matrix, and therefore a correlation matrix, if and only
if

—0.7632 < p+0.18 < 0.7632,

which can be written as

—0.9432 < p < 0.5832. (3.55)

Note that condition (3.55) is the same as condition


(1.8) obtained when solving the same question using
Sylvester’s Criterion; see Chapter 1.

Solution 2: By definition, the matrix 2 is symmetric pos-


itive semidefinite if and only if «'Qx2 > O for all a =
(xi )i=1.3 € R®. Note that

x Qa
1 0.6 —0.3 Ch
a= (iy os) 0.6 1 p x2
—0.3 6) 1 3

= xi + x + v3 + 1.27122 — 0.62%12%3 + 2prex3.

By completing the square, we obtain that

x’ Qe
= grt 221(0.6%2 — 0.343) + x5 + x3 + 2pr2r3
= (21+ 0.6a2 — 0.303)?
—(0.6a2 — 0.303)? + 73 + x3 + 2prezx3
= (a1 + 0.6%2 — 0.323)?
+0.64a°3 + 2ar2a3(p + 0.18) + 0.9143.
3.2. LINEAR ALGEBRA 97

By completing the square once again, we find that

0.6422 + 2ax2x3(p + 0.18) + 0.9123


= Ge
p+0.18\?7
+ 23
0.8
2(p+ 0.18)?
BE Gi ar
+ 0.9123
an

0.64
== (0802 +23
pe OAs \~
0.8
3)
+ Haq (0:5824 — (6 + 0.18)") ,
va

and therefore

x’ Qe
a (21 + 0.6x%2 — 0.303)?

0:18.\2
©
aa (0.5824
— (p + 0.18)*) .

Thus, ‘Qa > 0 for all « = (2;)i=1.3 € R® if and only


if

(x1 + 0.622 — 0.323)?


p+oisy'
+ a + 23 08

ae (0.5824 24 —— (p+ 0.18 0.18)”)


ey OFAN
5 fee Ine

The last inequality holds if and only if

0.5824 — (9 + 0.18)? > 0, (3.56)


which is the same as (3.54).
98 CHAPTER 3. SOLUTIONS

We conclude that 2 is a correlation matrix if and only


if
—0.94382 <p < 0.5332.
3.3. FINANCIAL INSTRUMENTS 99

3.3 Financial instruments: options,


bonds, swaps, forwards, futures.

Question 1. The prices of three put options with strikes


40, 50, and 70, but otherwise identical, are $10, $20,
and $30, respectively. Is there an arbitrage opportunity
present? If yes, how can you make a riskless profit?
Answer: If an arbitrage exists, it will be due to the fact
that the convexity of put option values with respect to the
strike price is violated.
In the plane (K,y), the line passing through the points
(K = 40, P(40) = 10) and (K = 70, P(70) = 30) is given
by
70—K k — 40
Tengo ep e 30. (3.57)
The point on this line corresponding to strike 50 is ob-
tained by substituting AK = 50 in (3.57), and has y-
coordinate equal to

Since P(K) is a strictly convex function of Kk, a no-


arbitrage value of the put option with strike 50 should
be below the line passing through the price points of the
options with strikes 40 and 70. However, P(50) = 20 >
a. Thus, the put option with strike 50 is overpriced, and
an arbitrage exists.
Using a “buy low, sell high” strategy, we can take ad-
vantage of this arbitrage opportunity as follows: buy 2
put options with strike 40, buy 1 put option with strike
70, and sell 3 put options with strike 50. ‘There is a $10
positive cash flow when setting up this portfolio, since

3-$20 — 2-$10 — $30 = $10.


100 CHAPTER 3. SOLUTIONS

The value V(Z’) of the portfolio at the maturity T' of


the options is

V(T) = 2max(40— S(T), 0)


+ max(70 — S(7’),0)
— 3max(50 — S(T’), 0).

Note that V(Z’) is nonnegative for any value S(‘J’) of the


underlying asset at 7’:
If 70 < S(T), then all options expire out of the money
and
VA eo
If 50 < S(T) < 70, then
VEY =370= SO aso.
If 40 < S(T) < 50, then

Vilveos, (= S(T) =3GUnus)


= 25(T) —80
aC,
If S(T) < 40, then
V(T) = 2S(T) — 80+ 2(40—S(T)) = 0.
In other words, we set up a portfolio with positive cash
flow at inception which does not lose money regardless of
the value of the underlying asset at time 7’. The risk-free
profit is equal to the future value at time 7’ of the $10
cash flow from setting up the portfolio. O

Question 2. The price of a stock is $50. In three months,


it will either be $47 or $52, with 50% probability. How
much would you pay for an at the money put? Assume
for simplicity that the stock pays no dividends and thee
interest rates are zero.
3.3, FINANCIAL INSTRUMENTS 101

Answer: Recall first that real world probabilities do not


play any role in valuing an option in a (one period) bino-
mial tree model. ‘Thus, the 50% probability stated in the
question is only meant to throw you off-course.
Solution 1: Vhe value of the option is the discounted ex-
pected value of the payoff of the option in the risk-neutral
probability measure. Since interest rates are zero, this can
be written as

P(0) = DERN upl up ap Dreier counts (3.58)

‘The up and down factors are u = 22


50 = 1.04 and d =

50 = 0.94, respectively. ‘The risk-neutral probabilities of
going up and down are
l-—d u—1l
eS
PRN,up a down == ——
PRN,down 7G = 0.4.

The ATM put pays $3 if the stock price goes down to $47,
i.e., Paown = 3, and expires worthless if the stock price
goes up to $52, i.e., Pup = 0. From (3.58), we find that
the value of the ATM put is

P(O)) =" 0640560 4 se 1,2 (3.59)

In other words, you should pay at most $1.20 for an at


the money put.
Solution 2: An insightful solution can be given by setting
up a hedged portfolio. ‘The Delta of the put option is
Pu oe Paown 0—3 5
A = as SOE = = 3 0.6.
ig Sup Re Sdour 52 — 47

A portfolio which is long one ATM put and short Ap


shares will be long the put and long 0.6 shares, and will
have the same value at maturity regardless of whether the
stock price goes down to $47 or up to $52:

I(T) II P(T) +0.65(L)


Bs {0+ 0.6 -52 31.20, if S(T’) = 52;
3+ 0.6: 47 a 31.20, if S(T’) = 47.
102 CHAPTER 3. SOLUTIONS

For no-arbitrage, the value of the portfolio at inception


must be the discounted value of its payoff. Since the inter-
est rates are zero, we obtain that II(0) = P(0)+0.65(0) =
31.20, and therefore

P(0) = 31.20—0.6-50 = 1.20,


which is the same value of the put, $1.20, obtained above;
see (3.59). O

Question 3. A stock worth $50 today will be worth either


$60 or $40 in three months, with equal probability. The
value of a three months at the money put on this stock is
$4. Does the value of the three months ATM put increase
or decrease, and by how much, if the probability of the
stock going up to $60 were 75% and the probability of the
stock going down to $40 were 25%?
Answer: In a one period binomial tree model, the actual
probabilities of the asset going up or down do not play
any role in the valuation of a plain vanilla option. Thus,
the value of the three months at the money put would be
the same, $4, even if the probability of the stock going up
to $60 were 75%. 0

Question 4. What is risk-neutral pricing?


Answer: Risk—neutral pricing, or valuation, refers to valu-
ing derivative securities as discounted expected values of
their payoffs at maturity, under the assumption that the
underlying asset has lognormal distribution with a drift
equal to the risk-free rate.
More precisely, if the price of the underlying asset has
a lognormal distribution with volatility o and pays div-
idends continuously at the rate g, then the value of a
derivative security on this asset with payoff V(T) at ma-
turity 7’ given by risk-neutral valuation is

V(0) = e"" Ern[V(S(1))],


3.3. FINANCIAL INSTRUMENTS 103

where r is the risk-free rate assumed to be constant, and


the expected value is computed with respect to the log-
normal random variable $(7') given by

S(T) = S(0) a Utlaes bias


Risk-neutral valuation can be used for derivative se-
curities which can be perfectly hedged dynamically using
cash and the underlying asset. Plain vanilla European op-
tions, as well as European options with other payoffs at
maturity (such as asset—-or—nothing and cash-or—nothing
options) can be priced using risk—neutrality. Risk—neutral
valuation cannot be used for path dependent options such
as American options, barrier options, and Asian options.
O

Question 5. Describe briefly how you arrive at the Black—


Scholes formula.
Answer: The Black-Scholes formulas give the values of
plain vanilla European put and call options on an under-
lying asset with lognormal distribution. Several methods
for deriving the Black-Scholes formulas are:
e Risk neutral pricing: the expected value of the payoff of
the option at maturity computed under the assumption
that the price of the underlying asset has a lognormal
distribution with drift equal to the risk free rate gives the
Black-Scholes value of the option.
e Black-Scholes PDE solution: the Black-Scholes value of
the option satisfies the Black-Scholes PDE with bound-
ary conditions given by the payoff of the option at matu-
rity. he Black-Scholes PDE is transformed into the heat
PDE using a lognormal change of variables, and the closed
form solution of the heat PDE is then used to derive the
closed form solution of the Black-Scholes PDE, which is
the Black-Scholes value of the option.
104 CHAPTER 3: SOLUTIONS

e Binomial tree model pricing: the evolution of the un-


derlying asset is modeled using a binomial tree calibrated
to converge in the limit to a lognormal distribution with
drift equal to the risk free rate. For every tree, an ap-
proximate option value is obtained from the binomial tree
model. ‘The limit of these binomial tree option values as
the number of time intervals in the tree goes to infinity is
the Black-Scholes value of the option.
Note that twelve different ways to derive the Black—
Scholes formula can be found in Wilmott [5].

Question 6. How much should a three months at the


money put on an asset with spot price $40 and volatility
30% be worth? Assume, for simplicity, that interest rates
are zero and that the asset does not pay dividends.
Answer: Vhe following approximation for the value of an
at the money put option on a non dividend paying un-
derlying asset and assuming zero risk-free interest rates
is easy to estimate and very accurate if the total variance
is small (e.g., if ¢?7 < 0.25):

Patm & 0.40 SVT; (3.60)

see Stefanica [3] for a derivation of formula (3.60).


Horse 40 4o1==0. Se andndie= a we obtain that the
value of the at the money put is approximately 2.40.
For comparison purposes, note that the value of the put
option computed using the Black-Scholes formula would
be 2.3914; the approximate formula (3.60) is very accurate
in this case. OU

Question 7. If the price of a stock doubles in one day,


by how much will the value of a call option on this stock
change?
Answer: The value of a deep in the money call on a non
dividend paying asset can be approximated, e.g., by using
3.3. FINANCIAL INSTRUMENTS 105

the Put-Call parity, as C =» S — Ke~’", where K and 7’


are the strike and the maturity of the option, and r is the
constant risk free rate. ‘Thus, if the spot price S doubles,
the call option will be even deeper in the money, and
therefore its value will be approximately 2S'— Ke~'. In
other words, the value of deep in the money calls roughly
doubles if the spot price doubles.
If the option is around at the money, the percentage
change generated by the doubling of the stock price is
about one order of magnitude larger since the option will
become deep in the money.
If the option is deep out of the money, then it trades
for fraction of cents. ‘he doubling of the spot price would
result in changing of the price of the option by several
orders of magnitude.
As a numerical example, consider a six months call op-
tion with strike 20 on a non dividend paying underlying
asset with volatility 25%. Assume that the risk free rate is
constant at 5%. The Black-Scholes values of the call op-
tion corresponding to several spot prices of the underlying
asset can be found below:

Spot Price | Option Price


10 0.000045
20 1.65
20.49
60.49

If the call option is deep out of the money and the


spot price doubles from $10 to $20, the value of the call
increases from $0.000045 to $1.65, i-e., by more than four
orders of magnitude.
If the call option is at the money and the spot price
doubles from $20 to $40, the value of the call increases
106 CHAPTER 3. SOLUTIONS

from $1.65 to $20.49, i.e., more than tenfold.


If the call option is deep in the money, and the spot
price doubles from $40 to $80, the value of the call in-
creases from $20.49 to $60.49, i.e., by a factor of 2.95;
if the call option is even deeper in the money® and the
spot price doubles from $400 to $800, the value of the call
increases from $380.49 to $780.49, i.e., the call approxi-
mately doubles in value.
Moreover, if the call option is deep in the money, its
value is very close to S— Ke~"", i.e., the value of the spot
price of the underlying asset minus the present value of
the strike. For all the spot prices greater than $40, the
estimate C ~ S — Ke~"” is very accurate. Thus, if the
call option is at the money and the spot price doubles,
the value of the call option increases by the same amount
as the increase in the spot price. For example, if the
spot price doubles from $40 to $80, the value of the call
increases by $40, from $20.49 to $60.49, which is exactly
the increase in the spot price.

Question 8. What are the smallest and largest. val-


ues that Delta can take?

Answer: Assume, for simplicity, that the underlying asset


does not pay dividends.
The Delta (A) of a long position in a plain vanilla call
option is between 0 and 1 (and therefore the Delta of a
short plain vanilla call position is between —1 and 0). The
Delta of a long call position increases with the spot price
of the underlying asset, and goes from 0, when the asset
is worthless (i.e., when the call option is deep out of the
money), to 1, when the spot price of the asset is very large
(ie., when the call option is deep in the money).

°Of course, call options with strike ten times smaller than the
spot price of the underlying asset never occur in practice; this part
of the example is for illustration purposes.
3.3. FINANCIAL INSTRUMENTS 107

‘The Delta of a long position in a plain vanilla put op-


tion is between —1 and 0; the Delta of a short position in
a plain vanilla put option is between 0 and 1. The Delta
of a long put position also increases with the spot price of
the underlying asset, and goes from —1, when the asset is
worthless (i.e., when the put option is deep in the money),
to 0, when the spot price of the asset is very large (i.e.,
when the put option is deep out of the money).
Note that, from the Put-Call parity, i.c., C — P =
S — Ke~"', it follows that A(P) = A(C) — 1, which is
consistent with the bounds above. []

Question 9. What is the Delta of an at-the-money call?


What is the Delta of an at-the—money put?
Answer: ‘The Delta of an at-the-money call is approxi-
mately 0.5; the Delta of an at-the-money put is approxi-
mately —0.5.
Assume, for simplicity, a Black-Scholes framework with
zero risk-free rates and an underlying asset paying no div-
idends. ‘Then, A(Cgs) = N(d1), where N(a) denotes the
cumulative distribution of the standard normal variable
and

In(2)+(r-q+9)T 7
dy = (a) +( =) mes (3.61)
iar JAG Sane feyavelakeie
je —aope-al0):
The linear ‘Taylor approximation of N(x) around 0 is

N(a) © 0.5 ++ ~ 0.5 +042, (3.62)


uw) 1 ian: ‘ me mm
since, j= = 0.3989 ~ 0.4. Thus,

A(Cps) = N(di) = N (2°) ~ 0.5+40.20vT.


108 CHAPTER 3. SOLUTIONS

This is roughly estimated as A(Cgs) * 0.5, since 0.20VT


is small for most options (e.g., 0.20V/T < 0.1 for volatility
less than 50% and for maturity less than one year).
For put options, A(Pgs) = —N(—di), where di =
eVT’.
2
cf. (3.61). From (3.62), it follows that

A(Pas)-= -w(-3
te

)~ —0.5+4+020VT, ry.

which, using the same rationale outlined above, is often


stated as A(Pgs) + —0.5. O

Question 10. What is the Put—Call parity? How do you


prove it?

Answer: The Put—Call parity is a model independent no—


arbitrage relationship between the prices of European call
and put options with the same strike and maturity.
In a nutshell, the Put—Call parity states that being
long a call option and short a put option on the same
underlying asset, and with the same strike and maturity,
is the same as being long a forward contract on the asset,
with the same maturity as the maturity of the options,
and with delivery price equal to the strike of the options.
Equivalently, being long a call and short a put is the same
as being long one unit of the underlying asset (for non
dividend paying assets) and short the present value of the
strike of the options.
More precisely, if C(t) and P(t) are the values at time
t of a European call and put option with maturity 7’ and
strike A’, on the same non dividend paying asset with spot
price S(t), then the Put—Call parity states that

COG) "PGy 350) =e (3.63)

where r denotes the risk-free rate, assumed to be constant.


3.3. FINANCIAL INSTRUMENTS 109

If the underlying asset pays dividends continuously at


the rate q, the Put—Call parity has the form

C(t), —aRG) = SiGe 275) ke "6 (3.64)


For simplicity, we restrict our attention to non dividend
paying underlying assets and prove formula (3.63).
Consider a portfolio made of the following assets:
e long 1 put option;
e long 1 unit of the underlying asset;
e short 1 call option.
‘The value of the portfolio at time t is

Vportfolio(t) = P(t) + S(t) — C(t). (3.65)


‘The values of the call and put option at maturity are

C(T) = max(S(T) — K,0);


ELT) nz, mmax( I= SZ) 0):
Then, regardless of whether S(T’) < K or S(T) > K, the
value of the portfolio at time 7’ will be equal to Kk:
S(T) <K then P(T) = ke S(7) and C(T) = 0; and
therefore

Vport folio (7’) II Pest) = C2)


(ae ST et Sie 20)
ek. (3.66)
If S(T) > K, then P(T) = 0 and C(T) = S(T) — K, and
therefore

Vport'yolio(T') a jal, =i S(T) a C(T)

og s(x (GPL ik)


fae) ah (3.67)
From (3.66) and (3.67), we find that

Vportfolio(l') = P(T)+S(1)-C(T) = K, (3.68)


110 CHAPTER 3. SOLUTIONS

regardless of the value S(7’) of the underlying asset at the


maturity of the option.
Then, for no-arbitrage, the value of the portfolio at
time t must be equal to the present value of K at time 1,
i.€.,
Voesni potent). = Ke (3.69)
From (3.65) and (3.69), we obtain that

P(t) + SQne OC Orenter'S 3:


‘This can be written as

CO) = BO SO aren
which is the Put—Call parity formula (3.63).
The Put-—Call parity formula (3.64) corresponding to
an underlying asset paying dividends continuously at the
rate q can be obtained similarly using a portfolio with a
long put position, a short call position, and a long position
in e “7-9 units of the underlying asset. All dividends
payed by the long asset position between time t and time
T’ are used to purchase additional fractions of the asset.
Doing so continuously results in an asset position at time
T' equal to long one unit of the asset. ‘Chus, the value of
the portfolio at time 7’ will be equal to K regardless of
the value of S(7Z’), and the Put-Call parity formula (3.64)
for underlying assets paying dividends continuously can
be obtained as before.

Question 11. Show that the time value of a European


call option is highest at the money.
Answer: The time value of a call option is the difference
between the value C(S) of the option and the intrinsic
value max(S — K,0) of the call option. In other words,
the time value of the option is

C(S) — max(S — K,0).


3.3. FINANCIAL INSTRUMENTS 111

We want to establish that the time value of the option


is highest at the money, i.e., when S = K. To do so, we
show that the function

f(S) = C(S) — max(S — K,0)


attains its maximum for S = Kk.
Note that’

ae C(S), jue IS) IAC


PLO ies ‘C(Si seein ltiesh Sak
For S < K, the function f(S) is the value of a call with
strike K, and therefore is increasing.
For S > K, we find that

f (S)) =" A(C)= 1950,


since the Delta of a call option is always less than 1,° and
therefore the function f(S) is decreasing.
We conclude that f(S’) has a global maximum point at
S = K, which is what we wanted to show.
Note that a similar reasoning shows that the time value
of a put option, given by

P(S) — max(K — S,0),


is largest at. the money, i.e., for S = Kk. If g(S) = P(S) —
max(K — S,0), then

(9) = (POS K+S, if $< kK;


hat aes P(S), if i eer
“Phe function f(.S) is a continuous function, but it is not dif-
ferentiable at S= K.
®The Delta of a long call position is an increasing function
going from 0 when the spot price of the underlying asset is 0 and
the call option is deep out of the money, to 1 when the spot price
of the underlying asset goes to co and the call option is deep in the
money. For example, in the Black-Scholes model,

A(Ceas) = e 7’ N(di1) < N(di) < 1.


dale CHAPTER 3. SOLUTIONS

Lor Sere

g(S) = A(P)+1 > 0,

and therefore the function g(S) is increasing, since the


Delta of a put option is between —1 (when the put is
deep in the money) and 0 (when the put is deep out of
the money), and therefore is always greater than —1.
For S > K, the function g(S) is the value of a put with
strike K, and therefore is decreasing.
Thus, g(S) has a global maximum point at S = K, and
therefore the time value of the put option is largest at the
money.

Question 12. What is implied volatility? What is a


volatility smile? How about a volatility skew?
Answer: By definition, implied volatility is the unique
value of the volatility parameter o from the lognormal
model for the evolution of the price of an underlying that
makes the Black-Scholes value of an option equal to the
market price of the option. Implied volatility exists and is
unique’ for any arbitrage-free market value of the option.
On the same asset, prices of options with multiple
strikes and maturities are quoted, and implied volatilities
can be computed for each of these options. If the price
of the asset had a lognormal distribution as assumed in
the Black-Scholes model, then the resulting plots of im-
plied volatility vs strike for the same maturity should be
flat. In practice, they are not flat, and are often shaped
as “smiles” or “skews”.
An implied volatility smile occurs when the implied
volatilities of deep in the money options and of deep out of
the money options are higher than the implied volatilities
°The uniqueness of the implied volatility comes from the fact
that the Black-Scholes value of the option is a strictly increasing
function of the volatility parameter (or, equivalently, the vega of
the option is strictly positive). :
3.3. FINANCIAL INSTRUMENTS 113

of options close to at the money. Volatility smiles are


typical for currency optiohs.
An implied volatility skew occurs when the implied
volatilities of options with large strikes are lower than
the implied volatilities of at the money options (reverse
skew), or when the implied volatilities of options with
small strikes are lower than the implied volatilities of at
the money options (forward skew). Reverse skews are typ-
ical for long dated equity and index options. Forward
skews are typical for commodities options.

Question 13. What is the Gamma of an option? Why is


it preferable to have small Gamma? Why is the Gamma
of plain vanilla options positive?
Answer: The Gamma (I) of an option measures the sen-
sitivity of the Delta of the option with respect to the price
of the underlying asset, i.e.,

2
Tis es
OS
ee ee
OS?

where V denotes the value of the option.


It is often important to immunize a portfolio with re-
spect to changes in the price of the underlying asset, i.e.,
to make the portfolio Delta-neutral. A portfolio with
small Gamma would need to be rebalanced less often in
order to be kept Delta—neutral, since the change in the
Delta of the portfolio is proportional to Gamma for small
changes in the value of the underlying asset. ‘Thus, a
Delta-neutral and Gamma-~neutral portfolio is well hedged
against small changes in the price of the underlying asset
(although not against jumps in the price of the underlying
asset).
The Delta of plain vanilla options (calls or puts) in-
creases as the spot price of the underlying asset increases.
Thus, Gamma is positive, since Gamma is the rate of
114 CHAPTER 3. SOLUTIONS

change of Delta. Moreover, Gamma is asymptotically go-


ing to 0 for deep out of the money options and for deep
in the money options, and the highest value of Gamma
corresponds to options with strike close to the spot price
(at the money options). 0

Question 14. When are a European call and a European


put worth the same? (‘The options are written on the same
asset and have the same strike and maturity.) What is the
intuition behind this result?
Answer: Recall from the Put—Call parity that

OR P nhes"t nade + (3.70)


where C’ and P are the values of a call and put option,
respectively, with strike K and maturity 7’ on an under-
lying asset with spot price S and paying dividends con-
tinuously at rate g. If C = P, it follows from (3.70) that
K = Se"~? _ Since the forward price of the underlying
asset is F = Se'"~%", we conclude that a call and a put
are worth the same if their strike is equal to the forward
price of the asset; these options are called at-the-money—
forward options.
Note that this result is independent of any assumption
on the evolution of the price of the underlying asset, since
Put—Call parity is model independent.
‘The fact that an at-the-money-forward call and an
at-thesnoney-forward put are worth the same may seem
counterintuitive at first glance: call options have unlim-
ited upside since their payoff at maturity,

C(T) = max(S(T) — K,0),

can be infinitely large, while put options have limited up-


side since their payoff at maturity,

P(T) = max(K
— S(T),0),
3.3. FINANCIAL INSTRUMENTS 115

is bounded by the strike price, i.e., P(I’) < K.


However, the value of an option is equal to the risk—
neutral expected value of the option payoff at maturity.
In every model for the evolution of the price of the un-
derlying asset (including in the geometric Brownian mo-
tion model underlying the Black-Scholes framework), the
probability density of the underlying asset at maturity
decreases exponentially for large values of the spot price.
This renders the expected value of large payoffs negligi-
ble, and makes it possible for at-the-money-forward put
options to be worth the same as at-the—money-—forward
call options. O

Question 15. What is the two year volatility of an asset


with 30% six months volatility?
Answer: Asset volatility scales with the square root of
time: if o denotes the annualized volatility of an asset,
the volatility o(t) of the asset over a time horizon t is
given by o(t) = ot. Then,

a(t2) ta
a(t) Ns ti

For t2 = 2, t; = 0.5,-and a(¢1))= 0(0:5) =:0.3, werfind


that o(t2) = 0(2) = 0.6, i.e., the two year volatility of the
asset is 60%. 0

Question 16. How do you value an interest rate swap?

Answer: For valuation purposes only, add payments at


maturity equal to the notional of the swap both for the
fixed leg and for the floating leg of the swap. Then, the
value of the swap for the party receiving fixed payments
is Vswap = Vyiz — Vfioat, Where Vyiz is the value of a
coupon bond with coupon rate equal to the fixed rate of
the swap, and Vfioat is the value of an instrument making
116 XHAPTER 3. SOLUTIONS

the floating payments of the swap, plus a payment equal


to the notional at maturity.
The value Vix of the fixed rate coupon bond is the sum
of the present values of its future payments discounted
using risk-free zero rates (most frequently, LIBOR rates).
‘To compute Veioat, note that, right after a payment is
made, the remaining payments of the floating leg (includ-
ing the notional at maturity) are equivalent to rolling over
the notional at the prevailing zero rates until maturity.
Thus, the value of all the floating payments on a payment
date for the swap is equal to the notional. We conclude
that Vfioat is the present value of the next floating pay-
ment (which was determined at the prior swap payment
date) plus the notional.
‘To illustrate swap valuation with an example, consider
a 19-month semiannual swap on a $10 million notional
with 3% fixed rate and paying semiannually compounded
LIBOR. ‘The next floating payment that will be made in
one month is $125,000 (and was determined five months
ago, at the previous cash flow date).
‘The cash flow dates of the swap are | month, 7 months,
13 months, and 19 months. ‘The value of the swap for the
party receiving fixed payments is Vswap = Vrix — Vetoat-
‘The value of the 3% semiannual coupon bond correspond-
ing to the fixed leg of the swap is

il
Vie
f =o) elOOROOOKtd —
00 - disc (3)

. ue
+ 150, 000 - disc (3)
1
13
iP 150, ; 000- disc
di: (3)

j 19
+ 10, 150, 000 - disc (3) ; (3.71)

where disc(t) denotes the discount factor corresponding


3.3. FINANCIAL INSTRUMENTS 117

to time t. For example, if the discount factors are given


in terms of the semiannually compounded LIBOR rate
LIBOR(t), then
=2t
disc(t) = (1“+ eee)

On the next cash flow date, i.e., in one month, the


floating leg of the swap is equal to the value of the next
floating payment, $125,000, plus the $10 million notional,
i.e., $10,125,000. ‘Thus, the value of the floating leg of
the swap today is the present value of $10,125,000 in one
month, i.e.,

Vyicat = 10,125,000 - disc (35) ; (3.42)

‘The value of the swap for the party receiving fixed pay-
ments is Vswap = Veia — Veioat, Where Veic and Veioat are
given by (3.71) and (3.72), respectively.

Question 17. By how much will the price of a ten year


zero coupon bond change if the yield increases by ten basis
points?
Answer: Vhe first order approximation of the change AB
in the bond price in terms of the change Ay in the bond
yield is
PAN ae) PEs = —DBAy, (3.73)
Oy
where D = Sin is the duration of the bond. ‘Thus,

“ ~ — DAy. (3.74)

Note that. D = 10, since the duration of a zero coupon


bond is equal to the maturity of the bond. Moreover, the
change in the bond yield is Ay = 0.001, since 1% = 100
basis points (bps), and therefore 10 bps = 0.1% = 0.001.
118 CHAPTER 3. SOLUTIONS

From (3.74), we find that the percentage change in the


value of the bond can be estimated as follows:

ed ~~ — NO O00 = = OOr

In other words, the price of the ten year zero-coupon


bond decreases by 1% if the yield increases by ten basis
points. OO

Question 18. A five year bond with 3.5 years duration


is worth 102. What is the value of the bond if the yield
decreases by fifty basis points?
Answer: The value of the bond will increase, since the
yield of the bond decreases. More precisely, recall from
(3.73) that
AB = — DBAy. (3.75)

For .B = 102,.D'='3.5 and. Ay = —0.005 (1%: = 100


bps, and therefore 50 bps = 0.5% = 0.005), we find from
(3.75) that AB & 1.785. Thus, the new value of the bond
is B+ AB 103.785. O

Question 19. What is a forward contract? What is the


forward price?
Answer: A forward contract is an agreement between two
parties in which one party (the long position) agrees to
buy a specified quantity of the underlying asset from the
other party (the short position) at a given time in the
future and for a price, called the forward price, that is
agreed upon at the inception of the forward contract. The
forward price is chosen such that the forward contract has
value 0 at inception.!° él
If the underlying asset has spot price So and pays div-
idends continuously at rate q, the forward price for a for-
1°Note that the forward price is not the price of the forward. ~
3.3. FINANCIAL INSTRUMENTS 119

ward contract maturing at 7’ is

_ (n= yeu
F = Soe Mee ,

where r is the risk-free rate between 0 and 7’. O

Question 20. What is the forward price for treasury


futures contracts? What is the forward price for com-
modities futures contracts?

Answer: A short position in a forward contract (i.e., sell-


ing a forward contract) is perfectly hedged by buying one
unit of the underlying asset and holding that position until
the maturity of the forward contract. The forward price
is the future value at the maturity of the forward contract
of the cost of buying one unit of the underlying asset.
For a treasury futures contract, buying the underlying
treasury bond generates « positive cash flow from receiv-
ing all the bond coupon payments until the maturity of
the forward contract. If So is the spot price of the under-
lying treasury bond and if C is the present value of all the
coupon payments received during the life of the futures
contract, then the forward price F’ of a treasury futures
contract with maturity 7’ is the future value at time 7’ of
Sr CO Ler = (Sor Oe
For a commodities futures contract, buying the under-
lying commodity incurs storage costs; for example, for a
gold futures contract, buying the underlying gold would
require storing the gold safely. If So is the spot price of
the underlying commodity and if C is the present value of
all the storage costs, then the forward price F’ of a com-
modities futures contract with maturity 7’ is the future
value at time 7’ of So +C,i.e., F = (So+ Oye? where r
denotes the risk-free rate between 0 and 7’.

Question 21. What is a Eurodollar futures contract?


120 CHAPTER 3. SOLUTIONS

Answer: A Eurodollar is a dollar deposit in a bank outside


the United States. The Eurodollar rate is the interest
rate earned on Eurodollars deposited by one bank with
another bank (and is very close to LIBOR for short term
maturities).
A Eurodollar futures contract is a futures contract on
a Eurodollar rate, and deliveries are for up to ten years in
the future.
For example, a three-month Eurodollar futures con-
tract is a futures contract on the three month (90-day)
Eurodollar rate. The start date is the third Wednesday of
the delivery month (March, June,
September, December).

Question 22. What are the most important differences


between forward contracts and futures contracts?

Answer: ‘The main differences between the ways forward


and futures contracts are structured, settled, and traded
are:
e Futures contracts trade on exchanges and have standard
features, while forward contracts are over-the-counter in-
struments.

e Futures are marked to market and settled in a margin


account on a daily basis, while forward contracts are not
settled before maturity.
e Futures carry almost no credit and counterparty risk,
since they are settled daily, while entering into a forward
contract carries some credit risk.
e Futures have a range of delivery dates, while forward
contracts have a specified delivery date.
e Futures contracts require the delivery of the underlying
asset for the futures price, while forward contracts can
be settled in cash at maturity, without the delivery of a
physical asset. ©
3.3. FINANCIAL INSTRUMENTS 121

Question 23. What is the ten-day 99% VaR of a port-


folio with a five-day 98% VaR of $10 million?
Answer: If we assume normally distributed short term
portfolio returns, the VaR (Value at Risk) of a portfolio is
proportional to the square root of the time horizon. More
precisely, if the N day C% VaR of the portfolio is denoted
by VaR(N,C), where N is the number of days in the time
horizon and C is the confidence level, then

N
VaR(N,C)SAN yes© av OtLO ar V0),
am ahah Wa (3.76)
where, ov is the (annualized) standard deviation of the
rate of return of the portfolio, zc is the z-score of the stan-
dard normal distribution corresponding to C, i.e., P(Z <
zc) = C, and V(0) is the current value of the portfolio.
From (3.76), it follows that

zo9V 10
VaR(10 days, 99%) VaR(5 days, 98%).
zogV5
For VaR(5 days,98%) = $10,000,000, and since z99 ~
2.326348 and zog © 2.053749, we obtain that

VaR(10 days,99%) ~ $16,019,255. O

Question 24. Put options with strikes 30 and 20 on


the same underlying asset and with the same maturity
are trading for $6 and $4, respectively. Can you find an
arbitrage?
Answer: Since the value of a put option with strike 0 is
$0, we in fact know the prices of put options with three
different strikes, i.e.,

P(30) = 6; P(20) = 4; P(0) =0,


where P(K) denotes the value of a put option with strike
K.
122 CHAPTER 3. SOLUTIONS

In the plane (K, P(K)), these option values correspond


to the points (30,6), (20,4), and (0,0), which are on the
line P(K) = 2K.
This contradicts the fact that put options are strictly
convex functions of strike price, and creates an arbitrage
opportunity.
The arbitrage comes from the fact that the put with
strike 20 is overpriced. Using a ” buy low, sell high” strat-
egy, we could buy (i.e., go long) 2 put options with strike
30, and sell (i.e., go short) 1 put option with strike 20. ‘To
avoid fractions, we set up the following portfolio:
e long 2 puts with strike 30;
e short 3 puts with strike 20.
‘This portfolio is set up at no initial cost, since the cash
flow generated by selling 3 puts with strike 20 and buying
2 puts with strike 30 is $0:

3-$4—2-$6 = $0.

At the maturity 7’ of the options, the value of the port-


folio is

V(L) = 2max(30 — S(T), 0) — 3max(20 — S(T), 0).

Note that V(Z’) is nonnegative for any value $(J') of


the underlying asset:
If S(Z') > 30, then both put options expire worthless,
and V(1’) = 0.
If 20 < S(T) < 30, then

Vl). = 2680 =48@)) SH0.

If 0 < S(T’) < 20, then

V(T) I 2(30 — S(T) — 3(20 — S(T))


= S(T)
Os
3.3. FINANCIAL INSTRUMENTS 123

In other words, we took advantage of the existing ar-


bitrage opportunity by setting up, at no initial cost, a
portfolio with nonnegative payoff at 7’ regardless of the
price S(7Z’) of the underlying asset, and with a strictly
positive payoff if0 < S(7') < 30. O

Question 25. J sell a one month put option with 28%


implied volatility today and I hedge my position “contin-
uously” until maturity. In one month, I calculate that the
realized volatility of the underlying asset was 16%. Did I
make money or did | lose money?

Answer: Delta—hedging an option “continuously” ** is equiv-


alent to taking an opposite position in an option with the
same strike and maturity but with a volatility parameter
equal to the realized volatility of the underlying asset.
Thus, the short put position which was created by
selling the put at a price corresponding to 28% implied
volatility is hedged by synthesizing a long put position
equivalent to buying the put at a price corresponding to
16% volatility. Since option prices are increasing functions
of volatility, you essentially sold a 28%~—vol put and hedged
it by buying the less expensive 16%-—vol put, meaning that
you made money on this trade. OO

Question 26. Consider the following option replication


strategy for a call option with strike 30 on an underlying
asset with spot price $25: If the price of the asset goes
above $30, buy one unit of the asset for $30 and hold
it while the price is above $30. If the price of the asset
goes back below $30, sell the one unit of the asset for
$30. Thus, at maturity, you will either hold no position,

1ltn practice, the rebalancing of the hedge is done discretely


(i.e., not continuously) once the rebalancing threshold is triggered,
which creates additional risks related to jumps in the price of the
underlying asset. Also, this hedging setup does not account for
trading costs.
124 CHAPTER 3. SOLUTIONS

if the price of the asset is below the strike price 30 or you


will have one unit of the asset which you bought for $30,
corresponding to the payoff of a call option with strike
$30. Seemingly, you replicated the call option at no cost.
What is wrong with this argument?
Answer: We first clarify how this can be viewed as a repli-
cating strategy for a call option with strike 30:
e if the price of the underlying asset never goes above $30,
you never buy the asset and end up with a $0 position,
which is the same as the payoff max(S(7Z') — 30, 0) of a call
option with strike 30 when S(T’) < 30; here, 7’ denotes
the maturity of the option and S(7’) the spot price of the
underlying asset at maturity;
e every time the price of the underlying asset goes above
$30 and then back below $30, you end up with a $0 po-
sition, since you bought the asset for $30 when the price
went above $30 and then sold the asset for $30 when the
price went below $30;
e if after the last time before maturity when the asset price
crosses $30 the price ends up above $30, the last trade you
made before maturity was buying one unit of the asset for
$30; the value of your position at 7’ is S(Z’) — 30 which is
the same as the payoff max(S(7’) — 30, 0) of the call option
with strike 30 since S(J') > 30;
e if after the last time before maturity when the asset
price crosses $30 the price ends up below $30, the last
trade you made before maturity was selling one unit of
the asset for $30, which cancels out the prior buying of
the asset for $30; you end up with a $0 position, which is
the same as the payoff max(S(Z’) — 30,0) of a call option
with strike 30’ since S(T) < 30.
Thus, the outcome of this strategy is max(.S(T)—30, 0),
which is exactly the payoff of a call option with strike 30
on the asset, and this was apparently achieved at no cost.
However, this cannot be the case.
‘The argument contains several flaws as detailed below:
3.3. FINANCIAL INSTRUMENTS 125

Firstly, assets do not trade at one price, but rather


with a bid—ask spread, where the bid price is the price at
which you can sell the asset and the ask price is the price
at which you can buy the asset, with the bid price always
smaller than the ask price.
When the asset price goes above $30, what you will be
looking for in order to buy the asset (at a time denoted
by t1) is for $30 to be below the bid price, i.e., 30 <
Svia(ti). You will then buy the asset at the ask price
Sask(ti), which will be even higher than $30 since 30 <
Svia(ti1) < Sask (ti). When the asset price goes below $30,
what you will be looking for in order to sell the asset (at
a time denoted by tz) is for the ask price to go below $30,
i.e., Sask(t2) < 30. You will then sell the asset at the
bid price Spia(t2), which will be even lower than $30 since
Svia(t2) < Sask(t2) < 30. Thus, every time the price of
the underlying asset goes above $30 at time ¢; and then
back below $30 at time t2 you do not end up with a $0
position from buying the asset at $30 and selling it at $30,
but will lose money since you buy the asset at the price
Sask(t1) > 30 and you sell the asset at the price Spia(t2) <
30, losing an amount equal to Sask(ti) — Soia(t2).
If the price moves smoothly the loss could be small,
but the loss could be significant if price jumps of the un-
derlying asset occur when the $30 threshold is breached.
Also, the more often the $30 price is crossed, the higher
the trading losses you will incur.
Moreover, every trade incurs trading costs which were
not included in the idealized trading version that seemed
to generate a zero-cost replication strategy for the call
option. OU
126 CHAPTER 3. SOLUTIONS
3.4. C++. DATA STRUCTURES ATE

3.4 C++. Data structures.

Question 1. How do you declare an array?


Answer: An array can be declared either on stack, or on
heap.

//created on stack, uninitialized


T identifier[size] ;

//created on stack, initialized


T identifier[] = initializer_list;

//created on heap, uninitialized


T* identifier = new T[size];

Example:
aye avcexeyl Leu s
intebar wle= 4 lmenote:
int* baz = new int([3];

Question 2. How do you get the address of a variable?


Answer: Use the ampersand before the name of the vari-
able, e.g.

le ViaTs
T* ptr = &var

Example:
int foo = 1;
int* foo_ptr = &foo;

Question 3. How do you declare an array of pointers?


Answer: 'Vhe same way as declaring an array, but making
the type, T, a pointer:
128 CHAPTER 3. SOLUTIONS

Tx identifier[size] ;
T* identifier[] = initializer_list;
T**x identifier = new T*[sizel;

Example:

intea = dss aMb Dea yb Cmts


int* foo[3];
int* bar[] = {&a, &b, &c};
int** baz = new int*[3];

Question 4. How do you declare a const pointer, a


pointer to a const and a const pointer to a const?

Answer:

//pointer to a read only variable


const T* identifier;
T const* identifier;

//read only pointer to a variable


T *const identifier = rvalue;

//read only pointer to a read only variable


const T *const identifier = rvalue;
T const *const identifier = rvalue;

Example:

//read only variables


const) int a = 2; const int b= 2;
Intec: =a

//pointer to a read only b


int const* foo_two;
foo_two = ka; foo_two = kb;

//pointer to read only a


3.4. C++. DATA STRUCTURES 129

const int* foo;


foo = &a; foo = &b;

//read only pointer to c


//it needs to be initialized
int *const bar = &c;

//read only pointer to read only a


//it needs to be initialized
const int *const baz = &a;

Question 5. How do you declare’a dynamic array?


Answer:

T* identifier = new T[sizel;


T* identifier = nullptr;
Tx identifier;

delete[] identifier;

Example:

int *foo = new int([4];


int *bar = nullptr;
bar = new int[4];

Question 6. What is the general form for a function


signature?
Answer:

return_type function_name(parameter_list) ;

Example:

int my_sum(int a, int b);


130 CHAPTER 3. SOLUTIONS

Question 7. How do you pass-by-reference?


Answer:

return_type function_name(T & identifier);

The identifier is now an alias for the argument.

Question 8. How do you pass a read only argument by


reference?
Answer:

return_type function_name(const T & identifier);

Once you define a parameter as const, you will not be


able to modify it in the function.

Question 9. What are the important differences between


using a pointer and a reference?
Answer: Several differences between using a pointer and
a reference are:
e A pointer can be re-assigned any number of times, while
a reference cannot be reassigned after initialization.
e A pointer can point to NULL (nullptr in C++11),
while a reference can never be referred to NULL.
e It is not possible to take the address of a reference as it
is done with pointers.
e ‘here is no reference arithmetic.

Question 10. How do you set a default value for a pa-


rameter?
Answer:

return_type function_name(T identifier = rvalue)

The parameters with default value must be placed at the


end of the parameter list.
3.4, C++. DATA STRUCTURES ikeasit

Question 11. How do you create a template function?


Answer:
template<class T>
return_type function_name(parameter_list);

template<typename T>
return_type function_name(parameter_list);
Note that the parameter type can be specified, when call-
ing the function, explicitly or implicitly. Also note that
thereis no technical difference between using class or
typename besides code readability (typename for primitive
types and class for classes).

Example:
template<typename T>
T temp_sum(T a, T b) {return atb;}

struct Processor{
InGiea
int apply(int b) {return atb;}
ast

template<class T>
int temp_sum_2(int a, int b) {
T processor;
processor.a = a;
return processor.apply(b) ;

int main(){
//implicit, foo equals 3
int foo = temp_sum(1,2);

//explicit, bar equals 3


int bar = temp_sum_2<Processor>(1,2) ;
132 CHAPTER 3. SOLUTIONS

Question 12. How do you declare a pointer to a func-


tion?
Answer:

return_type (*identifier) (list_parameter_types)

Example:
int my_sum(int a, int b) {return atb;}
int mainQ){
int (*p_func) (int, int);
p_func = & my_sum;

// foo equals 3
int foo = p_func(1,2);

Question 13. How do you prevent the compiler from


doing an implicit conversion with your class?
Answer: Use the keyword explicit to define the con-
structor:

explicit Classname(parameter_list)

Question 14. Describe all the uses of the keyword static


in C++.
Answer: Inside a function, using the keyword static
means that once the variable has been initialized it re-
mains in memory up until the end of the program.
Inside a class definition, either for a variable or for a
member function, using the keyword static means that
the there is only one copy of them per class, and shared
between instances.
As a global variable in a file of code, using the keyword
static means that the variable is private within the scope
of the file.
3.4. C++. DATA STRUCTURES 133

Question 15. Can a static member function be const?


Answer: When the const qualifier is applied to a non-
static member function it implies that member function
can not change the instance class when called (i.e. can
not change any non mutable members from *this). Since
static member function are defined at a class level, where
there is no notion of this the const qualifier for member
functions does not apply.

Question 16. C++ constructors support the initializa-


tion of member data via an initializer list. When is this
preferable to initialization inside the body of the construc-
tor? %
Answer: The initialization list has to be used for const
members, references and with members without default
constructors, but for any type of members initialization
through the initialization list is still preferable, since it
is for efficient. Using the initialization list, the members
are initialized calling directly their constructors. If the
initialization is done in the body of the constructor for
each member being initialized there is an instance of it
created and then a copy assignment operation is called to
assign that instance to its respective member.

Question 17. What is a copy constructor, and how can


the default copy constructor cause problems when you
have pointer data members?
Answer: A copy constructor allows you to create a new
object as a copy of an existing instance. The default copy
constructor creates the new object by copying the exist-
ing object, member by member, and thus when there are
member pointer you end up with two objects pointing to
the same object.
It is important to note that the copy constructor is
called every time a function receives an object via the
134 CHAPTER 3. SOLUTIONS

pass-by-value mechanism. ‘This means that the copy con-


structor needs to be implemented using a pass by refer-
ence. Otherwise you will be recursively calling the copy
constructor. You should always set the parameter for a
copy constructor to be const.

ClassName( const ClassName& other );


ClassName( ClassName& other );
ClassName( volatile const ClassName& other);
ClassName( volatile ClassName& other );

Question 18. What is the output of the following code:

#include <iostream>
using namespace std;

class A
{
public:
int * ptr;
“AQ
{
delete(ptr);
}
¥;

void foo(A object_input)


{

int main()
{
A aa;
aa.ptr = new int(2);
foo(aa);
3.4.. C++. DATA STRUCTURES 135

cout<<(*aa.ptr)<<end1;
return 0;
}

Answer: The output of the code is an uncertain number,


depending on the compiler used; for some compilers it
could generate an error. ‘he reason for this is that we do
not define our own copy constructor.
When we call the foo function, the compiler will gen-
erate a default copy constructor which will shallow copy
every data members defined in class A. This will lead to
the result that two pointers, one in temporary object and
the other in the object aa, will point to the same area in
memory. When we get out the foo function, the compiler
will automatically call the destructor function of the tem-
porary object in which the pointer will be deleting and the
area it points to will be free. In this situation, the pointer
in aa will still point to the same area which has been free.
When we try to visit the data through the pointer in aa,
we will get garbage information.

Question 19. How do you overload an operator?


Answer:

type operator symbol (parameter_list);

If you define the operator outside of the class, then it


will be a global operator function.
Example:

struct FooClass{int a;};


int operator + (FooClass lhs, FooClass rhs) {
return lhs.a + rhs.a;
}

Question 20. What are smart pointers?


136 CHAPTER 3. SOLUTIONS

Answer: A smart pointer is a class built to mimic a pointer


(offering dereferencing, indirection, arithmetic) that also
offers extra features to simplify the usage, sharing and
management of resources.
C++11 comes with three implementations of smart
pointers: shared_ptr, unique_ptr, and weak_ptr.
Example:

//shared_pointer maintains
//a reference count
//when the count is zero the object
//pointed to is destroyed
std: :shared_ptr<int> foo(new int(3));
std::shared_ptr<int> bar = foo;

//memory not released


//bar is still in scope
foo.reset();

//releases the memory,


//since no one is using it
bar.reset();
}

Question 21. What is encapsulation?


Answer: Encapsulation is the ability to expose an inter-
face while hiding implementation. This is usually achieved
through access modifiers (public, private, protected, etc.).

Question 22. What is a polymorphism?


Answer: Polymorphism is the ability for a set of classes
to all be referenced through a common interface.

Question 23. What is inheritance?


3.4. C++. DATA STRUCTURES 137

Answer: Inheritance is the ability for one class to extend


another through sub-classing. ‘Vhis is also referred to as
“white-box” (the opposite of “black-box”) re-use. A li-
brary can provide base classes that may be extended by
the application developer.

Question 24. What is a virtual function? What is a


pure virtual function and when do you use it?
Answer: Virtual functions are functions that are resolved
by the compiler, at runtime, to the most derived version
with the same signature. This means that if a function
that was defined using a base class Foo, with a virtual
member function f, is called using an instance of a sub
class FooChild, that function is going to be dynamically
binded to the implementation of the sub class (regardless
that the actual code only refers to the base class).
A pure virtual function is a virtual function with no
implementation in the base class, making the base class
abstract (and thus can’t be instantiated). Derived classes
are forced to override the pure virtual function if they
want to be instantiated. You use the same syntax as the
virtual function but add =0 to its declaration within the
class.

Question 25. Why are virtual functions used for de-


structors? Can they be used for constructors?
Answer: Destructors are recommended to be defined as
virtual, so the proper destructor (in the class hierarchy)
is called at running time.
When calling a constructor, the caller needs to know
the exact type of the object to be created, and thus they
cannot be virtual.

Question 26. Write a function that computes the facto-


rial of a positive integer.
138 CHAPTER 3. SOLUTIONS

Answer:

//for implementation
int factorial(int n){
int output =1;
rion “Uatahe ah Gy 8 Bh ES cel B Aries))
output *= 91;
return output ;

//recursive implementation
int factorial(int n){
if ( == 0) return 1;
return n*factorial
(n-1) ;

//tail recursive implementation


int factorial(int n, int last = 1){
if (m == 0) return last;
return factorial(n-1, last * n);

Question 27. Write a function that takes an array and


returns the subarray with the largest sum.
Answer:

#include <vector>
#include <algorithm> // std: :max

using namespace std;

template <typename T>


T max_sub_array(vector<T> const & numbers){
T max_ending = 0, max_so_far = 0;
for(auto & number: numbers) {
max_ending = max(0, max_ending + number);
3.4. C++. DATA STRUCTURES 139

max_so_far = max(max_so_far, max_ending) ;

return max_so_far;

Question 28. Write a function that returns the prime


factors of a positive integer.
Answer:

#include <vector>
using namepsace std;

vector<int> prime_factors(int n){


vector<int> factors;
for (int i = 2; i <= n/i ; ++i)
while (n 4 i == 0) {
factors. push_back (i) ;
ne/ =i;
}
te Gay os al),
factors.push_back(n) ;
return factors;

Question 29. Write a function that takes a 64-bit integer


and swaps the bits at indices i and j.
Answer:
long swap_bits (long x, const int &i, const int &j){
if ( (Gee)! & 1L) 1=—CGe> seat)!
pe MEIGS PP Gh eee)
return x;

Question 30. Write a function that reverses a single


linked list.
140 XHAPTER3. SOLUTIONS

Answer:

#include <memory> // shared_ptr


using namespace std;

template<typename T>
struct node_t {
T data;
shared_ptr<node_T<T>> next;
Ape

//cecursive implementation
template<typename T> shared_ptr<node_t<T>>
reverse_linked_list(
const shared_ptr<node_t <T>> &head){
if ('!head || !head->next) {
return head;

shared_ptr<node_t<T>>
new_head = reverse_linked_list (head->next) ;
head->next->next = head;
head->next = nullptr;
return new_head;

//while implementation
template<typename T> shared_ptr<node_t<T>>
reverse_linked_list(
const shared_ptr<node_t <T>> &head){
shared_ptr<node_t<T>>
prev = nullptr, curr |= head;
while(curr) {
shared_ptr<node_t<T>> temp = curr-> next;
curr->next = prev;
prev = curr;
curr = temp;
3.4. C++. DATA STRUCTURES 141

ty
return prev;

Question 31. Write a function that takes a string and


returns true if its parenthesis are balanced.
Answer:

#include<string>
#include<stack>
using namespace std;

bool is_par_balanced(const string input)


{
/{"Q) QO)"=> false
//"(a(dd) O (O))"=>true
stack<char> par_stack;
for(auto &c: input)
{
if (c==’)’)
{
if (par_stack.empty())
return false;
else if (par_stack.top()==’(’)
par_stack.pop() ;
ap
else if (c==’(’)
par_stack.push(c) ;
}
return par_stack.empty() ;

Question 32. Write a function that returns the height


of an arbitrary binary tree.
Answer:
142 CHAPTER 3. SOLUTIONS

#include<memory> //std::shared_ptr
#include<algorithm> //std: :max
using namespace std;

template <typename T>


struct BinaryTree {
T data;
shared_ptr<BinaryTree<T>> left, right;
3;

template <typename T>


int height (
const shared_ptr<BinaryTree<T>> &tree,
int count = -1){
if (!tree) return count;
return max(
height (tree->left, count + 1),
height (tree->right, count +1));

Question 33. Write a C++ function that computes the


n-th Fibonacci number.
Answer: ‘The Fibonacci numbers (Fn)n>0 are given by the
following recurrence:

Fri2 = Fagit fr, Vn 2 0,

With eo) Oracle ale

//recursive implementation
int fib(int n) {
if (n == 0 || n == 1) return n;
else {
return fib(n-1) + fib(n-2);
3.4. C++. DATA STRUCTURES 143

//iterative implementation
int fib(int n ){
if @ ==0 |i nm ==-1)) rettrn n:
int prev = 0, last = 1, temp;
for CGint @ =027 4) <= ny +41) 4
temp = last;
last = prev + last;
prev = temp;
}
return last;

//tail recursive implementation


int fib(int n, int last = 1, int prev = 0)
{
if (m == 0) return prev;
if (m == 1) return last;
return fib(n-1, lasttprev, last);

Question 34. Implement a basic calculator to evaluate a


simple expression string. ‘he expression string may con-
tain open parentheses ”(” and closing parentheses ”)”, the
plus sign ”+” or the minus sign ”-”, non-negative integers
and empty spaces.
Note: You may assume that the given expression is always
valid. Do not use the ”eval” built-in library function.
Example 1:

input. ieee
Output: 2
144 CHAPTER 3. SOLUTIONS

Example 2:

insehes WA ae Ae
Output: 3

Example 3:

Input: "(1+(4+5+2)-3)+(6+8)"
Output: 23

Answer: Use a stack data structure as follows (sample


code in C++):
class Solution {
public:
int calculate(string s) {
// two stacks, one on numbers, the other on operators
stack<int> nums, ops;
int res = 0, tmp = 0, sign = 1;
for (int i = 0; i < s.size(); ++i) {
if (isdigit(s[i])) {
// parsing number (could be more than one digit)
tmp = tmp * 10 + (s[iJ - ’0’);
$
else {
// now we are at a non-numeric char.
// if it is addition / subtraction,
// we are fine to keep accumulate
// if it is opening bracket,
// we save the current result
// if it is closing bracket,
// we clear out previously saved op and number
res += sign * tmp;
tmp = 0;
if (s[i] == °+’) sign = 1;
if (sli]) == °-?) sign = -1;
ites CS Pa (2)) eel
nums.push(res) ;
ops.push (sign) ;
res = 0;
Sign = 1;
}
if (s{i] == °)’ && ops.size()) f{
res = ops.top() * res + nums.top();
ops. pop();
nums.pop() ;
3.4. C++. DATA STRUCTURES

}
res += sign * tmp;
return res;
146 CHAPTER 3. SOLUTIONS
3.5. MONTE CARLO METHODS 147

3.5 Monte Carlo simulations. Numerical


methods.

Question 1. How would you compute 7 using Monte


Carlo simulations? What is the standard deviation of this
method?

Answer: An Acceptance-Rejection type method can be


used to approximate 7 as follows: generate N points uni-
formly in the [—1, 1] x [—1, 1] square and accept a point
(x,y) if the point is in the unit disk D(0,1), i-e., if 2? +
y’ <1. If the number of accepted’ points is A, then the
ratio 2 converges in the limit as N — oo to the ratio of
the area of the unit disk to the area of the square, which
is equal to 7. ‘Therefore,

4A
DAD eee
N

‘The standard deviation of the method is O (+). We


make this more precise by computing below the coefficient
of Tr in O (Fr): see (3.79).
Let U,,U2,-:- be a sequence of independent identi-
cally distributed bivariate random variables uniformly dis-
tributed in [—1, 1] x [—1, 1]. Denote by 1p,0,1) the indica-
tor function of the unit disk D(0, 1), i-e.,

1, if (x,y) € D(0,1);
1p(0,1)(@,y) = { 0, otherwise.

Let

Xi = Ipeon(Ui), Viz 1.

The random variables X;, 7 > 1, are independent and


148 CHAPTER 3. SOLUTIONS

identically distributed, with

1
EX II = // qoedy
E [1pe,1)(Us)]
D(0,1)

7
T
ey (3.77)
3.07

Since X;, 7 > 1, are integrable, it follows from the strong


law of large numbers that

2) AIgP Agee yay T ‘


lim ————————_ = — almost surely.
N00 n 4

Note that X; + X2+--:+ Xp, counts how many points


out of the randomly selected n points reside in the disk
D(0,1). Thus, for N large enough,

Aly a
iee ae arg
xW
(3.78)
©

‘lo calculate the variance of the estimation in (3.78),


note that, for 1<i< N,

(1p(0,1) (Us)? = Iporn(U), VI<i<N,

and therefore, using (3.77), we find that

var(Xi) = E[X?] —(E[X])?

PE [Ade (U))”| = (=)


= E[1p.o1)(Ui)] - oe

a iG
_ wmw4—7)
3.5. MONTE CARLO METHODS 149

‘Then,

val
5 ee
N
1

By taking the square root of the variance, we conclude


that the standard deviation of this Monte Carlo method
for estimating 7 is

ae ha OS iy onl a
TO OW YOR TH (3.79)

Question 2. What methods do you know for generating


independent samples of the standard normal distribution?
Answer: ‘Vhe three most commonly used methods to gen-
erate independent standard normal samples are:
e Box—Muller (using the Marsaglia—Bray algorithm in or-
der to avoid estimating trigonometric functions);
e Acceptance—Rejection;
e Inverse ‘Transform.
For details on these methods, see Glasserman [2].

Question 3. How do you generate a geometric Brownian


motion stock path using random numbers from a normal
distribution?
Answer: Consider a stock whose price follows the geomet-
ric Brownian motion

dS, = pwS:dt+oaS:dWwy, (3.80)


150 CHAPTER 3. SOLUTIONS

where js and o are the drift and the volatility of the stock
price, and W; is a Weiner process. ‘To generate a price
path for the stock between time 0 and time 7’, discretize
the time interval into m equal time steps of size dt = t.
and let t; = jot, for 7 =O: m.
By integrating (3.80) between t; and tj+41, it follows
that
tj+1 tj
here = St, = » | Sidt ota o | SidW;. (3.81)
tj i)

We use the following approximations:

ty.
i S;dt R St, (tj+1 — ty)
Usid

= St, ot; (3.82)

tj41
i Si.dw, v St; (Wear A Wi; )
td

= St, Vdt Z541, (3.83)

where Z;+1 is a standard normal variable, since W; is a


Wiener process and therefore W:,,, — Wz, is a normal
variable with mean 0 and variance t;41 — t; = dt, i.e.,

Wij —Wi, = VOtZj41. (3.84)


Note that Z;, for 7 = 1: m, are independent standard
normals.
If z1, 22, ..., 2m are independent samples of the stan-
dard normal distribution, we obtain from (3.81-3.83) that
(3.80) can be discretized as follows:

Stiy1 — Sty = wSt,0¢ + OSs, Votzj41,

for j = 0: (m— 1), which can be written as


3.5. MONTE CARLO METHODS 151

for 7=0:(m-—1).
Note that there is a very small probability that the
price path above becomes negative, which is a drawback
of using this discretization.
A price path which is always positive can be generated
using Ito’s formula to express (3.80)
as

d(In(S‘)) = (u- =) dt ete odW;. (3.85)

By integrating (3.85) between t; and t;+1, it follows that


St,
In (3)
In(S:,,,) ad In(.5¢; ) =
tj
o =

a (u 5) (tj41 — 03) + 0(We;., — Wi;)


¥ (u- =) NET sate
oc?

(3.86)
for 7 = 0:(m— a where (3.84) was used for (3.86).
Then, (3.86) can be discretized as follows:
Xt o .
n(set
— = (u-F) ot +oVbtz;41,

107 =O}: (m — 1), and therefore


2
Bey ei Se-exp ((u= 5) bt + oVbiz341 ;

for allj =0:(m=1). O

Question 4. How do you generate a sample of the stan-


dard normal distribution from 12 independent samples of
the uniform distribution on [0, 1]?
Answer: If ur, u2,..., Ui2 are 12 independent samples of
the uniform distribution on [0, 1], then

Sou — 6 (3.87)
152 CHAPTER 3. SOLUTIONS

can be used as a sample of the standard normal distribu-


tion.
To see this, recall from the Central Limit ‘Theorem
that, if X;, i > 1, are independent identically distributed
random variables with finite expected value [|X] and
standard deviation o(X), then
aye n ; re 5
jim n Bias 0) E[Xx]
= Z, (3.88)
Jn

where Z is the standard normal distribution and the con-


vergence in (3.88) is in distribution.
Let U,, U2, ..., Uig be 12 independent uniform distri-
butions on [0, 1]. Using (3.88) for n = 12 and X; = Ui,
4 = 1: 12, we infer that

i)
Coe EU
Tyas SU) ; (3.89)
V12

where

Aue )0 ; Py (8. ae21 (3.90)


20)
2
= BU*|-(Bu)y?=
: a
fwau-
ie
(5) JO 4

Spelt, 94LL gee ok


“Re Bees Dee?
and thus 1
W)
QUO) = ===. == (3.91)
3.91

From (3.89-3.91), we obtain that

1
3.5. MONTE CARLO METHODS 153

and therefore (3.87) can be used as an approximate sample


of the standard normal distribution.
Note that this is an inefficient method, since it uses
12 samples from the uniform distribution to generate one
approximate sample of the standard normal distribution,
and all the samples that it generates are in the interval
[—6,6]. More efficient methods for generating indepen-
dent standard normal samples are Box—Muller, which uses
two uniform distribution samples to generate two sam-
ples of the standard normal distribution, and Acceptance-
Rejection.

Question 5. What is the rate of convergence for Monte


Carlo methods?
Answer: If n is the number of paths in the Monte Carlo
simulation and m is the number of time steps between 0
and ‘7’ used in the discretization of each path, then the
convergence rate of the Monte Carlo simulation is

ee
O (max (=.=z) :

‘The estimate above holds for Monte Carlo simulations


on multi asset derivative securities, i.e., is independent of
the number of underlying assets of the derivative security,
unlike finite differences and numerical integration meth-
ods, where the convergence slows down as the number of
underlying assets increases.

Question 6. What variance reduction techniques do you


know?
Answer: The variance reduction techniques are used to
reduce the constant factor corresponding to the Monte
Carlo approximation error O (+). Some of the most
commonly used variance reduction techniques are:
e Control Variates;
154 CHAPTER 3. SOLUTIONS

e Antithetic Variables;
e Moment Matching.
For details on these methods and their implementation,
see Glasserman [2]. 0

Question 7. How do you generate samples of normal


random variables with correlation p?
Answer: Assume that you can generate two samples z1
and z2 from two independent standard normal variables
Zi and 22. Let X1 = Zy and X29 a pZy + 4/1 — p? Zo.

Note that X94 is a linear combination of independent nor-


mal variables, and therefore X2 is a normal variable as
well. Also,

corr(X1,X2) = corr(Z1,pZ1 + V/1 — p?Z2)


= .p,corr(41,.23)
+ V1 — p? corr(Z1,
Z2)
= p vat(Z1)
== P;

since var(Z1) = 1 and corr(Z1,


Z2) = 0, since Z; and Z2
are independent.
We conclude that X; and X2 are normal random vari-
ables with correlation p. ‘Thus, starting with two indepen-
dent standard normal samples z; and 22,

21=2, and g2=p21+ V/1- p2ze

are samples of normal random variables with correlation


p.

Question 8. What is the order of convergence of the


Newton’s method?
Answer: If it is convergent, Newton’s method is quadrat-
ically (second order) convergent.
3.5. MONTE CARLO METHODS 155

Recall that, given an initial guess xo, the Newton’s


method recursion for solving f(x) = 0, where f : R — R,
is
Lk+1 ee Dked fap)?
OR) ; U:
Wakes (3.92)

The quadratic convergence of Newton’s method can


be stated formally as follows: Let x* be a solution of
f(x) = 0. If f(x) is a twice differentiable function with
f’'(a) continuous, if f’(a*) 4 0, and if xo is close enough
to a*, then there exists Mf > 0 and naz a positive integer
such that

ae
=a < M, Vk>nm. (3.93)
kaw

‘To provide the intuition behind (3.93), note that, since


f(a*) = 0, the recursion (3.92) can be written as
Lko1— 2"

= &-x «_ f(x) — f(x")


-
f'(xr)
f(x") — f (xx) a (te = 2") fi(@e) (3.94)
f'(xr)
Recall from the linear ‘Vaylor expansion of the function
f(x) around the point x, that, if f(a) is continuous, then
there exists a point cy, between x* and xx such that

fie") = f(ae) + (a* — ax) f'(ae)


hae yan 2
si ene f"(cr). (3.95)
From (3.94) and (3.95), we find that

Se eT ee
ficesig Gyre WET TP ae fh
and conclude that
[vezi — 2"| £'"(cx)
——
lax |27) :
Sona
3.96

a
ee
156 CHAPTER 3. SOLUTIONS

If f(a) and f’(a) are continuous functions such that


f(a") # 0, it follows that, if x, is close to «*, then
flew) | isj close to Fence:
gress fi" (a*) < oo. Ther
Therefore, :
the term

£aes is uniformly bounded if x, is close enough to x”,


and (3.96) can be written formally as (3.93). 0

Question 9. Which finite difference method corresponds


to trinomial trees?
Answer: Forward Euler. As an explicit finite difference
method, the Forward Euler discretization of the Black—
Scholes PDE gives the finite difference value of the option
at a node as a linear combination of the option values at
three nodes on the prior time step, which is similar to the
risk neutral formula for trinomial trees.
If the calibration of the trinomial trees is done in the
log space, i.e., if the up and down factors are calibrated
to the normal distribution of In(S), and if the Forward
Euler discretization is done for the constant coefficients
PDE obtained by the change of variables x = In(S), the
classical trinomial tree recursion and the Forward Euler
recursion are almost identical. 0

Question 10. What is the relationship between the LU


and Cholesky decompositions?
Answer: Both the LU decomposition (without pivoting)
and the Cholesky decomposition provide a computation-
ally efficient way to solve linear systems by only using for-
ward substitution and backward substitution, which are
very fast solvers. ‘he decompositions are similar in form,
i.e., the given matrix is written as the product of a lower
triangular matrix and an upper triangular matrix.
More precisely, the LU decomposition without pivoting
of a nonsingular square matrix A consists of finding a
lower triangular matrix L with all entries on the main
3.5. MONTE CARLO METHODS 157

diagonal equal to 1 and a nonsingular upper triangular


matrix U such that

Ay
= LU.

The matrices L and U are called the LU factors of A.


It is important to note that the LU decomposition
without pivoting does not exist for every nonsingular ma-
trix: a matrix has an LU decomposition without pivoting
if and only if all the leading principal minors of the matrix
are nonzero.'* ‘This drawback is addressed by introduc-
ing the LU decomposition with row (or column) pivoting,
which exists for any nonsingular matrix.
‘The Cholesky decomposition of a nonsingular symmet-
ric matrix A consists of finding a nonsingular upper trian-
gular matrix U with positive entries on the main diagonal
such that
A = U'U. (3.97)
The matrix U is called the Cholesky factor of A.
Note that not every symmetric matrix has a Cholesky
decomposition: ‘he nonsingular symmetric matrix A has
a Cholesky decomposition if and only if the matrix A is
symmetric positive definite, i.e., if 2’Ax > 0 for all x 4 0,
or, equivalently, if all the eigenvalues of the matrix are
positive.
If they exist, both decompositions are unique. In other
words, if a nonsingular matrix has an LU decomposition
without pivoting, the L and U factors are uniquely deter-
mined. Similarly, any symmetric positive definite matrix
has a uniquely determined Cholesky factor.

Question 11. (i) Which matrices have an LU decompo-


sition’ without pivoting?
12Mhe Jeading principal minors of an n xX n matrix A are the
determinants of the i x i matrices A; = A(1: 7,1: 7%) made of the
i? upper left entries of A, for 1 <i <n.
158 CHAPTER 3. SOLUTIONS

(ii) Does a symmetric positive definite matrix have an LU


decomposition without pivoting?
Answer: (i) A matrix has an LU decomposition without
pivoting if and only if all the leading principal minors of
the matrix are nonzero. Recall that the leading principal
minors of an n x n matrix A are the determinants of the
i x i matrices A; = A(1:i,1:7) made of the i? upper left
entries of A, for 1 <i<n.8
Note that even very well conditioned matrices do not
have an LU decomposition without pivoting. For example,
the matrix A = : )does not have an LU decom-

position without row pivoting. If it did, then A = LU


could be written as
( 1 ogLala i OE eae cole
Lely. 4 0 UD,2h.
ae Ny bol
(3.98)
By multiplying the matrix U by the first row of L, we find
that U(1, 1) = 0 and U(1,2) = 1. Thus, (3.98) becomes

(ener ore HGS Lo aye


However, by multiplying the second row of L by the first
column of U, it follows that

FQAV SO oO cae 1;

which is not possible, and therefore we conclude that the


0
matrix A = 1 7 )does not have an LU decompo-

sition without row pivoting. The reason for this is that


'3For example, the leading principal minors of the matrix
2 —3 0 j
1 1 1 arerdet(2) == 92: det ( ee ) = 5;
ath. eee nba
2 = 3) 0
det 1 il 1 = 322)
= 5 =3
3.5. MONTE CARLO METHODS 159

A(1,1) = 0, and, since A(1,1) is the first leading princi-


pal minor of A, the matrix A does not have all the leading
principal minors nonzero.
(ii) By definition, the n x n matrix A is symmetric posi-
tive definite (spd) if and only if x‘ Aa > 0 for all x ¥ 0.
Several equivalent necessary and sufficient conditions for a
symmetric matrix to be symmetric positive definite exist,
e.g., a Matrix is spd if and only if the matrix is symmetric
and all the eigenvalues of the matrix are positive, and a
matrix is spd if and only if the matrix is symmetric and
has a Cholesky decomposition.
For the purpose of answering this question, we will
use another equivalent property for spd matrices given
by Sylvester’s Criterion, which states that a symmetric
matrix is symmetric positive definite if and only if all the
leading principal minors of the matrix are positive.
Since a matrix has an LU decomposition without piv-
oting if and only if all the leading principal minors of the
matrix are nonzero, and since the leading principal mi-
nors of a symmetric positive definite matrix are positive
and therefore nonzero, we conclude that any symmetric
positive definite matrix has an LU decomposition without
pivoting. UO
160 CHAPTER 3. SOLUTIONS
3.6. PROBABILITY. STOCHASTIC CALCULUS 161

3.6 Probability. Stochastic calculus.

Question 1. What is the exponential distribution? What


are the mean and the variance of the exponential distri-
bution?
Answer: Vhe density function of the exponential random
variable X with parameter a > 0 is

Cabann lee 0:
fey { 0, if r<0.
The expected value and the variance of the exponential
random variable X are
1
E[|X] = and var(X) = ae

‘To see this, use integration by parts to find that


oo : re co 1 oo aarde
[ ne dpm ee =e ~f e -"dz
0 a 6 a Jo
—azr |e 1

Oak 5 a?
los) 2 -axr | 9) [oe] 7
/ pete Ft +2 f ze "da
0 a 0 Qa Jo

= 0+-x—
2
: a
1
a?
= a?
Then,
co co fie 1

Bix) = | Flee) Gt =O xe dz = —;
(a) Q
—0o

FAX 2 | = oe 2
Boia at =o a
xe2. —-azx 2
dx =~.
1 J —oo 16)

‘Therefore,
2
var(X) = E{[X?] = (E[X])? = an
a2 (=) Saat
162 CHAPTER 3. SOLUTIONS

Question 2. If X and Y are independent exponential


random variables with mean 6 and 8, respectively, what
is the probability that Y is greater than X?

Answer: The probability density functions of X and Y


are, respectively:

ze 6, ifear 0:
fx(z)=
0, if ta<,0:

Css c :
BE By ify = 0;
fr(y)=
0, Niece).

Since X and Y are independent, the joint probability den-


sity function fxy(x,y) of (X,Y) is the product of the
marginal probability density functions, i.e.,

fxy(a,y) = fx(z) fr(y)


de-t-3, ife>0,y>0;
0, otherwise.

Let

A = {(a,y) €R?:y> az}.

The probability that Y is greater than X can be found by


evaluating the double integral

Pe XK Wore [[#xv@uaray
A
ieee ato Kaas eee
= a ‘ e & 8dady
3.6. PROBABILITY. STOCHASTIC CALCULUS 163

as follows:

1 EE Es ¥
POS ae = af e & e dz | dy
48 Jo Jo
jE a y
al 3 ( Z , dy
as — 8 = Hem

Thee estates i
a e ( - )dy
== —_ 8 —_— 6

gee u Z
=> = a See oe
at (« eo )ay
1 24

eke.
4
=e. IC
4

Question 3. What are the expected value and the vari-


ance of the Poisson distribution?
Answer: A Poisson distribution is a random variable X
taking nonnegative integer values with probabilities

where A > 0 is a fixed positive number.


We show that the expected value and the variance of
the Poisson distribution X are

E[X|)=X and var(xX) =.

By definition,

‘EIX] = DS)P(X=h)-k = Dik


164 CHAPTER 3. SOLUTIONS

“ Mm ; 2 Fos
Since the ‘Taylor series expansion for e° is

ele
go9p k=0

it follows that

Lo 3100)
co k-1

a ND)
co \e-2 >

Be PaO Ny
=e (3.101)
fan (2)
From (3.99) and (3.100), we find that E[|X] = 4.
To calculate var(X), note that

Bie} =) SP Osea) ak

which can be written as

Bie = Sapiens + One eet


rire coe egal e')
a yh-2 °° ye-d
es BEd —X
+"
Dera d. a1)
NEN,
where (3.100) and (3.101) were used for the last equality.
We conclude that

var(X) = B[X?) = (Bix)? =).


3.6. PROBABILITY. STOCHASTIC CALCULUS 165

Question 4. A point is chosen uniformly from the unit


disk. What is the expected value of the distance between
the point and the center of the disk?
Answer: ‘The expected value of the distance between a
uniformly chosen point in the unit disk D and the cen-
ter of the disk can be computed as 7 [VvI +Y?], where
(X,Y) is uniformly distributed in the unit disk D. The
probability density function of (X,Y) is

4, if z € D;
f(x,y) =
0, otherwise.

Then,

E [LvX24 2) = -|f Vx2+y?drdy. (3.102)


D
Using the polar coordinates substitution 2 = rcos@ and
y=rsin@, withO<r< land 0<@ < 2m, and recalling
that dxdy = rdrd0, we obtain from (3.102) that

E [Vx a 2)
7 1 ak

r2 (cos? 6 + sin? 0) rdrdé


T Jo Jo
if a] 2a :

= =f i r2d0 dr (3.103)
T Jo Jo
1 1 1
= -[ Qnr? dr = 2 | r-dr
T Jo fy)
rs
3 y)

where for (3.103) we used the fact that cos” 9 +sin? 0 = 1


fora é@. O

Question 5. Consider two random variables X and Y


with mean 0 and variance 1, and with joint normal dis-
166 CHAPTER 3. SOLUTIONS

tribution. If cov(X,Y) = ot what is the conditional


probability P(X > 0|Y <0)?
Answer: From the definition of conditional probability, it
follows that
Pl X04 <u)
P(XaXe > 0IY <0) ee PUY <0) (3.104 )

Note that 1
PUY <= 0) = 5 (3.105)

since Y is a standard normal random variable.


In order to compute P(X > 0,Y < 0), let

W = V2X ~Y. (3.106)


Since E[X] = E[Y] = 0, it follows that E[W]| = 0. More-
over, since

var(X) = var(Y) =1 and cov(X,Y) = —=


S)
we obtain that

var(W) = var (v2x — Y)

= (Var (v2x) — 2 cov (vax, Y) + var(Y)

= 2var(X) — 2V2 cov(X,Y) + var(Y)


= 1,

and

cov(W, Y) II cov (v2x —Y, Y)

= 2 cov(X,Y) —var(Y)
= 0.
Note that W = V2 X —Y is a normal random variable _
since X and Y have joint normal distribution. Moreover,
3.6. PROBABILITY. S1OCHASTIC CALCULUS 167

since E[W] = 0, var(W) = 1, and cov(W,Y) = 0, it


follows that W and Y are independent standard normal
variables.
From (3.106), we find that

X Pees
= =(W+Y).

hen, the probability of the event {X > 0,Y < 0} can be


written as

PEXCS0;¥ =0)

= p(w+y) >0,¥ <0)


V2
PAW ey, 0. Yaad), (3.107)
The two straight lines w+ y = 0 and y = 0 cut the
(w,y) plane into four wedges:

R,; = {wt+y> 0,y < 0};


Rz = {w+y
> 0,y > 0};
Rg = {w+y < 0,y-< 0};
Ry = {w+y < 0,y > O}.

Note that

P(W+Y>0,Y <0) = P((W,Y)ER:). (3.108)


Since W and Y are independent normal random variables,
their joint probability density function is rotationally sym-
metric, and therefore

P((W,Y) € R1) II P((W,Y) € Ra); (3.109)


P((W,Y) € Ra) P((W,Y) € Rs); (3.110)
P((W,Y) € Ro) 3P((W,Y) € Ra);
see Figure 3.1.
168 CHAPTER 3. SOLUTIONS

™~ |
Re
& w
i
RB
an
Figure 3.1: The regions R; to R4 in the (w,y) plane.

Also, note that

ay (WY) @ Ri) =1, (3.111)

since P(W+Y =Oor Y =0) = 0.


From (3.109-3.111), we find that

1, ==) PUY) ei.)


\| 2P((W.Y) € Ri) + 2P((W,Y) € Ro)
= SP(LWAY we Ras
and therefore

ao = eS M @ T |

Thus,
P(W+Y>0,Y <0) II | (3.112)
3.6. PROBABILITY. STOCHASTIC CALCULUS 169

see (3.108).
Then, from (3.107) and (3.112), it follows that

1
P(X >0,Y
<0) = =. (3.113)
From (3.104), (3.105), and (3.113), we conclude that

P(X >.0,Y-< 0)
Hicxge> OY. <4)
PCY <0)
1
eee

Question 6. If X and Y are lognormal random variables,


is their product XY lognormally distributed?
Answer: First, note that, if X and Y are independent
lognormal random variables, then XY is lognormally dis-
tributed, since In(XY) = In(X) + In(Y) is the sum of two
independent normal random variables, and therefore it is
normally distributed.
In a more general setting, if In(X) and In(Y) have joint
normal distribution, then In(X) + In(Y) is normally dis-
tributed and therefore XY is a lognormal random vari-
able.
Otherwise, In(X) + In(Y) may not be normally dis-
tributed even if In(X) and In(Y) are normally distributed,
in which case XY is not lognormally distributed. UO

Question 7. Let X be a normal random variable with


mean js and variance o”, and let ® be the cumulative
distribution function of the standard normal distribution.
Find the expected value of Y = ®(X).
Answer: Let Z be a standard normal random variable
independent of X. ‘Then,

Y =-O(X) = P(ZEX|X) = Blizex|a),


170 CHAPTER 3. SOLUTIONS

and therefore

E[Y] = E[E[1z<x|X]]. (3.114)


Recall from the ower Property for conditional expec-
tation’* that, for any two random variables 7’ and W,

His PET Wj. (3.115)

Using (3.115) for 7’)= 1z<x and W = X, we obtain that

BE lizex( Ai = Ezex)
PZ<xX). (3.116)

From (3.114) and (3.116), we obtain that

BY) Pima x). (3.117)


Recall that X is a normal random variable with mean
ys and variance o*, and that X and Z are independent.
‘Then, Z — X is anormal random variable with the follow-
ing mean and variance:

E|Z—-X) = =p;
var(Z — X) var(Z) + var(X) = 1407.

and Z — X is a normal random variable with mean —jz


: en ean ies
and variance 1+ 07. Thus, Ry reel is a standard normal
l+o
random variable and therefore

PZ X) “2 PG =e 0)
I|
(A < [I )
Vl+o02 ~— V1 +o?
Le otal
= | ——— }. 3.118
(ee) \ )
147 other words, to calculate the expected value of T, one can
first calculate the conditional expected value of T knowing the ex-
tra information from W, then average out the resulting conditional ©
expected value over W.
3.6.\ PROBABILITY. STOCHASTIC CALCULUS 171

From (3.117) and (3.118) we conclude that

AY)
E[Y] == © (4s)
See cacer 3.11
(3.119)

We note that, if X is the standard normal variable,


then Y = ®(X) is uniformly distributed in the interval
[0,1], and therefore E[Y] = 5. Note that, for 4. = 0 and
ao = 1 in (3.119), i-e., if X is a standard normal variable,
then E[Y] = (0) = 4, which is consistent to the com-
ment above.

Question 8. What is the law of large numbers?


Answer: Vhere is a strong law of large numbers and there
is a weak law of large numbers. ‘The strong law of large
numbers states that the average of a large number of in-
dependent identically distributed integrable random vari-
ables converges almost surely to their common mean; in
the case of the weak law of large numbers, the convergence
is only in probability.
More precisely, let X1, Xo, ...be a sequence of inde-
pendent identically distributed random variables with fi-
nite expected value = E[X;], and let S, = Xi+---+Xn.
The strong law of large numbers states that fe —> fd
almost surely, i.e.,

P (lim ee 1) = (3.120)
n—co T1

The weak law of large numbers states that S. n.


nr

in probability, i.e.,

lim P (S 4) > «) = 0,10 Vie 200. eos (8120)


aD

Note that, if a sequence of random variables conver-


gences almost surely, then it also converges in probability,
and therefore, if (3.120) holds true, then (3.121) also holds
Ife CHAPTER 3. SOLUTIONS

true. This is the sense in which the strong law of large


numbers is “stronger” than the weak law of large numbers.
O

Question 9. What is the central limit theorem?


Answer: Vhe central limit theorem states that the limiting
distribution of the centered and scaled sum of an indepen-
dent identically distributed sequence of random variables
is a normal distribution if the common distribution of the
random variables has finite variance.
More precisely, let X1, X2, ...be a sequence of in-
dependent identically distributed random variables with
finite expected value 4. = E[X;] and finite variance o? =
var[X;]. Let S, = X1+---+ Xn. Then,

lim Braz Gl SZ
n—0o o/n

where Z is the standard normal distribution, and the con-


vergence is in distribution, i.e.,

li 1
Sn — Nye St = <p
Be oe (A
o/n ) P(Z t)

Putting together the Law of Large Numbers and the


Central Limit Theorem, the following approximation as n
goes to infinity holds:

Sn a
a aE aw
moan Jn

Question 10. What is a martingale? How is it related


to option pricing?
Answer: Let (Q,F:,P) be a filtered probability
space, where Q is the sample space, F; is a filtration, and
P is a probability measure on 2. A stochastic process X;
3.6. PROBABILITY. STOCHASTIC CALCULUS 173

is called a martingale with respect to the filtration {F,}


if and only if
(i) X; is adapted, i.e., X; is F;-measurable for all ¢;
(ii) X; is integrable for all t, i.e., E[|Xz|] < co for all ¢;
(iii) E[X4|Fs] = Xs almost surely for all s < t.
In other words, a martingale is a stochastic process
in which, given the available information F; up to the
current time s, the optimal estimation (in the least square
sense) of the process in the future time ¢, i.e., E[X|Fs],
is the current value X, almost surely.
The martingale concept is one of the cornerstones of
option pricing theory. ‘he fundamental theorem of as-
set pricing states that, if a market model is arbitrage—
free, then there exists a risk neutral probability so that
each discounted asset price process under the risk neutral
probability is a martingale. ‘Thus, a way to price a deriva-
tive security is to figure out a partial differential equation,
called the pricing equation, usually deduced by applying
It6’s formula, so that the discounted price process of the
derivative is a martingale under risk neutral probability.

Question 11. Explain the assumption (dW;)? = dt used


in the informal derivation of It6’s Lemma.
Answer: The notation in differential form (dW,)* = dt is
a shorthand for the conventional integral notation used in
Riemann integral
Te iF
i,(dW)? = i dt. (3.122)
0 Jo
The intuition behind (3.122) is related to the
quadratic variation of a Brownian path. By definition, the
quadratic variation QVw [0,7] of the Brownian path W in
the interval (0, 7] is

QVi[0,7] = lim S7|Wi, — Wasa"


1=1
174 CHAPTER 3. SOLUTIONS

where { = 47, fori = 0: n. Therefore, in expectation,


we have that

E
aU pas [We, add Pee |
t=

i tim SOE [Ime. = Wes!"


1=—1
n

= lim (t,
)
nm—-co
—t-1)
j=l

= T,

since W:, — Wi,_, ~ N(0,t; — t:-1) fori =1:n. In fact,


it can be shown that the convergence
n

fe—T asn— oo
Am |W, WW tes
i=1

is in L? sense, i.e.,
2
[Wu - Waal-7 |etl
n

lim #
NCO
at

By mimicking the notations used in the conventional Rie-


mann integral, we write that
a n
ik (dW,)? = lim ) |W, —Wi,_,| le
0 noo ¢
all
ae
ss ii dt,
0
whose shorthand notation in differential form is

(dW,)* =dt. O

Question 12. If W; is a Wiener process, find E(WiW.}-


3.6. PROBABILITY. STOCHASTIC CALCULUS 175

Answer: Assume that s < t. Write W; as W; = (W; —


W,;) + Ws, and note that

WW, = (Wi—W.)W. + W?2. (3.123)

Since W,, 7 > 0, is a Wiener process, it follows that


W,.—W, and W, are independent normal random variables
of mean 0, and therefore

E|(W “a W,)Ws| = E(w. =a W,|E[Ws]

= 0. (3.124)
Also, Ws is normal of mean E[Ws] = 0 and variance
var(W,;) = s: Thus,

var(W.) = E[We] — (B[W.])?

and therefore

E[W;] = var(W.) = s. (3.125)


From (3.123-3.125), we obtain that

E[(W.W.] = E|(Wi—Ws)W.]
+ E[W2]
=e (3.126)
Since (3.126) was derived under the assumption s < t,
we conclude that

E|(W.W,] = min(s,t). O (3.127)

Question 13. If W; is a Wiener process, what is var(W;+


W,)?
Answer: Assume that s < t. Write W; as Wi = (Wi —
W,) + Ws, and note that Wi + Ws, = (Wi — Ws) + 2Ws.
176 CHAPTER 3. SOLUTIONS

‘Then,

var(W; + Ws) {| var(W; — Ws)


+ Acov(W; — Ws, Ws)
+ 4var(W;). (3.128)

Since W,, T > 0, is a Wiener process, it follows that

cov(W:—W.,,W;) = 0; (3.129)
var(W;-—-W,;) = t—8; (3.130)
var(Ws) = 8, (3.131)
since W,; — W, and W, are independent normal variables
of variance t — s and s, respectively.
From (3.128-3.131), we obtain that

var(Wi+W;) = (t—s)+4s = t+ 3s. (3.132)


Since (3.132) was derived under the assumption s < t,
we conclude that

var(W,+Ws;) = max(s,t)+3min(s,t). O

Question 14. Let W; be a Wiener process. Find Find


l t
/ W, dW, and & if W;, aw,
|;
0 0

Answer: Since fa dz = Ee + C, we begin by comput-


ing d (3). Recall from Ito’s lemma that, if f(a,t) is a
continuously differentiable function, then

Hor j(@,t) = = we find that


Ww2

a(4) = ae SP
3.6.. PROBABIZITY. STOCHASTIC CALCULUS 177

and therefore

Wet |
a( t 5 ) = W,dW;. (3.133)

By integrating (3.133) between 0 and t, and since Wo = 0,


we obtain that
t JP
[w. dW, = ae‘ io Vt>0. (3.134)
JO

From (3.134), it follows that


t i ase
| f Ws; aw,| = ole = 0)
0

since W; is normally distributed with mean 0 and variance


t and therefore

E(W/] = var(W;)
+ (E[Wi])?
joie

Question 15. Find the distribution of the random vari-


able
Xa sh W,dW;.
0

Answer: Recall from It6’s formula that, if W; is a Wiener


process and f(x) is a function with continuous second or-
der derivative, then

df(W,) = f'(W.)aW, + ;fi" (Wi) dt. (3.135)


For f(a) = 2”, we obtain from (3.135) that

dW? = 2W.dW; + dt. (3.136)


178 CHAPTER 3. SOLUTIONS

By integrating (3.136) from 0 to 1, we obtain that

1
We = 3 WidW, +1 = 2X +1. (3.137)
)

Note that W, is a standard normal random variable, and


let Wi = Z. By solving (3.137) for X, we find that

ae We at = Sgt
2 >

Let fx(x) and Fx (a) be the probability density func-


tion of X and the cumulative distribution function of X,
respectively.
Note that

p(x<-3) = P(Z°
< 0)e= 0.

Thus, if «¢< +2, them Fx(2) = P(X-< 2) = 0 and


therefore

fx() = 0,
since fx (x) = Fx(a).
hea —3, then

Fx(e) = P(X<2)=P (45 he r)


= P(Z? <2e+1)
P(-V2a+1<Z< V2x4+1)
= 2P(05.2 <./2¢ +1)
D} V2xr+1 aie,
= S== e y.
V 21 (0)
3.6. PROBABILITY. STOCHASTIC CALCULUS. 179

By differentiating F’x(a), it follows that, for « < —5,

fx(z) = F(x)
tes
aiees
mdz \ Jo e 2 dy

2. _ GaeFIy?
= Ne 2 (V2z +1)’
us

lI

We conclude that

VB _ 2a+1
2 @ 3 if 4 ie

TES
ihWere, ay
fx(x) = ee O
0 if «<—i.

Question 16. Let W; be a Wiener process. Find the


mean and the variance of
t

[ weaw..
JO

Answer: Let By be the Borel o-algebra over the time in-


terval [0,t] and let F; be the filtration for the probability
space in which the Wiener process W; resides. We will
use the following results:'°

Theorem 3.1. (Martingality)


Let fs be progressively measurable and square integrable
in [0,7], t.€., fe is Bt ® F:-measurable for every t € [0,7]
and E We lflPat] < oo. Then, the stochastic integral
15hor the proofs of Theorem 3.1 and Theorem 3.2, see Theorem
2.8 on p.65 and Theorem 3.1 on p.67, respectively, from Fried-
man [1].
180 CHAPTER 3. SOLUTIONS

ue fsdW,; defines a zero mean, square integrable martin-


gale for t€ [0,7]. In particular,
t

E ff faaw,| = (0); Vte (0, 2}. (3.138)


0
Theorem 3.2. (It6’s isometry)
Let f; and gt be progressively measurable and square inte-
grable processes. Then,

B|f°fedW, [aw] = [ lf. gst

In particular, if fs = 9s, tt follows that

E a paw.) |= iPElfi|ds. (3.139)

For our problem, we need to compute E[X] and var(X),


where
t

X = | weaw..
0

We first check that the integrand W? is progressively


measurable and square integrable in [0, t].
Note that W? is progressively measurable because it is
adapted and continuous.
Furthermore, since W; ~ N(0,s), it follows that W, =
/sZ, where Z is a standard normal random variable, and
therefore

BIWe| = E|(v82)"] = s? B[Z*)=3s?, (3.140)


where we used the fact that the fourth moment of the
standard normal distribution is 3, i.e., E[Z*] = 3. From
(3.140), it follows that
te U

i Bwsjas = [ 387ds = t® < 00. (3.141)


0 0 ; ;
3.6. PROBABILITY. STOCHASTIC CALCULUS 181

In other words, W? is square integrable in {0, f].


We can therefore apply both Theorems 3.1 and 3.2 with
fs = W2 and g, = W2.
From (3.138) for fs = W2, we find that

= if weaw,| a (3.142)
0

From (3.139) for fs = gs = W2, and using (3.141), it


follows that

E(x?) = a wraw,)]ah E(W2 ds

Since E[X] = 0, see (3.142), we find that

var(X) = B[X?] =(B{x))’ =. O

Question 17. If W; is a Wiener process, find the variance


of
we
aap Vtes
8 dwWw,.

Answer: We will use ‘Theorems 3.1 and 3.2 to solve this


problem. ‘To do so, we first check that the integrand
2

Via is progressively measurable and square integrable


in (0, 1].
w2

‘The process Jte =e is progressively measurable be-


cause it is adapted and continuous.
182 CHAPTER 3. SOLUTIONS

Furthermore, since W; ~ N(0,t), the probability den-


2 ‘
sity function of W; is =e, and therefore

we 1 Pep ear
Be = if ete ~dzr
Diba

= u hi sa)? dz
Did efsctes)
1 Oe degetao
= Coe 2 de
WY US tes)
1 2t
= 27 - —— Balas
Qrt m5 ( )
2
= ——— 3.144

where, for (3.143), we used the identity

|[pOoFSe easeDRG dehy 2


Qn
==

for any positive constant A > 0. From (3.144), it follows


that
1 ae
Z we
[ elt) a = [ tE le |dt
JO JO

NERt ee:
1
2
= ee

I|
ww] aa
(oe) | On S a
co
aa
—— Ot——

w2

Thus, i 7D) ters |dt < co, and we conclude that both
‘Vheorems 3.1 and 3.2 can be applied here.
From (3.138), it follows that

E[X]=E kipVie av =0. (3.146)


3.6. PROBABILITY-STOCHASTIC CALCULUS 183

From (3.139), and using (3.145), we obtain that

aa we 2

E[X*| = £ ( Vte# aw) |


0

1 we 2

= fs (vie) dt
JO

r] we

= [ Ble] ae
0

= =(8 — 5v2). (3.147)

From (3.146) and (3.147), we conclude that

var(X) = E[X?] — (B[X])? = ;(8- 5v2).

Question 18. If W; isa Wiener process, what is # [em |?


Answer:
Solution 1: If Wy is a Wiener process and Y = e™ | then

In(Y) = W, & v#Z, (3.148)

where Z is the standard normal random variable, and


therefore Y is a lognormal random variable. Recall that
the expected value of a lognormal random variable given
by In(V) =y+02Z is
bh
a
PV eee eee (3.149)

From (3.148), and using (3.149) with u = 0 and a = vt,


we conclude that
184 CHAPTER 3. SOLUTIONS

Solution 2: Since W; = VtZ, it follows that

E|e”*| \| BleViz =a 1 ei ti,lemen


Cuigac LL ie,
V 20 ie
1 ix
— e ss ae aaOe (3.150)
Ne Thos

By completing the square, we find that

x 1 2 t
Vin = 5 (2 vi) +o
and therefore

eve Severs e*) (3.151)


From (3.150) and (3.151), it follows that

ae! weg ce\e


1B, en] =e? e322 ‘) dx
| A) OTA Wee,
che il Shoe eos
= e? —cm oe e 2d y (Sulow )

= e?, (3.153)
where we used the substitution y = x — Vt for (3.152),
and, for (3.153), we used the fact that

1 Ce ue
aii=O) Sota Feat
2
patios
since Ke 2 is the probability density function of the
standard normal distribution. 0

Question 19. If W; is a Wiener process, find the variance


of
t
ifs dW;.
0
3.6. PROBABILITY. STOCHASTIC CALCULUS 185

Answer: Recall that, for any deterministic square inte-


grable function f : [a,b] — R, the stochastic integral
ifsf(s)dW, is normally distributed with mean 0 and vari-
ance equal to the square of the L? norm of f, i.e.,
b b
‘|f(s), ~ N (0,| FP(s)d).
‘Then,
fe rt ; t?

[ saw. ae n (o,f as) = n(o5),


, dc os
and we conclude that
t fare
var (| sdW, =— OU
O 3

Question 20. Let W; be a Wiener process, and let


t
a / W,dr. (3.154)
0
What is the distribution of X;? Is X; a martingale?
Answer: A solution to this question was given in Chap-
ter 1 using integration by parts; we include a different
solution here.
Note that X; is not a martingale because, if we rewrite
(3.154) in differential form as
dX; = Widt = Widt+ 0dWi,

we can think of X; as being a diffusion process with only


the drift part W;.
Recall that the integral of a one-parameter family of
Gaussian random variables remains Gaussian. Since W;
is a Gaussian family, X; is normally distributed. Further-
more,
E[X;) = B[W,]dr = 0.
186 CHAPTER 3. SOLUTIONS

Therefore,

var(X1) = E[X?]—(E[X1])? = E[X?]. (3.155)


Note that

t 2 t et
xX? = (| Wads 2h f W,W,,dsdu, (3.156)
(0) 0 0

and recall that

E(W.W.| = min{s;u}, Vs,u > 0; (3.157)

see (3.127).
From (3.155-3.157), we obtain that

var(X1) = EX?) = ihsf


" EIW,W.ldsdu

bs ih‘pnid caged
ca [Cf sts [ uas)

= [ (G4 ue-w) au

piers
(
t 2
U
= ut — a du

ee ee
t?

We conclude that X; is normally distributed with mean


3
0 and variance ., i-€.,

n~w(a2).a t?
3.6. PROBABILITY. STOCHASTIC CALCULUS 187

Question 21. What is an It6 process?

Answer: An It6 process is a generic term referring to


a stochastic process X; determined by the solution of a
stochastic differential equation (SDE) of the form

dX, = a(X:,t)dt + b(Xz,t)\dWi, (3.158)

where W; is a Wiener process. ‘The coefficient a(x,t) of


the dt term is the drift of Xz; the coefficient b(x,t) of
the dW; term is the diffusion of X;. The SDE (3.158) is,
by definition, the shorthand notation for the stochastic
integral equation

t t

X, = Xo+ f a(Xs, 8) as+ [ b(Xs,8) dW.


0 0

We note that a sufficient condition for the existence


and uniqueness of the (strong) solution to an SDE is for
the drift and the diffusion coefficients a(x,t) and b(2,t) to
be locally Lipschitz functions of at most linear growth in
os vel

Question 22. What is Itd’s lemma?


Answer: It6’s lemma, also known as I[t6’s formula, states
that, if X; is an It6 process satisfying the SDE

dX, = a(Xi,t)dt + 0(X:,t)dM,

then for any function f (x,t) with continuous second order


partial derivative in 2 and continuous first order partial
derivative in t, the process f(X¢,t) is also an Ito process,
driven by the same Wiener process W;, and the drift and
diffusion parts of f(X;,t) are determined according to the
‘Taylor expansion of f(a,t) to first order for the t part and
188 CHAPTER 3. SOLUTIONS

up to second order for the & part:

a) Of ig?f he
df(Xz,t) = ot ahi ae De dXt + 9 Ox? (dX)?

= Of
= of
Dt dt + oe
; t + b(b(X¢, Xz, t)dWi]
Ee [a( Xz, t)dt thdW,

a [a(X;, t)dt + b(Xe,t)aW,]?


2 2)

es OF pe Of, OOM Ki AVOCF


Tal aes An" Lage wine vOut |e’
he econ dW). (3.159)
Ox
Note that for (3.159) we used the fact that

(dX1)° | [a(X1, t)dt + (Xz, t)dWi]?


b°(X¢, t)dt
since (dW;)” = dt (see Problem 3.6 in this section), (dt)? =
Opand dWidi— 0. SEI

Question 23. If W; is a Wiener process, is the process


X; = W? a martingale?
Answer: Recall that a stochastic process M; defined in a
filtered probability space (Q, F+, P) is a martingale if and
only if
(i) M; is adapted, i.e., My is F:-measurable for all t;
(ii) M; is integrable for all t, i.e., E[M:z] < 00 for all t;
(iii) E[|A4,||Fs] = Ms almost surely for all s < t.
We check whether the process X; satisfies the condi-
tions (i), (ii), and (iii) above.
Since any continuous function of the Wiener process
W; is adapted, the process X; is adapted, and therefore
condition (i) is satisfied.
Also, E[X:] = E[W?] = t < 00, since W; is a normal
random variable of mean 0 and variance t, and therefore
3.6. PROBABILITY. STOCHASTIC CALCULUS 189

E(W?] = var[W;] = t. Thus, the random variable X; is


integrable for every t, and X; satisfies condition (ii).
‘To check whether X; satisfies condition (iii), note that,
for every s < t,

Ea Pay Sey, {Fs


E [(W, — W. + W,)|Fs]
= E|(W,—W.)*|Fs] + 2E (Wi — W.)W.|Fs]
+ E[W2|F.]. (3.160)
Since the Wiener process W; has independent increments,
i.e, Wi — Ws is independent of F;, and stationary incre-
ment, i.e., Wi — Ws ~ Wi-s. Moreover, W; — W;-is a
normal random variable of mean 0 and variance t — s, i.e.,
E(W:—s| = 0 and var(Wi_;) = E[W2..] =f = oanlnens

E |(W.=W,)"|F,| = E[We. |) =t— 38, (3.161)


and

E|(Wi—Ws:)Ws|Fs]) = W:sE[W: —W,|Fs]


= W.E|Wi-s|
= © (3.162)

Since W, is Fs-measurable, it follows that

E[W2|Fs] = W?. (3.163)


From (3.160-3.163), we find that

E [X1|Fs]
= E|(W,-W.)*|Fs] +2E [(Wi — W.)Ws|Fs]
+ E[W2|Fs]
ear Pep Woe
2

‘Thus,

E(XilF.) = t—-3+We A Wer= Xe,


190 CHAPTER 3. SOLUTIONS

and therefore the process X; does not satisfy condition


(iii).
We conclude that X; is not a martingale. UO

Question 24. If W; is a Wiener process, is the process

N = W? -— 3tw,
a martingale?
Answer:
Solution 1: A stochastic process M; defined in a filtered
probability space (Q, 7, P) is called a martingale if and
only if
(i) M: is adapted, i.e., M; is F;-measurable for all t;
(ii) M; is integrable for all t, i.e., E[|M:|] < oo for all t;
(iii) F[M:z|Fs] = Ms almost surely for all s < t.
We check whether the process N; satisfies the condi-
tions (i), (ii), and (iii) above.
Since any continuous function of the Wiener process
W, is adapted, the process N; is adapted, and therefore
condition (i) is satisfied.
Also,
E|N.]) = E[W?] — 3tE[Wi] = 0 < oo,
since W; ~ N(0,t), therefore E[W:] = E[W?] = 0. Thus,
the random variable N; is integrable for every t, and N;
satisfies condition (ii).
‘To check whether N; satisfies condition (iii), note that,
for every s < t,

E |Nt|Fs]
= E[W? -3tW.|F.]
= E[(Wi- Ws; +W.)°|Fs] — 3tE [Wi|Fs]
= E[(W.—W.)*|Fs] + 3E [(W: — W.)’W,|Fs]
+ 3E [(Wi—W.)W3|Fs] + E[W2|Fs]
— 3tE [Wi|F,].
3.6. PROBABILITY. SVOCHASTIC CALCULUS 191

Recall that the Wiener process W; has independent


increments, i.e, Wi — W; is independent of Fs, and sta-
tionary increment, i.e., W, — W, ~ Wi_s, and the fact
that W, is Fs-measurable. ‘Then,

E[(Wi-W;)"|Fs| = E[W?..|
=4 (0): (3.164)
E [(Wi — Ws)°Ws|F ss] lI W. E|Wi..]
= (t—s)W,; (3.165)
E[(Wi-—W.)W2|Fs] = W; E[Wi-s]
= 0; (3.166)
EAWAE.|) = WV. (3.167)
since W;_, ~ N(0,t — s), and therefore

E|W._.| = E[W?_,] = 0;
E(W¢_.| = var[Wi-s] =t—s.
Moreover, since W; is a martingale, it follows that

E|W.|F5] = Ws. (3.168)

From (3.164—3.168), we find that

E [Ni|Fs]
= E[(W,—W.)*|Fs] +3E [(W — W.)°W.|Fs]
+ 3B [(W, —W.)W2|F.] + E[W3|Fs|
— 3tE [Wi|Fs]
= 0+3(t—s)W,+0+W; —3tW,
= W:-3sW,
= Ng.
Thus, the process N; satisfies condition (iii), and we con-
clude that N; is a martingale.
192 CHAPTER 3. SOLUTIONS

Solution 2: By applying It6’s formula to N;, we obtain


that

dN, = d(W,; —3tW:)


= 3W?dW she 5-6W,dt — 3(W,dt = tdW;)

= 3(W; —t)dWi.
The process N; has zero drift and can be written as a
stochastic integral as
t
Nt =a 3(W? — s)dW,.
0

Moreover,

fiB |(3(W? —s))"] ds


ie)

II 9 if (E[W3] — 2sE[W2] + s”) as

9|
[ (3s* — 2s” + s*) as (3.169)
= 6t°< OO;

for (3.169), we used the fact that W; = \/sZ, where Z is


a standard normal random variable, and therefore

E(Ws) II E |(v82)’] = sE|Z7| = s;

ElWe] = B\(ye2)'| = 3 eZ" = as


Therefore, the process 3(W2 — 8) is square integrable
for s € [0,t], and, from ‘Theorem 3.1, we conclude that NV;
is a martingale.

Question 25. What is Girsanov’s theorem?


3.6. PROBABILITY. STOCHASTIC CALCULUS 193

Answer: Girsanov’s theorem is providing a way to change


the drift of a Wiener process by defining a new probability
measure via a Radon-Nikodym derivative. More precisely,
let (Q,F;,P), for 0 < t < T, be a filtered probability
space, and let W; be a Wiener process in the probability
measure P. Let ht be a progressively measurable stochas-
tic process such that the stochastic exponential
t v
Ex(h) = exp (/ h,dW, -3/ neds)
0 2. (0)

is a martingale in the probability measure P. Define a


new probability measure P over 2 given by the Radon-
Nikodym derivative as follows:

dP ss
ap = oP Ud hidW; — a
ey heat) 3
(3.170)

Then, W; is a Wiener_process with drift h under the new


probability measure P. Equivalently, if we define W; by
W, = Wi — hi, then WwW is a Wiener process in the Pe
measure. More generally, if X; is the diffusion process
satisfying the SDE

CEG == al Xz, t)dW Ss b( Xt, t)dt

under the probability measure P, then under the probabil-


ity measure P defined in (3.170), Xz satisfies the following
SDE

dX II a(X¢, t)dW, Sih b( Xt, t)dt

ll a(X;,t) (aw, + hedt) + b(X:,t)dt


a(X;,t)dW, + (b( Xt, t) + hea(X¢, t)) de.
In other words, in the P-measure, X; has drift part b and
diffusion part a; whereas in the P-measure, the diffusion
part stays the same while the drift part becomes b+ ha.
194 CHAPTER 3. SOLUTIONS

In particular, if we choose h to be —2, then X¢ becomes


driftless in the P-measure provided ee the stochastic
exponential €;(h) is a martingale.
We note that the martingality’® of the stochastic expo-
nential €;(h) guarantees that the new measure P defined
in (3.170) is a probability measure, i.e. A) dP= 1, since

i dP I| oP ap
Q OW
dP
GH

Question 26. What is the martingale representation the-


orem, and how is it related to option pricing and hedging?
Answer: Let W,; be a Wiener process defined on the fil-
tered probability space (Q, F:, P), where the filtration {F;}
is generated by the Wiener process W,. Let M; be a mar-
tingale with respect to {F;} such that MM; is square inte-
grable for every t, i.e., E(M?] < co, for all t. The mar-
tingale representation theorem asserts that, for any such
a martingale M;, there exists an F,-adapted square inte-
grable process 0; such that M; has the stochastic integral
representation
t

M, = E[Mo|+ i 6,dW,
0

almost surely. Note that such martingales have to excon-


tinuous.
'6 A sufficient condition which ensures the martingality of €;(h)
: ; die: Lon poe
is the Novikov’s condition E le?So hi “| OO.
3.6. PROBABILITY. STOCHASTIC CALCULUS 195

‘The relationship between the martingale representa-


tion theorem and option pricing and hedging is as follows.
For simplicity, assume that the risk free rate is zero. As-
sume that the price process S; of the underlying asset
follows the diffusion process determined by the SDE

dS} =O tS dW,

under risk neutral probability P, where the driving Wiener


process W; is defined on the filtered probability space
(Q,F:,P) with the filtration {7} is generated by Wt.
Consider a derivative whose payoff function (possibly path
dependent) is yr at maturity 7. ‘Vhe Fundamental The-
orem of Asset Pricing asserts that, because the risk free
rate is assumed zero, the value process V; of the derivative
security is a martingale under risk neutral probability. In
fact,
Vi = Elpr|Fi).
Moreover, if the payoff function yr is square integrable,
i.e., E[y7] < 00, then V; is a square integrable martingale.
Therefore, by applying the martingale representation the-
orem, we find that there exists a F,-adapted process 0;
such that
T
gr = Vr = E[Vo] +f dW;
JO
“TT 0,

0 O10t

Note that the quantity =k. indicates the amount of shares


to hold in order to dynamically hedge the position of the
derivative with payoff function yr at maturity 7’. In par-
ticular, if pr is the payoff of a call option, then ran cor-
responds to the delta of the call.

Question 27. Solve

Og = Gd es (3.171)
196 CHAPTER 3. SOLUTIONS

where W; is a Wiener process.


Answer:
Solution 1: Note that (3.171) is a particular case for j4 = 0
and o = 1 of the stochastic differential equation

(aisle == puS;dt + oSidWi, (3.172)

which is the model for the evolution of an asset in the


Black-Scholes framework. he solution of (3.172) has the
distribution
ae
Dt = 90 exp ((u- Fes aviz), (3.173)

with t > 0, where Z is the standard normal variable. For


p = 0 and o = 1 in (3.173), we obtain that the solution
to (3.171) is
Y, = Yo exp (-5 - viz) ; (3.174)

Solution 2: From It’s lemma it follows that, if Y; is a


stochastic process satisfying dY; = Y:dW; and f(y) is a
function with continuous second order derivative, then

a s%)) = 5F"WO¥2dt + f'(W)YdWe. (3.175)


For f(y) = In(y), we obtain from (3.175) that

d(In(¥:)) = — sit + dW. (3.176)

By integrating (3.176) between 0 and t, we find that

In(%) — In(Yo) ee 5BWV se VV oe ae :a LES

since Wo = 0, and therefore

t
Ye" Youesp (-5+m).
3.6. PROBABILITY. S!OCHASTIC CALCULUS 197

Since W; is a Wiener process, W; is a normal random vari-


able of mean 0 and variance t, i.e., Wi = /tZ, where Z is
the standard normal variable. Thus, Y; has the distribu-
tion

Y; = Yo exp (-5 + viz)

which is the same as (3.174). O

Question 28. Solve the following SDEs:

(i) dY; = pY.idt + cY¥;dW;;

Answer:
(i) Recall from It6’s formula that, if Y; satisfies the SDE

dy, = wY;dt + co¥,dWi,

and if f(y) is a function with continuous second order


derivative, then

df(Y%) = fae ie arta


II averWa SYry, ) a
=. S esten (S10)

For f(y) = In(y), we obtain from (3.177) that

dln(¥;) = (u— =) dt + odW. (3.178)

By integrating (3.178) from 0 to t, we obtain that

2
In(¥;) — In(Yo) = (u— =) t+ow:, (3.179)
198 CHAPTER 3. SOLUTIONS

and by solving (3.179) for Y:, we conclude that

(ii) We look for a solution of

dX = pdt + (aX + b)dW, (3.180)

of the form
XG = Ue Vee (3.181)

where the process U; is defined by the solution to the SDE

dU; = aUidW;, ~=owith Up = 1, (3.182)

and the process V; is the solution to the SDE

dV, = ardt+ BrdWi, with Vo = Xo, (3.183)

where the coefficients a; and (3; are to be determined.


Recall from (i) that the solution to

dY, = pY:dt + cY,dW;

(3.184)
By letting u = 0 and o = a in (3.184), we find that the
solution to (3.182) is

Up Sener, (3.185)
a2

Note that

dU; dV, (aU;dW;) (a,dt ap BidW;)

I| af,U;dt, (3.186) :
3.6. PROBABILITY. STOCHASTIC CALCULUS 199

since (dW)? = dt and dW,dt = 0 (because this term has


order (dt)*/*). By applying Ité’s product rule to X; =
U;V; and using (3.186), we obtain that
dX, = d(UrM) = UidVi + Vidi; + dl; dV,
= U;(ardt + BedWr) + Vi(aUidW;)
+- aG,U,dt

(aBy ae ar)U,dt + (aU V, = B,U,)dW,

= (afr Sr ar)Urdt + (aX¢ a5 BU) dW (3.187)

Then, X; = U,V; is a solution to (3.180) if and only if


the coefficients of the dW; terms and of the dt terms in
(3.180) and (3.187) are equal.
From the dW; terms, we obtain that

aX4+b=aXi.+ BU
—= CHOR =—10

<> B=WwW,'=be tT" — (3.188)


From the dt terms, we obtain that

(aft + ar)U; =

abs +a = wU;,?
Op Tt up a ar

—aWit St ae abe7°™t a ¢
at
= pe

[fut
Oo: = (w= ob)ée tte. (3.189)
Note that the solution V; to (3.183) is given by
t t

Vee +f ads + | B.dWs. (3.190)


0 0
From (3.181), (3.185), and (3.190), we find that the
solution X; to (3.180) is
2 ce %

De eit Ft (xo+f ads + | Bul.)


0 0
200 CHAPTER 3. SOLUTIONS

and, using (3.188) and (3.189), we obtain that


2 t : ait,
es = eo ae [xo+ f(Havent ds
Jo
t idee
ae i bea, |

JO
2
—, Ben ew - Ft

i 2 Is
dk (u—ab) [ EE a et)
)
ree; [«a(W,-Ws jee J dWw,. ia

8)

Question 29. What is the Heston model?


Answer: In Heston’s stochastic volatility model, it is as-
sumed that the price of the underlying asset satisfies the
same SDE as in the lognormal model, i.e.,

dS} = puSdt ++ Jv SidWi,

whereas the instantaneous variance v; itself follows a mean


reverting CIR (Cox-Ingersoll-Ross) process

dy, = A(vze— m)dt + n/vidZ,

where A > 0 and m > O are positive constants. ‘he


driving Wiener processes W; and Z; are correlated with
constant correlation p, i.e.,

corr(dW;,dZ,) = pdt.

‘The Heston model is a benchmark, and is commonly used


in derivative pricing because it has following features:
e it takes into account the leverage effect, namely; the
driving Wiener processes W; and Z; are correlated;
empirically, the correlation p is negative and that is
why the word ”leverage” ; |
3.6. PROBABILITY. STOCHASTIC CALCULUS 201

e it has a quasi closed form (up to an inverse Fourier


transform) solution for the prices of European op-
tions, which make the calibration more tractable;

e the variance process is mean reverting with rate of


reversion A and long term mean m.

Note that, if the parameters of the volatility process


are in the regime 2Am < 7, then zero is an attainable
boundary for the volatility process. Practitioners usually
assume that the boundary behavior at. zero is either ab-
sorption, i.e., the process is stuck at zero once it hits zero,
or reflection, i.e, the process bounces back right after it
hits zero. On the other hand, the other boundary infinity
is an unattainable boundary.

Question 30. Show that the probability density function


of the standard normal integrates to 1.

Answer: Uhe probability density function of the standard


: : aye2
normal variable is -~ e~ =. We want to show that
V2

pee yf saeaie
WOAIS Abas
e 2 f = |,

which, using the substitution t = 2, can be written as

Lie / e® dx = yr. (3.191)

We prove (3.191) by using polar coordinates. Since x is


just an integrating variable, we can also write the integral
T in terms of another integrating variable, denoted by y,
saphtes| e~¥ dy.

ee
ee
eh
»tie
202 CHAPTER 3. SOLUTIONS

Then,!”

Ve
II (/ en az). ( rel ay)

2° <—“ 2 2
II / / eer dady (3.192)

i / Se ese) daxdy.

We use the polar coordinates transformation « = rcos@


and y = rsin@, with r € [0, oo) and @ € (0, 27), to
evaluate the last integral. Since dxdy = rd@dr, we obtain
that

i? / i} Pes Cadel) dady

yi [
So yeas r Pod
DRED cos TAT
+r“
ome 29)
sin dédr

0 1)
co 27 5

‘ i re’ d@dr (3.193)


0 10)

I| re Qn r cn. dr
0
t
I]
we
2x lim re’ ar
too | 0

2x lim (-aa
t—0o 2
0

note that (3.193) follows from the equality cos? @+sin? @ =


1 for any real number @.
Since J > 0, I = ./z, which is what we wanted to
prove; see (3.191). O
'7Note that Fubini’s theorem is needed for a rigorous derivation
of the equality (3.192); this technical step is rarely required by the
interviewer. eek
3.6. PROBABILITY. STOCHASTIC CALCULUS 203

Question 31. Let W; = (Xz, Y:) be a two dimensional


Brownian motion starting at (a,y), i.e.; X_ and Y; are in-
dependent one dimensional Brownian motions with Xo =
Trance yor =e
(i) Find the probability that the Brownian motion W,
reaches the y-axis before reaching the z-axis.
(ii) Let 0 < ry < rz such that ri < \/x? + y? < re. Find
the probability that W; enters the inner circle of center
0 and radius r; before leaving the outer circle of center 0
and radius r2.
Answer: (i) Denote by tz and Ty the stopping times

Tie ee MELE Os Veceai Os


Tj int {> 0 Xe 0}

and let
T = inin{tz,7y}-
In other words, Tz is the first hitting time of the a—axis
for the Brownian motion W;, 7, is the first hitting time
of the y-axis for W;, and 7 is the first time W; hits either
the x-axis or the y-axis.
We are asked to find the probability

P(x,y)[Ty < Te]


where P,,,)[-] denotes the (conditional) probability cor-
responding to the two dimensional Brownian motion W;
starting at the point (a,y) at time 0.
Note that

P(x,y)
[Ty < Tx] + Poe,yy
[Ty > Te] = 1

since a two dimensional Brownian motion is recurrent and


therefore the two dimensional Brownian motion W; must
hit either the z-axis or the y-axis in finite time almost
surely.

oa
204 CHAPTER 3. SOLUTIONS

Let u(x,y) be a function with continuous and bounded


second order derivatives. By applying It6’s formula to
u(Xz, Y:) with stopping time 7, we obtain that

u(X,,Y,) — u(x,y) (3.194)

i wdX + | Uyd Yt (3.195)


0 0
+5 Stig, Buy} de. (3.196)
0
Assume that the function u(x,y) satisfies the PDE
Ure + Uyy = O in the first quadrant of the (a,y) plane
with boundary conditions u(0, y) = 1 and u(a#,0) = 0. By
taking the conditional expectation E,,,,)[-]| on both sides
of (3.194-3.196) and rearranging terms of the resulting
equation, we obtain that

u(x, y) = E (x,y) [u(X-, Y7)]

= 1-Pcaey) [7 = Ty] + 0+ Poaeyy [7 = Te]


II Pea.) (Ty << Ta.

‘To solve the boundary value problem for u, we transform


the equation into polar coordinates u = u(r, @) as
1 1
Urr + —Ur + —Ue9 = 0
r r
and the boundary conditions become u(r,7/2) = 1 and
u(r, 0) = 0, respectively. Notice that since the solution wu is
radially symmetric, we may use the ansatz u(r,0) = f(6),
for some function f. Thus, the problem reduces to the
following second order linear ODE

f'(0) =0
with boundary conditions f(7/2) = 1 and f(0) = 0. The
solution is given by f(0) = 26. Thus,

Ps
u(z,y) = 3 arctan (2) 5
3.6. PROBABILITY. STOCHASTIC CALCULUS 205

(ii) Denote by 7 and 72 the stopping tines

my = cint{t > 02|Wiksier}


T2 inf{t > 0:|W:| > re},

and let
T=) mun Tre)

We are asked to find the probability

Pes) (riz).

By the same token as in part (i) of the problem, the


probability is determined by solving the PDE in polar
coordinates
1 1
Urr + —Ur + —Uee = 0
.. ce
with boundary conditions u(r1,@) = 1 and u(r2,@) = 0
for all 8. Since the solution u is rotationally symmetric,
we use the ansatz u(r,@) = g(r). The problem reduces to
the following second order linear ODE

il
g(r) + =9 (r) 0 V TT 0 sero;

with boundary conditions

g(r1) = 1; g(r2) = 0.
The solution is given by

Inr — Inr2
= —————_, Vni<r<re.
g(r) Inr; — Inr2 ;

Consequently,

In \/x? + y? —Inre
UU) Bia a
Inr, — Inre
206 CHAPTER 3. SOLUTIONS

Question 32. Let B:, t > 0, be a standard Brownian


motion in the probability measure P. Determine the prob-
ability density function of B 3 under the probability mea-
sure P defined by the Radon—Nikodym derivative

— = € 2 (3.197)

Answer: Let

‘Then,
1 2 ly fieOrde
3 3

By he
5 i: Od [B, By eya t

and therefore the Radon-Nikodym derivative from (3.197)


can be written as
~ 3 S
dP 3 ees
= os (; Od Br — 5, fat)

which defines a probability measure.


From Girsanov’s theorem, it follows that
8

B= B+ | O.ds (3.198)
0

and B, is a standard Brownian motion under P.


From (3.198), we obtain that

Bs2 = By2 =1, i

and we conclude that B a is normally distributed with


mean —1 and variance 3 under P.

Intuitively, one can think of the result as follows:


3.6. PROBABILITY. STOCHASTIC CALCULUS 207

e Since the change of measure only changes the drift and


does not affect the diffusion part, Bs must a Gaussian
Dy)
random variable of variance 5.

e ‘lhe change of measure is in effect only before time 1,


when #; is a Brownian motion with drift —1 under P,
and after time | it switches back to a standard Brownian
motion starting at -1. O

Question 33. Let Bi = (BLY, BY”) be a two dimen-


sional Brownian motion in the (2,y) plane. Let a > 0
and denote by 7 the first time B; hits eo — =
Determine the probability distribution of Bo

Answer: Since the twc Brownian motions BY? and Be


are independent, it follows that the Brownian motion Bw
is independent of the random time 7. Hence, we can cal-
culate the probability distribution of BY by conditioning
on 7 as follows:

Bp <7) = [P [B® <air=4] fr(t) dt


0

II [oP [B? <2] nwa


za [vs (=) fae (3.199)

where f; is the probability density function of t and ®


is the cumulative distribution function of the standard
normah distribution.

Denote by fao the probability density function of

a
Ee
208 CHAPTER 3. SOLUTIONS

BS. ‘Then, from (3.199), it follows that

F(t ee
d qa)
=P |B <a

- £(f'2(4) roe)
I
fas=6(5,5)Felt, (3.200)
where ¢ denotes the probability density function of the
standard normal distribution.
To determine the probability density function f, of T,
note the equivalence between the following two events:

fat) pe ames, BY > al. (3.201)


From the reflection principle for the Brownian motion, it
follows that

P [ans
O<s<t
B®) > 7 = 2P [BY > a| (3.202)
and therefore, from (3.201) and (3.202), we find that

P[r<t] = P jana,BY > |

= 2P EY > a]
2 (a-P[B? < al)

(\-*())
Thus, the probability density function of r is given by

f(t) = SPir<t
pec siete © lba|
=alpae (+) (3.208)
3.6. PROBABILITY. SVOCHASTIC CALCULUS 209

From (3.200) and (3.203), we obtain that

fot) =
- Lago
II
Fi‘(5 | no)
ioe) 1 A 2
— “f ee ot aii
ame Joe te
ee 2 (oe a
on £7 + a7 an
— a

m(x? + a?)

We conclude that B{”’ has a Cauchy distribution with


location parameter 0 and scale parameter a. [1

a)
ee
bee
iit
210 CHAPTER 3. SOLUTIONS
3.7. BRAINTEASERS all

3.7 Brainteasers.

Question 1. A flea is going between two points which


are 100 inches apart by jumping (always in the same direc-
tion) either one inch or two inches at a time. How many
different paths can the flea travel by?
Answer: Let a, denote the number of different paths of
the flea that covers the distance of n inches, jumping either
one inch or two inches at a time. We want to find ajoo.
Since the flea can jump either one inch or two inches at
a time, it could have made the last jump either from the
end of the (n — 1)st inch or from the‘end of the (n — 2)nd
inch. Hence, the total number of ways the flea can cover
the distance of n inches, jumping either one inch or two
inches at a time, is the sum of the number of ways the flea
can cover the distance of n — 1 inches, jumping either one
inch or two inches at a time, and the number of ways the
flea can cover the distance of n — 2 inches, jumping either
one inch or two inches at a time.
In other words, for n > 2, we have

An = An-1 + An-2- (3.204)

Note that a, = 1, since the flea can cover 1 inch in only


one way, jumping one inch once; while a2 = 2, since the
flea can cover 2 inches in two ways, either jumping one
inch twice, or jumping two inches once.
Note that (3.204) is the recurrence relation for the Fi-
bonacci sequence. Let
14+
di = 14 v5 and ¢2 =
V5 1— V5
2D 2
be the roots of the characteristic equation 2? — x — 1 =0
corresponding to (3.204).
‘Then,

On, = C1 oy 7 C2 oo, Vn Za i

ae
en
HAW CHAPTER 3. SOLUTIONS

where the constants C; and C2 are such that a; = 1 and


aag= Ds

By solving the linear system

Ci¢dit+ Cod2 = 1
Cidt
+ Cogs = 2,
we obtain that

and therefore

Plugging n = 100 into the last expression gives the


answer. U1

Question 2. I have a bag containing three pancakes:


one golden on both sides, one burnt on both sides, and
one golden on one side and burnt on the other. You shake
the bag, draw a pancake at random, look at one side, and
notice that it is golden. What is the probability that the
other side is golden?
Answer:
Solution 1: Label the pancakes 0, 1, and 2, according to
the number of burnt sides. Let #;, i = 0,1,2, denote the
event that the pancake with i burnt sides is drawn from
the bag. Let A denote the event that the side (of the
randomly drawn pancake) we look at is golden.
‘he probability of the pancake drawn from the bag
having no burnt sides given that one side is golden is
P(Eo|A). Then, from the Bayes’formula, we obtain that
3.7. BRAINTEASERS 213

P(Eo A) ae= Se =
P(Eo M A) -

P(Eo)P(A|Eo)
P(A|Eo) P(E) + P(A|E1)P(Ei) + P(A Ez) P(E2)

Solution 2: Out of the six possible sides that we could


have seen, three are golden. Out of these, two belong to
a pancake that is golden on both sides. ‘Vherefore, the
probability of the other side being golden is 2. O

Question 3. Alice and Bob are playing heads and tails,


Alice tosses n + | coins, Bob tosses n coins. ‘The coins are
fair. What is the probability that Alice will have strictly
more heads than Bob?
Answer:
Solution 1: Alice flips the coin more often than Bob, so
either she must end up with more heads or with more tails
than Bob. She cannot, however, end up with more heads
and more tails, because she only flips one more coin than
Bob. We deduce that either Alice gets more heads or she
gets more tails.
Since these events are equally likely, they both have
probability s, and therefore the probability that Alice will
have strictly more heads than Bob is 3.
Solution 2: Suppose that Alice and Bob begin by flipping
n coins each. Let p be the probability that Alice gets
more heads than Bob, and let g be the probability that
both Alice and Bob get an equal number of heads. Note
that. 2p + q = 1 since the probability that Alice gets more
214 CHAPTER 3. SOLUTIONS

heads than Bob is, by symmetry, equal to the probability


that Bob gets more heads than Alice.
Alice then flips the (n + 1)st coin. For Alice to have
more heads, she either had more heads than Bob before
flipping the last coin; or the same number of heads as Bob
before flipping the last coin and her (n + 1)st flip must
come up heads. Hence, the probability of Alice winning
is p+ $q.
Since 2p+q=1, then p+ sq= i,

Question 4. Alice is in a restaurant trying to decide


between three desserts. How can she choose one of three
desserts with equal probability with the help of a fair coin?
What if the coin is biased and the bias is unknown?
Answer:
Solution 1: Denote the desserts by A, B, and C. First,
suppose the coin is fair. Denote heads by H and tails by
T. Vhe procedure to choose one of three desserts with
equal probability is as follows: toss the coin twice; let
the outcomes 7’H, HT’, and 7'T’ correspond to choosing
desserts A, B, and C, respectively; if the outcome is HH,
repeat the procedure.
Note that the probability of our procedure not being
repeated is p= 3, hence, the number of times our proce-
dure is repeated is a geometric random variable with p as
its parameter. he expected number of times our proce-
dure is repeated is + = :. Since each procedure involves
two coin tosses, the expected number of coin tosses before
Alice chooses one of three desserts with equal probability
1S
8

Now, suppose the coin is biased. One procedure to


choose one of three desserts with equal probability would
be as follows: toss the coin four times; denote by 77 HHT,
ATTH, and THTH the outcomes corresponding to choos-
ing desserts A, B, and C, respectively; all the other 4-toss
outcomes result in repeating the procedure.
3.7. BRAINTEASERS 215

Solution 2: An alternative procedure is as follows: toss


the coin three times; denote by H7'l', VHT’, and T'7'H
the outcomes corresponding to choosing desserts A, B,
and C’, respectively; all the other 3-toss outcomes result
in repeating the procedure.
Using an argument similar to the case with a fair coin,
one finds that the expected number of tosses of a coin
with an unknown bias, before Alice chooses one of three
desserts with equal probability, is a and 8, respectively,
for the two procedures described above.

Question 5. What is the expected number of times you


must flip a fair coin until it lands on head? What if the
coin is biased and lands on head with probability p?
Answer: Denote by X be the number of times you must
flip a fair coin until it lands on head. If the first coin
toss is a head (which happens with probability 3), then
X = 1. If the first coin toss is a tail (which also happens
with probability 5) then the coin tossing process resets
and the number of steps before the coin lands on head
will be 1 plus the expected number of coin tosses until the
coin lands heads. In other words, the expected number of
coin toss E[X] satisfies the equation

‘ 1 1 ;

Solving (3.205) for E[X], we conclude that ELX] = 2, ie.,


the expected number of times you must flip a fair coin
until it lands heads is 2.
On the other hand, if the coin is biased with the prob-
ability p of landing on head, the same argument still ap-
plies. However, in this case, (3.205) reads

E(X] = p + (1—p) 1+ 2[X)).


Again, solve for E[X], we obtain E[X] = = O
216 CHAPTER 3. SOLUTIONS

Question 6. What is the expected number of coin tosses


of a fair coin in order to get two heads in a row? What if
the coin is biased with 25% probability of getting heads?
Answer: We solve the general case of a biased coin with
probability p of the coin toss resulting in heads. ‘The out-
comes of the first two tosses are as follows:
e If the first toss is tails, which happens with probability
1 —p, then the process resets and the expected number of
tosses increases by 1.
e If the first toss is heads, and if the second toss is also
heads, which happens with probability p*, then two con-
secutive heads are obtained after two tosses.
e If the first toss is heads, and if the second toss is tails,
which happens with probability p(1— p), then the process
resets and the expected number of tosses increases by 2.
If EX] denotes the expected number of tosses in order
to get two heads in a row, we conclude that

E{X] = (1—p)(1 + E[X]) + 2p* + p( — p)(2+ E[X)).


By solving for |X], we obtain that
1
BSS dues (3.206)
Pp
For an unbiased coin, i.e., for p = $) we find from
(3.206) that E[X] = 6, i.e., the expected number of coin
tosses to obtain two heads in a row is 6.
For a biased coin with 25% probability of getting heads,
i.e., for p = ro we find from (3.206) that E[X] = 20, ie.,
the expected number of coin tosses to obtain two heads in
a row in this case is 20.

Question 7. A fair coin is tossed n times. What is the


probability that no two consecutive heads appear?
Answer: ‘Che total number of sequences of heads and tails
of length n is 2”. Let a, be the number of sequences of
3.7. BRAINTEASERS 217

heads and tails of length n, such that no two consecutive


heads appear. ‘hen, the probability that no two consec-
utive heads appear is $+.
Note that a; = 2 (H and 'T do not contain two consec-
utive heads) and az = 3 (out of HH, HT, TH, and TT,
only HH contains two consecutive heads). We find the
closed formula for a, by deriving a recurrence relation as
follows:
A sequence of n > 3 coin tosses does not contain two
consecutive heads if and only if: (i) either it begins with a
tail, followed by a sequence of n—1 coin tosses with no two
consecutive heads; (ii) or it begins with a head, followed
by a tail, and followed by a sequence of n — 2 coin tosses
with no two consecutive heads. Since these two scenarios
are mutually exclusive, it follows that

is ee YOppen ae knee, NY i) coy (3.207)

Note that (3.207) is the recurrence relation for the Fi-


bonacci sequence. Let ¢1 = 14v6 and ¢2 = toe be
the roots of the characteristic equation «7 — « — 1 = 0
corresponding to (3.207). ‘Then,

@n = Cid, +Cods, Vu],

where the constants C; and C2 are such that a, = 2 and


a2 = o

By solving the linear system

Cigit Cog2 = 2
Cidi + Cros = 3,
we obtain that

Ponaad (Bi V5 _ 1.
1 2/5 Vit

= $3
218 CHAPTER 3. SOLUTIONS

We conclude that

n+2 wh nm+2
ania Aare Ae Dwele

and therefore the probability that no two consecutive heads


appear in n tosses of a fair coin, which is equal to 3, is

1 f :
es igus eA ia) p CO)

Question 8. You have two identical Fabergé eggs, either


of which would break if dropped from the top of a building
with 100 floors. Your task is to determine the highest floor
from which an egg could be dropped without breaking.
What is the minimum number of drops required to achieve
this? You are allowed to break both eggs in the process.
Answer: Consider the following more general problem:
Find the largest number of floors he(n) a building could
have in order to be able to determine the highest floor from
which an egg could be dropped without breaking using e
eggs and n drops.
Since one drop can only determine one floor, it follows
that
he(1) = 1. (3.208)
If we have only one egg at our disposal, the only pos-
sible strategy is to try the floors one by one from bottom
to top; hence,
faye): == 7;
When e > 2 and n > 2, the first drop cannot be from
the floor higher than he-i(n — 1) + 1, since if the ege
breaks, there are only e — 1 eggs and n — 1 drops left,
and the highest floor we can still handle is he-1(n — 1).
If the first drop does not break an egg, we can treat floor
3.7. BRAINTEASERS 219

he-1(n — 1) + 2 as the new floor 1, and reduce it to a


problem with e eggs and n — 1 drops, and therefore

he(n) = 1+ he-i(n—1)+he(n— 1).

Iterating this argument, we obtain that

he(n)
Il 1+ he-i(n — 1) + he(n — 1)
= 2+ he-i1(n— 1) + he-1(n — 2) +he(n — 2)

n—-1
II (n—1)+ Ss Re-1(j) + he(1)

n—1

II n+ eee),
j=1

since he(1) = 1; see (3.208).


For e = 2,? the formula above becomes

ho(n)
n—1 (1! (n - 1)n

= n+ Som(j)=n+ > fant 5


q=1 j=1

mint a (3.209)

where the next-to-last step follows from the summation


k
formula }7;_) =__ 3.
k(k+1)

Since h2(13) = 91 < 100 < 105 = h2(14), the required


number of drops is 14.
Note that our iterative argument also provides an al-
220 CHAPTER 3. SOLUTIONS

gorithm: you drop the first egg from the floors

14 1+ hi(13),
27 2+ hi(13) + hi (12),
13

39 3) ha);
Hescl

HS)

50 4+ $7 hil),
j=10
133

60 5+ >) ha(3),
j=9
13

69 62>) hii);
j=8
LS

a Tt ave
j=7
13

84 So See
j=6
13

90
j=5
13
95 10+ S73),
j=4
13
99 lit S$”aly),
j=3

and 100; that is, move up by 14 = 1+ hi (13) floors, then


by 13 = 1+ h1(12) floors, then by 12 = 1+ hi(11) floors,
and so on, until the first egg breaks (or does not) from
the 100th floor. Calling the floor from which the first
ege
breaks f and the previously tested floor f’, you drop
the
3.7. BRAINTEASERS 221

second egg from the intervening floors f’+1, f’+2,...,f—


1 in that order. UO

Question 9. An ant is in the corner of a 10 x 10 x 10


room and wants to go to the opposite corner. What is the
length of the shortest path the ant can take?
Answer: For clarity, assume that the ant is in a corner
by the ceiling. Denote that corner by A, and denote by
B the opposite corner to A, which is by the floor. The
shortest path from A to B would require the ant to go on
a straight line across a wall of the room to a side of the
floor and from there on a straight line along the floor to
the B. If you imagine laying down to the floor the vertical
wall the ant went down from A to the floor, you have a
10 x 20 rectangle with A and B opposite corners in the
rectangle. ‘The shortest path for the ant to go from A to
B is by following the diagonal of the rectangle, which has
length 10y/5 22.36. O

Question 10. A 10 x 10 x 10 cube is made of 1,000 unit


cubes. How many unit cubes can you see on the outside?
Answer: If all the outside unit cubes are removed, what
remains is an 8 x 8 x 8 cube, which is made of 8° = 512
unit cubes. Thus, there are 1000 — 512 = 488 outside unit
cubes.

Question 11. Fox Mulder is imprisoned by aliens in a


large circular field surrounded by a fence. Outside the
fence is a vicious alien that can run four times as fast
as Mulder, but is constrained to stay near the fence. If
Mulder can contrive to get to an unguarded point on the
fence, he can quickly scale the fence and escape. Can he
get to a point on the fence ahead of the alien?
Answer: Let R denote the radius of the circular field,
whose center we denote by C. Denote Mulder’s speed by

CY
ee
oe
ae
Ce
222 CHAPTER 3. SOLUTIONS

v. The alien’s speed is then 4v. Denote Mulder’s and


alien’s positions by M and A, respectively.
Mulder cannot just run for the fence along the straight
line connecting C with the point on the fence diametrically
opposite to A. Indeed, while it takes Mulder a time to
cover pe distance R, the alien would cover the distance
TR in ae # time and the alien would catch up with Mulder,
since =cn aE
‘To eomnee this strategy, Mulder needs to start run-
ning for the fence from a point that is closer to the fence
than C. Assume that Mulder somehow managed to be at
a point M that is xR away from C (where 0 < x < 1),
with M, C, and A collinear, and C' between M and A. De-
note by P the point on the circle diametrically opposite to
A; see Figure 3.2. Then, MC = «Rand MP = (1—2)R.
It takes Mulder OunF time to reach the fence running
rR
from M to P, while the alien needs 7, time to reach the
point P going from A to P on a semicircle. Note that

See is ified =<,


v v 4

Thus, if « > 1 — =, Mulder would be able to escape the


alien.
We are now ready ip describe Mulder’s cone strat-
egy. He oot: z=1-—%4+0.01. Note that x < 3, since
0.01 <a" o Reman of the alien’s movement, Maids
first runs ion C to any point on the circle of radius xR,
centered at C’. ‘Then he runs around that circle, until his
position M is such that M, C, and A are collinear, with
C between M : and A. He is able to do so, since x < +4
and the alien’s speed is only 4 times his speed.'® Finally,
'8Mulder is able to do so since, if x < i, his angular speed,
Saree , is larger than the angular speed ae of the alien:

Ov 40v 1
Bef at ee, Pig.
ntRa TR 4
3.7. BRAINTEASERS 228

Figure 3.2: Mulder can reach P from M before the alien


can do so from A.

Mulder runs from M to P and will reach P before alien


does, as shown above, since x > 1 — :

Question 12. At your subway station, you notice that


of the two trains running in opposite directions which are
supposed to arrive with the same frequency, the train go-
ing in one direction comes first 80% of the time, while the
train going in the opposite direction comes first only 20%
of the time. What do you think could be happening?
Answer: One thing that could be happening is that the
train that comes first 80% of the time comes in fact more
frequently than the other one. However, even if both
trains run with the same frequency, one train might come
first 80% of the time. For example, assuming that your ar-
rival in the station is uniformly distributed, if both trains
run every ten minutes, and train A comes into the station
at 1:00, 1:10, 1:20, ..., while train B comes at 1:12, 1:22,
1:32, ..., then train A will come first 80% of the time.

eee
=
224 SHAPTER 3. SOLUTIONS

Question 13. You start off with one amoeba. Every


minute, this amoeba can either die, do nothing, split into
two amoebas, or split into three amoebas; all these scenar-
ios being equally likely to happen. All further amoebas
behave the same way. What is the probability that the
amoebas eventually die off?
Answer:
Solution 1: (Due to Yu Gan, Baruch MFE’14.) Denote by
Ap, the event that no amoebas are alive after & minutes.
Let pe = P(Ax). Note that pi = + and Ay C Apy1, for
illo 2S Ale
The probability p that the amoebas eventually die off
is the probability that at some point in time, i.e., after n
minutes for some 7, no amoebas are alive. In other words,

a2 r(U 4),
Note that Uy_, Ax = An, since Ax © Axgyi for all k > 1.
‘Then,

bis) \| 12 (Ua) 22 JP (lim LJ a]


k= as k=1

II lim P (Ua) = nN
lin’Pco (45)
kW

= Vln wae (3.210)


n—-+Cco

Given the four equally probable outcomes after the first


minute, i.e, the amoeba can die, remain one amoeba, split
into two amoebas, or split into three amoebas, it follows
that
a8 SUGAR 1 2 ie
Prt a a gPr-1 ae qPn-1 es qPnr-is (3.211)
3.7. BRAINTEASERS 225

for all n > 2.


The sequence (pn)n>o0 is increasing. Recall that An C
An+1, and therefore

(op = PUA) = Pi Ans) = Pn+1; Vere i:

Also, we can see by induction that the sequence (pn )n>0


is bounded from above by xf = 1, since py = + a eee
and, if we assume that pn_1 < V2 —1 for some n > 2,
then we obtain from (3.211) that

Pn, = zt gvB- 1) + 5? NeEG i


= V/2—1. ,
‘Thus, the sequence (pn)n>0 is bounded from above and
increasing, and therefore convergent.
Recall from (3.210) that limn—opn = p, where p is
the probability that the amoebas eventually die off. Since
Pn < V2—1 for all n > 1, it follows that p < /2-—1.
Moreover, from (3.211), we find that

ey es ee al
13 The + —p’,
Ce eA 4
which can be written as

0O=—p 3 +p Oa:
—3p4+1 = (p—1)(p 2 + 2p-1),

and has solutions 1, /2—1, and =n/) = 1. Since 0 =< m=


Te 1, we obtain that p = i eee
In other words, the probability that the amoebas even-
tually die off is fDi I ;
Solution 2: Let p be the probability that the descendants
of asingle amoeba will die out eventually. ‘hen, the prob-
ability that the descendants of n amoebas will all die out
eventually is p™, since each amoeba is independent of all
other amoebas.
226 CHAPTER 3. SOLUTIONS

Furthermore, the probability that the descendants of


an amoeba will die out eventually is independent of time
when averaged over all the possibilities.
At the beginning, the probability that the descendants
of an amoeba will die out eventually is, by definition, p.
After one minute, the initial amoeba turns into 0, 1, 2, or
3 amoebas, with probability of ; for each case. ‘Thus, the
probability that the descendants of the amoeba die out is
now
1
An BAPE AE :(lt+p+p +p’).

These two probabilities must be equal, and therefore


1
qiitpte’ +p’), (3.212)
which is the same as

p tp 3p4+1.= (p= 1)\(p7 +4 2p— 1), = 0s, (3,213)


The roots of (3.213) are p = 1, p= —V2—1, and p=
V¥2-—1. The only root in the interval (0,1) is p= /2—1,
and this is the probability that the amoebas eventually
die off.
A more subtle question is why p = 1 is not one of
the possible answers to the problem in hand? The right
hand side of (3.212) is the generating function h(p) of
amoeba’s branching process. It is a well-known theorem
on branching processes that if the mean number of off-
spring produced by a single amoeba is bigger than 1, then
the smallest positive root of the equation p = h(p) is the
probability that amoeba’s descendants will die out eventu-
ally. In our case, the mean number of offspring produced
by a single amoeba is + (0+ 1+2+3) = 3 > 1, so the
theorem applies. ©

Question 14. Given a set X with n elements, choose two


subsets A and B at random. What is the phen of
A being a subset of B?
3.7. BRAINTEASERS 227

Answer: When two subsets A and B of X are chosen at


random, each element of X is equally likely to end up in
any of the following four sets:

A\B, B\A, ANB, and X\(AUB).

For A to be a subset of B, A \ B would have to be


empty; in other words, none of the n elements of X would
end up in A\ B. The probability of any element of X not
ending up in A \ B is 3.
We conclude that the probability of A being a subset
of B is ‘

Question 15. Alice writes two distinct real numbers be-


tween 0 and 1 on two sheets of paper. Bob selects one of
the sheets randomly to inspect it. He then has to declare
whether the number he sees is the bigger or smaller of the
two.
Is there any way Bob can expect to be correct more
than half the times Alice plays this game with him?

Answer: Denote the numbers Alice writes on two sheets of


paper by a; and az, 0 < ai < @2 < 1. Denote the number
Bob selects by A. Bob’s task is to guess (with probability
of being correct bigger than 1/2) whether A = a; or A =
ao.

Bob’s strategy is as follows: after seeing A, Bob draws a


number B uniformly at random from (0, 1); if B is smaller
than A, Bob declares that A = a2; otherwise, he declares
that A = ai.
Denote by FE the event that Bob is correct using this
strategy.
228 CHAPTER 3. SOLUTIONS

P(E) = P(B|A=a)-P(A=a1)
+ P(E|A = az2)-P(A= az)
= P(B>ay)-P(A=ai)
+ P(B < az): P(A= az)
= (l-—ai)-0.5+a2-0.5
= 0.5+0.5(a2 — a1)
> OL:

The probability that Bob is correct using this strategy


is therefore greater than $. O

Question 16. How many digits does the number 1251


have? You are not allowed to use values of log,,)2 or
logy 5.
Answer: Note that

20050 LOOQN\ Fr eg LODO?


1251 = (=) Sa (3.214)

Since 2'° = 1024, we obtain that

125100 Fe 1000°°
eeee me *=4.0007°
ent :

102439 1.02430 Sep)


We first show that

1 sau) ODMR yes LO. (3.216)

From the binomial expansion, it follows that

30 =_ 7~>(30j 0.024,i
A+ 0.02" (3.217)
j=0
3.7. BRAINTEASERS 229

Note that the ratio of every two consecutive terms in


(3.217) is less than 0.72, since
30 fete R
Can) ane Cray indy
(°°)0.0247
30 ; ie
@)
SO\e ae

By 30! (30-53)
abel
ELEN Es
4 G1 aT tak ee
epee aN yy
el
< 30-0.024
= S070)
Then,

30)» 10.0247
Vener < 0.727 VO<j <30, (3.218)

and, from (3.217) and (3.218), we obtain that


30 co

(10.024) SST .RT nS 0.72


j=0 j=0
il
1 — 0.72
<< .iKOp

note that, for the equality on the second line above, we


used the geometric series identity

y goi= — LOL ON ae
—z
j=l
Thus, the inequality (3.216) is proved.
From (3.215) and (3.216), we obtain that

10°?
<0125°? == Belin
7.02480 ates (3.219
ee,
230 CHAPTER 3. SOLUTIONS

and conclude that 125'°° has 210 digits. 0

Question 17. For every subset of {1,2,3,...,2013}, ar-


range the numbers in the increasing order and take the
sum with alternating signs. ‘The resulting integer is called
the weight of the subset.'? Find the sum of the weights
of all the subsets of {1, 2,3,..., 2013}.
Answer: Let w(S) denote the weight of subset S. Every
subset S of {1,2,3,...,2013} that does not contain ele-
ment 1 can be uniquely paired with the subset {1} US
that contains element 1. Since there are 27°!° subsets of
{1,2,3,...,2013}, there are 27°!” such pairs. Note that
w(S) + w({1}US) = 1; that is, the combined weight
of each pair is 1. For example,

w ({2,5,8}) + w({1, 2,5, 8})


= (20 Bo 8) Gl ees ee
= il.

Hence, the sum of the weights of all the subsets of


{ 1)'2, Syeee,
20M ais oe a med

Question 18. Alice and Bob alternately choose one num-


ber from one of the following nine numbers: 1/16, 1/8,
1/4, 1/2, 1, 2, 4, 8, 16, without replacement. Whoever
gets three numbers that multiply to one wins the game.
Alice starts first. What should her strategy be? Can she
always win?
Answer: First, notice that the numbers Alice and Bob
play with are powers of two, namely, 2~*, 2~3, 2-7, 271,
2° Oh 9? 0 cand 0%. Nex imagine that Alice and Bob
are playing on the 3 x 3 square, whose entries are as fol-
lows: the first row (from left to right) is 2*, 2~*, 21: the
1°Ror SreraD Gs the weight of the subset {3} is 3. The weight of
the subset {2,5,8} is 2-54+8=5.
3.7. BRAINTEASERS DSi

second row is 2~”, 2°, 2?: and the third row is 27!. Paes
oes
‘he product of the entries in every row, every column,
and every diagonal is 1, and these possibilities cover all
the ways to choose three of the given numbers multiplying
to 1. hus, Alice and Bob are essentially playing Vic-
‘Tac-'Toe! It is well-known that best play by both players
in ‘Vic-'Lac-‘Toe leads to a draw. We conclude that Alice
does not have a winning strategy, although she cannot lose
either.

Question 19. Mr. and Mrs. Jones invite four other cou-
ples over for a party. At the end of the party, Mr. Jones
asks everyone else how many people they shook hands
with, and finds that everyone gives a different answer. Of
course, no one shook hands with his or her spouse and
no one shook the same person’s hand twice. How many
people did Mrs. Jones shake hands with?
Answer: Since each person shook hands with at most eight
others, the nine different answers received by Mr. Jones
are exactly the numbers 0 through 8. Denote by P; the
person with 7 handshakes, i = 0,1,...,8. Mr. Jones is
not assigned any additional notation.
Ps shook hands with 8 people of the total of 9 other
people. ‘hus, Ps did not shake the hand of only one other
person, so that person must be his or her spouse. On the
other hand, Ps did not shake the hand of Po since nobody
did that. Therefore, Pg and Po are married, and Pg shook
everyone’s hand except for Po. P7 did not shake the hands
of two people, one of whom was his/her spouse. One of
these two people had to be Po as he or she did not shake
anyone’s hand, and the other one had to be P; as he or she
had only one handshake, namely with Ps. Since spouses
do not shake hands, the spouse of P7 is either P; or Po.
However, Po is married to Pg, so P; must be married to
Pz.
232 CHAPTER 3. SOLUTIONS

Proceeding similarly, we find that Ps and Pz must be


married, and that Ps and P3 must be married. ‘Then, P4
must be Mrs. Jones, since this is the only person whose
spouse was not identified, and Mr. Jones was not any one
of Po, baer Pg.

We conclude that Mrs. Jones shook hands with four


people. O

Question 20. The New York Yankees and the San Fran-
cisco Giants are playing in the World Series (best of seven
format). You would like to bet $100 on the Yankees win-
ning the World Series, but you can only place bets on in-
dividual games, and every time at even odds. How much
should you bet on the first game?
Answer: Let P bea 5x5 matrix containing the net payoffs
for all the states in this dynamic programming problem.
More precisely, P(i,7) denotes the net payoff in the state
(i, 7)when Yankees have won ? and lost 7 games (0 < i,j <
4). Clearly, P(4,7) = 100 (0 < j < 3), and P(i,4) = —100
(0 <2 < 3). Moreover, P(4, 4) is left blank, as (4,4) is not
in our state space (4 wins and 4 losses cannot be achieved
in a best of seven series).
Let B be a 4 x 4 matrix containing the bets we need
to place at each state (7,7), given that we would like to
bet $100 on the Yankees winning the World Series, and
given that Yankees have won i and lost 7 games so far
(0 <i,7 <3). Clearly, B(3,3) = 100.
Given that Yankees have won 7 and lost 7 games so
far, if we bet B(i,7) on the Yankees for the next game,
our payoff will be P(2,7) + B(i,7) if the Yankees win, or
P(i,j) — B(i,j) if the Yankees lose. Therefore,

Pi hy) P(i,j) + B(i, 9); (3.220)


Pi,9G+1) = PGF) = BGG. (3.221)

By adding and subtracting the equations (3.220) and


3.7. BRAINTEASERS 233

(3.221) we obtain that

P(i,j) II (P(i+1,9) + P(i,j+1)) (3.222)


rrNl|]eR
Bi,j) lI 5 (Pl +1,7)— P(i,j+1)) (3.223)

Now, it is easy to compute all the entries of P, using


(3.222) and working backwards from P(4,7) and P(i,4).
For example,

P(3:3) = 5 (P(4,3) + P(3,4)) =


* (100 — 100) = 0.
Nlre

Once matrix P is computed, we use (3.223) to compute


the matrix B. For example,

1 5
bf 5 (PG; 1) — P(2,2)) = =(75 — 0) = 37.5.

In other words, given that Yankees have won 2 games


and lost 1 game so far, and we would like to bet $100
on the Yankees winning the World Series, we should bet
$37.5 on the Yankees for the next game. We include both
matrices below:

0 —31.25 —62.5 —87.5 —100


31.25 0 —37.5 —75 —100
[es 62.5 37.5 0 —50 —100
87.5 75 50 0 —100
100 100 100 106 0

31259 31.25 ber 12.8


me Silva!” Bier eile) 775)
a 25 37.5 50 50
12.5 25
Therefore, we should bet B(1,1) = 31.25 dollars (on
the Yankees) on the first game. Note that the probability
of the Yankees winning or losing a single game does not
affect your betting strategy. LU
234 CHAPTER 3. SOLUTIONS

Question 21. We have two red, two green and two yellow
balls. For each color, one ball is heavy and the other is
light. All heavy balls weigh the same. All light balls weigh
the same. How many weighings on a scale are necessary
to identify the three heavy balls?

Answer: It is clear that one weighing does not suffice. We


show that the heavy balls can be identified in only two
weighings.
Label the red balls R; and Re, the green balls G; and
G2, and the yellow balls Y; and Y2. Our first weighing is
{Ri,Gi} vs. {R2, Yi}.
If the scale is in balance, then either G; or Yi is heavy,
but not both. Our second weighing is {Gi} vs. {Yi}. If
G, is heavier, then the set of heavy balls is {R2, G1, Y2}.
If Y; is heavier, then the set of heavy balls is {Ri, G2, Yi}.
If {Ri,Gi} is heavy, then either Gi is heavy or Yi
is light. Our second weighing is {Gi,Yi} vs. {G2, Yo}.
If the scale is in balance, then G; is heavy; hence, the
set of heavy balls is {Ri,Gi, Yo}. If {Gi, Yi} is heavier,
then G; and Y; are both heavy. The set of heavy balls is
{Ri,Gi, Yi}. If {G2, Yo} is heavier, then Gz and ¥2 are
both heavy. The set of heavy balls is {Ri, G2, Yo}.
If {R2,¥i} is heavy, then either Y; is heavy or Gi
is light. Our second weighing is {G1,Yi} vs. {Ge, Ye}.
If the scale is in balance, then Y is heavy; hence, the
set of heavy balls is {R2,G2,¥Yi}. If {Gi,¥i} is heavy,
then G; and Y; are both heavy. The set of heavy balls is
{R2,Gi,Yi}. If {G2, Y2} is heavier, then G2 and Y2 are
both heavy. ‘The set of heavy balls is {R2,G2,Y2}. O

Question 22. There is a row of 10 rooms and a treasure


in one of them. Each night, a ghost moves the treasure
to an adjacent room. You are trying to find the treasure,
but can only check one room per day. How do you find
it?
3.7. BRAINTEASERS 235

Answer: The treasure can be found in at most 16 days.


Label the rooms 1 through 10. Denote by 7; the room
where the treasure is on day k, and let Ry denote the room
you check on day k. Adopt the following strategy: for days
ki 152; ..-, 8; let Ry-= k'4+1> for' days k= 9)10)4... 16,
let Ry = 18 —k.
If 7, is even, then we will find the treasure in one of
the first eight days. In other words, there exists k with
1 < k < 8 such that 7, = Ry. Note that for 1 < k < 8,
Ty and Rx have the same parity, since 7; is even, Ri = 2,
and both 7; and Ry, change by at most 1 from day to
day, according to ghost’s moves and our strategy. Hence,
Tk — Rp is even. Furthermore, 7; 4 1 and 7g 4 10, which
implies 7; — Ri > 0 and 7g — Rg < 0. Since 7, — Ry can
change by at most 2 from day to day, there must exist
some k, 1 <k <8, such that 7; — Ry = 0.
If 7) is odd, then we claim 7; = Rx forsome k,9 <k <
16. Note that 75 is odd since 7; is odd and the treasure
is moved to an adjacent room each night. Furthermore,
for9<k< 16, 7; and Rx have the same parity since 75
is odd, Ro = 9, and both 7; and Rx change by at most
1 from day to day, according to ghost’s moves and our
strategy. Hence, Ry — 7% is even. Furthermore, 7) 4 10
and 716 # 1, which implies Rg —T9 > 0 and Rig—Tie < 0.
Since Ry — T, can change by at most 2 from day to day,
there must exist some k, 9 < k < 16, such that Ry —T, =
0.
Finally, note that this strategy can be generalized to
any number n > 2 of rooms, when the treasure will be
found in 2n — 4 days, by checking the rooms 2,3,...,2—
1 lin 2... 2¢3,2,4n that orders O

Question 23. How many comparisons do you need to


find the maximum in a set of n distinct numbers? How
many comparisons do you need to find both the maximum
and minimum in a set of n distinct numbers?
236 CHAPTER 3. SOLUTIONS

Answer: We can find the maximum in n — 1 comparisons


as follows: let {a1,72,...,%} denote the set of n dis-
tinct numbers. Scan the numbers from left to right, while
maintaining the current maximum M. More precisely, set
a2, = M, and, in the 7th comparison, compare M and 2441.
If M > 2i41, leave M as is; otherwise, set M = 2141. Do
(aia cerry) = Woae sym dle
Similarly, one can find the minimum in a set of n dis-
tinct numbers in n — | comparisons.
Then, we can find the maximum M and the minimum
m in {x1,%2,...,%n} with 2n — 3 comparisons: find the
maximum M in {1,%2,..-,4%n} with n — 1 comparisons,
and then find the minimum m in {a1,@2,...,¢n}\ {M}
with n — 2 comparisons.
However, one can find m and M significantly faster.
If n is even, compere all 5 consecutive pairs of numbers
Coen andito,, 4 ls ra put the smaller number into
a set S and the larger ames into a set L. This requires
5 comparisons. Note that S and L have 3 elements each.
Then, find the minimum m in S using }—1 comparisons,
and find the maximum M in L using + — 1 comparisons.
‘The total number of comparisons to find m and M is

it easel egal) = pio (3.224)

If n is odd, compare Td consecutive pairs of numbers


Doe and yon not, and put the smaller number
into a set S and the larger number into a set L. This
requires net comparisons. Place x, into both S and L.
Note that Sand L have 25+ +1 = "1 elements each.
‘Then, find the minimum in in S using nit 1 aot
comparisons, and find the maximum M in L using nth
= 1 comparisons. ‘Che total number of potieer ae
to find m and M is

n—-I1 n—-1 n—-1 3n+1


5 ia 5 ar 5 = 5 Dee (3.225)
3.7. BRAINTEASERS 23 “I

Note that the results of (3.224) and (3.225) can be


written succinctly as

| ;
7 —2 comparisons,

where [x] denotes the ceiling of x, the smallest integer


greater than or equal to x.

Question 24. Given a cube, you can jump from one


vertex to a neighboring vertex with equal probability. As-
sume you start from a certain vertex (does not matter
which one). What is the expected number of jumps to
reach the opposite vertex?
Answer: Label the vertices of the cube with 0, 1, 2, and
3, according to your distance from the opposite vertex. In
other words, label the starting vertex with 3, the vertices
adjacent to the starting vertex with 2, the vertices adja-
cent to the opposite vertex with 1, and the opposite vertex
with 0. Call the opposite vertex your final destination.
Denote by &;, 2 = 0,1,2,3, the expected number of
jumps yet to be made to reach the final destination, given
that you are currently in one of the vertices labeled with
2. Note that Ho = 0, and we have to find 3.
After the first jump, you are in one of the vertices la-
beled with 2, so
E3 = 1+ Eo. (3.226)
From a vertex labeled with 2, you can jump to three
vertices: two of them are labeled with 1, and one of them
is labeled with 3. ‘hus, you jump to a vertex labeled with
1 with probability 2, or to a vertex labeled with 3 with
probability 4. Hence,
2 ee }
By = 1+ go + 98. (852210)

Similarly, from one of the vertices labeled with 1, you


jump to a vertex labeled with 2 with probability 2 or to
238 CHAPTER 3. SOLUTIONS

a vertex labeled with 0 with probability z Hence,

22 Hg By
Ey~= ee 1438+
1 2
580 es +3
dee, 3.228
(3.228)

since Ho = 0.
Solving (3.226-3.228) yields £, = 7, Bz = 9, and £3 =
10.
We conclude that it will take 10 jumps, on average, to
reach the opposite vertex. OO

Question 25. Select numbers uniformly distributed be-


tween 0 and 1, one after the other, as long as they keep
decreasing; i.e. stop selecting when you obtain a number
that is greater than the previous one you selected.
(i) On average, how many numbers have you selected?
(ii) What is the average value of the smallest number you
have selected?
Answer: We give three solutions for part (i); the third
solution will be used to solve part (ii).
(i) Solution 1: Denote by E(x) the expected number of
numbers you have yet to select, given that you have just
selected number x. For example, £(0) = 1, since the next
number you select is greater than 2 = 0, upon which the
game stops.
Assume that you have just selected number x. Denote
by y the next number you select. We find E(a) by con-
ditioning on y. With probability 1 — x, y is greater than
x; the game stops, and, thus, you have selected only one
number after selecting a. On the other hand, y could be
smaller than x, in which case you expect to select’ E(y)
additional numbers after selecting y; in other words, you
expect to select 1+ H(y) additional numbers after selecting
x. Since the probability density function of y is f(y) =.1,
3.7. BRAINTEASERS 239

the law of total probability gives

Bears ot (ae) +f (1+ E(u) f(y) dy


0

= (=2)4 / Geer yer


JO

= 1+ f E(y) dy.
0
Differentiating

Ble) = 1+ f°
0
Bly)ay
with respect to x yields

Ee(cy = E(e),

Thus, E(a2) = Ce", where C is a constant. he condi-


tion E(0) = 1 gives C = 1. Hence, E(x) = e”.
Note that the first number selected is automatically
smaller than 1. hen, the number of numbers selected
after starting with the number 1, which was denoted by
F(1), is equal to the number of numbers selected starting
with a random number between 0 and 1. ‘Therefore, the
average number of numbers you have selected is H(1) = e.
(i) Solution 2: Denote by N the average number of num-
bers you have selected. Denote by a; the ith number you
selected, i > 1, and let p; denote the probability that
Uy < Ui-1 <... < “1. Since there are 7! permutations of
Eeeeon, Unene) ae z.
You select at least two numbers before stopping. ‘The
probability that you select exactly 7 numbers, 7 > 2, before
stopping is equal to the probability that aj;-1 < ... <
@2 <2, and x; > «;-1. ‘he latter equals the probability
that aj-1 < ... < 2 < 21 minus the probability that
WMO <a eS Loe CAs Tas) pi) — Pi-
240 CHAPTER 3. SOLUTIONS

We conclude that the expected number of numbers you


have selected is

N \ Me Ssu |>. S

j
=e; (3.229)
where (3.229) follows from the ‘Taylor series expansion of
e” around 0, i.e.

by letting x = 0.
(i) Solution 3: Denote by p(x) dx the probability that a
number between x and x + dz is selected as part of the
decreasing sequence. Let p;(x) dx denote the probability
that a number between x and x+ dz is selected as the ith
term of the decreasing sequence. ‘Then,

| p(a) dx = (>:nto) dx. (3.230)


i=1
The probability that a number between x and 2+ dz is
selected as the ith term of the decreasing sequence is

pi(z) dz = — y(t ath Gi, (3.231)


3.7. BRAINTEASERS 241

since (1—)‘~' is the probability that is first i— 1 num-


bers selected are greater than x, and Gent is the proba-
bility that they are selected in decrease order.
From (3.230) and (3.231), we obtain that

Da yas = Se Scobey

]
II ©
8
a8 (3.232)
where (3.232) follows from the Taylor series expansion of
e’ around 0, i.e.

by letting t= 1— 2.
‘Therefore, the expected number of numbers selected in
the decreasing sequence is

1 1
/ pe) de = | e "dzr=e—1.
0 0

Adding the last number selected (which is not in the de-


creasing sequence) gives an average of e numbers selected.
(ii) Denote by s the smallest number selected. It is the
last number selected in the decreasing sequence. Since
the probability that a number between x and x + dz is
selected as part of the decreasing sequence equals e!* dz,
see (3.232), and since the probability that the next number
selected is larger is (1 — a), then the probability that s is
between x and «+ dz is e'~*(1 — x) dz.
242 CHAPTER 3. SOLUTIONS

Therefore, the expected value of the smallest number


you have selected is

1 ~

Bits) ies ipze’ "(1—2) dx


0
ll
= f0 eyi-vdy, (8.233)
where (3.233) follows from the substitution y = | — a.
Using integration by parts to compute (3.233), we find
that

e(y — v)|, = fra =2y) dy


1 1

E(s) II
0
; 5. ft 1 I
= ey —y’)|0 =e 7]
0
+2 Jo/ ye” dy
1 1 1
= “My —y? —1)| + 2ye" -2f eY dy
2 1 1
= e(3y-y — me ==, Dye!
1
=. Suave 3)|(0)
= =e. UO

Question 26. ‘lo organize a charity event that costs


$100K, an organization raises funds. Independent of each
other, one donor after another donates some amount of
money that is exponentially distributed with a mean of
$20K. ‘The process is stopped as soon as $100K or more
has been collected. Find the distribution, mean, and vari-
ance of the number of donors needed until at least $100K
has been collected.
Answer: Denote by a; the amount of money donated by
donor 7, that is exponentially distributed with mean.1 |x.
Let sn = >>", a; be the total amount of money donated
3.7. BRAINTEASERS 243

by donors 1,...,n, and let

IND 2 min{n such that s,, > a}


n>1

be the discrete random variable denoting the smallest in-


dex n such that s, is at least a.
Denote by P(nja),n > 1, the probability mass function
of N, that is, the probability that N =n when a total of
a needs to be raised. Note that

P(ila) = Pl; Sa) eo ™ (3.234)


We find P(nla), n > 1, by conditioning on a;. Given
that the first donor donated a; = x < a, N is equal to
n if and only if the remaining amount a — 2 is raised by
the next nm — 1 donors (and not by fewer than the next
n — 1 donors), an event that by definition has probability
P(n — lja— x). Since the probability density function of
Cy, lary an (ae Ae **, then, for n > 1, the law of total
probability yields

Pig. = i P(n — 1a — x) fa, (x) dx

i re ** P(n pe. lla —


II x) dx. (3.230)
0

We will prove that


a m—1 ae
P(nla) a ea & : > Va a 0, (3.236)

by induction on n. ‘The base case n = | was already


established; see (3.234). Assume that (3.236) holds for
n > 1; we will show that it also holds for n + 1.
From the induction hypothesis, we obtain that

A ie Oe —A(a—& Qn
P(nlja—2z) = pie) (ih ial (3.237)
244 CHAPTER 3. SOLUTIONS

for all 0 < 2 <a. From (3.235), it follows that

P(n+ 1a) = ii re >* P(nla — x).dzx. (3.238)


0
From (3.237) and (3.238), we find that

P(n + lla)
os . AWE=Aq (A(a = 9) a ' Pe) dz
0 (n—1)!
Nr —Xa a .
= ome? ‘da

DONG Pe Na
a n!
We conclude that (3.236) holds for n + 1, and therefore
(3.236) is proved by induction.
From (3.236), it follows that N has the same distribu-
tion as 1 + M, where M has a Poisson distribution with
mean Aa. Then,

E|N|] =1+Aa; Var(N) = Aa.

For our problem, 1/A = $20K and a = $100K. Thus,


E|N] = 1+ Aa = 6 and Var(N) = Aa = 5.
We conclude that the number of donors needed until at
least $100K is collected has mean 6 and variance 5. O

Question 27. Consider a random walk starting at 1 and


with equal probability of moving to the left or to the right
by one unit, and stopping either at 0 or at 3.
(i) What is the expected number of steps to do so?
(ii) What is the probability of the random walk cee at
3 rather than at 0?
Answer: Denote by Xf the position of the random walk
at time n, where the superscript refers to. the starting po-
sition of the walk; for example, X§ = @. In this question,
3.7. BRAINTEASERS 245

we are concerned with Xj. For a random walk starting


at € € {0, 1, 2,3}, denote by 7 the number of steps taken
by the random walk in order to reach either 0 or 3 for
the first time, and by te = E[’] the expected number of
steps.
(i) We are looking to find t;. We derive a recurrence
relation among the te’s as follows: for ¢ = 1, 2,

te = ENT]
=| TANG = Pie Reka 1)
+ Elte|Xt =e+1]- P(x? =e4+1)
= Bil + ti) P(x, = 2-1) (3.239)
+ Ell+Teyi]-P(Xf=€4+1) (3.240)
1 1
= (4 tesa) 5 i (1 + tee) 5, (3.241)

where the following identities were used to derive (3.239)


and (3.240):

Bgx7 =e = 1] II E(L+Te-1]; (3.242)


EIt\X{=241) = Bit Ru}. -(@.243)

Note that (3.242) and (3.243) follow from the fact that,
starting at @, once the random walk took its first step to
é—1 (or +1), it becomes equivalent to a random walk
starting afresh at ¢—1 (or €+ 1). The plus 1 term on
the right hand side of (3.242) and (3.243) accounts for the
first step.
Since to = t3= 0, by letting / = 1 and then / = 2 in
(3.241), we obtain the following linear system for t; and
to:

ty = Sie
246 CHAPTER 3. SOLUTIONS

l4+h 1+3
2 pe 2

ael+ti 1

Thus, t7 = 2 andito.——2
We conclude that the expected number of steps before
stopping is 2.
(ii) Denote by pe the probability that the random walk
reaches 3 before it reaches 0 when its position is at @, for
é € {0,1,2,3}. We need to find pi. Denote by 76 the first
time the walk reaches 0 when starting at ¢, and by r§ the
first time the walk reaches 3 when starting at @. We derive
a recurrence relation for the pe’s as follows: for @= 1, 2,

pe = P (15 < 78)


Za P (7§ <7 Xf = £41) P(Xf=£+1)
+P (15 <1§|
Xf= -1) P(Xt =£-1)
ss P (15
Re eee P(x! =£41)
+P (ae < Td) P(x =011)
5 Pet it
tis + 5° Pe-1, (3.244)
3.244

because the random walk is equally likely to move to the


left or to the right and, once it moved, the random walk
starts afresh.
Since po = 0 and p3 = 1, by letting 1 = 1 and then
| = 2 in (3.244), we obtain the following linear system for
Pi and p2:

ph MOM NDE
Rae Dae
Ri ee edap 2?
rer
p2 PINs
a7 9 =cet
oeed

Thus, pi = } and p2= 2.


3.7. BRAINTEASERS 247

We conclude that the probability of the random walk


ending at 3 rather than at 0 is 5. id

Question 28. A stick of length 1 drops and breaks at


a random place uniformly distributed across the length.
What is the expected length of the smaller part?

Answer:
Solution 1; ‘Treating the stick as an interval [0,1], the
breakpoint X becomes a random variable uniformly dis-
tributed on (0,1). Its probability density function fx (x)
is 1, for 0 < x < 1, and O otherwise. Denote by L the
length of the smaller part. Then, 1 = min (z,1— 2). We
conclude that

E{L} II
l 1
min (7,1 — 2): fx(x) dx

= / min (2,1 —2)dzx


Jo
1/2 if
l cde + [ (1 — a2) dz
1/2
v
8
2
4

Solution 2: Treating the stick as an interval [0,1], the


breakpoint X becomes a random variable uniformly dis-
tributed on (0,1). Denote by L the length of the smaller
part.
Let A be the event that the breakpoint X is in (0, 3).
Then, A is the event that the breakpoint X is in (3,1).
Clearly, P(A) = P(A) = 4. Given A, the length of the
smaller part is X. Given A, the length of the smaller part
18° LX"
248 CHAPTER 3. SOLUTIONS

The law of total probability yields

BIL) = E(L|A]- P(A) +E[LIA]- P(A)


= E[X|A]- 5 + Ell X1A)- ;
= B[X|A}- 1,5 BLX|A] 5. (3.245)
Note that, given A, X is uniformly distributed on (0, s),
and, Given A, X is uniformly distributed on (3, 1). Since
the expectation of a random variable uniformly distributed
on an interval (a, b) is equal to “t", we obtain that

E|X|A] = Z
1 and
mors
E|X|A] = r

‘Then, (3.245) yields

ieee le ets il
El wOk ASDon 0p ee
1
ee
4

Question 29. You are given a stick of unit length.


(i) The stick drops and breaks at two places. What is the
probability that the three pieces could form a triangle?
(ii) The stick drops and breaks at one place. Then the
larger piece is taken and dropped again, breaking at one
place. What is the probability that the three pieces could
form a triangle?
Answer: We offer two solutions for part (i); the second
solution will be used to solve part (ii).
(i) Solution 1: Denote by X and Y the two break points,
and assume X and Y are pe ge oe random variables
uniformly distributed on (0,1). To form a triangle, the
sum of the lengths of any two pieces must be greater than
3.7. BRAINTEASERS 249

the length of the third piece. Equivalently, each piece


must be of length less than 1/2.
Assume that Y > X. ‘Then, the length of the three
pieces are X, Y — X, and 1—Y. Each of these pieces is of
length less than 1/2 if and only if the point (X,Y) belongs
to the region {(a,y) :% < 1/2, y—2 < 1/2,1-y < 1/2,27 €
(0,1),y € (0,1),a < y} in the unit square. Since the area
of this region is 1/8, the probability of the three pieces
forming a triangle, given Y > X, is 1/8. Symmetrically,
the probability of the three pieces forming a triangle, given
y =< X, his 1/8) Kvents {YY eee X-}wand-{Y t< X} are
disjoint; hence, the probability of the three pieces forming
a triangle is 1/4,
(i) Solution 2; Consider an equilateral triangle ABC with
height of length 1. Given a point P in its interior, let ha,
hy, and he be the lengths of the perpendiculars dropped
from P to the sides BC, CA, and AB, respectively. Since
the areas of triangles BPC, CPA, and APB sum up to
the area of ABC, we conclude that ha +h» +h. = 1, and,
thus, is independent of the position of P. Breaking a stick
of length 1 into three pieces of lengths ha, he, and he, is
clearly equivalent to (uniquely) specifying a point P in
the interior of the triangle ABC.
Connect the midpoints A’, B’, C’, of the sides of trian-
gle ABC to split it into four congruent equilateral trian-
gles, with the medial triangle A’B’C’ in the middle (see
Figure 3.3). Each piece of the broken stick has length
less than 1/2 if and only if the corresponding point P be-
longs to the medial triangle. Since the area of the medial
triangle is 1/4 of the area of triangle ABC, the desired
probability is 1/4.
(ii) Assume that the pieces have lengths h and (1 — h)
after the first break, with h < (1—h), ie., h < 1/2. With
h fixed, the (larger) piece of length 1 — h is taken and
dropped again, breaking at one place uniformly at ran-
dom. ‘The probability that the three pieces thus obtained
250 CHAPTER 3. SOLUTIONS

Figure 3.3: Point P inside the medial triangle A’ B’C" of


the equilateral triangle ABC.

form a triangle clearly depends on h. It is close to 0 for h


close to 0, and it is close to 1 for h close to 1/2.
More precisely, using the representation from the sec-
ond solution of part (a) above, the probability that the
three pieces form a triangle, given a fixed h < 1/2, is
equal to the probability that the point P, that lies on the
segment UZ parallel to side AB and at distance h from
it, belongs to the segment VW, that is the intersection
of UZ with the medial triangle A’B’C’ (see Figure 3.3).
Since P is chosen (by the second break of the stick) from
UZ uniformly at random, the probability that P belongs
to VW is equal to the ratio of their lengths, namely an
Next, we express this ratio in terms of h. First, note
that the medial triangle A’B’C’ has side length AB/2
and height 1/2. Since the triangles ABC and UZC are
similar, we have o4 = toh Since the triangles WVC’
3.7. BRAINTEASERS 251

/ / * . ia Caen oO
and A’B’C" are similar, we have aot = Dividing
the last two equations, we obtain that

VW h
UZ ~~ 1-h

Therefore, given h < 1/2, the probability that the three


pieces form a triangle is

1/2 W244 1/2 4


——_ ah = -| pant f —— dh.
[ eae een Ae alah
By using the substitution y = 1 — h; it follows that

a ae) 1 Jeg
——dh = —-=+ i: — dy
it 1—-h 2 1/2 YU

2 2
1
== ee ae

Since the probability that h < 1/2, where h is chosen


uniformly at random from (0,1) (by the first break of the
stick), is 1/2, then the total probability that the three
pieces form a triangle is

Question 30. Why is a manhole cover round?


Answer: A circle is the shape with minimal surface given
a required minimal width in any direction. Moreover, the
cover of a round manhole cannot fall through the hole.
If the manhole were square, its cover turned on its edge
could fall through the hole since the diagonal of a square
is V2 times larger than its edge. O
252 CHAPTER 3. SOLUTIONS

Question 31. When is the first time after 12 o’clock that


the hour and minute hands of a clock meet again?
Answer: The minute hand moves at a speed of 360 de-
grees per hour, while the hour hand moves at a speed of
30 degrees per hour. They start together at 12 o’clock.
‘The first time they meet, the minute hand made one full
rotation more than the hour hand, which is the same as
360 degrees more than the hour hand. If t denotes the
time (measured in hours) until the two hands meet again,
this can be written as
360:¢ = 30:¢+ 360.
AN NERS, <= oa hours, which is approximately 1 hour, 5
minutes, and 27 seconds. U

Question 32. Three light switches are in one room, and


they turn three light bulbs in another. How do you figure
out which switch turns on which bulb in one shot?
Answer: Turn on two switches for a couple of minutes,
and then turn one of the switches off and go into the other
room. ‘The bulb that is lit corresponds to the switch that
is still on; the bulb that is not lit but is hot corresponds
to the switch that was turned on and then turned off; the
bulb that is not lit and is cold corresponds to the switch
that was never turned on. UO

Question 33. The number 2”? has 9 digits, all different.


Without computing 2?°, find the missing digit.
Answer: For any positive integer n, denote by D(n) the
sum of the digits of n. Recall that the difference between
a number and the sum of its digits is divisible by 9, ice.,
9|n—D(n).
Thus, for n = 279, it follows that
pn) 2 aku oz (3.246)
3.7. BRAINTEASERS 253

We are given that 27° has 9 digits, and that all 9 digits
are different. Denote by x the missing digit. hen,

Te?) ce = ' —~2=45—2. (3.247)

From (3.246) and (3.247), it follows that

9 | 2° = (45). (3.248)

Note that ;
92
929 259 (04a 0 644
= 2°.(6341)*
y ®. (634.1)
= 2°.63+k+2°, (3.249)
where k is a positive integer.”
From (3.249), we find that

ee ee GO ake

and therefore
a in eye (3.250)
From (3.248) and (3.250), it follows that

PT Ces nO eee eer)


= (45 — 2) —2°
= St
2°Tt is easy to see that

'(634+1)4 = 6344+4-.63°+6-637+4-63+1
= 63-(63°+4-637+6-634+4)+1
= 63-k+1,
where k = 63° + 4- 637+ 6-634 4.
254 CHAPTER 3. SOLUTIONS

Since 9 | 13—a and z is a digit, we conclude that x = 4.


In other words, we identified that «, the missing digit from
279 must be 4.
Indeed, 27? = 536870912, ie., 27? has 9 digits, all
different, and 4 is not a digit of 27°.
For completeness, we include here a proof of the fact
that the difference between a number and the sum of its
digits is divisible by 9, i.e.,

9|n—- D(n).

If the digits of n are ax, @k—-1,..-@1, ao (from left to right),


then

No = ViGE= 10* + Ap-1- io" 2 +...a,-10+ a0;


Din) = aet+ape1+.:.+a1 +40.

Hence,
k .

n= Dy ee - (10° — 1).
i=0
Since 10’ — 1 is an i-digit number with all digits equal to
9, it follows that 9 | 10° — 1, for alli = 1: k, and therefore
9|n—D(n). O

Question 34. Alice and Bob stand at opposite ends of


a straight line segment. Bob sends 50 ants towards Alice,
one after another. Alice sends 20 ants towards Bob. All
ants travel along the straight line segment. Whenever two
ants collide, they simply bounce back and start traveling
in the opposite direction. How many ants reach Bob and
how many ants reach Alice? How many ant collisions take
place?

Answer: Imagine that when two ants meet, they switch


identities. Hence, even after a collision, two ants are trav-
eling in two opposite directions. It follows that 20 ants
reach Bob, while 50 ants reach Alice. aes
3.7. BRAINTEASERS 255

‘Lo calculate the number of ant collisions, imagine that


each ant carries a message. In other words, Bob sends
50 messages to Alice, one message per ant. Similarly,
Alice sends 20 messages to Bob, one message per ant.
Furthermore, imagine that the two ants swap messages
when they collide. ‘Then a message always makes forward
progress. Each of Alice’s messages goes through 50 ant
collisions. Each of Bob’s messages goes through 20 ant
collisions. ‘The total number of collisions is 50 times 20,
which is 1000 collisions. O

Question 35. There are 20 people at a party. Everyone


writes down their name on a piece of paper and throws
it in a bag. We shake up the bag and each person draws
one name from the bag. You are in the same group as the
person you have drawn.
For example, if people labeled 1 through 20 drew the fol-
lowing names from the bag:

1, Padiery SMELT np ORE ik BE ew)


Ur” chteeesslie § lidwensiiing gelatine lid slbsn rts at:
O95) 8.18 Clly 10a deb a2
Uh dlp bey all | HSS iG aly waite: «ll 20)
og OE eh SMa 1 Poe a pe ee mS Rl
14s 1S Ae ROO aly On eel 9 PI 2
the groups that form are

(6200.1 2918.19) (225115 147 20).03) 3

(4,8,13), (7,9,15, 17), (16).


What, is the expected number of groups?
Answer:
Solution 1: (Due to Zhaofeng Brent Liao, Baruch MFE’19.)
Consider the general case when there are n people at the
party. Let the random variable X,, denote the number of
256 CHAPTER 3. SOLUTIONS

groups that are formed for a party with n people. ‘Vhus,


the expected number of groups is E[Xn].
Once the first person, denoted by A, draws a name
from the bag, there are two possible scenarios:
e Scenario 1: A draws her own name from the bag; this
happens with probability -, Then, A forms a group by
herself which is closed, i.e., nobody else will be able to join
this group. ‘The remaining n — 1 people will form groups
amongst themselves and the problem reduces to the case
of a party with n — 1 people, with one additional group
formed already.
e Scenario 2: A draws someone else’s name, denoted by
B; this happens with probability nat . In this case, A and
B form a group that is still open, i.e., B could draw the
name of another person different from A that would then
join the group. ‘Therefore, we can think of A and B as
being one person for the purpose of counting the number
of groups. ‘he problem reduces to the case of a party
with n — 1 people with no group formed yet.
We obtain the following recursion formula:

E[Xn] = = (1+B[Xn-a}) + n—-1


FIX,
3 1
a E|Xn-1] + ay Vn = eee (3.251)

Since [1] = 1 (there is only one group formed at a party


with one person), the solution to the recursion (3.251) is
n

E[X,] = ie Vin1.
k=l

For n = 20, we obtain that

20 1

E[X20] = Duin ~ 3.598.


k=1
3.7. BRAINTEASERS 257

We conclude that the expected number of groups at a


party with 20 people is 3.6.
Solution 2: By looking at the example, we note that the
groups formed, i.e.,

(1, 6, 10, 12, 18, 19), (2,5, 11, 14, 20), (3),
(4,8,13), (7,9, 15, 17), (16),
correspond to the cycles of the permutation

Gy, Dy Shy coly TEs ARO alls Miss, 12, AL Tile). Ale LO) ol
ite
ANS, Me lle aly 2,

with each cycle written in the clockwise direction.


‘Thus, the expected number of groups is the expected
number of cycles in a random permutation 7, chosen uni-
formly at random from the set of all 20! permutations of
ie M0:
In other words, if the random variable X denotes the
number of cycles in the permutation 7, we need to find

For 1 < i < 20, let X; be the random variable given by


BG de } if k is the length of the cycle of 7 containing
person 7. Note that X = ee X;, since the total contri-
bution from all the people in a cycle of length k is 1 and
therefore
20
E[X) = >) EIXi). (3.252)

Note that
20
EX) = St :- P (person 7 is in a cycle of length k) .
: k=1
(3.253)
We compute the probability that person 7 is in a cycle of
length k as follows: Given 1 < k < 20, there are fe)
ways to choose k: — 1 people, other than person 7, for the
258 CHAPTER 3. SOLUTIONS

cycle of length k; (k — 1)! ways to permute these k — 1


chosen people, together with person 7, to form a cycle
of length k; and (20 — k)! ways to permute the remain-
ing 20 — k people. ‘Thus, the number of permutations
of {1,2,..., 20}, in which person 7 belongs to a cycle of
length k, 1 <k < 20, is equal to

ea -(k—1)!-(Q0—k)! = 19),

which is independent of k and 7, and we obtain that


!
P (person 7 is in a cycle of lengthk) = Si =e 5
(3.254)
From (3.253) and (3.254), we find that?!

1 ei 1
E[Xi] = = ye wie eS (3.255)
k=1
where
20 4
Ao = i
k=1
denotes the 20-th harmonic number.
From (3.252) and (3.255), we obtain that the expected
number of cycles in a permutation (and therefore the ex-
pected number of groups that are formed at the party)
1s

; 20

E[X] = Dea
1 = Hoo.
4=1

Yo find a numerical value, recall that

1
iy wd Inn + AY. + regan)
an
21 Note that, given the definition of X;, E|X;] must be indepen-
dent of i. ‘
3.7. BRAINTEASERS 259

as n — oo, where y 0.5772 is the Euler constant. Then,


Hoo © 3.598, and we conclude that the expected number
of groups at this party is 3.6. O

Question 36. Let A be the sum of the digits of 2019!.


Let B be the sum of the digits of A. Let C be the sum of
the digits of B. Find C.
Answer:
Note that 2019! < 1000079
= (104)? =
10°°”° and therefore 2019! has at most 8076 digits. Then,
the number A which is the sum of the digits of 2019!, is at
most 8076 x 9 = 72684. ‘Thus, the number A has at most
5 digits. In turn, the number B, being sum of the digits
of A, is at most 5 x 9 = 45 and therefore B is a 2-digit
number less than 50. We conclude that the number C,
being sum of the digits of B, is at most 4+9 = 13.
Recall that a number is divisible by 9 if and only if the
sum of its digits is divisible by 9. Since 2019! is divisible
by 9, it follows that numbers A, B, and C are also divisible
by 9. Since C < 13, we obtain that

C=O, 10

Question 37. Find 2019 consecutive positive integers


that are not prime.
Answer: Vhe numbers

2020! + 2, 2020!+ 3, ..., 2020! + 2020

form a sequence of 2019 consecutive positive integers that


are divisible by
Die BM Aectsy, PAUPAY,
respectively, since

i | (2020! +1), V2<i<


2020,
and therefore are not prime.
260 SHAPTER 3. SOLUTIONS

Question 38. Exactly 4 out of 100 coins are fake. All


the genuine coins weigh the same; all the fake coins, too.
A fake coin is lighter than a genuine coin. How can you
find at least one genuine coin using a balance scale only
twice?
Answer: Divide the 100 coins arbitrarily into 3 piles, la-
beled A, B, and C, of 33 coins, 33 coins, and 34 coins,
respectively.
In the first weighing on the balance scale, weigh pile A
against pile B. here are two different cases depending
on the result of the first weighing:
e Case 1: If pile A is lighter than pile B (the case when
pile B is the lighter pile is handled similarly by symmetry),
and since a fake coin is lighter than a genuine coin, the
number of fake coins in the piles A and B out of the total
of 4 coins can only be as follows: (4,0), (3,0), (3,1), (2,0),
(2,1), or (1,0). Thus, pile B has at most one fake coin.
For the second weighing on the balance scale, we select
arbitrarily two coins from pile B and weigh them against
each other and we find at least one genuine coin as follows:
Case 1.1: If the coins are in balance, both coins are
genuine.
Case 1.2: If the coins are not in balance, the heavier
coin is genuine.
e Case 2: If pile A weighs the same as pile B, the
piles A and B contain the same number of fake coins, and
therefore the number of fake coins in the piles A, B, and
C can only be as follows: (0,0, 4), (1, 1,2), or (2,2,0). For
the second weighing on the balance scale, we remove an
arbitrary coin r from pile A and proceed to weigh pile B
together with coin r (containing 34 coins) against pile C
(also containing 34 coins). We find at least one genuine
coin as follows:
Case 2.1: If pile B together with coin r is heavier than
pile C, then pile B together with coin r has fewer fake
coins (which are lighter) than pile C. This would happen
3.7. BRAINTEASERS 261

if the initial distribution of the fake coins across the piles


A, B, and C was (0,0,4), in which case pile A contained
only genuine coins and therefore coin r was genuine, or if
the initial distribution of the fake coins across the piles A,
B, and C' was (1,1, 2) and coin r which was added to pile
B before the second weighing was genuine. In either case,
coin 7 is genuine.

Case 2.2: If pile B together with coin r is lighter than


pile C, then pile B together with coin r has more fake
coins (which are lighter) than pile C. This could only
happen if the initial distribution of the fake coins across
the piles A, B, and C' was (2, 2,0), which means that all
the coins in pile C’ are genuine.
Case 2.3: If pile B together with coin r weighs the
same as pile C, then pile B together with coin r has the
same number of fake coins as pile C. ‘his could only
happen if the initial distribution of the fake coins across
the piles A, B, and C was (1,1,2) and coin 7 was fake,
resulting in exactly two fake coins in pile B together with
coin r and in pile C’. In other words, pile A had | fake
coin, 7, and this coin was removed and placed with pile
B. All the remaining coins in pile A are genuine.
Thus, in all of the cases, we find at least one genuine
coin with only two weighings on a balance scale.

Question 39. Can you design a pair of 6-sided non—


identical fair dice different from the standard dice with
each face bearing a positive integer and having the same
probability distribution for the sum as the pair of standard
dice? ,(In other words, there must be two ways to roll a 3,
six ways to roll a 7, one way to roll a 12, and so forth.)

Answer: Yes! We will use the method of generating func-


tions to find the two. Let the non-standard dice A and B
have faces (a1,a2,@3,@4,@5,a¢) and (61, b2, bs, ba, bs, be),
262 CHAPTER 3. SOLUTIONS

respectively, where a; and b; are positive integers. Let

a(x) = 2 4+ 0% 42% 40% 40° + °° (3.256)

b(z) = ot +02 40°38 +t +o 40° (3,257)


be the generating functions for the number rolled on dice
A and B, respectively. When the product

a(x)b(a)
act (age a pes Bhs SE Ait +E + a°°)

(« + oO? 4 p38 4 Pt 4 Pd 4 s)

is expanded, the coefficient of x” is exactly the number


of ways the non-standard dice A and B can have a sum
equal to n. If the sum of the non-standard dice has the
same distribution as for two standard dice, the product
a(x)b(a) must be equal to the product of the generating
functions for two standard dice, i.e.,

a(e)bta)! <1 (aba a af Pty gP ot)?


= ax (1 Bea 24 4044 2°)]°
I [x(1+2°)(14+2+27)]°
II [e(l++2) (1—a+2”) (l+2+42°)]’

We now identify how to distribute the factors from the


product above between the polynomials a(a) and b(2).
Since each die has positive integers on its faces, 2 di-
vides both a(#) and b(x). Thus, a(a) and b(a) must each
get one of the two x factors.
Note that a(1)= b(1)= 6 since each die has 6 faces;
see also (3.256) and (3. ao). Since (1 + @)|2a1 = 2). =
a+@7)\o-1 = 1, (1+2+4+27)|2-1 = 3, it follows that’ a(x)
and 6(2) must ‘seh get one of the two (1 +2) factors aa
one of the two (1+ a+ 2”) factors.
3.7. BRAINTEASERS 263

This leaves only the two (1 — 2 + 2”) factors to be


distributed. If we distribute one of the factors (1—a +2”)
to each of a(x) and b(x), then a(a) = b(a) and the dice A
and B would be identical and the same as the standard
dice. We therefore give both (1 — a + 2”) factors to the
generating function of one of the dice, for example to die
B. Vhen,

a(z) = a(1+z2) (1 +2+27)


II x + Qa? 4+ 2x? + x
lI ata? ta? t+e° 4 o¢°%4 2%;
b(2) lI a(1+2)(1 +a+27) (i eas)
lI Bet
ae a ag”

We conclude that the sum of the non-standard dice A


and B with faces (1, 2, 2,3, 3,4) and (1,3,4, 5,6, 8), respec-
tively, has the same probability distribution as the sum of
a pair of standard dice. Note that the dice A and B are
the only non-standard dice with this property. 0
eae: Dany
ee RSE ay nineiee®ee
a Ea kp rie

bv
ai naepills:
; hy } ih fs a oe he da iy aS
(f Flint «| i” fer. kas, fy Ar a ith i

a +n oo PR: reeset I oy at
<4 Vy ae - ni DD a EMI ut He “2a
et ifs
;

é +

¢ sj
. |

ty

i. 5 i -
Paige is q
7 “ f j } ry

rons (hs
aL »
ijit
rT ‘ bP) bald
rae? +

,
1 Vi 3

~ Df

v ¥8 :
parr rae rey aere
a< ats A

eke Lio gs *

« AS poring
wie,
7 wed
Bibliography

(1) Avner Friedman. Stochastic Differential Equations


and Applications. Dover Publications, Mineola, New
York, 2006.

Paul Glasserman. Monte Carlo Methods in Financial


Engineering. Springer-Verlag New York, Inc., New
York, 2004.

Dan Stefanica. A Mathematical Primer with Numeri-


cal Methods for Financial Engineering. Financial En-
gineering Advanced Background Series. FE Press, New
York, 2nd edition, 2011.

Dan Stefanica. Numerical Linear Algebra Methods for


Financial Engineering Applications. Financial Engi-
neering Advanced Background Series. FE Press, New
York, 2014.

Paul Wilmott. Frequently Asked Questions in Quan-


titative Finance. John Wiley & Sons Ltd, Chichester,
West Sussex, 2nd edition, 2009.
Made in the USA
Las Vegas, NV
04 March 2023

97R00154
| ket Book Guides for Quant Interviews
Challenging Brainteasers for Quant
iterviews, by Rados Radoic

Matic, Ri
fess, 202
uestion
fanica, |
ess, 201

OL 223 sa nan

mi
aZ2a
2 dn)

You might also like