Calculas of Fiinance
Calculas of Fiinance
For our entire range of books please use search strings "Orient
BlackSwan", "Universities Press India" and "Permanent Black" in
store.
The
Calculus
of
Finance
Amber Habib
Mathematical Sciences Foundation
New Delhi
THE CALCULUS OF FINANCE
Registered Office
3-6-747/1/A & 3-6-754/1, Himayatnagar, Hyderabad 500 029
(Telangana), INDIA
e-mail: [email protected]
Distributed by
Orient Blackswan Private Limited
Registered Office
3-6-752 Himayatnagar, Hyderabad 500 029 (Telangana), INDIA
e-mail: [email protected]
Other Offices
Bengaluru, Bhopal, Chennai, Guwahati, Hyderabad, Jaipur, Kolkata,
Lucknow, Mumbai, New Delhi, Noida, Patna, Visakhapatnam
eISBN 9789389211023
Preface
List of Notation
1 Basic Concepts
1.1 Arbitrage
1.2 Return and Interest
1.3 The Time Value of Money
1.4 Bonds, Shares and Indices
1.5 Models and Assumptions
2 Deterministic Cash Flows
2.1 Net Present Value
2.2 Internal Rate of Return
2.3 A Comparison of IRR and NPV
2.4 Bonds: Price and Yield
2.5 Clean and Dirty Price
2.6 Price –Yield Curves
2.7 Duration
2.8 Term Structure of Interest Rates
2.9 Immunisation
2.10 Convexity
2.11 Callable Bonds
3 Random Cash Flows
3.1 Random Returns
3.2 Portfolio Diagrams and Efficiency
3.3 Feasible Set
3.4 Markowitz Model
3.5 Capital Asset Pricing Model
3.6 Diversification
3.7 CAPM as a Pricing Formula
3.8 Numerical Techniques
4 Forwards and Futures
4.1 Forwards and Futures
4.2 Forward and Futures Price
4.3 Value of a Futures Contract
4.4 Method of Replicating Portfolios
4.5 Hedging with Futures
4.6 Currency Futures
4.7 Stock Index Futures
5 Stock Price Models
5.1 Lognormal Model
5.2 Geometric Brownian Motion
5.3 Suitability of GBM for Stock Prices
5.4 Binomial Tree Model
6 Options
6.1 Call Options
6.2 Put Options
6.3 Put–Call Parity
6.4 Binomial Options Pricing Model
6.5 Pricing American Options
6.6 Factors Influencing Option Premiums
6.7 Options on Assets with Dividends
6.8 Dynamic Hedging
6.9 Risk-Neutral Valuation
7 The Black–Scholes Model
7.1 Risk-Neutral Valuation
7.2 The Black–Scholes Formula
7.3 Options on Futures
7.4 Options on Assets with Dividends
7.5 Black–Scholes and BOPM
7.6 Implied Volatility
7.7 Dynamic Hedging
7.8 The Greeks
7.9 The Black–Scholes PDE
7.10 Speculating with Options
8 Value at Risk
8.1 Definition of VaR
8.2 Linear Model
8.3 Quadratic Model
8.4 Monte Carlo Simulation
8.5 The Martingale
Appendix A: Calculus
A.1 One Variable Calculus
A.2 Partial Derivatives
A.3 Lagrange Multipliers Method
A.4 Differentiating under the Integral Sign
A.5 Double Integrals
Appendix B: Probability and Statistics
B.1 Basic Probability
B.2 Random Variables
B.3 Cumulative Distribution Function
B.4 Binomial Random Variable
B.5 Normal Random Variable
B.6 Expectation and Variance
B.7 Lognormal Random Variable
B.8 Cauchy Random Variable
B.9 Bivariate Distributions
B.10 Conditional Probability
B.11 Independence
B.12 Multivariate Distributions
B.13 Covariance Matrix
B.14 Linear Regression and Least Squares
B.15 Random Sampling
B.16 Sample Mean, Variance and Covariance
B.17 Central Limit Theorem
B.18 Stable Distributions
B.19 Data Fitting
B.20 Monte Carlo Simulation
Appendix C: Solutions to Selected Exercises
Bibliography
Preface
Mathematics has always enjoyed a close relationship with financial
matters. Early developments in arithmetic owed much to the needs
of accounting, and even geometry was influenced by the need of the
State to measure area to fix taxes. While economics deals with the
general issues regarding money and its place in society, finance has
a narrower aim: how should we invest our money to make it grow the
most? This sharper focusmakes itmore tractable to mathematical
treatment. It is exciting that relatively elementary mathematics can
lead to quite deep results in finance, including work that haswon
theNobel Prize. At the same time, the problems of finance have
helped motivate new mathematics of the highest order. Mathematical
finance offers a new solution to the perennial problem
mathematicians face—convincing people that our work has some
significance for society.
This book will introduce you to the basic concepts and products of
modern finance. The emphasis is not somuch on the details of the
financialworld as the basic principles by which we seek to
understand it. Thus the aim of the book is to teach you how to think
about finance. This seems particularly pertinent in the context of
market upheavals that appear to be caused, to a fair extent, by the
careless application of mathematical tools to the creation and pricing
of complicated contracts. The carelessness stems from a lack of
intuition or regard for the importance of the assumptions underlying
the models, leading to incorrect evaluation of risk. As this book will
show you, the understanding and quantification of risk is the central
problem of finance.
The book is based on material developed for the courses in
mathematical finance of the Mathematical Sciences Foundation,
Delhi. (MSF’s website is www.mathscifound.org) These courses are
mainly aimed at undergraduates, but also attract students of
professional courses as well as those in employment. They are
taught portfolio analysis and financial derivatives, with the highlights
being the Markowitz Model, the Capital Asset Pricing Model and the
Black– Scholes approach to options pricing. The students come from
a wide variety of backgrounds and so the required mathematics is
also taught in parallel to the material on finance. We emphasise
hands-on work, with extensive lab as well as student projects where
theory is applied to (and tested by) real-life data.
I have tried to retain the flavour of theMSF programs in that the
book should be accessible to undergraduates and others of varying
backgrounds. (Exposure to basic calculus and probability is all that is
required. No prior knowledge of economics or finance is needed.)
For those who have not taken any probability or calculus after high
school, the required mathematics is described in fair detail in the
Appendix. The book is peppered with examples that use real-life
data to ground the theory. Exercises are also scattered through the
text—their purpose varies fromsimple practice in applying formulas
to extending the ideas in the text to new situations.
The numbering of the exercises and examples needs some
explanation. Exercise 1.4.2 will be found in Chapter 1, Section 4.
Examples share the same numbering scheme with Exercises, so
that Example 1.4.3 is found in Chapter 1, Section 4, just after
Exercise 1.4.2.
As you read the book, you will notice that not many references
have been provided to the original sources. One reason is that it is
not always clear who first had a certain idea, or how much credit
should be given to the person who put it in its final or most popular
form. For instance, a technique may be used, perhaps implicitly, by
traders and investors long before it gets academic treatment and
acquires a provenance. A large and contentious book could be
written on the many claims to originality (one already has been:
Rubinstein [42]). The best thing would be for you to followup this
bookwith one ormore of the detailed texts on finance, for example,
Bodie, Kane and Marcus [7], Brealey andMyers [8], Cuthbertson and
Nitzsche [16], Hull [26], Luenberger [28], and Sharpe, Alexander and
Bailey [45]. Formore onMathematical Finance, the following books
are at about the same level as this one, but with varying choices of
coverage: Capinski and Zastawniak [9], Ross [40], and Stampfli and
Goodman [48]. To read more advanced texts, you would need to
become proficient in stochastic calculus. A gentle start in this
direction is provided by Mikosch [33], while the two volumes of
Shreve [46, 47] are more advanced and comprehensive.
I am grateful to my colleagues at MSF for support and inspiration
in countless ways. Perhaps the most striking is their commitment to
creating innovative teaching and research programs centered
around the interaction of mathematics with all aspects of the world
we live in. I particularly thank Professor Dinesh Singh, Director, MSF,
for inviting me to take part in MSF activities and for sharing his vision
of mathematics. Professor Sanjeev Agrawal has been a constant
source of ideas, advice, and energy. Other colleagues who have
helped me refine my thoughts on finance are Charu Sharma, Divya
Beri, Jatin Anand, Niteesh Sahni and Ziaur Rehman.
At Universities Press, I must thank its Director, Madhu Reddy, and
editors Shubashree Desikan and Sreelatha Menon for
encouragement, advice, and gentle prodding. Thanks are due to the
referee for pointing out various ways of improving the text.
Two old friends have played a special role in this story. Surajit
Basu added to the long list of kindnesses he has done me by
commenting on an early draft. Adnan Aziz rode in like the proverbial
white knight just before publication and saved me from an army of
ambiguities and omissions. I appeal to the reader to help root out
those that remain by writing to me at [email protected].
And, finally, my most heartfelt thanks to my wife Abha and son
Zafar for continually showing me new ways of looking at life.
Amber Habib
Mathematical Sciences
Foundation New Delhi
List of Notation
β, 82
, 217
B(n,p), 217
Cov[X,Y ], 229
δC , 166
DFW, 47
DM, 39
e, 10
, 214
≐, 197
E[X], 221
fS,T, 43
FX , 214
fX , 212, 213
fX,Y , 227
γC , 168
μ, 120
∇f, 203
N(μ,σ), 219
Ω, 210
Φ, 157
ℙ, 210
R, 247
R2 , 91
r, 59
reff, 13
ρ, 230
ρC , 171
S, 209
σ, 59, 120
σX , 222
σI , 165
σX 2, 222
σXY , 229
S2 , 244
sT , 42
SXY , 246
ΘC , 170
Var[X], 222
V C , 170
X, 243
x+ , 157
xq , 216
1 Basic Concepts
Of course, there are many other possibilities, but let us start with just
these. The question which arises is––in which of these should we put
our money? This naturally depends on how the nature of these
investments matches with our requirements.
For example, the advantage of a fixed deposit as opposed to a
savings account is that the former pays a higher rate of interest. The
savings account, on the other hand, allows you constant access to
your money, while a fixed deposit requires the money to be with the
bank for a set time such as three months or a year.
A bond provides regular payments over a set time period in return
for an initial payment and is thus rather like a fixed deposit. Many
bonds can be traded during their lifetime, and this provides additional
flexibility to the investor. Bonds are also issued by companies, not
only banks, and typically offer higher gains than fixed deposits. But
there is a downside––if the company hits sufficiently bad times it may
not be able to meet its obligations and the investor may not receive
the promised payments or even get the initial payment back. In other
words, the investor faces default risk, though only in exceptional
circumstances. Risk also enters the picture if the investor wishes to
sell the bond before its expiry, since the price would be affected by
prevailing market conditions and the perceived financial stability of the
issuer.
Risk comes into even more prominence when we consider the
remaining possibilities for investing. Share prices, for instance, vary
greatly from day to day. Even if we invest in a company with an
excellent record, there is no guarantee that we will gain by owning its
shares over the next few months. On the other hand, if we hold on to
the shares for many years we have a good chance of making a
handsome profit. It used to be thought that bonds provide the optimal
way to do well in the long run––say over 20 or 30 years. The current
opinion, however, is in the favour of shares, provided one invests in a
diverse collection of companies and thus reduces the possible loss
due to one or more of them doing badly. Mutual funds, which
distribute the investor’s money over such a collection, cater to the
investor who wants steady long-term growth. The investor who
wishes to make money quickly would invest in just a few shares that
he believes are going to do exceptionally well in the immediate future.
Such an investor would naturally be exposed to high levels of risk.
PROBABILITY
This discussion leads us to the role of probability in finance. The
relative worth of an investment depends on the probabilities of the
possible pay-offs. If higher pay-offs are perceived as more likely, its
value should increase. For example, if we can model the fluctuations
in prices of a stock, we can assign probabilities to the possible pay-
offs from buying that stock, and thus estimate its value to the
investor.1 Specifically, we treat the future profit as a random variable.
Its expectation then represents the expected profit, while its
standard deviation represents fluctuations and hence risk. (See
Figures 1.1 and 1.2.) 2
Figure 1.2: This diagram considers the 65 stocks making up the Dow Jones
Composite Index and their weekly profits over the one year period ending
November 6, 2006. The mean profit (per dollar invested) is plotted on the
vertical axis, and the standard deviation of the profits (representing risks) is
plotted on the horizontal axis. The curve has been drawn to emphasise the
absence of stocks with high mean profit but low risk – this indicates that higher
mean profit requires greater risk.
RISK-FREE ASSETS
Some assets can be viewed as free of risk. For instance, deposits in
banks and bonds bought from governments are typically treated as
risk-free. Of course, both banks and governments can collapse, but
such instances are rare. We shall soon see that there is good reason
to expect that all risk-free assets will gain in value at the same rate,
and we may therefore talk of the risk-free rate of growth. This rate is
not universally fixed, but varies with market and time.
PORTFOLIOS
So far, we have considered individual assets. To design a portfolio,
we need to consider not only the individual characteristics (regarding
profit and risk) of the assets, but also their relationships with each
other. Two assets could be linked together in certain ways––for
instance, they may show a tendency to rise or fall in value together.
Alternately, one may tend to move in the opposite direction to the
other. In the latter case, a rise in one would be offset by a fall in the
other, and a portfolio consisting of both these assets would be less
risky than a portfolio consisting of only one of them! By combining
assets in various ways, we can tailor a portfolio to satisfy the risk
preferences of any investor.
HEDGING
The process of reducing risk by combining assets appropriately is
called hedging. By hedging we reduce risk and therefore, also lower
our expected profit. One of the goals we will pursue in this text is to
see how to hedge against specific risks to which a portfolio is
exposed, for instance fluctuations in the prices of stocks or in interest
rates. If the hedging is complete, no risk will remain, and the portfolio
will grow slowly at the risk-free rate. Therefore, we will also consider
how to hedge to the right extent, so that the remaining risk just falls
within acceptable levels and the portfolio is able to grow at a faster
rate.
In the latter part of this book we will consider the financial
instruments known as derivatives. Derivatives are contracts that fix
the terms for future trades. A simple example is a contract that binds
two parties to a sale of crude oil, six months from now, at a price of
$50 per barrel.3 A prime use of derivatives is to reduce uncertainty
about future expenses (or profits), and they have become a very
popular means of hedging. The creation and pricing of suitable
derivatives is a major focus of modern finance.
1.1 ARBITRAGE
Arbitrage is the making of profit without undertaking risk. It can be
earned, for instance, when a product is being sold at different prices
in different markets. Then risk-free profit can be made by selling it
where it is costlier and buying it where it is cheaper. A variation is
when the different prices are at different times, so that it is possible to
buy today at a low price and sell some days later at a higher price.
For this profit to qualify as arbitrage, however, it must be absolutely
certain beforehand that the price will go up.
1. A kind relative offers to sell you a share whose value has gone
up by at least 15% every year for 50 years.
2. A valid lottery ticket is lying on the road.
3. Horses A and B race against each other. If you bet a rupee on
horse A and it wins, you get back Rs 2. If you bet a rupee on
horse B and it wins, you get back Rs 4. If you bet on a horse that
loses, you lose your money.
INTEREST
We will call the income from an investment interest if it is earned
regularly and in a predetermined manner, without risk. (This is a
rather narrow use of the word and we employ it for clarity at this initial
stage.)
Interest can be calculated according to different conventions.
Consider a starting amount P (called the principal) on which interest
is earned over a time period T. The amount of interest earned is given
by the rate of interest, denoted r, in accordance with the adopted
convention. The rate r is given relative to some time interval, called its
period. The most commonly used period is one year, in which case
the rate is called annual.
Rates are commonly given as percentages, which have to be
converted to fractions for calculations.
SIMPLE INTEREST
In simple interest, the interest earned over one period is not added to
the principal (e.g., it may be returned to the investor), and further
interest is again earned on the principal alone. Thus, if P is invested
at a rate of interest r, the amount after one period is
A = P + Pr = P(1 + r).
During the second period, interest is again earned on P alone, so
that the amount after two periods is
A = P(1 + nr). □
A = P(1 + r),
A = P(1 + r)n. □
Sometimes the period for which the rate is quoted is not the same as
the interval at which interest is compounded. For instance, the rate
may be given as an annual one, while the interest is calculated every
6 months. In this situation, the rate is adjusted linearly, as in the
following example.
Example 1.2.4 Suppose you invest Rs 10,000 for one and a half
years at an annual rate of 10% with semiannual compounding (that is,
the compounding is every six months). Then interest is calculated for
each six-month period at half the annual rate, i.e., at 5%.
Therefore, over one-and-a-half years, the invested amount
becomes
A=P n. □
Exercise 1.2.7 Suppose you take a loan of Rs 1000, and have to pay
it back in two equal and equally spaced installments over a year. The
annual rate of interest applied to this loan is 15% and the interest is
compounded semi-annually.
which is slightly better than the P(1 + r) he would have had if he had
just let the money sit in the bank for the whole year. An investor who
can create this strategy will certainly think of pushing it further by
using smaller and smaller investment periods. In general, if he
withdraws and reinvests m times, he will end up with
A=P m.
m=2 ⟹ A = 110.250
m=4 ⟹ A = 110.381
m = 12 ⟹ A = 110.471
m = 52 ⟹ A = 110.506
m = 365 ⟹ A = 110.516
The larger the value of m, the greater is his profit. This naturally leads
to considering the limit case m →∞. To evaluate this limit, we need to
recall the number e, which is called Euler’s number and is defined by
A = Penr. □
Figure 1.3: This diagram shows the growth of Rs 100 according to the
different interest rates, each with r = 10%, over a period of 10 years.
Exercise 1.2.9 Consider a bank that offers to double your investment
in 10 years. What is the corresponding annual rate of interest if we
assume the interest is:
a. simple
b. compounded annually
c. compounded continuously.
erT1erT2 = er(T1+T2).
reff = .
a. Which one earns more interest if the period of the investment is:
6 months, 9 months, 1 year?
b. Suppose the invested amount is Rs 1000, and the common
effective rate is 10%. What is the maximum difference in the
interests earned by A and B at any point during the first 6
months? (The answer is quite small, so use a good number of
decimal places in your calculations.)
Let us now look at the other kinds of variations in interest rates listed
on page 12. The second kind of variation can be expected to be small
due to competition. A bank offering lower interest on deposits than its
competitors would soon start losing customers and would have to
raise its rates.5
The third kind certainly exists. Thus, a bank will offer lower interest
on deposits than it will exact on loans. However, it should be noted
that only the first rate can be reasonably seen as risk-free. The
second rate involves a risk taken by the bank, which explains why it is
higher.
P= .
C= .
C = e−rT.
INFLATION
Another way that the value of money changes through time is with
respect to its purchasing power. Typically, the same amount of money
can buy less and less as time progresses––this phenomenon is
known as inflation. Inflation shows up as a general increase in
prices. Occasionally, prices may fall, and then we have a deflation
situation. But inflation is the general trend.
By averaging the rise in prices over various commodities, one can
arrive at a single number––the rate of inflation f––which represents
the annual decrease in purchasing power of a unit of currency.
A,
1 + r′ = , or r′ = .
1. Rs 100 now
2. Rs 110 a year from now
Suppose the available interest rate is 8%. Then the first amount
grows to Rs 108 over a year – and this is a bit less than Rs 110. Most
people, however, still prefer the first choice. They argue that the Rs
110 will also be subject to inflation and so they would prefer to have
the Rs 100 now.
Yet the point to remember is that present value is not just a
theoretical notion––it has practical implications. If we are sure of
receiving Rs 110 in a year, we can borrow its present value against it
today. In this case that amounts to Rs 101.85. In other words, the
second offer is equivalent to an offer of Rs 101.85 today, and so it
wins no matter what the rate of inflation is.
Figure 1.4: The Dow Jones Industrial Average Index (DJIA) between
1928 and 2008
Figure 1.5: In this diagram we plot the logarithms of the DJIA values and see
that they have a strong linear trend. They also enable a clearer look at the
relative size of fluctuations over a long term.
2 You can ignore this discussion for now if you are unfamiliar with the terms
random variable, expectation, and standard deviation. However, you should
start your study of the basics of probability (Appendix B) and familiarise yourself
with these concepts as they will soon become essential.
3 A barrel of oil equals 159 litres. Between January and December 2008, the
price of a barrel of crude oil rose from $85 to $122 (in June) and then fell all the
way to $33!
4 The English translation is by Davis and Etheridge on page 24 of [17]. The
seminal contributions of Bachelier are described in more detail in the
introductory remarks of our fifth chapter.
5 The rates would have to be compared after taking into account any fees
charged by the banks. A bank offering better service might also get away with a
lower rate. Thus, in the real world, we only expect the differences to be small –
not absent.
6 For more information, consult the websites of the National Stock Exchange of
India (www.nse-india.com) and the Bombay Stock Exchange
(www.bseindia.com).
2 Deterministic Cash Flows
Figure 2.1
0 1 2
Let the discount factors for the first and second years be 0.9 and 0.8
respectively. Then the net present values of A and B are given by
where x0 is earned at time zero, x1 after the first interval, and so on.
Example 2.1.2 An annuity consists of annual payments of the same
amount A. An annuity lasting for n years would be represented by
Figure 2.2.
Figure 2.2
The NPV of an n-year annuity, with the first payment after 1 year, can
be calculated as follows (assuming a constant risk-free rate r):
NPV × r = A. □
Until the First World War, perpetuities were a popular way for
European governments to raise money, particularly for war.
Speculation centered around trade in these perpetuities was a major
financial activity. In 1900, the center for such trade was the Paris
Stock Exchange, and the total capital loaned through perpetuities was
70 billion gold francs (by the governments of France, Germany,
Russia, etc.). In comparison, the annual budget of France was 4
billion francs! (Source: Taqqu [49])
The cash flow A+B collects these transactions into a single cash flow
shown in Figure 2.3(b).
Figure 2.3(b)
It is important to keep in mind that net present value does not use
only the intrinsic properties of the cash flows but is closely tied to
market conditions in the form of interest rates. Since interest rates
fluctuate with time, a project that starts off with higher NPV may not
stay that way. For instance, in Example 2.1.1, if the sequence of
discount factors is changed to 0.8, 0.9, then project A becomes better
than project B.
In the next section we will study a measure which uses only the
intrinsic properties of a cash flow to evaluate it.
2.2 INTERNAL RATE OF RETURN
Once again, consider a deterministic cash flow sold at time t = 0 for a
price P. Its internal rate of return or IRR is that rate of interest r
which would allow the flow to be generated from its price. Another
way of putting this is to say that IRR is the rate of interest which would
make the cash flow’s NPV equal to P.
Let us look at some simple examples.
+ .
5= + .
r = 0.12, – 1.72.
Now, since the price paid is less than the total earned, it is clear that r
should be positive. So we reject the value –1.72 and obtain 0.12 as
the IRR. □
IRR need not always be positive. For example, suppose the cash flow
in the last example was bought at a price P = 7. Then the two
solutions for r are
r = – 0.09, – 1.63.
Both solutions are negative, which was to be expected since the
amount paid is more than that received. Can we still reject one? Since
some of the amount paid does come back, it is clear that the loss
does not reach 100%. Hence we should have r > –1. This leads us to
reject the value –1.63 and take the IRR to be –0.09.
Figure 2.4: Internal rate of return for the cash flow in Example 2.2.1
f(r) = + .
The intersection of the graph with the horizontal line at height P gives
the possible values of IRR if this cash flow is sold for P at time 0.
Figure 2.4 shows that we always obtain one value below –1 and one
value above –1. Due to the reasons given above, the value which is
above –1 is taken to be the IRR.
Consider a regular cash flow of amounts x1, …, xn, occurring at
times 1, …, n and sold at time t = 0 for a price P. Its IRR is then a
solution of the equation
(2.1)
If we include P as part of the cash flow, we obtain the cash flow x0 = –
P,x1,…,xn, occurring at times 0,1,…,n. In this notation the IRR
equation is
(2.2)
The internal rate of return has the virtue of using only knowledge of
the cash flow, without any reference to external and transient factors
like interest rates. The characterisation of a cash flow by a rate of
growth is also intuitively appealing, and so, the most popular
technique for attracting investors is to promise a high IRR.
The IRR is a solution of a polynomial equation since equation (2.2)
can be rearranged into
Exercise 2.2.2 Consider the cash flow x0 = –1,x1 = 0,x2 = –1. Show
its IRR is not defined.
Exercise 2.2.3 Consider a cash flow in which x0 < 0 and the net
inflow is greater than the net outflow. Show its IRR equation has a
solution.
A more significant source of trouble is that a polynomial equation may
have many solutions, and then we may not be able to say which is the
‘right’ IRR.
or
– x0 – – – + + = 0.
BISECTION METHOD
Consider the function
Calculate values of f(r) for different values of r until you find values r1
and r2 such that f(r1) > P > f(r2). Then the IRR is between r1 and r2.
Let r3 be the midpoint of r1 and r2. If f(r3) > P, the IRR is between r3
and r2; otherwise it is between r3 and r2. Take the appropriate
midpoint and continue the process. The midpoints give a sequence of
gradually improving approximations to the IRR.
f(r) = + + + .
We calculate f(0.1) = 6.8 and f(0.5) = 2.9; hence the IRR is between
0.1 and 0.5. Their midpoint is 0.3 and f(0.3) = 4.3, so the IRR is
between 0.1 and 0.3. The next midpoint is 0.2, with f(0.2) = 5.3.
Proceeding in this way, we create the following sequence of
approximations to IRR:
NEWTON–RAPHSON METHOD
We first calculate the derivative of f(r):
We then define
The IRR of A is 8.7%. The IRR equation for B has two solutions:
23.8% and –83.9%. Since the investment in B is less than the gain,
we reject the negative value and take 23.8% as the IRR. So, on the
basis of IRR, we would choose B.
□
The next example illustrates that some caution is needed while
applying this principle.
t=0 t=1
A 100 –150
B –100 150
In a cash flow with a single sign change, we can say which side of the
deal we are on whether we are giving or taking the loan. If the cash
flow has initial positive signs we are taking the loan and if the initial
signs are negative we are giving it. In more complicated flows we
cannot say, and then a high IRR is just as likely to be bad as good!
Then the present value ratio (N⁄I) gives the value per unit invested.7
With IRR, the situation is just the reverse.
1. It is only well defined for very simple cash flows, and even then it
is not easy to calculate. On the other hand, it requires no extra
information (like interest rates).
2. The IRR of the whole cannot be obtained from the IRRs of the
parts.
3. It does not give a sense of the total profit over the life of the
project.
4. It gives a sense of the rate of growth of the project—both in the
sense of profit per unit investment and in the sense of growth per
unit time.
5. Finally, let us recall from the last session that it may not be clear
whether a high IRR indicates a high rate of profit or a high rate of
loss!
MODIFIED IRR
In recent years, another measure of the rate of growth has gained
popularity. It is called the modified internal rate of return or
modified IRR or MIRR. It is a sort of hybrid of NPV and IRR. In
MIRR, we evaluate a project’s cash flow within the context of the firm
undertaking it. All negative entries (outflows) are assumed to be
generated by investing at a certain rate called the finance rate. All
positive entries (inflows) are assumed to be reinvested at another rate
called the reinvestment rate. Typically, the finance rate is taken to be
the market risk-free rate. The reinvestment rate is usually the firm’s
cost of capital—the rate at which its overall growth is taking place.
We estimate the total investment by taking the NPV of all the
negative entries using the finance rate. Let us denote the magnitude
of this NPV by P. The gains are estimated by taking the net future
value of all the positive entries using the reinvestment rate at the
conclusion of the cash flow. Let us denote this net future value by A.
The MIRR μ is then defined to be the interest rate under which P
would grow into A over the life of the cash flow. If the life is n time
periods, then the MIRR per period is defined by
P(1 + μ)n = A.
Then,
P = 1000 + = 1961.17,
and this gives MIRR= 0.36 or 36%. On the other hand, the IRR of this
flow is 100%! □
Figure 2.6
The face value is also called the maturity value or par value. The
date on which it is paid is called the maturity date. Bonds with this
simple structure are called straight or plain vanilla bonds. More
complicated bonds may offer one party the right to terminate the
contract early or allow for some fluctuation in the coupon payments.
Suppose the cash flow offered by a bond is purchased for some
price P. This price depends not only on the structure of the bond itself
but also on certain external factors. Two important factors are:
In this book we shall ignore the second factor and treat bonds as if
they are risk-free. We only note that the investor can use credit
ratings from various agencies to gauge the risk of default. In the US,
the popular rating agencies are Moody’s, Standard and Poor’s (S&P)
and Fitch IBCA. In India we have CRISIL (Credit Rating Information
Services of India, Ltd) and CARE (Credit Analysis and Research,
Ltd). The compensation or premium for the default risk is calculated
via a statistical study of the historical loss from default at each of the
rating levels.
(2.3)
(2.4)
This can be taken as the fair price for the bond. It should be noted,
however, that the assumption of a uniform r over the life of the bond is
a strong one (especially as the life of a bond can stretch up to 30
years!). This can be taken into account by using different values of r
for different time spans, and we will do this later in the section on the
term structure of interest rates (§2.8).
Exercise 2.4.1 Check that annuities and perpetuities are always sold
at a premium, while a zero-coupon bond is always sold at a discount.
For a bond with face value F, coupon rate C⁄F, m regular payments
per year and a total of N coupon payments, formula (2.3) is modified
to
(2.4)
(2.5)
P= .
2. Show that if its yield equals the coupon rate C⁄F, then the bond is
at par. If the yield is greater than the coupon rate, the bond is
sold at a discount. If the yield is lower, it is sold at a premium.
P= .
Figure 2.8
We calculate the price by calculating the NPV of the remaining
payments at the required yield. We use continuous compounding as
that makes it easier to deal with arbitrary time intervals.
In this example, we see that the sawtooth effect tends to hide the long
term behaviour of the bond price. It is useful to identify and subtract it
to reveal the underlying trend. We view the rise and fall as due to the
approach and delivery of the next coupon payment. So we start by
defining the accrued interest, which is the fraction of the next
coupon payment which is already due. If the bond is sold at a time t
between the k and k + 1 coupon payments, this is the fraction
I(t) = C
The actual price P(t) is now called the dirty price. Figure 2.10 depicts
the accrued interest, clean price and dirty price plots for Example
2.5.1.
Figure 2.10: Clean price, dirty price, and accrued interest plots for Example
2.5.1
Exercise 2.5.2 A 10% 3-year bond with face value 100 is issued on 1
January, 2008. Suppose its clean price on 1 July, 2008 is 102. How
much would you have to pay to buy this bond?
Figure 2.11: (a) Variation of the price–yield curve with the coupon payments.
The higher curves correspond to higher coupon payments. (b) Variation of the
price–yield curve with the number of coupon payments. The arrows point in the
direction of increasing number of coupon payments.
2.7 DURATION
Bonds are risk-free if held to maturity since their cash flow is
deterministic, at least if default risk can be ignored. However, at
intermediate times they are not risk-free since their price fluctuates
with the prevalent interest-rates (or required yield). We have just
noted that longer-term bonds are more exposed to this risk. We shall
now develop a way to quantify this risk.
=– .
DM = ,
so that
= –DM.
≈ ,
and so, the proportional change in price δP⁄P is approximated as
follows:
≈-DM δλ.
DM = .
By writing the times of payment as i⁄m, the units of Macaulay duration
are kept as years. In this context, Macaulay duration does not have
as precise an interpretation in terms of the sensitivity to yield changes
as in continuous compounding (see Exercise 2.7.1). Nevertheless,
since discrete compounding approximates continuous compounding,
it can still be used as an indicator of that sensitivity.
Exercise 2.7.1 Consider a bond with face value F, coupon rate C⁄F
and m coupon payments per year. If the yield is described by discrete
compounding, show that
=– ,
Figure 2.12: Duration plotted against maturity date for a 10% annual
bond. The three curves correspond to different required yields λ.
period n. While duration indeed increases with n initially, the curve
eventually flattens and even drops a little before settling at a constant
level.
Thus, while duration initially increases with n, in the very long run it
stabilises at a constant level.
DM = ,
= –DM.
Suppose the portfolio consists of k bonds, and let the price and
duration of the ith bond be Pi and DM,i respectively. Then P = P1 +
+ Pk and
Example 2.8.1 Suppose the one-year spot rate is 5% and the two-
year spot rate is 6% (with discrete compounding). Then, an
investment of 100 for a year will grow to 100(1 + s1) = 100 × 1.05 =
105. The same amount invested for two years will grow to 100(1 +
s2)2 = 100(1.06)2 = 112.36. □
One might think from this calculation that the two-year investment is
better than the one-year because it earns a higher rate of interest, yet
it is not necessarily so. For, the prevailing rates may increase after a
year and reinvesting the 105 at the new high rate may lead to a better
result.
This discussion brings us to forward rates. Suppose an investor
wants to take a loan, but after a year rather than right away. To avoid
the risk from interest rate fluctuations, she wants to finalise the loan
and the interest that will be charged. The rate that is decided for this
loan will be called a forward rate.
Thus, a forward rate is an interest rate that will be applied to a
transaction in the future but is to be decided in the present. The
forward rate for an investment starting at time S and lasting till time T
will be denoted by fS,T. Like spot rates, forward rates are expressed
annually.
Example 2.8.2 Suppose the one-year spot rate is s1 = 5%, while the
forward rate for the succeeding one-year period is f1,2 = 6%. Then, we
can invest 100 initially for one-year at s1 and then for another year at
f1,2 (without risk since the forward rate is set now). The investment will
grow to 100(1.05)(1.06) = 111.3 over two years. □
Spot and forward rates are related to each other. Consider an amount
A which is to be invested for 2 years, risk-free. If it is invested using
the 2-year spot rate, it will grow to A(1 + s2)2. An alternative is to
invest it first for one year and then reinvest for another, in which case
it will grow to A(1 + s1)(1 + f1,2). Since both routes are risk-free, by the
No Arbitrage Principle we must have
A(1 + s2)2 = A(1 + s 1)(1 + f1,2).
Some spot rates can be observed directly from the market. If there is
a risk-free zero-coupon bond with maturity time n, we can take its
yield to be sn. Difficulties arise when a suitable zero-coupon bond is
not present. The yields from other bonds do not directly give spot-
rates since they involve payments at different times.
Consider an annual n-year bond. Using net present value and spot
rates, its fair price is
If P is observed from the market and the spot rates s1,…,sn–1 are
known, we can solve this equation for sn. This observation gives rise
to a technique known as bootstrapping. First, we look at all the
available zero-coupon bonds and use their yields to obtain a list of
spot rates. The gaps in this list are filled by looking at the market
prices of other bonds and solving for the corresponding spot rate.8
+ + = + + .
Figure 2.14: The spot rates versus time plot for Example 2.8.6 □
In this example we note that, as expected, the spot rate s3 does not
equal the yield of the 3-year bond. Nevertheless, the difference is
quite small. On reflection, this is reasonable because most of the
payment happens at 3 years. Therefore, we can expect bond yields to
provide a starting approximation to the spot rates which can then be
refined by calculations as in the example.
A complication is that the maturity dates of available bonds may
not dovetail in the exact way required for bootstrapping. We may seek
zero-coupon bonds expiring in exactly 1 year but may only find ones
expiring in 11 and 13 months. In such a situation we would have to
use an average of their yields to represent the yield a 1-year bond
would have had.
Once the spot rates have been established, the forward rates can
be found by equations (2.6) to (2.8).
FISHER–WEIL DURATION
Figure 2.15 shows how quickly the spot rates can change by
significant amounts. It depicts the twenty-year spot rate curve for
India as calculated by the National Stock Exchange on three
successive days in September 2008.
Suppose, at this time, you had held some zero coupon bonds
expiring in 3 years. Then, in just two days their value would have first
increased by 1% and then decreased by 1.3%. In the longer term,
much wilder swings could be expected.
The illustration above points to the importance of quantifying the
risk to a bond portfolio from changes in the spot rates. We shall only
consider the risk from the simplest kind of change. Specifically, we
shall study the sensitivity of bond prices to parallel shifts—when all
the spot rates change by the same amount. If we plot the spot rate
curve for the new spot rates, we see a parallel shift in the curve.
Figure 2.15: Variation in the twenty-year spot rate curve for India over three
consecutive days in September 2008 (based on data released by NSE)
The concept of Macaulay duration can be adapted to this scenario.
Consider an annual n-year bond with face value F and coupon
payments of C. If we use continuous compounding, its price is given
by where the si are the continuously compounded spot rates.
Consider the case when all the spot rates change by the same
amount λ. Then the new price is a function of λ:
DP = D1 + D2.
DFW = .
λ=0 = –DQ,
DQ = .
Two obvious questions arise out of these observations. The first is:
Why are there different term structures? The second is: Why is the
normal structure the basic one while the others are rare or transitory?
The answers again involve risk. Longer term investments are
exposed to more risk over their life. They also reduce the investor’s
flexibility by tying up his cash. For these reasons investors demand
greater return on longer term investments, leading to an upward
sloping term structure. However, occasionally investors perceive
certain times in the future as particularly risky, e.g., the period in
which a general election takes place. Then the spot rates for these
times will rise, causing bumps in the term structure. An inverted term
structure will occur when the present is turbulent, so that long term
investments are seen as safer than short term ones.
2.9 IMMUNISATION
The simplest way to invest money for a time T in a risk-free way is to
buy a zero-coupon bond maturing at T. By concentrating the payment
at the end, and at a predetermined rate, it completely eliminates
interest-rate risk. Unfortunately, only on rare occasion will we find a
zero-coupon bond that matures exactly at the required time. We can
try to use a zero-coupon bond that matures near T, but this will have
attached risks:
DP = D1 + D2.
Then we have the following calculations (all the rates are assumed to
be for continuous annual compounding):
Consider a portfolio where the proportions invested in the two bonds
are w1 and w2 . Then we have the following two equations for
immunisation:
2.10 CONVEXITY
Our development of immunisation in the previous section was based
on a linear approximation to the price–yield relationship. The
technique can be further improved by using a quadratic approximation
—this requires the use of the second derivative of price with respect
to yield.
Consider a bond whose price is denoted by P. Then P is a
function of the yield λ. The convexity of the bond is defined to be
= . (2.9)
From the shape of the price–yield curve, it is evident that as the yield
λ increases, the first derivative of P also increases, moving from
highly negative values towards zero. Therefore, the second derivative
of P is positive, and so > 0.
Equation (2.9) also defines convexity for a portfolio of bonds
(assuming that all bonds have the same required yield).
= 1 + 2
= D2M – .
= .
IMMUNISATION
Under a small change δλ in the yield, the change δP in the price has
the quadratic approximation
δP≐ δλ + (δλ)2.
≐ –DM(δλ) + (δλ)2.
Our task is to create a portfolio of bonds whose value will match this
stream, even under significant changes in the interest rate. Suppose
the available bonds all have face value 10 and are as follows:
= .
Example 2.11.1 Consider a five year annual bond that has face
value 1000 and was issued at par when the required yield was 10%.
Suppose this bond can be called back at any time by paying the face
value as well as a fee of 50. Now, if after 2 years the required yield
has decreased to 7%, then the price of this bond will become
P= + + = 1079.
P= .
Example 2.11.3 Consider a five year 10% annual bond with face
value 100. Suppose it can be called back at any of the first four
coupon times, at call prices of C1 = 110, C2 = 107, C3 = 104, and C4 =
102 respectively (the call price is to be paid in addition to the
corresponding coupon payment). If it is called at the time of the jth
coupon payment, the corresponding YTC is the solution r of the
equation
Suppose the market price at the time of issue is P = 110. Then the
YTC’s at the possible call dates are 9.09%, 7.78%, 7.40%, and
7.46%, while the YTM is 7.53%. Thus the yield to worst is 7.40%. □
PUTTABLE BONDS
Puttable bonds are the mirror image of callable ones. Now it is the
buyer who is protected against interest rate risk by having the choice
to sell the bond back to the issuer for a premium called the put price.
Puttable bonds can also be European or American. The yield to put
or YTP for a possible put date is the yield if the bond is put back on
that date, and the yield to worst is the lowest of the YTP’s.
Proper pricing of callable and puttable bonds requires the
probabilistic modelling of interest rate fluctuations. As we will not take
up that topic in this book, this is as far as we can go with such bonds.
7 The usage of this term is not uniform in the literature. It is also used for the
ratio of the NPV of the positive entries to I, or (N/I) – 1.
8 The term bootstrapping is derived from the phrase ‘lift yourself by your own
bootstraps.’
3 Random Cash Flows
3.1 R R
Consider a time interval [0,T] and an asset whose value at time 0 is
V0. Suppose its value VT at time T is not known initially (such assets
are called risky assets). Then VT can be treated as a random
variable. The rate of return is also then a random variable. The
expectation of r is called the mean return of the asset and is denoted
by r. The variance of r is denoted by σ2, and its standard deviation by
σ. We call σ2 and σ the variance and standard deviation of the
asset. While r measures the profit expected from the portfolio, σ
measures the associated risk. Our task, therefore, is to explore the
relationship between r and σ for both individual asset as well as their
collections in portfolios.
r=
Exercise 3.1.1 Let two assets have (random) rates of return r1 and
r2. Use the No Arbitrage Principle to show that it is not possible for
these random variables to have a constant and non-zero difference.
wi = .
SHORT SELLING
= ,
which is just the same as the rate calculated in the usual way.
However, this return is calculated on a negative initial investment.
Thus, the inclusion of short selling leads to negative weights.
Another version of short-selling is that you first borrow the asset
for a certain time, and then sell it. Finally, you repurchase it from
another source and return it to the lender. The numerical
consequences are just as above—initially you receive V0 and finally
you pay VT.
Example 3.1.2 Suppose you own Rs 200. You notice two shares, S
and T, whose prices are Rs 800 and Rs 1000, respectively. Your
analysis leads you to expect that in the next few days the price of T
will increase much more sharply than that of S. To benefit from this,
you implement a strategy of using S to pay for T as follows. You start
by short-selling S, with delivery set for 10 days in the future. This
earns you Rs 800 right away, and you pool in the Rs 200 you already
possess to buy T.
At this point, you have no cash, you own T and you owe S. Your
net worth is therefore 0 + 1000 – 800 = 200, the same as before.
After 10 days, suppose your analysis is borne out, and the values
of S and T are Rs 900 and Rs 1200 respectively. You sell T and then
use part of the gains to buy S and deliver it. You are left with Rs 300
cash. Thus, overall, you have earned a rate of return of
= 0.5 = 50%.
Now let us look at the individual parts of our strategy. After the first
stage we have Rs 1000 invested in T and Rs –800 invested in S
(since we owe S). Therefore the weights for S and T are:
wS = = –4, wT = = 5.
Note that the weights do add to 1. The individual rates of return are
rS = = = = 12.5%
rT = = = 20%
Exercise 3.1.3 What will happen in the above scenario if, after 10
days,
RANGE OF WEIGHTS
An individual weight can take on any value: it will be positive if we
own the corresponding asset and negative if we have short sold or
borrowed it. However, if short selling is not allowed, then each weight
will be non-negative and this will also force all of them to be at most
one: 0 ≤ wi ≤ 1.
r= = = ∑ iwiri.
It follows that the mean and variance for the portfolio’s rate of return
are given by (see Appendix, §B.12)
(3.1)
(3.2)
where σij is the covariance between the ith and jth rate of returns.
These formulas can also be expressed neatly using matrix algebra:
(3.3)
r= ,
(3.4)
σ2 = .
The matrix formulations are very useful when working with large
amounts of data.
Figure 3.2: Some investors may like risk because it opens the
possibility of higher return.
1. rA > rB and σA ≤ σB
2. rA ≥rB and σA < σB
rP = ∑ iwiri,
Exercise 3.3.1 What will be the feasible set for one asset?
ρ= .
a= > 0.
w= .
w= .
Exercise 3.3.7 Show that for ρ = –1, the feasible set for assets A and
B has the form given in Figure 3.4(b).
Figure 3.4: The feasible set for two assets with correlation (a) ρ = 1
(b) ρ = –1. The dashed and solid segments represent combinations
with and without short selling, respectively.
Exercise 3.3.8 Let the returns for each pair among the the assets
A,B and C have the same correlation ρ. What is the feasible set if (a)
ρ = 1, (b) ρ = –1?
DIVERSIFICATION
An interesting feature that is already apparent is that a combination of
two assets can have a σ that is lower than for either individual asset.
For example, in Figure 3.3 there is a combination of A and B which
has least σ. The process of reducing risk by combining investments is
known as diversification.
Exercise 3.3.9 Show that in the feasible set for two assets A and B,
risk is minimised by the portfolio in which the weight of A is given by
w= .
Figure 3.7: Feasible set for three assets with no short selling
(3.5)
(3.6)
(3.7)
(3.8)
(3.9)
Combining these with the two constraint equations (3.6) and (3.7)
gives a system of n + 2 linear equations in n + 2 variables, w1,
…,wn,λ,μ. This linear system can be expressed in matrix notation as
follows:
(3.10)
Exercise 3.4.1 Consider three assets A,B and C with the following
properties.
rA = 0.2, rB = 0.4, rC = 0.6,
σA = σB = σC = 1,
σAB = σAC = σBC = 0.
(3.11)
Exercise 3.4.2 Consider the assets of Exercise 3.4.1 Find their
combination with the minimum variance.
Figure 3.10: The feasible set and efficient frontier for Example 3.4.4
The first diagram in Figure 3.10 shows the feasible set for these
assets without short selling. It has been drawn by taking about 500
random combinations of the assets with positive weights. The assets
themselves are marked by the solid squares. This already shows that
the efficient frontier may not be completely smooth. There is at least
one point where it turns sharply.
In fact, Markowitz showed that the efficient frontier has the
following form. First, there are some key points on it, which we will
call turning points. The efficient frontier is obtained by connecting
adjoining turning points via their feasible curves. Thus, the frontier
consists of pieces of hyperbolas, glued together at the turning points.
The second diagram in Figure 3.10 shows the turning points (the
stars) and efficient frontier for this example. In Example 3.4.7, below,
we carry out the calculations whose results are shown in the diagram.
□q
where
S= ,W= ,R=
The Two Fund Theorem is very useful for an investor who does not□
have the resources or inclination to do a full analysis of the available
assets. Suppose such an investor identifies two portfolios A and B
that he has reason to think are efficient (For example, he may
consider two well managed mutual funds). Now, to create an efficient
portfolio with a desired mean return r, he only has to combine A,B
appropriately. Their weights wA and wB have to be chosen such that r
= wArA + wBrB. This is the only calculation he has to make!
Example 3.4.7 We will now use the Two Fund Theorem to derive the
results of Example 3.4.4 concerning the efficient frontier in the
absence of short-selling. Since the example has three assets, all
portfolios can be described using 2 weights. Let wA be the weight of
asset A and wB the weight of asset B. Then the weight of asset C is 1
– wA – wB. Each portfolio can therefore be represented as a point on
a plane, with coordinates wA and wB . The variance and mean return
of a portfolio are functions of wA and wB :
The portfolios with the same variance therefore lie on an ellipse, while
those with the same mean return lie on a line. This gives us a family
of concentric ellipses and another family of parallel lines, as depicted
in Figure 3.11. Figure 3.11 shows that σ2 decreases as we move
towards the inner ellipses. Therefore, minimum variance portfolios (at
given levels of r) are located at the points where a constant r line
tangentially touches a constant σ2 ellipse.
Figure 3.11: (Example 3.4.7) The parallel lines have constant r. The
concentric ellipses have constant σ2, changing from 0.03 to 0.0178 to
0.015 as we move from the outermost to the innermost of the drawn
ellipses. The square box marks the overall minimum variance
portfolio. The thick slanting line is the minimum variance curve: each
member has minimum σ2 for its level of r.
We solve equation (3.11) to obtain the overall minimum variance
portfolio. It has weights wA = 1.1023 and wB = –0.0697. (Note that it
requires short selling.) Its mean return and variance are r = 0.054 and
σ2 = 0.0143.
Minimum variance portfolios at other levels of r can be obtained by
solving equation (3.10). The one at r = 0.01 has wA = 1.7236 and wB
= –0.2359. It also has σ2 = 0.0178.
Exercise 3.4.9 Note that a line cuts an ellipse in atmost two points.
This implies that for each feasible σ–r combination, in a universe of
three fundamental assets, there will be one or two different portfolios
with that pairing of mean return and variance. In fact, there will be one
portfolio if the combination is on the minimum-variance curve, and two
portfolios otherwise. What will happen in a universe of four or more
fundamental assets?
and
σP2 = w2σ A2 σ P = |w|σA = |rP – rf|.
The last case is the most important one. This is because, in line
with the general expectation that riskier assets offer higher returns,
we expect rV > rf to be the typical case. Our discussion is summed up
by:
It is a fact that the One Fund Theorem is true even in the absence of
short selling. We shall not formally prove this, but it is evident from the
picture.
The One Fund Theorem is also called the Separation Theorem and
was discovered by James Tobin [50] in 1958. Tobin won the Nobel
Prize in 1981.
The ray giving the efficient frontier is called the capital market line. If
the point M = (σM,rM) is known, then the equation of this line is
r = rf + σ.
m= = = .
Note that if we scale all the weights by the same positive constant, m
does not change. So the constraint ∑ iwi = 1 can be ignored, and we
can let the weights wi vary freely. We now have a problem of
unconstrained optimisation and we solve it by setting
= 0, i = 1,…,n.
We first solve the linear system rk – rf = ∑jσkjvj for the vj’s, and then
scale them to obtain the weights (wj) of M:
wj = .
MARKET PORTFOLIO
CAPM is based on the following assumptions about market conditions
and the behaviour of investors:
The net result is that each investor will calculate the same Capital
Market Line, and will then choose a point on it, according to his level
of affinity for risk. (We are using the fact, noted at the end of the
previous section, that the One-Fund Theorem is true even in the
absence of short selling of risky assets.) Each point on the Capital
Market Line consists of a proportion of M and the risk-free asset, so
the total investment (obtained by summing over all investors) is also a
combination of the risk-free asset and M. Hence, the total investment
in risky assets is just a multiple of M and can be identified with it
(since scaling the size of an investment without changing its weights
leads to the same rate of return). Consequently, we call M the market
portfolio.
It is impossible to completely describe the market portfolio. It is
difficult to even list the risky assets fully, let alone calculate the
amounts invested in them. Instead, one usually settles on using a
comprehensive stock index as an approximation to the market
portfolio.
MARKET BETA
The coefficient β is called the market beta or just beta of the asset.
Table 3.1: Some betas calculated with respect to the NIFTY stock
index over the years 2004–2005 and 2006–2007. As this table
illustrates, betas are usually positive and mostly vary between 0.5 and
2. (Source: National Stock Exchange of India.)
CAPM FORMULA
Having identified the “One Fund” with the market portfolio, let us now
complete the profit–risk description. CAPM achieves this by
considering the relationship of any portfolio with the market portfolio.
For a given portfolio A, consider the set consisting of all combinations
of A and the market portfolio M. This set forms a curve in the feasible
set, as shown in Figure 3.16. The Capital Market Line meets this
curve at M and, as it cannot cross it, is tangential to it. Let us consider
the implications of this geometric insight.
Let P be a portfolio in which A has weight t and M has weight 1 – t.
Then its mean return and variance are given by
We do a final rearrangement:
3.6 DIVERSIFICATION
The Markowitz model has shown the benefits of diversification, i.e.,
investing in a variety of assets. By doing so, we can reduce risk while
keeping the expected return at a satisfactory level. Now we shall
apply CAPM to this idea. Inspired by the CAPM relationship for mean
returns, we consider the original returns themselves. Define a new
random variable ϵ by
rA = rf + β(rM – rf) + ε,
or
ε = rA – rf – β(rM – rf).
= rf + β(rM – rf).
V0 = .
NPV = –P + .
V0 = .
Exercise 3.7.2 Consider assets with current values V1 and V2. Show
the value of their combination is V1 + V2 by using:
1. No Arbitrage Principle
2. CAPM
MARKOWITZ MODEL
Here, the goal is to describe the feasible set and we need to know r
and σ for each relevant asset. One way to estimate them is to look at
data from the past. For example, we consider evenly spaced data
over a time period and use it to form a sequence of rates of return:
r1,r2,…,rN. This can be viewed as a random sample for the mean rate
of return r. Hence we can use sample mean and sample variance as
estimates for r and σ2 (see §B.15):
Generating the full feasible set requires more work. The first step
is to find the covariances for each pair of assets. For this, we use the
following estimator (see §B.15):
Figure 3.18: Efficient portfolios for the data of Figure 1.2. The square
box marks the Dow Jones Composite Index while the diamonds
represent its constituent stocks. The filled stars show the efficient
frontier when short selling is allowed, while the unfilled stars show the
efficient frontier when it is not. (See Example 3.8.1.)
Geometrically, the Sharpe index is the slope of the line joining the
risky asset to the risk-free asset in the portfolio diagram (the dashed
line in Figure 3.19). A higher value of the index indicates that the
asset is closer to the Capital Market Line. Hence the index measures
the efficiency of the asset.
CAPM
The flip side of the theoretical elegance of CAPM is a certain lack of
solidity when it comes to numerical work. The main difficulty in its use
is the elusive nature of the market portfolio M. As remarked earlier, it
is not possible to fix the composition of M—in fact it is not even
feasible to list all its constituents.
ESTIMATING BETA
In any case, let us start with the standard first step of choosing a
certain comprehensive stock index as an approximation to M. By
manipulating historical data as in the previous section, we can find
estimates of rM as well as r (for any stock). We can also estimate the
β of any stock relative to the stock index by substituting the relevant
estimates of covariance and variance into its formula.
It is worth noting that beta can also be estimated from its
interpretation as the slope of the Ordinary Least Squares line. Thus,
suppose we have data x1,…,xN for the rates of return from the market
portfolio (or the index representing it) over equal time intervals, and
also data y1,…,yN for the corresponding rates of return of a certain
stock. We try to fit a line y = a + bx to this data. Following OLS, we
define the best line as the one that minimises the total squared error
We carry out the minimisation by applying the first derivative test. This
gives the equations
and
We shall call the line determined by these values of a,b the OLS line.
Exercise 3.8.2 Verify the above results.
Figure 3.21: A plot of r versus β for 35 stocks and the S&P CNX500
index. The box marks the index and the line through it is the
estimated Security Market Line. See Example 3.8.4.
1. The Security Market Line is often far from the OLS line for the
data, and so is not a good fit to it. Typically, the OLS line is flatter
than the Security Market Line—low β stocks have higher mean
returns than CAPM predicts.
2. Data suggests that market β does not completely describe the
risk–profit relationship and other variables, such as the Total
Market Capitalisation (TMC) of a stock, are also involved.
3. The r–β relationship does not always appear linear.
9 The frontierwhen short selling is not allowed has been plotted using the
software Efront, available from J R Varma’s website:
http://www.iimahd.ernet.in/~jrvarma.Another option is to use the Solver tool in
Microsoft Excel and Open Office Cale.
10 We will encounter Fischer Black again when we study financial derivatives.
He is best known for the Black−Scholes options pricing formula.
11 A good reference on these matters is Fama and French [20].
4 Forwards and Futures
FORWARD CONTRACTS
In a forward contract, the holder agrees to buy a fixed type and
amount of the underlying asset from the writer at a fixed future date
(expiration date) and at a price (exercise price) agreed upon now
(see Figure 4.1).
1. On the expiration date, the holder must pay the exercise price to
the writer.
2. In return, the writer must deliver the underlying asset to the
holder.
3. No money is exchanged at the time of signing the contract.
Figure 4.1: The structure of a forward or futures contract signed at time t = 0
with expiration time t = T and exercise price X
FUTURES CONTRACT
The main features of a futures contract are identical to a forward
contract. A futures contract may be referred to simply as a futures,
and its exercise price may again be called its strike price or futures
price.
However, futures are traded through an exchange, are
standardised, and can be traded further (the holder can sell to a new
holder). These features make futures an easily used and flexible tool
for investment. The mathematical treatment of forwards and futures is
almost identical, and so we shall mostly treat these terms as
interchangeable.
which exactly equals the holder’s profit from the contract. Similarly,
the change in the writer’s account is X – ST, which is his profit. At the
end, neither party can gain by defaulting since the value owed by the
other has already been transferred!
Proof. If X < SerT, the holder of the contract can earn arbitrage as
follows: He initially short sells the asset for S. By time T, this amount
grows to SerT. He uses X of this to close the contract, get the asset,
and deliver to the buyer (through the short sale). With no further
obligations, he is left with a risk-free profit of SerT – X.
If X > SerT, it is the writer who can make an arbitrage profit: She
initially borrows S and uses it to buy the asset. At time T she delivers
the asset to the holder, earns X and uses SerT of that to pay off the
loan. She pockets a riskless profit of X – SerT. □
Exercise 4.2.4 Show that the exercise price is given by X = S(1 + r)n
if the interest is compounded discretely and T equals n time periods.
One feature of this formula for X, which most people find surprising
when they first encounter it, is that expectations about future prices
play no role in it!
The discussion above illustrates that the No Arbitrage Principle is
a powerful mathematical tool for calculating the correct price of a
derivative. It will provide the base for every pricing formula that we
develop in this course.
Forward and futures prices are the same if interest rates are
constant. If they are variable, then differences arise due to marking to
market. For instance, suppose the spot price S of the underlying
asset is positively correlated to the interest rate r. If r increases, so
does S, and hence X, and the holder of a futures benefits because
marking to market leads to money being deposited into her margin
account. This gain can be withdrawn and reinvested. On the other
hand, a fall in r leads to withdrawals from the margin account and
may lead to her having to borrow money to maintain the account.
Overall, she comes out ahead because she borrows when interest
rates are low and invests when they are high.
However, the holder of a forward does not benefit in this way and
this creates slight differences between forward and futures prices. For
short periods (up to about 3 months) this difference is negligible but
has been observed to become significant beyond that. (For a detailed
exploration of the relation between forward and futures prices, see
Cox, Ingersoll and Ross [13].)
Exercise 4.2.5 Suppose the current stock price is $17, the current
futures price of a contract expiring in one year is $18, r = 8%, and
short-selling requires a 30% security deposit attracting interest at d =
4%. Is there an arbitrage opportunity?
1. The contract
2. An amount X e–rT of cash
or,
Vt = St – X e–r(T–t).
If we are sure that one portfolio eventually has the same value as
another, we call it a replicating portfolio of the other one. The
method of replicating portfolios, which we have just described and
illustrated, will be our basic technique for using the No Arbitrage
Principle to price derivatives.
Exercise 4.4.1 Consider two futures for the same underlying asset
(which may generate income), both expiring at T. One of them is
written at time 0 with exercise price X0 and the other is written at time
t > 0 with exercise price Xt. Use the Method of Replicating Portfolios
to show that the value of the first contract at time t is
Vt = (Xt – X) e–r(T–t),
(Note that X and It are known at time t.) At time T, both portfolios
become one unit of the asset. Hence by the Method of Replicating
Portfolios, they have the same value at all times t < T:
Vt + Xe–r(T–t) = St – It. □
Since no money is transferred when buying a futures contract at
time t = 0, we have V0 = 0. Substituting in (4.3), we find the right
exercise price as well:
X = erT(S 0 – I0). (4.4)
Let ST′ be the price of one unit of the asset at time T′. At time T′,
portfolio B earns ST′. We immediately reinvest this earning in the
asset, acquiring units of it. The total amount of asset owned now
becomes and so at time T our portfolio B is finally worth ST. This is
also the final worth of Portfolio A. On equating their initial values, we
find
+ = 1 unit,
Vt + Xe–r(T–t) = St. □
Exercise 4.4.6 Show that the exercise price of the futures contract in
the previous theorem is given by
X= S0. (4.6)
X= S0,
where S0 is the initial spot price of the underlying asset and r is the
continuously compounded interest rate.
To obtain the continuous yield we let n →∞ and find that one unit of
the asset grows into
Exercise 4.4.11 Show that the exercise price of the futures contract
in the previous theorem is given by
X = S0e(r–q)T. (4.8)
Consider the following sentence: “If you are long in a risky asset,
you can reduce risk by shorting a futures contract for it.” What it says
that if you own a risky asset, you can reduce risk by writing a futures
contract for it.
We shall now consider certain tactics for using futures to hedge.
The tactics avoid actual transfer of the asset at any time, to eliminate
transaction or delivery costs.
SHORT HEDGE
Suppose at t = 0 we have an asset with spot price S0. We fear a fall in
its value by time T. So we carry out a short hedge:
At time T, we earn X0 from the first futures and pay XT to the writer of
the second futures. The asset acquired from the writer of the second
contract is delivered to the holder of the first, and the original asset
remains with us. Therefore the value of our portfolio at time T is
VT = ST + (X0 – XT).
The term St – e–r(T–t)Xt represents the risk, and is called the basis.
Without the hedge, the basis would have been St. Since S and X are
strongly correlated (move up or down together), the variation in St –
e–r(T–t)Xt can be expected to be much less than the variation in St (in
fact the No Arbitrage Principle predicts that St –e–r(T–t)Xt = 0). In this
way, the short hedge reduces risk.
The second step in the short hedge is optional, and can be
skipped if we actually wish to sell the asset through the original
futures. However, even when we wish to sell the asset at time T, we
may still find it beneficial to carry out the second step. The reason is
that, typically, it is the obligation of the writer to deliver the asset to
whoever is the holder at the time of expiry of the contract—in
particular, any delivery costs are borne by the writer. The second step
makes us a holder and so we can demand delivery where the holder
of the first contract resides. In this way, we avoid paying the delivery
costs ourselves. Further, we proceed to sell the asset itself in the local
market at the current price.
LONG HEDGE
Suppose we expect an inflow of funds by time T and intend to use it
to buy an asset. We are worried the spot price will rise in the interim.
We hedge against this scenario by carrying out a long hedge:
At t = 0: buy a futures with exercise price X0 and expiry date T.
At t = T: sell (write) a futures contract with exercise price XT and
expiry date T. Then close out both contracts.
CROSS HEDGE
We have seen that it is possible to use futures to completely eliminate
risk, provided they are available for the asset whose value we wish to
hedge. In reality, futures are only available for those shares and
commodities with a high enough volume of trade (For example: on the
NSE, in February 2008, futures were available only for 225 out of
about 1000 listed stocks). In particular, they are not available for
individual assets such as a particular office building. Now, suppose
you are the owner of that building and wish to hedge against
fluctuations in its value. What can you do?
The answer is that you can’t do anything about the risks which are
specific to your building – such as the sudden discovery that it doesn’t
satisfy local safety rules. But it is possible to hedge against risks
arising out of general trends in the market or, more specifically, the
real estate market. This can be done by using futures which are
based on the stock of a prominent real estate company or on an index
which tracks the real estate sector.
This discussion leads us to the notion of a cross hedge, in which
the asset underlying the futures is not the same as the asset held (or
desired) by the investor. Let us denote by S the spot price of the asset
held (or desired) by the investor, and by S* that of the asset
underlying the futures.
Then the value (or cost) at time T is:
X0 + ST – XT = X0 + (ST*– X T) + (ST – ST*) = X 0 + (ST – ST*)
ROLLING HEDGE
We have noticed that we may not be able to execute a perfect hedge
because futures with the right underlying asset may not be available.
Another difficulty is that we may not be able to exactly match the
expiry time of the futures with the time of purchase, either because
contracts with the right expiry date are not available, or because we
are uncertain as to the exact date when we will wish to trade the
asset. Then we can use contracts with a longer expiry date (as
illustrated in the discussion on short hedges).
If contracts with a long enough life are not available, we can
implement a sequence of hedges using futures with shorter lives.
Such a rolling hedge will reduce risk but not eliminate it. The risk
arises from fluctuations in interest rates, since the futures used in the
later stages of the rolling hedge will not lock in today’s interest rate.
Thus the current exchange rate S and the interest rates for the two
currencies determine the exchange rate X to be used in the futures.
Exercise 4.6.2 Suppose the spot rates are given with discrete
compounding, there are m compounding periods per year, and the life
of the currency futures is one compounding period. Show that
equation (4.10) will be modified to
X=S .
Suppose that after 3 months the index stands at 16,000. Then its
value is taken as Rs 15 × 16,000 = 240,000. At this point, the holder
owes Rs 227,830 to the writer, and the writer owes Rs 240,000 to the
holder. Naturally, matters can be settled via a single payment of Rs
240,000 – 227,830 = 12,170 from the writer to the holder. □
h= ρPI.
At this point, h starts to resemble the β in CAPM, with the stock index
standing in for the market portfolio, but based on return rather than
rate of return. If we adjust accordingly, we find:
h=β . (4.11)
This will act as immediate cash compensation for the hit the portfolio
must also have taken from the fall in the market, and can be
reinvested in more stable assets. □
r′ = = =r+ (X - IT).
Let us denote the beta of the original portfolio P by β. The beta of the
new portfolio is
(4.12)
So the managers actually short 59 futures, and this adjusts the beta
to
5 Stock Price Models
ST = SerT.
ST = SeμT+cTZ,
2⁄2
Exercise 5.1.1 If Z is a standard normal variable, then E[ecZ] = ec
.
2
= S eμ(2T)+cT(Z1+Z2)–cT .
cT2 = σ2T,
Ui = ln .
Figure 5.1: Each graph shows five simulations of the Lognormal
model with varying values of drift μ and volatility σ: (a) μ = 0.1 and σ
= 0.2, (b) μ = 0.1 and σ = 0.1, (c) μ = 0 and σ = 0.1.
The Ui are called log returns, since they can also be written as
ln(Si+1) – ln(Si), i.e., as the returns of the logs of the prices.
According to GBM,
Ui = (μ – σ2⁄2)Δt + σWΔt,
Ui ~ N((μ – σ2⁄2)Δt,σ ).
(μ – σ2⁄2)Δt ≈ u.
σ2Δt ≈ s2.
This gives:
μ ≈ ,
σ ≈ .
Figure 5.2: The first chart presents the daily closing prices of the
BSE Sensex index over 8 years from July 1997 to July 2005. On
fitting the Lognormal model to this data, we obtain μ = 0.06 and σ =
0.29. The second chart is a simulation of the lognormal model with
the same parameter values.
Table 5.1
Notice how the μ estimates jump all over the place from 0.028 to
0.305, but those for σ are stable and accurate. □
Data from stock exchanges has to be cleaned up before it can be
used in these calculations for the following seasons:
1. Since an exchange is closed on the weekends and on certain
holidays, the data is not entirely gathered at regular intervals. If
we wish to analyze daily data, we have to remove those Ui
which correspond to a gap of more than 1 day.
2. Some changes in the stock price are not to be taken literally. For
example, on July 1, 2004, Infosys shares fell by 76% on the
NSE. This was caused simply by Infosys issuing 3 bonus shares
for every share that already existed. The number of shares went
up by a factor of 4 and so the price fell to a quarter.
3. Another reason for a sudden drop in price can be the
announcement of a dividend payment. In the days before its
anticipated announcement prices rise because buyers expect an
imminent extra profit. As soon as the payment is made, prices
fall accordingly.
Figure 5.4: Twenty years of monthly log returns of the Dow Jones
Industrial Average, showing extended periods of low or high volatility
We shall, therefore, use GBM in our work with the understanding
that it doesn’t give the closest approximation to reality, but at least it
gives one that we can easily manipulate to get useful insights.
Overall, while GBM has its limitations, it remains a favourite model.
We shall soon apply it to obtain the Black–Scholes model for options
pricing.
5.4 BINOMIAL TREE MODEL
The Binomial tree model simulates stock price movements by
conceiving them as a sequence of small up or down jumps. The
basic building block of this model is the following branch:
E[lnSΔt] = νΔt,
Var[lnSΔt] = σ2Δt,
E[lnSΔt] = pu + (1 – p)d,
pu + (1 – p)d = νΔt ,
u2 = σ2Δt + (νΔt)2.
U=e , D = e– ,p= .
U ≈ eσ . (5.8)
D ≈ e–σ . (5.9)
p≈ . (5.10)
An interesting aspect of these estimates of U and D is that they do
not involve the drift! This is a virtue as the jumps in the underlying
tree now depend only on σ, which is observable. Thus, the possible
paths for the price are independent of μ–its role is only in
determining the probability associated to a path.
At this stage we have two models for stock prices, one
continuous and the other discrete. We shall soon see that
sometimes one model is convenient, sometimes the other.
Therefore, we wish to be reassured that the two models are
consistent with each other. It turns out that if we let the time step go
to zero in the binomial tree model, it tends towards GBM. We shall
not prove this. Figure 5.6 illustrates this by an example. The first
diagram shows that even for n = 20 the probability distribution of the
final stock price, under the binomial tree approach, has a distinctly
lognormal look. We confirm this in the second diagram by comparing
the cumulative distribution functions of the final stock price under the
two models. (We have taken T = 1⁄4. The GBM model has μ = 0.2
and σ = 0.3. The parameters for the Binomial Tree have been
calculated from the first-order estimates (5.8) to (5.10), using n =
20.)
(a) (b)
Figure 6.1: The structure of a European call option signed at time t = 0 with
expiration time t = T, exercise price X, and call premium C. The dashed arrows
at the t = T stage indicate that the trade is optional.
1. The holder pays the writer an initial fee (the call premium) to buy
the contract. The contract details an amount of the underlying
asset, an expiration date, and an exercise price (or strike
price).
2. On the expiration date, the holder may pay the writer the exercise
price.
3. If the holder pays up, the writer must deliver the specified amount
of the underlying asset.
Example 6.1.1 On June 3, 2005, the closing price for TISCO stock on
NSE was Rs 349.70. Call options on this stock were available with a
variety of exercise prices and expiry dates. Table 6.1 shows some of
the available exercise prices (X) for calls expiring on June 30, 2005,
as well as the premium (C) at which they could have been purchased.
The closing price for TISCO stock on June 30 was Rs 339.75.
Table 6.1: The table shows the closing call premiums (C) on June 3,
2005, for a range of exercise prices (X) that were available for call
options on TISCO stock. The options expired on June 30, 2005.
(Source: NSE)
Suppose that on June 3 you had bought a call with exercise price
Rs 320 at the closing price of Rs 29.75. On June 30 you would be
able to buy a stock worth Rs 339.75 for just Rs 320. Unfortunately,
you had already paid Rs 29.75 for this opportunity, so you end up with
a total loss of Rs 10.
You would come out ahead if the final stock price were above Rs
349.75 (ignoring the possibility of earning interest). Finally, had the
June 30 price dropped to below Rs 320, you would not exercise the
option, since exercising it would cause further loss. □
Exercise 6.1.2 Suppose two call options are identical, except that
one has a higher exercise price. Which one will have a higher call
premium?
max{0, ST − X},
Figure 6.2: The payoff to the holder of a European call option on its expiry at T
as a function of the final spot price ST of the underlying asset
C ≥ 0. (6.1)
C ≥ S – Xe–rT. (6.2)
Combining the bounds (6.1) and (6.2) we get the following bounds for
the premium of a European call:
Exercise 6.1.4 Show that the bound (6.3) is also valid for an
American call on an asset without income.
C ≤ S. (6.4)
Figure 6.3: This graph compares the premiums of calls on Maruti stock (stars)
with the lower bound given by 6.3 (diamonds). The horizontal axis represents
the time interval 23 May 2005, to 29 June 2005. The calls had an exercise or
strike price of Rs 460 and expired on June 30, 2005. During this period, the
price of a Maruti share ranged between Rs. 435 and Rs. 478.
Exercise 6.1.5 Show that the inequality (6.4) is also valid for an
American call on an asset without income.
Bounds are useful if they are not too far away from the actual values.
Figure (6.3) illustrates how the lower bound given by the inequality
(6.3) is usually reasonably close to the actual premium values. The
upper bound given by (6.3), though correct, is too large to be useful.
Here is our first surprise concerning call options:
Ct ≥ St – Xe–r(T–t),
This conclusion fails when the asset generates income, because then
the early exercise of an American call would have the added benefit
of bringing in a share of this income.
1. The holder pays the writer an initial fee (the put premium) to buy
the contract.
2. On the expiration date the holder may deliver the underlying
asset to the writer.
3. If the holder delivers the asset, the writer must pay the exercise
price (or strike price).
Figure 6.5: The payoff to the holder of a European put option expiring at T as a
function of the final spot price ST of the underlying asset
Figure 6.5 shows the final payoff from a European put expiring at
T and with exercise price X. The formula for this final payoff is
max{0, X – ST}.
P ≥ Xe–rT – S. (6.5)
Combining (6.5) with P ≥ 0, we get the following lower bound for the
premium of a European put:
Exercise 6.2.3 Write down the form the inequality (6.6) will have if the
underlying asset generates either known income or a continuous
dividend yield.
An upper bound on P can be obtained if it is known that the asset
price cannot be negative. In this case, the maximum possible payoff
from a put is its exercise price X. Hence the premium of a European
put cannot exceed the present value of X:
P ≤ Xe–rT. (6.7)
Exercise 6.2.4 How will you modify the bounds (6.6) and (6.7) for an
American put on an asset without income whose spot price cannot be
negative?
Figure 6.6: The payoffs to the holders of (from left to right) a European call, a
European put and a forward, each with the same underlying asset, expiry time
T and exercise price X
C – P = S – X e–rT.
Theorem 6.3.1 (Put−Call Parity) Suppose call and put options are
available on the same underlying asset with the same expiry date T
and exercise price X. Let the continuously compounded risk-free rate
be r, and let S be the spot price of the underlying asset. Then the call
premium C and the put premium P are related by
P + S = C + Xe–rT. (6.8)
□
It is important to note that put–call parity is only valid for
European options. The form in equation (6.8) holds when the asset
generates no income. It is easily adapted to the cases when the asset
either generates known income or has a known dividend yield.
Exercise 6.3.3 What will be the put–call parity formula when (a) the
asset generates a known income, (b) the asset has a constant and
continuous dividend yield?
ONE-STEP BOPM
First, consider the n = 1 case. Imagine a European call option on this
asset which expires at T and has exercise price X. If its initial
premium is C, then the evolution of its value over the interval [0,T] is
represented by:
Here, Cu is the call payoff if the asset price moves up, and Cd is
the call payoff if the asset price moves down.
It is clear that Cu > Cd. So, the call value goes up if the asset price
goes up, and down if the asset price goes down. Now, if we become
the writer of the call, the position reverses. Profits from the asset will
be cancelled by losses from the call. If we have the right amounts of
long asset and short calls, the fluctuations can exactly cancel and we
shall have a risk-free portfolio. Thus, suppose we own h units of the
asset and write 1 call. The value of this portfolio evolves according to
the following branch:
h= .
The ratio h is called the option’s delta, and this process of creating
a risk-free portfolio is called delta hedging.
Since the portfolio is risk-free, its initial value equals the present value
of its value at T. Thus,
hS – C = e–rT(hSU – C u).
Exercise 6.4.2 Use the No Arbitrage Principle to show that D < erT <
U, and hence 0 < p* < 1.
Note that the probabilities of the up and down moves did not enter
into our calculations.
TWO-STEP BOPM
We now let the above process happen twice in succession, cutting
the time interval [0, T] into the equal parts [0,T/2] and [T/2,T]. Then
we have the following picture for the evolution of the spot price:
MANY-STEP BOPM
A certain pattern in the call premium formulas should now be evident.
First, we list all the possible final spot prices. Over n steps, these are
ST = SUkDn–k, k = 0,1,…,n.
The payoff from the call corresponding to the final value SUkDn–k is
max{SUkDn–k – X,0}. We multiply this payoff by the binomial
expression p*k(1 – p *)n-k, where not]pstar@p * p* = .
Finally, we sum all these terms and divide the sum by Rn = erT.
The general formula is thus obtained:
(6.10)
where p* = . □
C=
The formula (6.10) for C has a very interesting form. First, the division
by erT represents discounting to a present value. This present value is
taken as a weighted average of the payoffs from the call at expiry.
The weights are the probabilities associated with a binomial random
variable with parameters n and p*. Thus, one is led to interpreting p*
as a probability, which is possible since we have already determined
that 0 < p* < 1 (Exercise (6.4.2). Since it arose out of making the
process risk-free, it is called the risk neutral probability.
Figure 6.7: Premiums of a European call plotted against the initial spot price.
The dots represent the predictions of a 3-step BOPM. The smooth curve is
obtained from a 100-step BOPM. The dashed line is the graph of the function S
– Xe-rT, which is asymptotic to the premium plots. (See Exercise 6.4.5)
(6.11)
where p* = . □
Exercise 6.4.7 Show that the BOPM formulas for European put and
call premiums satisfy put–call parity. (This reassures us that BOPM
correctly captures the properties of options.)
ESTIMATING BOPM PARAMETERS
To put BOPM to honest work, we need estimates of the parameters U
and D. We have already seen one way of making these estimates in
the previous chapter where we matched the binomial tree to the
lognormal model and obtained
U ≐ eσ , D≐e–σ ,
The main issue here is the choice of σ. This example illustrates that □
the model works well if we have the right value of σ, but how do we
obtain it? It is true that we have earlier given a way of estimating σ
from historical prices, but a little reflection throws up some obvious
problems. How much data should we use? What should be its
frequency? Unfortunately, our choices in these matters can have a
dramatic impact on the value that we get for σ.
One solution that has become popular is to work back from the
options themselves. Find the σ that makes BOPM give accurate
values for one set of options and use it in calculations for others! We
will return to this idea in the next chapter.
The corresponding diagram for the payoffs from the American put
is:
P* = max .
We work our way back from the right end of the tree. We first note
that:
Pu* = max ,
Pd* = max .
Finally, we apply the one-step BOPM to the first branch:
P* = max .
By now it should be clear how the process will work for an n-step tree.
This is an easy process to implement numerically. However, it does
not lead to a closed form solution (unlike the case of European put),
and so does not give any analytic insight.
p* = = 0.75.
P* = max
= 2.38.
In this case it is best not to exercise early as that has a zero payoff.
On the other hand, if we change X to 105, the payoff from exercising
early is superior. □
Figure 6.9: Variation of put premium with time to expiry plotted using a 10-step
BOPM. The higher curve is for an American put and the lower one for a
European put with the same underlying asset and exercise price.
Current Spot Price (S): The value of a call rises with S since this
suggests that ST will also be high, bringing in greater return from the
call. On the other hand, since a put brings more profit as ST
decreases, the value of a put goes down when S rises.
Exercise Price (X): A higher X decreases the profit from a call and so
lowers its value. The value of a put will rise with X.
Figure 6.10: Variation of option premiums with the risk-free rate, calculated
from a 10-step BOPM. All the cases have S = X = 10, σ = 30% and T = 1.
h = e–qΔt .
Suppose h is set to the value given in the above exercise. Then the
portfolio, being risk-free, must grow at the risk-free rate. Therefore, its
initial value is simply the present value of its final value:
hS – C = e–rΔt(heqΔtSU – C u).
where
p* =
Cu = max{0,SU – X}
Cd = max{0,SD – X}.
Starting with these calculations, it is easy to carry out an n-step
BOPM over a time interval T.
(6.12)
where
p* =
where
p* = .
Exercise 6.7.5 Show that in the current context, put–call parity takes
the following form:
P + Se–qT = C + Xe–rT.
Note that the portfolio value has increased by exactly the risk-free
rate of 5%. To keep the portfolio risk-free over the next stage of the
tree, we have to adjust its hedge ratio to 0.9545. We can do this by
either buying 204.5 shares (so we have a total of 954.5) or by buying
back 214.3 calls (decreasing their number to 785.7). While the steps
are equivalent mathematically, the second one involves a much lower
investment and is therefore more practical. Thus, we decide to buy
back 214.3 calls at a price of 15 each, and we borrow 214.3 × 15 =
3214 to do so.
At t = 2, suppose the spot price again rises. Then the value of the
portfolio is
Again, the portfolio value has increased by exactly the risk-free rate. □
This is only a toy example in that the real prices would not actually go
up or down by the prescribed factors, and so the tree has to be
reconstructed after each step for the new price. This also means that
we cannot expect the hedge to be perfect. In the next example we
illustrate one way of handling an actual sequence of prices.
U = es = 1.048
D= = 0.954
p* = = 0.501
The initial 10-step BOPM gives the following starting hedge ratio and
call premium:
h = 0.561,
C = 0.6388.
Our first step is to write 1000/h = 1783 calls. Our portfolio now
consists of 1000 shares, 1783 shorted calls and 0.6388 × 1783 =
1139 in cash. The cash cancels the value of the shorted calls, so the
initial value of this portfolio is 1000 × 10 = 10,000.
Now, suppose the stock prices take the following values at 9-day
intervals (they have been randomly generated to follow a GBM with
the given drift and volatility):
10, 10.02, 9.53, 10.43, 10.30, 10.16, 10.30, 9.89, 10.17, 9.70, 10.05.
After 9 days, the new stock price is 10.02. The 9-step BOPM gives
the hedge ratio and call premium values as:
h = 0.563,
C = 0.6436.
Figure 6.11 shows the result of repeating this process till the expiry of
the calls. The hedging is quite successful in removing the large
oscillations present in the stock price. □
Figure 6.11:Dynamic hedging with calls. The diamonds represent the unhedged
shares and the stars represent the hedged portfolio (Example 6.8.2).
= e–rT
Exercise 6.9.1 Consider a contract which will pay you the square of
the price of the underlying asset at a future time T. What is the correct
price for this contract?
Ui = ln , i = 1,…,n.
p* = .
We can now describe the variance of the overall log return under the
risk-neutral probability:
Var ln = Var ln
u=σ , d = –σ .
Therefore,
The first step is easily done. Suppose the asset price follows GBM
with drift μ and volatility σ:
2⁄2)T+σ
ST = Se(μ–σ Z, Z ~ N(0,1).
2⁄2)T+σ
ST* = Se(r–σ Z, Z ~ N(0,1).
Therefore,
Example 7.1.3 Consider a contract which will pay you the square□
of the final spot price of the underlying asset. What should you pay
to buy this contract? We assume the underlying asset follows GBM
with volatility σ and calculate the value as follows:
□
7.2 THE BLACK–SCHOLES FORMULA
Theorem 7.2.1 (Black-Scholes Formula for European Call
Options) Consider aEuropean call with expiry date T and exercise
price X on an asset following GBM13 with volatility σ. Then the call
premium at t = 0 is given by
where
w= . (7.2)
a= .
Therefore,
(7.4)
where
w = –a + σ = .
This is the same payoff as that for a European call on the asset
underlying the futures and with exercise price Xe–r(TF–TO). Therefore,
the value CF is given by the Black–Scholes formula for such a call:
where
w= = .
2.
V = e–rTE[f(S T*)],
Exercise 7.4.2 Show that the premiums of European puts and calls
on an asset with dividend yield q and volatility σ are given by
7.5 BLACK–SCHOLES AND BOPM
An alternate derivation of the Black–Scholes equation is to start with
a BOPM having time-step △ t, and then let △ t → 0 to get a
continuous model. In taking this limit, the binomial distribution in the
BOPM converts to the normal distribution in the Black–Scholes. In
particular, by taking a small enough △ t, we get a discrete
approximation of the Black–Scholes. This is illustrated in Figure 7.2,
which shows that even a 10-step BOPM gives a good approximation
to the continuous limit.
The Black–Scholes formula has a simple form which lends itself
to quick calculation as well as analytic treatment leading to general
conclusions. BOPM involves considerably more computation (if large
tree sizes are used). On the other hand, it is a much more flexible
tool than Black–Scholes. For instance, BOPM can be used to price
American puts whereas Black–Scholes cannot. BOPM can also
easily handle the so-called exotic options, which have more
complicated rules. In general, we could say that the scope of BOPM
is wider but the analysis from Black–Scholes is deeper.
Figure 7.2: Comparison of Black–Scholes and BOPM. The call premium C
has been plotted against the elapsed time t. The continuous path tracks the
prices predicted by Black–Scholes, while each dot has been calculated using
a 10-step BOPM. (In this instance, we have taken T = 1, S = X = 100, r =
10% and σ = 0.3.)
DELTA
Consider a European call with exercise price X and expiry time T on
a stock with initial spot price S, drift μ and volatility σ. Let the risk-
free rate be r. Then, according to the Black–Scholes model, the call
premium C is a function of S, X, T, σ and r. We are interested in how
changes in any of these will affect C. To start, we consider the effect
of changes in S. To this end, we define the delta of the option by
δC = .
C = SΦ(w) – Xe–rTΦ(w – σ ).
δC = Φ(w).
DELTA HEDGING
Consider a portfolio which is long one share and h calls. Then its
value is V = S + hC. To hedge against changes in the spot price, we
set
= 0.
This gives:
0 = 1 + hδC = 1 + hΦ(w), or h = – .
w = = 0.157.
= 0.653
After 9 days, the spot price is 10.02. Black–Scholes gives the new
call premium to be 0.628. The value of the portfolio is now
10.02 × 1000 – 1778 × 0.628 + 1162 e0.05×9⁄365 = 10,067.
GAMMA
The most important parameter affecting C is the spot price S. For a
closer look at their relationship, we involve the second derivative as
well. Therefore, we define the gamma of the call by
γC = .
Exercise 7.7.2 Confirm the formula for the gamma of a call given
above.
DELTA–GAMMA HEDGING
We have already encountered delta hedging, where the delta is set
to zero by shorting an appropriate number of calls. We can further
reduce risk due to changes in the spot price by also setting the
gamma to zero.
To do this, we create a portfolio consisting of the stock and two
different calls on this stock (e.g. they could have different expiry
dates or exercise prices). Suppose it has one share, x1 units of the
first call and x2 units of the second call. We similarly use Ti, Xi, Ci to
represent the expiry time, exercise price and premium of the ith call.
Then the value V of the portfolio is given by:
V = S + x1C1 + x2C2.
1 + x1Φ(w1) + x2Φ(w2) = 0,
2⁄2 2⁄2 = 0.
x1e–w1 + x2e–w2
THETA
First we consider the time factor. Let t be a time instant during the life
of the call. Then the time to expiry is T – t, and the value of the call at
t is given by
where
w(t) = .
Note that ΘC < 0 and so the value of the call decreases with time.
Alternately, we interpret this as: the value of the call increases with
the time to expiry T.
VEGA
Next, we consider the dependence on the volatility σ. We define the
vega of the call by
VC = .
RHO
The parameter rho measures the dependence on the risk-free rate r,
and is defined by
ρC = .
ρC = TXe–rTΦ(w –σ ). (7.10)
Delta: δV =
Gamma: γ =
V
Vega: VV =
Theta: ΘV =
Rho: ρV =
Exercise 7.8.1 Show that the Greeks for a European put are given
by
δP = –Φ(–w)
γP = 2⁄2
e–w
VP = 2⁄2
S e–w
ΘP = – + rXe–rTΦ(σ – w)
ρP = –TXe–rTΦ(σ – w).
△V ≐ δV △S + (△S)2 + V V △σ + ΘV △t + ρV △r.
ΘV = rV – rS δV – γV .
ΘV + rS δV + γV – rV = 0.
These calculations are valid for any t (not only t = 0) and so we get
+ rS + – rV = 0.
(7.12)
s = T – t,
The benefit is that the heat equation has been extensively studied
and there are standard techniques for its solution, both theoretical
and numerical.
E = .
The risk-free probabilities for the up and down moves in this tree are
Table 7.1: Data for call options on TCS stock traded on the National
Stock Exchange on August 31, 2005. All these options had 29
September 2005 as the expiry date.
ℙ[ST ≥ 0] = ℙ Z ≥- = 65%.
It is this last factor, the possibility of being totally wiped out, which is
the downside of speculating with options. □
Profit =
Profit =
The maximum profit has been capped at 47, and the possible loss is
now about 75% of what it would be if the investor just went long in
the first call. □
BUTTERFLIES
Butterflies are combinations used by investors who expect a certain
amount of variation in the spot price but are not sure about the
direction in which it will occur. Butterflies come in two flavours
depending on whether the expected variations are small or large.
Consider an investor who expects the future spot price to lie in a
certain small range. He can speculate by creating a combination of
options whose profit profile is as follows:
Thus, he profits if the spot price stays in the expected range, and
has limited loss if it does not. This kind of profile can be created via
the following steps:
STRADDLES
The strategies considered so far used only calls or only puts.
Now we consider combinations of puts and calls. Straddles, like
butterflies, are used to speculate on the expected volatility of the
spot price. However they raise both the possible profit as well as
loss.
A bottom straddle or straddle purchase is used when the
investor wants to bet on a large move in the spot price. It consists of
buying a call and put with the same expiry date and exercise price.
STRANGLES
Strangles are like straddles, except that the put and call have
different exercise prices. Therefore the profit profile has a flat portion
in the centre:
A strangle would be used by an investor expecting a large jump
in the spot price.
COLLAR
A collar has a similar profile to a bull spread. It is used by investors
who want to protect the gains made by a stock that they own. For
example, if you own a stock that has done well recently and you wish
to sell it after a month, you can use a collar as protection against a
subsequent drop in its price. A collar consists of the following pieces:
Suppose the put has initial premium P and the call has initial
premium C. Then there is an initial expense of P – C in setting up the
collar (and it is possible that this is a gain rather than an expense).
The final value of this portfolio has the following structure as a
function of ST:
Exercise 7.10.7 Explain how to create combinations of options with
the following profit profiles.
13 Recall that GBM is a good model for stock which does not pay dividends.
8 Value at Risk
or
This means that the loss over the next n days will be less than V with
probability P%. P is usually of the order of 95 to 99.14 Note that V is
the magnitude of the possible loss and is given as a positive number.
It is also assumed that the composition of the portfolio does not
change during this time.
Z=
0.95 = ℙ[R ≥ –V ] = ℙ .
Therefore,
If we can answer this question, we can also find the change in the
value of the entire portfolio:
△Vi ≐ △Si.
Note that ∂Vi ⁄ ∂Si can be obtained from either a model such as
Black–Scholes or estimated from historical data. We thus have our
linear model:
δC = Φ(w) = 0.76
△C ≈ δC△S = 0.76△S
γC = 2⁄2
e–w = 0.0034
ΘC = 2⁄2
– e–w –rXe–rTΦ(w –σ ) = – 218.8
△C ≈ δC △S + γC (△S)2 + ΘC △t
F△C(x) ≈ F(△S+223.5)2
= F△S+223.5 - F△S+223.5 -
=Φ -Φ .
With this formula in hand, it is not hard to locate the 95% VaR: it is
53.17. □
Example 8.4.1
where △S1 ~ N(–10,0.5), △S2 ~ N(5,1) and △S3 ~ N(0,3). Let the
correlation matrix be
Figure 8.4: Histogram showing the frequencies for the simulated data of
Example 8.4.1. The dark region marks the 95% VaR.
If we have models for the values of all the assets in the portfolio,□
we need not use the linear or quadratic approximations. Instead, we
can use the models to exactly calculate the change △VP arising from
any set of changes △ Si. This is called full valuation and while it
does away with one set of approximations, it considerably increases
the numerical work. We illustrate it for the case of a single call.
Example 8.4.2 We pay a final visit to the situation of the call on TCS
stock (Examples 8.1.2 and 8.3.1). We have estimated the 10-day
return from one share to be normal with mean 11 and standard
deviation 52. Since the initial share price was 1405, the final share
price has a N(1416,52) distribution.
We can simulate the final price S10 and for each value s that we
obtain, we calculate the new call premium c using the Black-Scholes
Formula. In our case, we have
E[St+a] = eμtS a.
Mt = e–μtS t.
Now suppose you start by betting Rs 1 and the first head appears on
the N + 1 toss. Your stake on that throw is 2N and your total gain
(after deducting your previous losses) is
2N – = 2N – = 1.
So this strategy leads to a curious situation. Almost all the time you
will win a small amount. But if you do lose, you will lose very badly
indeed. All the risk has been concentrated into one tiny and therefore
extremely toxic zone.
Let us do a mean–variance analysis of this strategy over 10
tosses. There are two possibilities, win and lose, with the following
payoffs and probabilities:
Scenario Payoff Probability
Lose 1 – 210
Win 1 1–
Calculus
This chapter contains the calculus that is used in this book. Some of
it should be familiar to you, while the latter parts may be new. You do
not need to read it all at one go. Refer to it as the need arises.
Our treatment of calculus is admittedly superficial and you may
want to supplement it with a more detailed text. There are
innumerable books that you can consult. Most of them will teach you
how to calculate—Apostol [1] will teach you how to think.
DIFFERENTIAL CALCULUS
Let us start with a quick review of calculus in one variable. If we have
a function f : I → ℝ, where I is an open interval in ℝ, then the
derivative or differential of f at a ∈ I is defined by provided the limit
exists. The quantity f ′(x) is visualised as the instantaneous rate of
change of the value f(a). Practically, if we know f ′(a) and f(a), then
we can estimate the value of f at a nearby point a + h by If we draw
the graph of f, then f ′(a) provides the slope of the line which is
tangent to the graph at (a, f(a)). (See Fig. A.1.)
f(a + h) ≈ f(a) + f ′(a)h.
Recall that two functions f, g can be added, subtracted,
multiplied, divided or composed according to the following rules:
Figure A.1: The graph of a function f(x) showing the tangent line at a point (a,
f(a)). The tangent line has slope f ′ (a).
ALGEBRA OF ERIVATIVES
For the purposes of this book, one has to be aware of the following
basic rules involving derivatives.
′(x) = .
Chain rule:
(f ∘ g) ′ (x) = f ′ (g(x))g ′ (x).
Figure A.2: The graph of a function f(x) showing local extrema at x = a,b.
There is a local maximum at x = a and a local minimum at x = b.
For example, the functions x3, x4 and –x4 all have zero first and
second derivatives at x = 0. For x3, x = 0 is not even a local
extremum, while for x4 it is a local minimum and for –x4 it is a local
maximum. (See Fig A.3).
INTEGRAL CALCULUS
If functions f,g are related by f ′ (x) = g(x) at every x we call f the anti-
derivative or indefinite integral of g and denote the relationship by
Hence there is a constant A such that f(x) e–qx = A, or f(x) = A eqx for
every x. Substituting x = 0, we find that f(0) = A, and so f has to be of
the form
f(x) = f(0) eqx. □
Now suppose . Then we define the definite
integral of f over an interval [a,b] by
Figure A.4: The area of the shaded region under the graph of f is given by
Linearity: If f,g are two functions and c,d two numbers, then
Splitting: If a ≤ b ≤ c, then
IMPROPER INTEGRALS
We have defined definite integrals over an interval [a,b] and stated
that these represent the area under a curve. In some problems the
curve extends over the whole real line and we are interested in the
entire area under it. Naturally, we represent this area by
but how is this defined? Well, we first define the integral from some
point a to ∞ as a limit of ones we already know how to calculate:
Two limits are involved in this definition and if either of them does not
exist, we say that diverges or does not exist.
∇f(a) = .
In fact it does not always hold. The next theorem describes a broad
class of situations when it holds.
Figure A.5: Choice of order of integration in double integration. Over the same
to
Look at the shaded region in the two diagrams in Fig. A.5. Suppose
it is named U. In the first diagram, we have marked its boundary as
made of two curves: the lower part is the graph of y = g(x) and the
upper part is the graph of y = h(x). In both cases, x varies from a to
b. If we look at any of the vertical line segments filling out U, then on
each of these segments, x is fixed while y varies from g(x) to h(x). So
we can carry out the integral
and this represents the integral of f over the vertical line segment
located by x. The function I(x) is defined for x ∈ [a,b] and so we can
integrate it too:
It is possible that the two choices give different results and in that
case we would not have a well-defined double integral. However, if f
is continuous then it is guaranteed that they will give the same result,
and we can use the one which is more convenient. In this case, we
may also write for the double integral, indicating the region of
integration and the integrand but not the choice of order of
integration.
CHANGE OF VARIABLES
Consider a double integral
x = x(u, v),
y = y(u, v),
J(u,v) = det .
x = r cos θ
x = r sin θ
Now consider the double integral of a function f over the disk with
center at origin and radius R. In polar coordinates, this corresponds
to a rectangle with the r side going from 0 to R and the θ side going
from 0 to 2π. Therefore the double integral can be expressed as
□
Appendix B
1. ℙ(S) = 1.
1. ℙ(∅) = 0.
Example B.2.1
□q
Remark Thus the number fX(x) does not represent the probability
that X = x. Individual values of fX have no significance, only the
integrals of fX do! (Contrast this with the discrete case.)
If two random variables X,Y have the same pdf, we write not] equald
and say that they have the same distribution. Note that this
does not mean the two random variables are equal. They may not
even have the same sample space.
The graph of FX is
The cdf of X is
Figure B.1: The first diagram shows how to read off quartiles from a
cdf plot. The second shows the median and interquartile range
marked against the density function.
These two examples illustrate the basic properties of the cdf FX:
Exercise B.3.6 Identify the quartiles and interquartile range for the
random variable in Example B.3.4.
Exercise B.3.7 Identify the quartiles and interquartile range for the
random variable in Example B.3.5.
The median and interquartile range provide a summary of the
basic features of the random variable. Unfortunately, they are not so
well suited to the study of combinations of random variables. For
example, suppose a random variable is the sum of two random
variables whose medians are known. This knowledge does not
suffice to give the median of the sum. We shall, therefore, develop
more sophisticated measures of the centre and the spread. These
will be called expectation and variance, respectively.
Now we shall look at two kinds of random variables, one discrete
and one continuous, which are especially important.
1. 0 ≤ p, q ≤ 1,
2. p + q = 1
Therefore,
□q
Exercise B.6.4 Let X be any random variable, and g,h two real
functions. Then
E[g(X) + h(X)] = E[g(X)] + E[h(X)].
VARIANCE
Given some data , its average x is seen as a central value
about which the data is clustered. The significance of the average is
greater if the clustering is tight, less otherwise. To measure the
tightness of the clustering, we use the variance of the data:
Variance is just the average of the squared distance from each data
point to the average (of the data).
Therefore, in the analogous situation where we have a random
variable X, if we wish to know how close to its expectation its values
are likely to be, we again define a quantity called the variance of X:
not]var@Var[X]
Var[X] = E[(X – E[X])2].
□q
Therefore
Var[X] = σ2
Now we can also illustrate our earlier statement about how the
normal distribution serves as a substitute for other distributions.
Figure B.3 compares the pdf of a binomial distribution (n = 100 and p
= 0.2) with that of a normal distribution with the same mean and
variance (μ = np = 20 and σ2 = np(1 – p) = 4).
(B.1)
where δ and γ are called the location and scale parameters. The
location parameter δ can be any real number, while the scale
parameter γ has to be positive.
1. .
2.
3. .
1.
2.
We shall write the proof for the case when X,Y are both discrete.
Compare the last exercise with the identity connecting the dot
product and length of vectors:
1≥ = |Cov[X,Y ]|
implies
|Cov[X,Y ]|≤ σXσY .
fY |X=x(y) = .
1. 0 ≤ fY |X=x(y) ≤ 1,
2.
E[Y |X = x] =
fX,Y (x,y) = ,
where –1 < ρ < 1. X and Y are then called a bivariate normal pair.
We leave the following for you to verify.
Figure B.7: On the left is a plot of the joint pdf of a bivariate normal
pair of variables X,Y with ρ = 0.7. On the right is a contour plot of the
same pdf together with the line of regression y = ρx of Y on X. The
dashed vertical line shows that for a fixed value X = x, the line of
regression gives the mean of Y .
B.11 INDEPENDENCE
Let X,Y be jointly distributed random variables. We consider Y to be
independent of X, if knowledge of the value taken by X tells us
nothing about the value taken by Y . Mathematically, this means:
fY |X=x(y) = fY (y).
Of course, the mean and variance would add up like this for any
collection of independent random variables. The important feature
here is the preservation of normality. We shall use the cdf to
demonstrate this. First, let X,Y be any two jointly distributed random
variables. Then the cdf of X + Y can be expressed as follows:
The joint density function pdf f can be used to obtain the joint pdf
of any subset of X1,…,Xn. For instance, when the Xi are discrete, the
calculation shows that the joint pdf of X1,…,Xn–1 is Similarly, if they
are continuous, the joint pdf of X1,…,Xn–1 is
EXPECTATION AND VARIANCE
Any function g : ℝn → ℝ can be composed with the Xi to create a
new random variable g(X1,…,Xn) : S → ℝ defined by
g(X1,…,Xn)(w) = g(X1(w),…,Xn(w)).
The expectation and variance of the sum are given by the following
generalisations of the two variable case:
INDEPENDENCE
Jointly distributed random variables X1,…,Xn are called independent
if their joint pdf f is given by f(x1,…,xn) = f1(x1) fn(xn), where fi is the
pdf of Xi .
3.
C= .
where x is the column vector whose entries are the numbers x1,
…,xn. The proof of this identity is as follows:
Let us now mention that an n × n matrix P is called positive-
definite if it has the following two properties: Symmetry: P = PT.
Positivity: For any non-zero column vector x with n entries, xTPx >
0.
A covariance matrix C is clearly symmetric. Moreover, we have xT
Cx = Var ∑ ixiXi ≥ 0, so it almost satisfies the positivity condition. In
applications, it is often a reasonable assumption that none of the Xi
is completely determined by the others. Under this assumption, ∑i xi
Xi can only be a constant if each xi is zero. For, suppose we have ∑i
xi Xi = c, where c is a constant and one of the xi is non-zero. Let the
non-zero one be x1. Then X1 is completely determined by the others:
X1 = c –∑ i=2nx iXi .
aij =
=A .
Then
B.14 LINEAR REGRESSION AND LEAST SQUARES
In this section, we take up the problem of finding the best linear
approximation to the true relationship between two jointly distributed
random variables. Suppose the variables are called X and Y and we
are looking for an expression of the form α + βX which will give the
best approximation to Y . The meaning of the word ‘best’ is obviously
the main point of contention here–while different choices are
available, we shall only discuss the most popular one. This goes by
the name of Ordinary Least Squares (or OLS) and seeks to
minimise the expression
h(α,β) = E[(Y – α – βX)2].
To find the minimum, we apply the first derivative test and set the two
partial derivatives of h to zero:
One nice property of the OLS line is that it matches the regression
curve whenever the latter is a line. Thus, suppose we know that the
regression curve is a line:
E[Y |X = x] = a + bx, or E[Y |X] = a + bX (B.4)
The common probability density function fX for all the Xi is called the
population density. We also say that we are sampling from a
population of type X. The parameters associated to X are called
the population parameters.
ri = .
SAMPLE MEAN
Let X1,…,Xn be a random sample. Its sample mean is the random
variable not]Xbar@X X defined by
S2 =
Zn = .
□q
which implies an = ± , bn = ∓ .
Zn ± .
Example B.19.1 Figure B.14 is based on the data for Infosys stock
depicted in Figure 5.3. The quantiles for this data are shown as
discs. The solid curve represents the normal distribution with the
same mean and variance as the data. We see that it is too low on
the left and too high on the right, as far as most of the data is
concerned. This has happened because the extreme events at the
ends have pulled it away from the centre. We can compensate by
ignoring the extreme events. Then we get a near perfect fit (the
dashed curve) to the middle 80% of the data.
We are not able, however, to get a good match through the full (0,1)
range. This suggests that the data does not truly represent a normal
distribution. □q
This example illustrates how quantiles can be used to visualise the fit
of a distribution to data, and to adjust it accordingly. For example, if
we do not care about extreme events and just want a model that
does well under normal circumstances, the distribution depicted by
the dashed curve would do very well indeed.
With this basic building block, one can generate sequences that
simulate sampling from any distribution we wish to consider. In
particular, let us note that if we can simulate random sampling of a
particular distribution, we can also simulate sampling of a number of
independent random variables with this distribution. For example,
suppose we wish to generate 1000 pairs of values of independent
random variables X,Y which are uniformly distributed over [0,1].
Then we generate 2000 values of the uniform distribution over [0,1]
and allocate them alternately to X and Y .
=A .
15 The event algebra does not always include all the subsets of the sample
space S. But we will let this issue lie undisturbed.
16 These are not the only types. But these types contain all the ones we need.
a. 2 = 1 + 10r
b. 2 = (1 + r)10
c. 2 = e10r
The respective solutions are r = 0.1, 0.072 and 0.069, or 10%, 7.2%
and 6.9%.
We use 1.05X to pay off the loan and are left with a risk-free profit of
0.000625X.
Exercise 1.3.13 The annual growth factor is 1.0212 = 1.268, so the
effective annual rate is 26.8%.
Exercise 1.3.14
a. Let the invested amount be Rs 1, the discrete rate be rd, and the
continuous rate be rc. Since the interest earned over 1 year is
the same for both A and B, we find that
Hence
which shows that the interest earned over 6 months is also the
same for both A and B. If we graph the interest earnings against
time, we get the diagram as shown in the next page.
This yields rd = 0.098 and rc = 0.095. Over the first 6 months, using
years as units, the difference between the interests earned by A and
B is
f(t) = 1000(1 + 0.098t – e0.095t), 0 ≤ t ≤ 0.5.
This shows that when the yield equals the coupon rate, the bond is
sold at par. If the yield is greater, the price must fall below the face
value, and the bond is at a discount. If the yield is lower, the price
must rise and the bond is at a premium.
Hence,
Dividing through by P gives
=– .
This implies nsn = msm + (n – m)fm,n, which gives the formula for fm,n.
Exercise 2.11.2 If the bond is called at the end of the ith year, for
the purposes of calculating YTC it becomes an i year bond with face
value Ci. Apply Exercise 2.4.3.
C 3
In these calculations, note that r1,r2 are random, but their difference
c is not.
Exercise 3.1.3 In both cases, we are left with Rs 100 cash – a loss
of Rs 100.
wS = – = –4 and wT = = 5.
Exercise 3.2.2 C and B are more efficient than A, while C and D are
more efficient than E. Hence the efficient ones are B, C and D.
σ2P = .
The RHS is a quadratic in rP, hence it can be put in the form a rP2 +
b rP + c. To obtain a, we collect all the coefficients of rP2:
a= .
Exercise 3.3.4
w= .
Exercise 3.3.5 The portfolio mean return and standard deviation are
Exercise 3.3.6 Similar to the solution of part 2 of Exercise 3.3.4.
w= = = .
This system has the following block structure (indicated by the lines
in the matrices above):
=
This gives two vector equations:
IW + RL = 0
RTW = R′
= rf + (rM – rf).
CHAPTER 4
Exercise 4.2.6 Reinvest the income from the margin account at the
prevailing risk-free rate till the expiry date.
X = 100 × 1 + = 105.
We note in passing that, had r been continuously compounded, the
numerical result would hardly have changed:
X = 100 e0.05 = 105.13.
At time T both portfolios become the asset, hence have equal value.
Therefore they have equal value at time t:
Vt + Xe–r(T–t) = 0 + Xte–r(T–t).
The net present value of the income from the bond during the life of
the futures is:
I0 = 1000 e–0.25×0.1 = ¥975.31.
Exercise 4.7.2 Let the rates of return from the portfolio and the
index be rP and rI respectively. Then
rP = rI = .
Therefore
β= = = h.
CHAPTER 5
Exercise 5.2.1 Combining the GBM formula with the formula for
futures price, we obtain which is the formula for a GBM with drift μ –
r and volatility σ.
CHAPTER 6
Exercise 6.1.3 We compare with the value of the corresponding
futures to get:
Exercise 6.2.3
Exercise 6.3.4 The same reasoning would suggest that put prices
would fall. But put–call parity says put and call prices move together
—the only resolution is that the prices must be independent of the
expected future asset price.
This limit takes some effort to calculate. A hint: substitute the power
series expansions and consider the first couple of terms. Using the
log function helps. It is also possible to attack the problem via
L’Hôpital’s Rule.
CHAPTER 7
CHAPTER 8
Exercise 8.1.3 The 2-day standard deviation of the return from each
asset is
σ = 105 × 0.01 × = 1414.21.
APPENDIX A
where b′∈ [b – z,b] and a′∈ [a – z,a]. Now b →∞ implies b′→∞ and a
→–∞ implies a′→–∞. Therefore,
APPENDIX B
Exercise B.3.2 We check that the given formula for faX+B satisfies
FaX+b (x) =
The
case a < 0 is shown below (The a > 0 case is similar and easier):
Exercise B.3.6 From the graph of FX we see that the leftmost point
x at which F(x) ≥ 0.25 is x = 0. Hence x0.25 = 0. Similarly, we see that
x0.5 = 1 and x0.75 = 2. Therefore the interquartile range is 2 – 0 = 2.
Exercise B.7.1
And,
Exercise B.16.1 The fact that X will have mean μ and standard
deviation σ/ has been proven just before this exercise. Further,
we know from the previous units that the sum of independent normal
variables is normal.