0% found this document useful (0 votes)

25 views32 pages

Markov Chains

Uploaded by

Video Editing Time

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views32 pages

Markov Chains

Uploaded by

Video Editing Time

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

31

3 Markov Chains
3.1 Introduction and transition matrices
A stochastic process is a random phenomenon which evolves in time. More formally,

Definition 3.1. A stochastic process X is a family {Xt : t ∈ T } of random variables

indexed by a time-set T .

In this module, T will always be the set of non-negative integers, and so our stochastic
processes will all evolve in discrete time.

Example 3.2 (Gambler’s ruin). Suppose we play a game involving repeatedly tossing a
fair coin; every time the coin comes up Heads you pay me £1; every time there’s a Tail
I pay you £1. I start with £20 and you start with £50: we stop when either of us loses
all our money.
Here we could let Xn = x if you have £x after the nth coin toss. So X0 = 50, and
{X0 , X1 , X2 , . . . } is a stochastic process. Questions that we might be interested in asking
include:

• what is the chance that you win the game?

• what is the expected number of coin tosses until the game ends? ⊛

Example 3.3 (Random walk on Z, see Figure 3.1). Here X0 = 0 and we set Xn =
Xn−1 ±1, where we add one with probability p and subtract one with probability q = 1−p.
Questions that we might like to ask include:

• what is the probability that X ever returns to its starting place?

• if it will return to zero with probability one, then what is the expected time until
it does so?

• what is the probability that the process ever reaches 20?

Definition 3.4. The state-space of a stochastic process X is the set of possible values
taken by the random variables {Xt }. We denote the state-space by S.

In this module, the state-space will always be a discrete set.

Example 3.5. In the gambler’s ruin example, the state-space is given by

{0, 1, 2, . . . , 69, 70}. In the random walk example, the state-space is Z. ⊛
3.1 Introduction and transition matrices 32

10 20 30 40
-1

-2

Figure 3.1: Possible path of a random walk on Z

Suppose that we have observed a process X at times 0, 1, 2, . . . , n − 1, and that we

now wish to make a prediction about the value of Xn . For example, suppose that our
random walk on Z has followed the path {x0 , x1 , x2 , . . . , xn−1 }. Now, given that

X +1 with probability p
n−1
Xn =
X
n−1 −1 with probability q,

it follows that



 p if xn = xn−1 + 1

P (Xn = xn | Xn−1 = xn−1 , Xn−2 = xn−2 , . . . , X1 = x1 , X0 = 0) = q if xn = xn−1 − 1


0

otherwise.

In general, the state of the process at time n could depend upon the entire past of
the process, and upon the time n. Notice however that, in this example,

• the jump in X at time n only depends upon the state of X at time n − 1;

• the conditional probability above does not depend on time n.

A stochastic process such as this is known as a Markov chain:

Definition 3.6 (Markov chain). A random process X = {Xn : n = 0, 1, . . . } is said to

be a discrete state-space Markov chain if

(a) the random variables Xn take values in a discrete state-space S;

(b) the conditional probabilities of the future given the past depend only on the present:

P (Xn = xn | Xn−1 = xn−1 , Xn−2 = xn−2 , . . . , X0 = x0 ) = P (Xn = xn | Xn−1 = xn−1 )

whenever both sides are well-defined, and for all n, all x0 , . . . , xn ∈ S.

3.1 Introduction and transition matrices 33

A stochastic process satisfying (b) above is said to have the Markov property. The
intuition behind (b) is:

P (future | past and present) = P (future | present)

Example 3.7. Let Xn denote the number of individuals alive in generation n of a

branching process. Since the distribution of Xn depends only on Xn−1 , it follows that
X = {Xn } is a Markov chain, taking values in S = {0, 1, 2, . . . }. ⊛

We describe the process as making a transition from state i to state j at time n if

Xn−1 = i and Xn = j: this happens with (one-step) transition probability

P (Xn = j | Xn−1 = i) .

Definition 3.8. X is said to have stationary transition probabilities if the condi-

tional probabilities
pij = p(i, j) = P (Xn = j | Xn−1 = i)

do not vary with time n. If this holds then X is called a time-homogeneous Markov
chain.

Unless stated otherwise, all stochastic processes that we will consider in this module
will be discrete time, discrete state-space, time-homogeneous Markov chains.

Given a chain X we can arrange its one-step transition probabilities into a transition
matrix P . For example, if S = {0, 1, 2, . . . } then we can write

0 1 2 ...
 
0 p00 p01 p02 ...
 
1 p10 p11 p12 ... 
P =  
2 p20
 p21 p22 ... 
.. .. .. ..

..
. . . . .

Similarly, we can define n-step transition probabilities as

(n)
pij = p(n) (i, j) = P (Xm+n = j | Xm = i)

(note that these don’t depend on m, by stationarity), and arrange these into an n-step
3.1 Introduction and transition matrices 34

transition matrix P (n) :

0 1 2 ...

(n) (n) (n)

0 p00 p01 p02 ...
 (n) (n) (n) 
(n)
1
 p(n)
10 p11 p12 ... 
P = (n) (n)

2 p20
 p21 p22 ... 
.. .. .. ..

...
. . . .

(n)
You should read pij as “the probability that X, if started from i, is at j after n steps”.
(0)
Remark 3.9. Note that pij = P (Xm = j | Xm = i): this is equal to 1 if i = j, and zero
otherwise. Thus P (0) = I (the identity matrix).

The matrix P is known as a stochastic matrix: all of its elements are non-negative
(they’re probabilities!), and each of the row sums equals one (for the same reason!). It
is often convenient (at least when S is small) to depict P by means of a state diagram:
each vertex in the diagram corresponds to a state in S, and each arrow to a non-zero
transition probability. For example, if S = {1, 2, 3, 4, 5} and

1 2 3 4 5
 
1 0.8 0.2 0 0 0
2 0.4 0 0 0.5 0.1 
 
 
P = 3
 1 0 0 0 0 
4 0.3 0 0 0 0.7 
 

5 0 0 0.5 0.5 0

then we can represent this as

0.2
+
0.8 5 F 1T k 2
0.4

1 0.1
0.5
0.3

0.7
,
3 f 4 l 5
0.5

0.5

Example 3.10. Consider the following (very!) simple model of the weather: Xn = w if
it’s wet on day n, and Xn = d if dry. Suppose that

P (wet tomorrow | dry today) = 2/3

P (dry tomorrow | wet today) = 1/4 .
3.1 Introduction and transition matrices 35

Then we have
w d
!
w 3/4 1/4
P =
d 2/3 1/3
⊛

Note that, given the starting state x0 and transition matrix P , we can calculate the
probability of the process following any path {x0 , x1 , . . . , xn } over the period {0, 1, . . . , n}
as follows:

P (X follows {x0 , x1 , . . . , xn })
=P (X follows {x0 , x1 , . . . , xn } | X follows {x0 , x1 , . . . , xn−1 }) × P (X follows {x0 , x1 , . . . , xn−1 })
=P (Xn = xn | X follows {x0 , x1 , . . . , xn−1 }) × P (X follows {x0 , x1 , . . . , xn−1 })
=P (Xn = xn | Xn−1 = xn−1 ) × P (X follows {x0 , x1 , . . . , xn−1 })
=p(xn−1 , xn ) × P (X follows {x0 , x1 , . . . , xn−1 })
=p(xn−1 , xn ) × p(xn−2 , xn−1 ) × · · · × p(x0 , x1 ) × P (X0 = x0 ) ,

where in the last probability we allow for the possibility of the initial state also being
random.
In the above example, if X0 = w then the probability that we observe the sequence
{w, w, d, w} is given simply by

3 1 2 1
P (X1 = w, X2 = d, X3 = w | X0 = w) = pww pwd pdw = × × = .
4 4 3 8

The following theorem is fundamental in telling us how to calculate long-term tran-

sition probabilities:

Theorem 3.11 (Chapman-Kolmogorov equations). The n-step transition matrix P (n)

satisfies

P (n+m) = P (n) P (m) .

That is,
(n+m) (n) (m)
X
pij = pik pkj .
k∈S

Proof. We begin with the probability on the LHS, and consider all possible states that
3.1 Introduction and transition matrices 36

the chain might be in at time n:

! !
[
P (Xn+m = j | X0 = i) = P {Xn+m = j, Xn = k} X0 = i
k∈S
X
= P (Xn+m = j, Xn = k | X0 = i)
k∈S
X
= P (Xn+m = j | Xn = k, X0 = i) P (Xn = k | X0 = i)
k∈S
X
= P (Xn+m = j | Xn = k) P (Xn = k | X0 = i)
k∈S
(n) (m)
X
= pik pkj .
k∈S

Exercise 3.12. Note that in this proof (fourth line) we have declared that

P (Xn+m = j | Xn = k, X0 = i) = P (Xn+m = j | Xn = k) .

The Markov property as stated in Definition 3.6 really only tells us this is true when
m = 1. Show (by induction) that it is true for all m ∈ N. ⊛

This theorem tells us that P (2) = P (1) P (1) . But since P (1) = P , we see that P (2) = P 2
(i.e. the square of the one-step transition matrix P ). Continuing this argument, we get

P (n) = P n , n = 0, 1, 2, . . .

So we can obtain the n-step transition probabilities by calculating higher powers of the
one-step transition matrix.

Example 3.13. In the simple weather example, we can compute

w d
!
w 35/48 13/48
P (2) = P2 =
d 13/18 5/18

(n)
Proposition 3.14. Suppose we define the row vector of probabilities ν (n) by νi =
P (Xn = i). (So ν (n) is just the mass function of Xn , expressed as a vector.) Then

ν (n) = ν (r) P n−r for all r ∈ {0, 1, . . . , n − 1} (3.1)

3.1 Introduction and transition matrices 37

and in particular,

ν (n) = ν (0) P n . (3.2)

In other words, we can obtain the distribution of the chain at time n by starting with
its distribution at time r, and multiplying by the matrix P n−r .

Proof. Conditioning on the state at time n − 1 we have

X
P (Xn = j) = P (Xn = j | Xn−1 = i) P (Xn−1 = i)
i∈S
X
= pij P (Xn−1 = i) .
i

In matrix notation, this says that ν (n) = ν (n−1) P . Repeating this argument a total of
n − r times yields

ν (n) = ν (r) P n−r for all r ∈ {0, 1, . . . , n − 1} .

Example 3.15. In the simple weather example, suppose that on day 0 it is wet with
probability 1/5. Then ν (0) = (1/5, 4/5), and
!
3/4 1/4
ν (1) = ν (0) P = (1/5, 4/5) = (41/60, 19/60) ;
2/3 1/3
!
35/48 13/48
ν (2) = ν (1) P = ν (0) P 2 = (1/5, 4/5) = (0.724, 0.276) .
13/18 5/18

Thus the probability that it is wet on day 2 in this scenario is 0.724. ⊛

Exercise 3.16. What is ν (1) if ν (0) = (8/11, 3/11)? What about ν (2) ? ⊛

Example 3.17. Suppose that X0 , X1 , . . . is a sequence of independent random variables

taking values in a countable set S. Show that X = {Xn : n ≥ 0} is a Markov chain.
Under what condition is this chain homogenous?
Solution: Due to the independence of the random variables we have

P (Xn = s | Xn−1 = xn−1 , . . . , X0 = x0 ) = P (Xn = s) = P (Xn = s | Xn−1 = xn−1 ) ,

and so X is (trivially!) a Markov chain. To be homogeneous, we require the transition

probabilities to be time-independent, i.e.

P (Xn = i | Xn−1 = j) = P (X1 = i | X0 = j) .

3.1 Introduction and transition matrices 38

But this means that

P (Xn = i) = P (X1 = i) for all n ≥ 1, and all i ∈ S.

So we need {Xi } to be identically distributed in order for X to be homogeneous. ⊛

Example 3.18 (Bernoulli process). Let S = {0, 1, 2, . . . } and define a chain Y by Y0 = 0

and with
P (Yn+1 = s + 1 | Yn = s) = p = 1 − P (Yn+1 = s | Yn = s) .

Clearly Y is a Markov chain (its position at time n + 1 only depends on that at n).
We can think of Yn as counting the number of Heads in n tosses of a coin, where the
probability of obtaining a Head on any one toss is p.
The transition matrix is given by

0 1 2 3 4 ...
 
0 1−p p 0 0 0
 
1 0 1−p p 0 0 
P =  
2 0
 0 1−p p 0 
..

.. ..
. . .

Furthermore, we can easily calculate the n-step transition probabilities here:

P (Xm+n = j | Xm = i) = P (there are exactly j − i Heads in n coin tosses)


0 if j − i > n or j < i
=
 n pj−i (1 − p)n−(j−i)

if 0 ≤ j − i ≤ n,
j−i

since the number of Heads in n coin tosses is of course distributed as Bin(n, p). ⊛
3.2 State classification 39

3.2 State classification

In this section we look at the states in S, and their various properties: this will better
help us to understand the long-term behaviour of Markov chains.

3.2.1 Communicating classes

Suppose that it is possible for the chain X to make its way from state i to another
state j, and back again. Then we might expect these two states to share many common
properties, and this indeed turns out to be the case.

Definition 3.19. Suppose that i, j ∈ S. Then:

(n)
(a) we say that i leads to j if there is some n > 0 with pij > 0. In this case we write
i → j.

(b) we say that i and j intercommunicate if i → j and j → i. We write i ↔ j.

(c) i and j are in the same communicating class (written i ∼ j) if either

(i) i ↔ j, or
(ii) i = j.

Lemma 3.20. The relation ∼ is an equivalence relation (it is reflexive, symmetric, and
transitive).

Proof. The first two parts here are trivial:

• Reflexive: i ∼ i, since i = i;

• Symmetric: i ∼ j implies that i ↔ j, and so j ∼ i;

• Transitive: suppose that i ∼ j and j ∼ k – we must show that i ∼ k. If i = j or

j = k then i ∼ k is immediate. If not, then we know that i → j and j → k, and
(n) (m)
so there exist n, m > 0 such that pij > 0 and pjk > 0. Then it follows from the
Chapman-Kolmogorov equations (Theorem 3.11) that

(n+m) (n) (m)

pik ≥ pij × pjk > 0 ,

and so i → k. A similar argument shows that k → i, and so i ∼ k.

Example 3.21. Consider the chain X with state-space diagram

3.2 State classification 40

+
1 k @ 2 K
5

3 o 4 o 6

Here there are two communicating classes: {1, 2, 3, 4} and {5, 6}. ⊛

Definition 3.22. For a chain X on S we say that

(a) the state i ∈ S is essential if for all j ∈ S with i → j, it’s also the case that j → i.
Otherwise i is inessential.

(b) the chain X (or equivalently its transition matrix P ) is irreducible if S is one
single communicating class (i.e. if all states intercommunicate).

Suppose that the Markov chain starts at state i. Then i is essential if, wherever X
goes, it is always possible to return to its starting state; i is inessential if it is possible
for X to leave i and reach a state from which it is impossible to return.

Example 3.23. In Example 3.21, states {1, 2, 3, 4} are essential; states {5, 6} are inessen-
tial. (E.g. from 5 and 6 it is possible to reach state 4, but then it is impossible to return.)
⊛

Example 3.24. Consider the following random walk on S = {0, 1, . . . , n}, which is
absorbed when it reaches state 0 or state n:

) + +
0 k 1 k 2 k .) . . n i

Here the states {1, 2, . . . , n − 1} form an inessential communicating class. (From any
of these states it is possible to reach state 0, from which it is impossible to return.)
States 0 and n are both essential, and each forms a communicating class by itself. ⊛

Lemma 3.25. If i → k and i is essential, then k is also essential, and i ∼ k.

Proof. If i → k and i is essential, then by Definition 3.22 it follows that k → i, and so

i ∼ k. Suppose now that k → j, for some j ∈ S. Then we know that i → k and k → j,
giving i → j. But since i is essential, this implies that j → i. Therefore j → i and
i → k, and so j → k. Therefore k is essential too.

Notice that this implies that all states in any given communicating class are either
essential or inessential.
3.2 State classification 41

3.2.2 Periodicity

Definition 3.26. The period of a state is the greatest common divisor of times at
which the chain might return to the state. Thus
n o
(n)
(a) if i → i then i has period gcd n > 0 : pii > 0 .

(b) if i 9 i then the period of i is not defined.

(c) if i has period 1 then it is said to be aperiodic.

Theorem 3.27. Periodicity is a class property.

Proof. Recall that a|b means that the positive integer a divides the integer b exactly.
Suppose that i ↔ j and that i has period d.
(n) (m)
Since i ↔ j we can find n and m such that pij and pji are both positive. Since
(k) (r)
period(i) = d we know that pii > 0 means d|k (by Definition 3.26). Now, if pjj > 0
(n+r+m) (n) (r) (m) (n+m)
then pii ≥ pij pjj pji > 0 and so d|(n + r + m). But we also know that pii ≥
(n) (m)
pij pji > 0, and so d|(n + m): it follows that d|r, and so state j has period d as
required.
(r)
Exercise 3.28. This ‘proof’ really only shows that if pjj > 0 then d|r: we haven’t
shown that d is the greatest common divisor of all such times though! Show that this
must be the case, by assuming that there exists a d′ > d with d′ |r for all times r at which
it is possible for the chain to return to state j: show that this implies that d′ |k for all k
(k)
with pii > 0, contradicting the assumption that the period of i is d. ⊛

Example 3.29. In Example 3.21, the class {1, 2, 3, 4} has period 1: to see this, note that
it is possible for the chain to start at 1 and return at time 2 (via the path 1 → 2 → 1),
or to return at time 3 (1 → 3 → 2 → 1). Since gcd {2, 3} = 1, wee see that this class is
aperiodic. The period of class {5, 6} is 2, since if the chain starts in state 5, it is only
possible for it to return at even times.
Similarly, in Example 3.24, the class {1, 2, . . . , n − 1} has period 2; classes {0} and
{n} are both aperiodic. ⊛

3.2.3 Recurrence and transience

So far the classification of states has taken no account of the actual probabilities in the
transition matrix P : the ideas of communicating classes, essential states and periodicity
depend only on whether the transition probabilities are positive or not, nothing more.
It’s now time to start using this extra information to establish further properties of the
chain...
3.2 State classification 42

Definition 3.30. For all i, j ∈ S, the distribution of Ti,j , the first-passage time from i
to j, is defined by
(n)
P (Ti,j = n | X0 = i) = fij ,
(n)
where fij = P (Xn = j, Xm 6= j for m = 1, . . . , n − 1 | X0 = i) for n ≥ 1.
We also write ∞
(n)
X
∗
fij = fij ,
n=1

and then P (Ti,j = ∞ | X0 = i) = 1 − fij∗ .

Definition 3.31. The state i is

• recurrent (or persistent) if fii∗ = 1

• transient if fii∗ < 1.

So a state i is recurrent if, when X starts at i, with probability 1 it will return to its
starting state in finite time; i is transient if there is a positive chance of the chain never
returning.
We are often particularly interested in finding fij∗ , since this tells us the probability
that the chain ever visit j if started at i. In order to calculate these first-passage
probabilities, we often make use of a technique known as first-step decomposition.
Note that these probabilities satisfy:

(1)
fij = pij
(n+1)
fij = P (X first hits j at time n + 1 | X0 = i)
X
= P (X first hits j at time n + 1 | X1 = k, X0 = i) P (X1 = k | X0 = i)
k:k→j, k6=j
X
= P (X first hits j at time n + 1 | X1 = k) pik
k:k→j, k6=j
(n)
X
= pik fkj
k:k→j, k6=j

(n)
If we now sum fij over n, we obtain the first-step decomposition:

X
fij∗ = pij + ∗
pik fkj (3.3)
k:k→j, k6=j

Thus we can calculate first-passage probabilities by solving a system of linear equations.

Later on we shall find a more efficient matrix-based approach to this, but for chains on
relatively few states, it is not too inefficient to simply solve the equations ‘longhand’.
3.2 State classification 43

Example 3.32. Two people play a game, whereby each chooses a pattern of three
symbols from {H, T }. They then repeatedly toss a fair coin, and the first person to
observe their pattern of symbols wins. For example, suppose that player A chooses the
sequence HHT , while B chooses HT H. We can represent the progress of the game as a
Markov chain, as follows. Let the state-space be S = {1, 2, 3, 4, 5, 6}, where

1 = start of game; 2 = H; 3 = HH; 4 = HT ; 5 = HHT ; 6 = HT H.

Then our chain has the following transition diagram and matrix:

1/2

3 / 5 i 1
@ 1/2 1 2 3 4 5 6
1/2  
1 1/2 1/2 0 0 0 0
1/2  
4 1 c / 2 2 0 0 1/2 1/2 0 0 
1/2
1/2 3 0
 0 1/2 0 1/2 0 
P = 


1/2 4 1/2 0 0 0 0 1/2 
4 / 6 i 1 
1/2 5 0
 0 0 0 1 0 
6 0 0 0 0 0 1

Note that we have made the chain stop as soon as it hits either state 5 or state 6, since
∗
then one player has won and the game ends. We are now interested in P (A wins) = f15 .
Using the first-step decomposition (3.3), we obtain:

∗ 1 ∗ 1 ∗
f15 = f15 + f25
2 2
∗ ∗
and so f15 = f25 . Furthermore,

∗ 1 ∗ 1 ∗
f25 = f35 + f
2 2 45
∗ 1 ∗ 1
f35 = f35 +
2 2
∗ 1 ∗ 1 ∗
f45 = f15 + f
2 2 65
∗
f65 = 0.

∗
Thus f35 = 1 and f45∗
= 21 f15
∗
= 12 f25
∗ ∗
. Substituting these values into the equation for f25 ,
∗ ∗ ∗
we obtain f25 = 1/2 + (1/4)f25 = 2/3. So f15 = 2/3 and similarly we can check that
∗
f16 = 1/3; thus player A is twice as likely to win the game with his choice of symbols. ⊛
3.2 State classification 44

Exercise 3.33 (Exercise sheet 4). Consider the following random walk:

) 1/2 1/2 1/2 1/2 1/2

+ + + + +
1 0 k 1 k 2 k 3 k 4 k 5 6 i 1
1/2 1/2 1/2 1/2 1/2

What is the probability that the chain ever reaches state 0 if it starts at state 4? ⊛

In a similar way to which we decomposed fij∗ according to the first step of the chain
(n)
X, we can also decompose the transition probability pij according to the first time that
the chain X visits j if started at i – this gives the first-passage decomposition:

n n−1
(n) (m) (n−m) (n−u) (u)
X X
pij = fij pjj = fij pjj , n≥1 (3.4)
m=1 u=0

(where for the second equality we have simply changed variables, using u = n − m).
Notice that we have now expressed our n-step transition probability as a convolution
of two sequences: this suggests that we should use generating functions! So let
∞ ∞
(n) (n)
X X
pij z n , Fij (z) = E z Ti,j = fij z n .

Pij (z) = and
n=0 n=1

(Be careful with the lower limits of the two sums here, and note that Pij (z) is only
P (n)
guaranteed to converge for |z| < 1, since the series pij is not necessarily finite.)
Using (3.4), we obtain

∞ ∞ n−1
!
(0) (n) (0) (n−u) (u)
X X X
Pij (z) = pij + pij z n = pij + fij pjj zn
n=1 n=1 u=0
∞ ∞
!
(0) (u) (n−u) n−u
X X
= pij + pjj z u fij z
u=0 n=u+1
(0)
= pij + Pjj (z)Fij (z) .

That is,
(0)
Pij (z) = pij + Pjj (z)Fij (z) , when |z| < 1 . (3.5)

This useful formula allows us to derive a simple criterion for determining whether or
not a state is recurrent.
3.2 State classification 45

P (n)
Theorem 3.34. The state i is recurrent if and only if n pii = ∞. Furthermore, if i
is transient then ∞
X (n) 1
pii = < ∞.
n=0
1 − fii∗

Note (Exercise sheet 4) that

∞
(n)
X X
E [number of times X visits state i | X0 = i] = P (Xn = i | X0 = i) = pii .
n=0 n

Thus state i is transient if and only if we expect X to visit i

only a finite number of times.

P (n)
Proof. State i is recurrent iff fii∗ = 1 (by Definition 3.31). Since fii∗ = n fii , we
∗
see that, using Abel’s Theorem (Theorem 1.1), fii = 1 iff limz↑1 Fii (z) = 1. Now, by
equation (3.5) we can write
Pii (z) − 1
Fii (z) =
Pii (z)
(0)
(since pii = 1), and so limz↑1 Fii (z) = 1 iff limz↑1 Pii (z) = ∞. Again using Abel’s
P (n)
Theorem, this occurs if and only if n pii = ∞.
(n)
Exercise 3.35. Show that if i is transient then pji → 0 as n → ∞, for all j ∈ S. ⊛

We can now prove the following:

Theorem 3.36. Recurrence and transience are class properties.

(n) (m)
Proof. Suppose that i ∼ j. Then there exist m, n such that pij > 0 and pji > 0. Then

(m) (r) (n) (m+r+n)

pji pii pij ≤ pjj .

Summing over r we obtain

(m) (n) (r) (m+r+n) (s)

X X X
pji pij pii ≤ pjj ≤ pjj .
r≥0 r≥0 s≥0

If i is recurrent, then the sum on the LHS of this equation is infinite (by Theorem 3.34),
and so the sum on the RHS must also be infinite, showing j to be recurrent.

This means that we only need to check recurrence/transience for one state in a
communicating class in order to determine whether all states in that class are recurrent.
How do recurrence/transience relate to the labels essential/inessential?
3.2 State classification 46

Theorem 3.37. If i is recurrent and i → j then i ∼ j and fij∗ = fji∗ = 1. So if i is

recurrent then it is essential.

Proof. Since i is recurrent and i → j, we know that fii∗ = 1. Putting this into the
first-step decomposition (3.3) we obtain
X X
∗
fki = pki + pkl fli∗ = pkl fli∗ .
l6=i l

Substituting this expression for fli∗ into the RHS of this equation yields
X X
∗
fki = pkl plg fgi∗
l g
!
X X
= pkl plg fgi∗
g l
(2)
X
= pkg fgi∗ ,
g

and multiple applications of this trick leads us to

(n)
X
∗
fki = pkg fgi∗ .
g

But since this holds for any state k ∈ S, it certainly holds when k = i. Thus

(n)
X
1 = fii∗ = pig fgi∗ , for any n ∈ N .
g

(n)
pig = 1 (it’s the sum of row i of the transition matrix P (n) ),
P
Nearly there now! Since g
we get
(n)
X
1 − fgi∗ ,

0= pig for any n ∈ N .
g

But here we have a set of non-negative terms which adds up to zero, implying that every
term in the sum must be zero. However, we know (from one of the assumptions of the
(n)
theorem) that i → j, and so there exists n ∈ N such that pij > 0. This means that
1 − fji∗ = 0, and so fji∗ = 1, and so j → i.
Therefore j ∼ i, and it follows that j must also be recurrent (by Theorem 3.36).
Finally, to show that fij∗ = 1 as well, simply reverse the roles of i and j in the above
argument.

This theorem says that if X starts at the recurrent state i (i.e. a state to which it
will return in finite time with probability 1), then it is impossible for X to reach a state
from which it is impossible to return to i (i.e. i is essential). So once a chain enters a
3.2 State classification 47

recurrent class, it must stay there forever. Hopefully this makes good intuitive sense!
In general, the converse result does not hold. However:

Theorem 3.38. If C is a finite essential class then it is recurrent.

We end this section with simple example.

Example 3.39. Let X be a Markov chain with transition matrix:

9
I
2
1/3
1 2 3
 
1 1/3 1/3 1/3 1/3
5 1 1 1

P = 2 0 0 1 
 
1/3
3 0 1 0 %
3

We can see here that state 1 is inessential: it is possible for the chain to leave 1
and never return. Since it is inessential, by Theorem 3.37 it must also be transient (i.e.
∗ ∗
f11 < 1). We can check this by calculating f11 here easily:

∗
f11 = P (X1 = 1 | X0 = 1) + P (X2 = 1, X1 6= 1 | X0 = 1) + . . .
= 1/3 + 0 + 0 + · · · = 1/3 .

Furthermore, by Theorem 3.34,

∞
X (n) 1
E [expected number of times X visits 1 | X0 = 1] = p11 = ∗
= 3/2 .
n=0
1 − f11

We can also check this calculation directly in this simple example:

(0) (1) (2) (n)

p11 = 1 ; p11 = 1/3 ; p11 = 1/32 ; ... p11 = 1/3n .

Thus ∞ ∞
(n)
X X
p11 = 1/3n = 3/2 .
n=0 n=1

(2) ∗
Finally, note that, since p22 = 1, f22 = 1, and we see that state 2 is recurrent. (We
could have inferred this directly from Theorem 3.38 in fact, since C = {2, 3} is a finite
(n)
essential class, hence recurrent.) Furthermore, p22 = 1 if n is even, and zero if n is odd.
∞
X (n)
Thus p22 = ∞, as expected, since state 2 is recurrent. ⊛
n=0
3.3 Interesting example: random walk on Z 48

3.3 Interesting example: random walk on Z

Let us now revisit the random walk X on the integers Z, defined by

X + 1 with probability p
n
Xn+1 =
X − 1 with probability q = 1 − p.
n

(We shall assume that p is not equal to zero or one, since otherwise this chain is pretty
boring!) This is symmetric if p = 1/2, otherwise asymmetric. Note that the whole state
space forms a single communicating class (so the random walk is irreducible), and that
every state has period 2.

Exercise 3.40 (Exercise sheet 4). Show that, for all n ≥ 0 and k ∈ Z,

 n p(n+k)/2 q (n−k)/2

if n + k is even
n+k
P (Xn = k | X0 = 0) = 2

 0 if n + k is odd.

Consider F01 (z) = E z T0,1 , where (recall that) T0,1 is the time that X first hits 1 if

started at 0. Since X0 = 0, we know that X1 ∈ {−1, 1}. If X1 = 1 then clearly T0,1 = 1;

but if X1 = −1, then we can argue that T0,1 = 1 + T−1,1 (the time to get from 0 to 1 is
equal to 1 (the first step) plus the time taken to get from -1 to 1). So

F01 (z) = E z T0,1

= z 1 P (X1 = 1 | X0 = 0) + E z 1+T−1,1 | X1 = −1 P (X1 = −1 | X0 = 0)

= zp + zqE z T−1,1 .

Now consider T−1,1 . The time taken to get from -1 to 1 must be equal (in distribution)
to the time taken to get from -1 to 0 for the first time plus the time taken to get from
0 to 1 for the first time. So T−1,1 has the same distribution as T−1,0 + T0,1 . But these
two times are independent random variables, and by symmetry (translation invariance)
have the same distribution as each other; thus
2
E z T−1,1 = E z T−1,0 +T0,1 = E z T−1,0 E z T0,1 = E z T0,1 = F01 (z)2 .

Putting this into the above we obtain a quadratic equation for F01 (z):

F01 (z) = pz + qzF01 (z)2 ,

with solutions p
1± 1 − 4pqz 2
F01 (z) = .
2qz
3.4 The fundamental matrix 49

We now have to decide which root to take. We know that F01 (z) is a continuous
function of z (it’s a power series), and that F01 (z) → 0 as z → 0; if we take the positive
root above then the RHS tends to infinity as z → 0, and so we must take the negative
root. Thus
p
1 − 1 − 4pqz 2
F01 (z) = .
2qz

Arguing as above, it follows that F0k (z) = F01 (z)k , for any k ≥ 1. We can use this
to determine whether or not the random walk X is recurrent or transient: since the
∗
state-space is irreducible, it follows that X is recurrent if and only if f0k = 1 for all
k ∈ Z.
Now, when k ≥ 1, we see that
√ k k
∗ 1− 1 − 4pq 1 − |1 − 2p|
f0k = lim F0k (z) = =
z→1 2q 2q

(p/q)k if p < 1/2
=
1 if p ≥ 1/2.

So if p < 1/2, there is a positive chance that X never reaches state k ≥ 1; by symmetry,
if p > 1/2, the random walk will tend to drift upwards, and there will be a positive
chance that it will never visit any given negative state. We have shown that

Theorem 3.41. The simple random walk on Z is recurrent if and only if it is symmetric.

It can similarly be shown (without much further effort) that the symmetric random
walk in 2 dimensions is recurrent, but that in 3 (or more) dimensions it is transient!

3.4 The fundamental matrix

Suppose that S is finite, and that the chain X begins in a transient state. Since S is
finite, there must exist at least one essential, recurrent class (why?!), and we know that
eventually X must leave the set of transient states T (consider Exercise 3.35) and enter
a recurrent class where it will remain forever (Theorem 3.37). We would like to know
the probabilities of X ever visiting any particular recurrent class (if there are more than
one), and also to calculate the expected amount of time X spends in T .

Definition 3.42. The square matrix Q is a substochastic matrix if all its entries are
non-negative and all of its row-sums are no greater than one. (Recall that transition
matrices are stochastic matrices – all row sums equal one.)
3.4 The fundamental matrix 50

Definition 3.43. The fundamental matrix G of a substochastic matrix Q is given by

G = I + Q + Q2 + Q3 + . . . .

Note that the entries of G take values in the range [0, ∞].
P∞
Remark 3.44. If P is a transition matrix for the chain X and G = n=0 Pn =
P∞ (n)
n=0 P is its fundamental matrix, then the (i, j)th entry of G satisfies

∞
(n)
X
Gij = pij = E [number of times X visits j | X0 = i] .
n=0

Example 3.45. Consider the chain with transition matrix

1 2
!
2 1
1 3 3
P = .
2 0 1

Then
1 2
2 n 2 n
!
1 1−
P (n) = 3 3
,
2 0 1
and so
1 2
!
1 3 ∞
G= .
2 0 ∞
Thus if X begins in the transient state 1 then it spends on average three units of time
there, and infinitely long in state 2 (once it gets there it never leaves). If it begins in
state 2 then it spends no time at all in state 1 (it can’t get there), and infinitely long in
state 2. ⊛
3.4 The fundamental matrix 51

We have seen in Theorem 3.34 that we can relate the first-passage probabilities to
the expected number of times that X visits each state. It follows that

∞ if i is recurrent
Gii =
1

1−fii
∗ if i is transient.

Furthermore, if i 6= j then



 0 if i 9 j

Gij = ∞ if i → j and j is recurrent

 fij∗


1−fjj
∗ if i → j and j is transient.

To understand this, consider each case separately. If i 9 j then if X0 = i it is impossible

for the chain to ever visit j, hence the expected number of visits to j is zero. If i → j
then there is a positive chance that the chain, if started at i, will reach j in finite time,
i.e. fij∗ > 0. If it does reach j, then if j is recurrent it will then re-visit j on average
an infinite number of times (since Gjj = ∞); if j is transient however, it will re-visit j
∗
on average a finite number of times (equal to Gjj = 1/(1 − fjj )). So in each case Gij is
given by

Gij = E [number of visits to j | X0 = i]

= P (X ever hits j | X0 = i) E [number of visits to j | X0 = j]
= fij∗ Gjj ,

where we have implicitly used the Markov property in the second line to argue that the
chain effectively ‘starts again with X0 = j’ once it has reached j for the first time.

Example 3.46. Looking back to Example 3.45 we know that state 1 is transient, with
∗
G11 = 3. Thus f11 = 2/3. ⊛

Remark 3.47. Note that if j is recurrent then Gij equals 0 or ∞ for all i ∈ S; if j is
transient then Gij < ∞ for all i.

These observations make it very easy to calculate the rows of G corresponding to

recurrent states j: Gij = 0 if i 9 j; otherwise Gij = ∞. Furthermore, if j is transient and
i is recurrent, then it must be impossible for the chain to move from i to j (Theorem 3.37):
thus Gij = 0 in this case.
So we are left with having to calculate the values of Gij where i, j ∈ T , i.e. where
both i and j are transient states. A very useful fact is that we can calculate this part of
G by restricting attention to the substochastic matrix Q corresponding to the states in
3.4 The fundamental matrix 52

T , and calculating the fundamental matrix for Q, GQ . Then, arguing as for geometric
series, we find that

GQ = I + Q + Q2 + · · · = I + Q(I + Q + Q2 + . . . ) = I + QGQ .

Furthermore, we know that all of the entries of GQ are finite (by the above discussion),
and so we may subtract QGQ from each side to obtain

I = (I − Q)GQ .

Finally, since we are dealing with a finite state-space S here, all of these matrices are
finite-dimensional and we can deduce that

GQ = (I − Q)−1 . (3.6)

Example 3.48. Consider again the coin-tossing game of Example 3.32:

1/2

3 / 5 i 1
@ 1/2 1 2 3 4 5 6
1/2  
1 1/2 1/2 0 0 0 0
1/2  
4 1 c / 2 2 0 0 1/2 1/2 0 0 
1/2
1/2 3 0 0 1/2 0 1/2 0 
P = 
 

1/2 4 1/2 0
 0 0 0 1/2 
4 / 6 i 1 
1/2 5 0 0 0 0 1 0 

6 0 0 0 0 0 1

States T = {1, 2, 3, 4} are transient, and {5} and {6} are two recurrent communicat-
ing classes. We can immediately fill in the entries of G involving either of the recurrent
states:
1 2 3 4 5 6
 
1 ? ? ? ? ∞ ∞
 
? ? ? ? ∞ ∞
2 
? ? ? ? ∞ 0 
3 
G=  .
4
 ? ? ? ? ∞ ∞ 

5
 0 0 0 0 ∞ 0 

6 0 0 0 0 0 ∞
3.4 The fundamental matrix 53

Now let Q be the substochastic matrix obtained by restricting P to T :

1 2 3 4
 
1 1/2 1/2 0 0
 
2 0 0 1/2 1/2 
Q=  
3 0 0 1/2 0  
4 1/2 0 0 0

We then calculate
 −1  
1/2 −1/2 0 0 8/3 4/3 4/3 2/3
   
 0 1 −1/2 −1/2 2/3 4/3 4/3 2/3
GQ = (I − Q)−1

=
 0
 = .
 0 1/2 0  
 0
 0 2 0 
−1/2 0 0 1 4/3 2/3 2/3 4/3

This then tells us the final entries of our matrix G:

1 2 3 4 5 6
 
1 8/3 4/3 4/3 2/3 ∞ ∞
 
2 2/3 4/3 4/3 2/3 ∞ ∞ 

3 0
 0 2 0 ∞ 0 
G=  .

4 4/3 2/3 2/3 4/3 ∞ ∞ 

5 0
 0 0 0 ∞ 0 
6 0 0 0 0 0 ∞

So, for example, if X0 = 1 then on average the chain spends 4/3 units of time in state
2. If we started the chain in state 3 then we would expect it to spend a total of 2 units
of time in state 3 before being absorbed in state 5. ⊛

Notation: If C ⊆ S then we let Ti,C be the first time that X hits C if started at i. We
shall write
X
piC = pij
j∈C

for the probability that X jumps from i into C in one step, and

∗
fiC = P (Xn ∈ C for some n > 0 | X0 = i) = P (Ti,C < ∞) .

∗
That is, fiC is the probability that X ever hits the set of states C when started from i.

Theorem 3.49. Suppose T is the union of all transient classes and C is any union of
3.4 The fundamental matrix 54

∗
recurrent classes such that fiC = 1 for all i ∈ T . Then,

(E [Ti,C ] : i ∈ T ) = GQ e .

Here e is a column vector full of 1s, with as many entries as there are states in T .
∗
Proof. Because X begins in T and must end up in C (since fiC = 1 for all i ∈ T ), Ti,C
is equal to the amount of time that X spends in T . (Here we are using the fact that
recurrent states are essential – once the chain hits a recurrent state it is impossible for it
to move to a transient state.) But the (i, j)th entry of GQ is equal to the mean number
of times the chain visits the transient state j when started at i: if we add up all of these
entries (summing over j) then we obtain the mean amount of time spent in all transient
states. But this is exactly what we get when we multiply GQ by e.
Example 3.50. Consider Example 3.48 once again. Here T = {1, 2, 3, 4} is the set of all
transient states, and C = {5, 6} is the union of all recurrent classes. Using Theorem 3.49
we calculate:
      
E [T1,C ] 8/3 4/3 4/3 2/3 1 6
      
E [T2,C ]
 = GQ e = 2/3 4/3 4/3 2/3 1 = 4 .
    

E [T ]  0 0 2 0 
 1 2
   
 3,C  
E [T4,C ] 4/3 2/3 2/3 4/3 1 4

So we see that if X0 = 1, the average amount of time that the chain spends in T (i.e.
the amount of time until somebody wins the game) is 6. ⊛
As well as using the matrix GQ to calculate the mean amount of time that X spends
in transient states, we can also use it to calculate the probability that the chain will ever
visit the recurrent class C. (If there is only one recurrent class then this probability is of
course 1; but if there is more than one such class, the chain can only ever enter one of
them, and so this problem becomes more interesting.)
Example 3.51. Consider Example 3.48 again: here there are two recurrent classes
(C1 = {5} and C2 = {6}), and we are interested in the probability that the chain ever
visits C1 (since this is the probability that Player A wins). Remember that we have
∗
already calculated this probability, f15 = 2/3, in Example 3.48 by solving a set of linear
equations – we promised then that there is a more efficient way to do this using matrices
... ⊛
Theorem 3.52. Let C be a recurrent class, and T the union of all transient classes. Let
FC∗ be the vector of absorption probabilities

FC∗ = (fiC
∗
: i∈T) ,
3.4 The fundamental matrix 55

and define the matrix RC by

RC = (pij : i ∈ T , j ∈ C) .

Then

FC∗ = GQ RC e (3.7)

where e is again a column vector of 1s (with as many entries as the number of states in
C).

Remark 3.53. If you have trouble remembering how many entries there should be in the
vector e (|T | in Theorem 3.49, and |C| in Theorem 3.52), then note that in both cases the
number of entries is exactly the right number to make the matrix multiplication work!
∗
Proof of Theorem 3.52. Recall that fiC is the probability that X ever reaches C if started
at i. In order to reach C, X has to spend some number n ≥ 0 of steps in the set of
transient states T , and then jump (in one step) from some state k ∈ T to some state
j ∈ C. Thus we can write
!
(n) (n)
XX X XX X
∗
fiC = pik pkj = pik pkj
n≥0 k∈T j∈C k∈T j∈C n≥0
XX X
(G RC )ij = GQ RC e
Q

= Gik pkj = i
.
j∈C k∈T j∈C

Example 3.54. Returning to Example 3.51, we know from Example 3.50 that
 
8/3 4/3 4/3 2/3
 
2/3 4/3 4/3 2/3
GQ = 
 
.
 0 0 2 0 
 
4/3 2/3 2/3 4/3

We also know that

5
 
1 0
 
2 0 
RC1 =   .
3 1/2 

4 0
(Here we have simply taken the entries of P with rows corresponding to states i ∈ T =
{1, 2, 3, 4} and columns to states j ∈ C1 = {5}.) Since there is only one state in C1 , we
3.4 The fundamental matrix 56

have e = (1). Then

    
8/3 4/3 4/3 2/3 0 2/3
    
2/3 4/3 4/3 2/3  0 
= G Q RC 1 e =     (1) = 2/3 .
 
FC∗1  0
 0 2 0  1/2
    1 
 
4/3 2/3 2/3 4/3 0 1/3

This tells us that if we start in state 1 or 2, the probability of ever reaching C1 (and
Player A winning) is 2/3 (as we had already calculated!); if the chain starts in state 3
then Player A is certain to win (3 9 6, and so this makes sense!); if the chain starts in
state 4, then it is more likely that B wins – A only has a 1/3 chance of winning from
this starting state. ⊛

Exercise 3.55. Since the game must end with either Player A or player B winning, it
follows that  
1/3
 
1/3
FC∗2 = 
 0 .

 
2/3
Check that you can obtain this result by replacing RC1 with RC2 above. ⊛
3.4 The fundamental matrix 57

Example 3.56. Consider the following biased random walk:

) 1/4 1/4 1/4
+ + +
1 0 k 1 k 2 k 3 4 i 1
3/4 3/4 3/4
Here there are two essential classes (C1 = {0} and C2 = {4}), and one transient class
(T = {1, 2, 3}). We have
 
1 0 0 0 0
 
3/4 0 1/4 0 0  0 1/4 0
 
 
 0 3/4 0 1/4 0  ;
P = Q = 3/4 0 1/4 .
  

 0 0 3/4 0 1/4 0 3/4 0

 

0 0 0 0 1

Using Q we calculate

13 4 1
1 
GQ = (I − Q)−1 = 12 16 4  ,

10
9 12 13

and from P we read off


 

3/4 0
RC1 = 0 ; RC 2 = 0 .
   

0 1/4

We can then calculate the mean time until the random walk hits one of the recurrent
states belonging to C = C1 ∪ C2 :
      
E [T1,C ] 13 4 1 1 18/10
Q 1 
E [T2,C ] = G e = 12 16 4  1 = 32/10 ,
     
10
E [T3,C ] 9 12 13 1 34/10

and also the probabilities of ending up in state 0:


   
13 4 1 3/4 39/40
1 
FC∗1 = GQ RC 1 e = 12 16 4   0  (1) =  9/10  .
   
10
9 12 13 0 27/40

So, for example, if the chain begins in state 2, it will spend on average 32/10 units of
time in states {1, 2, 3} before ending up in one of the two absorbing states. Furthermore,
starting from state 2, the probability that it will ever visit state 0 (and remain there
forever) is 9/10. ⊛
3.5 Stationarity 58

3.5 Stationarity
In this (final) section of the notes we look at the behaviour of Markov chains as n → ∞.
In general we would not expect Xn to converge to any given state, but we might hope
that as n gets large the distribution of Xn (ν (n) = ν (0) P n ) will converge.

We again assume in this section that S is finite.

Definition 3.57. The row vector π = (πi : i ∈ S) is called a stationary (or invariant,
or equilibrium) distribution of the Markov chain X if and only if π satisfies the following
conditions:

(a) πi ≥ 0 for all i ∈ S;

P
(b) i∈S πi = 1;

(c) for all j ∈ S,

X
πj = πi pij .
i∈S

Stationary distributions are also often called invariant or equilibrium distributions.

Note that, in matrix notation, part (c) of this definition states that

π = πP .

Exercise 3.58 (Exercise sheet 4). Show that part (c) of Definition 3.57 is equivalent to

π = πP n for all n ≥ 0.

Remark 3.59. π is called a stationary distribution since, if X0 ∼ π then Xn ∼ π for all

n ≥ 0. (Recall Exercise 3.16).

In the previous section we saw that S can be decomposed into a set of transient
states T and a number of recurrent classes {Ci }. We have also seen (in Exercise 3.35)
(n)
that pij → 0 as n → ∞ for all j ∈ T . Since S is finite, there is at least one such
recurrent class and we know that the chain will eventually leave T , enter one of the sets
Ci , and remain there forever. If we expect a stationary distribution to tell us about the
long-term behaviour of X, we therefore expect the non-zero entries of π to relate to the
recurrent states in S.
3.5 Stationarity 59

Theorem 3.60. Suppose that X is a Markov chain on a finite state-space S = T ⊎ C,

where T is the set of all transient states and C is a closed irreducible set of recurrent
states. (Note that there is only one class of recurrent states here.) Then there exists a
unique stationary distribution, given by π = (πi )i∈S where

1/µ
i if i ∈ C
πi =
0 if i ∈ T ,

where ∞
(n)
X
µi = E [Ti,i ] = nfii
n=1

is the mean recurrence time of state i.

Definition 3.61. The recurrent state i is called

• positive-recurrent if µi < ∞;

• null-recurrent if µi = ∞.

Remark 3.62. If S is finite (as we are assuming throughout this section), then all
recurrent states are positive-recurrent.

So if there is only one recurrent class in S, we can find the mean recurrence time
of each state by finding the unique stationary distribution π.

Example 3.63. Let S = {1, 2, 3, 4} and

 
1/3 2/3 0 0
 
1/2 1/2 0 0 
P =
 .
 0 0 1/2 1/2 

0 3/4 1/4 0

From inspection of P we see that {1, 2} is a class of recurrent states, and that states 3
and 4 are transient. Theorem 3.60 tells us that there is a unique stationary distribution
π, with π3 = π4 = 0. In order to find π1 and π2 , we can restrict attention to that part
of P corresponding to these two states. We must therefore solve
!
1/3 2/3
(π1 , π2 ) = (π1 , π2 ) ,
1/2 1/2

subject to π1 + π2 = 1. Doing so yields π1 = 3/7 and π2 = 4/7. Thus π = (3/7, 4/7, 0, 0)

is the unique stationary distribution. Furthermore, we see that the mean recurrence
times for states 1 and 2 are µ1 = 1/π1 = 7/3 and µ2 = 1/π2 = 7/4. ⊛
3.5 Stationarity 60

Exercise 3.64. Check that πP = π in Example 3.63. ⊛

Example 3.65. Consider the following random walk:
) 1/4 1/4 1/4 1/4
+ + + +
3/4 0 k 1 k 2 k 3 k 4 i 1/4
3/4 3/4 3/4 3/4

Here C = {0, 1, 2, 3, 4} and T = ∅. To find the unique stationary distribution we

solve (using simultaneous equations, or Gaussian elimination):
 
3/4 1/4 0 0 0
3/4 0 1/4 0 0 
 
 
(π0 , π1 , π2 , π3 , π4 ) 
 0 3/4 0 1/4 0  = (π0 , π1 , π2 , π3 , π4 ) .

 0 0 3/4 0 1/4
 

0 0 0 3/4 1/4

1
This has solution (with 4i=0 πi = 1): π =
P
(81, 27, 9, 3, 1). Thus, if X0 = 0, we see
121
that µ0 , the mean time until X returns to 0 is given by µ0 = 1/π0 = 121/81; whereas if
X0 = 4, the mean time until it returns to its starting state is 121. ⊛

3.5.1 Limiting behaviour

We have seen that a stationary distribution (if one exists) satisfies πP n = π for all n.
(n)
We now study the relationship between π and the limiting behaviour of pij as n → ∞.
(n)
We have already seen that if j is transient, then pij → 0 for all i ∈ S. What can
be said about the limit of these probabilities if j is recurrent? In considering limiting
behaviour, we will restrict attention to the case where all states are aperiodic: in this
case we call the chain itself aperiodic. (Similar, but messier, results can be derived for
chains with period d > 1 by subsampling the chain at times kd, k ≥ 0.)
Theorem 3.66. Suppose that X is an aperiodic Markov chain with state-space S. Then

(n) fij∗
pij → as n → ∞, for all i ∈ S.
µj

Definition 3.67. The chain X is called ergodic if it is irreducible, aperiodic and

positive-recurrent (i.e. all states intercommunicate, have period 1, and are positive-
recurrent).
Thus if |S| < ∞ and X is ergodic, then we have seen that there is a unique stationary
distribution π for the chain. Theorem 3.66 tells us that in this case,

(n) 1
pij → = πj as n → ∞, for all i ∈ S.
µj
3.5 Stationarity 61

Example 3.68. Consider the chain X with transition matrix

 
1/2 1/2 0
P =  0 3/4 1/4 .
 

1 0 0

This chain is ergodic, and the unique stationary distribution is given by π =

(2/7, 4/7, 1/7). Thus as n → ∞,
 
2/7 4/7 1/7
P (n) → 2/7 4/7 1/7 .
 

2/7 4/7 1/7

Example 3.69. Consider the chain X with matrix P given by

 
1 0 0 0 0
 0 1/4 0 3/4 0 
 
 
P =
1/3 0 1/3 0 1/3 .

 0 1/2 0 1/2 0 
 

1/3 1/3 1/3 0 0

Here there is one transient class {3, 5} and two positive-recurrent aperiodic classes,
{1} and {2, 4}. We can therefore immediately work out many of the entries in the limit
of P (n) :  
1 0 0 0 0
0 ? 0 ? 0 
 

P (n) → 
 
 ? ? 0 ? 0 .

0 ? 0 ? 0 
 

? ? 0 ? 0
For the missing entries of this matrix, we see from Theorem 3.66 that we need to find
the probabilities fij∗ . By Theorem 3.37, f24
∗ ∗
= f42 = 1, and so we just need to find fij∗ for
i ∈ {3, 5} and j ∈ {2, 4}. Using the first-step decomposition (3.3), or our fundamental
matrix equations (3.7), we obtain

∗ 1 ∗ 1 ∗ ∗ 1 ∗
f32 = f32 + f52 ⇒ f32 = f ;
3 3 2 52
∗ 1 1 ∗ 1 1 ∗ ∗ 2 ∗ 1
f52 = + f32 = + f52 ⇒ f52 = and f32 = .
3 3 3 6 5 5
∗ ∗ ∗ ∗ ∗
Ths immediately tells us that f34 = f32 = 1/5 and f54 = 2/5; so f31 = 4/5 and f51 = 3/5.
3.5 Stationarity 62

Finally, we need to calculate the mean recurrence times µ1 , µ2 and µ4 . Hopefully

it’s obvious that µ1 = 1. So now consider the chain started in class {2, 4}: the chain
restricted to this class is ergodic, and so possesses a unique stationary distribution (by
Theorem 3.60), which we find by solving
!
1/4 3/4
(π2 , π4 ) = (π2 , π4 ) .
1/2 1/2

This gives (π2 , π4 ) = (2/5, 3/5) (remembering that we always need π2 + π4 = 1). Thus
µ2 = 1/π2 = 5/2 and µ4 = 1/π4 = 5/3.
(n)
We can now finish filling in the limiting matrix, using pij → fij∗ /µj :
 
1 0 0 0 0
 0 2/5 0 3/5 0
 

P (n)
 
→ 4/5 2/25 0 3/25
 0.
 0 2/5 0 3/5 0
 

3/5 4/25 0 6/25 0

THE END

STA03B3 Lecture 4
No ratings yet
STA03B3 Lecture 4
34 pages
Transfer Function
75% (4)
Transfer Function
40 pages
Co 3 Material
100% (1)
Co 3 Material
16 pages
Notes On Markov Chain
No ratings yet
Notes On Markov Chain
34 pages
All Chapters
No ratings yet
All Chapters
180 pages
Cadeia de Markov
No ratings yet
Cadeia de Markov
184 pages
Pig Launcher & Receiver
50% (2)
Pig Launcher & Receiver
4 pages
Stochastic Processes Beamer
No ratings yet
Stochastic Processes Beamer
43 pages
Stoch Bio Chapter 3
No ratings yet
Stoch Bio Chapter 3
46 pages
31134 - Stochastic Process
No ratings yet
31134 - Stochastic Process
154 pages
STAT611
No ratings yet
STAT611
62 pages
MATH37012 Course Notes: Discrete Time: DR Jonathan Bagley
No ratings yet
MATH37012 Course Notes: Discrete Time: DR Jonathan Bagley
29 pages
Week 3-Stochastic Processes
No ratings yet
Week 3-Stochastic Processes
29 pages
Markov Chainsfin
No ratings yet
Markov Chainsfin
137 pages
Marcov Chains
No ratings yet
Marcov Chains
40 pages
Cadeia de Markov
No ratings yet
Cadeia de Markov
178 pages
MSc Day 3
No ratings yet
MSc Day 3
117 pages
Markov Chains - Processos Estocásticos
No ratings yet
Markov Chains - Processos Estocásticos
24 pages
Markov chains2
No ratings yet
Markov chains2
75 pages
Discrete Markov Chain
No ratings yet
Discrete Markov Chain
43 pages
Markov Chain
No ratings yet
Markov Chain
39 pages
STAT333 Lecture Notes Book Version
No ratings yet
STAT333 Lecture Notes Book Version
71 pages
Discrete Time Markov
No ratings yet
Discrete Time Markov
71 pages
ST202 Notes
No ratings yet
ST202 Notes
52 pages
Cadenas de Markov
No ratings yet
Cadenas de Markov
53 pages
Chap1-2 Markov Chain
No ratings yet
Chap1-2 Markov Chain
82 pages
Math5846_chapter8
No ratings yet
Math5846_chapter8
101 pages
Stat 333 Master
No ratings yet
Stat 333 Master
85 pages
Chapter 3 Markov Chain - Definition and Basic Properties (Lecture On 01-12-2021) - STAT 243 - Stochastic Process
No ratings yet
Chapter 3 Markov Chain - Definition and Basic Properties (Lecture On 01-12-2021) - STAT 243 - Stochastic Process
5 pages
10. Stochastic Process (STA102)
No ratings yet
10. Stochastic Process (STA102)
30 pages
Complex Probability and Markov Stochasti
No ratings yet
Complex Probability and Markov Stochasti
10 pages
MC Notes
No ratings yet
MC Notes
42 pages
EE5712 Power System Reliability:: Review of Stochastic Process
No ratings yet
EE5712 Power System Reliability:: Review of Stochastic Process
73 pages
STAT 433 Course Note
No ratings yet
STAT 433 Course Note
11 pages
Markov Chain
No ratings yet
Markov Chain
15 pages
STA03B3 Lecture 3
No ratings yet
STA03B3 Lecture 3
27 pages
Yuli v. Nazarov, Jeroen Danon-Advanced Quantum Mechanics A Practical Guide-Cambridge University Press (2013)
100% (1)
Yuli v. Nazarov, Jeroen Danon-Advanced Quantum Mechanics A Practical Guide-Cambridge University Press (2013)
370 pages
Markov Chains: Gonzalo Mateos University of Rochester Gmateosb@ece - Rochester.edu
No ratings yet
Markov Chains: Gonzalo Mateos University of Rochester Gmateosb@ece - Rochester.edu
59 pages
Queueing 2017 Fall
No ratings yet
Queueing 2017 Fall
48 pages
Screenshot 2025-05-06 at 4.01.34 AM
No ratings yet
Screenshot 2025-05-06 at 4.01.34 AM
5 pages
stoch-procs-lecture1-2025
No ratings yet
stoch-procs-lecture1-2025
20 pages
Lec2_Markov_ChainsI (1)
No ratings yet
Lec2_Markov_ChainsI (1)
48 pages
ST3236_Note5
No ratings yet
ST3236_Note5
18 pages
Stochastic Process: X T X X
No ratings yet
Stochastic Process: X T X X
8 pages
1 Discrete-Time Markov Chains
No ratings yet
1 Discrete-Time Markov Chains
7 pages
Discrete-Time Markov Chains: He Shuangchi
No ratings yet
Discrete-Time Markov Chains: He Shuangchi
61 pages
Markov Processes: Fundamental of Stochastic Networks-Oliver C.Ibe, John-Wiley, 2011
No ratings yet
Markov Processes: Fundamental of Stochastic Networks-Oliver C.Ibe, John-Wiley, 2011
30 pages
StochBioChapter3 PDF
No ratings yet
StochBioChapter3 PDF
46 pages
addition and multiplication rule of probability
No ratings yet
addition and multiplication rule of probability
38 pages
Discrete Time Markov Chains
No ratings yet
Discrete Time Markov Chains
59 pages
Markov Chains - Lectures - CMC - 2024
No ratings yet
Markov Chains - Lectures - CMC - 2024
168 pages
Markov Chain (Part 1)
No ratings yet
Markov Chain (Part 1)
31 pages
Stochastic Processes (2)
No ratings yet
Stochastic Processes (2)
8 pages
(MTL106) Review Notes - Stochastic Processes (IITD)
No ratings yet
(MTL106) Review Notes - Stochastic Processes (IITD)
8 pages
Notes On Stochastic Processes: 1 Learning Outcomes
No ratings yet
Notes On Stochastic Processes: 1 Learning Outcomes
26 pages
21PYB101J-Instructional Manual
No ratings yet
21PYB101J-Instructional Manual
39 pages
Bolted Connections PDF
No ratings yet
Bolted Connections PDF
56 pages
Chapter 4 Markov Chain
No ratings yet
Chapter 4 Markov Chain
39 pages
Forget Rollin Jacquelin
No ratings yet
Forget Rollin Jacquelin
9 pages
U 21TC72 M1 Part a Optical Fibre Structure
No ratings yet
U 21TC72 M1 Part a Optical Fibre Structure
62 pages
Form 3 Physics Exam
No ratings yet
Form 3 Physics Exam
3 pages
Introduction To Metrology
No ratings yet
Introduction To Metrology
55 pages
Markov Hand Out
No ratings yet
Markov Hand Out
14 pages
System Modeling - 5
No ratings yet
System Modeling - 5
9 pages
MCMC Sheldon Ross
No ratings yet
MCMC Sheldon Ross
68 pages
Car Ebook
100% (11)
Car Ebook
144 pages
Fracture Mechanics-Lecture 4
No ratings yet
Fracture Mechanics-Lecture 4
14 pages
03 Bronze 3 - M1 Edexcel
No ratings yet
03 Bronze 3 - M1 Edexcel
14 pages
Markov Chains 2013
No ratings yet
Markov Chains 2013
42 pages
ST202 - Stochastic Processes
No ratings yet
ST202 - Stochastic Processes
5 pages
1173 Appendix
No ratings yet
1173 Appendix
43 pages
BME I5100 Biomedical Signal Processing: Lucas C. Parra Biomedical Engineering Department City College of New York
No ratings yet
BME I5100 Biomedical Signal Processing: Lucas C. Parra Biomedical Engineering Department City College of New York
34 pages
SD49131I-HC Datasheet 20161104 PDF
No ratings yet
SD49131I-HC Datasheet 20161104 PDF
3 pages
c4000 Stand Alone Specifications PDF
No ratings yet
c4000 Stand Alone Specifications PDF
6 pages
20160907110938MTK3013-Chapter1.5 Rules of Inference - Rev1
No ratings yet
20160907110938MTK3013-Chapter1.5 Rules of Inference - Rev1
49 pages
Drag Coefficient-Reynolds
No ratings yet
Drag Coefficient-Reynolds
5 pages
Aim Classes: F (X) K X
100% (1)
Aim Classes: F (X) K X
3 pages
Mathematical Modeling of An Omnidirectional Drive System For Robotic Applications
No ratings yet
Mathematical Modeling of An Omnidirectional Drive System For Robotic Applications
6 pages
Hydro Testing Procedure - ZVV-JASH-R0
No ratings yet
Hydro Testing Procedure - ZVV-JASH-R0
5 pages
Sikaplan Membrane Systems For Tunnels
No ratings yet
Sikaplan Membrane Systems For Tunnels
28 pages
Review of Loudspeaker Equalization
No ratings yet
Review of Loudspeaker Equalization
4 pages
210-02 Rectilinear Motion of Particles
No ratings yet
210-02 Rectilinear Motion of Particles
5 pages
Problem 267 - Resultant of Non-Concurrent Force System: Pinoy Math Community
No ratings yet
Problem 267 - Resultant of Non-Concurrent Force System: Pinoy Math Community
9 pages
Fmea
No ratings yet
Fmea
1 page
Practical EMI-filter-design Procedure For High-Power High-Frequency SMPS According To MIL-STD 461
No ratings yet
Practical EMI-filter-design Procedure For High-Power High-Frequency SMPS According To MIL-STD 461
8 pages
Specifications
No ratings yet
Specifications
9 pages
Flute Finger Hole Locations
No ratings yet
Flute Finger Hole Locations
3 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet