18 615 Notes
18 615 Notes
Rachel Wu
Spring 2017
1 [email protected]
2 [email protected]
i
Rachel Wu Contents
Contents
1 February 28, 2017 1
1.1 Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Classification of states . . . . . . . . . . . . . . . . . . . . . . . . 1
2 March 2, 2017 3
2.1 Markov chain convergence . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Implications of convergence . . . . . . . . . . . . . . . . . . . . . 4
3 March 7, 2017 5
3.1 Markov chains (continued) . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Simple random walks in Z . . . . . . . . . . . . . . . . . . . . . . 5
3.3 Counting paths examples . . . . . . . . . . . . . . . . . . . . . . 6
4 March 9, 2017 8
4.1 More simple random walks in Z . . . . . . . . . . . . . . . . . . . 8
4.2 Applications to probability . . . . . . . . . . . . . . . . . . . . . 10
7 April 4, 2017 15
7.1 Infinite-state Markov chains . . . . . . . . . . . . . . . . . . . . . 15
8 April 6, 2017 18
8.1 Infinite-state Markov chains, ctd. . . . . . . . . . . . . . . . . . . 18
8.2 Applications of Infinite-state MCs . . . . . . . . . . . . . . . . . 19
ii
Rachel Wu Contents
13 May 1, 2018 30
iii
Rachel Wu 1 February 28, 2017
The former implies that i is reachable from j, and the latter implies that j
is also reachable from i.
The state space S can be partitioned into S = S˜1 ∪ S˜2 ∪ · · · ∪ S̃l , where S̃i
are disjoint, nonempty, communication classes. Each of these classes has the
property that all vertices i, j have i ↔ j?
b c
One communication class, since all states are reachable from all others.
How about for gambler’s ruin? There are three classes: 0, N , and the rest.
Since Markov chains are directed graphs, communication classes are just
maximal connected subgraphs.
Definition 1.4. A Markov chain is called irreducible if it has only one com-
munication class.
Definition 1.5. Let S̃ be a communication class, and let X0 ∈ S̃. If Pr {Xn } 6∈
S̃ → 1, then S̃ is called transient, and all states s ∈ S̃ are transient states.
Otherwise, S̃ is recurrent.
1
Rachel Wu 1 February 28, 2017
Example 1.7
If we are given sets {2, 4, 6, 8, . . .}, then the period is 2. For gambler’s ruin,
the chain is aperiodic.
In fact, if there is any loop (node to self) in the graph, then the Markov
chain is aperiodic.
Proposition 1.8
If there exists a ∈ Z+ , such that all entries of P a are strictly positive, then
the Markov chain is irreducible and aperiodic.
Proof. If Pa (i, j) > 0, ∀i, j, then there is only one communication class, and it
is irreducible by definition. In addition, a ∈ T1 , and (a + 1) ∈ T1 as well, and
gcd(a, a + 1) = 1, so this is aperiodic.
2
Rachel Wu 2 March 2, 2017
2 March 2, 2017
Pset 2 is due next lecture. Office hours March 6, 2-3p.
Lemma 2.1
gn,j − sn,j → 0 as n → ∞.
Proof of lemma. We write out the matrix entry form of P a(n+1) , and see that
X
ga(n+1),j = Pa xxx · Pn+1 ≤ (1 − d)gxxx,j + dsxx,j . (2.3)
k
TODO: fill in all the indices... The professor uses a lot of indices I can’t see, so
basically, all the largest numbers are multiplied by smallest numbers sometime
during the matrix multiplication, so the larger numbers shrink and smaller
numbers grow.
Now we show that the smallest element increases.
X
sa(n+1),j = Pa (î, k) · Pn+1 (k, j) ≥ (1 − d)san,j + dgan,j . (2.4)
k
So
ga(n+1),j − sa(n+1),j ≤ (1 − 2d)(gn,j − sn,j ) (2.5)
Lemma 2.2
gn+1,j ≤ gn,j and sn+1,j ≥ sn,j .
3
Rachel Wu 2 March 2, 2017
Proof of lemma. P n+n multiples rows and columns, which sum to 1, with ele-
ments each greater than 0. So gn+1,j ≤ 1 · gn,j + 0 = gn,j , and also sn+1,j ≥ sn,j
for the same reason. So here, gn+1,j − sn+1,j ≤ gn,j − sn,j . (This only doesn’t
work in generality since they may be equal).
Question 2.3. Couldn’t we just have used eigenvalues, one λ = 1? Well yes,
but this is more general, for arbitrary dimensions. In general, they may also be
more invariant distributions.
• If THMC is irreducible and aperiodic, then ∃a, such that P a has all entries
strictly positive. If J ⊂ Z, and gcd(J) = 1, and J is closed under addition,
then J forms a group of the periods?
Corollary 2.4
Suppose we have function f : S → R. Then
" n #
1X
E f (Xi ) → π(1)f (1) + π(2)f (2) + . . . (2.6)
n i=1
theorem,
n
1X
f (Xi ) → f. (2.7)
n i=1
We can just use these facts on psets. However, we must show that the
Markov chain is irreducible and aperiodic, and find the invariant distribution
(and of course, show that it is a Markov chain).
4
Rachel Wu 3 March 7, 2017
3 March 7, 2017
3.1 Markov chains (continued)
Today we continue Markov chains.
n−k
Pr {k, k + 1} = (3.1)
n
k
Pr {k, k − 1} = (3.2)
n
We observe that the urns are the same, so π(k) = π(n − k). By definition of
a stationary distribution,
1 n
for k = 1, . . . , n − 1. We will show that π(k) = 2n k is a solution.
π(k − 1) Pr {k − 1, k} + π(k + 1) Pr {k + 1, k}
n! n−k+1 n! k+1
= · + ·
(k − 1)!(n − k + 1)! n (k + 1)!(n − k − 1)! n
n!
= = π(k) (3.4)
k!(n − k)!
Si
•
• •
• •
i
5
Rachel Wu 3 March 7, 2017
How many paths are there from (0,0) to (n, x)? Well if we take p steps up
and q steps down,
n=p+q
x = p − q.
Then the number of paths is equal to
(
p+q n n+x
p = n+x
2 ∈ Z, |x| ≤ n
N(0,0)→(n,x) = 2 (3.6)
0 otherwise.
√
Using Stirling’s Pr {Sn = 0} approaches 1/ πn. Then we find that
∞
X
E [# of returns to 0] = Pr {Sn = 0} = ∞. (3.8)
n−1
Solution. Reflect the firetruck across the river, connect the house and reflection,
and determine where that intersects the river.
Example 3.4
We have two points (a, α) and (b, β), such that α, β > 0, b > a, and
a, b, α, β ∈ Z. How many paths between a and b touch or cross the x axis?
Solution. There is a bijection between the paths from a to b and the paths from
a0 to b, where a0 is a reflected across the x axis. Simply reflect across the x axis
at the first point the a0 → b line crosses the x axis. There are
b−a
(b−a)+(β−α) (3.9)
2
6
Rachel Wu 3 March 7, 2017
Corollary 3.5
The number of paths from a → b which do not touch the x axis is equal to
the number of paths from a → b minus the number of paths from a0 → b.
Corollary 3.6
The number of paths from (1, 1) → (n, x) which are strictly above the x
axis is Nn−1,x−1 − Nn−1,x+1 .
Example 3.7
How many paths from (1, 1) → (5, 1) do not touch the x axis?
Solution. Trivially we could draw them and see that there are 2. Using the
formula,
4 4
N4,0 − N4,2 = − = 6 − 4 = 2. (3.10)
2 3
7
Rachel Wu 4 March 9, 2017
4 March 9, 2017
4.1 More simple random walks in Z
Recall from last lecture that the number of paths from (0, 0) → (n, x) = Nn,x
(
p+q n
n+x
p = n+x 2 ∈ Z, |x| ≤ n
Nn,x = 2 (4.1)
0 otherwise.
The number of paths from (a, α) → (b, β) which do not touch the x axis is
Nb−a,β−α − Nb−a,β+α . (4.2)
At the end of the course, we will model Brownian motion with such a simple
random walk.
Proof. The left side is equal to the number of paths from (1, 1) → (n, x) since
from (0,0), we can only go up to (1,1). For the right side, let us take p steps up
and q steps down, such that x = p − q, n = p + q. Then
p+q−1 p+q−1 (p + q − 1)! 1 1
− = ( − ) (4.3)
p−1 p (p − 1)!? q p
Si
•
• •
•
i
Proposition 4.3
The number of paths of length 2n, such that si > 0, ∀i 6= 0, is
1
N2n−1,1 = N2n,0 .
2
8
Rachel Wu 4 March 9, 2017
Corollary 4.4
The number of paths that do not return to 0 in the first 2n steps, (s0 =
0, s1 6= 0, . . . , s2n−1 6= 0, s2n = 2r) is
1
2 · N2n,0 = N2n,0 ,
2
since points are all positive or all negative.
Proposition 4.5
The number of paths that start from 0 and return to 0 at 2n is
1
4N2n−2,0 − N2n,0 = N2n,0
2n − 1
.
We demonstrate with n = 3, 2n = 6.
Si Si
•
• • • •
• • • • •
• • • •
i i
And we also have the negative versions of these.
9
Rachel Wu 4 March 9, 2017
Corollary 4.6
We can derive that f2n = µ2n−2 − µ2n from proposition 4.5.
Corollary 4.7
The probability that a simple random walk returns to 0 is
∞
X
fi → 1
i=0
10
Rachel Wu 5 March 21, 2017
N2k,0 N2n−2k,0
Pr {τ = 2k} = = µ2k µ2n−2k .
22n
Bashing with Stirling’s,
1 1
Pr {τ = 2k} ∼ √ p . (5.1)
πk π(n − k)
Proposition 5.1
N2k,0 · N2n−2k,0 is the number of paths that last return to 0 at 2k.
αn
X
Pr {τ0 ≤ 2αn} = Pr {τ0 = 2k}
k=0
αn
X 1
≈ p
k=1
π k(n − k)
αn
X 1 1
= q . (5.2)
n π k (1 − k )
k=1 n n
Theorem 5.2
2 √
As n tends to ∞, Pr {τα ≤ 2αn} → π arcsin α.
For example, take α = 0.1. Then approximately 20.4% of length n paths last
return to 0 at 10% of n. It is very likely for last returns to be close to 0 or n,
and the least likely for last returns to occur in the middle.
I’m tired so I’ll fill in the plots over spring break....... Fill in a u shaped plot
11
Rachel Wu 5 March 21, 2017
Proposition 5.3
The number of paths from (0, 0) → (n, x) of length n, which touch the
q = r line is Nn,2r−x .
Proposition 5.4
The number of paths from (0, 0) → (n, x) of length n such that maxk Sk = r
is Nn,2r−x − Nn,2r+2−x
This is equivalent to the number of paths that touch y = r minus the number of
paths that touch y = r + 1. All paths going beyond must also touch y = r + 1.
Proposition 5.5
The number of paths of length n such that maxk Sk = r is
1
X
# paths (0, 0) → (n, x) = max(Nn,r ; Nn,r+1 ),
x=r
depending on parity.
Example 5.6
Let n = 4, r = 1. How many paths are there?
Theorem 5.7
As n tends to ∞,
n √ o 1
Z α
2
Pr Sk ≤ α k → e−t /2 dt
2π −∞
12
Rachel Wu 6 Mach 23, 2017
Theorem 6.1
By the central limit theorem, for α ≥ 0, n → ∞,
√ Z +∞
1 2
Pr max Sk ≥ α k → √ e−t /2 dt .
k 2π α
Lemma 6.2
Let F2k be the number of paths which return to 0 for the first time at 2k.
Then
13
Rachel Wu 6 Mach 23, 2017
Each step of a path with length 2n is either on the positive or negative half
plane. We study paths of length 2n, (S1 , S2 , . . . , S2n ). Let N+ be the number
of “positive steps,” and N− be “negative.” Observe that both N+ and N− must
be even, as we must return to 0 to change sign. Let
Example 6.3
Find B2,4 . We draw out the possibilities. There are 4.
• •
• •
• • •
• • • • • •
• • •
• •
• •
Proposition 6.4
We know that B2n,2n = N2n,0 = # of non-negative paths. Then 12 N2n,0 =
# of positive paths, as we can just shift up-right and fill in the last step.
Proposition 6.5
B2k,2n = N2k,0 · N2n−2k,0
14
Rachel Wu 7 April 4, 2017
7 April 4, 2017
7.1 Infinite-state Markov chains
We are given an unbounded but countable
P state space S = {0, 1, 2, . . .} ∈ Z, with
a probability distribution {α(x)} , x α(x) = 1. A Markov chain X0 , X1 , . . .
has transition probability
So by total probability,
X
Pr {Xn+1 = y} = p(Xn = x)p(x, y).
x∈S
Example 7.1
We have an infinite Markov chain where
p(i, i + 1) = 1, i ≥ 0.
As a graph,
0 1 2 3 4
etc.
15
Rachel Wu 7 April 4, 2017
Example 7.2
Consider a Markov chain, similar to simple random walks.
As a graph,
0 1 2 3 4
etc.
So
π(n) = π(n − 1)/2 + π(n + 1)/2.
This is a bit harder, but we can say that π(0) = π(1), and thus
This isn’t very useful since we have an infinite number of states, so we can
say that there’s no invariant distribution.
These examples show two types of new behavior: there could be no conver-
gence like 7.1, or just no distribution like 7.2.
Definition 7.3. Let X0 , X1 , . . . be a sequence of random variables. Let ran-
dom variable τ be the stopping time, such that event {τ = n} depends on
X0 , X1 , . . . , Xn only.
Definition 7.4. Given some state x ∈ S, then the hitting time the first time
the Markov chain visits x, τn = min {n, Xn = x}.
for any x, y ∈ S.
16
Rachel Wu 7 April 4, 2017
Proposition 7.7
If an irreducible Markov chain is recurrent, then Pr {Xn = y} for infinitely
many n is equal to 1, ∀y.
17
Rachel Wu 8 April 6, 2017
8 April 6, 2017
8.1 Infinite-state Markov chains, ctd.
We continue discussion of infinite-state Markov chains.
Proposition 8.1
If for some state x ∈ S, then
X
pn (x, x) < ∞
n
Proposition 8.3
If a recurrent ITHMC has an invariant distribution, then it is positive
recurrent.
That implies that after n steps, we are still at the same distribution.
X
π(y) = π(x)pn (x, y).
x∈S
type up lol
18
Rachel Wu 8 April 6, 2017
0 1 2 3 4
etc.
p(x, x + 1) = p
p(x, x − 1) = q
19
Rachel Wu 8 April 6, 2017
Solution. We have 2n/d steps of simple randomp walk in each dimension. The
probability of returning in each dimension is 1/ πn/d. If the simple
p random
walks are independent, then the probability of returning is (1/ πn/d)d . For
large n, this can be made rigorous, though they are not actually independent.
∞
X X 1
p2n (0, 0) ∼ c .
n=1
nd/2
20
Rachel Wu 9 April 11, 2017
Proposition 9.1
If X1 and X2 are independent, then fX1 fX2 (s) = fX1 +X2 (s)
We let n2 = n − n1 .
Proposition 9.2
The expected value can be found as E [X] = (fX (s))0 |s=1 .
21
Rachel Wu 9 April 11, 2017
Xn • • • • •
Xn+1 • • • • • • • • • • • • • •
P (k, j) = Pr {y1 + · · · + yk = j} ,
22
Rachel Wu 9 April 11, 2017
Theorem 9.5
fXn+1 (s) = fXn (fξ (s))
Example 9.6
Let p0 = 1/2, p2 = 1/2, X0 = 1.
•
•
• •
• •
•
Theorem 9.7
The extinction probability A is the smallest positive root of fξ (s) = s.
(n) (n)
Proof. Note that the limit A = limn→∞ fξ (0) exists, since fξ (0) is increasing
and bounded by ≤ 1. In addition,
(n) (n)
fξ (A) = fξ ( lim fξ (0)) = lim fξ (0) = A.
n→∞ n→∞
Finally, if  is the smallest positive root of fξ (s) = s, then  > 0 and tends to
A.
Example 9.8
Let p0 = 1/2, p2 = 1/2. Then fξ (s) = 1/2 + 1/2s2 , so s = 1 is the root to
fξ (s) = s. This has extinction probability 1.
23
Rachel Wu 9 April 11, 2017
Example 9.9
Let p0 = 1/4, p1 = 1/4, p2 = 1/2. Then fξ (s) = 1/2 + 1/4s + 1/2s2 , so
s = 1, 1/2 are the roots to fξ (s) = s. This has extinction probability 1/2,
the smaller root.
24
Rachel Wu 10 April 13, 2017
Example 10.1
If Pr {ξ = 0} + Pr {ξ = 1} = 1, then the next step there are either 1 or 0
offspring. This process dies out with probability 1.
A = fξ (A) = Pr {ξ = 0} + Pr {ξ = 1} + · · · + Pr {ξ = k} .
This is because each particle dies out independently, so if there are k particles,
they need to all die out.
Theorem 10.2
If E [ξ] ≤ 1, then a = 1, and if E [ξ] > 1, then fξ (s) = s has a unique
positive root a such that a < 1.
∞
!
d2 X
fξ00 (s) = 2 Pr {ξ = k} s k
ds
k=0
∞
X
= Pr {ξ = k} sk−2
k=2
We can also prove the theorem analytically. First a useful claim about convexity.
Claim 10.3. If f (x) is convex, then the collection of points {(x, y) : y ≥ f (x)}
is convex. That is, the intersection of this set with any line is a segment.
Case 1 E [ξ] = fξ0 (1) ≤ 1 implies that fξ00 (s) < fξ0 (1) ≤ 1.
Z 1
1 − fξ (s) = fξ0 (s) dt < 1 − s
s
fξ (1 − ) ≈ 1 − fξ0 (1) · ≤ 1 −
25
Rachel Wu 10 April 13, 2017
Note, this proof does not cover the case Pr {ξ = 1} and a = 0, but we are
handwaving.
Moving on,
Pr {X = x, Y = y}
Pr {Y = y|X = x} = .
Pr {X = x}
if Pr {X = x} =
6 0. In the continuous case, this is problematic since Pr {X = x}
is always 0. The usual expectation is a real number,
X
E [f (x, y)] = f (x, y) Pr {X = x, Y = y} .
x,y
26
Rachel Wu 10 April 13, 2017
Proof.
X X
E [E [Y |X]] = Pr {X = x} Pr {Y = y|X = x}
x y
XX
= y Pr {Y = y, X = x}
x y
X
= y Pr {Y = y} = E [Y ]
y
Proof.
X X
E [E [Y |X1 , X2 ]] = Pr {X2 = x2 |X1 = x1 } Pr {Y = y|X1 = x1 , X2 = x2 }
x1 y
X Pr {Y = y, X2 = x2 |X1 = x1 }
= Pr {X2 = x2 |X1 = x1 } y
x1 ,y
Pr {X2 = x2 |X1 = x1 }
X
= y Pr {Y = y|X1 = x1 } = E [Y |X1 ]
y
27
Rachel Wu 11 April 20, 2017
Proposition 11.2
E [Mk+2 |M0 . . . Mk ] = Mk
Proof.
Proposition 11.3
E [Mk+k0 |M0 . . . Mk ] = Mk
Proposition 11.4
E [Mk ] = E [M0 ]
Example 11.5
Let X1 , X2 , . . . , Xn be P
independent random variables, such that E [Xi ] = 0
for any i. Then Mn = i Xi is a martingale.
Example 11.6
We play a game to gain 3 or -1 with equal probability each round, but we
must pay 1 to play a round. Let X1 , X2 , . . . , Xn be independent random
variables, such that E [Xi ] = µ for any i. Then Mn = X1 + · · · + Xn − nµ
is a martingale.
28
Rachel Wu 11 April 20, 2017
Example 11.7
Let X1 , X2 , . . . , Xn be i.i.d. random variables, where E [Xi ] = 0 and
E Xi2 = V , where V is a constant. Then Mn = (X1 +X2 +· · ·+Xn )2 −nV
is a martingale.
Pn
Proof. Let Sn = i=1 Xi .
Example 11.8
Pn
Let X1 , X2 , . . . , Xn be i.i.d. random variables, and Sn = i=1 Xi . Then
M1 = Snn , M2 = Sn−1 n−1
, and Mk = Sn−k
n−k
, down to S1 .
k−1
We can find that E [Sk−1 |Sk ] = k Sk .
Example 11.9
Consider a branching process, X0 = 1, and ξ is the offspring of one particle,
where E [ξ] = µ. Xn is the number of particles at time n.
Xn
Then the martingale we are looking for is Mn = µn .
Xn+1
E [Mn+1 |Mn ] = E |Xn
µn+1
µXn Xn
= n+1 = n = Mn
µ µ
29
Rachel Wu 13 May 1, 2018
13 May 1, 2018
After a year I am returning to finish my notes!
30