0% found this document useful (0 votes)

17 views23 pages

STA347

This document is a study guide for Yuchen Wang's STA347 final preparation. It covers topics in probability theory, including experiments and sample spaces, properties of probability, classical equal probability and combinatorics, conditional probability, random variables, stochastic processes, modes of convergence, laws of large numbers, and the central limit theorem. Key concepts are defined, such as σ-fields, axioms of probability, disjoint and finite sets. Properties of probability like non-negativity, countable additivity, and continuity from below and above are also summarized.

Uploaded by

Bexultan Mustafin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views23 pages

STA347

Uploaded by

Bexultan Mustafin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

STA347

Final Preparation
Yuchen Wang

March 26, 2020

Contents
1 Experiments, Events and Sample Spaces 3

2 Definition and Properties of Probability 3

2.1 Finite Sample Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3 Classical Equal Probability and Combinatorics 4

4 Inclusion-Exclusion Formula 5

5 Conditional Probability 5

6 Independence 5

7 Bayes Theorem 6

8 Random Variables 6
8.1 Examples of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
8.2 Cumulative Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
8.3 Multivariate Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
8.3.1 Bivariage Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
8.3.2 Marginal Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8.3.3 Conditional Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8.3.4 Multivariate Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
8.4 Functions of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
8.5 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
8.6 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

9 Inequalities 14

10 Conditional Expectation 15

11 Probability Related Functions 16

11.1 Survival Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

12 Stochastic process 18
12.1 Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
12.2 Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
12.3 Reflection principle (Wiener process) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1
CONTENTS 2

13 Mode of Convergence 20
13.1 L1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
13.2 Almost Sure Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
13.3 Convergence in distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

14 Law of Large Numbers 22

15 Central Limit Theorem 22

1 EXPERIMENTS, EVENTS AND SAMPLE SPACES 3

1 Experiments, Events and Sample Spaces

Definition 1.1. Experiment, Sample space and event

• Experiment: Any process, real or hypothetical, in which the possible outcomes can be identified ahead of
time;

• Sample space: The collection of all possible outcomes, denoted by S;

• Event: A well-defined subset of sample space

Definition 1.2 (countably infinity). A set is countably infinite if its elements can be put in one-to-one
correspondence with the set of natural numbers.

Definition 1.3 (At most countable sets). A set that is either finite or countably infinite is called an at most
countable set.

Theorem 1.1. Suppose E, E1 , E2 , . . . are events. The following are also events

1. E c

2. E1 ∪ E2 ∪ . . . En
P∞
3. i=1 Ei

2 Definition and Properties of Probability

Definition 2.1 (σ-field). Let χ be a space. A collection F of subsets of χ is called a σ-field if

1. χ ∈ F

2. (closure under complement) if E ∈ F, then E c ∈ F

3. (closure under countable union) if E1 , E2 , . . . ∈ F, then ∪∞

n=1 En ∈ F

Remark 2.1. A σ-field refers to the collection of subsets of a sample space that we should use in order to
establish a mathematically formal definition of probability. The sets in the σ-field constitute the events from
our sample space.

Axiom 2.1 (Axioms of Probability). Let S be a sample space, and let F be a σ-field of S.

• Axiom 1 (non-negativity) P (E) ≥ 0 for any event E ∈ F.

• Axiom 2 P (S) = 1

• Axiom 3 (countable additivity) For every sequence of disjoint events E1 , E2 , . . . ∈ F

∞
X
P (∪∞
i=1 Ei ) = P (Ei )
i=1

Definition 2.2 (probability). Any function P on a sample space S satisfying Axioms 1-3 is called a probability.

Definition 2.3 (disjoint sets). Sets A and B are disjoint if A ∩ B = ∅.

Theorem 2.1. Properties of Probability

1. P (∅) = 0
3 CLASSICAL EQUAL PROBABILITY AND COMBINATORICS 4

2. (finite additivity) For any disjoint events E1 , . . . , En ,

n
X
P (∪ni=1 Ei ) = P (Ei )
i=1

3. P (Ac ) = 1 − P (A)

4. For A ⊂ B, P (A) ≤ P (B)

5. 0 ≤ P (A) ≤ 1

6. P (A − B) = P (A) − P (A ∩ B)

7. P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

8. (subadditivity, Boole’s inequality) For any events E1 , . . . , En ,

n
X
P (∪ni=1 Ei ) ≤ P (Ei )
i=1

Theorem 2.2 (Continuity from below and above). Let P be a probability.

(continuity from below) If An % A (i.e. A1 ⊂ A2 ⊂ . . . and ∪n An = A), then P (An ) % P (A)
(continuity from above) If An & A (i.e. A1 ⊃ A2 ⊃ . . . and ∩n An = A), then P (An ) & P (A)

2.1 Finite Sample Spaces

Suppose |S| = n, that is, S = {s1 , . . . , sn }. Then each member has probability, that is, pi = P ({si }) such that
n
X
pi ≥ 0 and pi = 1
i=1

3 Classical Equal Probability and Combinatorics

Definition 3.1 (permutation). When there are n elements, the number of events pulling k elements out of n
elements is called a permutation of n elements taken k at a time and denoted by Pn,k .
Theorem 3.1.
n!
Pn,k = n(n − 1) . . . (n − k + 1) =
(n − k)!
Definition 3.2 (combination). The number of combinations of n elements taken k at a time is denoted by Cn,k
or nk .

Theorem 3.2.
n n!
Cn,k = = = Pn,k /k!
k k!(n − k)!
Theorem 3.3 (Binomial coefficients).
n
n
X n k n−k
(x + y) = x y
k
k=0

Theorem 3.4 (Newton Expansion). For |z| < 1, the term (1 + z)r can be expanded as
∞
r
X r k
(1 + z) = z
k
k=0
4 INCLUSION-EXCLUSION FORMULA 5

Theorem 3.5.
n r(r − 1) . . . (r − k + 1) Γ(r + 1)
= =
k k! Γ(r − k + 1)Γ(k + 1)
´∞
with Γ(α) = 0 xα−1 e−x dx
Theorem 3.6. For any numbers x1 , . . . , xk and non-negative integer n,
X n

(x1 + . . . + xk ) =n
xn1 . . . xnk k
n1 , . . . , n k 1
It is easy to see that

n n n2 + · · · + nk n3 + · · · + nk nk
= ···
n1 , . . . , n k n1 n2 n3 nk
(1)
n!
=
n1 ! · · · nk !
Theorem 3.7 (Stirling’s formula).
1 1
lim log(n!) − [ log(2π) + (n + ) log(n) − n] = 0
n→∞ 2 2

4 Inclusion-Exclusion Formula
For any n events A1 , . . . , An ,
n
X X X
P (∪ni=1 Ai ) = P (Ai ) − P (Ai ∩ Aj ) + P (Ai ∩ Aj ∩ Ak ) + · · ·
i=1 i<j i<j<k (2)
+ (−1)n−1 P (A1 ∩ · · · ∩ An )

5 Conditional Probability
Definition 5.1 (conditional probability). When P (B) > 0, the conditional probability of an event A given
B is defined by
P (A|B) = P (A ∩ B)/P (B)
Theorem 5.1. If P (B) > 0, then P (A ∩ B) = P (A|B)P (B).
Theorem 5.2. Let A1 , . . . , An be events with P (A1 ∩ . . . ∩ An ) > 0. Then
P (A1 ∩ · · · ∩ An ) = P (A1 ) P (A2 |A1 ) P (A3 |A1 , A2 ) · · · P (An |A1 , . . . , An−1 ) (3)

6 Independence
Definition 6.1 (independence). Two events A and B are independent if and only if
P (A ∩ B) = P (A)P (B)
. A collection of events {Ai }i∈I are said to be (mutually) independent if
P (∩i∈J Ai ) = Πi∈J P (Ai )
for any ∅ =
6 J ⊂ I.
A collection of events {Ai }i∈I are said to be pair-wise independent if
P (Ai ∩ Aj ) = P (Ai )P (Aj )
for i 6= j ∈ I.
7 BAYES THEOREM 6

Theorem 6.1. Two events A and B are independent if and only if A and B c are independent.

Definition 6.2 (conditionally independence). Two events A and B are conditionally independent given C
if
P (A ∩ B|C) = P (A|C)P (B|C)

Remark 6.1. Conditional independence does not imply independence.

7 Bayes Theorem
Definition 7.1. A collection of sets B1 , . . . , Bk is called a partition of A if and only if B1 , . . . , Bk are disjoint
and A = ∪ki=1 Bi .

Theorem 7.1 (Law of total probability). Let events B1 , . . . , Bk be a partition of S with P (Bj ) > 0 for all
j = 1, . . . , k. For any event A,
Xk
P (A) = P (Bj )P (A|Bj )
j=1

Theorem 7.2 (Bayes’ Theorem). If 0 < P (A), P (B) < 1, then

P (A|B)P (B)
P (B|A) =
P (A|B)P (B) + P (A|B c )P (B c )

8 Random Variables
Definition 8.1. A real-valued function X on the sample space S is called a random variable if the probability
of X is well-defined, that is, {s ∈ S : X(s) ≤ r} is an event for each r ∈ R.

Definition 8.2 (Borel sets in R). The collection of all Borel sets B in R is the smallest collection satisfying the
followings

1. (a, b] ∈ B for any a < b ∈ R

2. (closure under complement) For any B ∈ B, B c ∈ B

3. (closure under countable union) For any B1 , B2 , . . . ∈ B, ∪∞

j=1 Bj ∈ B

We call the collection B the Borel σ-field

Definition 8.3 (Probability of a random variable). For any Borel set B in R, an event X ∈ B is defined as
{s ∈ S : X(s) ∈ B} and often denoted by {X ∈ B} or (X ∈ B). The corresponding probability is

P (X ∈ B) = P ({s ∈ S : X(s) ∈ B})

Lemma 8.1. If |X(S)| < ∞ and (X = r) is an event for any r ∈ X(S), then X is a random variable.

Definition 8.4 (distribution). The distribution of X is the collection of all probabilities of all events induced
by X, that is, (B, P (X ∈ B)). Two random variables X and Y are said to be identically distributed if they
have the same distribution.

Remark 8.1. To show X and Y having the same distribution, we need to check for any event B on R,
P (X ∈ B) = P (Y ∈ B). Since all Borel sets on R are induced by intervals, it is enough to prove

P (a < X ≤ b) = P (a < Y ≤ b)

for any a < b ∈ R. Even P (X ≤ a) = P (Y ≤ a) for any a ∈ R guarantees that X and Y are identically
distributed.
8 RANDOM VARIABLES 7

Definition 8.5 (discrete random variable). A random variable X is said to be discrete if P (X = x) = 0 or

P (X = x) > 0 and P (X ∈ χ0 ) = 1 where χ0 = {x ∈ R : P (X = x) > 0}
Definition 8.6 (probability mass function). The probability mass function (pmf) of a discrete random
variable X is
pmfX (x) = P (X = x)
for any possible value of x ∈ X(S).
Theorem 8.1. Let X be a discrete random variable. Then the set of x having P (X = x) is at most countable.
Theorem 8.2. Let f be the pmf ofP a discrete random variable X. The set of possible values of X is X(S) =
/ X(S) ≥ 0 and ∞
{x1 , x2 , . . .}. For x ∈ i=1 f (xi ) = 1.

Theorem 8.3. Let X(S) = {x1 , x2 , . . .} be the set of possible values of a discrete random variable X. Then
for any subset A of R, X X
P (X ∈ A) = P ({x}) = pmfX (x)
x∈A x∈A

Definition 8.7 (absolutely continuity and probability density function). A random variable X is said to be
absolutely continuous if the probability of each interval [a, b] is of the form
ˆ b
P (a < X ≤ b) = f (x) dx
a

where a < b ∈ R and f is a non-negative function on R. Such function f is called a probability density
function (pdf) of X.
Theorem 8.4. Let X be a continuous random variable. Then
d
pdfX (x) = P (X ≤ x)
dx

8.1 Examples of Random Variables

Definition 8.8 (Bernoulli). A random variable X taking value 0 or 1 with P (X = 1) = p and P (X = 0) = 1−p
for some p ∈ [0, 1] is called a Bernoulli random variable with success probability p and often denoted by X ∼
Bernoulli(p).
Definition 8.9 (discrete uniform). Let χ be a non-empty finite set. A random variable X taking values in χ
with equal probability is called a uniform random variable on χ and denoted by X ∼ unif orm(χ).
The probability mass function of X ∼ unif orm(χ) is
(
1
if x ∈ χ
pmfX (x) = |χ|
0 otherwise

Definition 8.10 (binomial). A random variable X is called a binomial random variable if it has the same
distribution as Z which is the number of success in n independent trails with success probability p, and denoted
by X ∼ binomial(n, p).
The probability mass function of X ∼ binomial(n, p) is
(
n x n−x if n = 0, 1, . . .
pmfX (x) = x p (1 − p)
0 otherwise

Definition 8.11 (continuous uniform). A random variable X defined on (a, b) for finite real numbers a < b
d−c
satisfying P (c < X ≤ d) = b−a for any c, d such that a ≤ c ≤ d ≤ b is called a uniform random variable on
(a, b) which is denoted by X ∼ uniform(a, b). The probability mass function of X ∼ unif orm(a, b) is
(
1
if a < x < b
pmfX (x) = b−a
0 otherwise
8 RANDOM VARIABLES 8

Definition 8.12 (geometric). Consider an independent Bernoulli trial with success probability p. The number
of trials until the first success is called a geometric distribution with parameter p, denoted by geometric(p).
The geometric random variable X ∼ geometric(p) has probability mass function as

pmfX (n) = (1 − p)n−1 p

for n ∈ N.

Definition 8.13 (negative binomial). Consider an independent Bernoulli trial with success probability p. The
number of trials until k-th success is called a negative binomial distribution with parameter k and p, denoted
by neg-bin(k, p).
The negative binomial random variable X ∼ neg − bin(k, p) has probability mass function as

n−1
pmfX (n) = (1 − p)n−k pk
k−1

for n ∈ N s.t. n ≥ k.

Definition 8.14 (hypergeometric). Consider a jar containing n balls of which r are black and the remainder n−r
are white. The random variable X is the number of black balls when m balls are drawn without replacement.
The probability of k black balls are drawn is

n−r n
/ if k = 0, . . . , min(r, m)

pmf X (k) = m−k m
0 otherwise.


Such distribution is called a hypergeometric distribution.

Definition 8.15 (zeta/zipf). A positive integer valued random variable X follows a Zeta or Zipf distribution
if
n−s
pmf X (n) =
ζ(s)
P∞ −s
for n = 1, 2, . . . and s > 1 where ζ(s) = n=1 n

Definition 8.16 (Poisson). A Poisson distribution with parameter µ > 0 has the probability mass function

µn
pmfX (n) = e−µ
n!
for non-negative integer n.

Theorem 8.5. If X ∼ P oisson(λ) and the distribution of Y , conditional on X = k, is a binomial distribution,

Y |(X = k) ∼ Binom(k, p), then the distribution of Y follows a Poisson distribution Y ∼ P oisson(λ · p)

Theorem 8.6 (SumsP P random variables). If Xi ∼ P oisson(λi ) for i = 1, . . . , n are

of Poisson-distributed
independent, and λ = ni=1 λi , then Y = ( ni=1 Xi ) ∼ P oisson(λ).

Definition 8.17 (Exponential). A continuous random variable W having the probability density

pdf W (w) = λe−λw 1(w > 0)

is distributed from an exponential distribution with parameter λ > 0, which is denoted by W ∼ exponential
(λ).
8 RANDOM VARIABLES 9

8.2 Cumulative Distribution Function

The (cumulative) distribution function of a random variable X is the function

cdf X (x) = FX (x) = P (X ≤ x)

for −∞ < x < ∞.

Theorem 8.7 (properties of distribution functions). Let F be a distribution function. Then

(a) F is nondecreasing,
(b) limx→∞ F (x) = 1 and limx→−∞ F (x) = 0,
(c) F is right continuous, that is, limy&x F (y) = F (x),
(d) F (x−) := limy%x F (y) = P (X < x)
(e) P (X = x) = F (x) − F (x−)

Theorem 8.8. If a real function F satisfies (a)-(c) in the above properties, then it is a distribution function of
a random variable.

Definition 8.18 (p-quantile). The p-quantile of a random variable X is x such that P (X ≤ x) ≥ p and
P (X ≥ x) ≥ 1 − p.

Definition 8.19. The median, lower quartile, upper quartile are 0.5-, 0.25-, 0.75-quantile. The inter quartile
range (IQR) is the difference between upper and lower quartile.

8.3 Multivariate Distributions

8.3.1 Bivariage Distributions
Definition 8.20. The joint/bivariate distribution of two random variables X and Y is the collection of all
possible probabilities, that is, P ((X, Y ) ∈ B) where B is a Borel set in R2 .

Definition 8.21. Two random variables X and Y are jointly continuously distributed if and only if there exists
a non-negative function f such that for any Borel set B in R2
¨
P ((X, Y ) ∈ B) = f (x, y) dx dy
B

Such function f is called a joint density function of (X, Y ).

Theorem 8.9 (Properties of joint density functions). Joint density functions satisfies

1.
pdfX,Y (x, y) ≥ 0

2. ¨
pdfX,Y (x, y) dx dy = 1

Definition 8.22. The joint (cumulative) distribution function of X and Y is

cdfX,Y (x, y) = P (X ≤ x, Y ≤ y)

Definition 8.23. When X and Y are discrete, then the joint probability mass function of X and Y is
defined by
pmfX,Y (x, y) = P (X = x, Y = y)

Theorem 8.10 (Properties of joint probability mass functions). Satisfies

8 RANDOM VARIABLES 10

1.
pmfX,Y (x, y) ≥ 0

2. X
pmfX,Y (x, y) = 1
x,y

Theorem 8.11. Consider two random variables X and Y .

lim cdfX,Y (x, y) = 0

y→−∞

lim cdfX,Y (x, y) = 0

x→−∞
lim cdfX,Y (x, y) = cdfX (x)
y→∞

lim cdfX,Y (x, y) = cdfY (y)

x→∞

8.3.2 Marginal Distributions

Suppose X and Y are random variables. The cdf or pmf or pdf of X (or Y ) derived from the joint cdf or pmf
or pdf is called the marginal cdf or pmf or pdf of X (or Y ).
Theorem 8.12. 1. X
pmfX (x) = pmfX,Y (x, y)
y

2. ˆ
pdfX (x) = pdfX,Y (x, y) dy

Definition 8.24. Two random variables X and Y are independent if and only if

P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B)

Theorem 8.13. If two random variables X and Y are independent, then the following hold if the functions
exist.
1. cdfX,Y (x, y) = cdfX (x) × cdfY (y) for all x, y

2. pmfX,Y (x, y) = pmfX (x) × pmfY (y) for all x, y

3. pdfX,Y (x, y) = pdfX (x) × pdfY (y) for all x, y

Theorem 8.14. If one of the following hold, then two random variables X and Y are independent.
1. cdfX,Y (x, y) = cdfX (x) × cdfY (y) for all x, y

2. pmfX,Y (x, y) = pmfX (x) × pmfY (y) for all x, y

3. pdfX,Y (x, y) = pdfX (x) × pdfY (y) for all x, y

8.3.3 Conditional Distributions

Definition 8.25. The conditional density of X given Y = y is
pdfX,Y (x, y)
pdfX|Y (x|y) =
pdfY (y)
Theorem 8.15.
pdfX,Y (x, y) = pdfX (x)pdfX|Y (x|y)
8 RANDOM VARIABLES 11

8.3.4 Multivariate Distributions

Definition 8.26. The joint cumulative distribution function of n variables X1 , . . . , Xn is defined by

cdfX1 ,...,Xn (x1 , . . . , xn ) = P (X1 ≤ x1 , . . . , Xn ≤ xn )

The joint probability mass/density function of n discrete/continuous random variables X1 , . . . , Xn is

pmfX1 ,...,Xn (x1 , . . . , xn ) = P (X1 = x1 , . . . , Xn = xn )

ˆ ˆ
P ((X1 , . . . , Xn ) ∈ B) = . . . pdfX1 ,...,Xn (x1 , . . . , xn ) dxn . . . dx1
B

Definition 8.27. Let X1 , . . . , Xn be random variables. Marginal cumulative distribution, probability mass,
probability density functions of X1 , . . . , Xi−1 , Xi+1 , . . . , Xn are

cdfX1 ,...,Xi−1 ,Xi+1 ,...,Xn (x1 ,...,xi−1 ,xi+1 ,...,xn ) = lim cdfX1 ,...,Xi−1 ,Xi ,Xi+1 ,...,Xn (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) (4)
xi →∞
X
pmfX1 ,...,Xi−1 ,Xi+1 ,...,Xn (x1 ,...,xi−1 ,xi+1 ,...,xn ) = pmfX1 ,...,Xi−1 ,Xi ,Xi+1 ,...,Xn (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) (5)
xi
ˆ
pdfX1 ,...,Xi−1 ,Xi+1 ,...,Xn (x1 ,...,xi−1 ,xi+1 ,...,xn ) = pdfX1 ,...,Xi−1 ,Xi ,Xi+1 ,...,Xn (x1 , . . . , xi−1 , xi , xi+1 , . . . , xn ) dxi (6)

Theorem 8.16. Let X1 , . . . , Xn be continuous random variables having cdf. Then

∂n
pdfX1 ,...,Xn (x1 , . . . , xn ) = F (x1 , . . . , xn )
∂x1 . . . ∂xn
Definition 8.28. Random variables X1 , . . . , Xn are independent if and only if for any Borel sets B1 , . . . , Bn

P (X1 ∈ B1 , . . . , Xn ∈ Bn ) = P (X1 ∈ B1 ) . . . P (Xn ∈ Bn )

Theorem 8.17. Random variables X1 , . . . , Xn are independent if and only if

cdfX1 ,...,Xn (x1 , . . . , xn ) = cdfX1 (x1 ) . . . cdfXn (xn )

8.4 Functions of Random Variables

Theorem 8.18. Let X be a discrete random variable and Y = g(X) be a transformed random variable where
g : R → R is a function. The pmf of Y is
X
pmfY (y) = pmfX (x)
x:g(x)=y

Theorem 8.19. Let X be a continuous random variable and Y = g(X) be a transformed random variable
where g is an appropriate transformation like continuous increasing. The cdf of Y is
ˆ
cdfY (y) = pdfX (x) dx
{x:g(x)≤y}

The probability density function of Y is

d
pdfY (y) = cdfY (y)
dy
Theorem 8.20. Let X be a continuous random variable and F (x) = cdfX (x). Then new random variable
Y = F (X) is uniformly distributed on (0, 1), that is, Y ∼ unif orm(0, 1).
8 RANDOM VARIABLES 12

Theorem 8.21 (change of variable). Let X be a continuous random variable and g be a one-to-one and
differentiable function. Then the density of random variable Y = g(X) is

d −1
pdfY (y) = pdfX (g −1 (y)) g (y)
dy

whenever y is in the range of Y (S).

Theorem 8.22. Consider discrete random variables X1 , . . . , Xn . There exist m functions g1 , . . . , gm so that
Yi = gi (X1 , . . . , Xn ). The joint probability mass function of Y = (Y1 , . . . , Ym ) is
X
pmfY (y) = pmfX (x)
x:gi (x)=yi ,i=1,...,m

Definition 8.29. Random variables X1 , . . . , Xn are said to be independent and identically distributed
(i.i.d) if all random variables have the same distribution and are independent.

Theorem 8.23. Let X and Y be jointly continuous random variables. The density of Z = X + Y is
ˆ
pdfZ (z) = pdfX,Y (x, z − x) dx

If X and Y are independent, then

ˆ
pdfX (z) = pdfX (x)pdfY (z − x) dx

Theorem 8.24 (change of variable). Suppose X1 , . . . , Xn have a joint density function f (x1 , . . . , xn ) and
Yi = gi (X1 , . . . , Xn ) for one-to-one correspondent and differentiable functions gi ’s, say y = g(x). The joint
density of Y1 , . . . , Yn is
∂(x1 , . . . , xn )
pdfY (y) = pdfX (x) det
∂(y1 , . . . , yn )
where x = (x1 , . . . , xn ) = g −1 (y)

8.5 Expectation
Definition 8.30. expectation The expectation (or expected value or mean value) of a discrete random variable
is X X
E[X] = x × P (X = x) = x × pmfX (x)
x x

when the sum is absolutely convergent.

Definition 8.31. The expectation of a continuous random variable X is defined by

ˆ
E[X] = x × pdfX (x) dx

Theorem 8.25. Assume a discrete random variable X is non-negative. Then

ˆ ∞ ˆ ∞
E[X] = P (X > z) dz = x dF (x)
0 0

Corollary 8.1. Let X be a non-negative integer valued random variables. Then

∞
X
E[X] = P (X ≥ n)
n=1
8 RANDOM VARIABLES 13

Lemma 8.2. Let F be the cumulative distribution function of a random variable X. For an interval,
P (a < X ≤ b) = E[1(a < X ≤ b)]
In general, for each event A of X,
P (X ∈ A) = E[1(X ∈ A)]

Theorem 8.26. For any random variable X with finite expectation,

ˆ ∞ ˆ 0 ˆ ∞
E[X] = P (X > z) dz − P (X < z) dz = x dF (x)
0 −∞ −∞

Theorem 8.27. Let X be a random variable and g be a function on R. If expectation of Y = g(X) is defined,
then ˆ ˆ ∞
E[Y ] = g(x) d cdfX (x) = g(x) · pdfX (x) dx
−∞
or ˆ X
E[Y ] = g(x) d cdfX (x) = g(x) · pdfX (x)
x

Lemma 8.3. Assume X, Y ≥ 0 with probability 1, that is, P (X ≥ 0, Y ≥ 0) = 1, then

E[X + Y ] = E[X] + E[Y ]
and
E[X − Y ] = E[X] − E[Y ]
Theorem 8.28 (Properties of Expectation). Satisfies

1. (linearity) Let Y = aX + b, then

E[Y ] = aE[X] + b

2. (monotonicity) If X ≥ 0, that is, P (X ≥ 0) = 1, then E(X) ≥ 0

3. (additivity) E[(]X + Y ) = E[X] + E[Y ]
4. For constant random variable 1, E[1] = 1
Theorem 8.29. Let X and Y be two independent random variables and g and h be real functions satisfying
g(X) and h(Y ) are random variables with finite expectations. Then
E[g(X)h(Y )] = E[g(X)]E[h(Y )]

8.6 Moments
Definition 8.32. For positive integer k, the k-th moment of X is E[X k ] and the k-th central moment is
E[(X − E[X])k ].
Theorem 8.30. If E[|X|t ] < ∞ for some t > 0, then E[|X|s ] < ∞ for any 0 ≤ s ≤ t.
Definition 8.33 (variance). The variance of a random variable X is
VAR X = E[(X − E[X])2 ]
The covariance and correlation between two random variables X and Y are
Cov(X, Y ) = E[(X − E[X])(Y − E[Y ])]
and
Cov(X, Y )
Cor(X, Y ) = √
VAR X VAR Y
9 INEQUALITIES 14

Theorem 8.31 (Properties of variance). satisfies

1. VAR X ≥ 0

2. VAR X = E[X 2 ] − (E[X])2

3. VAR aX + b = a2 VAR X

4. VAR X + Y = VAR X + VAR Y + 2Cov(X, Y )

5. VAR X + Y = VAR X + VAR Y if and only if X and Y are uncorrelated.

6. If a random variable X is bounded, then it must has finite variance.

7. VAR X = 0 if and only if P (X = c) = 1 for some c ∈ R.

Theorem 8.32 (Properties of covariance).

Cov[X, Y ] = E[X, Y ] − E[X]E[Y ]

Definition 8.34 (skewness and kurtosis). The standardized third and fourth moments are said to be skewness
and kurtosis, that is,
skewness = E[(X − µ)3 ]/σ 3 , kurtosis = E[(X − µ)4 ]/σ 4
where µ = E[X] and σ 2 = VAR X.

9 Inequalities
Theorem 9.1 (Chebychev’s inequality). Let X be a random variable with mean µ and variance σ 2 . Then, for
any α > 0,
1
P (|X − µ| ≥ ασ) ≤ 2
α
Equivalently, for α > 0,
VAR X
P (|X − µ| > α) ≤
α2
Theorem 9.2 (Markov’s inequality). If X ≥ 0 with µ = E[X] < ∞, then for any α > 0,

P (X ≥ α) ≤ µ/α

Remark 9.1. The Chebychev’s inequality is a special case of Markov’s inequality by considering

Y = (X − µ)2

Note that A = {s ∈ Ω : |X(s) − E(X)| ≥ r} = s ∈ Ω : (X(s) − E(X))2 ≥ r2

Now, consider the random variable, Y, where Y (s) = (X(s) − E(X))2 .

Note that Y is a non-negative random variable.
Thus, we can apply Markov’s inequality to it, to get:
E ((X−E(X))2 )
P (A) = P Y ≥ r2 ≤ E(Y )
= V r(X)

r2
= r2 2 .

Theorem 9.3 (Cauchy-Schwartz’ inequality). Let X and Y be two random variables having finite second
moment. Then
[E[XY ]]2 ≤ E[X 2 ]E[Y 2 ]
where the equality holds if and only if P (aX = bY ) = 1 for some a, b ∈ R.

Theorem 9.4. Let X and Y be two random variables with finite second moment. Then Y = aX + b for some
a, b if and only if |Corr(X, Y )| = 1.
10 CONDITIONAL EXPECTATION 15

Lemma 9.1 (Young’s inequality). For p, q > 1 with 1/p + 1/q = 1 and two nonnegative real numbers x, y ≥ 0,
xy ≤ xp /p + y q /q
Theorem 9.5 (Hölder’s inequality). For p, q > 1 with 1/p + 1/q = 1,
E[|XY |] ≤ ||X||p ||Y ||q
when the expectations exist and are finite where ||X||r = E[|X|r ]1/r for r > 0.
Remark 9.2. The Cauchy-Schwartz’ inequality is a special case of Hölder’s inequality (p = q = 2)
Theorem 9.6 (Jensen’s inequality). For a convex function ϕ,
ϕ(E[X]) ≤ E[ϕ(X)]
Theorem 9.7 (Minkowski’s inequality). For p ≥ 1,
||X + Y ||p ≤ ||X||p + ||Y ||p

10 Conditional Expectation
Definition 10.1. conditional expectation The conditional expectation of Y given X = x is defined by
ˆ
E[Y |X = x] = y dcdfY |X (y|x)

If Y is discrete, then X
E[Y |X = x] = y × pmfY |X (y|x)
y

If Y is continuous, then ˆ
E[Y |X = x] = y × pmfY |X (y|x) dy

Theorem 10.2 (Properties of conditional expectation). Satisfies

1. E[aY + b|X] = aE[Y |X] + b
2. If P (Y ≥ 0|X) = 1, then E[Y |X] ≥ 0
3. E[Y + Z|X] = E[Y |X] + E[Z|X]
4. for constant random variable 1, E[1|X] = 1
5. for convex function φ, ϕ(E[Y |X]) ≤ E[ϕ(Y )|X]
Theorem 10.3 (Law of Total Expectation).
E[E[Y |X]] = E[Y ]
i.e. The expected value of the conditional expected value of Y given X is the same as the expected value of Y .
One special case states that if {Ai }i is a finite or countable partition of the sample space, then
X
E[X] = E[X|Ai ]P (Ai )
i
11 PROBABILITY RELATED FUNCTIONS 16

Definition 10.2. conditional variance The conditional variance is given by

VAR Y |X = x = E[(Y − E[Y |X = x])2 |X = x]

Theorem 10.4.
VAR Y = E[VAR Y |X] + VAR E[Y |X]

11 Probability Related Functions

Let X be a random variable.

1. moment generating function: mgfX (t) = E[etX ]

2. cumulant generating function: cgfX (t) = log E[etX ]

3. probability generating function: pgfX (t) = E[z X ]

4. characteristic generating function: chfX (t) = E[eitX ]

√
where t ∈ R, z > 0 and i = −1 is the unit imaginary number.

Theorem 11.1 (properties of mgf). As follows

1. mgfX (0) = 1
dk
2. E[X k ] = dtk mgfX (0) if it exists

3. If E[|X|k ] < ∞, then for µj = E[X j ] where j = 1, . . . , k,

t2 tk
mgfX (t) = 1 + µ1 t + µ2 + . . . + µk + o(|t|k )
2! k!

4. mgfaX+b (t) = ebt mgfX (at)

5. If X and Y are independent, then

mgfX,Y (s, t) = mgfX (s)mgfY (t)

Theorem 11.2 (properties of cgf). As follows

1. cgfX (0) = 0

2. If X and Y are independent, then

cgfX,Y (s, t) = cgfX (s) + cgfY (t)

Theorem 11.3 (properties of pgf). As follows

1. pgfX (1) = 1
dk
2. E[X(X − 1) . . . (X − k + 1)] = dz k
pgfX (1) if it exists.

3. If X and Y are independent, then

pgfX,Y (s, t) = pgfX (s) + pgfY (t)

Theorem 11.4 (properties of chf). As follows

11 PROBABILITY RELATED FUNCTIONS 17

1. chfX (0) = 1
k
2. E[X k ] = (i)−k dt
d
k chfX (0) if it exists

3. If E[|X|k ] < ∞, then for µj = E[X j ] where j = 1, . . . , k,

t2 tk
chfX (t) = 1 + iµ1 t − µ2 + . . . + ik µk + o(|t|k )
2! k!

4. chfaX+b = eibt chfX (at)

5. If X and Y are independent, then

chfX,Y (s, t) = chfX (s)chfY (t)

6. |chfX (t)| ≤ 1 for all t

7. chf is uniformly continuous

8. for any t1 , . . . , tn ∈ R and z1 , . . . , zn ∈ C,

X
chfX (tj − tk )zj z̄k ≥ 0
j,k

Theorem 11.5. If two random variables X and Y have the same moment generating functions in an open
neighbourhood of 0, that is, (−a, b) for a, b > 0, then X and Y are identically distributed.

Theorem 11.6. If a function ϕ : R → C satisfies 5 - 8 in Theorem 11.4, then there exists a random variable
having ϕ as its characteristic function.

Definition 11.1. The joint probability/moment/cumulant generating and characteristic functions of X and Y
are

1. mgfX,Y (s, t) = E[esX+tY ]

2. cgfX,Y (s, t) = log mgfX,Y (s, t)

3. pgfX,Y (s, t) = E[sX tY ]

4. chfX,Y (s, t) = E[eisX+itY ]

Theorem 11.7 (Inversion Formula). Let ϕ be a characteristic function of a random variable X. Then for any
a, b, ˆ ∞ −iat
1 e − e−ibt
P (a < X < b) + {P (X = a) + P (X = b)}/2 = lim ϕ(t) dt
T →∞ 2π −∞ it
Theorem 11.8 (Chernoff Bound). Let X be a random variable having moment generating function. For any
constant x,
P (X ≥ x) ≤ inf e−xt mgfX (t)
t>0
12 STOCHASTIC PROCESS 18

11.1 Survival Functions

Let X be a non-negative valued random variable.
The survival function of X is SX (t) = P (X > t) or SX (t) = 1 − FX (t).
(the probability of surviving longer than time x.
The hazard function is
pdfX (t) pdfX (t)
hX (t) = =
SX (t) 1 − FX (t)
(measures the risk of event (or death) at time x. The cumulative hazard function is
ˆ t
HX (t) = hX (z) dz
0

for t > 0.
The residual (or future) lifetime given X > t is defined by

RX (t) = X − t

The mean residual lifetime is the conditional expectation of residual lifetime given X > t, that is,
ˆ ∞ ˆ ∞
SX (z)
E[RX (t)|X > t] = P (RX (t) > z|X > t) dz = (7)
0 t SX (t)

Particularly for t = 0 and SX (0) = 1,

ˆ ∞
E[RX (0)|X > 0] = SX (z) dz = E[X]
0

12 Stochastic process
Definition 12.1. A stochastic process is a collection of time indexed random variables

{Xt : t ∈ T }

A collection of σ-field F = {Ft }t∈T is called a filtration if F ⊂ Ft for any 0 ≤ s ≤ t.

A stochastic process X = {Xt }t∈T is said to be adapted to the filtration F if Xt is Ft -measurable (or
{Xt ≤ r} ∈ Ft for any real number r).
Definition 12.2 (Martingales). A stochastic process Xn is said to be a (discrete-time) martingale if
1. E[|Xn |] < ∞

2. E[Xn+1 |X0 , . . . , Xn ] = Xn for all n

3. A stochastic process Xn is said to be supermartingale if it satisfies above (1) and

E[Xn+1 |X0 , . . . , Xn ] ≤ Xn

for all n.

4. A stochastic process Xn is said to be submartingale if it satisfies above (1) and

E[Xn+1 |X0 , . . . , Xn ] ≥ Xn

for all n.
Note: the condition X0 , . . . , Xn is often replaced by F, that is,

E[Xn+1 |Fn ] = Xn
12 STOCHASTIC PROCESS 19

Remark 12.1. A martingale is both supermartingale and submartingale.

If Xn is a submartingale, then −Xn is a supermaringale.
Definition 12.3 (stopping time). A time valued random variable T is said to be a stopping time if the event
{T ≤ n} can be expressed by X0 , . . . , Xn
Example 12.1. The first time T that the stochastic process Xn is bigger than or equal to a constant K is a
stopping time by considering
{T = n} = {X1 < K, . . . , Xn−1 < K, Xn ≥ K}
Theorem 12.1 (Optional Sampling Theorem). Let Xn be a submartingale and T is a stopping time with
P (T ≤ k) = 1. Then
E[X0 ] ≤ E[XT ] ≤ E[Xk ]

12.1 Random Walk

Let X1 , X2 , . . . be a sequence of independent random variables having mean zero and variance 1. Define
Sn = X1 + . . . + Xn
Theorem 12.2. For any α > 0,
VAR Sn
P ( max |Sk | ≥ α) ≤
k=1,...,n α2
Theorem 12.3. If Xn is symmetric for each n, then
P ( max |Sk | ≥ α) ≤ 2P (Sn ≥ α)
k=1,...,n

12.2 Poisson Process

A Poisson process with intensity λ is a stochastic process N = {Nt : t ≥ 0} taking values in non-negative
integers satisfying
(a) N0 = 0 and Ns ≤ Nt if 0 ≤ s ≤t
1 − λh + o(h) if m = 0

(b) P (Nt+h = n + m|Nt = n) = λh + o(h) if m = 1

o(h) if m > 1

(c) For 0 ≤ s < t, the arrivals Nt − Ns in the interval (s, t] is independent of the arrivals Ns in the interval (0, s].
Theorem 12.4. For any fixed time t > 0, Nt ∼ P oisson(λt)
Theorem 12.5. The interarrival times X1 , X2 , . . . are independent and identically distributed from exponential
with λ

12.3 Reflection principle (Wiener process)

Definition 12.4 (Wiener Process). A continuous-time stochastic process W (t) for t ≥ 0 with W (0) = 0 and
such that the increment W (t) − W (s) is Gaussian with mean 0 and variance t − s for any 0 ≤ s < t, and
increments for nonoverlapping time intervals are independent.
Remark 12.2. Brownian motion (i.e. random walk with random step sizes) is the most common example of
a Wiener process.
Theorem 12.6 (Reflection principle). If (W (t) : t ≥ 0) is a Wiener process, and a > 0 is a threshold, then

P sup W (s) ≥ a = 2P (W (t) ≥ a)
0≤s≤t

Remark 12.3. If the path of a Wiener process f (t) reaches a value f (s) = a at time t = s, then the subsequent
path after time s has the same distribution as the reflection of the subsequent path about the value a.
13 MODE OF CONVERGENCE 20

13 Mode of Convergence
Definition 13.1. Modes of convergence
d
• A sequence of random variables Xn converges to X in distribution (Xn −→ X) if

P (Xn ≤ x) → P (X ≤ x)

as n → ∞ for any x with P (X = x) = 0.

p
• A sequence of random variables Xn converges to X in probability (Xn −→ X) if

P (|Xn − X| > ) → 0

as n → ∞
a.s.
• A sequence of random variables Xn converges to X almost surely (Xn −→ X) if

P (lim sup |Xn − X| = 0) = 1

n→∞

Lp
• A sequence of random variables Xn converges to X in Lp (Xn −→ X) for p > 0 if

E[|Xn − X|p ] → 0

as n → ∞
Theorem 13.1. Let Xn and X be discrete random variables with probability mass functions fn (x) and f (x)
satisfying fn (x) → f (x) for any x with f (x) > 0. Then

Xn −→ X

in distribution.
Theorem 13.2 (Relations between modes of convergence). As follows:
a.s. p
(a) Xn −→ X =⇒ Xn −→ X
Lp p
(b) Xn −→ X =⇒ Xn −→ X
p d
(c) Xn −→ X =⇒ Xn −→ X

13.1 L1 Convergence
Lemma 13.1 (L1 Convergence). If Y ≥ 0 and E[[]Y ] < ∞, then for any > 0 there exists M > 0 such that

E[Y 1{Y > M }] <

Lemma 13.2. Suppose a random variable Y has a finite absolute expectation, that is, E[|Y |] < ∞. For any
> 0, there exists δ > 0 such that |E[Y 1{A}]| < for any event A with P (A) < δ where 1{A} is an indicator
function of the event A.
Lemma 13.3. Suppose a random variable Y has a finite absolute expectation, that is, E[|Y |] < ∞ and a
sequence An of events satisfy P (An ) → 0. Then

E[Y 1{An }] → 0

Theorem 13.3 (Dominated Convergence Theorem). Suppose that Xn → X in probability, |Xn | ≤ Y and
E[Y ] < ∞. Then
E[Xn ] → E[X]
13 MODE OF CONVERGENCE 21

Theorem 13.4 (Generalized Dominated Convergence Theorem). If all X, Y, Xn , Yn have finite absolute expec-
tation, |Xn | ≤ Yn for all n, Xn → X in probability, Yn → Y , and E[Yn ] → E[Y ], then

E[Xn ] → E[X]

Theorem 13.5 (Monotone Convergence Theorem). Let Xn be non-negative non-decreasing random variables.
Suppose lim Xn = X is finite a.s. Then
n→∞
lim E[Xn ] = E[X]
n→∞

Theorem 13.6 (Fatou’s lemma). Let X1 , X2 , . . . be a sequence of non-negative random variables. Then

E[ lim inf Xn ] ≤ lim inf E[Xn ]

n→∞ n→∞

13.2 Almost Sure Convergence

Theorem 13.7 (Borel-Cantelli lemma). Let A = ∩∞ ∞
m=1 ∪n=m An be the event that infinitely many An ’s occur.
P
1. P (A) = 0 if n P (An ) < ∞
P
2. P (A) = 1 if n P (An ) = ∞ and A1 , A2 , . . . are independent.

Theorem 13.8. If for any > 0, ∞

P
n=1 P (|Xn − X| > ) < ∞, then Xn → X almost surely.

Theorem 13.9. If a sequence of random variables Xn converges to X in probability, then there exists a
subsequence nk such that Xnk converges to X almost surely.

Theorem 13.10. A sequence xn of real numbers converges to x if and only if for any subsequence nk there
exists a further subsequence nkl such that xnkl

Theorem 13.11. A sequence of random variables Xn converges to X in probability if and only if for any
subsequence nk there exists a further subsequence nkl such that Xnkl converges to X a.s.

13.3 Convergence in distribution

Theorem 13.12. As follows
d p
(a) If Xn −→ c where c is a constant, then Xn → c.
p Lp
(b) If Xn −→ c and P (|Xn | ≤ M ) = 1 for some M > 0, then Xn −→ X for any p > 0

Theorem 13.13. Let X be a random variable with P (X = x) = 0 for all x and F be the distribution function
of X. Then F (X) ∼ unif orm(0, 1) and F −1 (U ) ∼ X for any U ∼ unif orm(0, 1)
d
Theorem 13.14 (Skorokhod’s representation theorem). If Xn −→ X, then there exist random variables
Y, Y1 , Y2 , . . . in a probability space such that
(a) Xn and Yn have the same distribution as well as X and Y have the same distribution
a.s.
(b) Yn −→ Y

Theorem 13.15 (Continuous mapping theorem). Let g be a continuous function.

a.s. a.s.
1. Xn −→ X =⇒ g(Xn ) −→ g(X)
p p
2. Xn −→ X =⇒ g(Xn ) −→ g(X)
d d
3. Xn −→ X =⇒ g(Xn ) −→ g(X)
d
Theorem 13.16. Xn −→ X if and only if E[g(Xn )] → E[g(X)] for any bounded continuous function g.
14 LAW OF LARGE NUMBERS 22

d
Theorem 13.17. Xn −→ X if and only if

chfXn (t) → chfX (t)

d
Theorem 13.18. If Xn −→ X, then
d
aXn + b −→ aX + b
for any a, b ∈ R
d d
Theorem 13.19 (Slutsky’s lemma). Suppose Xn −→ X and Yn −→ c for a constant c.
d
1. Xn + Yn −→ X + c
d
2. Xn Yn −→ Xc
d
3. Xn /Yn −→ X/c if c 6= 0

14 Law of Large Numbers

Theorem 14.1 (Weak Law of Large Numbers). Let Xn be i.i.d. with E[|Xn |] < ∞. Then
p
X̄n −→ E[X1 ]

Theorem 14.2 (Strong Law of Large Numbers). Let X1 , . . . , Xn be i.i.d. r.v.s with E[|Xn |] < ∞. Then
a.s.
X̄n −→ E[X1 ]

Theorem 14.3. Let X1 , . . . , Xn be i.i.d. r.v.s with E[Xn2 ] < ∞.

X̄n = (X1 + . . . + Xn )/n −→ E[X1 ]

almost surely and in L2 .

15 Central Limit Theorem

For k ≈ np, the binomial probability is approximated by
(k − np)2

n k 1
p (1 − p)n−k ≈ p exp −
k 2πnp(1 − p) 2np(1 − p)

Theorem 15.1 (Levy’s Central Limit Theorem). Let X1 , . . . , Xn be i.i.d. r.v.s with µ = E[Xi ] and σ 2 =
VAR Xi . Then
√ d
n(X̄n − µ)/σ −→ N (0, 1)
Theorem 15.2 (Lindeberg-Feller Central Limit Theorem). Let X1 , . . . , Xn be i.i.d. r.v.s with E[Xi ] = 0 and
σi2 = VAR Xi2 < ∞. Let s2n = E[X12 ] + . . . + E[Xn2 ] The Lindeberg condition
1
Pn 2 2 →0
s2n k=1 E[Xk 1{Xk > s2n }]
for any > 0 holds if and only if
d
(X1 + . . . + Xn )/sn −→ N (0, 1)
and
max(σ12 , . . . , σn2 )/s2n → 0
15 CENTRAL LIMIT THEOREM 23

Theorem 15.3 (Lyapounov’s condition). Let X1 , . . . , Xn be i.i.d. r.v.s with E[Xi ] = 0 and σi2 = VAR Xi2 < ∞
satisfying Lyapounov’s condition
n
1 X
lim E[|Xk |2+δ ] = 0
n→∞ s2+δ
n k=1

Then Lindeberg’s condition holds. Hence

d
(X1 + . . . + Xn )/sn −→ N (0, 1)

Theorem 15.4 (δ-method). Let X1 , . . . , Xn be i.i.d. r.v.s anf an is a sequence of positive real numbers diverging
d
to infinity. If an (Xn − µ) −→ Z for some r.v. Z and a constant µ, then for any continuously differentiable
function g,
d
an (g(Xn ) − g(µ)) −→ g 0 (µ)Z

R19 Cse Ii Syllabus
No ratings yet
R19 Cse Ii Syllabus
39 pages
M1112SP IIIb 2
No ratings yet
M1112SP IIIb 2
4 pages
01-Basic Probability Theory
No ratings yet
01-Basic Probability Theory
49 pages
Statistics and Econometrics Lecture Notes 2021
No ratings yet
Statistics and Econometrics Lecture Notes 2021
428 pages
STA 311 Probability
No ratings yet
STA 311 Probability
69 pages
STAT3004 Course Notes
100% (1)
STAT3004 Course Notes
167 pages
LN SP2018
No ratings yet
LN SP2018
139 pages
Cours Chapter1
No ratings yet
Cours Chapter1
12 pages
Chapter 1
No ratings yet
Chapter 1
39 pages
What is a Risk Model_ _ SAS
No ratings yet
What is a Risk Model_ _ SAS
3 pages
Factor Risk Model - Breaking Down Finance
No ratings yet
Factor Risk Model - Breaking Down Finance
3 pages
Factor Models - Definition, Advantages and Disadvantages of..
No ratings yet
Factor Models - Definition, Advantages and Disadvantages of..
4 pages
Revision Notes - ST2131: Ma Hongqiang April 18, 2017
No ratings yet
Revision Notes - ST2131: Ma Hongqiang April 18, 2017
30 pages
Review of Policy Years June 2025 - Skuld
No ratings yet
Review of Policy Years June 2025 - Skuld
3 pages
week12
No ratings yet
week12
6 pages
Material_MAT3003_Modules-(1+2) (1)
No ratings yet
Material_MAT3003_Modules-(1+2) (1)
48 pages
772199 (1)
No ratings yet
772199 (1)
4 pages
midterm_2019
No ratings yet
midterm_2019
4 pages
1219556
No ratings yet
1219556
8 pages
STA1501_ASSIGNMENT 1_Questions_2025
No ratings yet
STA1501_ASSIGNMENT 1_Questions_2025
9 pages
Review Notes - Probability
No ratings yet
Review Notes - Probability
16 pages
1281473
No ratings yet
1281473
16 pages
Part IA - Probability: Definitions
No ratings yet
Part IA - Probability: Definitions
18 pages
1278715
No ratings yet
1278715
12 pages
Introduction to Computation and Programming Using Python with Application to Understanding Data 2nd edition Edition Guttag - The latest ebook edition with all chapters is now available
No ratings yet
Introduction to Computation and Programming Using Python with Application to Understanding Data 2nd edition Edition Guttag - The latest ebook edition with all chapters is now available
47 pages
FundProb_notes22
No ratings yet
FundProb_notes22
52 pages
Notes on Probability Theory and Statistics -- Joel Terschuur
No ratings yet
Notes on Probability Theory and Statistics -- Joel Terschuur
58 pages
Reviewer 10 - Engineering Data Analysis
No ratings yet
Reviewer 10 - Engineering Data Analysis
5 pages
Research the Merits and Methods of Multi Factor Investing
No ratings yet
Research the Merits and Methods of Multi Factor Investing
24 pages
210 Book
No ratings yet
210 Book
199 pages
Prob Main
No ratings yet
Prob Main
124 pages
main
No ratings yet
main
24 pages
2020 Back Matter
No ratings yet
2020 Back Matter
21 pages
Course Notes Stats 210 Statistical Theory
No ratings yet
Course Notes Stats 210 Statistical Theory
199 pages
Axioms of Probability
No ratings yet
Axioms of Probability
40 pages
Math 170A
No ratings yet
Math 170A
34 pages
Lecture 1
No ratings yet
Lecture 1
4 pages
Probability Review
No ratings yet
Probability Review
29 pages
STAT230 Course Notes
No ratings yet
STAT230 Course Notes
51 pages
Stochastic Processes: Stat433/833 Lecture Notes
No ratings yet
Stochastic Processes: Stat433/833 Lecture Notes
113 pages
STA1006S Notes 2 DVW
No ratings yet
STA1006S Notes 2 DVW
21 pages
Fall07 Notenotes For Pee Seogrts
No ratings yet
Fall07 Notenotes For Pee Seogrts
134 pages
4b_ProbabilityNotes
No ratings yet
4b_ProbabilityNotes
79 pages
M.Sc_DS_Model_QP
No ratings yet
M.Sc_DS_Model_QP
7 pages
Laudon Mis12 ppt04 GE
No ratings yet
Laudon Mis12 ppt04 GE
35 pages
M.E. BCS
No ratings yet
M.E. BCS
110 pages
Underground Mining Safety Factor
No ratings yet
Underground Mining Safety Factor
16 pages
Differential-Geometrical Methods in Statistics
No ratings yet
Differential-Geometrical Methods in Statistics
301 pages
Stochastic Processes - Jiahua Chen
No ratings yet
Stochastic Processes - Jiahua Chen
113 pages
Statistics and Numerical Methods Notes
No ratings yet
Statistics and Numerical Methods Notes
225 pages
Stats 210 Course Book
No ratings yet
Stats 210 Course Book
200 pages
Probability Olivier Knill
No ratings yet
Probability Olivier Knill
372 pages
Laudon Mis12 ppt07 GE
No ratings yet
Laudon Mis12 ppt07 GE
44 pages
Prob ch1
No ratings yet
Prob ch1
34 pages
102 02 Answers
No ratings yet
102 02 Answers
17 pages
Guideline MBA
No ratings yet
Guideline MBA
20 pages
MAT3003 Modules - (1 2 3) - Updated
No ratings yet
MAT3003 Modules - (1 2 3) - Updated
40 pages
Notes Aukland Studied PDF
No ratings yet
Notes Aukland Studied PDF
200 pages
Whittaker 1915
No ratings yet
Whittaker 1915
14 pages
Statistics With R Hari
No ratings yet
Statistics With R Hari
59 pages
Computation of Covariance Matrices For C
No ratings yet
Computation of Covariance Matrices For C
11 pages
Lectnotemat 5
No ratings yet
Lectnotemat 5
346 pages
Laudon Mis12 ppt05 GE
No ratings yet
Laudon Mis12 ppt05 GE
47 pages
Lesson 6 Normal Distribution
No ratings yet
Lesson 6 Normal Distribution
33 pages
Course Info 3rd Sem PESU
No ratings yet
Course Info 3rd Sem PESU
16 pages
Skript 2022
No ratings yet
Skript 2022
112 pages
Statistics Chapter 6c (Poisson Probability)
No ratings yet
Statistics Chapter 6c (Poisson Probability)
16 pages
Probability & Statistics PDF
No ratings yet
Probability & Statistics PDF
266 pages
Lectnotemat 2
No ratings yet
Lectnotemat 2
348 pages
Wa0012 PDF
No ratings yet
Wa0012 PDF
21 pages
Probability Notes
No ratings yet
Probability Notes
73 pages
Lecture Notes Statistics I
No ratings yet
Lecture Notes Statistics I
160 pages
Probability & Statistics BITS WILP
100% (2)
Probability & Statistics BITS WILP
174 pages
APznzabODUiSSaWMUou42Zzm0EYzg0Yh 1FEJF QJAUr4k8rz3m YU3iMSbfj49gHbb070VtVcnEvSEyzQBI1oV0P1nfomatmabhLwlTksvMa8zNID0lFpjygrjBXJpow7OT1jEWPikvLlRPMXG56 KCTPGX6AhP ArSuiN6zEcbb9NFUZrolIsRV3C5
No ratings yet
APznzabODUiSSaWMUou42Zzm0EYzg0Yh 1FEJF QJAUr4k8rz3m YU3iMSbfj49gHbb070VtVcnEvSEyzQBI1oV0P1nfomatmabhLwlTksvMa8zNID0lFpjygrjBXJpow7OT1jEWPikvLlRPMXG56 KCTPGX6AhP ArSuiN6zEcbb9NFUZrolIsRV3C5
33 pages
Stability of Structuresa
No ratings yet
Stability of Structuresa
15 pages
Statistics For Managers Using Microsoft Excel: Edition
No ratings yet
Statistics For Managers Using Microsoft Excel: Edition
56 pages
MATH1151 Course Outline
No ratings yet
MATH1151 Course Outline
29 pages
Civil Mtechgte (FT) R18
No ratings yet
Civil Mtechgte (FT) R18
70 pages
Ch08 - Large-Sample Estimation
No ratings yet
Ch08 - Large-Sample Estimation
28 pages
MCMC UseR
No ratings yet
MCMC UseR
239 pages
Materi Uts (Baru)
No ratings yet
Materi Uts (Baru)
92 pages
16 WLANOverview
No ratings yet
16 WLANOverview
71 pages
01 - Probability Spaces
No ratings yet
01 - Probability Spaces
15 pages
Lab1 Problems 0286
No ratings yet
Lab1 Problems 0286
4 pages
Material_MAT3003_Modules-(1+2+3)
No ratings yet
Material_MAT3003_Modules-(1+2+3)
63 pages
Анг
100% (1)
Анг
318 pages
Probability Theory - R.S.varadhan
No ratings yet
Probability Theory - R.S.varadhan
225 pages
MTH2222 Mathematics of Uncertainty
No ratings yet
MTH2222 Mathematics of Uncertainty
96 pages
Cs229 Probability Review
No ratings yet
Cs229 Probability Review
36 pages
Prof (1) F P Kelly - Probability
No ratings yet
Prof (1) F P Kelly - Probability
78 pages
Lecture Notes in Probability: Raz Kupferman Institute of Mathematics The Hebrew University April 5, 2009
No ratings yet
Lecture Notes in Probability: Raz Kupferman Institute of Mathematics The Hebrew University April 5, 2009
159 pages
Mechanical Measurements
No ratings yet
Mechanical Measurements
22 pages
(Courant Lecture Notes in Mathematics 7) S. R. S. Varadhan-Probability Theory-Courant Institute of Mathematical Sciences - American Mathematical Society (2001)
0% (1)
(Courant Lecture Notes in Mathematics 7) S. R. S. Varadhan-Probability Theory-Courant Institute of Mathematical Sciences - American Mathematical Society (2001)
227 pages
Probability I12
No ratings yet
Probability I12
100 pages
MATH/STAT 235A - Probability Theory Lecture Notes, Fall 2011
No ratings yet
MATH/STAT 235A - Probability Theory Lecture Notes, Fall 2011
111 pages
Maths B Notes 2
No ratings yet
Maths B Notes 2
2 pages
Lesson 1.1 Exploring Random Variable
No ratings yet
Lesson 1.1 Exploring Random Variable
6 pages
Statistics Basics From IITM Statistits 2 Course Week - 0
100% (1)
Statistics Basics From IITM Statistits 2 Course Week - 0
71 pages
Probability Theory - Varadhan
No ratings yet
Probability Theory - Varadhan
225 pages
Random Variale & Random Process
No ratings yet
Random Variale & Random Process
298 pages
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
Unlocking Statistics for the Social Sciences
From Everand
Unlocking Statistics for the Social Sciences
Norma Sinclair
No ratings yet
The Linux Terminal for Advanced Users - The Command Line Made Easy: First Edition
From Everand
The Linux Terminal for Advanced Users - The Command Line Made Easy: First Edition
Michael Basler
No ratings yet
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
Syllabus - Linear Algebra For Engineers
No ratings yet
Syllabus - Linear Algebra For Engineers
5 pages
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Intrusion Detection Honeypots
From Everand
Intrusion Detection Honeypots
Chris Sanders
3/5 (2)
Kellory the Warlock
From Everand
Kellory the Warlock
Lin Carter
No ratings yet