Measure Theoretic Probability Theory Notes
Measure Theoretic Probability Theory Notes
June 3, 2024
Probability Spaces
Definition. A probability space is a triple (Ω, F, P) where Ω is a set (the sample space), F is a sigma algebra on Ω
(the event space) and P is a measure (the probability measure) on the measurable space (Ω, F) with the additional
property that P(Ω) = 1.
• Remark. Note that by the monotonicity of a measure, we have P(F ) ≤ P(Ω) ≤ 1 for all F ∈ F .
• Example. Consider two independent coin tosses. Here, we have Ω = {HH, HT, TH, TT}, F = P (Ω) and we
define P(ω) = 41 for all ω ∈ Ω. Thus, by sigma additivity, we have that P(A) = ω∈A P(ω) = |A|
P
4
for all A ∈ F .
• Example. Consider rolling a fair dice. Here, we have Ω = {1, 2, 3, 4, 5, 6}, F = P (Ω) and we define P(ω) = 1
6
for all ω ∈ Ω. Thus, by sigma additivity, we have that P(A) = ω∈A P(ω) = |A|
P
6
for all A ∈ F .
Random Variables
• Probability distribution law. The probability distribution law of a random variable X : (Ω, F) → (E, E)
denoted by PX , is a measure on (E, E) defined by PX (A) = P(X −1 [A]) for all A ∈ E.
• Absolutely Continuous. A real-valued random variable X is said to be absolutely continuous if the probability
distribution law, PX on (R, B(R)) is absolutely continuous with respect to the Lebesgue measure.
• Discrete. A random variable X : (Ω, F) → (E, E) is said to be discrete if X(Ω) ⊆ E is a countable set.
• Cumulative distribution function. The cumulative distribution function (CDF) of a real valued random
variable X, denoted by FX , is defined by FX (x) = PX ((−∞, x]) for all x ∈ R.
– Non Decreasing: For x ≤ y, we have (−∞, x] ⊆ (−∞, y] =⇒ PX ((−∞, x]) ≤ PX ((−∞, y]) by the
monotonicity of a measure. Thus, x ≤ y =⇒ FX (x) ≤ FX (y) .
• Let (X, A) be a measurable space, and let ν and µ be sigma-finite measures on (X, A). If g : X → [0, ∞) is a
measurable function, then we have
Z Z
g dν = gf dµ
X X
dν
where f = dµ
, the Radon-Nikodym derivative of ν with respect to µ.
• Let (X1 , A1 ) and (X2 , A2 ) be measurable spaces, with µ a measure on (X1 , A1 ) and T : X1 → X2 a measurable
map. Also, let f : X2 → R be a non-negative measurable map. Then, we have
Z Z
f ◦ T dµ = f d(µ ◦ T −1 )
X1 X2
1
Definition. Let (Ω, F, P) be a probability space and X : Ω → S ⊆ R a real-valued random variable. We define
the expectation or expected value of X, denoted by E[X] as follows:
Z
E[X] = X dP
Ω
Using the change of variable theorems, we can derive an alternative, easier to work with, expression for the
expectation. Let g : S → R be a measurable function. Now,
Z Z
g(X) dµ = g dPX
Ω S
by theorem (2) where PX is the probability law of the random variable X. Next, we use (1) to get
Z Z Z
dPX
g dPX = g dµ = gfX dµ
S S dµ S
2
For a continuous random
R variable X, the reference measure is taken to be the Lebesgue measure. Then the
expectation is given by X(Ω) xfX dλ. Further, if we assume that Ω ⊆ R, then xfX is a function from R to itself, and
by assumption it is Lebesgue integrable over X(Ω). Thus, it is also Riemann integrable over X(Ω) meaning we have
Z
E[X] = xfX (x) dx
X(Ω)