0% found this document useful (0 votes)
12 views28 pages

3. Continuous Random Variables

Uploaded by

Plugg TM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views28 pages

3. Continuous Random Variables

Uploaded by

Plugg TM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

2P7: Probability & Statistics

Continuous Random Variables

Thierry Savin

Lent 2024

the royal flush, the best possible hand in poker, has a probability 0.000154%
Introduction
Course’s contents

1. Probability Fundamentals

2. Discrete Probability Distributions

3. Continuous Random Variables

4. Manipulating and Combining Distributions

5. Decision, Estimation and Hypothesis Testing

1/26
Introduction
This lecture’s contents

Introduction

Fundamentals of Continuous Random Variables

The Probability Density Function

The Exponential Density

The Gaussian Density

The Beta Density

2/26
Introduction
Continuous Random Variables

In the last lectures:


▶ We have seen how discrete random variables are defined and
described by their probability mass function
▶ We have given important examples of probability mass
functions:
• Bernoulli
• Geometric
• Binomial
• Poisson
▶ We have shown how to characterise probability mass functions
via expectation, variance and other moments.
In this lecture, we will consider random variables with a continuous
support, which are described by their probability density function,
and give a few important examples.

3/26
Fundamentals
Definition of a continuous random variable

▶ We have seen random variables assign a number to each


outcome of the sample space.
▶ Discrete random variables have a discrete set of possible
values.
▶ Continuous random variables will have a continuous set of
values.
▶ The support can be finite (for example: [0, 1], [a, b]) or
infinite (for example: [0, +∞) , (−∞, +∞)) in extent.
Example: spinner wheel
▶ The sample space is a continuous set of outcomes
X (orientations of the arrow)
▶ The angle with the horizontal is a continuous
random variable X on a finite set X = [0, 2π).
▶ P[2.68 < X ≤ 2.69] = 0.01

▶ P[X = 2.68983285921430891716 . . . ] = 0
4/26
Fundamentals
The CDF of a continuous random variable

▶ In general, P[X = a] = 0 for continuous random variables.


▶ We can still consider events corresponding to intervals,
P[a < X ≤ b], and we have seen
P[a < X ≤ b] = FX (b) − FX (a)
where FX (x) = P[X ≤ x] is the cumulative distribution
function (CDF) of X.
▶ FX (x) is an “informative” probability, even for a continuous
random variable.
Example: spinner wheel FX (x)
 1

 0 if x ≤ 0,
 x
FX (x) = if 0 ≤ x < 2π,

 2π
1 if 2π ≤ x.

0 x
0 2π
5/26
The Probability Density Function
Definition

▶ Formally, we define the probability density function (PDF) as


dF (x)
fX (x) = X
▶ Interpretation: dx
FX (x + dx) − FX (x)
fX (x) = lim
dx→0 dx
P[x < X ≤ x + dx]
= lim ⇔ fX (x)dx ≈ P[x < X ≤ x + dx]
dx→0 dx
So fX (x)dx is the probability of X falling within the infinitesimal
interval (x, x + dx].
fX (x)
Example:
 spinner wheel 1
 1 2π
if x ∈ [0, 2π),
fX (x) = 2π
 0 otherwise
gives a “good picture” of the
uniform distribution of X. 0 x
0 2π
Note: we can extend the support to R by setting fX (x) = 0 for x ∈
/ X.
6/26
The Probability Density Function
Properties

▶ Reminder on the properties of FX :


(1) FX is non-decreasing: FX (a) ≤ FX (b) if a ≤ b
(2) lim FX (x) = 0 and lim FX (x) = 1
x→−∞ x→∞
▶ From (1), the probability density function is positive:
fX (x) ≥ 0 for all x ∈ R
▶ From fX (x) = F ′ (x):
Z b X
fX (x)dx = FX (b) − FX (a) = P[a < X ≤ b]
a
▶ From (2), the probability density function is normalised:
Z +∞
fX (x)dx = 1
−∞
▶ In general, the
P
R seen with discrete mass distributions
become with density functions.
▶ Note that fX is not a probability. It has the dimension of X−1 .
7/26
Joint Probability Density Function
Definitions

▶ For two continuous random variables X and Y, we defined the


joint probability density function fXY (x, y ) from the joint CDF
FXY (x, y ) = P[X ≤ x ∩ Y ≤ y ]:
∂ 2 FXY (x, y )
fXY (x, y ) =
∂x∂y
▶ The sum rule becomes an integral rule and marginalisation is
stated as Z +∞
fXY (x, y )dy = fX (x)
−∞
▶ Conditional probability density function1 and product rule:
f (x, y )
fX|Y (x|y ) = XY ⇒ fXY (x, y ) = fX|Y (x|y )fY (y )
fY (y )
1 ∂
We define the conditional PDF fX|Y (x|y ) = F
∂x X|Y = y
(x|Y = y ) with:
P[(X≤x)∩(y <Y≤y +dy )] F (x,y +dy )−F (x,y ) 1 ∂FXY (x,y )
FX|Y = y (x|Y = y ) = lim P[y <Y≤y +dy ]
= lim XYF (y +dy )−FXY(y ) = f (y ) ∂y
dy →0 dy →0 Y Y Y
fXY (x,y )
⇒ fX|Y (x|y ) = fY (y )
. This is conditional to “ Y = y ” exactly.
8/26
Joint Probability Density Function
Definitions & Properties

▶ Law of total probability


Z +∞
fX (x) = fX|Y (x|y )fY (y )dy
−∞

▶ Bayes’ rule

fX|Y (x|y )fY (y )


fY|X (y |x) = Z +∞
fX|Y (x|y )fY (y )dy
−∞

▶ Independence

X and Y independent ⇔ fXY (x, y ) = fX (x)fY (y )


⇔ fX|Y (x|y ) = fX (x) for all x, y ∈ R×R
⇔ fY|X (y |x) = fY (y )
9/26
Probability Density Function
Expectation and moments of a PDF

▶ The probability density function can be used to compute


expectations:
Z +∞  Z +∞ Z +∞ 
E[g (X)] = g (x)fX (x)dx E[g (X, Y)] = g (x, y )fXY (x, y )dxdy
−∞ −∞
−∞

▶ In particular, we call
E[Xn ] the nth moment
E[(X − E[X])n ] the nth central moment
The following moments are important:
• The mean (or first moment)
Z +∞
E[X] = x fX (x)dx
−∞
• The variance (or second central moment)
Var[X] = E[(X − E[X])2 ] = E[X2 ] − E[X]2
10/26
Probability Density Function
Other characteristics of a PDF

There are many ways to characterise the distribution of a random


variable X. For example:
p
▶ The standard deviation is σ = Var[X]
▶ The mode is the value of x at which fX (x) is maximum
▶ The median is the value Q1/2 of x at which FX (x) = 12 (split
area under the PDF in two equal parts):
Z median Z +∞
1
fX (x)dx = = fX (x)dx
−∞ 2 median

▶ The 1st and 3rd quartiles are the values Q1/4 and Q3/4 of x at
which FX (x) = 14 and 34 , respectively
▶ The interquartile range: Q3/4 − Q1/4
▶ The skewness E[(X − E[X])3 ]/σ 3 . If the skewness is positive,
the distribution is skewed to the right (the “tail” of the
distribution is longer to the right)
11/26
Probability Density Function
Characteristics of a PDF

fX (x)
mode standard deviation

mean

E[(X − E[X])3 ]
>0
σ3

x
median
(2ndquartile)
1st quartile 3rd quartile

interquartile range

12/26
The Exponential Density
Definition

What is the time/distance between two successive successes?


▶ Consider Xt ∼ Pois(λt) the number of successes (or arrivals)
over a time interval t with an average rate of arrivals λ.
▶ We wish to derive the density fT (t) of the time intervals T
between arrivals.
▶ The probability fT (t)dt = P[t < T ≤ t + dt].
▶ The event {t < T ≤ t + dt} means both:
• No arrivals happen between [0, t]: {Xt = 0}
• Exactly one arrival happens between [t, t + dt]: {Xdt = 1}
▶ So fT (t)dt = P[Xt = 0 ∩ Xdt = 1] = PX (0) × PX (1)
t dt

(λt)0 e −λt (λdt)1 e −λdt −λt −λdt


= × = λe e dt
0! 1!
after simplification and taking dt → 0, fT (t) = λe −λt

13/26
The Exponential Density
Definition [DB p.28]

A random variable X is said to have an Exponential distribution


with parameter λ > 0 if:
 −λx
λe if x ≥ 0,
X ∼ Exp(λ) ⇔ fX (x) =
0 otherwise.
The support of X, X = [0, ∞), is continuous infinite.

fX (x) FX (x)
2 1

1.5 0.75

1 0.5

0.5 0.25 λ= 2
λ= 1
λ = 0.5
0 x 0 x
0 1 2 3 4 5 0 1 2 3 4 5
Z +∞
Verify that fX (x)dx = 1.
−∞
14/26
The Exponential Density
Properties of X ∼ Exp(λ)

▶ Expectation E[X] = 1
[DB p.28]
λ ∞
Z ∞ Z ∞ 
−λx ∞ 1
dx = −xe −λx e −λx dx = − e −λx

E[X] = xλe 0
+
0 0 λ 0

▶ Variance Var[X] = 1
[DB p.28]
Z ∞
λ2 Z ∞
∞ 2
E[X2 ] = x 2 λe −λx dx = −x 2 e −λx 0 + 2xe −λx dx =

0 0 λ2
▶ Mode xmax = 0
Obvious from the curve. . .
▶ Median Q1/2 = ln 2
λ
See next
ln 34
▶ Quartile Qp = − ln(1−p)
λ , Q1/4 = λ , Q3/4 = ln 4
λ
Z x
FX (x) = fX (ξ)dξ = 1 − e −λx so FX (Qp ) = p ⇔ 1 − e −λQp = p
0
▶ Skewness 2 > 0 (strongly right-tailed)
Tedious but not difficult
15/26
The Exponential Density
Examples

Here are a few instances where the exponential density occurs:


▶ Business engineering: amount of money spent in one trip to
the supermarket; amount of time a clerk spends with their
customer;
▶ Reliability engineering: amount of time a product lasts;
▶ Earth sciences: time between earthquakes, geyser eruptions,
. . . ; climatology;
▶ Physics: time for a radioactive particle to decay; barometric
formula (how the density of air changes with altitude);
The exponential distribution describes the time for a continuous
process to change state; the geometric distribution describes the
number of trials necessary for a discrete process to change state.

16/26
The Gaussian Density
Definition [DB p.28]

A random variable X is said to have a Gaussian2 (or Normal)


distribution with mean µ and variance σ 2 if:
1 (x−µ)2
X ∼ N (µ, σ 2 ) ⇔ fX (x) = √ e − 2σ2 for all x ∈ R
2πσ 2
The support of X, X = R, is continuous infinite.
fX (x) FX (x)
1

0.75
0.4

0.5
0.2
(µ, σ 2 ) = −1, 21

0.25 (µ, σ 2 ) = (0, 1)
(µ, σ 2 ) = (2, 2)
0 x 0 x
−4 −2 0 2 4 6 −4 −2 0 2 4 6
Z +∞
R +∞ 2 R +∞ R +∞
Verify that fX (x)dx = 1; hint: calculate −∞
2
e −x dx = −∞ −∞ e −(x
2 +y 2 )
dxdy in cylindrical coordinates
−∞
2
named after the German mathematician Carl Friedrich Gauss (1777-1855)
17/26
The Gaussian Density
Cumulative distribution

▶ N (0, 1) is called the standard Gaussian distribution. We will


show in the next lecture that:
Y ∼ N (0, 1) ⇔ X = µ + σY ∼ N (µ, σ 2 ) [DB p.29]

▶ The cumulative distribution function of Y ∼ N (0, 1) is:


Z y
1 ξ2
FY (y ) = Φ(y ) = √ e − 2 dξ [DB p.29]
2π −∞
Φ(y ) is tabulated p.29 of the Maths Data Book. By
symmetry, you can verify Φ(−y ) = 1 − Φ(y ) and Φ(0) = 12 .
▶ The CDF of X ∼ N (µ, σ 2 ) is Φ x−µ

σ .
▶ Most computing environments (Python, MATLAB. . . ) have
an “error function” called erf. Be cautious that
Φ(y ) = 12 1 + erf( √y2
 

18/26
The Gaussian Density
Properties of X ∼ N (µ, σ 2 )

In the following, we write X = µ + σY with Y ∼ N (0, 1)


▶ Expectation E[X] = µ [DB p.28]
R +∞ 2
E[Y] = √1
2π −∞
y e −y dy = 0 (integrand is odd), E[X] = σE[Y] + µ
▶ Variance Var[X] = σ 2 [DB p.28]
R +∞ 2 2 +∞ R +∞ 2
E[Y2 ] = √12π y 2 e −y dy = √1 − y2 e −y e −y dy = 1,
 
−∞ 2π −∞
+ −∞
E[X2 ] = σ 2 E[Y2 ] + 2σµE[Y] + µ 2

▶ Mode xmax = µ
Obvious from the curve. . .
▶ Median Q1/2 = µ
See next
▶ Quartile Qp = µ + σΦ−1 (p)
Quartile of Y is Φ−1 (p)
Φ−1 (1/2) = 0 and Φ−1 (3/4) = −Φ−1 (1/4) ≈ 0.6745
▶ Skewness 0
By symmetry
19/26
The Gaussian Density
Properties of X ∼ N (µ, σ 2 )

▶ Confidence interval: P[|X − µ| ≤ mσ] = 2Φ(m) − 1

P[|X − µ| ≤ mσ] = P[|Y| ≤ m]


= P[−m ≤ Y ≤ m]
= Φ(m) − Φ(−m)
= 2Φ(m) − 1
How likely is X within m standard deviations of the mean?
• We have 2Φ(1) − 1 ≈ 68% confidence that |X − µ| ≤ σ
• We have 2Φ(2) − 1 ≈ 95% confidence that |X − µ| ≤ 2σ
▶ Familiarise yourself with the lookup table for Φ in the Data
Book p. 29.

20/26
The Gaussian Density
Examples

The Gaussian density is very common. It occurs in:


▶ Sciences: measurement errors in experiments are often
modelled by a normal distribution (we will see why!);
▶ Physics: position of a diffusing particle; ground state in a
quantum harmonic oscillator; thermal radiation;
▶ Education: standardised tests, exam results.
The Log-normal distribution (where the logarithm of a variable has
a Gaussian density) is commonly seen in:
▶ Biology: physiological measurement (blood pressure, size and
weight, length of hair. . . );
▶ Chemistry: particle size distribution, concentration of rare
elements in minerals. . .
▶ Computing: file sizes, length of emails. . .
▶ Finance: exchange rates, price indices, stock market indices. . .
▶ Social Sciences: demographics, scientometrics. . .
21/26
The Beta Density
Definition

▶ Suppose we observe that k out of n Bernoulli trials are


successes. What is the probability density of the Bernoulli
parameter p given this observation?
▶ From the Binomial distribution, Pk|p (k|p) = nCk p k (1 − p)n−k
▶ Using Bayes rule:

Pk|p (k|p)fp (p)


fp|k (p|k) = Z 1
Pk|p (k|p)fp (p)dp
0

▶ We assume that prior to any observation, all values of


p ∈ [0, 1] are believed to be equally likely, fp (p) = 1
▶ After some calculations we find
(n + 1)! k
fp|k (p|k) = p (1 − p)n−k
k!(n − k)!
22/26
The Beta Density
Definition [DB p.28]

A random variable X is said to have an Beta distribution with


shape parameter α > 0 and β > 0 if:

 Γ(α+β) α−1
x (1 − x)β−1 if x ∈ [0, 1],
X ∼ Beta(α, β) ⇔ fX (x) = Γ(α)Γ(β)
0 otherwise.

Z ∞
where the Gamma function is defined Γ(a) = ξ a−1 e −ξ dξ (with
0
a > 0).
The support of X, X = [0, 1], is continuous finite.
▶ The Gamma function is a generalisation of the factorial to
non-integers
▶ It has the property Γ(a) = (a − 1)! when a is an integer.3
3
From the previous slide, verify fp|k = Beta(α, β) the probability density of p
after the observation of k = α − 1 successes and n − k = β − 1 fails.
23/26
The Beta Density
Properties of X ∼ Beta(α, β)

fX (x) FX (x)
1
2
0.75
1.5
0.5
1
(α, β) = (1, 1)

0.5 0.25 (α, β) = (2, 5)


(α, β) = (2, 2)
(α, β) = (3, 2)
0 x 0 x
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

α
▶ Expectation E[X] = α+β [DB p.28]

αβ
▶ Variance Var[X] = (α+β)2 (α+β+1) [DB p.28]

No need to know this (but here for completeness)


α−1
▶ Mode xmax = α+β−2 for α, β > 1
▶ Median no closed-form expression. . .
▶ Quartile no closed-form expression. . .

2(β−α) α+β+1
▶ Skewness √ (tail’s side depends on the sign of β − α)
(α+β+2) αβ
24/26
The Beta Density
Examples

Here are some real-life examples of where the Beta Distribution


can be observed:
▶ Quality Control: proportion of defective items in a production
process;
▶ Sports Analytics: winning probability of sports teams;
▶ Environmental Modelling: probability distribution of
precipitation;
▶ Medicine: prevalence of a disease in a population, success rate
of a treatment;
▶ Finance: range of returns on investment portfolios or assets.

25/26
Additional Remarks

Two additional remarks:


▶ It is possible to define a probability density function for a
discrete random variable using the delta function!
Consider a discrete random variable X with probability mass
function PX and support X, then:
X
fX (x) = PX (k)δ(x − k)
k∈X

is the probability density function of X.


▶ It is possible to define conditional expectations:
Z +∞
E[X|Y = y ] = x fX|Y (x|y )dx (it is a function of y )
−∞

26/26
You can attempt all problems in Examples Paper 5

You might also like