0% found this document useful (0 votes)
13 views

P6-Random Variables and Distributions

Uploaded by

jeffsiu456
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

P6-Random Variables and Distributions

Uploaded by

jeffsiu456
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

§6 Random variables and distributions

§6.1 Random variable


6.1.1 Consider an experiment with sample space Ω.
(Informal) Definition. A random variable is a function X : Ω → (−∞, ∞).

6.1.2 A random variable is useful for quantifying experimental outcomes or descriptive events. Some-
times, it may be convenient to simplify the sample space so that events of interest can be
described by X.

6.1.3 Examples.

(i) Ω = {win, lose} can be transformed by defining X(win) = 1, X(lose) = 0.


(ii) Toss 2 coins. Ω = {HH, HT, TH, TT}. Interested in no. of heads only. Define

X(HH) = 2, X(HT) = X(TH) = 1, X(TT) = 0.

(iii) Toss 2 coins. Ω = {HH, HT, TH, TT}. Interested in how many more vertical strokes the
first letter has than the second letter. Define

X(HH) = X(TT) = 0, X(HT) = 1, X(TH) = −1.

(iv) n Bernoulli trials. Interested in no. of successes. Define X = no. of successes ∈ {0, 1, . . . , n}.
(v) Annual income Y has sample space [0, ∞). Taxable when annual income exceeds c say.
Only interested in part of income which is taxable. May define X = max{0, Y − c}.

6.1.4 Conventional notation:


Use capital letters X, Y, . . . to denote random variables and small letters x, y, . . . the possible
numerical values (or realisations) of these variables, so that e.g.

X(ω) = x, Y (ω) = y, for a particular outcome ω ∈ Ω.

§6.2 Distribution function


6.2.1 Definition. The distribution function of a random variable X is the function F : (−∞, ∞) →
[0, 1] given by
F (x) = P(X ≤ x).

Note: Alternative name → cumulative distribution function (cdf).

25
6.2.2 Example. Toss a coin twice, with Ω = {HH, HT, TH, TT}. Define

• X = no. of heads;
• Y = 1 if both tosses return the same side, and = −1 otherwise.

Distribution functions of X and Y are, respectively,



0, t < 0, 
 0, t < −1,


 
 1/4, 0 ≤ t < 1,
FX (t) = and FY (t) = 1/2, −1 ≤ t < 1,
 3/4, 1 ≤ t < 2, 
1, t ≥ 1.

 
t ≥ 2.

1,

6.2.3 The following properties characterise a cdf:

(i) limx→−∞ F (x) = 0


(ii) limx→∞ F (x) = 1
(iii) F is increasing (or, equivalently, non-decreasing)
(iv) F is right-continuous, i.e. limh↓0 F (x + h) = F (x).

Note: F is not necessarily left-continuous. For instance, in Example §6.2.2, limh↓0 FX (1 − h) =


1/4 6= 3/4 = FX (1).

6.2.4 Random variables and distribution functions are useful for describing probability models. The
probabilities attributed to events concerning a random variable X can be calculated from the
distribution function of X — cdf of X completely specifies random behaviour of X.
Example. If X denotes an integer-valued random variable and its distribution function F is
given, then we can calculate

P(X = r) = F (r) − F (r − 1), r = 0, ±1, ±2, . . . ,

and deduce from these the probability of any event involving X.

§6.3 Discrete random variables


6.3.1 Definition. Let X be a random variable defined on the sample space Ω. Then X is a discrete
random variable if X(Ω) ≡ {X(ω) : ω ∈ Ω} is countable.
Note: A set A is countable if its elements can be enumerated (or listed), such that A = {a1 , a2 , . . .}.

26
6.3.2 Examples.

(i) Binomial (n, p): X(Ω) = {0, 1, 2, . . . , n}.


(ii) Bernoulli trial: X(Ω) = {0, 1}.
(iii) Poisson (λ): X(Ω) = {0, 1, 2, . . .}.

6.3.3 Definition. The mass function of a discrete random variable X is the function f : (−∞, ∞) →
[0, 1] such that
f (x) = P(X = x), x ∈ X(Ω).
Note: Alternative names → probability mass function or probability function.
Definition. The set {x ∈ X(Ω) : f (x) > 0} is known as the support of X.
Note: The support of X usually, but not necessarily, coincides with X(Ω).

6.3.4 Examples.

(i) Binomial (n, p):   


 n px (1 − p)n−x , x = 0, 1, 2, . . . , n,

f (x) = x

 0, otherwise.

(ii) Bernoulli trial: (


px (1 − p)1−x , x = 0, 1,
f (x) =
0, otherwise.

(iii) Poisson (λ): (


e−λ λx /x!, x = 0, 1, 2, . . . ,
f (x) =
0, otherwise.

(iv) Let X be no. of failures before first success in a sequence of independent Bernoulli trials
with success probability p. Then X(Ω) = {0, 1, 2, . . .}, and X has mass function
(
(1 − p)x p, x = 0, 1, 2, . . . ,
f (x) =
0, otherwise.

This is called a geometric distribution.

27
(v) Let X be no. of failures before kth success in a sequence of independent Bernoulli trials
with success probability p. Then X(Ω) = {0, 1, 2, . . .}, and X has mass function
  
k − 1 + x
(1 − p)x pk , x = 0, 1, 2, . . . ,


f (x) = x

 0, otherwise.
This is called a negative binomial distribution.
(vi) Suppose a random sample of size m are drawn without replacement from a collection of
k objects of one kind and N − k of another kind. Let X be no. of objects of the first kind
found in the sample. Then X has mass function
    
 k N −k N
, x = max{0, m + k − N }, . . . , min{k, m},

f (x) = x m−x m

 0, otherwise.
This is called a hypergeometric distribution.

The following figures display the mass functions of examples of the above discrete random
variables.
Binomial (8,0.2) Bernoulli (p=0.2)
0.8
0.3
0.6
0.2
0.4
0.1 0.2
0.0 0.0
2 5 8 0 1

Poisson (3.5) Geometric (p=0.2)


0.20 0.20
0.15 0.15
0.10 0.10
0.05 0.05
0.00 0.00
0 2 4 6 8 10 12 3 8 13
1 3 5 7 9 11
Negative binomial (p=0.2, k=4) Hypergeometric (m=4, N=18, k=6)
0.05 0.4
0.04
0.3
0.03
0.02 0.2
0.01 0.1
0.00 0.0
10 30 50 0 1 2 3 4

28
6.3.5 The cdf of a discrete random variable X is a step function with jumps at values in the support
of X.
The following figures display the distribution functions of some discrete random variables.
Binomial (8,0.2) Bernoulli (p=0.2)

0.9 0.9

0.4 0.4

-0.1 -0.1
1 4 7 0.0 0.5 1.0

Poisson (3.5) Geometric (p=0.2)

0.9 0.9

0.4 0.4

0 5 10 0 5 10

Negative binomial (p=0.2, k=4) Hypergeometric (m=4, N=18, k=6)

0.9
0.8

0.4
0.3

0 20 40 0 2 4

§6.4 Continuous random variables


6.4.1 Definition. A random variable X is continuous if its distribution function F (x) ≡ P(X ≤ x)
has the form Z x
F (x) = f (y) dy, −∞ < x < ∞,
−∞

for some function f : (−∞, ∞) → [0, ∞).


Definition. The function f is called the density function of X.
Note: Alternative names → probability density function (pdf) or probability function.
Definition. The set {x ∈ X(Ω) : f (x) > 0} is known as the support of X.

6.4.2 The cdf F of a continuous random variable X is continuous.

29
6.4.3 If the cdf F is differentiable, we can obtain the pdf f by f (x) = F 0 (x) (≥ 0 since F is
increasing).

6.4.4 The pdf f plays a similar role as the mass function P(X = x) for discrete X. Results for discrete
and continuous random variables can often be interchanged with P(X = x) and summation
R
sign Σ replaced by f (x) and integration sign .
Example. For any subset A of real numbers,
X
P(X ∈ A) = P(X = x) (discrete)
x∈A
Z
P(X ∈ A) = f (x) dx (continuous)
x∈A

6.4.5 If X is continuous, P(X = x) = 0 for all x.

6.4.6 If X is continuous with pdf f , then


Z ∞
(i) f (x) dx = 1
−∞
Z b
(ii) P(a ≤ X ≤ b) = P(a < X ≤ b) = P(a ≤ X < b) = P(a < X < b) = f (x) dx
a
Z
(iii) P(X ∈ A) = f (x) dx for any subset A of real numbers.
x∈A

6.4.7 Analogues in physics:


Continuous X Physics
pdf at x density at x (a point in space)
probability of set A mass of A (a region in space)
P(X = x) mass of a single point x (= 0 since x has no volume, hence no mass)

6.4.8 Examples. (f ↔ pdf, F ↔ cdf)

(i) Uniform distribution, U [a, b] (a < b):



  0, x < a,
1 


 , a ≤ x ≤ b, 
 x−a
f (x) = b−a F (x) = , a ≤ x ≤ b,

 0,  b−a
otherwise, 


 1, x > b.
e.g. Straight rod drops freely onto a horizontal plane. Let X be angle between rod and North
direction: 0 ≤ X < 2π. Then X ∼ U [0, 2π].

30
(ii) Exponential distribution, exp(λ) (λ > 0):
( (
λe−λx , x > 0, 0, x ≤ 0,
f (x) = F (x) = −λx
0, x ≤ 0, 1 − e , x > 0.
Remarks:
– An exponential random variable describes the interarrival time, i.e. the random time
elapsing between unpredictable events (e.g. telephone calls, earthquakes, arrivals of buses
or customers etc.)
– The exponential distribution is memoryless, i.e. if X ∼ exp(λ),

P(X > s + t | X > s) = P(X > t).

Knowing that event hasn’t occurred in the past s units of time doesn’t alter the distri-
bution of arrival time in the future, i.e. we may assume the process starts afresh at any
point of observation.
– The scale parameter λ is also called the rate. The greater is λ, the shorter is the
interarrival time (the more frequent are the arrivals).
(iii) Gamma distribution, Gamma (α, β) (α, β > 0):
 α α−1 −βx
 β x e

, x > 0,
f (x) = Γ(α)

 0, x ≤ 0,
R∞
where Γ(·) denotes the gamma function Γ(α) , 0 uα−1 e−u du.
Remarks:
– α: shape parameter; β: scale parameter, or rate.
– Gamma (1, β) ≡ exp(β).
(iv) Beta distribution, Beta (α, β) (α, β > 0):
 Γ(α + β) xα−1 (1 − x)β−1 , 0 < x < 1,

f (x) = Γ(α)Γ(β)
0, otherwise.

Note: Beta (1, 1) ≡ U [0, 1].


(v) Cauchy distribution:
 
1 1
f (x) = , −∞ < x < ∞,
π 1 + (x − θ)2
for any fixed real parameter θ.

31
(vi) Normal (or Gaussian) distribution, N (µ, σ 2 ):

(x − µ)2
 
1
f (x) = √ exp − , −∞ < x < ∞.
2πσ 2 2σ 2

Remarks:
– µ is the mean, and σ 2 is the variance (to be discussed later).
– The pdf f has a bell shape, with centre µ. The bigger is σ 2 , the more widely spread is f .
– The Central Limit Theorem (CLT) states that in many cases, the average or sum of a
large number of independent 1 random variables is approximately normally distributed.
– The Binomial (n, p) random variable is the sum of n independent Bernoulli random vari-
ables. Thus we should expect, by CLT, that Binomial (n, p) is approximately normal for
large n. In fact,
approx.
Binomial (n, p) ∼ N (np, np (1 − p)), for large n.

– N (0, 1) is known as the standard normal distribution, i.e. special case of N (µ, σ 2 ) when
µ = 0 and σ = 1.
The pdf and cdf of N (0, 1) are usually denoted by φ and Φ, respectively:
Z x
1 −x2 /2
pdf φ(x) = √ e , cdf Φ(x) = φ(y) dy, −∞ < x < ∞.
2π −∞

– If X ∼ N (µ, σ 2 ) and a, b are fixed constants, then Y = aX + b ∼ N (aµ + b, a2 σ 2 ), i.e.


any linear transformation of a normal random variable is also normal.
Special case: Taking a = 1/σ and b = −µ/σ amounts to standardisation of X, resulting
in a standard normal random variable Y = (X − µ)/σ ∼ N (0, 1).
– Many real-life random phenomena obey a normal distribution approximately (due to CLT).
Examples include measurement error, height of a man, fluctuation from nominal quality
in production line, etc.
(vii) Chi-squared distribution with m degrees of freedom, χ2m :
if m is positive integer, χ2m is distribution of m 2
P
i=1 Zi , for independent standard
normal Z1 , . . . , Zm .
1 2
Note: χ2m ≡ Gamma (m/2, 1/2), Gamma (α, β) ≡ χ .
2β 2α
1
We shall discuss independent random variables in the next chapter.

32
(viii) Student’s t-distribution with m degrees of freedom, tm :
Z
tm is distribution of p , for independent Z ∼ N (0, 1) and X ∼ χ2m .
X/m
Remarks:
– tm is heavy-tailed version of N (0, 1): tm approaches N (0, 1) when m → ∞.
– t1 ≡ Cauchy distribution with centre θ = 0.
(ix) F distribution with parameters (m, n), Fm,n :
X/m
Fm,n is distribution of , for independent X ∼ χ2m and Y ∼ χ2n .
Y /n
Note: F1,n ≡ (tn )2 , and (Fm,n )−1 ≡ Fn,m .

The following diagrams display the density and distribution functions of examples of the above
continuous random variables.
Density functions
U[0,1] exp(1)
0.9 0.9

0.4 0.4

-0.1
-0.9 0.2 1.3 0 2 4
Gamma (3,1) Beta (0.8,1.2)

0.20 3

0.05 1

0 5 10 0.2 0.5 0.8

Cauchy (t 1) N(0,1)
0.3
0.3

0.1 0.1

-10 0 10 -3 0 3

χ 2 (4 d.f.) F 5,10
0.7
0.12
0.3
0.01
0 5 10 1 3 5

33
Distribution functions
U[0,1] exp(1)
0.9 0.9

0.4 0.4

-0.1
-0.9 0.2 1.3 0.5 2.0 3.5

Gamma (3,1) Beta (0.8,1.2)


0.9 0.9

0.4 0.4

-0.1
0 5 10 0.0 0.5 1.0
Cauchy (t 1) N(0,1)
0.9 0.9

0.4 0.4

-10 0 10 -3 0 3

χ 2 (4 d.f.) F 5,10
0.9 0.9

0.4 0.4

0 5 10 0.2 2.4 4.6

6.4.9 The normal distribution and the normal-related distributions — χ2m , tm , Fm,n — are useful
for statistical inference such as hypothesis testing, confidence interval construction, regression,
analysis of variance (ANOVA), etc.

§6.5 *** More challenges ***


6.5.1 Let X be a random variable with distribution function F . Let x be a fixed real number. For
n = 1, 2, . . . , define events  
1
An = X ≤ x − .
n
(a) Show that
A1 ⊂ A2 ⊂ · · · and A1 ∪ A2 ∪ · · · = {X < x}.

34
(b) Show that
P(X < x) = lim F (x − 1/n).
n→∞

(c) Deduce from (b) that if F is continuous, then P(X = x) = 0 for all x ∈ R.
(d) Give an example of X for which P(X < x) 6= F (x) for some x.

6.5.2 Define, for a constant λ > 0, a function


(
1 − e−λx , x > 0
F (x) =
0, x ≤ 0.

(a) Verify that F is a distribution function.


(b) Find a density function for F .
(c) Let X be a random variable distributed under F . Define a new random variable Y as follows.
A fair coin is tossed once, independent of X. Put Y = X if a head turns up and Y = −X if
otherwise.
Write down expressions for the distribution and density functions of Y , respectively.

6.5.3 Let f be a function defined by

f (x) = max{0, c (4 − x2 )} for x ∈ (−∞, ∞),

where c is some unknown real constant.

(a) Find the value(s) of c such that f is a proper density function.


(b) Suppose now c takes the positive value determined in (a), and X is a random variable dis-
tributed with density f . Let F denote the distribution function corresponding to f .
(i) Find F .
(ii) Show that
F (x) + F (−x) = 1 for all real values of x.
(iii) Determine a positive constant a that satisfies
11a
P(|X| ≤ a) = .
16
(iv) Does there exist a positive constant a that satisfies P (|X| ≤ a) = 3a/16? If so, what is
it?

35

You might also like