Mat 352 Official Note Construction
Mat 352 Official Note Construction
1 Probability Theory 3
1.1 Significance of Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 What is a Random Experiment . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Approaches To Probability . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Problem 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 Problem 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.3 Problem 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.4 Problem 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.5 Problem 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Probability Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Addition Theorem Of Probability . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Conditional Probability 10
2.0.1 Baye’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Random Variables 14
3.1 Distribution Function of a Random Variable . . . . . . . . . . . . . . . . . . 14
3.1.1 Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1.2 Continuous Random Variable . . . . . . . . . . . . . . . . . . . . . . 15
3.1.3 Moment Generating Function . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Types of Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 Bernoulli Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.2 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.3 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Types of Continuous Random Variable . . . . . . . . . . . . . . . . . . . . . 19
3.3.1 Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3.2 Exponential Random Variable . . . . . . . . . . . . . . . . . . . . . . 21
3.3.3 Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1
4.2 The Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Functions of Random Variables. . . . . . . . . . . . . . . . . . . . . . . . . . 29
6 Gamma Function. 36
2
Chapter 1
Probability Theory
1.1 Significance of Probability Theory
Probability theory is a very important area of the mathematical sciences with applications
in diverse fields such as;
(i). Medicine (in diagnosis and in measuring the effectiveness of drugs as well as study of
epidemics).
(ii). Electrical engineering (in signals and robotics).
(iii). Industries for quality control and production.
(iv). Meteorology.
(v). Linguistics.
It is also applied in other area of mathematics such as, Bio-mathematics, Finance, Physics,
Insurance, Commerce, Sociology, Population Studies, Genetics, Demography, Chemistry
(Polymers), Econometrics, Failure of technical devices and machines, errors of mechanism,
morbidity and mortality of population, meteorology and geostatistics , e.t.c.
Probabilist are those that study probability. The theory of probability has the origin in
the game of chance related to gambling, e.g Throwing of dice or coin, drawing cards from a
pack of cards, e.t.c.
3
1.2 What is a Random Experiment
1
S = ut + at2
2
(ii). For a perfect gas we have the boyle’s law which has the guiding rule,
1
(P ressure × V olume = Constant) =⇒ V ∝
P
4
Classical Approach
This follows from the classical definition of probability of occurrence of a certain event.
Here one conducts (observe) and count the number of times that event actually occurred.
Based on those results the probability is found.e.g; If there are n-equally likely outcomes of
a random experiment out of which k are favourable to the result of interest, then the result
has a probability of k/n. Thus,
The probability of any event A lies between 0 and 1. i.e, 0 ≤ P(A) ≤ 1. However, if P(A) = 0
then A is an impossible event, while if P(A) = 1 then A is a certain event, where P(A) is the
probability of event A occurring.
If P(A) is known as the probability of success, p , then 1 − P(A) = P(Ā), q is known as the
probability of failure. Therefore, p + q = 1.
The Classical definition does not require the actual experimentation and it is also known as
”a prior” or theoretical or mathematical probability of an event.
The classical probability has its shot comings and it fails in the following situations;
1. If N , the exhaustive number of outcomes (sample space) of the random experiment is
infinite
2. If the various outcomes of the random experiment are not equally likely
3. If the actual value of N is not known.
Emperical Approach
Suppose we have an irregular asymmetric die . If the center of gravity of the die can be
changed by adding a tiny amount of lead to a particular face then the symmetry and fairness
(unbaisedness) of the die is lost. The experiment of tossing this die no longer fall under the
classical model of equally likely outcome and so the classical definition of probability fails
for such a random experiment.
Definition: If an experiment is performed repeatedly under essentially homogeneous num-
ber of trials as the number of trials becomes indefinitely large, the probability of happening
of the event is being assumed that the limit is finite and unique. Suppose that an event A
occurs m-times in N repetitions of a random experiment. Then the ratio m/N gives the
relative frequency of the event A and it will not vary appreciably from one trial to another.
5
In the limiting case when N becomes sufficiently large, it’s more or less settles to a
number, which is called the probability of A. Symbolically;
m
P(A) = lim
N →∞ N
Axiomatic Approach
n
! n
[ X
P Ai = P(Ai )
i=1 i=1
Or we write,
∞
! ∞
[ X
P Ai = P(Ai )
i=1 i=1
6
1.3 Problems
1.3.1 Problem 1
Four cards are drawn at random from a pack of 52 cards. Find the probability that;
(1). They are, a king, a queen, a jack and an ace.
(2). Two are kings and two are aces.
(3). All are diamonds.
(4). Two are red and two are black.
1.3.2 Problem 2
If six dice are rolled, what is the probability that they all show different faces.
1.3.3 Problem 3
Find the probability that in 5 tosses of a perfect coin, the coin turns up head at least 3 times
in a succession.
1.3.4 Problem 4
The letters of the word article are arranged at random. Find the probability that the vowels
occupy even places.
1.3.5 Problem 5
A family is expecting two special guests. The guest both specified the days of the week they
would arrive.
(a). Calculate the probability that they will both arrive on the same day.
(b). Calculate the probability that they both arrive respectively on sunday, monday, tuesday,
wednesday, thursday, friday and saturday then add up all these probabilities.
P : P(S) −→ [0, 1]
1 P(S) = 1
7
Note
1. Two events are said to be mutually exclusive if they can never occur simultaneously.
Therefore the sample space is the disjoint union of an event and it’s compliment. The power
set P(S) is the set of all subsets of S. That is if S is the sample space of an experiment,
then P(S) is the set of all possible events.
Proof. Let us suppose that a random experiment results in a sample space S with N sample
points (exhaustive number of cases).Then by definition;
In general,
which implies,
P(A) = P(A) + P(∅) =⇒ P(∅) = P(A) − P(A) = 0
8
Theorem 1.5.3. If A′ is the complement of an event A, then P(A′ ) = 1 − P(A)
Proof. The sample space S, can be decomposed into the mutually exclusive events A and
A′ . That is S = A ∪ A′ which implies that n(S) = n(A ∪ A′ ). Thus,
n(S) n(A ∪ A′ )
=
n(S) n(S)
which implies,
1 = P(S) = P(A ∪ A′ ) = P(A) + P(A′ )
P(A′ ) = 1 − P(A)
0 ≤ P(A) ≤ 1
Proof. From the non-negativity axiom P(A) ≥ 0 for every event, A ∈ S = Ω. Moreover, for
all A ⊆ Ω we have that
P(A) ≤ P(Ω) = 1
0 ≤ P(A) ≤ 1
Proof. DIY
In general,
n
! n
!
[ X X X \
P Ai = P(Ai ∩ Aj ) + P(Ai ∩ Aj ∩ Ak ) + · · · + (−1)n−1 P Ai
i=1 i<j i<j<k i<···<n i=1
9
Chapter 2
Conditional Probability
Consider, for instance a sample space Ω, and an event E ∈ Ω, with P(E) ̸= 0 i.e P(E) > 0.
Suppose it is known that E has occurred. What is the conditional probability of another
event A ∈ Ω, given the fact that E has occurred?.
A D E
Notice B = Ω, E is the new sample space (n(E)), while D is the relevant part (n(A ∩ E)).
The above is a simple illustration of conditional probability. This means that the Con-
ditional Probability of event A, given that E has occurred is written as;
|A ∩ E| |A ∩ E| |E|
P(A | E) = = ÷
|E| |Ω| |Ω|
P(A ∩ E)
P(A | E) = , P(E) > 0
P(E)
10
Theorem 2.0.1. Given events A, B, C defined on the sample space Ω such that P(B) > 0.
Then;
1. 0 ≤ P(A | B) ≤ 1
2. P(B | B) = 1
n
! n
[ X
P Ak | B = P(Ak | B)
k=1 k=1
5. If A ⊆ B, then;
P(A)
P(A | B) =
P(B)
Now if (A ∩ B) ⊂ B this implies P(A ∩ B) ≤ P(B), then dividing through by P(A), we have;
P(A ∩ B) P(B)
≤
P(B) P(B)
11
Combining [2.1] and [2.2], we see that
0 ≤ P(A | B) ≤ 1
n
!
P ( nk=1 Ak ∩ B)
S
[ P((A1 ∪ A2 ∪ A3 · · · ∪ An ) ∩ B)
P Ak | B = =
k=1
P(B) P(B)
n
!
[ P(A1 ∩ B) + P(A2 ∩ B) + P(A3 ∩ B) + · · · + P(An ∩ B)
P Ak | B =
k=1
P(B)
Thus we conclude, !
n
[ n
X
P Ak | B = P(Ak | B)
k=1 k=1
= P(A ∩ C | B)
12
Proof (5). Since A ⊆ B then it is clear that A ∩ B = A, then observe the following;
P(A ∩ B) P(A)
P(A | B) = =
P(B) P(B)
P(Ai )P(B | Ai )
P(Ai | B) =
P(B)
Proof. Let A and B be events, then by definition of condition probability;
P(Ai ∩ B)
P(B | Ai ) =
P(Ai )
Which implies,
P(Ai ∩ B) = P(Ai ) · P(B | Ai ) (2.3)
Also,
P(Ai ∩ B)
P(Ai | B) = (2.4)
P(B)
But the total probability is given by;
Therefore,
n
X
P(B) = P(Ai ) · P(B | Ai )
i=1
13
Chapter 3
Random Variables
A random variable X of a sample space, S is a function from the set S into the set R of
real numbers such that the pre-image of every interval of R is an event of S. Note that the
possible values of a random variable are not equally likely .
1. f (xi ) ≥ 0
Pn
2. i=1 f (xi ) = 1
Such that the random variable together with finite image sets are called discrete random
variables. For a discrete random variable X, we define the probability mass function p(a)
of X by p(a) = P{X = a}. As in the finite case, we make X(S) into a probability space by
defining the probability of xi to be f (xi ) = P(X = xi ) and call f , the distribution of X, i.e,
14
as given in [3.1] above.
When the relevant series converges absolutely. It can be shown (DIY), that V ar[X] exists if
and only if µ = E[X] and E[X 2 ] both exists and that in this case the formula;
f (x)
P(a ≤ X ≤ b)
a x
b
15
In this case X is said to be a continuous random variable. The function f is called the
distribution on the continuous probability function or density function of X. This satisfies
the condition;
1. f (x) ≥ 0
Z
2. f (x) dx = 1
R
That is f is non-negative and the total area under it’s graph is 1. The expectation E[x] is
defined by; Z
E[x] = x · f (x) dx := µ (3.5)
R
when it exists.
The variance V ar[x] is defined by V ar[x] = E [(x − µ)2 ] also as;
Z
V ar[x] = (x − µ)2 · f (x) dx (3.6)
R
when it exists.
Just as in the discrete case, it can be shown that V ar[x] exists if and only if µ = E[x] or
E[x2 ] both exists then V ar[x] = E[x2 ] − µ2 , also as;
Z
V ar[x] = x · f (x) dx − µ2
2
(3.7)
R
16
We call Φ(t) the moment generating function because all the moments of x can be obtained
successively by differentiating Φ(t).
For example,
d
Φ′ (t) = E etX = E xetx =⇒ Φ′ (0) = E[x]
dt
Similarly we have;
′′ d ′ d tx
=⇒ Φ′′ (0) = E x2
Φ (t) = Φ (t) = E xe
dt dt
In general, the nth derivative of Φ(t) evaluated at t = 0 equals E [xn ], that is;
√
To show that E[x] = p and σ(x) = pq, we proceed by using it’s moment generating function.
Recall from [3.10],
X
MX (t) = etx P(x)
X
1 1 1 x 1 x
X
tx
X
tx x 1−x
X
tx x −x
X
txp X pet
MX (t) = e P(x) = e p q =q e p q =q e =q
X x=0 x=0 x=0
q x=0
q
pet
=q 1+ = pet + q
q
So we conclude that the moment generating function is given by;
17
3.2.2 Binomial Distribution
A discrete random variable X, has a binomial distribution. If the probability mass function
is given by;
n x n−x
P(x) = P(X = x) := p q ; x = 0, 1, 2, 3, . . . , n (3.14)
x
If it satisfies the condition below;
2. Only two possible outcomes ”success” and ”failure” are possible in each trial.
3. The trials are independent, i.e the outcome of one event does not influence the outcome
of the other event.
We shall show that MX (t) = (pet + q)n , E[x] = np, V ar[x] = npq. Note that the binomial
distribution is a generalized form of the Bernoulli distribution.
The moment generating function X ∼ b(n, p) i.e with binomial distribution where the prob-
ability of success on any of the n trials is p and it is given by;
n n
tx
X
tx n x n−x
X n tx x
e p (1 − p)n−x
MX (t) = E e = e p (1 − p) = (3.15)
x=0
x x=0
x
Since all individual events are independent and are Bernoulli random variables, then we
have;
n
Y
MX (t) = (pet + q)(pet + q) · · · (pet + q) = (pet + q)i = (pet + q)n (3.19)
| {z } i=1
n−times
18
Hence to determine the mean (expectation), Recall E[x] = M′X (0). Since MX (t) = (pet +q)n ,
then M′X (t) = n(pet + q)n−1 · pet , which implies E[x] = M′X (0) = np.
To determine the variance / standard deviation;
Consider M′′X (0) = n(n − 1)p(p + q)n−2 p + (p + q)n−1 np = n(n − 1)p2 + np. So we have that;
Then;
So;
p √
σ(x) = V ar[x] = npq
e−λ λx
P(x) = , x = 0, 1, 2, 3, . . . (3.21)
x!
The poisson distribution is an approximation to the binomial distribution when the number
of trial n is very large and the probability of success is very minimal such that np = λ, a
t
constant. We shall show that MX (t) = eλ(e −1) or exp(λ(et − 1)), E[x] = λ, V ar[x] = λ and
√
σ(x) = λ
19
fX (x)
1
b−a
a b x
The probability density function for a continuous uniform distribution on the interval [a, b]
is given as; 1
, for a ≤ x ≤ b
P(x) := b − a (3.22)
0, otherwise
etb − eta
MX (t) = (3.23)
t(b − a)
Recall that;
∞
x x2 x3 X xn
e =1+x+ + + ··· = (3.24)
2! 3! n=0
n!
Therefore;
∞
tb (tb)2 (tb)3 X (bt)n
e =1+t+ + + ··· = (3.25)
2! 3! n=0
n!
∞
ta (ta)2 (ta)3 X (at)n
e =1+t+ + + ··· = (3.26)
2! 3! n=0
n!
20
simplification of [3.23], we have that;
∞
b+a b2 + ab + a2 2 X tn (bn − an )
MX (t) = 1 + t+ t + ··· = (3.28)
2 6 n=1
(b − a)n!
Therefore,
∞
b + a 2(b2 + ab + a2 ) X ntn−1 (bn − an )
M′X (t) = + t + ··· =
2 6 n=1
(b − a)n!
b+a
µ = E[x] = M′X (0) = (3.29)
2
a2 + ab + b2
M′′X (0) = E[x2 ] = (3.30)
3
Therefore,
2
a2 + ab + b2 a2 + b2 − 2ab (b − a)2
2 2 b+a
V ar[x] = E[x ] − (E[x]) = − = =
3 2 12 12
21
Hence,
Z ∞ Z ∞ Z ∞
−λx λ
MX (t) = E etX = tx tx
e−x(λ−t) dx =
e f (x)dx = e · λe dx = λ
0 0 0 λ−t
So;
λ
MX (t) = (3.34)
λ−t
λ
M′X (t) = (3.35)
(λ − t)2
−3
′′ 2 t
MX (t) = 2 1 − (3.36)
λ λ
Then from [3.35];
1
µ = E[x] = M′X (0) =
λ
Also from [3.36];
2
E[x2 ] = M′′X (0) =
λ2
2 1 1 1
Therefore; V ar[x] = E[x2 ] − (E[x])2 = 2
− 2 = 2 and σ(x) = , then we conclude;
λ λ λ λ
1
V ar[x] = (3.37)
λ2
and
1
σ(x) = (3.38)
λ
Assignment
Determine Moment generating function and use it to compute the mean and variance.
22
Chapter 4
Definition
The indicator function of the event E1 is defined as;
1, if w ∈ E
IE = IE (w) := (4.1)
0, if w ∈
/E
Proof. Since,
0, on B c
IB (w) :=
1, on B
1. IB c = 1 − IB
23
2. IA∩B = IA · IB
3. IA∪B = IA + IB − IA∩B
IB c = 1 − IB (4.3)
On to the next,
1, if w ∈ A ∩ B
IA∩B (w) := (4.4)
0, if w ∈
/ A∩B
1. w ∈ A and w ∈
/B
2. w ∈ B and w ∈
/A
3. w ∈
/ A and w ∈
/B
Case 1
Case 2
Case 3
IA∩B = IA · IB (4.5)
24
Now we want to show IA∪B = IA + IB − IA∩B , by [1], IA∪B = 1 − I(A∪B)c then we have
from De-morgan’s law,
(A ∪ B)c = Ac ∩ B c (4.6)
Therefore, IA∪B = 1 − IAc ∩B c , then it follows from [2] that IAc ∩B c = IAc · IB c , which implies
that, IA∪B = 1 − IAc · IB c , and IAc = 1 − IA , IB c = 1 − IB
Observe,
IA∪B = 1 − (1 − IA )(1 − IB ) = IA + IB − IA · IB
Note, these results can be extended in an obvious manner to a finite collection of events.
Hence (Prove that),
IA∪B∪C = IA + IB + IC − IA∩B − IA∩C − IB∩C + IA∩B∩C (4.8)
Property (i) follows since for event y < x, {X ≤ y} is contained in the event {X ≤ x} and
so it must have a smaller probability. Properties (ii) and (iii) follows since x must take on
some infinite value.
All probability questions about x can be answered in terms of the cumulative distribution
function F (·). For example if a, b ∈ R such that a < b, it follows that;
25
If we desire the probability that x is strictly smaller than b, we may calculate this
probability by,
P{X < b} = lim+ P{X ≤ b − h} = lim+ F (b − h)
h→0 h→0
Where limh→0+ means that we are taking the limit as h decrease to 0. Note that P(X < b)
does not necessarily equal F (b) since F (b) also includes the probability that x equals 0.
Theorem 4.2.1. Let X be a constant real-valued random variable with density function f (x).
Then the cumulative distribution function of X is defined by;
Z ∞
F (x) = f (x) dx (4.14)
−∞
Remark
Example 1
A real number is chosen at random from [0, 1] with uniform probability and then this number
is squared. Let x represent the result. What is the PDF of x ?.
Solution
26
Example 2
1 x2 , 0 < x < 3
fX (x) = 9
0, otherwise
Example 3
Given that,
0, x<0
x+1
F (x) = , 0≤x≤1 (4.17)
2
1, x>1
1
(a) Sketch its graph. Hence, Find (i) P(X = 0) (ii) P −3 < X < 2
(iii) P(X > 1)
27
Solution
1/2 + 1 3
P(X = 0) = 1/2, P(−3 < X < 1/2) = F (1/2− )−F (−3) = −0 = , P(X > 1) = 1
2 4
Exercise
(i) Sketch F (x) and hence determine; (a) P(X = 1/2), (b) P(X = 1), (c) P(X = 2).
(ii) The conditional probability that x > 2 given that x > 1
28
1. The probability mass function (pmf) is a function that gives the probability that a
discrete random variable is exactly equal to some value. Suppose that X : S −→ A ⊂ R
is a discrete random variable defined on a sample space S. Then the probability mass
function FX : A −→ [0, 1]. for X is defined as
fX (x) = P(X = x)
Also,
X
fX (x) = 1
x∈A
Fx (x) = P(X = x)
1. Distribution technique
Examples.
X=x -1 0 1
P(X = x) 0.25 0.5 0.25
29
Solution
X=x -1 0 1
Y = X2 = y 1 0 1
P(Y = y) 0.25 0.5 0.25
λx e−λ
FX (x) = f or x = 01, 2, 3, . . .
x!
a. Define a new random variable Y = 4X then the space of A map onto the space of
which will be denoted by B = {0, 4, 8, . . . }
Solution
Now
FY = P(Y = y) = P(4x = y) = P(x = y/4)
λy/4 e−λ
FY = y
, f or y = 0, 4, 8, . . .
4
!
Exercise
30
Theorem 4.3.1. Let X be a distribution with probability mass function f (x) defined on the
support A. Let y = G(x) be a one-to-one transformation that maps A to B. If X = g(y) is
the inverse transformation for all y ∈ Y , the probability mass function is given as
Therefore,
fY (y) = fx (g −1 (y))
Theorem 4.3.2. Let X be a continuous random variable with pdf fX (x) defined on the
support of A. Let Y = g(x) be a one-to-one transformation that maps A to B with inverse
x = g −1 (y) for all y ∈ B and let dx/dy be continuous and non-zero then the pdf of Y = g(x)
is the following
dy
fY (y) = fX (g −1 (y))
dx y∈B
31
Chapter 5
Thus, far we have discussed about probability distributions of a single random variable.
However, we are often interested in probabiity statements concerning two or more random
variables. To deal with such probabilities we define for any two random variables X and Y ,
the joint cumulative probability distribution function of X and Y by
The distribution of X can be obtained from the joint distribution of X and Y as follows.
Similarly, the cdf of Y is given by FY (b) = P(Y ≤ b) = F (∞, b). In the case where X
and Y are both discrete random variables, it is convenient to define the joint probability
mass function of X and Y by
p(x, y) = P{X = x, Y = y}
Similarly,
X
PY (y) = p(x, y)
x:p(x,y)>0
We say X and Y are jointly continuous if there exists a function f (x, y) defined for all real
x and y having the property that for all sets A and B of real numbers
Z Z
F (a, b) = P{x ∈ A, y ∈ B} = f (x, y) dxdy
B A
The function f (x, y) is called th joint probability density function of X and Y . The proba-
32
bility density of X can be obtained from a knowledge of f (x, y) by the following reasoning
: Z ∞Z Z
P{X ∈ A} = P{X ∈ A, Y ∈ (−∞, ∞)} = f (x, y) dxdy = fX (x)dx
−∞ A A
Example.
x1 + x2 , x1 = 1, 2, 3 : x2 = 1, 2,
f (x1 , x2 ) = 21
0, otherwise.
So that,
2x1 + 3
f (x1 ) =
21
Therefore,
2x1 + 3 , x1 = 1, 2, 3,
f (x1 ) = 21
0, otherwise.
The marginal pdf of X2 is
X x1 + x2 3x2 + 6
f (x2 ) = =
x =1,2,3
21 21
1
Therefore,
3x2 + 6 , x2 = 1, 2,
f (x2 ) = 21
0, otherwise.
33
5.1 Stochastically Independent
5.1.1 Definition
Let the random variable X and Y have joint pdf f (x, y) and the marginal pdf f1 (x) and
g(y) respectively. The random variables X and Y are said to be stochastically independent
if and only if
f (x, y) = f1 (x)g(y)
where, Z ∞ Z ∞
f1 (x) = f (x, y) dy and g(y) = f (x, y) dx
−∞ −∞
Note that, Random variables which are not stochastically independent are said to be depen-
dent.
Example.
Show that f (x1 , x2 ) = 4x1 x2 , 0 < x1 < 1 and 0 < x2 < 1 is a genuine pdf.
SOLUTION
Assignment
1 x1 −µ 2 1
e−1/2( σ )
2
f (x1 ) = √ and f (x2 ) = √ e−1/2(x2 )
2πσ 2π
If X1 and X2 are stochastically independent. Show that the marginal pdf of Y1 = X1 /X2 ,
N (0, 1) is the Cauchy pdf given by
1
g(y1 ) = f or − ∞ < y1 < ∞
π(1 + y12 )
34
The following table gives a simple summary of some probability distributions.
S-N Types - Discrete Probability mass function Moment generating function Mean Variance
n t n
x
1 Binomial with parameter n, p : 0 ≤ p < 1 x
p (1 − p)n−x ; x = 0, 1, 2, . . . , n (pe + (1 − p)) np np(1 − p)
λx e−λ λ(et −1)
2 Poisson with parameter, λ > 0 x!
;x = 0, 1, 2, . . . e λ λ
x−1 pet 1 1−p
3 Geometric with parameter, 0 ≤ p ≤ 1 p(1 − p) ; x = 1, 2, . . . (1−(1−p)et ) p p2
Types - Continuous
1
b−a
, a < x < b, etb −eta a+b (b−a)2
4 Uniform over (a, b) f (x) = t(b−a) 2 12
0, otherwise.
λe−λx , x ≥ 0,
λ 1 1
5 Exponential with parameters, (n, λ), λ > 0 f (x) = λ−t λ λ2
0, x < 0.
−λx n−1
λe (λx) , x ≥ 0,
(n−1)! λ n n n
6 Gamma with parameter, (n, λ), λ > 0 f (x) = λ−t λ λ2
35
0, x < 0.
x−µ 2
σ 2 t2
2
]
7 Normal with parameter (µ, σ 2 ) f (x) = √1 e−1/2( σ )
σ 2π
e[µt+ µ σ2
Chapter 6
Gamma Function.
For any positive integer, p ∈ R.
Z ∞
Γ(p) = e−x xp−1 dx
0
Then,
Z ∞
Γ(1) = e−x dx = 1
0
Z ∞
Γ(2) = xe−x dx = 1
Z 0∞
Γ(3) = x2 e−x dx = 2
0
Γ(n) ≡ (n − 1)!
Γ(n) = (n − 1)Γ(n − 1)
= (n − 1)(n − 2)(n − 3) · · · (3)(2)(1)
= (n − 1)!
36
ASSIGNMENT
1. Prove that,
√
1
Γ = π
2
1
f (x) = α
xα−1 e−x/β , 0 < x < ∞, α > 0, β > 0
Γ(α)β
Z ∞
Γ(a) = xα−1 e−x dx
0
is indeed a pdf.
1 α−1 −x
f (x) = x e , 0 < x < ∞, α > 0
Γ(α)
1 β−1 −y
g(y) = y e , 0 < y < ∞, β > 0
Γ(β)
With
X
U = X + Y, V =
X +Y
a. Calculate the joint pdf of U and V
b. Calculate the marginal pdf of V
c. By integrating the marginal pdf of V over the space of V ,
Show that the beta function
Z 1
β(α, β) = v α−1 (1 − v)β−1 dv
0
Γ(α)Γ(β)
β(α, β) =
Γ(α + β)
37