0% found this document useful (0 votes)
6 views38 pages

Mat 352 Official Note Construction

The document outlines a course on Probability Distributions and Elementary Limit Theorems, detailing the significance of probability theory across various fields such as medicine, engineering, and finance. It introduces concepts like random experiments, probability measures, and different approaches to probability, including classical, empirical, and axiomatic methods. Additionally, it presents problems related to probability calculations and the foundational principles of probability theory.

Uploaded by

bello marvellous
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views38 pages

Mat 352 Official Note Construction

The document outlines a course on Probability Distributions and Elementary Limit Theorems, detailing the significance of probability theory across various fields such as medicine, engineering, and finance. It introduces concepts like random experiments, probability measures, and different approaches to probability, including classical, empirical, and axiomatic methods. Additionally, it presents problems related to probability calculations and the foundational principles of probability theory.

Uploaded by

bello marvellous
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

MAT 352

Probability Distributions And Elementary


Limit Theorems

Dr. D.A. DIKKO


Mathematics Department,
University of Ibadan,
Ibadan Nigeria.
Contents

1 Probability Theory 3
1.1 Significance of Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 What is a Random Experiment . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Approaches To Probability . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Problem 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 Problem 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.3 Problem 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.4 Problem 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.5 Problem 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Probability Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Addition Theorem Of Probability . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Conditional Probability 10
2.0.1 Baye’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Random Variables 14
3.1 Distribution Function of a Random Variable . . . . . . . . . . . . . . . . . . 14
3.1.1 Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1.2 Continuous Random Variable . . . . . . . . . . . . . . . . . . . . . . 15
3.1.3 Moment Generating Function . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Types of Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 Bernoulli Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.2 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.3 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Types of Continuous Random Variable . . . . . . . . . . . . . . . . . . . . . 19
3.3.1 Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3.2 Exponential Random Variable . . . . . . . . . . . . . . . . . . . . . . 21
3.3.3 Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4 Functions of Random Variable 23


4.1 Indicator Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1
4.2 The Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3 Functions of Random Variables. . . . . . . . . . . . . . . . . . . . . . . . . . 29

5 Joint Probability Density Function. 32


5.1 Stochastically Independent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6 Gamma Function. 36

2
Chapter 1

Probability Theory
1.1 Significance of Probability Theory
Probability theory is a very important area of the mathematical sciences with applications
in diverse fields such as;
(i). Medicine (in diagnosis and in measuring the effectiveness of drugs as well as study of
epidemics).
(ii). Electrical engineering (in signals and robotics).
(iii). Industries for quality control and production.
(iv). Meteorology.
(v). Linguistics.

It is also applied in other area of mathematics such as, Bio-mathematics, Finance, Physics,
Insurance, Commerce, Sociology, Population Studies, Genetics, Demography, Chemistry
(Polymers), Econometrics, Failure of technical devices and machines, errors of mechanism,
morbidity and mortality of population, meteorology and geostatistics , e.t.c.
Probabilist are those that study probability. The theory of probability has the origin in
the game of chance related to gambling, e.g Throwing of dice or coin, drawing cards from a
pack of cards, e.t.c.

A systematic and scientific foundation of mathematical theory of probability was laid in


the mid-seventeenth century by the french mathematician Blaise Pascal (1623-1662) and
Pierre de Fermat. Today the subject has been developed to a great extent that there is
no single discipline where probability theory is not used. It is an essential tool in decision
making in the face of uncertainty with calculated risk.

3
1.2 What is a Random Experiment

Experiments may be broadly classified into groups;


(1). Deterministic or predictable experiments.
(2). Random or unpredictable or probabilistic experiments.

1. Deterministic Experiments: These are experiments which confirms what is already


known or what is actually expected. The result can be predicted with certainty and the
conditions under which an experiment is performed is uniquely determined by the outcome
of the experiment. e.g;
(i). The distance covered by a moving particle after a time (t) is given by

1
S = ut + at2
2

(ii). For a perfect gas we have the boyle’s law which has the guiding rule,

1
(P ressure × V olume = Constant) =⇒ V ∝
P

provided temperature is constant.


(iii). If we dilute a certain quantity of sulphuric acid with zinc we get hydrogen.

2. Random Experiments: This is an experiment which is conducted repeatedly under


essentially homogeneous conditions where the results is not unique but may be of various
possible outcomes. A random experiment has two features;
(a). Its outcome cannot be predicted with certainty,
(b). All possible outcomes can be listed before it is performed. Such phenomenon are fre-
quently observed in Economics, business and social sciences and other day to day activities.
e.g;
(i). A simple toss of an unbiased (fair) coin
(ii). The sex of a baby ( not in multiple births) cannot be predicted with certainty
(iii). A producer cannot ascertain the future demand of his/her product with certainty
(iv). You cannot predict with certainty the country that will win AFCON.

1.2.1 Approaches To Probability


There are three basic approaches to probability which are;
(1). Classical approach
(2). Empirical approach
(3). Axiomatic approach

4
Classical Approach

This follows from the classical definition of probability of occurrence of a certain event.
Here one conducts (observe) and count the number of times that event actually occurred.
Based on those results the probability is found.e.g; If there are n-equally likely outcomes of
a random experiment out of which k are favourable to the result of interest, then the result
has a probability of k/n. Thus,

N umber of times A occurred


P(A) =
N umber of times the trials was repeated

The probability of any event A lies between 0 and 1. i.e, 0 ≤ P(A) ≤ 1. However, if P(A) = 0
then A is an impossible event, while if P(A) = 1 then A is a certain event, where P(A) is the
probability of event A occurring.
If P(A) is known as the probability of success, p , then 1 − P(A) = P(Ā), q is known as the
probability of failure. Therefore, p + q = 1.
The Classical definition does not require the actual experimentation and it is also known as
”a prior” or theoretical or mathematical probability of an event.

Limitation Of Classical Approach

The classical probability has its shot comings and it fails in the following situations;
1. If N , the exhaustive number of outcomes (sample space) of the random experiment is
infinite
2. If the various outcomes of the random experiment are not equally likely
3. If the actual value of N is not known.

Emperical Approach

Suppose we have an irregular asymmetric die . If the center of gravity of the die can be
changed by adding a tiny amount of lead to a particular face then the symmetry and fairness
(unbaisedness) of the die is lost. The experiment of tossing this die no longer fall under the
classical model of equally likely outcome and so the classical definition of probability fails
for such a random experiment.
Definition: If an experiment is performed repeatedly under essentially homogeneous num-
ber of trials as the number of trials becomes indefinitely large, the probability of happening
of the event is being assumed that the limit is finite and unique. Suppose that an event A
occurs m-times in N repetitions of a random experiment. Then the ratio m/N gives the
relative frequency of the event A and it will not vary appreciably from one trial to another.

5
In the limiting case when N becomes sufficiently large, it’s more or less settles to a
number, which is called the probability of A. Symbolically;

m
P(A) = lim
N →∞ N

Axiomatic Approach

This modern theory of probability is based on axioms as introduced by A.N.KOLMOGOROV


in 1930’s. Here some concepts are laid based on certain properties or postulates or some
times defined as axioms entirely developed by logic of deduction.
The axiomatic definition of probability includes both the classical and empirical definitions
of probability and at the same time free from their draw backs.

Definition of Some Terms

1. An experiment: This is a process whose outcome can be observed.


2. An event: This is defined as a non-empty subset of a sample space.
3. A sample space: This is the collection of all possible outcomes of a random experiment.
The outcome of an experiment is known as sample points.
4. Given a sample space in a random experiment the probability of the occurrence of an
event E, is defined as a set function P(E) satisfying the following axioms;
(i). Axiom of non-negativity, i.e, for every E, P(E) ≥ 0
(ii). Axiom of certainty, For sure event N (S) = 1
(iii). Axiom of additivity: For mutually exclusive events E and F;

P(E or F) = P(E) + P(F)

Hence if A1 , A2 , . . . , An is any finite or infinite sequence of disjoint events in S, then;

n
! n
[ X
P Ai = P(Ai )
i=1 i=1

Or we write,

! ∞
[ X
P Ai = P(Ai )
i=1 i=1

6
1.3 Problems
1.3.1 Problem 1
Four cards are drawn at random from a pack of 52 cards. Find the probability that;
(1). They are, a king, a queen, a jack and an ace.
(2). Two are kings and two are aces.
(3). All are diamonds.
(4). Two are red and two are black.

1.3.2 Problem 2
If six dice are rolled, what is the probability that they all show different faces.

1.3.3 Problem 3
Find the probability that in 5 tosses of a perfect coin, the coin turns up head at least 3 times
in a succession.

1.3.4 Problem 4
The letters of the word article are arranged at random. Find the probability that the vowels
occupy even places.

1.3.5 Problem 5
A family is expecting two special guests. The guest both specified the days of the week they
would arrive.
(a). Calculate the probability that they will both arrive on the same day.
(b). Calculate the probability that they both arrive respectively on sunday, monday, tuesday,
wednesday, thursday, friday and saturday then add up all these probabilities.

1.4 Probability Measure


A probability measure on a sample space, S is a function

P : P(S) −→ [0, 1]

with the property;

1 P(S) = 1

2 P(A1 ∪ A2 ∪ A3 ∪ · · · ) = P(A1 ) + P(A2 ) + P(A3 ) + · · ·

If A1 , A2 , A3 , . . . are pairwise mutually exclusive.

7
Note

1. Two events are said to be mutually exclusive if they can never occur simultaneously.

2. For any two events that are mutually exclusive, A ∪ B = ∅ and A ∪ A′ = S

Therefore the sample space is the disjoint union of an event and it’s compliment. The power
set P(S) is the set of all subsets of S. That is if S is the sample space of an experiment,
then P(S) is the set of all possible events.

1.5 Addition Theorem Of Probability


Theorem 1.5.1. The probability of at least one of two occurrence for two events A and B
is given by
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)

Proof. Let us suppose that a random experiment results in a sample space S with N sample
points (exhaustive number of cases).Then by definition;

n(A ∪ B) n(A ∪ B) [n(A) − n(A ∩ B)] + n(A ∩ B) + [n(B) − n(A ∩ B)]


P(A ∪ B) = = =
n(S) N N

n(A) n(B) n(A ∩ B)


= + − = P(A) + P(B) − P(A ∩ B)
N N N

Thus, from Theorem [1.5.1] notice P(A ∩ B) ≥ 0 then it follows directly;

P(A ∪ B) ≤ P(A) + P(B)

In general,

P(A1 ∪ A2 ∪ A3 ∪ · · · ∪ An ) ≤ P(A1 ) + P(A2 ) + P(A3 ) + · · · + P(An )

Theorem 1.5.2. If ∅ is an empty set, then P(∅) = 0

Proof. Since A and ∅ are disjoint set, then;

P(A ∪ ∅) = P(A) + P(∅)

which implies,
P(A) = P(A) + P(∅) =⇒ P(∅) = P(A) − P(A) = 0

Which ends the proof.

8
Theorem 1.5.3. If A′ is the complement of an event A, then P(A′ ) = 1 − P(A)

Proof. The sample space S, can be decomposed into the mutually exclusive events A and
A′ . That is S = A ∪ A′ which implies that n(S) = n(A ∪ A′ ). Thus,
n(S) n(A ∪ A′ )
=
n(S) n(S)

which implies,
1 = P(S) = P(A ∪ A′ ) = P(A) + P(A′ )

Since A and A′ are disjoint. Thus, we can conclude

P(A′ ) = 1 − P(A)

Theorem 1.5.4. Given any event A, Prove that

0 ≤ P(A) ≤ 1

Proof. From the non-negativity axiom P(A) ≥ 0 for every event, A ∈ S = Ω. Moreover, for
all A ⊆ Ω we have that
P(A) ≤ P(Ω) = 1

by the earlier proof and axiom [2], we conclude that;

0 ≤ P(A) ≤ 1

Theorem 1.5.5. Show that

P(A ∪ B ∪ C) = P(A) + P(B) + P(C) − P(C ∩ B) − P(B ∩ A) − P(A ∩ C) + P(A ∩ B ∩ C)

Proof. DIY

In general,

n
! n
!
[ X X X \
P Ai = P(Ai ∩ Aj ) + P(Ai ∩ Aj ∩ Ak ) + · · · + (−1)n−1 P Ai
i=1 i<j i<j<k i<···<n i=1

The above result can be proved using mathematical induction

9
Chapter 2

Conditional Probability

Consider, for instance a sample space Ω, and an event E ∈ Ω, with P(E) ̸= 0 i.e P(E) > 0.
Suppose it is known that E has occurred. What is the conditional probability of another
event A ∈ Ω, given the fact that E has occurred?.

A D E

Notice B = Ω, E is the new sample space (n(E)), while D is the relevant part (n(A ∩ E)).

The above is a simple illustration of conditional probability. This means that the Con-
ditional Probability of event A, given that E has occurred is written as;

|A ∩ E| |A ∩ E| |E|
P(A | E) = = ÷
|E| |Ω| |Ω|

Where | · | means the cardinality of the set. Therefore,

P(A ∩ E)
P(A | E) = , P(E) > 0
P(E)

10
Theorem 2.0.1. Given events A, B, C defined on the sample space Ω such that P(B) > 0.
Then;

1. 0 ≤ P(A | B) ≤ 1

2. P(B | B) = 1

3. For A1 , A2 , A3 , . . . , An mutually exclusive events

n
! n
[ X
P Ak | B = P(Ak | B)
k=1 k=1

4. P(C | A ∩ B) · P(A | B) = P(A ∩ C | B)

5. If A ⊆ B, then;
P(A)
P(A | B) =
P(B)

6. If the sample space Ω, such that Ω = ni=1 Ak , where Ak ∩ Aj = ∅ for k ̸= j and


S

P(Ak ) > 0 for all k


Xn
P(C) = P(C | Ak ) · P(AK )
k=1

Note that such A1 , A2 , A3 , . . . , An are said to form a partition of n

Proof (1). To show 0 ≤ P(A | B) ≤ 1, it follows from definition of conditional probability


that
P(A ∩ B)
P(A | B) = , P(B) > 0
P(B)
From the non-negativity axiom of probability P(A ∩ B) > 0, dividing through by P(B), this
implies
P(A ∩ B) 0

P(B) P(B)
Thus it follows immediately that
P(A | B) ≥ 0 (2.1)

Now if (A ∩ B) ⊂ B this implies P(A ∩ B) ≤ P(B), then dividing through by P(A), we have;

P(A ∩ B) P(B)

P(B) P(B)

Which implies that


P(A | B) ≤ 1 (2.2)

11
Combining [2.1] and [2.2], we see that

0 ≤ P(A | B) ≤ 1

Proof (2). Now observe,


P(B ∩ B)
P(B | B) = =1
P(B)
Which concludes the proof.

Proof (3). From definition of conditional probability, we know that;

n
!
P ( nk=1 Ak ∩ B)
S
[ P((A1 ∪ A2 ∪ A3 · · · ∪ An ) ∩ B)
P Ak | B = =
k=1
P(B) P(B)

P((A1 ∩ B) ∪ (A2 ∩ B) ∪ (A3 ∩ B) · · · ∪ (An ∩ B))


=
P(B)
If {Ak }nk=1 are mutually exclusive, i.e Ai ∩ Aj = ∅ for all i ̸= j, then (Ai ∩ B) ∩ (Aj ∩ B) =
Ai ∩ Aj ∩ B = ∅ for all i ̸= j. Therefore, {Ak ∩ B}nk=1 are also mutually exclusive events, the
probability of their union is the sum of their probabilities. That is;

n
!
[ P(A1 ∩ B) + P(A2 ∩ B) + P(A3 ∩ B) + · · · + P(An ∩ B)
P Ak | B =
k=1
P(B)

P(A1 ∩ B) P(A2 ∩ B) P(A3 ∩ B) P(An ∩ B)


= + + + ··· +
P(B) P(B) P(B) P(B)
= P(A1 | B) + P(A2 | B) + P(A3 | B) + · · · + P(An | B)

Thus we conclude, !
n
[ n
X
P Ak | B = P(Ak | B)
k=1 k=1

Proof (4). Now

P[C ∩ (A ∩ B)] P(A ∩ B) P[C ∩ (A ∩ B)] P[(C ∩ A) ∩ B]


P(C | A ∩ B) · P(A | B) = · = =
P(A ∩ B) P(B) P(B) P(B)

= P(A ∩ C | B)

Which shows that;


P(C | A ∩ B) · P(A | B) = P(A ∩ C | B)

12
Proof (5). Since A ⊆ B then it is clear that A ∩ B = A, then observe the following;

P(A ∩ B) P(A)
P(A | B) = =
P(B) P(B)

Which concludes the proof.

Proof (6). DIY

2.0.1 Baye’s Theorem


Bayes’ theorem proposed by Reverend Thomas Bayes (1702-1761) an English mathematician
and philosopher. It states that (thus);
Theorem 2.0.2. Let a sample space S of an event be partitioned into n mutually exclusive
and exhaustive events A1 , A2 , A3 , . . . , An . Let B be an arbitrary event that occurred when the
experiment was performed, such that P(Ai ) ̸= 0 for (i = 1, 2, 3, . . . , n).
Then, P(B) = ni=1 P(Ai )P(B | Ai ) and;
P

P(Ai )P(B | Ai )
P(Ai | B) =
P(B)
Proof. Let A and B be events, then by definition of condition probability;

P(Ai ∩ B)
P(B | Ai ) =
P(Ai )

Which implies,
P(Ai ∩ B) = P(Ai ) · P(B | Ai ) (2.3)

Also,
P(Ai ∩ B)
P(Ai | B) = (2.4)
P(B)
But the total probability is given by;

P(B) = P(A1 ∩ B) + P(A2 ∩ B) + · · · + P(An ∩ B) (2.5)

Hence using equation [2.3] and [2.5] we have that;


P(B) = P(A1 ) · P(B | A1 ) + P(A2 ) · P(B | A2 ) + P(A3 ) · P(B | A3 ) + · · · P(An ) · P(B | An )

Therefore,
n
X
P(B) = P(Ai ) · P(B | Ai )
i=1

We can re-write [2.4] in the form;


P(Ai ) · P(B | Ai )
P(Ai | B) = Pn
i=1 P(Ai ) · P(B | Ai )

13
Chapter 3

Random Variables

A random variable X of a sample space, S is a function from the set S into the set R of
real numbers such that the pre-image of every interval of R is an event of S. Note that the
possible values of a random variable are not equally likely .

3.1 Distribution Function of a Random Variable


Let X be a random variable on sample space with a finite image set; say
X(S) := {x1 , x2 , x3 , . . . , xn }

We make X(S) into a probability space by defining the probability of xi to be P(X = xi )


which we write as f (xi ). This function f on X(S), i.e, defined by f (xi ) = P(X = xi ) is
called a distribution or probability function of X and it is usually given in the form of a
table;
xi x1 x2 ··· xn
(3.1)
f (xi ) f (x1 ) f (x2 ) · · · f (xn )
The distribution function satisfies the following conditions;

1. f (xi ) ≥ 0
Pn
2. i=1 f (xi ) = 1

3.1.1 Discrete Random Variables


Suppose X is a random variable on S with a countably infinite image set, say
X(S) := {x1 , x2 , x3 , . . . }

Such that the random variable together with finite image sets are called discrete random
variables. For a discrete random variable X, we define the probability mass function p(a)
of X by p(a) = P{X = a}. As in the finite case, we make X(S) into a probability space by
defining the probability of xi to be f (xi ) = P(X = xi ) and call f , the distribution of X, i.e,

14
as given in [3.1] above.

The Expectation, E[X] and Variance, V ar[X] are defined by;



X
E[X] = x1 f (x1 ) + x2 f (x2 ) + · · · + xn f (xn ) + · · · = xi f (xi ) = µ (3.2)
i=1

X
2 2 2
V ar[X] = (x1 −µ) f (x1 )+(x2 −µ) f (x2 )+· · ·+(xn −µ) f (xn )+· · · = (xi −µ)2 f (xi ) (3.3)
i=1

When the relevant series converges absolutely. It can be shown (DIY), that V ar[X] exists if
and only if µ = E[X] and E[X 2 ] both exists and that in this case the formula;

V ar[X] = E[X 2 ] − (E[X])2 = E[X 2 ] − µ2

Note that the Standard Deviation is given by;


p
σX = V ar[X]

3.1.2 Continuous Random Variable


Random variables which are not discrete are said to be continuous. The values of these
random variables continuously fill certain interval on the real axis.
Recall from the definition of a random variable that the set a ≤ X ≤ b is an event in S and
therefore the probability P(a ≤ X ≤ b) is well defined. We assume that there is a piece-wise
continuous function f : R −→ R such that P(a ≤ X ≤ b) is equal to the area under the
graph of f between X = a and X = b (as shown below)

f (x)

P(a ≤ X ≤ b)

a x
b

P(a ≤ X ≤ b) = area of shaded region, which shows that;


Z b
P(a ≤ X ≤ b) := f (x)dx (3.4)
a

15
In this case X is said to be a continuous random variable. The function f is called the
distribution on the continuous probability function or density function of X. This satisfies
the condition;

1. f (x) ≥ 0
Z
2. f (x) dx = 1
R

That is f is non-negative and the total area under it’s graph is 1. The expectation E[x] is
defined by; Z
E[x] = x · f (x) dx := µ (3.5)
R

when it exists.
The variance V ar[x] is defined by V ar[x] = E [(x − µ)2 ] also as;
Z
V ar[x] = (x − µ)2 · f (x) dx (3.6)
R

when it exists.
Just as in the discrete case, it can be shown that V ar[x] exists if and only if µ = E[x] or
E[x2 ] both exists then V ar[x] = E[x2 ] − µ2 , also as;
Z 
V ar[x] = x · f (x) dx − µ2
2
(3.7)
R

Thus, we have standard deviation given as;


p
σ(x) = V ar[x] (3.8)

when V ar[x] exists.

3.1.3 Moment Generating Function


A powerful mathematical device called the generating function was invented by the great
prolific mathematician Euler (1707-1783), originally to study a problem in number theory.
The moment generating function MX (t) of a random variable X is defined for all values t
by;
MX (t) = E etX
 
(3.9)

Notice from [3.8] we have;


X


 etx p(x), if X is discrete

X
MX (t) = Φ(t) := Z ∞ (3.10)

etx p(x)dx, if X is continuous



−∞

16
We call Φ(t) the moment generating function because all the moments of x can be obtained
successively by differentiating Φ(t).
For example,
d
Φ′ (t) = E etX = E xetx =⇒ Φ′ (0) = E[x]
   
dt
Similarly we have;
 
′′ d ′ d tx
=⇒ Φ′′ (0) = E x2
  
Φ (t) = Φ (t) = E xe
dt dt

In general, the nth derivative of Φ(t) evaluated at t = 0 equals E [xn ], that is;

Φ(n) (0) = E [xn ] , n≥1 (3.11)

3.2 Types of Discrete Random Variables


3.2.1 Bernoulli Distribution
If a discrete random variable X takes only two possible values, 0 and 1 with probabilities q
and p respectively, then the probability distribution of X is given by;

P(x) = P(X = x) := px q (1−x) ; x = 0, 1 (3.12)


To show that E[x] = p and σ(x) = pq, we proceed by using it’s moment generating function.
Recall from [3.10],
X
MX (t) = etx P(x)
X

Then using [3.12], we have that;

1 1 1  x 1  x
X
tx
X
tx x 1−x
X
tx x −x
X
txp X pet
MX (t) = e P(x) = e p q =q e p q =q e =q
X x=0 x=0 x=0
q x=0
q

pet
 
=q 1+ = pet + q
q
So we conclude that the moment generating function is given by;

MX (t) = pet + q (3.13)

Since E[x] = M′X (0) = p, similarly E[x2 ] = M′′X (0) = p.



But V ar[x] = E[x2 ] − (E[x])2 = p − p2 = p(1 − p) = pq, and σ(x) = V ar[x] = pq
p

17
3.2.2 Binomial Distribution
A discrete random variable X, has a binomial distribution. If the probability mass function
is given by;  
n x n−x
P(x) = P(X = x) := p q ; x = 0, 1, 2, 3, . . . , n (3.14)
x
If it satisfies the condition below;

1. There is a fixed number (n) of trials.

2. Only two possible outcomes ”success” and ”failure” are possible in each trial.

3. The trials are independent, i.e the outcome of one event does not influence the outcome
of the other event.

4. Probability of success is constant from trial to trial.

We shall show that MX (t) = (pet + q)n , E[x] = np, V ar[x] = npq. Note that the binomial
distribution is a generalized form of the Bernoulli distribution.

Moment Generating Function of a Binomial Distribution

The moment generating function X ∼ b(n, p) i.e with binomial distribution where the prob-
ability of success on any of the n trials is p and it is given by;
n   n  
tx
X
tx n x n−x
X n tx x
e p (1 − p)n−x
 
MX (t) = E e = e p (1 − p) = (3.15)
x=0
x x=0
x

Comparing this with general form of a binomial expansion


n
X
n
(a + b) = ax bn−x (3.16)
x=0

Set a = pet and b = 1 − p in [3.16], So we have;

MX (t) = (pet + q)n (3.17)


 P 
Similarly, since MX (t) = E [etx ] = E et x , where X is a random variable comprising of
x1 , x2 , x3 , . . . , xn . It follows directly that,

MX (t) = E et[x1 +x2 +x3 +···+xn ] = E[etx1 · etx2 · etx3 · · · etxn ]


 
(3.18)

Since all individual events are independent and are Bernoulli random variables, then we
have;
n
Y
MX (t) = (pet + q)(pet + q) · · · (pet + q) = (pet + q)i = (pet + q)n (3.19)
| {z } i=1
n−times

18
Hence to determine the mean (expectation), Recall E[x] = M′X (0). Since MX (t) = (pet +q)n ,
then M′X (t) = n(pet + q)n−1 · pet , which implies E[x] = M′X (0) = np.
To determine the variance / standard deviation;
Consider M′′X (0) = n(n − 1)p(p + q)n−2 p + (p + q)n−1 np = n(n − 1)p2 + np. So we have that;

E[x2 ] := M′′X (0) = n2 p2 − np2 + np (3.20)

Then;

V ar[x] = E[x2 ] − (E[x])2 = n2 p2 − np2 + np − n2 p2 = np − np2 = np(1 − p) = npq

So;
p √
σ(x) = V ar[x] = npq

3.2.3 Poisson Distribution


A random variable X is said to have a poisson distribution if,

e−λ λx
P(x) = , x = 0, 1, 2, 3, . . . (3.21)
x!

The poisson distribution is an approximation to the binomial distribution when the number
of trial n is very large and the probability of success is very minimal such that np = λ, a
t
constant. We shall show that MX (t) = eλ(e −1) or exp(λ(et − 1)), E[x] = λ, V ar[x] = λ and

σ(x) = λ

3.3 Types of Continuous Random Variable


In the following, we state some examples of continuous random variables. It follows similar
principle as it is in the discrete case.

3.3.1 Uniform Distribution


A uniform distribution (sometimes known as a rectangular distribution) is a distribution
that has constant probability.

19
fX (x)

1
b−a

a b x

The probability density function for a continuous uniform distribution on the interval [a, b]
is given as;  1
 , for a ≤ x ≤ b
P(x) := b − a (3.22)
0, otherwise

Moment Generating Function


 
Recall MX (t) = E etX , then;
Z b   Z b
tx 1 1 1
MX (t) = e dx = etx dx = [etb − eta ]
a b−a b−a a t(b − a)

So we have the moment generating function below, given as;

etb − eta
MX (t) = (3.23)
t(b − a)

Recall that;

x x2 x3 X xn
e =1+x+ + + ··· = (3.24)
2! 3! n=0
n!

Therefore;

tb (tb)2 (tb)3 X (bt)n
e =1+t+ + + ··· = (3.25)
2! 3! n=0
n!

ta (ta)2 (ta)3 X (at)n
e =1+t+ + + ··· = (3.26)
2! 3! n=0
n!

Subtracting [3.26] from [3.25], we have that;



t2 t3 X tn (bn − an )
e − e = t(b − a) + (b2 − a2 ) + (b3 − a3 ) + · · · =
tb ta
(3.27)
2! 3! n=1
n!

Also recall that, b2 − a2 = (b − a)(b + a) and b3 − a3 = (b − a)(b2 + ab + a2 ). Then on

20
simplification of [3.23], we have that;

b+a b2 + ab + a2 2 X tn (bn − an )
MX (t) = 1 + t+ t + ··· = (3.28)
2 6 n=1
(b − a)n!

Therefore,

b + a 2(b2 + ab + a2 ) X ntn−1 (bn − an )
M′X (t) = + t + ··· =
2 6 n=1
(b − a)n!

Now when t = 0 we have that;

b+a
µ = E[x] = M′X (0) = (3.29)
2

Then we continue the iteration;

a2 + ab + b2
M′′X (0) = E[x2 ] = (3.30)
3

Therefore,
2
a2 + ab + b2 a2 + b2 − 2ab (b − a)2

2 2 b+a
V ar[x] = E[x ] − (E[x]) = − = =
3 2 12 12

Then we conclude that;


(b − a)2
V ar[x] = (3.31)
12
Thus; r
p (b − a)2 b−a
σ(x) = V ar[x] = = √
12 2 3
Finally;
b−a
σ(x) = √ (3.32)
2 3

3.3.2 Exponential Random Variable


The exponential distribution (also known as negative exponential distribution) is the prob-
ability distribution that describes the time between events in a poisson process that is a
process in which events occur continuously and independently at a constant average rate. It
is a particular case of the gamma distribution. It is the continuous analogue of geometric
distribution and it has the key property of being memory less.
The pdf is given as; 
λe−λx , x ≤ 0 or x ∈ [0.∞)
f (x) := (3.33)
0, otherwise

21
Hence,
Z ∞ Z ∞ Z ∞
−λx λ
MX (t) = E etX = tx tx
e−x(λ−t) dx =
 
e f (x)dx = e · λe dx = λ
0 0 0 λ−t
So;
λ
MX (t) = (3.34)
λ−t
λ
M′X (t) = (3.35)
(λ − t)2
 −3
′′ 2 t
MX (t) = 2 1 − (3.36)
λ λ
Then from [3.35];
1
µ = E[x] = M′X (0) =
λ
Also from [3.36];
2
E[x2 ] = M′′X (0) =
λ2
2 1 1 1
Therefore; V ar[x] = E[x2 ] − (E[x])2 = 2
− 2 = 2 and σ(x) = , then we conclude;
λ λ λ λ
1
V ar[x] = (3.37)
λ2

and
1
σ(x) = (3.38)
λ

3.3.3 Normal Distribution


The Normal or Gaussian distribution are often used in the natural and social sciences to
represent real-valued random variables whose distributions are not known.
The pdf is given as;
1 (x−µ)2
F(x, µ, σ 2 ) = √ e− 2σ2 (3.39)
2πσ 2
where µ is the mean (expectation), σ is the standard deviation and σ 2 is the variance.

Assignment

Determine Moment generating function and use it to compute the mean and variance.

22
Chapter 4

Functions of Random Variable

4.1 Indicator Function


The indicator function is a random variable (for all students) on the space Ω and it indicates
whether or not a particular sample point belong to a particular sub-event.
The indicator function is extremely useful as simple as it looks in the proof of the powerful
and complicated theorems such as the celebrated kolmogorov’s inequality

Definition
The indicator function of the event E1 is defined as;

1, if w ∈ E
IE = IE (w) := (4.1)
0, if w ∈
/E

Lemma 4.1.1. Given a random variable X, then the expectation is given by

E [IB (·)] = P(B) (4.2)

Proof. Since, 
0, on B c
IB (w) :=
1, on B

IB counts up all sample points belonging to B and none other,

E (IB ) = 1 · P(B) + 0 · P(B c ) = P(B)

Lemma 4.1.2. The following result holds;

1. IB c = 1 − IB

23
2. IA∩B = IA · IB

3. IA∪B = IA + IB − IA∩B

Proof. By the definition of the indicator function,


IB c (w) = 0 iff IB (w) = 1 for all w ∈ Ω, and IB c (w) = 1 iff IB (w) = 0 for all w ∈ Ω. Since
these are the only possibilities, it follows that;

IB c = 1 − IB (4.3)

On to the next, 
1, if w ∈ A ∩ B
IA∩B (w) := (4.4)
0, if w ∈
/ A∩B

Let IA∩B (w) = 0, then w ∈


/ A ∩ B. This will result in 3 cases;

1. w ∈ A and w ∈
/B

2. w ∈ B and w ∈
/A

3. w ∈
/ A and w ∈
/B

Case 1

w ∈ A and w ∈ / B which implies IA (w) = 1 and IB (w) = 0. therefore,


IA∩B (w) = IA (w) · IB (w) = 1 × 0 = 0.

Case 2

w∈/ A and w ∈ B which implies IB (w) = 1 and IA (w) = 0. therefore,


IA∩B (w) = IA (w) · IB (w) = 0 × 1 = 0.

Case 3

w∈/ A and w ∈ / B which implies IA (w) = 0 and IB (w) = 0. therefore,


IA∩B (w) = IA (w) · IB (w) = 0 × 0 = 0.

In all cases IA (w) · IB (w) = 0.


Similarly, let IA∩B (w) = 1 which implies that w ∈ A ∩ B. Thus, w ∈ A and w ∈ B, it follows
directly that IA (w) = IB (w) = 1 which implies that
IA∩B = IA · IB = 1 × 1 = 1.
Since these are the only two possibilities, then we are done. That is;

IA∩B = IA · IB (4.5)

24
Now we want to show IA∪B = IA + IB − IA∩B , by [1], IA∪B = 1 − I(A∪B)c then we have
from De-morgan’s law,
(A ∪ B)c = Ac ∩ B c (4.6)

Therefore, IA∪B = 1 − IAc ∩B c , then it follows from [2] that IAc ∩B c = IAc · IB c , which implies
that, IA∪B = 1 − IAc · IB c , and IAc = 1 − IA , IB c = 1 − IB
Observe,
IA∪B = 1 − (1 − IA )(1 − IB ) = IA + IB − IA · IB

But IA∩B = IA · IB . We conclude that;

IA∪B = IA + IB − IA∩B (4.7)

Note, these results can be extended in an obvious manner to a finite collection of events.
Hence (Prove that),
IA∪B∪C = IA + IB + IC − IA∩B − IA∩C − IB∩C + IA∩B∩C (4.8)

4.2 The Distribution Function


The distribution function for a random variable X is defined by F (x) = P(X ≤ x), where x
is any real number (−∞ < x < +∞). The distribution function is also called the cumulative
distribution function (CDF).
Some properties of the cumulative distribution function includes the following:

1. F (x) is a non-decreasing function of x

2. lim F (x) = F (∞) = 1


x→∞

3. lim F (x) = F (−∞) = 0


x→−∞

Property (i) follows since for event y < x, {X ≤ y} is contained in the event {X ≤ x} and
so it must have a smaller probability. Properties (ii) and (iii) follows since x must take on
some infinite value.
All probability questions about x can be answered in terms of the cumulative distribution
function F (·). For example if a, b ∈ R such that a < b, it follows that;

P(X = a) = F (a+ ) − F (a− ) (4.9)


P(a < X ≤ b) = F (b) − F (a) (4.10)
P(a < X < b) = F (b− ) − F (a) (4.11)
P(a ≤ X ≤ b) = F (b) − F (a− ) (4.12)
P(a ≤ X < b) = F (b− ) − F (a+ ) (4.13)

25
If we desire the probability that x is strictly smaller than b, we may calculate this
probability by,
P{X < b} = lim+ P{X ≤ b − h} = lim+ F (b − h)
h→0 h→0

Where limh→0+ means that we are taking the limit as h decrease to 0. Note that P(X < b)
does not necessarily equal F (b) since F (b) also includes the probability that x equals 0.

Theorem 4.2.1. Let X be a constant real-valued random variable with density function f (x).
Then the cumulative distribution function of X is defined by;
Z ∞
F (x) = f (x) dx (4.14)
−∞

Remark

1. F (x) is the cumulative distribution function (CDF)

2. f (x) is the probability density function (PDF)


dF (x)
3. = f (x)
dx

Example 1

A real number is chosen at random from [0, 1] with uniform probability and then this number
is squared. Let x represent the result. What is the PDF of x ?.
Solution

Let u represent the chosen number. Then X = u, If 0 ≤ x ≤ 1, then we have


√ √
FX (x) = P(X ≤ x) = P(u2 ≤ x) = P(u ≤ x) = x. It is clear that X always takes a value
between 0 and 1, so the cumulative distribution function X is given by,



 0, x≤0


FX (x) = x, 0 ≤ x ≤ 1 (4.15)


x≥1

1,

To get the PDF, we differentiate the CDF and get,






 0, x≤0
 1
P DF := fX (x) = √ , 0≤x≤1 (4.16)

 2 x

1, x≥1

Note here that FX (x) is continuous, but fX (x) is not.

26
Example 2

Suppose that X has the PDF,



ax2 , 0 < x < 3
fX (x) =
0, otherwise

Calculate (a) P(1 < X < 2) (b) P(1 < X)


Solution
R
A PDF must satisfy (i) fX (x) ≥ 0 and (ii) fX (x) dx = 1. Now since 0 < x < 3, any value
from that interval substituted into ax2 would yield fX (x) ≥ 0. Also, fX (0) = 0 otherwise.
Since these are the only possibilities. Thus fX (x) ≥ 0 .
R
Now, Since fX (x) dx = 1 then,
Z 3
ax2 dx = 1
0

Which implies, 9a − 0 = 1. Thus,


1
a=
9
Hence we can re-write our PDF as follows,

 1 x2 , 0 < x < 3

fX (x) = 9
0, otherwise

So from the above,


2 2
x2
Z Z
7
P(1 < X < 2) = fX (x) dx = dx =
1 1 9 27
3 3
x2
Z Z
Like-wisely, 26
P(1 < X) = fX (x) dx = dx =
1 1 9 27

Example 3

Given that, 


 0, x<0
x+1

F (x) = , 0≤x≤1 (4.17)


 2
1, x>1
1

(a) Sketch its graph. Hence, Find (i) P(X = 0) (ii) P −3 < X < 2
(iii) P(X > 1)

27
Solution

Figure 4.1: Sketch of the distribution function

We have the follows as

1/2 + 1 3
P(X = 0) = 1/2, P(−3 < X < 1/2) = F (1/2− )−F (−3) = −0 = , P(X > 1) = 1
2 4

Exercise

Given that the random variable X has the distribution function,





 0, x<0

 1
 4 x, 0≤x<1



F (x) = 1

 , 1≤x<2


 4
 1 x,


2≤x<3

2

(i) Sketch F (x) and hence determine; (a) P(X = 1/2), (b) P(X = 1), (c) P(X = 2).
(ii) The conditional probability that x > 2 given that x > 1

28
1. The probability mass function (pmf) is a function that gives the probability that a
discrete random variable is exactly equal to some value. Suppose that X : S −→ A ⊂ R
is a discrete random variable defined on a sample space S. Then the probability mass
function FX : A −→ [0, 1]. for X is defined as

fX (x) = P(X = x)

Also,
X
fX (x) = 1
x∈A

2. Probability Density function (pdf) or density of a continuous random variable is a


function that describes the relative likelihood of this random variable to take a given
value.

3. In probability theory and statistics the cumulative distribution function (cdf) of a


real-valued random variables X or just distribution function of X evaluated at x is the
probability that X will take a value less than or equal to x.
In case of a continuous distribution it gives the area under the pdf from −∞ to x. The
cdf of a real valued random variable X is the function given by

Fx (x) = P(X = x)

4.3 Functions of Random Variables.


If we have a continuous random variable X and we have known it’s distribution. We are
interested in another random variable Y when it is in a transformation of X. For example
for Y = X 2 we want to find the distribution of Y .
There are several techniques

1. Distribution technique

2. Change of variable technique

3. Moment generating functions.

Examples.

1. Suppose we have the distribution of X as

X=x -1 0 1
P(X = x) 0.25 0.5 0.25

Define Y = X 2 , how do we find the distribution of Y .

29
Solution

From the table we have,

X=x -1 0 1
Y = X2 = y 1 0 1
P(Y = y) 0.25 0.5 0.25

2. Let X have the poisson distribution

λx e−λ
FX (x) = f or x = 01, 2, 3, . . .
x!

Then x has the support x = {0, 1, 2, 3, 4, . . . }.

a. Define a new random variable Y = 4X then the space of A map onto the space of
which will be denoted by B = {0, 4, 8, . . . }

b. What is the distribution of Y ?.

Solution

Now
FY = P(Y = y) = P(4x = y) = P(x = y/4)

Hence, substituting this in F (x), we have

λy/4 e−λ
FY = y
 , f or y = 0, 4, 8, . . .
4
!

Exercise

1. For a continuous function. Let the pdf of X be given by



6x(1 − x), 0 < x < 1,
f (x) :=
0, otherwise.

Find the pdf of Y = X 3

2. Let the pdf of y be 


3y 2 , 0 < y < 1,
FY (y) :=
0, otherwise.
and U = 2y + 3. Find the cdf of U and the pdf of U .

30
Theorem 4.3.1. Let X be a distribution with probability mass function f (x) defined on the
support A. Let y = G(x) be a one-to-one transformation that maps A to B. If X = g(y) is
the inverse transformation for all y ∈ Y , the probability mass function is given as

f (y) = fx (g −1 (y)), y∈B

Proof. From the definition,

fY (y) = P(Y = y) = P(g(x) = y) = P((x) = g −1 (y))

Therefore,
fY (y) = fx (g −1 (y))

Theorem 4.3.2. Let X be a continuous random variable with pdf fX (x) defined on the
support of A. Let Y = g(x) be a one-to-one transformation that maps A to B with inverse
x = g −1 (y) for all y ∈ B and let dx/dy be continuous and non-zero then the pdf of Y = g(x)
is the following
dy
fY (y) = fX (g −1 (y))
dx y∈B

31
Chapter 5

Joint Probability Density Function.

Thus, far we have discussed about probability distributions of a single random variable.
However, we are often interested in probabiity statements concerning two or more random
variables. To deal with such probabilities we define for any two random variables X and Y ,
the joint cumulative probability distribution function of X and Y by

F (a, b) = P(X ≤ a, Y ≤ b), −∞ < a, b < +∞

The distribution of X can be obtained from the joint distribution of X and Y as follows.

FX (a) = P{X ≤ a} = P{X ≤ a, Y < ∞} = P(a, ∞)

Similarly, the cdf of Y is given by FY (b) = P(Y ≤ b) = F (∞, b). In the case where X
and Y are both discrete random variables, it is convenient to define the joint probability
mass function of X and Y by

p(x, y) = P{X = x, Y = y}

The probability mass function of X may be obtained from p(x, y) by


X
PX (x) = p(x, y)
y:p(x,y)>0

Similarly,
X
PY (y) = p(x, y)
x:p(x,y)>0

We say X and Y are jointly continuous if there exists a function f (x, y) defined for all real
x and y having the property that for all sets A and B of real numbers
Z Z
F (a, b) = P{x ∈ A, y ∈ B} = f (x, y) dxdy
B A

The function f (x, y) is called th joint probability density function of X and Y . The proba-

32
bility density of X can be obtained from a knowledge of f (x, y) by the following reasoning
: Z ∞Z Z
P{X ∈ A} = P{X ∈ A, Y ∈ (−∞, ∞)} = f (x, y) dxdy = fX (x)dx
−∞ A A

Where the pdf of X is Z ∞


fX (x) = f (x, y) dy

The pdf of Y is such that Z ∞


fY (y) = f (x, y) dx
−∞

Example.

1. Let X1 and X2 have joint pdf given as

 x1 + x2 , x1 = 1, 2, 3 : x2 = 1, 2,

f (x1 , x2 ) = 21
0, otherwise.

Calculate (a) Directly (b) from the marginal


(i) P(X1 = 2) (ii) P(X2 = 1) Now, we begin below; The marginal pdf of X1 is
X x1 + x2 x1 + 1 x1 + 2
f (x1 ) = = +
x =1,2
21 21 21
2

So that,
2x1 + 3
f (x1 ) =
21
Therefore,
 2x1 + 3 , x1 = 1, 2, 3,

f (x1 ) = 21
0, otherwise.
The marginal pdf of X2 is
X x1 + x2 3x2 + 6
f (x2 ) = =
x =1,2,3
21 21
1

Therefore,
 3x2 + 6 , x2 = 1, 2,

f (x2 ) = 21
0, otherwise.

By direct calculation , P(X1 = 2) = f (2, 1) + f (2, 2) = 7/21 = 1/3 and P(X2 = 1) =


f (1, 1) + f (2, 1) + f (3, 1) = 9/21 = 3/7
From the marginal, P(X1 = 2) = f (x1 )|x1 =2 = f1 (2) = 1/3 and P(X2 = 1) = f2 (1) = 3/7

33
5.1 Stochastically Independent
5.1.1 Definition
Let the random variable X and Y have joint pdf f (x, y) and the marginal pdf f1 (x) and
g(y) respectively. The random variables X and Y are said to be stochastically independent
if and only if
f (x, y) = f1 (x)g(y)

where, Z ∞ Z ∞
f1 (x) = f (x, y) dy and g(y) = f (x, y) dx
−∞ −∞

Note that, Random variables which are not stochastically independent are said to be depen-
dent.

Example.

Show that f (x1 , x2 ) = 4x1 x2 , 0 < x1 < 1 and 0 < x2 < 1 is a genuine pdf.

SOLUTION

f (x1 , x2 ) ≥ 0 is non-negative because x1 and x2 runs from 0 to 1.


Hence; Z 1Z 1 Z 1
4x1 x2 dx2 dx1 = 2x1 dx1 = 1
0 0 0

Assignment

Let X1 , X2 be random variable each have N (0, 1).

1 x1 −µ 2 1
e−1/2( σ )
2
f (x1 ) = √ and f (x2 ) = √ e−1/2(x2 )
2πσ 2π

If X1 and X2 are stochastically independent. Show that the marginal pdf of Y1 = X1 /X2 ,
N (0, 1) is the Cauchy pdf given by

1
g(y1 ) = f or − ∞ < y1 < ∞
π(1 + y12 )

34
The following table gives a simple summary of some probability distributions.

S-N Types - Discrete Probability mass function Moment generating function Mean Variance
n t n
 x
1 Binomial with parameter n, p : 0 ≤ p < 1 x
p (1 − p)n−x ; x = 0, 1, 2, . . . , n (pe + (1 − p)) np np(1 − p)
λx e−λ λ(et −1)
2 Poisson with parameter, λ > 0 x!
;x = 0, 1, 2, . . . e λ λ
x−1 pet 1 1−p
3 Geometric with parameter, 0 ≤ p ≤ 1 p(1 − p) ; x = 1, 2, . . . (1−(1−p)et ) p p2
Types - Continuous 
1

b−a
, a < x < b, etb −eta a+b (b−a)2
4 Uniform over (a, b) f (x) = t(b−a) 2 12
0, otherwise.

λe−λx , x ≥ 0,
λ 1 1
5 Exponential with parameters, (n, λ), λ > 0 f (x) = λ−t λ λ2
0, x < 0.

−λx n−1
 λe (λx) , x ≥ 0,
(n−1)! λ n n n

6 Gamma with parameter, (n, λ), λ > 0 f (x) = λ−t λ λ2

35
0, x < 0.
x−µ 2
σ 2 t2
2
]
7 Normal with parameter (µ, σ 2 ) f (x) = √1 e−1/2( σ )
σ 2π
e[µt+ µ σ2
Chapter 6

Gamma Function.
For any positive integer, p ∈ R.
Z ∞
Γ(p) = e−x xp−1 dx
0

Then,
Z ∞
Γ(1) = e−x dx = 1
0
Z ∞
Γ(2) = xe−x dx = 1
Z 0∞
Γ(3) = x2 e−x dx = 2
0

Hence we can show that for any natural number n

Γ(n) ≡ (n − 1)!

Proof. Given that,


Z ∞
Γ(n) = xn−1 e−x dx (6.1)
0

Then using integration by part,


Z ∞ Z ∞
n−1 −x
Γ(n) = x e dx = (n − 1) xn−2 e−x dx
0 0

Then, we can see that;

Γ(n) = (n − 1)Γ(n − 1)
= (n − 1)(n − 2)(n − 3) · · · (3)(2)(1)
= (n − 1)!

36
ASSIGNMENT

1. Prove that,

 
1
Γ = π
2

2. Prove that the gamma pdf,

1
f (x) = α
xα−1 e−x/β , 0 < x < ∞, α > 0, β > 0
Γ(α)β
Z ∞
Γ(a) = xα−1 e−x dx
0

is indeed a pdf.

3. Let the random variable X have

1 α−1 −x
f (x) = x e , 0 < x < ∞, α > 0
Γ(α)

1 β−1 −y
g(y) = y e , 0 < y < ∞, β > 0
Γ(β)
With
X
U = X + Y, V =
X +Y
a. Calculate the joint pdf of U and V
b. Calculate the marginal pdf of V
c. By integrating the marginal pdf of V over the space of V ,
Show that the beta function
Z 1
β(α, β) = v α−1 (1 − v)β−1 dv
0

Γ(α)Γ(β)
β(α, β) =
Γ(α + β)

37

You might also like