0% found this document useful (0 votes)
7 views127 pages

Math Endsem Full

The document provides an overview of set theory and probability theory, defining key concepts such as sets, subsets, operations, and identities in set theory, as well as types of events and probability definitions in probability theory. It explains operations like union, intersection, and complement, along with methods of enumeration and principles of counting. Additionally, it discusses the classical, statistical, and axiomatic approaches to probability, emphasizing the importance of understanding outcomes and events in random experiments.

Uploaded by

ujvamsi23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views127 pages

Math Endsem Full

The document provides an overview of set theory and probability theory, defining key concepts such as sets, subsets, operations, and identities in set theory, as well as types of events and probability definitions in probability theory. It explains operations like union, intersection, and complement, along with methods of enumeration and principles of counting. Additionally, it discusses the classical, statistical, and axiomatic approaches to probability, emphasizing the importance of understanding outcomes and events in random experiments.

Uploaded by

ujvamsi23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 127

SET THEORY

A set is a collection of well-defined objects. The objects comprising the set are called elements.
If 𝑥 is an element of a set 𝐴, then we write 𝑥 ∈ 𝐴. If 𝑥 is not an element, then we write 𝑥  𝐴.
Subset: Let 𝐴 and 𝐵 be two sets. 𝐴 is said to be a subset of 𝐵, if 𝑥 ∈ 𝐴  𝑥 ∈ 𝐵 and is denoted
by 𝐴 ⊂ 𝐵.
Equality of sets: Two sets 𝐴 and 𝐵 are said to be equal if 𝐴 ⊂ 𝐵 and 𝐵 ⊂ 𝐴. We then write 𝐴 =
𝐵.
Universal set (𝑼): The universal set is the set of all objects under consideration.
Null set (𝝓): A set containing no elements is called a null set.
Singleton set: A set containing a single element is called a singleton set.

SET OPERATIONS:
Let 𝐴 and 𝐵 be two sets.
Union: The set of all elements which belong to set 𝐴 or to the set 𝐵 is called union of the sets
𝐴 and 𝐵. It is denoted by 𝐴 ∪ 𝐵. 𝐴 ∪ 𝐵 = {𝑥|𝑥 ∈ 𝐴 𝑜𝑟 𝑥 ∈ 𝐵}.
Intersection: The set of all elements which belong to set 𝐴 and to the set 𝐵 is called intersection
of the sets 𝐴 and 𝐵, i.e., the set of elements common to both 𝐴 and 𝐵. It is denoted by 𝐴 ∩ 𝐵.
𝐴 ∩ 𝐵 = {𝑥|𝑥 ∈ 𝐴 𝑎𝑛𝑑 𝑥 ∈ 𝐵}. If 𝐴 ∩ 𝐵 = 𝜙, then 𝐴 and 𝐵 are disjoint sets.
Complement: Let 𝐷 be a subset of 𝐴. Then the compliment of 𝐷 in 𝐴 is the set of all elements
of A which are not in D. It is denoted by 𝐷𝐶 or 𝐷̅ or 𝐷′. 𝐷′ = {𝑥|𝑥 ∈ 𝐴 𝑏𝑢𝑡 𝑥D}.
Difference: The difference 𝐴 – 𝐵 is defined as 𝐴 – 𝐵 = {𝑥 | 𝑥  𝐴 𝑎𝑛𝑑 𝑥  𝐵} = 𝐴 ∩ 𝐵
Set Identities:
1. 𝐴 ∩ 𝐵 = 𝐵 ∩ 𝐴, 𝐴 ∪ 𝐵 = 𝐵 ∪ 𝐴 (Commutative laws)
2. 𝐴 ∪ (𝐵 ∪ 𝐶) = (𝐴 ∪ 𝐵) ∪ 𝐶, 𝐴 ∩ (𝐵 ∩ 𝐶) = (𝐴 ∩ 𝐵) ∩ 𝐶 (Associative laws)
3. 𝐴 ∩ (𝐵 ∪ 𝐶) = (𝐴 ∩ 𝐵) ∪ (𝐴 ∩ 𝐶), 𝐴 ∪ (𝐵 ∩ 𝐶) = (𝐴 ∪ 𝐵) ∩ (𝐴 ∪ 𝐶)
(Distributive laws)
4. 𝐴 ∪ 𝐴 = 𝐴, 𝐴 ∩ 𝐴 = 𝐴 (Idempotent laws)
5. 𝐴 ∪ 𝑈 = 𝑈, 𝐴 ∩ 𝜙 = 𝜙 (Dominance laws)
6. 𝐴 ∪ 𝜙 = 𝐴, 𝐴 ∩ 𝑈 = 𝐴 (Identity laws)
7. 𝐴 ∩ 𝐴′ = 𝜙, 𝐴 ∪ 𝐴′ = 𝑈 (Complement laws)
8. (𝐴 ∪ 𝐵)′ = 𝐴′ ∩ 𝐵′ , (𝐴 ∩ 𝐵)′ = 𝐴′ ∪ 𝐵′ (De Morgan’s laws)
Cardinality: The number of elements in a set 𝐴 is called cardinality of 𝐴 and is denoted by 𝑛(𝐴).
1. 𝑛(𝐴 ∪ 𝐵) = 𝑛(𝐴) + 𝑛(𝐵) − 𝑛(𝐴 ∩ 𝐵)
2. 𝑛(𝐴 − 𝐵) = 𝑛(𝐴) − 𝑛(𝐴 ∩ 𝐵)
3. 𝑛(𝐴 ∪ 𝐵 ∪ 𝐶) = 𝑛(𝐴) + 𝑛(𝐵) + 𝑛(𝑐) − 𝑛(𝐴 ∩ 𝐵) − 𝑛(𝐵 ∩ 𝐶) − 𝑛(𝐴 ∩ 𝐶) + 𝑛(𝐴 ∩ 𝐵 ∩ 𝐶)

Methods of Enumeration
1. Multiplication Principle: Suppose that a procedure, say procedure 𝐴 can be done in 𝑛
different ways and another procedure, say 𝐵 can be done in 𝑚 different ways. Also
suppose that any way of doing 𝐴 can be followed by any way of doing 𝐵. Then, the
procedure consisting of ‘𝐴 𝑓𝑜𝑙𝑙𝑜𝑤𝑒𝑑 𝑏𝑦 𝐵’ can be performed in 𝑚𝑛 ways.
2. Addition Principle: The number ways in which either 𝐴 or 𝐵, but not both, can be
performed is 𝑚 + 𝑛.
3. Permutation (Arrangement of given objects; Order is important):
The number of permutations of 𝑛 distinct objects taken 𝑟 at a time is
𝑛 𝑛!
➢ 𝑃𝑟 = (𝑛−𝑟)!, if repetitions are not allowed
➢ 𝑛𝑟 , if repetitions are allowed
𝑛!
➢ , if out of 𝑛 objects, 𝑘1 are of one kind, 𝑘2 are of second kind, … , 𝑘𝑚 are
𝑘1 !𝑘2 !…𝑘𝑚 !
𝑡ℎ
of 𝑚 kind (𝑘1 + 𝑘2 + ⋯ + 𝑘𝑚 = 𝑛, ie all objects are taken)
➢ (𝑛 − 1)!, when arranged along a circle
(𝑛−1)!
➢ , when clockwise and anti-clockwise arrangements are indistinguishable.
2
4. Combination (Selection of objects; Order is not important):
𝑛𝑃 𝑛!
𝑛
➢ 𝐶𝑟 = 𝑟
= , if repetitions are not allowed.
𝑟! 𝑟!(𝑛−𝑟)!

𝑛+𝑟−1
➢ 𝐶𝑟 , if repetitions are allowed.

PROBABILITY THEORY
There are two types of experiments
I. Deterministic experiments
An experiment that has a single possible outcome which is known before the
experiment is conducted. Eg: Cooling water below 0𝑜 𝐶, it will freeze it; throwing a ball
in the sky, it will fall down
II. Random/probabilistic experiments
Random experiment is an experiment which may not result in the same outcome when
repeated under the same conditions. It is an experiment which does not have a unique
outcome. Eg: Tossing of a coin, rolling of a die.

Sample Space: The set of all possible outcomes of a random experiment.


Examples:
(i) In tossing of a coin: 𝑆 = {𝐻, 𝑇}
(ii) In tossing of two coins: 𝑆 = {𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇}
(iii) In tossing of two identical coins: 𝑆 = {𝐻𝐻, 𝐻𝑇, 𝑇𝑇}
(iv) In rolling of a die: 𝑆 = {1,2,3,4,5,6}
If a sample space has finite number of elements, then it is called a finite sample space.
Otherwise, the sample space is said to be an infinite sample space.
Example for a infinite sample space:
Consider rolling of a die till a 5 appears: 𝑆 = {5, 15,25, . .65, 115, 125, … ,215,225, … . }

Event: An event is a subset of the sample space.


Null Event: An event, which does not contain any element, is called a null event or an impossible
event, denoted by 𝜙
Certain or sure Event: If the event contains all the elements of the sample space, then it is called
a certain event.
Elementary/simple event: An event which has only one outcome
Compound event: An event which has more than one outcome

Types of events
1. Equally likely outcomes: If all outcomes of a random experiment have equal chances of
occurrence, then the outcomes are said to be equally likely, i.e, none of them have a
greater chance of occurance than the other.
Eg: (i) In tossing of an unbiased coin, head and tail are equally likely.
(ii) In rolling of an honest die, all six faces are equally likely.

2. Mutually Exclusive Events: Two events 𝐴 and 𝐵 are said to be mutually exclusive if both
of them cannot occur simultaneously. i.e., if occurrence of one event prevents the
occurrence of the other, then the events are said to be mutually exclusive. 𝐴 and 𝐵 are
mutually exclusive if 𝐴 ∩ 𝐵 = 𝜙
Eg: (i) In tossing of a coin head and tail are mutually exclusive
(ii) In rolling of a die all six faces are mutually exclusive

3. Exhaustive events: A set of events is exhaustive if one or the other of the events in the
set occurs whenever the experiment is conducted, i.e., the set of events exhaust all the
outcomes of the experiment. The union of exhaustive events is equal to the sample
space.
Eg: (i) In tossing of a coin, exhaustive cases = 2, exhaustive events = {𝐻}, {𝑇}
(ii) In tossing of 2 coins, exhaustive cases = 4, exhaustive events={𝐻𝐻}, {𝑇𝑇}, {𝐻𝑇} {𝑇𝐻}
(iii) In tossing 𝑛 coins, exhaustive cases = 2𝑛 .
(iv) In rolling of two dice, exhaustive cases = 36

4. Independent events: Two or more events are said to be independent if the happening or
non-happening of one event does not prevent the happening or non-happening of the
others, i.e., 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) × 𝑃(𝐵).
Eg: While tossing a coin, getting a head and a tail are independent events
Favourable cases: An outcome 𝑥 is said to be favourable to an event 𝐴, if 𝑥 belongs to 𝐴. The
total number of outcomes favourable to 𝐴 is called favourable cases to 𝐴.
Eg: (i) In tossing of two coins, favourable cases for getting 2 heads is 1, for getting exactly one
head is 2 and for getting at least 2 heads is 1.
(ii) In drawing a card from a pack, there are 4 cases favouring a king, 2 cases favouring a red
queen and 26 cases favouring a black card.

Probability
Probability is a quantitative measure of chances of occurrence. There are 3 approaches to the
study of probability.
1. Classical approach
2. Statistical or empirical approach
3. Axiomatic approach

1. Classical Definition of Probability:


If an event 𝐴 can occur in 𝑚 different ways out of a total of 𝑛 ways all of which are equally
likely and mutually exclusive, then the probability of the event 𝐴 is given by
𝑚 𝐹𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑎𝑠𝑒𝑠
𝑃(𝐴) = =
𝑛 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑎𝑠𝑒𝑠 (𝐸𝑥ℎ𝑎𝑢𝑠𝑡𝑖𝑣𝑒 𝑐𝑎𝑠𝑒𝑠)

NOTE:
(i) For a null set, 𝑚 = 0. Hence 𝑃(𝜙) = 0
(ii) For a sample space, 𝑚 = 𝑛. Hence 𝑃(𝑆) = 1
𝑚
(iii) 0 ≤ 𝑚 ≤ 𝑛. Hence, 0 ≤ ≤ 1, i.e., 0 ≤ 𝑃(𝐴) ≤ 1
𝑛
(iv) If 𝑚 outcomes are favourable to 𝐴, remaining 𝑛 – 𝑚 are favourable to𝐴′. Hence
𝑛−𝑚 𝑚
𝑃(𝐴′ ) = =1− = 1 − 𝑃(𝐴). i.e.,𝑃(𝐴) + 𝑃(𝐴′ ) = 1
𝑛 𝑛

2. Statistical Definition of Probability:


If an experiment is repeated several times under essentially homogeneous and identical
conditions, then the limiting value of the ratio of the number of times the event occurs
to the number of trials, as the number of trials become indefinitely large, is called the
𝑚
probability of that event,i.e. if an event 𝐴 occurs 𝑚 times in 𝑛 trials then 𝑃(𝐴) = lim
𝑛→∞ 𝑛

3. Axiomatic approach:
Consider a random experiment with sample space 𝑆. Associated with this random
experiment, may events be defined. With every event 𝐴, we associate a real number
𝑃(𝐴) called the probability of event 𝐴 satisfying the following axioms :
(i) 0 ≤ 𝑃(𝐴) ≤ 1
(ii) 𝑃(𝑆) = 1, 𝑆 being the sure event
(iii) For two mutually exclusive events 𝐴 and 𝐵, 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵)
(iv) If 𝐴1 , 𝐴2 , … , 𝐴𝑛 are pairwise mutually exclusive events, then
∞ ∞

𝑃 (⋃ 𝐴𝑖 ) = ∑ 𝑃(𝐴𝑖 )
𝑖=1 𝑖=1
These are called the axioms of probability

Theorems
1. If 𝜙 is the null event, then 𝑃(𝜙) = 0
2. If 𝐴̅ is the complementary event of 𝐴, then 𝑃(𝐴) = 1 − 𝑃 (𝐴̅).
3. If 𝐴 and 𝐵 are any two events, then 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵)
4. If 𝐴,𝐵 and 𝐶are any three events, then 𝑃(𝐴 ∪ 𝐵 ∪ 𝐶) = 𝑃(𝐴) + 𝑃(𝐵) + 𝑃(𝐶) −
𝑃(𝐴 ∩ 𝐵) − 𝑃(𝐵 ∩ 𝐶) − 𝑃(𝐴 ∩ 𝐶) + 𝑃(𝐴 ∩ 𝐵 ∩ 𝐶)
5. If 𝐴 ⊂ 𝐵 then 𝑃(𝐴) ≤ 𝑃(𝐵)
6. Probability that exactly one of the event 𝐴 or 𝐵 occur, i.e., 𝑃((𝐴 ∩ 𝐵′ ) ∪ (𝐴′ ∩ 𝐵)) =
𝑃(𝐴) + 𝑃(𝐵) − 2𝑃(𝐴 ∩ 𝐵)

Excercise
1
1. Let 𝐴, 𝐵 and 𝐶 be events such that 𝑃(𝐴) = 𝑃(𝐵) = 𝑃(𝐶) = , 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐶 ∩ 𝐵) =
4
1
0 and 𝑃(𝐴 ∩ 𝐶) = . Evaluate the probability that atleast one of the events 𝐴, 𝐵 and
8
𝐶 occurs.
3 3
2. If 𝑃(𝐴) = , 𝑃(𝐵) = , then show that
4 8
3
(i) 𝑃(𝐴 ∪ 𝐵) ≥
4
1 3
(ii) ≤ 𝑃(𝐴 ∩ 𝐵) ≤
8 8
3 5
(iii) ≤ 𝑃(𝐴 ∩ 𝐵′) ≤
8 8
3. 10 persons in a room are wearing badges marked 1 to 10. Three persons are chosen at
random and asked to leave the room simultaneously. Their badge number is noted.
(i) What is the probability that the smallest badge number is 5?
(ii) What is the probability that the largest badge number is 5?
4. A pair of dice is rolled. What is the probability of getting
(i) a sum greater than 6?
(ii) a sum neither 5 nor 10?
5. There are 8 positive numbers and 6 negative numbers. 4 numbers are chosen at random
and multiplied. What is the probability that the product is a positive number?
6. A number is chosen between 1 and 50. What is the probability that it is divisible by 8 or
6?
7. An urn contains 5 red and 10 black balls. 8 of them are placed in another urn. What is the
chance that the latter urn then contains 2 red and 6 black balls?
8. A bag contains 8 white and 6 red balls. What is the probability of drawing two balls of the
same color?
9. The coefficient 𝑎, 𝑏, 𝑐 of the quadratic equation 𝑎𝑥 2 + 𝑏𝑥 + 𝑐 = 0 are determined by
throwing a die 3 times find the probability that
(i) Roots are real
(ii) Roots are complex.

10.Three group of children contain respectively 3 girls 1 boy, 2 girls 2 boys, 1 girl 3 boys. One
child is selected at random from each group. Show that the chance that the 3 selected
13
consist of 1 girl and 2 boys is .
32
11.A committee of 4 persons is to be appointed from 3 officers of production department, 4
officers from purchase department, 2 officers from the sales department and 1 charted
accountant. Find the probability that
(i) There must be one from each category.
(ii) It should have at least one from the purchase department.
(iii) The CA must be in the committee.
12.What is the probability that a randomly selected year contains 53 Sundays?
13.Each of 2 persons A and B tosses 3 fair coins. Find the probability that they get the same
number of heads.
14.A and B throw a dice alternatively till one of them gets a ‘6’ and wins the game. Find their
respective probabilities of winning if A starts first.
Solution: Let S denote the success (getting a ‘6’) and F denote the failure (not getting a ‘6’).
Thus, P(S) =1/6, P(F) =5/6
P(A wins in the first throw) = P(S) = 1/6
A gets the third throw, when the first throw by A and second throw by B result into failures.
Therefore, P(A wins in the 3rd throw) = P(FFS) = P(F)P(F)P(S)= (5/6)(5/6)(1/6)
P(A wins in the 5th throw) = P (FFFFS)= (5/6)(5/6) (5/6)(5/6)(1/6)
Hence, P(A wins) =1/6 + (5/6)(5/6)(1/6) + (5/6)(5/6) (5/6)(5/6)(1/6) + … = 6/11 (G.P infinite sum)
P(B wins) = 1 – P (A wins) = 1- (6/11)= 5/11.

15. A and B throw alternatively a pair of dice. A wins if he throws sum 6 before B throws
sum 7 and B wins the other way. If A begins, find his chances of winning the game.
16.Out of the digits 0,1,2,3,4 (without repetition) a five-digit number is formed. Find the
probability that the number formed is divisible by 4
17.Six people toss a fair coin one by one. The game is won by the player who throws head.
Find the probability of success of the 4th player.

Answers:
5
1.
8
2. Prove
1 1
3. (i) (ii)
12 20
7 29
4. (i) (ii)
12 36
505
5.
1001
6
6.
25
140
7.
429
43
8.
91
43 173
9. ,
216 216
13
10.
32
4 195
11. , , 0.4
35 210
1 2
12. and (leap year)
7 7
5
13.
16
14. Solution
30 31
15. For A: For B:
61 61
5
16.
16
4
17.
63
Conditional Probability
Let 𝐴 and 𝐵 be two events. The conditional probability of 𝐵 given 𝐴 is the probability of the
occurance of 𝐵 when it is known that the event 𝐴 has already occurred. It is denoted by
𝑃(𝐵|𝐴)
𝑃(𝐴 ∩ 𝐵)
𝑃(𝐵|𝐴) =
𝑃(𝐴)
NOTE:
(i) The probability of the happening of 𝐵 when nothing is known about the happening of
𝐴 is called unconditional probability of B and is denoted by 𝑃(𝐵)
(ii) If 𝐴 and 𝐵 are two independent events, then 𝑃(𝐵|𝐴) = 𝑃(𝐵), i.e.,
𝑃(𝐴 ∩ 𝐵)
𝑃(𝐵|𝐴) = 𝑃(𝐵) =
𝑃(𝐴)
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) × 𝑃(𝐵)
(iii) If 𝐴 and 𝐵 are two independent events of 𝑆 then prove that 𝐴 and B ̅ and A̅ and
̅, 𝐵 and A
B̅ are also independent.

Multiplicative theorem/Theorem of Compound probability


Let 𝐴 and 𝐵 be two events with respective probabilities 𝑃(𝐴) and 𝑃(𝐵). Let 𝑃(𝐵|𝐴) be the
conditional probability of the occurrence of event 𝐵 given that event 𝐴 has already occurred.
Then, the probability of simultaneous occurrence of 𝐴 and 𝐵 is
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) × 𝑃(𝐵|𝐴)

If they are independent, then 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) × 𝑃(𝐵)

NOTE :
(i) If 𝐴1 , 𝐴2 , 𝐴3 , … , 𝐴𝑛 are 𝑛 independent events, then
𝑃(𝐴1 ∩ 𝐴2 ∩ 𝐴3 ∩ … ∩ 𝐴𝑛 ) = 𝑃(𝐴1 ) ∙ 𝑃(𝐴2 ) ∙ 𝑃(𝐴3 ) ∙ … ∙ 𝑃(𝐴𝑛 )
If they are not independent,
𝑃(𝐴1 ∩ 𝐴2 ∩ 𝐴3 ∩ … ∩ 𝐴𝑛 )
= 𝑃(𝐴1 ) ∙ 𝑃(𝐴2 |𝐴1 ) ∙ 𝑃(𝐴3 |𝐴1 𝐴2 ) ∙ … ∙ 𝑃(𝐴𝑛 |𝐴1 𝐴2 … 𝐴𝑛−1 )
(ii) If 𝑃1 and 𝑃2 are the probability of the happening of two independent events, then
the probability that the first event happens and the second fails is 𝑃1 (1 − 𝑃2 ).
(iii) We have
(a) 0 ≤ 𝑃(𝐵|𝐴) ≤ 1
(b) 𝑃(𝑆|𝐴) = 1
(c) 𝑃(𝐵1 ∪ 𝐵2 |𝐴) = 𝑃(𝐵1 |𝐴) + 𝑃(𝐵2 |𝐴) if 𝐵1 ∩ 𝐵2 = 𝜙
Exercise
2 1
1. If 𝐴 and 𝐵 are two independent events of 𝑆 such that 𝑃(𝐴̅ ∩ 𝐵) = , 𝑃(𝐴 ∩ 𝐵̅) = , then
15 6
find 𝑃(𝐵).
1 1
2. If 𝐴 and 𝐵 are two independent events of 𝑆 such that 𝑃(𝐴) = , 𝑃(𝐵) = , 𝑃(𝐴 ∪ 𝐵) =
3 4
1
then find
2
(i) 𝑃(𝐴|𝐵)
(ii) 𝑃(𝐵|𝐴)
(iii) 𝑃(𝐴 ∩ 𝐵̅)
(iv) 𝑃(𝐴|𝐵̅)
3. In a certain town 40% have brown hair, 25% have brown eyes, 15% have both brown hair
and brown eyes. A person is selected at random.
(i) If he has brown hair, then what is the probability that he has brown eyes also.
(ii) If he has brown eyes, then what is the probability that he does not have brown
hair.
(iii) Determine the probability that he neither has brown hair nor brown eyes.
4. A bag contains 10 gold coins and 8 silver coins. Two successive drawings of 4 coins are
made such that.
(i) The coins are replaced before the second trial.
(ii) The coins are not replaced before the second trial.
Find the probability that the first drawing will give 4 gold coins and second drawing will
give 4 silver coins.
Solution: (i) When coins are replaced before the second trial,
10 𝐶
4
P(4 gold coins in first drawing) = 18 𝐶
4
8𝐶
4
P(4 silver coins in second drawing) = 18 𝐶
4
10 𝐶 × 8 𝐶
4 4
Required probability = 18 𝐶 × 18 𝐶
4 4
(ii) When coins are not replaced before the second trial,
10 𝐶
4
P(4 gold coins in first drawing) = 18 𝐶
4
8𝐶
4
P(4 silver coins in second drawing) = 14 𝐶
4
10 𝐶 × 8 𝐶
4 4
Required probability = 18 𝐶 × 14 𝐶
4 4
5. Two defective tubes get mixed up with 4 good ones. The tubes are tested one by one,
until both defectives are found. What is the probability that the last defective tube is
obtained on
(i) Second test
(ii) Third test
(iii) Sixth test
6. A die is tossed. If the number is odd on the face, what is the probability that it is prime?
Solution: A: Number is odd = {1,3,5}
B: Number is prime = {3,5}
3 2
𝑃(𝐴) = 6 , 𝑃(𝐴 ∩ 𝐵) = 6
2
Required probability = 𝑃(𝐵|𝐴) = 3
Answers:
4 1
1. or
5 6
1 1 1 1
2. (i) (ii) (iii) (iv)
3 4 4 3
3
3. (i) (ii) 0.4 (iii) 0.5
8
4. Solution
1 2 1
5. (i) (ii) (iii)
15 15 3
6. Solution

Partition of a sample space


We say that the events 𝐵1 , 𝐵2 , … , 𝐵𝑘 represent a partition of the sample space 𝑆 if
1. 𝐵𝑖 ∩ 𝐵𝑗 = 𝜙 for all 𝑖 ≠ 𝑗 (mutually exclusive)
2. ⋃𝑘𝑖=1 𝐵𝑖 = 𝑆 (exhaustive)
3. 𝑃(𝐵𝑖 ) > 0 for all 𝑖

Total Probability Theorem

Let the events 𝐶1 , 𝐶2 , … , 𝐶𝑛 form partitions of the sample space 𝑆, where all the events have a
non-zero probability of occurrence. For any event 𝐴 associated with 𝑆,
𝑃(𝐴) = 𝑃(𝐴|𝐶1 )𝑃(𝐶1 ) + 𝑃(𝐴|𝐶2 )𝑃(𝐶2 ) + ⋯ + 𝑃(𝐴|𝐶𝑘 )𝑃(𝐶𝑘 )

Proof

Let 𝐶1 , 𝐶2 , … , 𝐶𝑘 represent a partition of the sample space 𝑆. Let 𝐴 be


some event with respect to 𝑆. Then
𝐴 = (𝐴 ∩ 𝐶1 ) ∪ (𝐴 ∩ 𝐶2 ) ∪ … ∪ (𝐴 ∩ 𝐶𝑘 )

We note that all the events (𝐴 ∩ 𝐶1 ), (𝐴 ∩ 𝐶2 ), … , (𝐴 ∩ 𝐶𝑘 ) are pairwise mutually exclusive


Thus,
𝑃(𝐴) = 𝑃(𝐴 ∩ 𝐶1 ) + 𝑃(𝐴 ∩ 𝐶2 ) + … + 𝑃(𝐴 ∩ 𝐶𝑘 )
But, 𝑃(𝐴 ∩ 𝐶𝑖 ) = 𝑃(𝐴|𝐶𝑖 ) ∙ 𝑃(𝐶𝑖 )

Hence, the total probability theorem is


𝑃(𝐴) = 𝑃(𝐴|𝐶1 )𝑃(𝐶1 ) + 𝑃(𝐴|𝐶2 )𝑃(𝐶2 ) + ⋯ + 𝑃(𝐴|𝐶𝑘 )𝑃(𝐶𝑘 )
Bayes’ Theorem
Let an event 𝐴 correspond to a number of exhaustive events 𝐶1 , 𝐶2 , … , 𝐶𝑘 , which form a
partition of the sample space 𝑆. If 𝑃(𝐶𝑖 ) and 𝑃(𝐴|𝐶𝑖 ) are given, then

𝑃(𝐶𝑖 ) 𝑃(𝐴|𝐶𝑖 )
𝑃(𝐶𝑖 |𝐴) =
∑𝑘𝑖=1 𝑃(𝐶𝑖 )𝑃(𝐴|𝐶𝑖 )

Proof
From conditional probability, we have
𝑃(𝐴 ∩ 𝐶𝑖 )
𝑃(𝐶𝑖 |𝐴) =
𝑃(𝐴)
𝑃(𝐴|𝐶𝑖 )𝑃(𝐶𝑖 )
= … (𝑖)
𝑃(𝐴)
(by using the multiplication theorem of probability)

Since the event 𝐴 corresponds to 𝐶1 , 𝐶2 , … 𝐶𝑘 , we have by total probability theorem,

𝑃(𝐴) = 𝑃(𝐴 ∩ 𝐶1 ) + 𝑃(𝐴 ∩ 𝐶2 ) + ⋯ + 𝑃(𝐴 ∩ 𝐶𝑘 )

= 𝑃(𝐴|𝐶1 )𝑃(𝐶1 ) + 𝑃(𝐴|𝐶2 )𝑃(𝐶2 ) + ⋯ + 𝑃(𝐴|𝐶𝑘 )𝑃(𝐶𝑘 )

= ∑ 𝑃(𝐶𝑖 )𝑃(𝐴|𝐶𝑖 )
𝑖=1

Substituting the above in (𝑖), we get

𝑃(𝐶𝑖 ) 𝑃(𝐴|𝐶𝑖 )
𝑃(𝐶𝑖 |𝐴) =
∑𝑘𝑖=1 𝑃(𝐶𝑖 )𝑃(𝐴|𝐶𝑖 )

The above theorem is also called as the theorem of inverse probability.

Exercise
1. A person has undertaken a construction job. The probabilities are 0.65 that there will be
strike, 0.80 that the construction job will be completed on time if there is no strike, and
0.32 that the construction job will be completed on time if there is a strike. Determine
the probability that the construction job will be completed on time.
Solution: A: The construction job will be completed on time
B: There will be a strike.
Given that 𝑃(𝐵) = 0.65,
𝑃(𝑛𝑜 𝑠𝑡𝑟𝑖𝑘𝑒) = 𝑃(𝐵′) = 1 − 0.65 = 0.35
Given that 𝑃(𝐴|𝐵) = 0.32, 𝑃(𝐴|𝐵′) = 0.80
Since events 𝐵 and 𝐵′ form a partition of the sample space 𝑆, by total probability theorem,
𝑃(𝐴) = 𝑃(𝐵)𝑃(𝐴|𝐵) + 𝑃(𝐵 ′ )𝑃(𝐴|𝐵 ′) = 0.65 × 0.32 + 0.35 × 0.8 = 0.208 + 0.28 = 0.488
2. Suppose 3 companies 𝑥, 𝑦, 𝑧 produce TVs. 𝑥 produces twice as many as 𝑦 while 𝑦 and 𝑧
produce same number. It is known that 2% of 𝑥, 2% of 𝑦, 4% of 𝑧 are defective. All the
TVs produced are put into one shop and then 1 TV is chosen at random. What is the
probability that the TV is defective? Suppose a TV chosen is defective, what is the
probability that this TV is produced by company 𝑥?
Solution: X: TV produced by company 𝑥
Y: TV produced by company 𝑦
Z: TV produced by company 𝑧
D: TV is defective
Given : 𝑃(𝑋) = 0.5, 𝑃(𝑌) = 0.25, 𝑃(𝑍) = 0.25, 𝑃(𝐷|𝑋) = 0.02, 𝑃(𝐷|𝑌) = 0.02, 𝑃(𝐷|𝑍) = 0.04
(i) By total probability theorem, 𝑃(𝐷) = 𝑃(𝑋)𝑃(𝐷|𝑋) + 𝑃(𝑌)𝑃(𝐷|𝑌) + 𝑃(𝑍)𝑃(𝐷|𝑍) =
0.025
𝑃(𝐷|𝑋) 𝑃(𝑋)
(ii) By Bayes’ theorem,𝑃(𝑋|𝐷) = 𝑃(𝑋)𝑃(𝐷|𝑋)+𝑃(𝑌)𝑃(𝐷|𝑌)+𝑃(𝑍)𝑃(𝐷|𝑍) = 0.4

3. There are three boxes, the first one containing 1 white, 2 red and 3 black balls; the second one
containing 2 white, 3 red and 1 black ball and the third one containing 3 white, 1 red and 2
black balls. A box is chosen at random and from it two balls are drawn at random. One ball is
red and the other, white. What is the probability that they come from the second box?
4. A randomly selected year has 53 Sundays. Find the probability that it is a leap year.
5. Two factories produce identical clocks. The production of the first factory consists of
10,000 clocks of which 100 are defective. The second factory produces 20,000 clocks of
which 300 are defective. What is the probability that a particular defective clock was
produced in the first factory?
6. One percent of the population suffers from a certain disease. There is a blood test for
this disease, and it is 99% accurate, in other words, the probability that it gives the
correct answer is 0.99, regardless of whether the person is sick or healthy. A person
takes the blood test, and the result says that he has the disease. Find the probability
that he actually has the disease.
Solution: A: Blood test is positive
𝑩𝟏 : He has the disease
𝑩𝟐 : He doesn’t have the disease
𝟏 𝟗𝟗 𝟗𝟗 𝟏
Given : 𝑷(𝑩𝟏 ) = 𝟏𝟎𝟎 , 𝑷(𝑩𝟐 ) = 𝟏𝟎𝟎 , 𝑷(𝑨|𝑩𝟏 ) = 𝟏𝟎𝟎 , 𝑷(𝑨|𝑩𝟐 ) = 𝟏𝟎𝟎
𝑷(𝑩𝟏 )𝑷(𝑨|𝑩𝟏 )
By Bayes’ theorem, required probability = 𝑷(𝑩𝟏 |𝑨) = 𝑷(𝑩 )𝑷(𝑨|𝑩
= 𝟎. 𝟓
𝟏 𝟏 )+𝑷(𝑩𝟐 )𝑷(𝑨|𝑩𝟐 )

7. If a machine is correctly set up, it produces 90% acceptable items. If it is incorrectly set
up, it produces only 40% acceptable items. Past experience shows that 80% of the set
ups are correctly done. If after a certain set up, the machine produces 2 acceptable
items, find the probability that the machine is correctly setup.
Solution: A: The machine produces 2 acceptable items.
𝐵1: Machine is correctly set up
𝐵2: Machine is not correctly set up
Given: 𝑃(𝐵1 ) = 0.8, 𝑃(𝐵2 ) = 0.2
𝑃(𝐴|𝐵1) = 0.9 × 0.9, 𝑃(𝐴|𝐵2 ) = 0.4 × 0.4
Therefore, by Bayes’ theorem, required probability = 𝑃(𝐵1|𝐴) = 0.95
8. It is suspected that a patient has one of the diseases 𝐴1 , 𝐴2 , 𝐴3 . Suppose that the
population suffering from this illness are in the ratio 2: 1: 1. The patient is given a test
which turns out to be positive in 25% of the cases of 𝐴1 , 50% of the cases of 𝐴2 and 90%
of the cases of 𝐴3 . Given that out of 3 tests taken by the patient two are positive, then
find the probability for each of the diseases.
Solution: 𝐴1 : The patient has disease 𝐴1
𝐴2 : The patient has disease 𝐴2
𝐴3 : The patient has disease 𝐴3
𝐵: Two test results are positive
2 1 1
Given: 𝑃(𝐴1 ) = 4 , 𝑃(𝐴2 ) = 4 , 𝑃(𝐴3 ) = 4.

3
1 23 3
1 21
)
𝑃(𝐵|𝐴1 = 𝑃𝑃𝑁 + 𝑃𝑁𝑃 + 𝑁𝑃𝑃 = 𝐶2 ( ) )
, 𝑃(𝐵|𝐴2 = 𝐶2 ( ) ,
4 4 2 2
2
9 1
𝑃(𝐵|𝐴3 ) = 3 𝐶2 ( )
10 10
Required probabilities = 𝑃(𝐴1 |𝐵) = 0.3128, 𝑃(𝐴2 |𝐵) = 0.4170, 𝑃(𝐴3 |𝐵) = 0.2703
9. An archer with an accuracy of 75% fires 3 arrows at one target. The probability of the
target falling is 0.6 if he hits once, 0.7 if he hits twice, 0.8 if he hits thrice. Given that, the
target has fallen find the probability that it was hit twice.
Solution: 𝐵1 : The target is hit first time
𝐵2 : The target is hit second time
𝐵3 : The target is hit third time
𝐵: The target falls
Given: 𝑃(𝑎𝑟𝑐ℎ𝑒𝑟 ℎ𝑖𝑡𝑠) = 0.75 𝑃(𝐵|𝐵1 ) = 0.6, 𝑃(𝐵|𝐵2 ) = 0.7, 𝑃(𝐵|𝐵3 ) = 0.8
1 23 1 3 2 3 3
𝑃(𝐵1 ) = 3 𝐶1 (4) 4
, 𝑃(𝐵2 ) = 3 𝐶2 4 (4) , 𝑃(𝐵3 ) = 3 𝐶3 (4)
Required probability = 𝑃(𝐵2 |𝐵) = 0.411
10. A bag contains three coins, one of which is two-headed and the other two coins are
normal and unbiased. One coin is chosen at random and is tossed four times in
succession. If each time head comes up, what is the probability that this is a two-headed
coin?
Solution: 𝐴: fair coin is tossed
𝐵: Fake coin is tossed
𝐻: Getting head 4 times
2 1
Given: 𝑃(𝐴) = 3 , 𝑃(𝐵) = 3
1 4
𝑃(𝐻|𝐵) = 1, 𝑃(𝐻|𝐴) = (2)
8
Required probability = 𝑃(𝐵|𝐻) = 9
11.Two absent minded roommates forget their umbrella in some way or the other. 𝐴
always takes his umbrella when he goes out. 𝐵 forgets to take his umbrella with a
1 1
probability . Each of them forget their umbrella at a shop with probability . After
2 4
visiting three shops, they return home. Find the probability that
(i) Only one has umbrella.
(ii) 𝐵 lost his umbrella given that there is only one umbrella after their return.
Solution: 𝐴: 𝐴 has umbrella after returning home
𝐵: 𝐵 has umbrella after returning home
3 3 1 1 3 3
𝑃(𝐴) = ( ) , 𝑃(𝐵) = + ( )
4 2 2 4
2183
(i) Required probability = 𝑃(𝐴′ 𝐵 + 𝐴𝐵′ ) = 4096
𝑃(𝐴′ 𝐵′ ) 27
(ii) Required probability = 𝑃(𝑜𝑛𝑙𝑦 𝑜𝑛𝑒 ℎ𝑎𝑠 𝑢𝑚𝑏𝑟𝑒𝑙𝑙𝑎) = 118

12.Suppose we have a screening test that tests whether a patient has a particular disease.
We denote positive and negative results as positive and negative respectively, and D
denotes the person having disease in the population. Suppose that the test is not
absolutely accurate, and 𝑃(𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 | 𝐷) = 95%, 𝑃(𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 | 𝐷𝑐 ) = 8%, 𝑃(𝐷) =
0.9% . What is the probability that a person has the disease given that he
received a positive result?
𝑃(𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 | 𝐷)⋅𝑃(𝐷)
Solution: 𝑃(𝐷 | 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒) = 𝑃(𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 | 𝐷)⋅𝑃(𝐷)+𝑃(𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 | 𝐷𝑐 )⋅𝑃(𝐷𝑐 )
0.95 ⋅ (0.009)
=
0.95 ⋅ (0.009) + 0.08 ⋅ (1 − 0.009)

= 0.097347
Note that the probability that a person has the disease given that he received a positive result is just 9.7347%.

13. A MNC has its branches at Ahmadabad, Bengaluru and Chennai wherein they have
1729, 4104 and 7999 employees respectively. For the purpose of dealing with transfer
requests from employees, MNC has divided the employees broadly in to two categories
Type I and Type II. At Ahmadabad, Bengaluru and Chennai 17%, 2%, 9% of the
employees are of Type II (respectively). The company transfers an employee of Type II.
What is the probability that the employee worked at Ahmadabad?
Solution: Let II be the event that a Type II employee is transferred (from any location).
Let A, B and C denote the event that a randomly selected employee works at Ahmadabad, Bengaluru
1729 4104 7999
and Chennai respectively. Then, 𝑃(𝐴) = 13832 , 𝑃(𝐵) = 13832 , 𝑃(𝐶) = 13832
We have
𝑃(𝐼𝐼 | 𝐴) = 0.17, 𝑃(𝐼𝐼 | 𝐵) = 0.02, 𝑃(𝐼𝐼 | 𝐶) = 0.09

1729
𝑃(𝐴∩𝐼𝐼) 𝑃(𝐼𝐼 | 𝐴)⋅𝑃(𝐴) (0.17)
13832
Then 𝑃(𝐴 | 𝐼𝐼) = 𝑃(𝐼𝐼)
= 𝑃(𝐼𝐼 | 𝐴)⋅𝑃(𝐴)+𝑃(𝐼𝐼 | 𝐵)⋅𝑃(𝐵)+𝑃(𝐼𝐼 | 𝐶)⋅𝑃(𝐶) = 1729 4104 7999
(0.17) +(0.02) +(0.09)
13832 13832 13832
= 0.2682

1 3
14. A bag contains 90 fair coins (𝑃(𝐻) = = 𝑃(𝑇)) and 10 unfair coins with 𝑃(𝐻) = ,
2 4
1
𝑃(𝑇) = . A coin is picked at random and tossed 𝑛 times and each one of the 𝑛 tosses
4
were heads.
(i) What is the probability that the picked coin is the unfair coin?
(ii) Find the least value of n that gives probability that the picked coin is the unfair is
at least 90%?
Solution: Let 𝐴 denote the event that 𝑛 tosses of the coin gave 𝑛 heads.
Let 𝐵1 denote the event that the coin is unfair and 𝐵2 denote the event that the coin is fair.
3 𝑛
𝑃(𝐴 | 𝐵1 )⋅𝑃(𝐵1 ) ( ) ⋅(0.1)
4
(i) 𝑃(𝐵1 | 𝐴) = = 3 𝑛 1 𝑛
𝑃(𝐴 | 𝐵1 )⋅𝑃(𝐵1 )+𝑃(𝐴 | 𝐵2 )⋅𝑃(𝐵2 ) ( ) ⋅(0.1)+( ) ⋅(0.9)
4 2

(ii) When 𝑛 = 10, we get 𝑃(𝐵 | 𝐴1 ) = 0.864997


(iii) When 𝑛 = 11, we get 𝑃(𝐵 | 𝐴1 ) = 0.905757
Hence the least value of n is 11.

15.(Birthday Problem)
How many people are needed in a room for there to be a probability that two people
have the same birthday to be at least 0.5? Ignore leap years and assume that all
birthdays are equally likely.
Solution: Let us assume that there are 𝑘 people in the room.
We solve this by finding the probability of no birthday match.
From 365 possible birthdays, the total number of possibilities of birthday combinations is 365𝑘 .
For nobody to have the same birthday, the first person can have any birthday, the second has 364 else to
choose, etc. Hence
365 ⋅ 364 ⋅ 363 ⋯ (365 − (𝑘 − 1))
𝑃(𝑛𝑜 𝑠ℎ𝑎𝑟𝑒𝑑 𝑏𝑖𝑟𝑡ℎ𝑑𝑎𝑦) =
365𝑘
= 𝑃(𝑘), say.
Then 𝑃(𝑠ℎ𝑎𝑟𝑒𝑑 𝑏𝑖𝑟𝑡ℎ𝑑𝑎𝑦) = 1 − 𝑃(𝑘)
We have,
1 − 𝑃(22) = 0.4757
1 − 𝑃(23) = 0.5073
The lowest 𝑘 for which the probability exceeds 0.5 is, 𝑘 = 23.
Answer: 23

16. (Hat Problem)


A group of 𝑛 people enter a restaurant and give their hats to the hat-keeper. On return,
the hat-keeper redistributes the hats back at random.
(i) What is the probability 𝑃𝑛 that no person gets his/her correct hat?
(ii) What happens to 𝑃𝑛 as 𝑛 tends to infinity?

Answers:
1. Solution
2. Solution
6
3.
11
4. 0.4
5. 0.25
6. Solution
7. Solution
8. Solution
9. Solution
10. Solution
11. Solution
12. Solution
13. Solution
14. Solution
15. Solution
1 1 1 1 1
16. (𝑖) 𝑃𝑛 = (1 − + − + ⋯ + (−1)𝑛 )
1! 2! 3! 4! 𝑛!
1
(𝑖𝑖)
𝑒
Random Variables
A random variable is a real valued function whose domain is the sample space of a random
experiment.
The set of all possible real values of the random variable 𝑋 is called the range space 𝑅𝑋 of 𝑋.

Example:
Consider the experiment of tossing a coin two times in succession.
The sample space of the experiment is 𝑆 = {𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇}

If 𝑋 denotes the number of heads obtained, then 𝑋 is a random variable and for each outcome,
its value is as given below:
𝑋(𝐻𝐻) = 2, 𝑋 (𝐻𝑇) = 1, 𝑋 (𝑇𝐻) = 1, 𝑋 (𝑇𝑇) = 0.
Here, 𝑅𝑋 = {0,1,2} is the range space of 𝑋.

More than one random variable can be defined on the same sample space. For example, for the
above experiment, let 𝑌 denote the number of heads minus the number of tails for each
outcome. Then,
𝑌(𝐻𝐻) = 2, 𝑌 (𝐻𝑇) = 0, 𝑌 (𝑇𝐻) = 0, 𝑌 (𝑇𝑇) = – 2.
Here, 𝑅𝑌 = {−2,0,2} is the range space of 𝑌.

Thus, 𝑋 and 𝑌 are two different random variables defined on the same sample space S.

Example:
A person plays a game of tossing a coin thrice. For each head, he is given Rs. 2 by the organizer
of the game and for each tail, he has to give Rs 1.50 to the organizer. Let 𝑋 denote the amount
gained or lost by the person. Show that 𝑋 is a random variable and exhibit it as a function on
the sample space of the experiment.

Solution:
𝑋 is a number whose values are defined on the outcomes of a random experiment. Therefore,
𝑋 is a random variable.
Sample space of the experiment is
𝑆 = {𝐻𝐻𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝐻, 𝑇𝐻𝐻, 𝐻𝑇𝑇, 𝑇𝐻𝑇, 𝑇𝑇𝐻, 𝑇𝑇𝑇}

𝑋(𝐻𝐻𝐻) = 𝑅𝑠. (2 + 2 + 2) = 𝑅𝑠. 6


𝑋(𝐻𝐻𝑇) = 𝑋(𝐻𝑇𝐻) = 𝑋(𝑇𝐻𝐻) = 𝑅𝑠. (2 + 2 − 1.50) = 𝑅𝑠. 2.50
𝑋(𝐻𝑇𝑇) = 𝑋(𝑇𝐻𝑇) = (𝑇𝑇𝐻) = 𝑅𝑠. (2 − 1.5 − 1.5) = − 𝑅𝑠. 1
𝑋(𝑇𝑇𝑇) = 𝑅𝑠. (−1.5 − 1.5 − 1.5) = − 𝑅𝑠. 4.50
where, a minus sign shows the loss to the player.
Thus, for each element of the sample space, 𝑋 takes a unique value. Hence, 𝑋 is a function on
the sample space whose range is {– 1, 2.50, – 4.50, 6}

TYPES OF RANDOM VARIABLES

1. Discrete Random Variable (DRV):


A discrete random variable is a (random) variable whose values take only a finite number of
values.
Eg: When a fair dice is rolled, it can take only a finite number of outcomes from the set
{1, 2, 3, 4, 5, 6}.

2. Continuous Random Variable (CRV):


Unlike discrete random variables, continuous random variables can take on an infinite
number of possible values.
Eg: The height of a certain group of people in centimeters can take an infinite number of
values .

BASIC TERMINOLOGIES

Probability Mass Function:


Let 𝑋 be a discrete random variable with range {𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 }. The function 𝑃𝑋 (𝑥𝑘 ) =
𝑃(𝑋 = 𝑥𝑘 ) for 𝑘 = 1,2,3, … , 𝑛 is called the probability mass function (pmf) of 𝑋 if it satisfies
the following conditions:
(i) 𝑃(𝑥𝑖 ) ≥ 0, ∀ 𝑖
(ii) ∑𝑃(𝑥𝑖 ) = 1
Example: Let 𝑋 denote the number of heads obtained when two fair coins are tossed. The
sample space of the experiment is 𝑆 = {𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇}. Their respective probabilities can be
tabulated as below
𝑋=𝑥 0 1 2
𝑃(𝑋 1 1 1
= 𝑥) 4 2 4

In the above, each 𝑃(𝑋 = 𝑥𝑖 ) ≥ 0 and ∑3𝑖=1 𝑃(𝑋 = 𝑥𝑖 ) = 1. Hence, it is a valid pmf.

Probability Density Function:


Let 𝑋 be a continuous random variable. A function 𝑓(𝑥) is called the probability density function
(pdf) of 𝑋 if it satisfies the following conditions:
(i) 𝑓(𝑥) ≥ 0

(ii) ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1

Note:
𝑏
1. 𝑃 (𝑎 ≤ 𝑥 ≤ 𝑏) = ∫𝑎 𝑓(𝑥)𝑑𝑥

2. 𝑃 (𝑥 ≥ 𝑎) = ∫𝑎 𝑓(𝑥)𝑑𝑥
𝑎
3. 𝑃 (𝑥 ≤ 𝑎) = ∫−∞ 𝑓(𝑥)𝑑𝑥

Cumulative Distribution Function:


The cumulative distribution function (cdf) of a random variable X is defined as 𝐹(𝑥) = 𝑃(𝑋 ≤
𝑥).
(i) If 𝑋 is a discrete random variable,
𝐹(𝑥) = ∑ 𝑃(𝑋 = 𝑥𝑗 )
𝑗≤𝑥
(ii) If 𝑋 is a continuous random variable with pdf 𝑓(𝑥),
𝑥
𝐹(𝑥) = 𝑃[𝑋 ≤ 𝑥] = ∫−∞ 𝑓(𝑥) 𝑑𝑥

Note:
1. 𝐹(−∞) = 0 and 𝐹(∞) = 1.
𝑑
2. If 𝑋 is a continuous random variable with pdf 𝑓(𝑥) and cdf 𝐹(𝑥), then 𝑓(𝑥) = 𝐹(𝑥).
𝑑𝑥
3. Let 𝑋 be a discrete random variable and 𝐹(𝑥) be the cdf of 𝑋. Then 𝑃(𝑋 = 𝑥𝑗 ) = 𝐹(𝑥𝑗 ) −
𝐹(𝑥𝑗−1 ).
Exercise
1 1 1
1. Suppose a random variable 𝑋 assumes the values 0,1,2 with the probabilities , and .
3 6 2
Find the cumulative distribution function.
Solution:

𝑋 0 1 2
𝑃(𝑋) 1/3 1/6 1/2

𝐹(𝑋) = 𝑃(𝑋 ≤ 𝑥).


For 𝑥 < 0, 𝐹(𝑥) = 0
1 1
0 ≤ 𝑥 < 1, 𝐹(𝑥) = 0 + =
3 3
1 1 1
1 ≤ 𝑥 < 2, 𝐹(𝑥) = 0 + + =
3 6 2
1 1 1
𝑥 ≥ 2, 𝐹(𝑥) = 0 + + + = 1
3 6 2
Therefore,
0, 𝑥<0
1
, 0≤𝑥<1
𝐹(𝑥) = 3
1
, 1≤𝑥<2
2
{ 1, 𝑥≥2
2. Suppose 𝑋 is a continuous random variable with pdf
2𝑥 0<𝑥<1
𝑓(𝑥) = {
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Find the cdf.
Ans: 0, x2 and 1.
3. Let 𝑋 be a continuous random variable with probability density function 𝑓(𝑥) defined as
𝑎𝑥, 0≤𝑥≤1
𝑎, 1≤𝑥≤2
𝑓(𝑥) = {
−𝑎𝑥 + 3𝑎, 2≤𝑥≤3
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Find
(i) Value of 𝑎
(ii) Cdf of 𝑋
1 2
(iii) 𝑃( ≤ 𝑋 ≤ )
3 3
1
(iv) 𝑃 (𝑋 ≥ )
2
1
(v) 𝑃 (𝑋 ≤ )
4
4. A coin is known to come up 3 times head as often as tail. The coin is tossed 3 times. Let
𝑋 denote the number of heads that appear. Write the probability distribution of 𝑋 and
find the cdf of 𝑋.
5. A random variable 𝑋 has the probability distribution function as shown below:
𝑋 0 1 2 3 4 5 6 7
𝑃(𝑋 = 𝑥) 0 𝑘 2𝑘 2𝑘 3𝑘 𝑘 2 2𝑘 2 7𝑘 2 + 𝑘
(i) Find the value of 𝑘
1
(ii) Find the smallest value of 𝜆 for which 𝑃 (𝑋 ≤ 𝜆) > .
2
𝑐
6. A random variable 𝑋 has the probability function 𝑃(𝑋 = 𝑘) = , 𝑘 = 0,1,2, … , 𝑛
2𝑘
Find
(i) The value of 𝑐
(ii) 𝑃(𝑋 ≥ 5)
1
(iii) 𝑃 (𝑋 ≤ )
2
7. Let 𝑋 be a continuous random variable with pdf
𝑘𝑥 4 , 0<𝑥<1
𝑓(𝑥) = {
0 , 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Find
(i) The value of 𝑘
1 3
(ii) 𝑃( < 𝑋 < )
4 4
1
(iii) 𝑃 (𝑋 > )
2
1
(iv) 𝑃 (𝑋 < )
8
8. The diameter of an electric cable 𝑋 is assumed to be a continuous random variable with
6𝑥(1 − 𝑥), 0≤𝑥≤1
pdf 𝑓(𝑥) = {
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
(i) Check if 𝑓(𝑥) is a valid pdf.
(ii) Obtain an expression for the cdf of 𝑋
1 1 2
(iii) Compute 𝑃 (𝑋 ≤ | <𝑋< )
2 3 3
(iv) Determine 𝑏 so that 𝑃(𝑋 < 𝑏) = 2𝑃(𝑋 ≥ 𝑏)

0 𝑥≤0
9. If 𝐹(𝑥) = { is a cdf of 𝑋, find the pdf of 𝑋.
1 − 𝑒 −𝑥 𝑥>0
EXPECTATION AND VARIANCE
Given a random variable, we often compute the expectation and variance, two important
summary statistics. The expectation describes the average value and the variance describes the
spread (amount of variability) around the expectation.

EXPECTATION

Let 𝑋 be a random variable whose possible values 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 occur with probabilities


𝑝1 , 𝑝2 , 𝑝3 , … , 𝑝𝑛 respectively. The mean of 𝑋, denoted by 𝜇, is the number ∑𝑥𝑖 𝑝𝑖 i.e. the mean
of a random variable 𝑿 is the weighted average of the possible values of 𝑿, each value being
weighted by the probability with which it occurs. The mean of a random variable 𝑋 is also
called the expectation of 𝑋.

Thus, if 𝐸(𝑋) is the expectation of a random variable 𝑋, then


(i) For a discrete random variable 𝑋 with pmf 𝑃(𝑥𝑖 ),
𝜇 = 𝐸(𝑋) = ∑ 𝑥𝑖 𝑃(𝑥𝑖 )
𝑖
(ii) For a continuous random variable 𝑋 with pdf 𝑓(𝑥),

𝜇 = 𝐸(𝑋) = ∫ 𝑥𝑓(𝑥)𝑑𝑥
−∞

Properties of Expectation

1. 𝐸(𝑐) = 𝑐 where 𝑐 is a constant


Proof:
𝐸(𝑐) = ∑𝑐𝑃(𝑋 = 𝑐) = 𝑐∑𝑃(𝑋 = 𝑐) = 𝑐 ∙ 1 = 𝑐

2. 𝐸(𝑎𝑋 + 𝑏) = 𝑎𝐸(𝑋) + 𝑏 where 𝑎 and 𝑏 are constants


Proof:
𝐸(𝑎𝑋 + 𝑏) = ∑(𝑎𝑋 + 𝑏)𝑃(𝑋 = 𝑥) = ∑𝑎𝑥𝑃(𝑋 = 𝑥) + ∑𝑏𝑃(𝑋 = 𝑥)
= 𝑎∑𝑥𝑃(𝑋 = 𝑥) + 𝑏∑𝑃(𝑋 = 𝑥) = 𝑎𝐸(𝑋) + 𝑏 ∙ 1 = 𝑎𝐸(𝑋) + 𝑏

3. 𝐸[𝐸(𝑋)] = 𝐸(𝑋)
Example:
Let a pair of dice be thrown and the random variable 𝑋 be the sum of the numbers that appear
on the two dice. Find the mean or expectation of 𝑋.

Solution:
The sample space of the experiment consists of 36 elementary events in the form of ordered
pairs (𝑥𝑖 , 𝑦𝑗 ) where 𝑥𝑖 = 1,2,3,4,5,6 and 𝑦𝑗 = 1,2,3,4,5,6. The random variable 𝑋 is defined as
𝑋: Sum of the numbers on the two dice
Then, 𝑅𝑋 = {2,3,4,5,6,7,8,9,10,11,12}

𝑋 = 𝑥𝑖 2 3 4 5 6 7 8 9 10 11 12
𝑃(𝑋 = 𝑥𝑖 ) 1 2 3 4 5 6 5 4 3 2 1
36 36 36 36 36 36 36 36 36 36 36

1 2 1
𝐸(𝑋) = 𝜇 = ∑ 𝑥𝑖 𝑃(𝑋 = 𝑥𝑖 ) = 2 × +3× + ⋯ + 12 × =7
36 36 36
𝑖
Thus, the mean of the sum of the numbers that appear on throwing two fair dice is 7.

VARIANCE
The mean of the random variable does not give us information about the variability in the values
of the random variable. In fact, if the variance is small, then the values of the random variable
are close to the mean. Also random variables with different probability distributions can have
equal means.

𝑛 𝑛

𝜎𝑥2 = 𝑉𝑎𝑟 (𝑋) = ∑(𝑥𝑖 − 𝜇)2 𝑝(𝑥𝑖 ) = ∑(𝑥𝑖2 − 2𝑥𝑖 𝜇 + 𝜇2 )𝑝(𝑥𝑖 )


𝑖=1 𝑖=1
𝑛 𝑛 𝑛

= ∑ 𝑥𝑖2 𝑝(𝑥𝑖 ) − ∑ 2𝑥𝑖 𝜇𝑝(𝑥𝑖 ) + ∑ 𝜇2 𝑝(𝑥𝑖 )


𝑖=1 𝑖=1 𝑖=1
𝑛 𝑛 𝑛

= ∑ 𝑥𝑖2 𝑝(𝑥𝑖 ) − 2𝜇 ∑ 𝑥𝑖 𝑝(𝑥𝑖 ) + 𝜇2 ∑ 𝑝(𝑥𝑖 )


𝑖=1 𝑖=1 𝑖=1
= ∑𝑛𝑖=1 𝑥𝑖2 𝑝(𝑥𝑖 ) − 2
2𝜇 × 𝜇 + 𝜇 × 1 [since ∑𝑛𝑖 𝑝(𝑥𝑖 ) = 1, ∑𝑛𝑖 𝑥𝑖 𝑝(𝑥𝑖 ) = 𝜇]
𝑛

= ∑ 𝑥𝑖2 𝑝(𝑥𝑖 ) − 𝜇2
𝑖=1
𝑛 𝑛 2

= ∑ 𝑥𝑖2 𝑝(𝑥𝑖 ) − (∑ 𝑥𝑖 𝑝(𝑥𝑖 ))


𝑖=1 𝑖=1
i.e., 𝑉(𝑋) = 𝐸(𝑋 2)
− [𝐸(𝑋)]2 where 𝐸(𝑋 2 ) = ∑𝑛𝑖=1 𝑥𝑖2 𝑝(𝑥𝑖 )

Example
Find the variance of the number obtained on the throw of an unbiased dice.

Solution:
The sample space of the experiment is 𝑆 = {1,2,3,4,5,6}.
Let 𝑋 denote the number obtained on the throw. Then 𝑋 is a random variable with the
following pmf
𝑋 = 𝑥𝑖 1 2 3 4 5 6
𝑃(𝑋 = 𝑥𝑖 ) 1 1 1 1 1 1
6 6 6 6 6 6

1 1 1 1 1 1 91
𝐸(𝑋 2 ) = 12 × + 22 × + 32 × + 42 × + 52 × + 62 × =
6 6 6 6 6 6 6
1 1 1 1 1 1 21
𝐸(𝑋) = 1 × + 2 × + 3 × + 4 × + 5 × + 6 × =
6 6 6 6 6 6 6
2
91 21 35
𝑉(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 = −( ) =
6 6 12

Properties of variance
1. 𝑉(𝑐) = 0 where 𝑐 is a constant
Proof:
𝑉(𝑐) = 𝐸(𝑐 2 ) − [𝐸(𝑐)]2 = 𝑐 2 − 𝑐 2 = 0

2. 𝑉(𝑎𝑋 + 𝑏) = 𝑎2 𝑉(𝑋) where 𝑎 and 𝑏 are constants


Proof:
We know that 𝑉(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2
𝑉(𝑎𝑋 + 𝑏) = 𝐸[(𝑎𝑋 + 𝑏)2 ] − [𝐸(𝑎𝑋 + 𝑏)]2
= 𝐸[𝑎2 𝑋 2 + 2𝑎𝑏𝑋 + 𝑏 2 ] − [𝑎𝐸(𝑋) + 𝑏]2
= 𝑎2 𝐸(𝑋 2 ) + 𝑏 2 + 2𝑎𝑏𝐸(𝑋) − {𝑎2 [𝐸(𝑋)]2 + 2𝑎𝑏𝐸(𝑋) + 𝑏 2 }
= 𝑎2 𝐸(𝑋 2 ) − 𝑎2 [𝐸(𝑋)]2
= 𝑎2 {𝐸(𝑋 2 ) − [𝐸(𝑋)]2 } = 𝑎2 𝑉(𝑋)
Example
Let 𝑋 be a continuous random variable with pdf
2𝑥 −2 , 1 < 𝑥 < 2
𝑓(𝑥) = {
0 , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Find 𝐸(𝑋) and 𝑉(𝑋).
Solution:
∞ 2
𝐸(𝑋) = ∫ 𝑥𝑓(𝑥)𝑑𝑥 = ∫ 𝑥 × 2𝑥 −2 𝑑𝑥 = [2 log 𝑥]12 = 2 log 2
−∞ 1
∞ 2
𝐸(𝑋 2)
= ∫ 𝑥 𝑓(𝑥)𝑑𝑥 = ∫ 𝑥 2 × 2𝑥 −2 𝑑𝑥 = [2𝑥]12 = 2
2
−∞ 1

𝑉(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 = 2 − (2 log 2)2 = 0.0782

MEAN, MEDIAN AND MODE


If 𝑋 is a CRV, then 𝑀 is a median such that
𝑀 ∞
1
∫ 𝑓(𝑥)𝑑𝑥 = ∫ 𝑓(𝑥)𝑑𝑥 =
−∞ 𝑀 2
If 𝑋 is a CRV, then 𝑍 is the mode for the value of 𝑥 for which 𝑓(𝑥) is maximum, i.e., 𝑓 ′ (𝑥) = 0
and 𝑓 ′′ (𝑥) < 0

NOTE:𝑀𝑜𝑑𝑒 = 3 × 𝑀𝑒𝑑𝑖𝑎𝑛 − 2 × 𝑀𝑒𝑎𝑛

Exercise
1. Find mean, median, mode and variance of a random variable 𝑋 having the pdf
6𝑥(1 − 𝑥) , 0 ≤ 𝑥 ≤ 1
𝑓(𝑥) = {
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
1 3 1 1
(Ans: 𝐸(𝑋) = , 𝐸(𝑋 2 ) = , 𝑉(𝑋) = , 𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑀𝑜𝑑𝑒 = )
2 10 20 2

2. Find pdf, mean, median, mode and variance of a random variable 𝑋 having cdf
1 − 𝑒 −𝑥 − 𝑥𝑒 −𝑥 , 𝑥≥0
𝐹(𝑥) = {
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
5
(Ans: 𝐸(𝑋) = 2, 𝐸(𝑋 2 ) = 6, 𝑉(𝑋) = 2, 𝑀𝑒𝑑𝑖𝑎𝑛 = , 𝑀𝑜𝑑𝑒 = 1)
3
𝑥2

3. If 𝐹(𝑥) = {−𝑒 2 𝑥 > 0 then find 𝑉(𝑋).
0 𝑥≤0
Solution:
𝑥2
𝑑 −
The pdf of 𝑋 is 𝑓(𝑥) = 𝐹(𝑥) = {𝑥𝑒 2 𝑥>0
𝑑𝑥
0 𝑥≤0
𝑥2 −𝑡
𝐸(𝑋) =

∫0 𝑥𝑓(𝑥)𝑑𝑥 =

∫0 𝑥 𝑒 2 −
2 𝑑𝑥 = √2 ∫∞ √𝑡 𝑒 𝑑𝑡
0

𝑥2
(by using = 𝑡, 𝑥 𝑑𝑥 = 𝑑𝑡)
2
We know that
∞ 𝑧−1 𝑒 − 𝑡 𝑑𝑡
𝛤(𝑧) = ∫0 𝑡
1
, 𝛤(𝑛 + 1) = 𝑛 𝛤(𝑛) , 𝛤 ( ) = √𝜋
2
3 1 √𝜋
Therefore, 𝐸(𝑋) = √2 𝛤 ( ) = √2 . . √𝜋 =
2 2 √2
𝑥2 −𝑡 2
𝐸(𝑋 2)
=

∫0
3
𝑥 𝑒 −
2 𝑑𝑥 = 2 ∫∞ 𝑡𝑒 𝑑𝑡 (by using 𝑥 = 𝑡)
0 2

= 2𝛤(2) = 2
1 2 𝜋
Therefore, 𝑉(𝑋) = 2 − (√𝜋 ) =4−2
√ 2

4. A school class of 120 students are driven in 3 buses to a symphonic performance. There
are 36 students in bus 1, 40 in bus 2, and 44 in bus 3. One student is chosen at random.
Let 𝑋 denote the number of students on the bus of that randomly chosen student. What
is 𝐸(𝑋)?
Solution:
36 40 44
Given 𝑃(𝑋 = 36) = , 𝑃(𝑋 = 40) = , 𝑃(𝑋 = 44) =
120 120 120
36 40 44
𝐸(𝑋) = 36 × + 40 × + 44 × = 40.2667
120 120 120

5. A student takes a multiple choice test consisting of two problems. The first one has 3
possible answers (out of which one is correct) and the second one has 5 (out of which one
is correct). The student chosen at random chooses one answer as the right answer for
each of the two problems. Let 𝑋 denote the number of right answers of the student. Find
𝑉(𝑋).
Solution:
𝑋: Number of right answers
𝑅𝑋 = {0,1,2}
Total number of outcomes = 3𝐶1 × 5𝐶1 = 15
2𝐶1 × 4𝐶1 8
𝑃(𝑋 = 0) = =
15 15
2𝐶1 × 1𝐶1 + 1𝐶1 × 4𝐶1 6
𝑃(𝑋 = 1) = =
15 15
1
𝑃(𝑋 = 2) =
15
𝑋 = 𝑥𝑖 0 1 2
𝑃(𝑋 = 𝑥𝑖 ) 8 6 1
15 15 15
6 1 8
𝐸(𝑋) = 0 + 1 × +2× =
15 15 15
6 1 2
𝐸(𝑋 2 ) = 0 + 12 × + 22 × =
15 15 3
86
𝑉(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 =
225

6. Three balls are randomly selected from an urn containing 3 white, 3 red, 5 black balls. The
person who selects the ball wins 1 rupee for each white ball selected and loses 1 rupee
for each red ball selected. Let 𝑋 be the total winnings from the experiment. Find the
probability distribution of 𝑋 and 𝑉(𝑋)
Solution:
𝑋: Total winnings

Selection 1W,2R 2W,1R 1R,2B 2R,1B 1W,2B 2W,1B 1W,1R,1B 3W 3R 3B


𝑋 -1 1 -1 -2 1 2 0 3 -3 0

𝑅𝑋 = {−3, −2, −1,0,1,2,3}

3𝐶3 1
𝑃(𝑋 = −3) = 𝑃(3𝑅) = =
11𝐶3 165
3𝐶2 ×5𝐶1 15
𝑃(𝑋 = −2) = 𝑃(2𝑅, 1𝐵) = =
11𝐶3 165
3𝐶1 ×3𝐶2 3𝐶1 ×5𝐶2 39
𝑃(𝑋 = −1) = 𝑃(1𝑊, 2𝑅) + 𝑃(1𝑅, 2𝐵) = + =
11𝐶3 11𝐶3 165
3𝐶1 ×3𝐶1 ×5𝐶1 5𝐶3 55
𝑃(𝑋 = 0) = 𝑃(1𝑊, 1𝑅, 1𝐵) + 𝑃(3𝐵) = + =
11𝐶3 11𝐶3 165
39 15 1
Similarly, 𝑃(𝑋 = 1) = , 𝑃(𝑋 = 2) = , 𝑃(𝑋 = 3) =
165 165 165

𝑋 -3 -2 -1 0 1 2 3
𝑃(𝑋 = 𝑥𝑖 ) 1 15 39 55 39 15 1
165 165 165 165 165 165 165

216 216
𝐸(𝑋) = 0, 𝐸(𝑋 2 ) = , 𝑉(𝑋) =
165 165

7. A coin is tossed till head appears then find the probability distribution on number of
tosses. Let 𝑋 denote the number of tosses. Find 𝐸(𝑋).
Solution:
𝑋: Number of tosses
Let 𝑃(𝐻) = 𝑝, 𝑃(𝑇) = 𝑞 = 1 − 𝑝
𝑅𝑋 = {1,2,3, … }
𝑃(𝑋 = 1) = 𝑝, 𝑃(𝑋 = 2) = 𝑞𝑝, 𝑃(𝑋 = 3) = 𝑞 2 𝑝, …
Hence, 𝑃(𝑋 = 𝑘) = 𝑝𝑞 𝑘−1
𝑝 1
𝐸(𝑋) = ∑∞
𝑘=1 𝑘 𝑝𝑞
𝑘−1
= (1−𝑞)2 =
𝑝

8. Suppose that an electronic device has a life length 𝑋 (in units of 1000 hours) which is
considered as a continuous random variable with the pdf 𝑓(𝑥) = 𝑒 −𝑥 , 𝑥 > 0. Suppose
that the cost of manufacturing one such item is 2 rupees. The manufacturer sells the item
for 5 rupees but guarantees a total refund if 𝑥 ≤ 0.9 . What is the manufacturer's
expected profit per item?
Solution:
𝑌: Expected profit
3, 𝑥 > 0.9
Profit, 𝑝 = {
−2, 𝑥 ≤ 0.9
𝐸(𝑌) = 3 × 𝑃(𝑋 > 0.9) + (−2) × 𝑃(𝑋 ≤ 0.9)
∞ 0.9
−𝑥
=3×∫ 𝑒 𝑑𝑥 + (−2) × ∫ 𝑒 −𝑥 𝑑𝑥
0.9 0
−0.9 (−𝑒 −0.9
=3×𝑒 −2× + 1) = 0.0325

9. Let 𝑋 be a random variable with probability function


𝑃(𝑋 = 𝑘) = 𝑝 (1 − 𝑝)𝑘−1 , 𝑘 = 1,2,3 … 𝑛.
Find 𝑉(𝑋).
Solution:
Given 𝑃(𝑋 = 𝑘) = 𝑝 (1 − 𝑝)𝑘−1 , 𝑘 = 1,2,3 … 𝑛

𝑋=𝑘 1 2 3 … 𝑛
𝑃(𝑋 = 𝑘) 𝑝 𝑝(1 − 𝑝) 𝑝(1 − 𝑝)2 … 𝑝(1 − 𝑝)𝑛

𝑛 𝑛

𝐸(𝑋) = ∑ 𝑥 𝑝(𝑥) = ∑ 𝑘 𝑝(1 − 𝑝)𝑘−1


1 1

= 𝑝 + 2𝑝(1 − 𝑝) + 3𝑝(1 − 𝑝)2 + ⋯


= 𝑝[1 + 2(1 − 𝑝) + 3(1 − 𝑝)2 + ⋯ ]
1 1
=𝑝× 2 =
(1 − (1 − 𝑝)) 𝑝
1 1
(NOTE: 1 + 𝑥 + 𝑥 2 + 𝑥 3 + ⋯ = ; 1 + 2𝑥 + 3𝑥 2 + ⋯ = (1−𝑥)2)
1−𝑥
𝑛 𝑛

𝐸(𝑋 2 ) = ∑ 𝑥 2 𝑝(𝑥) = ∑ 𝑘 2 𝑝(1 − 𝑝)𝑘−1


𝑖=1 1

= 𝑝 + 4𝑝(1 − 𝑝) + 9𝑝(1 − 𝑝)2 + ⋯


= 𝑝(1 + 4(1 − 𝑝) + 9(1 − 𝑝)2 + ⋯ )
= 𝑝𝑆
where 𝑆 = 1 + 4(1 − 𝑝) + 9(1 − 𝑝)2 + ⋯ − (𝑖)
(𝑖) × (1 − 𝑝) gives
𝑆(1 − 𝑝) = (1 − 𝑝) + 4(1 − 𝑝)2 + 9(1 − 𝑝)3 + ⋯ − (𝑖𝑖)
(𝑖) − (𝑖𝑖) simplifies to
𝑝𝑆 = 1 + 3(1 − 𝑝) + 5(1 − 𝑝)2 + ⋯ − (𝑖𝑖𝑖)
(𝑖𝑖𝑖) × (1 − 𝑝) gives
𝑝(1 − 𝑝)𝑆 = (1 − 𝑝) + 3(1 − 𝑝)2 + 5(1 − 𝑝)3 + ⋯ − (𝑖𝑣)
(𝑖𝑖𝑖) − (𝑖𝑣) simplifies to
𝑝2 𝑆 = 1 + 2(1 − 𝑝) + 2(1 − 𝑝)3 + ⋯
= 1 + 2(1 − 𝑝)[1 + (1 − 𝑝) + (1 − 𝑝)2 + ⋯ ]
1
𝑆= 2
{1 + 2(1 − 𝑝)[1 + (1 − 𝑝) + (1 − 𝑝)2 + ⋯ ]}
𝑝
1 1
= [1 + 2(1 − 𝑝) × ]
𝑝2 1 − (1 − 𝑝)
1 2 2 1
= [ − 1] = −
𝑝2 𝑝 𝑝3 𝑝2
2 1
∴ 𝐸(𝑋 2 ) = 𝑝𝑆 = −
𝑝2 𝑝
1−𝑝
Hence 𝑉(𝑋) =
𝑝2

CHEBYSHEV’S INEQUALITY
Let 𝑋 be a random variable with 𝐸(𝑋) = 𝜇 and 𝑉(𝑋) = 𝜎 2 . Then for any positive real number
‘k’,

𝜎2 𝜎2
𝑃(|𝑋 − 𝜇| ≥ 𝑘) ≤ 2
and 𝑃(|𝑋 − 𝜇| < 𝑘) ≥ 1 −
𝑘 𝑘2
OR
𝜎2
𝑃(𝜇 − 𝑘 < 𝑋 < 𝜇 + 𝑘) ≥ 1 − 2
𝑘

Proof:

(i) Discrete case:


𝜎 2 = 𝑉(𝑋) = ∑ (𝑥𝑖 − 𝜇)2 𝑃(𝑥𝑖 )


𝑖=−∞

𝜎2 ≥ ∑ (𝑥𝑖 − 𝜇)2 𝑃(𝑥𝑖 )


|𝑥𝑖 −𝜇|≥𝑘

(From −∞ to ∞, take a small part that satisfy |𝑥𝑖 − 𝜇| ≥ 𝑘)

𝜎2 ≥ ∑ 𝑘 2 𝑃(𝑥𝑖 )
|𝑥𝑖 −𝜇|≥𝑘

(Since |𝑥𝑖 − 𝜇| ≥ 𝑘)
𝜎2
≥ ∑ 𝑃(𝑥𝑖 )
𝑘2
|𝑥𝑖 −𝜇|≥𝑘

≥ 𝑃(|𝑋 − 𝜇|) ≥ 𝑘)
𝜎2 𝜎2
Therefore, 𝑃(|𝑋 − 𝜇| ≥ 𝑘) ≤ 2
and 𝑃(|𝑋 − 𝜇| < 𝑘) ≥ 1 −
𝑘 𝑘2

(since 𝑃(|𝑋 − 𝜇| ≥ 𝑘) = 1 − 𝑃(|𝑋 − 𝜇| < 𝑘)

(ii) Continuous case:

Consider 𝑉(𝑋) = 𝐸[(𝑋 − 𝜇)2 ]



𝜎 2 = ∫−∞(𝑥 − 𝜇)2 𝑓(𝑥) 𝑑𝑥

𝜇−𝑘 ∞
2
(𝑥 − 𝜇) 𝑓(𝑥)𝑑𝑥 + ∫ (𝑥 − 𝜇)2 𝑓(𝑥)𝑑𝑥
≥ ∫
−∞ 𝜇+𝑘

We have |𝑋 − 𝜇| ≥ 𝑘, ⟹ (𝑋 − 𝜇)2 ≥ 𝑘 2

𝜇−𝑘 ∞
2
≥ 𝑘 [∫ 𝑓(𝑥)𝑑𝑥 + ∫ 𝑓(𝑥)𝑑𝑥]
−∞ 𝜇+𝑘

= 𝑘 2 𝑃[𝑋 ≤ 𝜇 − 𝑘 𝑜𝑟 𝑋 ≥ 𝜇 + 𝑘]

= 𝑘 2 𝑃[𝑋 − 𝜇 ≤ −𝑘 𝑜𝑟 𝑋 − 𝜇 ≥ 𝑘]

= 𝑘 2 𝑃[𝑋 − 𝜇 ≤ −𝑘 𝑜𝑟 𝑋 − 𝜇 ≥ 𝑘]

= 𝑘 2 𝑃(|𝑋 − 𝜇| ≥ 𝑘)

i.e., 𝜎 2 ≥ 𝑘 2 𝑃(|𝑋 − 𝜇| ≥ 𝑘)

𝜎2
Hence, 𝑃(|𝑋 − 𝜇| ≥ 𝑘) ≤
𝑘2

1
Let, 𝑘 = 𝜎𝑘, the 𝑃(|𝑋 − 𝜇| ≥ 𝑘𝜎) ≤
𝑘2

NOTE:

By taking 𝑘 = 𝑘𝜎, we get


1
𝑃(|𝑋 − 𝜇| ≥ 𝑘𝜎) ≤
𝑘2
1
𝑃(|𝑋 − 𝜇| < 𝑘𝜎) ≥ 1 −
𝑘2
1
𝑃(𝜇 − 𝑘𝜎 < 𝑋 < 𝜇 + 𝑘𝜎) ≥ 1 −
𝑘2

Exercise
1. Apply Chebyshev's inequality to find the following probabilities with 𝜇 = 10 and 𝜎 2 = 4.
21 5
(i) 𝑃(5 < 𝑋 < 15) (Ans: ≥ where 𝑘 = )
25 2
5 3
(ii) 𝑃(|𝑋 − 10| < 3) (Ans: ≥ where 𝑘 = )
9 2
4
(iii) 𝑃(|𝑋 − 10| ≥ 3) (Ans: ≥ )
9

2. A random variable has mean 3 and variance 2. Find an upper bound for
1
(i) 𝑃(|𝑋 − 3| ≥ 2) (Ans: ≤ )
2

(ii) 𝑃(|𝑋 − 3| ≥ 1) (Ans: ≤ 2)


3. Find a smallest value of 𝑘 using Chebyshev’s inequality for which the probability is at
most 0.95.
Solution:
1
𝑃(|𝑋 − 𝜇| ≤ 𝑘𝜎) ≥ 1 − 2
𝑘
1
0.95 = 1 − 2 i.e., 𝑘 = √20
𝑘
4. A random variable has a pdf given by
2𝑒 −2𝑥 , 𝑥≥0
𝑓(𝑥) = {
0, 𝑥<0
(i) Find 𝑃(|𝑋 − 𝜇| ≥ 1)
(ii) Use Chebyshev’s inequality and obtain the upper bound for 𝑃(|𝑋 − 𝜇| ≥ 1) and
verify
Solution:

1
𝜇 = ∫ 𝑥 ∙ 2𝑒 −2𝑥 𝑑𝑥 =
0 2
1 1
(i) 𝑃(|𝑋 − 𝜇| ≥ 1) = 𝑃 (|𝑋 − | ≥ 1) = 1 − 𝑃 (|𝑋 − | < 1)
2 2
1
= 1 − 𝑃 (−1 < 𝑋 − < 1)
2
1 3
= 1 − 𝑃 (− < 𝑋 < )
2 2
3
2
= 1 − ∫ 2𝑒 −2𝑥 𝑑𝑥 = 0.04978
0

(ii) By Chebyshev’s inequality,


𝜎2
𝑃(|𝑋 − 𝜇| ≥ 𝑘) ≤ 2 ⟹ 𝑃(|𝑋 − 𝜇| ≥ 1) ≤ 𝜎 2
𝑘

1
𝐸(𝑋 = ∫ 𝑥 2 ∙ 2𝑒 −2𝑥 𝑑𝑥 =
2)
0 2
1
𝑉(𝑋) = 𝜎 2 = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 =
4
1
∴ 𝑃(|𝑋 − 𝜇| ≥ 1) ≤ which is > 0.04978. Hence verified
4

1
5. A random variable 𝑋 has the pdf 𝑓(𝑥) = 𝑒 −|𝑥| , −∞ < 𝑥 < ∞.
2
(i) Find 𝑃(|𝑋 − 𝜇| ≥ 2) (Ans: 0.1353)
(ii) Use Chebyshev’s inequality and verify

6. If 𝐸(𝑋) = 3 and 𝐸(𝑋 2 ) = 13, find the bounds for


(i) 𝑃(|𝑋 − 3| < 4) (Ans: 0.75)
(ii) 𝑃(|𝑋 − 3| ≥ 4) (Ans: 0.25)
5
(iii) 𝑃(0 < 𝑋 < 6) (Ans: )
9

8
7. What is the minimum value of 𝑃(−3 ≤ 𝑋 ≤ 3), given that 𝜇 = 0, 𝜎 = 1. (Ans: )
9

8. The number of patients requiring ICU in a hospital is a random variable with mean 18
and standard deviation 2.5. Determine the minimum probability that the number of
patients is between 8 and 28.
Solution:
1
𝑃(8 < 𝑋 < 28) ≥ 1 − (by Chebyshev’s inequality)
𝑘2
𝜇 − 𝑘𝜎 = 8 ⟹ 18 − 𝑘 × 2.5 = 8 ⟹ 𝑘 = 4
1
Therefore, 𝑃(8 < 𝑋 < 28) ≥ 1 −
42
15
The minimum probability is .
16
Two-dimensional Random Variables
Let 𝑆 be the sample space associated with a random experiment 𝐸. Let 𝑋 = 𝑋(𝑆) and
𝑌 = 𝑌(𝑆) be two functions each assigning a real number to each 𝑠 ∈ 𝑆. Then (𝑋, 𝑌) is called
a two-dimensional random variable.

BASIC TERMINOLOGIES

Joint Probability distribution function

Let (𝑋, 𝑌) be a two-dimensional discrete random variable. With each possible outcome
(𝑥𝑖 , 𝑦𝑖 ) we associate a number 𝑝(𝑥𝑖 , 𝑦𝑗 ) representing 𝑃(𝑋 = 𝑥𝑖 , 𝑌 = 𝑦𝑗 ) and satisfying the
following conditions:

(i) 𝑝(𝑥𝑖 , 𝑦𝑗 ) ≥ 0 for all 𝑖, 𝑗


(ii) ∑∞ ∞
𝑗=1 ∑𝑖=1 𝑝(𝑥𝑖 , 𝑦𝑗 ) = 1

Joint Probability density function


Let (𝑋, 𝑌) be a continuous random variable assuming all values in some region 𝑅 of the
Euclidean plane. The joint probability density function 𝑓 is a function satisfying the following
conditions:
(i) 𝑓(𝑥, 𝑦) ≥ 0 for all (𝑥, 𝑦) ∈ 𝑅,
(ii) ∬𝑅 𝑓(𝑥, 𝑦)𝑑𝑥𝑑𝑦 = 1

Joint Cumulative distribution function


For two-dimensional random variable (𝑋, 𝑌), the cdf 𝐹(𝑥, 𝑦) is defined as
𝐹(𝑥, 𝑦) = 𝑃(𝑋 ≤ 𝑥, 𝑌 ≤ 𝑦)
(i) For Discrete random variable:

𝐹(𝑥, 𝑦) = ∑ ∑ 𝑃(𝑥𝑖 , 𝑦𝑗 )
𝑥𝑖≤𝑥 𝑦𝑗 ≤𝑦

(ii) For Continuous random variable:


𝑥 𝑦
𝐹(𝑥, 𝑦) = ∫ ∫ 𝑓(𝑥, 𝑦)𝑑𝑦 𝑑𝑥
−∞ −∞
Note:
If 𝐹(𝑥, 𝑦) is the pdf of a two-dimensional random variable with joint pdf 𝑓(𝑥, 𝑦), then
𝜕 2 𝐹(𝑥, 𝑦)
= 𝑓(𝑥, 𝑦)
𝜕𝑥𝜕𝑦

Marginal Probability mass function

If (𝑋, 𝑌) is a two-dimensional discrete random variable with joint pmf 𝑃(𝑥𝑖 , 𝑦𝑗 ). Since 𝑋 = 𝑥𝑖
must occur with 𝑌 = 𝑦𝑗 for some 𝑗 and can occur with 𝑌 = 𝑦𝑗 for only one 𝑗, we have

𝑝(𝑥𝑖 ) = 𝑃(𝑋 = 𝑥𝑖 , 𝑌 = 𝑦1 𝑜𝑟 𝑋 = 𝑥𝑖 , 𝑌 = 𝑦2 𝑜𝑟 ⋯ ) = ∑ 𝑝(𝑥𝑖 , 𝑦𝑗 ).


𝑗=1

The function 𝑝 defined above is called marginal pmf of 𝑋.


Similarly, marginal pmf of 𝑌 is

𝑔(𝑦𝑗 ) = 𝑃(𝑌 = 𝑌𝑗 ) = ∑ 𝑝(𝑥𝑖 , 𝑦𝑗 )


𝑖=1

Marginal Probability density function


Let 𝑓(𝑥, 𝑦) be the joint pdf of the continuous two-dimensional random variable (𝑋, 𝑌). We
define 𝑔(𝑥) and ℎ(𝑦), the marginal probability density functions of 𝑋 and 𝑌, respectively, as
follows:

𝑔(𝑥) = ∫ 𝑓(𝑥, 𝑦)𝑑𝑦
−∞

ℎ(𝑦) = ∫ 𝑓(𝑥, 𝑦)𝑑𝑥
−∞

These marginal pdf’s correspond to the basic pdf’s of the one-dimensional random variables
𝑋 and 𝑌, respectively.

Example:
Two production lines manufacture a certain type of item. Suppose that the capacity (on any
given day) is 5 items for line I and 3 items for line II. Assume that the number of items actually
produced by either production line is a random variable. Let (𝑋, 𝑌) represent the two-
dimensional random variable yielding the number of items produced by line I and line II,
respectively. The following table gives the joint probability distribution of (𝑋, 𝑌). Each entry
represents 𝑝(𝑥𝑖 , 𝑦𝑗 ) = 𝑃(𝑋 = 𝑥𝑖 , 𝑌 = 𝑦𝑗 ).

X 0 1 2 3 4 5 Sum
Y
0 0 0.01 0.03 0.05 0.07 0.09 0.25
1 0.01 0.02 0.04 0.05 0.06 0.08 0.26
2 0.01 0.03 0.05 0.05 0.05 0.06 0.25
3 0.01 0.02 0.04 0.06 0.06 0.05 0.24
Sum 0.03 0.08 0.16 0.21 0.24 0.28 1.00
Thus
𝑃(2,3) = 𝑃(𝑋 = 2, 𝑌 = 3) = 0.04, etc. Hence if 𝐵 is defined as
𝐵 ={More items are produced by line I then by line II}
We find that
𝑃(𝐵) = 0.01 + 0.03 + 0.05 + 0.07 + 0.09 + 0.04 + 0.05 + 0.06 + 0.08 + 0.05 + 0.05
+ 0.06 + 0.06 + 0.05
= 0.75

Conditional probability mass function


Let (𝑋, 𝑌) be a discrete two-dimensional random variable with joint pmf 𝑝(𝑥𝑖 , 𝑦𝑗 ). Let 𝑝(𝑥𝑖 )
and 𝑞(𝑦𝑗 ) be the marginal pmf’s of 𝑋 and 𝑌, respectively.
The conditional pmf of 𝑋 given 𝑌 is defined as
𝑝(𝑥𝑖 | 𝑦𝑗 ) = 𝑃(𝑋 = 𝑥𝑖 | 𝑌 = 𝑦𝑗 )
𝑝(𝑥𝑖 ,𝑦𝑗 )
= if 𝑞(𝑦𝑗 ) > 0
𝑞(𝑦𝑗 )
The conditional pmf of 𝑌 given 𝑋 is defined as
𝑝(𝑦𝑗 | 𝑥𝑖 ) = 𝑃(𝑌 = 𝑦𝑗 | 𝑋 = 𝑥𝑖 )
𝑝(𝑥𝑖 ,𝑦𝑗 )
= if 𝑝(𝑥𝑖 ) > 0
𝑝(𝑥𝑖 )

Conditional probability density function


Let (𝑋, 𝑌) be a continuous two-dimensional random variable with joint pdf 𝑓(𝑥, 𝑦). Let 𝑔(𝑥)
and ℎ(𝑦) be the marginal pdf’s of 𝑋 and 𝑌, respectively.
The conditional pdf of 𝑋 for given 𝑌 = 𝑦 is defined by
𝑓(𝑥, 𝑦)
𝑔(𝑥 | 𝑦) = , ℎ(𝑦) > 0
ℎ(𝑦)
The conditional pdf of 𝑌 for given 𝑋 = 𝑥 is defined by
𝑓(𝑥, 𝑦)
ℎ(𝑦 | 𝑥) = , 𝑔(𝑥) > 0
𝑔(𝑥)
Independent Random Variables
Just as we defined the concept of independence between two events 𝐴 and 𝐵, we shall now
define independent random variables. Intuitively, we intend to say that 𝑋 and 𝑌 are
independent random variables if the outcome of 𝑋, say, in no way influences the outcome of
𝑌. This is an extremely important notion and there are many situations in which such an
assumption is justified.
Let (𝑋, 𝑌) be a two-dimensional discrete random variable. We say that 𝑋 and 𝑌 are
independent random variables if and only if
(i) For a discrete random variable:
𝑝(𝑥𝑖 , 𝑦𝑗 ) = 𝑝(𝑥𝑖 )𝑞(𝑦𝑗 ) ∀ 𝑖, 𝑗
i.e., 𝑃(𝑋 = 𝑥𝑖 , 𝑌 = 𝑦𝑗 ) = 𝑃(𝑋 = 𝑥𝑖 )𝑃(𝑌 = 𝑦𝑗 ) ∀ 𝑖, 𝑗

(ii) For a continuous random variable:


𝑓(𝑥, 𝑦) = 𝑔(𝑥)ℎ(𝑦) ∀ (𝑥, 𝑦)
where 𝑓(𝑥, 𝑦) is the joint pdf, and 𝑔(𝑥) and ℎ(𝑦) are the marginal pdf’s of 𝑋 and
𝑌, respectively.

Exercise
1. A fair coin is tossed 3 times. Let 𝑋: 0 or 1 according as 𝐻 or 𝑇 on the first toss. 𝑌:
Number of head. Find the probability distribution of 𝑋 and 𝑌.

2. Suppose that 3 balls are randomly selected from urn containing 3 red, 4 white, 5 black
balls. If 𝑋 and 𝑌 denotes the number of red balls and the number of white balls chosen,
then find the probability distribution of 𝑋 and 𝑌.

3. Evaluate the conditional probability 𝑃(𝑋 = 2|𝑌 = 2) from the following table

X 0 1 2 3 4 5 Sum
Y
0 0 0.01 0.03 0.05 0.07 0.09 0.25
1 0.01 0.02 0.04 0.05 0.06 0.08 0.26
2 0.01 0.03 0.05 0.05 0.05 0.06 0.25
3 0.01 0.02 0.04 0.06 0.06 0.05 0.24
Sum 0.03 0.08 0.16 0.21 0.24 0.28 1.00
Solution:
𝑃(𝑋=2,𝑌=2) 0.05
𝑃(𝑋 = 2 | 𝑌 = 2) = = = 0.20.
𝑃(𝑌=2) 0.25

4. A two-dimensional continuous random variable (𝑋, 𝑌) has joint pdf given by


𝑥𝑦
𝑥 2 + , 0 ≤ 𝑥 ≤ 1, 0 ≤ 𝑦 ≤ 2,
3
𝑓(𝑥, 𝑦) = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

Find 𝑔(𝑥 | 𝑦) and ℎ(𝑦 | 𝑥).


Solution:
2
𝑥𝑦 2
𝑔(𝑥) = ∫ (𝑥 2 + ) 𝑑𝑦 = 2𝑥 2 + 𝑥
0 3 3
1
𝑥𝑦 𝑦 1
ℎ(𝑦) = ∫ (𝑥 2 + ) 𝑑𝑥 = +
0 3 6 3
𝑥𝑦
𝑥2+ 6𝑥 2 +2𝑥𝑦
3
Hence 𝑔(𝑥 | 𝑦) = 1 𝑦 = , 0 ≤ 𝑥 ≤ 1, 0 ≤ 𝑦 ≤ 2;
+ 2+𝑦
3 6

𝑥𝑦
𝑥2+ 3𝑥 2 +𝑥𝑦 3𝑥+𝑦
3
ℎ(𝑦 | 𝑥) = 2 = = , 0 ≤ 𝑦 ≤ 2, 0 ≤ 𝑥 ≤ 1.
2
2𝑥 + 𝑥 6𝑥 2 +2𝑥 6𝑥+2
3

5. Suppose that a machine is used for a particular task in the morning and for a different
task in the afternoon. Let 𝑋 and 𝑌 represent the number of times the machine breaks
down in the morning and in the afternoon, respectively. Following table gives the joint
probability distribution of (𝑋, 𝑌). Check whether 𝑋 and 𝑌 are independent random
variables.

X 0 1 2 𝒒(𝒚𝒋 )
Y
0 0.1 0.2 0.2 0.5
1 0.04 0.08 0.08 0.2
2 0.06 0.12 0.12 0.3
𝒑(𝒙𝒊 ) 0.2 0.4 0.4 1.00
Solution:
𝑃(𝑋 = 0, 𝑌 = 0) = 0.1; 𝑝(𝑋 = 0)𝑞(𝑌 = 0) = 0.2 × 0.5 = 0.1
⟹ 𝑃(𝑋 = 0, 𝑌 = 0) = 𝑝(𝑋 = 0)𝑞(𝑌 = 0)

Similarly, 𝑝(𝑥𝑖 , 𝑦𝑗 ) = 𝑝(𝑥𝑖 )𝑞(𝑦𝑗 ) ∀ 𝑥, 𝑦

Thus 𝑋 and 𝑌 are independent random variables.

6. Let 𝑋 and 𝑌 be the life lengths of two electronic devices. Suppose that their joint pdf
is given by
𝑓(𝑥, 𝑦) = 𝑒 −(𝑥+𝑦) , 𝑥 ≥ 0, 𝑦 ≥ 0
Check whether 𝑋 and 𝑌 are independent random variables.
Solution:
∞ ∞
−𝑥
𝑔(𝑥) = ∫ 𝑓(𝑥, 𝑦)𝑑𝑦 = 𝑒 ∫ 𝑒 −𝑦 𝑑𝑦 = 𝑒 −𝑥 , 𝑥 ≥ 0
0 0
∞ ∞
−𝑦
ℎ(𝑦) = ∫ 𝑓(𝑥, 𝑦)𝑑𝑥 = 𝑒 ∫ 𝑒 −𝑥 𝑑𝑥 = 𝑒 −𝑦 , 𝑦 ≥ 0
0 0
Hence, 𝑓(𝑥, 𝑦) = 𝑔(𝑥) × ℎ(𝑦), i.e., 𝑋 and 𝑌 are independent random variables

7. Suppose that the following table represents the joint probability distribution of the
discrete random variable (𝑋, 𝑌). Evaluate all the marginal and conditional
distributions.

X 1 2 3
Y
1 1 1 0
12 6
2 0 1 1
9 5
3 1 1 2
18 4 15

8. Suppose that the two-dimensional random variable (𝑋, 𝑌) has joint pdf
𝑘𝑥(𝑥 − 𝑦), 0 < 𝑥 < 2, −𝑥 < 𝑦 < 𝑥
𝑓(𝑥, 𝑦) = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

1
(i) Evaluate the constant 𝑘. (Ans: )
8
(ii) Find the marginal pdf of 𝑋.
𝑥3
,0 < 𝑥 < 2
Ans: 𝑔(𝑥) = { 4
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
(iii) Find the marginal pdf of 𝑌.
1 𝑦 5𝑦 3
− + , −2 ≤ 𝑦 ≤ 0
3 4 48
Ans: ℎ(𝑦) = 1 𝑦 𝑦3
− + ,0 ≤ 𝑦 ≤ 2
3 4 48
{ 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

9. Suppose that the joint pdf of the two-dimensional random variable (𝑋, 𝑌) is given by
𝑥𝑦
𝑥2 + , 0 < 𝑥 < 1, 0 < 𝑦 < 2,
3
𝑓(𝑥, 𝑦) = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
(i) Check if 𝑓(𝑥, 𝑦) is a valid pdf
(ii) Evaluate 𝑃(𝐵) where 𝐵 = {𝑋 + 𝑌 ≥ 1}
(iii) Find the marginal pdf of 𝑋 and 𝑌
(iv) Find 𝑔(𝑥|𝑦)
(v) Find ℎ(𝑦|𝑥)
1
(vi) Evaluate 𝑃 (𝑋 > )
2
(vii) Evaluate 𝑃(𝑌 < 𝑋)
1 1
(viii) Evaluate 𝑃 (𝑌 < | 𝑋< )
2 2
Solution:
∞ ∞
(i) To check that ∫−∞ ∫−∞ 𝑓(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦 = 1.

∞ ∞ 2 1
𝑥𝑦
∫ ∫ 𝑓(𝑥, 𝑦)𝑑𝑥 𝑑𝑦 = ∫ ∫ (𝑥 2 + ) 𝑑𝑥 𝑑𝑦
−∞ −∞ 0 0 3

2 𝑥=1
𝑥3 𝑥2
=∫ ( + ) 𝑑𝑦
0 3 6 𝑥=0

2
1 𝑦
= ∫ ( + ) 𝑑𝑦
0 3 6

2
1 𝑦2
=( 𝑦+ )
3 12 0
2 4
= + = 1.
3 12
(ii) Let 𝐵 = {𝑋 + 𝑌 ≥ 1}. (See below figure)
𝑃(𝐵) = 1 − 𝑃(𝐵̅) where 𝐵̅ = {𝑋 + 𝑌 < 1}.

1 1−𝑥 𝑥𝑦
Hence 𝑃(𝐵) = 1 − ∫0 ∫0 (𝑥 2 + 3
) 𝑑𝑦 𝑑𝑥
1
2 (1
𝑥(1 − 𝑥)2
= 1 − ∫ [𝑥 − 𝑥) + ] 𝑑𝑥
0 6
7 65
=1− = .
72 72

(iii) To find the marignal pdf of 𝑋:



𝑔(𝑥) = ∫ 𝑓(𝑥, 𝑦)𝑑𝑦
−∞
2
𝑥𝑦
= ∫ (𝑥 2 + ) 𝑑𝑦
0 3
2𝑥
= 2𝑥 2 + ,0 < 𝑥 < 1
3
To find the marignal pdf of 𝑌:

ℎ(𝑦) = ∫ 𝑓(𝑥, 𝑦)𝑑𝑥
−∞
1
𝑥𝑦
= ∫ (𝑥 2 + ) 𝑑𝑥
0 3
𝑦 1
= + ,0 < 𝑦 < 2
6 3
𝑓(𝑥,𝑦) 6𝑥 2 +2𝑥𝑦
(iv) 𝑔(𝑥|𝑦) = = , 0 < 𝑥 < 1, 0 < 𝑦 < 2
ℎ(𝑦) 2+𝑦
𝑓(𝑥,𝑦) 3𝑥+𝑦
(v) ℎ(𝑦|𝑥) = = , 0 < 𝑥 < 1, 0 < 𝑦 < 2
𝑔(𝑥) 6𝑥+2

1 2 1 𝑥𝑦 2 7 3𝑦 5
(vi) 𝑃 (𝑋 > ) = ∫0 ∫1 (𝑥 2 + ) 𝑑𝑥𝑑𝑦 = ∫0 (24 + 24 ) 𝑑𝑦 = 6
2 2 3

1 𝑥 𝑥𝑦 1 7𝑥 3 7
(vii) 𝑃(𝑌 < 𝑋) = ∫0 ∫0 (𝑥 2 + ) 𝑑𝑦𝑑𝑥 = ∫0 𝑑𝑥 =
3 6 24
5
(viii) Ans:
32

10.For what value of 𝑘 is 𝑓(𝑥, 𝑦) = 𝑘𝑒 −(𝑥+𝑦) a joint pdf of (𝑋, 𝑌) over the region 0 < 𝑥 <
1, 0 < 𝑦 < 1? (Ans: 2.5027)

11.Suppose that the continuous two-dimensional random variable (𝑋, 𝑌) is uniformly


distributed over the square whose vertices are (1,0), (0,1), (−1,0), and (0, −1). Find
the marginal pdf’s of 𝑋 and 𝑌.
(Ans: 𝑔(𝑥) = 1 − |𝑥|, −1 < 𝑥 < 1, ℎ(𝑦) = 1 − |𝑦| , − 1 < 𝑦 < 1)

12.Suppose that the joint pdf of (𝑋, 𝑌) is given by


𝑒 −𝑦 , 𝑥 > 0, 𝑦 > 𝑥
𝑓(𝑥, 𝑦) = {
0, 𝑜𝑡ℎ𝑒𝑟𝑖𝑠𝑒
−𝑥
(i) Find the marginal pdf of 𝑋. (Ans: 𝑔(𝑥) = 𝑒 , 𝑥 > 0)
(ii) Find the marginal pdf of 𝑌. (Ans: ℎ(𝑦) = 𝑦𝑒 −𝑦 , 𝑦 > 𝑥 > 0)
(iii) Evaluate 𝑃(𝑋 > 2|𝑌 < 4). (Ans: 0.0885)

1
13.Find 𝑐 for which 𝑓(𝑥, 𝑦) = 𝑐𝑥 + 𝑐𝑦 2 , 𝑥 = 1,2, 𝑦 = 1,2,3 is the joint pmf. (Ans: )
37

2
0≤𝑥≤𝑦≤𝑎
14.If 𝑓(𝑥, 𝑦) = {𝑎2 . Find 𝑓(𝑦/𝑥) and 𝑓(𝑥/𝑦).
𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

1 1
(Ans: 𝑓(𝑦|𝑥) = , 0 ≤ 𝑥 ≤ 𝑦 ; 𝑓(𝑥|𝑦) = , 𝑥 ≤ 𝑦 ≤ 𝑎)
𝑎−𝑥 𝑦

8𝑥𝑦 0 < 𝑥 < 𝑦 < 1.


15. If 𝑓(𝑥, 𝑦) = { Find the marginal pdf of 𝑋 and 𝑌. Check
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
whether they are independent.
(Ans: 𝑔(𝑥) = −4𝑥(𝑥 2 − 1), 0 < 𝑥 < 𝑦, ℎ(𝑦) = 4𝑦 3 , 𝑥 < 𝑦 < 1; Not independent)

16.Suppose that a manufacturer of light bulbs is concerned about the number of bulbs
ordered from him during the months of January and February. Let 𝑋 and 𝑌 denote the
number of bulbs ordered during these two months, respectively. We shall assume that
(𝑋, 𝑌) is a two-dimensional continuous random variable with the following joint pdf.
(Refer to the figure below). Find 𝑃(𝑋 ≥ 𝑌).
𝑐, if 5000 ≤ 𝑥 ≤ 10,000 and 4000 ≤ 𝑦 ≤ 9000
𝑓(𝑥, 𝑦) = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Solution:
∞ ∞
To determine 𝑐 we use the fact that ∫−∞ ∫−∞ 𝑓(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦 = 1.

∞ ∞ 9000 10,000
∫ ∫ 𝑓(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦 = ∫ ∫ 𝑓(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦 = 𝑐[5000]2
−∞ −∞ 4000 5000
Thus 𝑐 = (5000)2 .

If 𝐵 = {𝑋 ≥ 𝑌}, we shall compute 𝑃(𝐵) by evaluating 1 − 𝑃(𝐵̅), where 𝐵̅ = {𝑋 < 𝑌}.


Hence
9000 𝑦
1
𝑃(𝐵) = 1 − ∫ ∫ 𝑑𝑥 𝑑𝑦
(5000)2 5000 5000
9000
1
=1− ∫ [𝑦 − 5000]𝑑𝑦
(5000)2 5000
17
=
25

EXPECTATION OF A TWO-DIMENSIONAL RANDOM VARIABLE


For DRV:
𝐸(𝑋) = ∑∞ ∞
𝑖=1 ∑𝑗=1 𝑥𝑖 𝑃 (𝑥𝑖 , 𝑦𝑖 )

= ∑∞ ∞
𝑖=1 𝑥𝑖 {∑𝑗=1 𝑃 (𝑥𝑖 , 𝑦𝑖 )}

= ∑∞
𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 ) where 𝑝(𝑥𝑖 ) is marginal pmf of 𝑥.

Therefore,
𝐸(𝑋) = ∑∞ ∞
𝑖=1 𝑥𝑖 𝑝(𝑥𝑖 ) and 𝐸(𝑌) = ∑𝑖=1 𝑦𝑗 𝑞(𝑦𝑖 )
For CRV:
∞ ∞ ∞
𝐸(𝑋) = ∫ ∫ 𝑥 𝑓(𝑥, 𝑦)𝑑𝑥 𝑑𝑦 = ∫ 𝑥 𝑔(𝑥)𝑑𝑥
−∞ −∞ −∞
∞ ∞ ∞
𝐸(𝑌) = ∫ ∫ 𝑦 𝑓(𝑥, 𝑦)𝑑𝑥 𝑑𝑦 = ∫ 𝑦 ℎ(𝑦) 𝑑𝑦
−∞ −∞ −∞

∞ ∞
𝐸(𝑋 ) = ∫ ∫ 𝑥 2 𝑓(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦
2
−∞ −∞

∞ ∞
𝐸(𝑋𝑌) = ∫ ∫ 𝑥𝑦 𝑓(𝑥, 𝑦) 𝑑𝑥 𝑑𝑦
−∞ −∞

Properties:
1. 𝐸(𝑐) = 𝑐
2. 𝐸(𝑐𝑋) = 𝑐𝐸(𝑥)
3. 𝐸(𝑋 + 𝑌) = 𝐸(𝑋) + 𝐸(𝑌)
Theorem:
Let (𝑋, 𝑌) be a two-dimensional random variable and suppose that 𝑋 and 𝑌 are independent.
Then 𝐸(𝑋𝑌) = 𝐸(𝑋)𝐸(𝑌).
Proof:
+∞ +∞
𝐸(𝑋𝑌) = ∫ ∫ 𝑥𝑦 𝑓(𝑥, 𝑦)𝑑𝑥 𝑑𝑦
−∞ −∞
+∞ +∞
=∫ ∫ 𝑥𝑦 𝑔(𝑥)ℎ(𝑦)𝑑𝑥 𝑑𝑦
−∞ −∞
+∞ +∞
=∫ 𝑥 𝑔(𝑥)𝑑𝑥 ∫ 𝑦 ℎ(𝑦)𝑑𝑦
−∞ −∞

= 𝐸(𝑋)𝐸(𝑌)
Theorem:
Let (𝑋, 𝑌) be a two-dimensional random variable, and if 𝑋 and 𝑌 are independent, then
𝑉(𝑋 + 𝑌) = 𝑉(𝑋) + 𝑉(𝑌).
2
Proof: 𝑉(𝑋 + 𝑌) = 𝐸(𝑋 + 𝑌)2 − (𝐸(𝑋 + 𝑌))

= 𝐸(𝑋 + 𝑌)2 − (𝐸(𝑋) + 𝐸(𝑌))2


2 2
= 𝐸(𝑋 2 + 2𝑋𝑌 + 𝑌 2 ) − (𝐸 (𝑋)) − 2𝐸(𝑋)𝐸(𝑌) − (𝐸 (𝑌))
2 2
= 𝐸(𝑋 2 ) + 2𝐸(𝑋𝑌) + 𝐸(𝑌 2 ) − (𝐸 (𝑋)) − 2𝐸(𝑋)𝐸(𝑌) − (𝐸(𝑌))
2 2
= 𝐸(𝑋 2 ) − (𝐸(𝑋)) + 𝐸(𝑌 2 ) − (𝐸 (𝑌))

= 𝑉(𝑋) + 𝑉(𝑌)

Exercise
𝑥𝑦
𝑥2 + 0 ≤ 𝑥 ≤ 1, 0 ≤ 𝑦 ≤ 2
1. If 𝑓(𝑥, 𝑦) = { 3 . Find 𝐸(𝑋), 𝐸(𝑌) and 𝑉(𝑌).
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

Solution:

𝐸(𝑋) = ∫−∞𝑥 𝑔(𝑥) 𝑑𝑥
∞ 2 13
= ∫−∞𝑥 {2𝑥 2 + 𝑥} 𝑑𝑥 =
3 18
∞ 10
𝐸(𝑌) = ∫−∞ 𝑦 ℎ(𝑦) 𝑑𝑦 =
19
∞ 14
𝐸(𝑌 2 ) = ∫−∞ 𝑦 2 ℎ(𝑦) 𝑑𝑦 =
9

𝑉(𝑌) = 𝐸(𝑌 2 ) – {𝐸(𝑌)}2 = 0.3209

CONDITIONAL EXPECTATION
Definitions:
(a) If (𝑋, 𝑌) is a two-dimensional discrete random variable, we define the conditional
expectation of 𝑋 for given 𝑌 = 𝑦𝑗 as

𝐸(𝑋|𝑦𝑗 ) = ∑ 𝑥𝑖 𝑝(𝑥𝑖 |𝑦𝑗 )
𝑖=1

(b) If (𝑋, 𝑌) is a two-dimensional continuous random variable, we define the conditional


expectation of 𝑋 for given 𝑌 = 𝑦 as

𝐸(𝑋|𝑦) = ∫ 𝑥 𝑔(𝑥|𝑦) 𝑑𝑥
−∞
Exercise

1. Suppose that (𝑋, 𝑌) is uniformly distributed over the semicircle indicated in the
following figure. Find 𝐸(𝑌|𝑥) and 𝐸(𝑋|𝑦).

Solution:
The region considered is .
The joint pdf of (𝑋, 𝑌) is given by
2
, (𝑥, 𝑦) ∈ 𝑠𝑒𝑚𝑖𝑐𝑖𝑟𝑐𝑙𝑒
𝑓(𝑥, 𝑦) = {𝜋

0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
√1−𝑥 2
2 2
𝑔(𝑥) = ∫ 𝑑𝑦 = √1 − 𝑥 2 , −1 ≤ 𝑥 ≤ 1
0 𝜋 𝜋
√1−𝑦 2
2 4
ℎ(𝑦) = ∫ 𝑑𝑥 = √1 − 𝑦 2 , 0 ≤ 𝑦 ≤ 1
−√1−𝑦 2 𝜋 𝜋
Hence,
1
𝑔(𝑥|𝑦) = , −√1 − 𝑦 2 ≤ 𝑥 ≤ √1 − 𝑦 2
2√1 − 𝑦2
1
ℎ(𝑦|𝑥) = , 0 ≤ 𝑦 ≤ √1 − 𝑥 2
√1 − 𝑥2
Therefore,
√1−𝑥 2
1
𝐸(𝑌|𝑥) = ∫ 𝑦 ℎ(𝑦|𝑥)𝑑𝑦 = √1 − 𝑥 2
0 2
Similarly,
√1−𝑦 2
𝐸(𝑋|𝑦) = ∫ 𝑥 𝑔(𝑥|𝑦) 𝑑𝑥 = 0
−√1−𝑦 2
CORRELATION COEFFICIENT
Correlation coefficient is a parameter which measures the degree of association between
two random variables 𝑋 and 𝑌
Let (𝑋, 𝑌) be a two-dimensional random variable. The correlation coefficient 𝜌𝑥𝑦 between 𝑋
and 𝑌 is defined as

𝐸[(𝑋 − 𝐸(𝑋))(𝑌 − 𝐸(𝑌))]


𝜌𝑥𝑦 =
√𝑉(𝑋)𝑉(𝑌)

where 𝐸[(𝑋 − 𝐸(𝑋))(𝑌 − 𝐸(𝑌))] = 𝜎𝑥𝑦 = 𝑐𝑜𝑣(𝑋, 𝑌) is called as covariance between 𝑋 and
𝑌.

𝐸[(𝑋 − 𝐸(𝑋))(𝑌 − 𝐸(𝑌))] = 𝐸[𝑋𝑌 − 𝑋𝐸(𝑋) − 𝑌𝐸(𝑋) + 𝐸(𝑋)𝐸(𝑌)]


= 𝐸(𝑋𝑌) − 𝐸(𝑋)𝐸(𝑌) − 𝐸(𝑌)𝐸(𝑋) + 𝐸(𝑋)𝐸(𝑌)
= 𝐸(𝑋𝑌) − 𝐸(𝑋)𝐸(𝑌)
NOTE:
1. If 𝑋 and 𝑌 are independent, then 𝐸(𝑋𝑌) = 𝐸(𝑋)𝐸(𝑌) ⟹ 𝜌𝑥𝑦 = 0 , i.e., 𝑋 and 𝑌 are
uncorrelated.
2. Converse of the above is not true, i.e., if 𝜌𝑥𝑦 = 0, then 𝑋 and 𝑌 need not be
independent
Hence, being uncorrelated and independent in general, are not equivalent.
For example,
1
Consider the random variable 𝑌 = 𝑋 2 here the pdf of 𝑋 is 𝑓(𝑥) = , −1 ≤ 𝑋 ≤ 1
2
3) 2
𝐸(𝑋𝑌) − 𝐸(𝑋)𝐸(𝑌) = 𝐸(𝑋 − 𝐸(𝑋)𝐸(𝑋 )
1 1
1 1
𝐸(𝑋) = ∫ 𝑥 ∙ 𝑑𝑥 = 0 ; 𝐸(𝑋 = ∫ 𝑥 3 ∙ 𝑑𝑥 = 0
3)
−1 2 −1 2
∴ 𝐸(𝑋𝑌) − 𝐸(𝑋)𝐸(𝑌) = 0 and hence 𝜌𝑥𝑦 = 0 but clearly 𝑋 and 𝑌 are not independent.

Theorem
If (𝑋, 𝑌) is a two-dimensional random variable, then −1 ≤ 𝜌𝑥𝑦 ≤ 1
Proof:
We know that 𝑉(𝑋) ≥ 0,
i.e., 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 ≥ 0
𝐸(𝑋 2 ) ≥ [𝐸(𝑋)]2
𝐸(𝑋 2 ) ≥ 0 (since [𝐸(𝑋)]2 ≥ 0)
2
𝑋−𝐸(𝑋) 𝑌−𝐸(𝑌)
Consider 𝐸 { ± } ≥0
√𝑉(𝑋) √𝑉(𝑌)
2 2
(𝑋 − 𝐸(𝑋)) (𝑌 − 𝐸(𝑌)) (𝑋 − 𝐸(𝑋)) (𝑌 − 𝐸(𝑌))
𝐸[ + ± 2. . ]≥0
𝑉(𝑋) 𝑉(𝑌) √𝑉(𝑋) √𝑉(𝑌)
2 2
𝐸(𝑋 − 𝐸(𝑋)) 𝐸(𝑌 − 𝐸(𝑌)) 𝐸[(𝑋 − 𝐸(𝑋) (𝑌 − 𝐸(𝑌)]
+ ± 2. ≥0
𝑉(𝑋) 𝑉(𝑌) √𝑉(𝑋) √𝑉(𝑌)
𝑉(𝑋) 𝑉(𝑌)
+ ± 2𝜌𝑥𝑦 ≥ 0
𝑉(𝑋) 𝑉(𝑌)
2
(Since 𝐸(𝑋 − 𝐸(𝑋)) = 𝑉(𝑋))
2 ± 2𝜌𝑥𝑦 ≥ 0
2 ± 2𝜌𝑥𝑦 ≥ 0 that is 1 ± 𝜌𝑥𝑦 ≥ 0
1 + 𝜌𝑥𝑦 ≥ 0 and 1 − 𝜌𝑥𝑦 ≥ 0
𝜌𝑥𝑦 ≥ −1 and 𝜌𝑥𝑦 ≤ 1
Hence, −1 ≤ 𝜌 ≤ 1.

Theorem
If 𝑋 and 𝑌 are linearly related, then 𝜌𝑥𝑦 = ±1
Proof:
Let 𝑋 and 𝑌 be linearly related, say 𝑌 = 𝑎𝑋 + 𝑏
𝐶𝑜𝑣(𝑋, 𝑌) = 𝐸(𝑋𝑌) − 𝐸(𝑋)𝐸(𝑌)
= 𝐸[𝑋(𝑎𝑋 + 𝑏)] − 𝐸(𝑋)𝐸(𝑎𝑋 + 𝑏)
= 𝐸[𝑎𝑋 2 + 𝑏𝑋] − 𝐸(𝑋)[𝑎𝐸(𝑋) + 𝑏]
= 𝑎𝐸(𝑋 2 ) + 𝑏𝐸(𝑋) − 𝑎[𝐸(𝑋)]2 − 𝑏𝐸(𝑋)
2
= 𝑎 [𝐸(𝑋 2 ) − (𝐸 (𝑋)) ]
= 𝑎 𝑉(𝑋)
𝑉(𝑌) = 𝑉(𝑎𝑋 + 𝑏) = 𝑎2 𝑉(𝑋)
𝐶𝑜𝑣(𝑋, 𝑌)
𝜌𝑥𝑦 =
√𝑉(𝑋)𝑉(𝑌)
𝑎𝑉(𝑋)
=
√𝑉(𝑋)𝑎2 𝑉(𝑋)
𝑎𝑉(𝑋)
= = ±1
±𝑎𝑉(𝑋)

Exercise
1. If 𝑈 = 𝑎 + 𝑏𝑋 and 𝑉 = 𝑐 + 𝑑𝑌, then show that 𝜌𝑢𝑣 = ±𝜌𝑥𝑦

2. The random variable (𝑋, 𝑌) has a joint pdf given by


𝑓(𝑥, 𝑦) = 𝑥 + 𝑦, 0 ≤ 𝑥 ≤ 1 , 0 ≤ 𝑦 ≤ 1compute correlation between 𝑋 &𝑌.
Solution:
𝐸(𝑋𝑌) − 𝐸(𝑋)𝐸(𝑌)
𝜌=
√𝑉(𝑋)𝑉(𝑌)
1 1
7
𝐸(𝑋) = ∫ ∫ 𝑥(𝑥 + 𝑦)𝑑𝑥 𝑑𝑦 =
0 0 12
1 1
7
𝐸(𝑌) = ∫ ∫ 𝑦(𝑥 + 𝑦)𝑑𝑥 𝑑𝑦 =
0 0 12
1 1
1
𝐸(𝑋𝑌) = ∫ ∫ 𝑥𝑦(𝑥 + 𝑦)𝑑𝑥 𝑑𝑦 =
0 0 3
11 11
𝑉(𝑋) = , 𝑉(𝑌) =
144 144
1 7 7
− ×
𝜌𝑥𝑦 = 3 12 12 = − 1
11
√ 11 × 11
144 144
3. If (𝑋, 𝑌) has the joint density function 𝑓(𝑥, 𝑦) = 2 − 𝑥 − 𝑦, 0 < 𝑥 < 1, 0 < 𝑦 < 1,
1
compute 𝜌𝑥𝑦 . (Ans: − )
11

4. Show that 𝑉(𝑎𝑋 + 𝑏𝑌) = 𝑎2 𝑉(𝑋) + 𝑏 2 𝑉(𝑌) + 2𝑎𝑏 𝑐𝑜𝑣(𝑋, 𝑌). Also prove that when
𝑋 and 𝑌 are independent 𝑉(𝑎𝑋 + 𝑏𝑌) = 𝑉(𝑎𝑋 − 𝑏𝑌) = 𝑎2 𝑉(𝑋) + 𝑏 2 𝑉(𝑌)

5. Two independent random variables 𝑋1 and 𝑋2 have mean 5 and 10 and variances 4
and 9 respectively. Find the covariance between 𝑈 = 3𝑋1 + 4𝑋2 and 𝑉 = 3𝑋1 − 𝑋2 .
(Ans: 0)

6. Let 𝑋1 , 𝑋2 , 𝑋3 be uncorrelated random variables having the same standard deviation.


1
Find the correlation coefficient between 𝑋1 + 𝑋2 and 𝑋2 + 𝑋3 . (Ans: )
2

1 1
7. Given 𝐸(𝑋𝑌) = 43, 𝑃(𝑋 = 𝑥𝑖 ) = and 𝑃(𝑌 = 𝑦𝑗 ) = . Find 𝜌𝑥𝑦
5 5
𝑋 1 3 4 6 8
𝑌 1 2 24 12 5
22 44 126 750
(Ans: 𝐸(𝑋) = , 𝐸(𝑌) = 𝐸(𝑋 2 ) = , 𝐸(𝑌 2 ) = , 𝜌𝑥𝑦 = 0.2079)
5 5 5 5

8. If 𝑋, 𝑌 and 𝑍 are uncorrelated random variables with standard deviation 5,12,9


respectively, then evaluate 𝜌𝑢𝑣 where 𝑈 = 𝑋 + 𝑌 and 𝑉 = 𝑌 + 𝑍
(Ans: 𝑐𝑜𝑣(𝑋, 𝑌) = 𝑐𝑜𝑣(𝑌, 𝑍) = 𝑐𝑜𝑣(𝑋, 𝑍) = 0, 𝑉(𝑈) = 169, 𝑉(𝑉) = 225, 𝜌𝑢𝑣 =
0.7385)
9. Suppose that a two-dimensional random variable is uniformly distributed over the
triangular region 𝑅 = {(𝑥, 𝑦)|0 < 𝑥 < 𝑦 < 1}
(i) Find the pdf of (𝑋, 𝑌)
(ii) Find the marginal pdf of 𝑋 and 𝑌
(iii) Find 𝜌𝑥𝑦
Solution:

2, (𝑥, 𝑦) ∈ 𝑅
𝑓(𝑥, 𝑦) = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
1
Marginal pdf of 𝑋, 𝑔(𝑥) = ∫𝑥 2 𝑑𝑦 = 2(1 − 𝑥), 0 ≤ 𝑥 ≤ 1
𝑦
Marginal pdf of 𝑌, ℎ(𝑦) = ∫0 2 𝑑𝑦 = 2𝑦, 0 ≤ 𝑦 ≤ 1
1 1 1 2
𝐸(𝑋) = ∫0 𝑥 𝑔(𝑥)𝑑𝑥 = ; 𝐸(𝑌) = ∫0 𝑦 ℎ(𝑦)𝑑𝑦 =
3 3
2) 1 2 1 2) 1 1 1 1
𝐸(𝑋 = ∫0 𝑥 𝑔(𝑥)𝑑𝑥 =
6
; 𝐸(𝑌 = ∫0 𝑦 2 ℎ(𝑦)𝑑𝑦 = ; 𝑉(𝑋) =
2 18
; 𝑉(𝑌) =
18
1 𝑦 1
𝐸(𝑋𝑌) = ∫0 ∫0 𝑥𝑦 𝑓(𝑥, 𝑦)𝑑𝑥𝑑𝑦 =
4
𝐸(𝑋𝑌) − 𝐸(𝑋)𝐸(𝑌) 1
𝜌𝑥𝑦 = =
√𝑉(𝑋) × 𝑉(𝑌) 2
Probability Distributions

BINOMIAL DISTRIBUTION (Biparametric, Discrete distribution)


The probability mass function of a binomial random variable with parameters 𝑛 and 𝑝 is given
by
𝑛
𝑃(𝑋 = 𝑘) = 𝐶𝑘 𝑝𝑘 (1 − 𝑝)𝑛−𝑘 , 𝑘 = 0,1,2, ⋯ , 𝑛
where 𝑛 is the number of trials and 𝑝 is the probability of success (𝑞 = 1 − 𝑝 is the probability
of failure).
We denote 𝑋~𝐵(𝑛, 𝑝)
The properties of a binomial distribution are
(i) Each trial is independent
(ii) There are only two possible outcomes in a trial i.e., success or failure.
(iii) A total number of 𝑛 identical trails are conducted.
(iv) The probability of success and failure is the same for all trials
(i.e., trials are identical)
Mean and Variance:
𝑛

𝐸(𝑋) = ∑ 𝑥 𝑃(𝑥)
𝑥=1
𝑛
𝑛
= ∑𝑥 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥=1
𝑛
𝑛
= ∑𝑥 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥=1
𝑛
𝑛!
= ∑𝑥 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥! (𝑛 − 𝑥)!
𝑥=1
𝑛
(𝑛 − 1)!
= 𝑛𝑝 ∑ 𝑝 𝑥−1 𝑞 𝑛−𝑥
(𝑥 − 1)! (𝑛 − 𝑥)!
𝑥=1

Substitute 𝑥 − 1 = 𝑠,
𝑛
(𝑛 − 1)!
𝐸(𝑋) = ∑ 𝑝 𝑠 𝑞 𝑛−𝑠−1
𝑠! (𝑛 − 𝑠 − 1)!
𝑠=0
𝑛−1
(𝑛 − 1)!
= 𝑛𝑝 ∑ 𝑝 𝑠 𝑞 𝑛−1−𝑠
𝑠! (𝑛 − 𝑠 − 1)!
𝑠=0

= 𝑛𝑝 (𝑝 + 𝑞)𝑛−1
= 𝑛𝑝
∴ 𝐸(𝑋) = 𝑛𝑝
𝑛

𝐸(𝑋 2 ) = ∑(𝑥 2 − 𝑥 + 𝑥) 𝑛
𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥=1
𝑛 𝑛

= ∑(𝑥 2 − 𝑥) 𝑛
𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 + ∑ 𝑥 𝑛
𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥
𝑥=1 𝑥=1
𝑛
𝑛!
= ∑ 𝑥 (𝑥 − 1) 𝑝 𝑥 𝑞 𝑛−𝑥 + 𝑛𝑝
𝑥! (𝑛 − 𝑥)!
𝑥=2
𝑛
(𝑛 − 2)!
= 𝑛(𝑛 − 1)𝑝2 ∑ 𝑝 𝑥−2 𝑞 𝑛−𝑥 + 𝑛𝑝
(𝑥 − 2)! (𝑛 − 𝑥)!
𝑥=2

Substitute 𝑥 − 2 = 𝑠,
𝑛−2
2 2
(𝑛 − 2)!
𝐸(𝑋 ) = 𝑛(𝑛 − 1)𝑝 ∑ 𝑝 𝑠 𝑞 𝑛−𝑠−2 + 𝑛𝑝
𝑠! (𝑛 − 𝑠 − 2)!
𝑠=0

= 𝑛(𝑛 − 1)𝑝2 (𝑝 + 𝑞)𝑛−2 + 𝑛𝑝


= 𝑛(𝑛 − 1)𝑝2 + 𝑛𝑝
Now, to find variance, we know that
𝑉(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 = 𝑛(𝑛 − 1)𝑝2 + 𝑛𝑝 − (𝑛𝑝)2
∴ 𝑉(𝑋) = 𝑛𝑝𝑞

Exercise
1. Six coins are tossed. Find the probability of getting:
(i) Exactly 3 heads
(ii) At least 3 heads
(iii) At most 3 heads
(iv) At least 1 head
Solution:
𝑋: Number of heads
1 1
𝑛 = 6, 𝑝 = 𝑃(𝐻𝑒𝑎𝑑) = , 𝑞 = 𝑃(𝑇𝑎𝑖𝑙) =
2 2

1 6
1 𝑥 1 6−𝑥
𝑋~𝐵 (6, ) ⟹ 𝑃(𝑋 = 𝑥) = 𝐶𝑥 ( ) ( )
2 2 2
5
(i) P(exactly 3 heads) = 𝑃(𝑋 = 3) =
16
21
(ii) P(at least 3 heads) = 𝑃(𝑋 ≥ 3) =
32
21
(iii) P (at most 3 heads) = 𝑃(𝑋 ≤ 3) =
32
(iv) P (at least 1 head) = 𝑃(𝑋 ≥ 1) = 1 − 𝑃(𝑋 = 0)

2. What is the probability of getting a 6 at least once in 2 throws of a fair die?


Solution:
𝑋: number of times 6 obtained
1 5
𝑛 = 2, P (getting six)= 𝑝 = , P (not getting 6)= 𝑞 =
6 6
11
𝑃(𝑋 ≥ 1) =
36

3. A fair die is thrown 180 times. What is the expected number of sixes.
Solution:
𝑋: number of times 6 is obtained
1
𝑛 = 180, P(getting a six)= 𝑝 =
6
𝐸(𝑋) = 𝑛𝑝 = 30

4. A die is thrown 8 times. Find the probability that 3 appears


(i) Two times (Ans: 0.2604)
(ii) At least 7 times (Ans: 0.0000244)
(iii) Exactly one time (Ans: 0.372)

5. Two percent fuses manufactured by a company are defective. Find the probability that
a box containing 200 fuses contains
(a) No defective (Ans: 0.01758)
(b) 3 or more defective (Ans: 0.7649)

6. Find the probability that in a family of 4 children, there will be at least one boy, by
1
assuming that probability of male birth is . (Hint: P(atleast 1 boy)=1-P(no boys))
2

7. A family has 6 children. Find the probability that there are fewer boys than girls.

8. The sum and product of mean and variance of binomial distribution are 24 and 128.
Find the distribution.
Solution:
𝑛𝑝 + 𝑛𝑝𝑞 = 24
𝑛𝑝. 𝑛𝑝𝑞 = 128
On solving the above equations, we get 𝑞 = 2 and 0.5
𝑞 = 2 is not possible. Therefore 𝑞 = 0.5 hence 𝑝 = 0.5
On substituting in 𝑛𝑝. 𝑛𝑝𝑞 = 128, we get 𝑛 = 32
Hence, the binomial distribution is given by 32 𝐶𝑥 (0.5)𝑥 (0.5)32−𝑥
9. Numbers are selected at random one at a time from the two digit numbers 00,01,…, 99
with replacement. An event occurs if and only if the product of the two digits of
selected number is 18. If 4 numbers are selected find the probability that the event
occurs at least 3 times.
Solution:
𝑋: Product is 18
4 24
𝑛 = 4, 𝑝 = and 𝑞 = . 𝑋~𝐵(4,0.04)
100 25
97
𝑃(𝑋 ≥ 3) =
254

10.A perfect die is tossed 100 times in sets of 8. The occurrence of 5 and 6 is called a
success. How many times do you expect to get 3 successes.
Solution:
𝑋: Number of successes
1 2 1
𝑝 = and 𝑞 = and 𝑛 = 8. 𝑋~𝐵 (8, )
3 3 3
1 3 2 5
8
𝑃 (3 success) = 𝑃(𝑋 = 3) = 𝐶3 ( ) ( ) = 0.2731
3 3
𝐸 (3 success) = 𝑛𝑝 = 100 × 0.2731 = 27.31

11.Suppose that the probability for 𝐴 to win a game of tennis against 𝐵 is 0.4. 𝐴 has an
option of playing either a best of 3 games or a best of 5 games. Which option 𝐴 should
choose so that his probability of winning is greater?
Solution:
𝑋: number of games
𝑝 = 0.4 and 𝑞 = 0.6, 𝑋~𝐵(𝑛, 0.4)
When 𝑛 = 3, 𝑃(𝑋 ≥ 2) = 0.352
When 𝑛 = 5, 𝑃(𝑋 ≥ 3) = 0.31744
Hence, 𝐴 should play best of 3 games so that his probability of winning is greater.

12. An aircraft knows that 5% of the people making reservations on a certain flight will not
show up. Consequently, their policy is to sell 52 tickets for the flight that can only hold
50 passengers. What is the probability that there will be a seat available for every
passenger who turns up?
Solution:
𝑋: number of passengers who won’t turn up
𝑝 = P(passenger will not turn up) = 0.05
𝑞 = 0.95, 𝑛 = 52
P(there will be a seat for every passenger) = 𝑃(𝑋 ≥ 2) = 1 − [𝑃(𝑋 = 0) + 𝑃(𝑋 =
1)] = 0.7405

13. In playing with an opponent of equal ability, which is more probable in each of the
following?
(i) Winning 2 games out of 4 OR 5 games out of 8?
(ii) Winning at least 2 games out of 4 OR at least 5 games out of 8?
Solution:
𝑋: Number of games played
1 1
P(winning) = 𝑝 = and 𝑞 =
2 2

𝑛 1 𝑛
The probability of winning exactly 𝑘 games out of 𝑛 is 𝐶𝑘 ( ) .
2

4 1 4
(i) P (winning 2 games out of 4) = 𝑃(𝑋 = 2) = 𝐶2 ( ) = 0.37
2
8 1 8
P (winning 5 games out of 8) = 𝑃(𝑋 = 5) = 𝐶5 ( ) = 0.21.
2
Hence, winning 2 games out of 4 is more likely than winning 5 games out of 8.

(ii) P (winning at least 2 games out of 4) = 𝑃(𝑋 ≥ 2) = 𝑃(𝑋 = 2) + 𝑃(𝑋 = 3) +


1 4
𝑃(𝑋 = 4) = [𝐶(4,2) + 𝐶(4,3) + 𝐶(4,4)] ( ) = 0.68
2
P (winning at least 5 games out of 8) = 𝑃(𝑋 ≥ 5) = 𝐶(8,5) + 𝐶(8,6) + 𝐶(8,7) +
1 8
𝐶(8,8)] ( ) = 0.36.
2
Hence, winning at least 2 games out of 4 is more likely than winning at least 5 games
out of 8.

POISSON’S DISTRIBUTION (Uniparametric, Discrete distribution)


A discrete random variable 𝑋 is said to have Poisson’s distribution if it has pdf of the form

𝑒 −𝛼 𝛼 𝑘
𝑃(𝑥) = , 𝑘 = 0,1,2, ⋯ and 𝛼 > 0
𝑘!
We write 𝑋~𝑃(𝛼)
𝑥2 𝑥3 𝑥𝑛
Note: 𝑒 𝑥 = 1 + 𝑥 + + + ⋯+ +⋯
2! 3! 𝑛!
Mean and Variance:

𝐸(𝑋) = ∑ 𝑥𝑃(𝑥)
𝑥=0


𝑒 −𝛼 𝛼 𝑥
= ∑𝑥
𝑥!
𝑥=1


𝑒 −𝛼 𝛼 𝑥
= ∑
(𝑥 − 1)!
𝑥=1


−𝛼
𝛼 (𝑥−1)
= 𝛼𝑒 ∑
(𝑥 − 1)!
𝑥=1

Substitute 𝑥 − 1 = 𝑠

−𝛼
𝛼𝑠
𝐸(𝑋) = 𝛼𝑒 ∑ = 𝛼𝑒 −𝛼 𝑒 𝛼 = 𝛼
𝑠!
𝑠=0

∴ 𝐸(𝑋) = 𝛼

𝐸(𝑋 2 ) = ∑ 𝑥 2 𝑃(𝑥)
𝑥=0


2
𝑒 −𝛼 𝛼 𝑥
= ∑(𝑥 − 𝑥 + 𝑥)
𝑥!
𝑥=1

∞ ∞
𝑒 −𝛼 𝛼 𝑥 𝑒 −𝛼 𝛼 𝑥
= ∑ 𝑥(𝑥 − 1) +∑𝑥
𝑥! 𝑥!
𝑥=2 𝑥=1


2 −𝛼
𝛼 𝑥−2
= 𝛼 𝑒 ∑ +𝛼
(𝑥 − 2)!
𝑥=2

Substitute 𝑥 − 2 = 𝑠

2) 2 −𝛼
𝛼𝑠
𝐸(𝑋 = 𝛼 𝑒 ∑ +𝛼
𝑠!
𝑠=0
= 𝛼 2 𝑒 −𝛼 𝑒 𝛼 + 𝛼

= 𝛼2 + 𝛼

We know that 𝑉(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2

∴ 𝑉(𝑋) = 𝛼

Theorem
Let 𝑋 be a binomial random variable with parameters 𝑛, 𝑝 and pdf 𝑃(𝑋 = 𝑥) =
𝑛 𝑒 −𝛼 𝛼 𝑥
𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥 . Suppose 𝑛 → ∞ and 𝑛𝑝 = 𝛼, then 𝑙𝑖𝑚 𝑃(𝑋 = 𝑥) = is a Poisson’s
𝑛→∞ 𝑥!
distribution with parameter 𝛼.
(i.e., Poisson's distribution is a limiting case of Binomial distribution when 𝑛 → ∞)
Proof:
General expression for binomial distribution is
𝑛
𝑃(𝑋 = 𝑥) = 𝐶𝑥 𝑝 𝑥 𝑞 𝑛−𝑥
𝑛
= 𝐶𝑥 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥

𝑛(𝑛 − 1)(𝑛 − 2) … (𝑛 − (𝑥 + 1)) 𝑥


= 𝑝 (1 − 𝑝)𝑛−𝑥
𝑥!
Let 𝑛𝑝 = 𝛼

𝑛(𝑛 − 1)(𝑛 − 2) … (𝑛 − (𝑥 + 1)) 𝛼 𝑥 𝛼 𝑛−𝑥


𝑃(𝑋 = 𝑥) = ( ) (1 − )
𝑥! 𝑛 𝑛
𝛼 𝑛
𝛼𝑥
1 2 (𝑥 + 1) (1 − 𝑛 )
= [1. (1 − ) (1 − ) ⋯ (1 − )]
𝑥! 𝑛 𝑛 𝑛 𝛼 𝑥
(1 − 𝑛 )

Let 𝑛 → ∞ and 𝛼 = 𝑛𝑝

𝛼 𝑛
𝛼 𝑥
1 (𝑥 + 1) (1 − 𝑛 )
lim 𝑃(𝑋 = 𝑥) = lim ( [1. (1 − ) ⋯ (1 − )] )
𝑛→∞ 𝑛→∞ 𝑥! 𝑛 𝑛 𝛼 𝑥
(1 − 𝑛 )

𝛼 𝑥 𝑒 −𝛼
lim 𝑃(𝑋 = 𝑥) =
𝑛→∞ 𝑥!
Exercise
1. 2% of fuses manufactured by a company are defective. Find the probability that a box
having 200 fuses contains
(i) No defective fuse.
(ii) 3 or more defective fuses.
Solution:
𝑋: Number of defective fuses
𝑛 = 200, 𝑝 = 0.02, 𝑞 = 0.98 ⟹ 𝛼 = 𝑛𝑝 = 4
𝑋~𝑃(4)
(i) 𝑃(𝑋 = 0) = 0.0183
(ii) 𝑃(𝑋 ≥ 3) = 0.769

2. A pot has 10% defective items. What should be the number of items such that the
probability of finding at least 1 defective item is at least 0.95.
Solution:
𝑋: Number of defective items
𝑛 =? 𝑝 = 0.1 and 𝑞 = 0.9
𝑋~𝑃(𝛼)
Given that 𝑃 (𝑋 ≥ 1) ≥ 0.95
[1 − 𝑃(𝑋 < 1) ≥ 0.95
𝑃(𝑋 < 1) ≤ 0.05
𝑃(𝑋 = 0) ≤ 0.05
𝑒 −𝛼 𝛼 0 ≤ 0.05
𝑒 −𝑛𝑝 ≤ 0.05
𝑛 ≥ 29.95

3. Suppose that a container contains 10,000 particles. The probability that such a particle
escapes from the container equals 0.0004. What is the probability that more than 5
such escapes occur.
Solution:
𝑋: Number of particles that escape
𝑛 = 10,000, 𝑝 = 0.0004 ⟹ 𝛼 = 𝑛𝑝 = 4 and 𝑋~𝑃(4)
5
𝑒 −4 4𝑥
𝑃 (𝑋 > 5) = 1 − 𝑃(𝑋 ≤ 5) = 1 − ∑
𝑥!
𝑥=0

4. 𝑋 is a Poisson’s variate and it is found that the probability that 𝑋 = 2 is two third of the
probability that 𝑋 = 1. Find the probability that 𝑋 = 0 and 𝑋 = 3. What is the
probability that 𝑋 exceeds 3?
Solution:
2
𝑃(𝑋 = 2) = 𝑃(𝑋 = 1)
3
𝑒 −𝛼 𝛼 2 2 𝑒 −𝛼 𝛼 1
= ×
2! 3 1!
4
On solving the above, we get 𝛼 =
3

𝑃(𝑋 = 0) = 0.2635
𝑃(𝑋 = 3) = 0.1041
𝑃(𝑋 > 3) = 0.0465

5. An insurance company has discovered that only about 0.1% of the population is
involved in a certain type of accidents each year. If its 10,000 policy holders were
randomly selected from the population, what is the probability that not more than 5
of the clients are involved in such accidents each year?
Solution:
𝑋: Number of clients involved in accidents
𝑝 = 0.001, 𝑛 = 10,000 ⟹ 𝛼 = 𝑛𝑝 = 10 and 𝑋~𝑃(10)
𝑃(𝑋 ≤ 5) = 0.0671

6. Suppose that a book of 585 pages contains 43 typographical errors. If these errors are
randomly distributed throughout the book, what is the probability that 10 pages
selected at random will be error free?
Solution:
𝑋: Number of errors
43
𝑛 = 10, 𝑝 = , ⟹ 𝛼 = 𝑛𝑝 = 0.735 and 𝑋~𝑃(0.735)
585

𝑃(𝑋 = 0) = 0.4795

7. Probability that an individual suffers a bad reaction from an injection is 0.001. Find the
probability that out of 2000 individuals,
(i) exactly 3 suffer from bad reaction.
(ii) not more than 2 suffer from bad reaction.

UNIFORM DISTRIBUTION (Continuous distribution)


Let 𝑋 be a continuous random variable assuming all values in the interval [𝑎, 𝑏] where 𝑎 and
𝑏 are finite. If the pdf of 𝑋 is given by
1
𝑎≤𝑥≤𝑏
𝑓(𝑥) = {(𝑏−𝑎)
0 elsewhere
then we say that 𝑋 has uniform distribution defined over [𝑎, 𝑏].

Note that, for any sub interval [𝑐, 𝑑],


𝑑 𝑑
1 (𝑑 − 𝑐)
P (c < X < d) = ∫ 𝑓(𝑥)𝑑𝑥 = ∫ 𝑑𝑥 =
𝑐 𝑐 (𝑏 − 𝑎) (𝑏 − 𝑎)
0 𝑥≤𝑎
𝑥−𝑎
Cdf = 𝐹(𝑥) = {(𝑏−𝑎) 𝑎<𝑥<𝑏
1 𝑥≥𝑏

Mean and Variance:


𝑏
1 𝑥 2 𝑏 (𝑏 − 𝑎)(𝑎 + 𝑏) (𝑎 + 𝑏)
𝐸(𝑋) = ∫ 𝑥𝑓(𝑥)𝑑𝑥 = { } = =
𝑎 (𝑏 − 𝑎) 2 𝑎 2(𝑏 − 𝑎) 2
𝑎+𝑏
∴ 𝐸(𝑋) =
2
𝑏
2) 2
1 𝑥 3 𝑏 (𝑏 3 − 𝑎3 ) (𝑎2 + 𝑏 2 + 𝑎𝑏)
𝐸(𝑋 = ∫ 𝑥 𝑓(𝑥)𝑑𝑥 = { } = =
𝑎 (𝑏 − 𝑎) 3 𝑎 3(𝑏 − 𝑎) 3
2
2)
(𝑎2 + 𝑏 2 + 𝑎𝑏) (𝑎 + 𝑏) (𝑏 − 𝑎)2
𝑉(𝑋) = 𝐸(𝑋 − [𝐸(𝑋)]2 = −[ ] =
3 2 12
(𝑏 − 𝑎)2
∴ 𝑉(𝑋) =
12

Exercise
1. If 𝑋is uniformly distributed over (−2, 2), then find
(i) 𝑃(𝑋 < 1)
1
(ii) 𝑃 (|𝑋 − 1| ≥ )
2

Solution:
Given that 𝑋 ∈ 𝑈(−2, 2).
1
−2 ≤ 𝑥 ≤ 2
Therefore, 𝑓(𝑥) = { 4
0 elsewhere
1 1 1 3
(i) 𝑃(𝑋 < 1) = ∫−∞ 𝑓(𝑥) 𝑑𝑥 = ∫−2 𝑑𝑥 =
4 4
1 1
(ii) 𝑃 (|𝑋 − 1| ≥ ) = 1 − 𝑃 (|𝑋 − 1| < )
2 2
1 1
= 1− 𝑃( − < 𝑋– 1< )
2 2
1 3
= 1– 𝑃( < 𝑋 < )
2 2
3
2 1 1 3
= 1– ∫ 𝑑𝑥 = 1 − =
1 4 4 4
2

2. If 𝐾 is uniformly distributed over (0,5), then what is the probability that the roots of
the equation 4𝑥 2 + 4𝑥𝐾 + 𝐾 + 2 = 0 are real?
Solution:
Given that 𝐾 ∈ 𝑈(0, 5).
1
, 0≤𝑘≤5
Therefore, 𝑓(𝑘) = {5
0, elsewhere
𝑃 { Roots are real} = 𝑃{ (4𝐾)2 − 4 × 4(𝑘 + 2) ≥ 0 }
= 𝑃{ 𝐾 2 − 𝐾 − 2 ≥ 0 } = 𝑃{ (𝐾 + 1)(𝐾 − 2 ) ≥ 0 }
= 𝑃{ (𝐾 + 1) ≥ 0, (𝐾 − 2 ) ≥ 0 𝑜𝑟 (𝐾 + 1) ≤ 0, (𝐾 − 2 ) ≤ 0}
= 𝑃{ 𝐾 ≥ −1, 𝐾 ≥ 2 𝑜𝑟 𝐾 ≤ −1 , 𝐾 ≤ 2}
= 𝑃{ 𝐾 ≥ 2 𝑜𝑟 𝐾 ≤ −1}
= 𝑃{ 𝐾 ≥ 2} + 𝑃{ 𝐾 ≤ −1}
5 −1
1 3
= ∫ 𝑑𝑥 + ∫ 0 𝑑𝑥 =
2 5 −∞ 5

UNIFORM DISTRIBUTION (Two-dimensional) (Continuous distribution)


We say that the two-dimensional continuous random variable (𝑋, 𝑌) is uniformly distributed
over a region 𝑅 in the Euclidean plane if
𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡 , (𝑥, 𝑦) ∈ 𝑅
𝑓(𝑥, 𝑦) = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

+∞ ∞
Because of the requirement ∫−∞ ∫−∞ 𝑓(𝑥, 𝑦)𝑑𝑥𝑑𝑦 = 1, the above implies that the constant
1
equals . We are assuming that 𝑅 is a region with finite, non-zero area.
𝑎𝑟𝑒𝑎 (𝑅)

Note:
The definition represents the two-dimensional analog to the one-dimensional uniformly
distributed random variable.

Exercise
1. Two characteristics of a rocket engine’s performance are thrust 𝑋 and mixture ration
𝑌. Suppose that (𝑋, 𝑌) is a two-dimensional continuous random variable with joint pdf
2(𝑥 + 𝑦 − 2𝑥𝑦), 0 ≤ 𝑥 ≤ 1, 0 ≤ 𝑦 ≤ 1
𝑓(𝑥, 𝑦) = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
(The units have been adjusted in order to use values between 0 and 1.)
Find the marginal pdf of 𝑋 and 𝑌.
Solution:
The marginal pdf of 𝑋 is given by
1 1
𝑦2
𝑔(𝑥) = ∫ 2(𝑥 + 𝑦 − 2𝑥𝑦)𝑑𝑦 = 2 (𝑥𝑦 + − 𝑥𝑦 2 )
0 2 𝑦=0
= 1 ,0 ≤ 𝑥 ≤ 1

That is, 𝑋 is uniformly distributed over [0,1].

The marginal pdf of 𝑌 is given by


1 1
𝑥2
ℎ(𝑦) = ∫ 2(𝑥 + 𝑦 − 2𝑥𝑦)𝑑𝑥 = 2 ( + 𝑥𝑦 − 𝑥 2 𝑦)
0 2 𝑦=0
= 1, 0 ≤ 𝑦 ≤ 1
That is, 𝑌 is uniformly distributed over [0,1].

2. Suppose that the two-dimensional random variable (𝑋, 𝑌) is uniformly distributed


over the shaded region 𝑅 indicated in Figure. Find the marginal pdf of 𝑋 and 𝑌.

Solution:
Given that the two-dimensional random variable (𝑋, 𝑌) is uniformly distributed over
the shaded region 𝑅 indicated in Figure. Hence
1
𝑓(𝑥, 𝑦) = , (𝑥, 𝑦) ∈ 𝑅.
𝑎𝑟𝑒𝑎 (𝑅)

1 𝑥 1
1
𝑎𝑟𝑒𝑎 (𝑅) = ∫ ∫𝑑𝑥 𝑑𝑦 = ∫ ∫ 𝑑𝑦𝑑𝑥 = ∫ (𝑥 − 𝑥 2 )𝑑𝑥 =
𝑥 𝑦 𝑥=0 𝑦=𝑥 2 𝑥=0 6
Therefore, the pdf is given by
6, (𝑥, 𝑦) ∈ 𝑅
𝑓(𝑥, 𝑦) = {
0, (𝑥, 𝑦) ∉ 𝑅
Then the marginal pdf’s of 𝑋 and 𝑌 are

+∞ 𝑥
𝑔(𝑥) = ∫ 𝑓(𝑥, 𝑦)𝑑𝑦 = ∫ 6 𝑑𝑦
−∞ 𝑥2
2 ),
= 6(𝑥 − 𝑥 0≤𝑥≤1

+∞ √𝑦
ℎ(𝑦) = ∫ 𝑓(𝑥, 𝑦)𝑑𝑥 = ∫ 6 𝑑𝑥
−∞ 𝑦
= 6(√𝑦 − 𝑦), 0≤𝑦≤1
NORMAL DISTRIBUTION/GAUSSIAN DISTRIBUTION (Continuous distribution)
A continuous random variable is said to be normally distributed if its pdf is given by
1 −(𝑥−𝜇)2
𝑓(𝑥) = 𝑒 2𝜎2 , −∞ < 𝑥, 𝜇 < ∞, 𝜎 > 0
𝜎 √2𝜋
We denote 𝑋~𝑁(𝜇, 𝜎 2 )
Note:
(i) The mean, median, and mode of the distribution coincide.
(ii) The curve of the distribution is bell-shaped and symmetrical about the line 𝑋 = 𝜇.
(iii) The total area under the curve is 1.
(iv) Exactly half of the values are to the left of the centre, and the other half to the right.

To check the pdf defined is valid:


𝑓(𝑥) ≥ 0
∞ ∞
1 −(𝑥−μ )2
∫ 𝑓(𝑥)𝑑𝑥 = ∫ 𝑒 2σ2 𝑑𝑥
σ√2𝜋
−∞ −∞

𝑥−𝜇
Substitute =𝑧
𝜎
∞ ∞
1 −(𝑧 )2
∫ 𝑓(𝑥)𝑑𝑥 = ∫ 𝑒 2 𝑑𝑧
√2𝜋
−∞ −∞
∞ ∞
2 −(𝑧 )2
∫ 𝑓(𝑥)𝑑𝑥 = ∫ 𝑒 2 𝑑𝑧
√2𝜋
−∞ 0
∞ ∞
√2 −(𝑧 )2
∫ 𝑓(𝑥)𝑑𝑥 = ∫ 𝑒 2 𝑑𝑧
√𝜋
−∞ 0

𝑧2
Substitute =t
2

Γ(𝑧) = ∫ 𝑡 𝑧−1 𝑒 −𝑡 𝑑𝑡
0

∞ ∞
√2 𝑒 −𝑡
∫ 𝑓(𝑥)𝑑𝑥 = ∫ 𝑑𝑡 = 1
−∞ √𝜋 0 √2𝑡
Mean and Variance:

𝐸(𝑋) = ∫ 𝑥 𝑓(𝑥)𝑑𝑥
−∞

1 −(𝑥−μ )2
𝐸(𝑋) = ∫ 𝑥 { 𝑒 2σ2 } 𝑑𝑥
σ√2𝜋
−∞
𝑥−𝜇
Substitute =𝑧
𝜎

1 −(𝑧 )2
𝐸(𝑋) = ∫ (𝑧𝜎 + 𝜇) { 𝑒 2 }𝜎 𝑑𝑧
σ√2𝜋
−∞


1 −(𝑧 )2
= ∫ 𝑧𝜎 { 𝑒 2 } 𝜎 𝑑𝑧 (which is an odd function and hence is 0)
σ√2𝜋
−∞

1 −(𝑧 )2
+ ∫ 𝜇{ 𝑒 2 } 𝜎 𝑑𝑧 (which is an even function)
σ√2𝜋
−∞


2𝜇 −(𝑧 )2
𝐸(𝑋) = ∫{ 𝑒 2 } 𝑑𝑧
√2𝜋
0
𝑧2
Substitute =𝑡
2

√2 𝑒 −𝑡
𝐸(𝑋) = 𝜇 ∫ 𝑑𝑡 = 𝜇
√𝜋 √2𝑡
0
1 1
𝐸(𝑋) = 𝜇 Γ( ) = 𝜇
√𝜋 2

∴ 𝐸(𝑋) = 𝜇

𝑉(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 = 𝐸(𝑋 − 𝜇)2


𝑉(𝑋) = ∫ (𝑋 − 𝜇)2 𝑓(𝑥)𝑑𝑥


−∞

1 −(𝑥−μ )2
2
= ∫ (𝑋 − 𝜇) { 𝑒 2σ2 } 𝑑𝑥
σ√2𝜋
−∞
𝑥−𝜇
Substitute =𝑧
𝜎

1 −(𝑧 )2
𝑉(𝑋) = ∫ (𝑧𝜎)2 { 𝑒 2 }𝜎 𝑑𝑧 (even function)
σ√2𝜋
−∞

2𝜎 2 2
−(𝑧 )2
= ∫𝑧 { 𝑒 2 } 𝑑𝑧
√2𝜋
0
𝑧2
Substitute =𝑡
2

2
√2 𝑒 −𝑡
𝑉(𝑋) = 𝜎 ∫ 2𝑡 𝑑𝑡
√𝜋 √2𝑡
0
√2 3
𝑉(𝑋) = 𝜎 2 Γ ( ) = 𝜎2
√𝜋 2

∴ 𝑉(𝑋) = 𝜎 2

STANDARD NORMAL DISTRIBUTION (Continuous distribution)


The standard normal distribution, also called as the 𝑧-distribution, is a special case of normal
distribution where the mean 𝜇 is 0 and the standard deviation 𝜎 is 1. The curve is symmetric
about 𝑥 = 0.
We write 𝑍 ∼ 𝑁(0, 1).
Its PDF is,
−z2
e 2
ϕ(z) = −∞<𝑍 <∞
√2π
𝑥−𝜇
where, 𝑧 = when 𝜇 is 0 and 𝜎 is 1.
𝜎

Properties:
1. Area under the curve is 1.
2. 𝑃(𝑎 < 𝑥 < 𝑏) is area under the curve from 𝑎 to 𝑏.
3. Cdf of 𝜙(𝑎) = 𝑃(𝑍 ≤ 𝑎)
4. 𝑃(𝑎 ≤ Z ≤ b) = 𝜙(𝑏) − 𝜙(𝑎)
5. 𝜙(−𝑎) = 𝑃(𝑍 ≤ −𝑎) = 𝑃(𝑍 ≥ 𝑎) = 1 − 𝑃(𝑍 < 𝑎) = 1 − 𝜙(𝑎)
𝑎−𝜇 𝑏−𝜇
6. 𝑃(𝑎 ≤ 𝑥 ≤ 𝑏) = 𝑃 ( ≤𝑍≤ )
𝜎 𝜎
Exercise
1. Suppose 𝑋~𝑁(75, 100). Find
(i) 𝑃(𝑋 < 60)
(ii) 𝑃 (70 < 𝑋 < 100)
(iii) 𝑃(𝑋 < 65)
Solution:
𝜇 = 75 and 𝜎 2 = 100.
𝑋−75
𝑍= and 𝑍 ∼ 𝑁(0, 1)
10
𝑋−75
(i) 𝑃(𝑋 < 60) = 𝑃 (𝑍 < )
10

= 𝑃(𝑍 < −1.5)


= 𝜙(−1.5)
= 1 − 𝜙(1.5) = 1 − 0.9332 = 0.0668
(ii) 𝑃(70 < 𝑋 < 100) = 𝑃(−0.5 < 𝑧 < 2.5)
= 𝜙(2.5) − 𝜙(−0.5) = 𝜙(2.5) − [1 − 𝜙(0.5)]
= 0.9938 − 1 + 0.6915
= 0.6853
(iii) 𝑃(𝑋 < 65) = 0.1587

2. Suppose 𝑋~𝑁(2, 0.16), find


(i) 𝑃(𝑋 ≥ 2.3) (Ans: 0.2266)
(ii) 𝑃(1.8 ≤ 𝑋 ≤ 2.1) (Ans: 0.2902)

3. Diameter of an electric cable is normally distributed with mean 0.8 and variance
0.0004. What is the probability that the diameter exceeds 0.81 inches?
Solution:
𝜇 = 0.8 𝑎𝑛𝑑 𝜎 = 0.02
0.81−𝜇
𝑃(𝑋 > 0.81) = 𝑃 (𝑍 > )= 0.3085
𝜎

4. If 𝑋~𝑁(1, 4), find 𝑃(|𝑋| > 4) (Ans: 0.073)

5. If 𝑋~𝑁(75, 25), find 𝑃(𝑋 > 80|𝑋 > 77) (Ans: 0.4605)

6. The height of 500 soldiers is found to have normal distribution. Of them, 258 are found
to be within 2 cm of the mean height of 170 cm. Find the standard deviation of 𝑋.
Solution:
𝜇 = 170
𝑋: height of soldiers
𝑋~𝑁(170, 𝜎 2 )
258
𝑃(168 < 𝑋 < 172) = = 0.516
500
168 − 𝜇 172 − 𝜇
𝑃( <𝑍< ) = 0.516
𝜎 𝜎
2
2𝜙 ( ) = 1.516
𝜎
2
= 𝜙 −1 (0.758)
𝜎
2
= 0.7
𝜎
𝜎 = 2.857

7. In the normal distribution, 31% of items are < 45 and 8% are over 64. Find the mean
and standard distribution.
Solution:
𝑃(𝑋 < 45) = 0.31
𝑃(𝑋 > 64) = 0.08
Simplifying, 45 − 𝜇 = −0.5𝜎
64 − 𝜇 = 1.41𝜎
Solving, 𝜇 = 49.97 𝑎𝑛𝑑 𝜎 = 9.94

8. Suppose 𝑋 has 𝑁(3, 4). Find 𝑐 such that (𝑃(𝑋 > 𝑐)= 2𝑃(𝑋 ≤ 𝑐).
Ans: 𝑐 = 2.14

9. Suppose that the life span of two electronic device A and B have distribution 𝑁(40,36)
and 𝑁(45,9). If the electronic device is to be used for 45 hours period which device is
to be preferred. If it is used for 48 hours which device is to be preferred?
Solution:
Device A Device B
𝑁 ( 40 , 36) 𝑁 (45 , 9)
Mean = 40 Mean = 45
SD = 6 SD = 3
45ℎ𝑟 period and above: 45ℎ𝑟 period and above:
𝑃(𝑋 ≥ 45) = 0.2025 𝑃(𝑋 ≥ 45) = 0.5
0.5 > 0.2025
Hence Device 𝐵 is better for 45 hours.
48ℎ𝑟 period and above: 48ℎ𝑟 period and above:
𝑃(𝑋 ≥ 48) = 0.0915 𝑃(𝑋 ≥ 48) = 0.1587
0.1587 > 0.0915
Hence Device B is better for 48 hours.

Thus, in both cases, Device B is better.

10.In a normal distribution, 7% of the items are under 35 and 89% of the items are under
63. Find the mean and variance of the distribution.
Solution:
𝑃(𝑋 < 35) = 0.07 and 𝑃(𝑋 < 63) = 0.89 gives 𝑋~𝑁(50, 100)

11.An examination is often regarded as good (i.e., has a valid grade spread) if the test
scores of those taking it can be approximated by a normal distribution. The instructor
uses the test scores to estimate parameters 𝜇 and 𝜎 2 . Then she assigns grades
according to the following chart:

Grade Score Range


A Students who score greater than 𝜇 + 𝜎
B Students who score between 𝜇 and 𝜇 + 𝜎
C Students who score between 𝜇 − 𝜎 and 𝜇
D Students who score between 𝜇 − 2𝜎 and 𝜇 − 𝜎
E Students who score below 𝜇 − 2𝜎

Obtain the percentage of students who are graded A, B, C, D and E.


Solution:
𝑃(𝑋 > 𝜇 + 𝜎) = 0.1587
𝑃(𝜇 < 𝑋 < 𝜇 + 𝜎) = 0.3413
𝑃(𝜇 − 𝜎 < 𝑋 < 𝜇 ) = 0.3413
𝑃(𝜇 − 2𝜎 < 𝑋 < 𝜇 − 𝜎 ) = 0.1359
𝑃(𝑋 < 𝜇 − 2𝜎) = 0.0228

12.The monthly income of a group of 10,000 persons were found to be normally


distributed with mean 750 rupees and SD rupees 50. Show that of this group about
95% had income exceeding rupees 668 and only 5% had income exceeding rupees 832.
What was the lowest income among the richest 100?
Solution:
𝑋: Monthly income of a group
To Show 𝑃(𝑋 > 668) = 0.95 and 𝑃(𝑋 > 832) = 0.05
Consider 𝑃(𝑋 > 668) and 𝑃(𝑋 > 832) and solve.
100
To find “𝐶” such that 𝑃(𝑋 > 𝜇 + 𝐶) = = 0.01
10000
𝐶 = 116.5
Therefore, lowest income among the richest 100 is 𝜇 + 𝐶 = 866.5

13.The annual rainfall at a certain locality is known to be a normally distributed random


variable with mean 29.5 inches and SD 2.5 inches. How many inches of rain annually
exceeds about 5% of the time?
Solution:
𝑋: Annual rainfall at certain locality
𝜇 = 29.5 𝑎𝑛𝑑 𝜎 = 2.5
𝑃(𝑋 > 𝜇 + 𝑐) = 0.05
𝑐
1 − 𝑃 (𝑍 ≤ ) = 0.05
𝜎
𝑐 = 4.125
Rain exceeds about 5% of the time is 𝜇 + 𝑐 = 33.625 inches.

14. Suppose that the breaking strength of cotton fabric (in pounds), say 𝑋, is normally
distributed with 𝐸(𝑋) = 165 and 𝑉(𝑋) = 9. Assume furthermore that a sample of this
fabric is considered to be defective if 𝑋 < 162. What is the probability that a fabric
chosen at random will be defective?
Solution:
𝑋 − 165 162 − 165
𝑃(𝑋 < 162) = 𝑃 ( < )
3 3
= Φ(−1) = 1 − Φ(1)
= 0.159

15. The errors in a certain length-measuring device are known to be normally distributed
with expected value zero and standard deviation 1 inch. What is the probability that
the error in measurement will be greater than 1 inch? 2 inches? 3 inches?

16.The outside diameter of a shaft, say 𝐷, is specified to be 4 inches. Consider 𝐷 to be a


normally distributed random variable with mean 4 inches and variance 0.01 inch2. If
the actual diameter differs from the specified value by more than 0.05 inch but less
than 0.08 inch, the loss to the manufacturer is $0.50. If the actual diameter differs from
the specified diameter by more than 0.08 inch, the loss is $1.00. The loss, 𝐿, may be
considered as a random variable. Find the probability distribution of 𝐿 and evaluate
𝐸(𝐿).
Solution:
D~𝑁(4, 0.01)
(i) 𝑃(0.05 < |𝐷 − 4| < 0.08) ⟹ 𝑃(0.05 < (𝐷 − 4) < 0.08) and 𝑃(0.05 <
−(𝐷 − 4) < 0.08)
𝑃(0.05 < (𝐷 − 4) < 0.08) = 𝑃(4.05 < 𝐷 < 4.08) = 0.0966
𝑃(0.05 < −(𝐷 − 4) < 0.08) = 𝑃(−0.05 > (𝐷 − 4) > −0.08) = 𝑃(−0.08
< (𝐷 − 4) < −0.05) = 𝑃(3.92 < 𝐷 < 3.95) = 0.0966
(ii) 𝑃(|𝐷 − 4| > 0.08) which gives 𝑃(𝐷 > 4.08) = 0.2119 and 𝑃(𝐷 < 3.92) =
0.2119
0.5 𝑓𝑜𝑟 (𝑖 )
∴𝐿= {
1 𝑓𝑜𝑟 (𝑖𝑖)
𝐸(𝐿) = 0.5 × 2 × 0.0966 + 1 × 2 × 0.2119 = $0.528

EXPONENTIAL DISTRIBUTION (Uniparametric, Continuous distribution)


Exponential distribution is characterized by a single parameter 𝜆, which is called the rate.
Intuitively, 𝜆 can be thought of as the instantaneous “failure rate” of a device at any time 𝑡,
given that the device has survived up to 𝑡. The exponential distribution is typically used to
model time intervals between random events.

Examples:
(i) The length of time between telephone calls
(ii) The length of time between arrivals of vehicles at a service station
(iii) The lifetime of electronic components, i.e., an inter failure time.

A CRV 𝑋 is said to be exponentially distributed with parameter 𝜆 > 0 if its pdf is given by
−𝜆𝑥
𝑓(𝑥) = { 𝜆𝑒 , 𝑥>0
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

Mean and Variance:


𝐸(𝑋) = ∫ 𝑥 𝑓(𝑥)𝑑𝑥
−∞

𝐸(𝑋) = ∫ 𝑥 𝜆𝑒 −𝜆𝑥 𝑑𝑥
0
𝑒 −𝜆𝑥 𝑒 −𝜆𝑥 1
= 𝜆 {𝑥 − 2 }=
−𝜆 𝜆 𝜆
2
Similarly, 𝐸(𝑋 2 ) = . Hence,
𝜆2
1
𝑉(𝑋) =
𝜆2

GAMMA DISTRIBUTION (Continuous distribution)


A continuous random variable 𝑋 is said to have a gamma distribution with parameters 𝛼 >
0 and 𝑟 > 0, if its pdf is given by

𝑥 𝑟−1 𝑒 −𝛼𝑥 𝛼 𝑟
𝑓(𝑥) = { , 𝑥 > 0, 𝛼, 𝑟 > 0
Γ(𝑟)
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

We write 𝑋~𝐺(𝛼, 𝑟)

Note:
When we substitute 𝑟 = 1 in gamma distribution, we get exponential distribution.
Mean and Variance:

𝐸(𝑋) = ∫ 𝑥 𝑓(𝑥)𝑑𝑥
−∞

𝑥 𝑟−1 𝑒 −𝛼𝑥 𝛼 𝑟
= ∫𝑥 { } 𝑑𝑥
Γ(𝑟)
0

Multiply and divide by 𝛼𝑟, we get


𝑟 ∞ 𝑥 𝑟 𝑒 −𝛼𝑥 𝛼 𝑟+1
𝐸(𝑋) = ∫ 𝑑𝑥
𝛼 0 Γ(𝑟 + 1)
∞ 𝑥 𝑟 𝑒 −𝛼𝑥 𝛼 𝑟+1
where ∫0 { } 𝑑𝑥 = 1 as it is the pdf of gamma distribution 𝐺(𝛼, 𝑟 + 1).
Γ(𝑟+1)

𝑟
∴ 𝐸(𝑋) =
𝛼
To find variance, 𝑉(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2

𝐸(𝑋 2 ) = ∫ 𝑥 2 𝑓(𝑥)𝑑𝑥
−∞

𝑥 𝑟−1 𝑒 −𝛼𝑥 𝛼 𝑟
2
= ∫𝑥 { } 𝑑𝑥
Γ(𝑟)
0

Multiply and divide by 𝛼 2 𝑟(𝑟 + 1), we get

2)
𝑟(𝑟 + 1) ∞ 𝑥 𝑟+1 𝑒 −𝛼𝑥 𝛼 𝑟+2
𝐸(𝑋 = ∫ 𝑑𝑥
𝛼2 0 Γ(𝑟 + 2)
∞ 𝑥 𝑟+1 𝑒 −𝛼𝑥 𝛼 𝑟+2
where ∫0 { } 𝑑𝑥 = 1 as it is the pdf of gamma distribution 𝐺(𝛼, 𝑟 + 2).
Γ(𝑟+2)

𝑟(𝑟+1)
Hence, we get 𝐸(𝑋 2 ) =
𝛼2
𝑟
∴ 𝑉(𝑋) =
𝛼2
To check the pdf defined is valid:

𝑓(𝑥) ≥ 0
∞ ∞
∫−∞ 𝑓(𝑥)𝑑𝑥 = ∫0 𝑓(𝑥)𝑑𝑥
∞ 𝑥 𝑟−1 𝑒 −𝛼𝑥 𝛼 𝑟
= ∫0 { } 𝑑𝑥
Γ(𝑟)

𝑑𝑣
Substitute 𝛼𝑥 = 𝑣 ⟹ 𝑑𝑥 =
𝛼

𝛼𝑟 𝑣 𝑟−1 −𝑣 1
= ∫ ( ) 𝑒 × 𝑑𝑣
Γ(𝑟) 𝛼 𝛼
0

1
= ∫ 𝑣 𝑟−1 𝑒 −𝑣 𝑑𝑣 = 1
Γ(𝑟) 0

(Since Γ(𝑧) = ∫0 𝑡 𝑧−1 𝑒 −𝑡 𝑑𝑡)

CHI-SQUARE DISTRIBUTION (Continuous distribution)


A continuous random variable 𝑋 is said to have a chi-square distribution if its pdf is given by

𝑛 𝑥
𝑥 2 −1 𝑒 −2
𝑛 , 𝑥>0
𝑓(𝑥) = 𝑛
2 Γ (2 )
2

{0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

Note:
𝑛 1
On substituting 𝑟 = and 𝛼 = in the gamma distribution, we get the pdf of chi-square
2 2
distribution. Hence, chi-square distribution is a special case of gamma distribution.
2
We write 𝑋~𝜒(𝑛) where 𝑛 is the degrees of freedom

Mean and Variance:


𝐸(𝑋) = 𝑛 and 𝑉(𝑋) = 2𝑛
𝑛 1
(Obtained by substituting 𝑟 = and 𝛼 = in the mean and variance of gamma distribution.
2 2
Derivation of the above can also be done similarly to that of gamma distribution.)

Exercise
1. The number of road accidents per day in a city is following a gamma distribution with
an average of 6 and variance of 18. Find the probability that in a day, there will be
(i) more than 8 accidents
(ii) between 5 to 8 accidents
Solution:
𝑟 𝑟 1
= 6 and = 18 ⟹ 𝑟 = 2, 𝛼 =
𝛼 𝛼2 3
1
1 −( )𝑥 1
8 (3) 𝑒 3 (3𝑥)
(i) 𝑃(𝑋 > 8) = 1 − ∫0 { } 𝑑𝑥
Γ(2)
1
1 −( )𝑥 1
8 ( ) 𝑒 3 ( 𝑥)
(ii) 𝑃(5 < 𝑋 < 8) = ∫5 { 3 Γ(2) 3 } 𝑑𝑥

2. The daily consumption of electric power (in million 𝐾𝑤) in a certain city is a random
𝑥
1 −
𝑥𝑒 3 , 𝑥>0
variable 𝑋 having the pdf 𝑓(𝑥) = {9
0 , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Find the probability that the power supply is inadequate on any given date if the daily
capacity of the power plant is 12 million Kw.
Solution:
𝑥
∞ 1
𝑃(𝑋 > 12) = ∫12 { 𝑥 𝑒 − 3 } 𝑑𝑥 = 5𝑒 −4 = 0.0915781944436709
9
𝑟 2
Note: 𝑉(𝑋) = = 1 2
= 18
𝛼2 ( )
3
𝑥
1
𝑥 𝑒−2 , 0 < 𝑥 < ∞
3. If 𝑋 has the pdf 𝑓(𝑥) = {4
0 , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Find the mean and variance.
Solution:
𝑟 𝑟
𝐸(𝑋) = and 𝑉(𝑋) =
𝛼 𝛼2
𝑥 1
We have, 𝑟 − 1 = 1 ⟹ 𝑟 = 2; −𝛼𝑥 = − ⟹ 𝛼 =
2 2
𝑟 𝑟
∴ 𝐸(𝑋) = = 4 and 𝑉(𝑋) = =8
𝛼 𝛼2

4. The amount of time required to repair a TV is exponentially distributed with mean 2.


Find
(i) the probability that the required time exceeds 2 hours
(ii) the conditional probability that the required time taken is at least 10 hours
given that already 9 hours have been spent on repairing the TV.
Solution:
1
𝐸(𝑋) = = 2 ⟹ 𝜆 = 0.5
𝜆
𝑥
1
𝑒 −2 , 𝑥>0
Hence, the pdf is given by 𝑓(𝑥) = { 2
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑥
∞1 1
(i) 𝑃(𝑋 > 2) = ∫2 𝑒 −2 𝑑𝑥 =
2 𝑒
1
𝑃(𝑋≥10 ∩𝑋>9) 𝑃(𝑋≥10)
(ii) 𝑃(𝑋 ≥ 10|𝑋 > 9 ) = = = 𝑒 −2
𝑃(𝑋>9 ) 𝑃(𝑋>9)

5. If 𝑋 ~ 𝐸(𝜆) with 𝑃(𝑋 ≤ 1) = 𝑃(𝑋 > 1), then find 𝑉(𝑋).


Solution:
𝑃(𝑋 ≤ 1) = 𝑃(𝑋 > 1)
1 − 𝑃(𝑋 > 1) = 𝑃(𝑋 > 1)
1
𝑃(𝑋 > 1) =
2

1
∫ 𝜆𝑒 −𝜆𝑥 𝑑𝑥 =
1 2
On solving, we get 𝜆 = ln 2
1 1
𝑉(𝑋) = =
𝜆2 (ln 2)2
Functions of One-dimensional random variables
If 𝑋 is a discrete random variable and 𝑌 = 𝐻(𝑋) is a continuous function of 𝑋, then 𝑌 is
also a discrete random variable.
Example:
Let 𝑋 have the probability distribution as follows
𝑋 −1 0 1
𝑃(𝑥) 1 1 1
3 2 6
Suppose 𝑌 = 3𝑋 + 1, then pmf of 𝑌 is given by
𝑌 −2 1 4
𝑃(𝑦) 1 1 1
3 2 6
Suppose 𝑌 = 𝑋 2 , then pmf of 𝑌 is given by
𝑌 1 0
𝑃(𝑦) 1 1
2 2
Example:
1 1 𝑖𝑓 𝑋 𝑖𝑠 𝑒𝑣𝑒𝑛
Suppose 𝑋 = 0,1, … , 𝑛 and 𝑃(𝑋 = 𝑛) = . Let 𝑌 = {
2𝑛 −1 𝑖𝑓 𝑋 𝑖𝑠 𝑜𝑑𝑑
1 1 1 1 1 1 1 1 1
Then, 𝑃(𝑌 = 1) = ∑2,4,6 = + + + ⋯= ( + + + ⋯) =
2𝑛 22 24 26 22 1 4 42 3

1 1 1 1 1 1 1 1 2
𝑃(𝑌 = −1) = ∑1,3,5 = + + + ⋯= ( + + + ⋯) =
2𝑛 2 23 25 2 1 4 42 3

Steps to obtain the pdf of 𝒀


Suppose 𝑋 is a continuous random variable with pdf 𝑓(𝑥) and 𝑌 = 𝐻(𝑋) is a continuous
function of 𝑋. Then 𝑌 is a continuous random variable. To obtain the pdf of 𝑌, we follow the
following steps:
1. Obtain the cdf of 𝑌, i.e., 𝐺(𝑦) = 𝑃(𝑌 ≤ 𝑦).
2. Differentiate 𝐺(𝑦) with respect to 𝑦 to get pdf of 𝑦 i.e., 𝑔(𝑦).
3. Determine the range space of 𝑌 such that 𝑔(𝑦) > 0.
Result
Let 𝑋 be a continuous random variable with pdf 𝑓(𝑥) and let 𝑌 = 𝑋 2 . Then the pdf of 𝑌 is
given by
1
𝑔(𝑦) = [𝑓(√𝑦) + 𝑓(−√𝑦)]
2 √𝑦

Exercise
2𝑥, 0 < 𝑥 < 1
1. If 𝑓(𝑥) = { and 𝑌 = 3𝑋 + 1, then find the pdf of 𝑌.
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Solution:
𝑦−1
𝑦−1 3 𝑦−1 2
𝐺(𝑦) = 𝑃(𝑌 ≤ 𝑦) = 𝑃(3𝑋 + 1 ≤ 𝑦) = 𝑃 (𝑋 ≤ ) = ∫ 2𝑥 𝑑𝑥 = ( )
3 0 3
2(𝑦 − 1)
𝑔(𝑦) = 𝐺 ′ (𝑦) =
9
𝑦−1
To find the range for 𝑦: 0 < 𝑥 < 1 ⟹ 0 < <1⟹1<𝑦<4
3
2(𝑦−1)
, 1<𝑦<4
Therefore, 𝑔(𝑦) = { 9
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

2𝑥, 0 < 𝑥 < 1


2. If 𝑓(𝑥) = { and 𝑌 = 𝑒 −𝑋 , then find the pdf of 𝑌.
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Solution:
−𝑋
1 1
1 2
𝐺(𝑦) = 𝑃(𝑌 ≤ 𝑦) = 𝑃(𝑒 ≤ 𝑦) = 𝑃 (log 𝑒 ≤ 𝑋) = ∫ 2𝑥 𝑑𝑥 = 1 − (log 𝑒 )
𝑦 log𝑒
1 𝑦
𝑦
2 1
𝑔(𝑦) = 𝐺 ′ (𝑦) = log 𝑒
𝑦 𝑦
1 1
To find the range for 𝑦: 0 < 𝑥 < 1 ⟹ 0 < log 𝑒 < 1 ⟹ < 𝑦 < 1
𝑦 𝑒
2 1 1
log 𝑒 , <𝑦<1
Therefore, 𝑔(𝑦) = { 𝑦 𝑦 𝑒
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

−𝑥 2
3. Suppose 𝑓(𝑥) = {2𝑥𝑒 , 0 < 𝑥 < ∞. Find the pdf of 𝑌 = 𝑋 2
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Solution:
1 1
𝑔(𝑦) =
2√ 𝑦
[𝑓(√𝑦) + 𝑓(−√𝑦)] = 2 (2𝑒 −𝑦 √𝑦 + 0) = 𝑒 −𝑦 , 0 < 𝑥 < ∞.
√𝑦
2
(𝑥 + 1), −1<𝑥 <1
4. Suppose 𝑓(𝑥) = {9 . Find the pdf of 𝑌 = 𝑋 2
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Solution:
1 1 2(√𝑦+1) 2(−√𝑦+1) 2
𝑔(𝑦) = [𝑓(√𝑦) + 𝑓(−√𝑦)] = 2 ( + )=9 ,0 < 𝑥 < 1
2√ 𝑦 √ 𝑦 9 9 √𝑦

1
, −1 < 𝑥 < 1
5. Suppose that 𝑓(𝑥) = { 2 . Let 𝑌 = 𝑋 2 . Find the pdf of 𝑌.
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
1
0<𝑦<1
Solution: 𝑔(𝑦) = {2√𝑦
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

Theorem
Let 𝑋 be a continuous random variable with pdf 𝑓(𝑥) where 𝑓(𝑥) > 0 for 𝑎 < 𝑥 < 𝑏.
Suppose 𝑌 = 𝐻(𝑋) is strictly monotone (increasing or decreasing) function of 𝑋, the pdf of
𝑌 is given by
𝑑𝑥
𝑔(𝑦) = 𝑓(𝑥) | | where 𝑥 = 𝐻−1 (𝑦)
𝑑𝑦

Exercise
2𝑥, 0 < 𝑥 < 1
1. If 𝑓(𝑥) = { and 𝑌 = 3𝑋 + 1, then find the pdf of 𝑌.
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Solution:
𝑌 = 3𝑋 + 1 is a strictly monotone function of 𝑋. Therefore,
𝑑𝑥 1 2(𝑦−1)
𝑔(𝑦) = 𝑓(𝑥) | | = 2𝑥 | | = , 1 < 𝑦 < 4.
𝑑𝑦 3 9
2(𝑦−1)
, 1<𝑦<4
Therefore, 𝑔(𝑦) = { 9
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

2𝑥, 0 < 𝑥 < 1


2. If 𝑓(𝑥) = { and 𝑌 = 𝑒 −𝑋 , then find the pdf of 𝑌.
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

1
3. Suppose 𝑋 is uniformly distributed over (0,1), find the pdf of 𝑌 =
𝑋+1

Solution:
1, 0<𝑥<1
𝑓(𝑥) = {
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
1
We know that 𝑌 = is strictly monotonic.
𝑋+1
1 1
𝑋= − 1 ⟹ 𝑓(𝑥) = 𝑓 ( − 1) = 1
𝑌 𝑦
𝑑𝑥 1 1
| | = 2, <𝑦<1
𝑑𝑦 𝑦 2

𝜋 𝜋
4. If 𝑋 is uniformly distributed over (− , ), find the pdf of 𝑌 = tan 𝑋
2 2

OR
𝜋 𝜋
If 𝑋 is uniformly distributed over (− , ), show that 𝑌 = tan 𝑋 follows Cauchy’s
2 2
distribution
Solution:
1 𝜋 𝜋
𝑓(𝑥) = {𝜋 , −
2
<𝑥<
2
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
We know that 𝑌 = tan 𝑋 is strictly monotonic.
1 𝑑𝑥 1
Then 𝑋 = tan−1 𝑌 ⟹ 𝑓(tan−1 𝑌) = and | | =
𝜋 𝑑𝑦 1+𝑦 2
1
Therefore, 𝑔(𝑦) = , −∞ < 𝑦 < ∞
𝜋(1+𝑦 2 )

5. If 𝑋~𝑁(𝜇, 𝜎 2 ), then show that


𝑋−𝜇
(i) 𝑍= ~𝑁(0,1)
𝜎

(ii) 𝑌 = 𝑍 2 ~𝜒 2 (1)
Solution:
1 1 𝑥−𝜇 2
− ( )
𝑓(𝑥) = 𝑒 2 𝜎
𝜎√2𝜋
𝑋−𝜇
(i) 𝑍= is a strictly monotonic function.
𝜎
1
1 − 𝑍2 𝑑𝑥
Then, 𝑋 = 𝜎𝑍 + 𝜇 ⟹ 𝑓(𝜎𝑍 + 𝜇) = 𝑒 2 and | | = 𝜎
𝜎 √2𝜋 𝑑𝑧
1 𝑧2
𝑑𝑥 1 − 𝑧2 1 −
Therefore, 𝑔(𝑍) = 𝑓(𝑥) | | = 𝑒 2 ×𝜎 = 𝑒 2 which is the pdf of a
𝑑𝑧 𝜎 √2𝜋 √2𝜋
standard normal distribution.
i.e., 𝑍~𝑁(0,1), −∞ < 𝑍 < ∞
(ii) 𝑌 = 𝑍 2 is a strictly monotonic function.
𝑋−𝜇 2
𝑌 = 𝑍2 = ( ) is in the standard form 𝑌 = 𝑋 2
𝜎
𝑧2

𝑒 2
where 𝑓(𝑧) = , −∞ < 𝑧 < ∞
√2𝜋
𝑦 𝑦 𝑦 1
− − − −1
1 1 𝑒 2 𝑒 2 𝑒 2 𝑦2
Therefore, 𝑔(𝑦) = [𝑓(√𝑦) + 𝑓(−√𝑦)] = 2 [ + ]= 1
2 √𝑦 √𝑦 √2𝜋 √2𝜋 1
22 Γ( )
2

0<𝑦<∞
which is the pdf of a chi-square distribution with 1 degree of freedom.
i.e., 𝑌 = 𝑍 2 ~𝜒 2 (1)

2
(1 + 𝑥), −1 < 𝑥 < 2
6. If the continuous random variable 𝑋 has the pdf 𝑓(𝑥) = {9
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Find the pdf of 𝑌 = 𝑋 2
Solution
𝑌 = 𝑋 2 is not monotonic in (−1,2). Hence, we divide the interval into (−1,1) and
(1,2)
In (−1,1):
1 1 2 2 2
𝑓(𝑦) = [𝑓(√𝑦) + 𝑓(−√𝑦)] = 2 [ (1 + √𝑦) + (1 − √𝑦)] = ,0 < 𝑦 < 1
2√ 𝑦 √ 𝑦 9 9 9 √𝑦

In (1,2):
𝑌 = 𝑋 2 is strictly monotonic in (1,2)
2 𝑑𝑦 1
𝑋 = √𝑦 ⟹ 𝑓(√𝑦) = (1 + √𝑦) and | | =
9 𝑑𝑥 2 √𝑦
2 1 1 1
𝑓(𝑦) = (1 + √𝑦) × = (1 + ),1 < 𝑦 < 4
9 2 √ 𝑦 9 √𝑦
Functions of Two-Dimensional Random Variables
Let (𝑋, 𝑌) be a continuous two-dimensional random variable with pdf 𝑓(𝑥, 𝑦). If 𝑍 =
𝐻1 (𝑋, 𝑌) is a continuous function of (𝑋, 𝑌), then 𝑍 will be a continuous (one-dimensional)
random variable.

Steps to find the pdf of 𝒁


In order to find the pdf of 𝑍, the following procedure is used:
1. To find the pdf of 𝑍 = 𝐻1 (𝑋, 𝑌), we first introduce a second random variable, say 𝑊 =
𝐻2 (𝑋, 𝑌), and obtain the joint pdf of 𝑍 and 𝑊, say 𝑘(𝑧, 𝑤).
2. From the knowledge of 𝑘(𝑧, 𝑤), we then obtain the desired pdf of 𝑍, say 𝑔(𝑧), by

simply integrating 𝑘(𝑧, 𝑤) with respect to 𝑤, i.e., 𝑔(𝑧) = ∫−∞ 𝑘(𝑧, 𝑤)𝑑𝑤.

Two problems which arise here are


(i) How to find the joint pdf 𝑘(𝑧, 𝑤) of 𝑍 and 𝑊
(ii) How to choose the appropriate random variable 𝑊 = 𝐻2 (𝑋, 𝑌)
To resolve these problems, let us simply state that we usually make the simplest possible
choice for 𝑊 as it plays only an intermediate role. In order to find the joint pdf 𝑘(𝑧, 𝑤), we
need the following theorem.

Theorem:
Suppose that (𝑋, 𝑌) is a two-dimensional continuous random variable with joint pdf 𝑓(𝑥, 𝑦).
Let 𝑍 = 𝐻1 (𝑋, 𝑌) and 𝑊 = 𝐻2 (𝑋, 𝑌). Assume that the functions 𝐻1 and 𝐻2 satisfy the
following conditions:
(i) The equations 𝑧 = 𝐻1 (𝑥, 𝑦) and 𝑤 = 𝐻2 (𝑥, 𝑦) may be uniquely solved for 𝑥 and 𝑦 in
terms of 𝑧 and 𝑤, say 𝑥 = 𝐺1 (𝑧, 𝑤) and 𝑦 = 𝐺2 (𝑧, 𝑤).
𝜕𝑥 𝜕𝑥 𝜕𝑦 𝜕𝑦
(ii) The partial derivatives , , , exist and are continuous.
𝜕𝑧 𝜕𝑤 𝜕𝑧 𝜕𝑤
Then the joint pdf (𝑍, 𝑊), say 𝑘(𝑧, 𝑤), is given by
𝑘(𝑧, 𝑤) = 𝑓[𝐺1 (𝑧, 𝑤), 𝐺2 (𝑧, 𝑤)]|𝐽(𝑧, 𝑤)|,
where 𝐽(𝑧, 𝑤) is the following 2 × 2 determinant:

𝜕𝑥 𝜕𝑥
𝜕𝑧 𝜕𝑤
𝐽(𝑧, 𝑤) =|𝜕𝑦 𝜕𝑦
|
𝜕𝑧 𝜕𝑤

This determinant is called the ‘Jacobian’ of the transformation (𝑥, 𝑦) → (𝑧, 𝑤) and is
𝛿(𝑥,𝑦)
sometimes denoted by . We note that 𝑘(𝑧, 𝑤) will be nonzero for those values of (𝑧, 𝑤)
𝛿(𝑧,𝑤)
corresponding to values of (𝑥, 𝑦) for which 𝑓(𝑥, 𝑦) is nonzero.

Exercise
1. Suppose that 𝑋 and 𝑌 are two independent random variables having pdf 𝑓(𝑥) =
𝑒 −𝑥 , 0 ≤ 𝑥 ≤ ∞ and 𝑔(𝑦) = 2𝑒 −2𝑦 , 0 ≤ 𝑦 ≤ ∞. Find the pdf of 𝑋 + 𝑌.
Solution:
Since 𝑋 and 𝑌 are independent, the joint pdf of (𝑋, 𝑌) is given by,
𝑓(𝑥, 𝑦) = 𝑓(𝑥)𝑔(𝑦) = 2𝑒 −(𝑥+2𝑦) , 0 ≤ 𝑥, 𝑦 ≤ ∞
Let 𝑍 = 𝑋 + 𝑌 and 𝑊 = 𝑌. So, 𝑋 = 𝑍 − 𝑊

𝜕𝑥 𝜕𝑥
𝜕𝑧 𝜕𝑤 1 −1
𝐽(𝑧, 𝑤) = |𝜕𝑦 𝜕𝑦
| =| |=1
0 1
𝜕𝑧 𝜕𝑤
Thus joint pdf of (𝑊, 𝑍) is,
𝑘(𝑧, 𝑤) = 𝑓(𝑥, 𝑦)|𝐽| = 2𝑒 −(𝑥+2𝑦) = 2𝑒 −(𝑧+𝑤)
0≤𝑦≤ ∞ ⇒0≤𝑤≤ ∞
0≤𝑥 ≤ ∞ ⇒0≤𝑧−𝑤 ≤ ∞⇒𝑤 ≤𝑧 ≤ ∞

Thus 𝑘(𝑤, 𝑧) = 2𝑒 −(𝑧+𝑤) , 0 ≤ 𝑤 ≤ 𝑧 ≤ ∞


𝑧
The required pdf of 𝑧, ℎ(𝑧) = ∫𝑤=0 2𝑒 −(𝑧+𝑤) 𝑑𝑤
= 2(𝑒 −𝑧 − 𝑒 −2𝑧 ), 0 ≤ 𝑧 ≤ ∞

2. 𝑋~𝑁(0, 𝜎 2 ), 𝑌~ 𝑁(0, 𝜎 2 ) and 𝑋, 𝑌 are independent. Find the pdf of 𝑅 = √𝑋 2 + 𝑌 2


Solution:
𝑥2
1 −
Pdf of 𝑋: 𝑓(𝑥) = 𝑒 2𝜎2 , −∞ ≤ 𝑥 ≤ ∞
√2𝜋𝜎
𝑦2
1 −
Pdf of 𝑌: 𝑔(𝑦) = 𝑒 2𝜎2 , −∞ ≤ 𝑦 ≤ ∞
√2𝜋𝜎
Since 𝑋 and 𝑌 are independent, the joint pdf of (𝑋, 𝑌) is given by
1 2 2 2
𝑓(𝑥, 𝑦) = 𝑓(𝑥)𝑔(𝑦) = 𝑒 −(𝑥 +𝑦 )/2𝜎 , −∞ ≤ 𝑥, 𝑦 ≤ ∞
√2𝜋𝜎
𝑋
Let 𝑅 = √𝑋 2 + 𝑌 2 and 𝜃 = tan−1 ( ), that is 𝑋 = 𝑅𝑐𝑜𝑠(𝜃) and 𝑇 = 𝑅𝑠𝑖𝑛(𝜃) and the
𝑌
Jacobian 𝐽 = 𝑅.
Thus joint pdf of (𝑅, 𝜃) is
1 2 /2𝜎 2
𝑘(𝑅, 𝜃) = 𝑓(𝑥, 𝑦)|𝐽| = 𝑒 −𝑅 , 𝑅 ≥ 0, 0 ≤ 𝜃 ≤ 2𝜋
√2𝜋𝜎
𝑅 2
2𝜋 𝑅 − 2
The required pdf of 𝑧, ℎ(𝑧) = ∫𝜃=0 2𝜋𝜎2 𝑒 2𝜎 𝑑𝜃
𝑅2
𝑅 −
= 𝑒 2𝜎2 , 𝑅≥0
𝜎2

3. If 𝑋1 , 𝑋2 are independent and have standard normal distribution, i.e. 𝑋1 , 𝑋2 ~𝑁(0,1),


𝑋1
find the pdf of .
𝑋2
Solution:
𝑥 2
1 − 1
Pdf of 𝑋1 : 𝑓(𝑥1 ) = 𝑒 2 , −∞ ≤ 𝑥1 ≤ ∞
√2𝜋
𝑥 2
1 − 2
Pdf of 𝑋2 : 𝑔(𝑥2 ) = 𝑒 2 , −∞ ≤ 𝑥2 ≤ ∞
√2𝜋
Since 𝑋1 , 𝑋2 are independent, the joint pdf of (𝑋1 , 𝑋2 ) is given by
1 (𝑥 2+𝑥 2 )/2
𝑓(𝑥1 , 𝑥2 ) = 𝑒 1 2 , −∞ ≤ x1 , x2 ≤ ∞
2𝜋
𝑋
Let 𝑍 = 1 and 𝑊 = 𝑋2 , i.e. 𝑋2 = 𝑊 and 𝑋1 = 𝑍𝑊.
𝑋2
𝑤 𝑧
The Jacobian 𝐽 = | |=𝑤
0 1
Thus joint pdf of (𝑊, 𝑍) is,
|𝑤| 2 (1+𝑧 2 )/2
𝑘(𝑧, 𝑤) = 𝑒 −𝑤 , −∞ ≤ w, z ≤ ∞
2𝜋
2 2
∞ |𝑤| −𝑤 (1+𝑧 )
The required pdf of 𝑍, ℎ(𝑧) = ∫−∞ 2𝜋 𝑒 2 𝑑𝑤
𝑤2 (1+𝑧2 )
2 ∞ −
= ∫ |𝑤|𝑒 2 𝑑𝑤
2𝜋 0
On substitution: −𝑤 2 (1 + 𝑧 2 )/2 = 𝑡
−𝑤 2 (1 + 𝑧 2 )𝑑𝑤 = 𝑑𝑡
1 ∞ 𝑒 −𝑡 1
We get ℎ(𝑧) = ∫0 𝑑𝑡 = , −∞ ≤ z ≤ ∞
𝜋 1+𝑧 2 𝜋(1+𝑧 2 )
4. The joint pdf of the random variable (𝑋, 𝑌) is given by
𝑥
𝑓(𝑥, 𝑦) = 𝑒 −𝑦 , 0 < 𝑥 < 2, 𝑦 > 0
2
Find the pdf of 𝑋 + 𝑌
Solution:
Let 𝑍 = 𝑋 + 𝑌 and 𝑊 = 𝑌, that is 𝑌 = 𝑊 and 𝑋 = 𝑍 − 𝑊
The Jacobian 𝐽 = 1
𝑧−𝑤 −𝑤
Thus joint pdf of (𝑊, 𝑍) is, 𝑘(𝑧, 𝑤) = 𝑓(𝑥, 𝑦)|𝐽| = 𝑒 ,
2
0≤𝑦≤ ∞⇒0≤𝑤≤∞
0≤𝑥 ≤2⇒0≤𝑧−𝑤 ≤2⇒𝑤 ≤𝑧 ≤2+𝑤
𝑧 − 𝑤 −𝑤
𝑘(𝑧, 𝑤) = 𝑒 ,0 ≤ 𝑤 ≤ 𝑧 ≤ 2+ 𝑤
2

𝑧 𝑧−𝑤
∫0 ( 2
) 𝑒 −𝑤 𝑑𝑤, 𝑤ℎ𝑒𝑛 0 < 𝑧 < 2
The required pdf of 𝑧, ℎ(𝑧) = { 𝑧 𝑧−𝑤
∫𝑧−2 ( 2
) 𝑒 −𝑤 𝑑𝑤, 𝑤ℎ𝑒𝑛 2 < 𝑧 < ∞
1
(𝑧 + 𝑒 −𝑧 − 1), 𝑤ℎ𝑒𝑛 0 < 𝑧 < 2
ℎ(𝑧) = {2
1 −𝑧
(𝑒 + 𝑒 2−𝑧 ), 𝑤ℎ𝑒𝑛 2 < 𝑧 < ∞
2

5. If 𝑋 and 𝑌 each follow an exponential distribution with parameter 1 and are


independent, find the pdf of 𝑈 = 𝑋 − 𝑌. Given that 𝑓(𝑥) = 𝑒 −𝑥 , 𝑥 > 0, and 𝑓(𝑦) =
𝑒 −𝑦 , 𝑦 > 0
Solution:
Since 𝑋 and 𝑌 are independent,
𝑓(𝑥, 𝑦) = 𝑓(𝑥)𝑓(𝑦) = 𝑒 −(𝑥+𝑦) , 𝑥, 𝑦 > 0
Let 𝑍 = 𝑋 − 𝑌 and 𝑊 = 𝑌 that is 𝑌 = 𝑊 and 𝑋 = 𝑍 + 𝑊
Therefore 𝐽 = 1
The joint pdf of (𝑊, 𝑍) is given by
𝑘(𝑤, 𝑧) = |𝐽|𝑓(𝑥, 𝑦) = 𝑒 −(𝑥+𝑦) = 𝑒 −(𝑧+2𝑤)

The range space of (𝑍, 𝑊) is given by 𝑤 > −𝑧, when 𝑧 < 0 and 𝑤 > 0, when 𝑧 > 0.
Now the pdf of 𝑍 is given by

ℎ(𝑧) = ∫ 𝑒 −(𝑧+2𝑤) 𝑑𝑤, 𝑤ℎ𝑒𝑛 𝑧 < 0
−𝑧

= ∫0 𝑒 −(𝑧+2𝑤) 𝑑𝑤, 𝑤ℎ𝑒𝑛 𝑧>0
1 𝑧
𝑒 , 𝑤ℎ𝑒𝑛 𝑧 < 0
ℎ(𝑧) = {12
𝑒 −𝑧 , 𝑤ℎ𝑒𝑛 𝑧 > 0
2
Moment generating functions (mgf)
Reference: 1. Meyer PL. Introductory probability and statistical applications. Oxford and IBH Publishing; 1965.
2. Johnson, Richard A., Irwin Miller, and John E. Freund. "Probability and statistics for engineers." (2000).

Definition:

The moment generating function of a r.v. 𝑋 is defined by

∑𝑖 𝑒 𝑡𝑥𝑖 𝑝𝑋 (𝑥𝑖 ) (discrete case)


𝑡𝑋 )
𝑀𝑋 (𝑡) = 𝐸(𝑒 = { ∞ 𝑡𝑥𝑖 ----------(1)
∫−∞ 𝑒 𝑓𝑋 (𝑥)𝑑𝑥 (continuous case)

where 𝑡 is a real variable.


Note:

𝑀𝑋 (𝑡) may not exist for all r.v.’s 𝑋. In general, 𝑀𝑋 (𝑡) will exist only for those values of 𝑡
for which the sum or integral of Eq (1) converges absolutely.

Suppose that 𝑀𝑋 (𝑡) exists. If we express 𝑒 𝑡𝑋 formally and take expectation, then
1 1
𝑀𝑋 (𝑡) = 𝐸(𝑒 𝑡𝑋 ) = 𝐸 [1 + 𝑡𝑋 + (𝑡𝑋)2 + ⋯ + (𝑡𝑋)𝑘 + ⋯ ]
2! 𝑘!
𝑡2 𝑡𝑘
= 1 + 𝑡𝐸(𝑋) + 𝐸(𝑋 2 ) + ⋯ + 𝐸(𝑋 𝑘 ) + ⋯ ----------(2)
2! 𝑘!

Differentiating (2) w.r.t ‘t’ at t=0, we get

′(𝑡) 2t 2)
𝑘𝑡𝑘−1
𝑀𝑋 = 𝐸(𝑋) + 𝐸(𝑋 + ⋯ + 𝐸(𝑋 𝑘 ) + ⋯ |t=0
2! 𝑘!
= 𝐸(𝑋)
𝑑2
Similarly, 𝑀𝑋′′ (0) = 𝑀𝑋 (𝑡)|𝑡=0 = 𝐸(𝑋 2 )
𝑑𝑡 2

and the 𝑘 𝑡ℎ moment of 𝑋 is given by


(𝑘)
𝑚𝑘 = 𝐸(𝑋 𝑘 ) = 𝑀𝑋 (0), 𝑘 = 1,2, ⋯ ----------(3)

(𝑘) 𝑑𝑘
where 𝑀𝑋 (0) = 𝑀𝑋 (𝑡)|𝑡=0 ----------(4)
𝑑𝑡 𝑘

Note:

𝐸(𝑋) is coefficient of t in 𝑀𝑋 (𝑡).


𝑡2
𝐸(𝑋 2 ) is coefficient of in 𝑀𝑋 (𝑡).
2!

𝑡𝑛
𝐸 (𝑋 𝑛 ) is coefficient of in 𝑀𝑋 (𝑡).
𝑛!
Exercise
𝑒 −|𝑥|
1. Suppose that 𝑋 has pdf 𝑓(𝑥) = , −∞ < 𝑥 < ∞ then find 𝐸(𝑋) and 𝑉(𝑋) using mgf.
2

Solution:

𝑀𝑋 (𝑡) = ∫ 𝑒 𝑡𝑥 𝑓(𝑥) 𝑑𝑥
−∞

∞ 𝑒 −|𝑥|
= ∫−∞ 𝑒 𝑡𝑥 𝑑𝑥
2

0 𝑒𝑥 ∞ 𝑒− 𝑥
= ∫−∞ 𝑒 𝑡𝑥 𝑑𝑥 + ∫0 𝑒 𝑡𝑥 𝑑𝑥
2 2

0 𝑒 (1+𝑡)𝑥 ∞ 𝑒 −(1−𝑡)𝑥
= ∫−∞ 𝑑𝑥 + ∫0 𝑑𝑥
2 2

1
𝑀𝑋 (𝑡) = , -1< t <1
1−𝑡 2

𝑀𝑋1 (0) = 0

𝑀𝑋2 (0) = 2

𝐸(𝑋) = 0 and 𝑉(𝑋) = 2.


1 2𝑡 2
Or: 𝑀𝑋 (𝑡) = = 1 + 𝑡2 + 𝑡4 + 𝑡6 + ⋯ = 1 + + 𝑡4 + 𝑡6 + ⋯
1−𝑡 2 2

𝑡2
Cofficient of 𝑡 is 𝐸(𝑋) = 0. Cofficient of is 𝑉(𝑋) = 2 − 0 = 2.
2

2. Let X be a random variable taking the values 0,1,2, ⋯ and 𝑓(𝑥) = 𝑎𝑏 𝑥 , 𝑎, 𝑏 > 0 & a+b=1.
Find mgf of 𝑋. If 𝐸(𝑋) = 𝑚1 , 𝐸(𝑋 2 ) = 𝑚2 then show that 𝑚2= 𝑚1 (2𝑚1 + 1).

Solution:
1 𝑎
𝑀𝑋 (𝑡) = ∑∞ 𝑥 𝑡𝑥
0 𝑎𝑏 𝑒 = 𝑎 ∑∞ 𝑡 𝑥
0 (𝑏𝑒 ) = a =
1−𝑏𝑒 𝑡 1−𝑏𝑒 𝑡

𝑎𝑏
𝐸(𝑋) = 𝑀𝑋1 (0) =
(1 − 𝑏)2
(1 + 𝑏)𝑎𝑏
𝐸(𝑋 2 ) = 𝑀𝑋2 (0) =
(1 − 𝑏)3

Given,

𝐸(𝑋) = 𝑚1 , 𝐸(𝑋 2 ) = 𝑚2

To Prove, 𝑚2 = 𝑚1 (2𝑚1 + 1).


𝑎𝑏 𝑎𝑏 𝑎𝑏 2𝑎𝑏+1+𝑏2 −2𝑏
Consider, 𝑚1 (2𝑚1 + 1) = (1−𝑏)2
(2 (1−𝑏)2 + 1) = (1−𝑏)2
( (1−𝑏)2
)

𝑎𝑏
= (2𝑏(𝑎 − 1) + 1 + 𝑏 2 )
(1−𝑏)4

𝑎𝑏
= (2𝑏(−𝑏) + 1 + 𝑏 2 )= 𝑚2 .
(1−𝑏)4

Moment generating function for binomial distribution

Let 𝑋 have the binomial distribution with probability distribution

𝑏(𝑥|𝑛, 𝑝) = (𝑛𝑥)𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 for 𝑥 = 0,1, ⋯ , 𝑛

Show that

𝑀(𝑡) = (1 − 𝑝 + 𝑝𝑒 𝑡 )𝑛 for all 𝑡

𝐸(𝑋) = 𝑛𝑝 and 𝑉𝑎𝑟(𝑋) = 𝑛𝑝(1 − 𝑝)

Solution:

By definition of moment generating function


𝑛
𝑛
𝑀(𝑡) = ∑ 𝑒 𝑡𝑥 ( ) 𝑝 𝑥 (1 − 𝑝)𝑛−𝑥
𝑥
𝑥=0
𝑛
𝑛
= ∑(𝑒 𝑡 𝑝)𝑥 ( ) (1 − 𝑝)𝑛−𝑥
𝑥
𝑥=0

= (𝑝𝑒 𝑡 + 1 − 𝑝)𝑛 for all 𝑡

where we have used binomial formula


𝑛
𝑛
(𝑎 + 𝑏)𝑛 = ∑ ( ) 𝑎 𝑥 𝑏 𝑛−𝑥
𝑥
𝑥=0

Differentiating 𝑀(𝑡), we find

𝑀′ (𝑡) = 𝑛𝑝𝑒 𝑡 (𝑝𝑒 𝑡 + 1 − 𝑝)𝑛−1

𝑀′′ (𝑡) = (𝑛 − 1)𝑛𝑝2 𝑒 2𝑡 (𝑝𝑒 𝑡 + 1 − 𝑝)𝑛−2 + 𝑛𝑝𝑒 𝑡 (𝑝𝑒 𝑡 + 1 − 𝑝)𝑛−1

Evaluating these derivatives at 𝑡 = 0, we obtain the moments

𝐸(𝑋) = 𝑛𝑝

𝐸(𝑋) = (𝑛 − 1)𝑛𝑝2 + 𝑛𝑝
Also, the variance is

𝑉𝑎𝑟(𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 = 𝑛𝑝(1 − 𝑝)

Moment generating function for Poisson distribution

Let 𝑋 have the Poisson distribution with probability distribution


𝜆𝑥
𝑓(𝑥) = 𝑒 −𝜆 for 𝑥 = 0,1, ⋯ , ∞
𝑥!

Show that
𝑡 −1)
𝑀(𝑡) = 𝑒 𝜆(𝑒 for all 𝑡

𝐸(𝑋) = 𝜆 and 𝑉𝑎𝑟(𝑋) = 𝜆

The mean and variance of the Poisson distribution are equal.

Solution:

By definition of the moment generating function



𝑡𝑥
𝜆𝑥 −𝜆
𝑀(𝑡) = ∑ 𝑒 𝑒
𝑥!
𝑥=0

(𝜆𝑒 𝑡 )𝑥 −𝜆
=∑ 𝑒
𝑥!
𝑥=0
𝑡
= 𝑒 −𝜆 𝑒 𝜆𝑒
𝑡 −1)
= 𝑒 𝜆(𝑒 , for −∞ < 𝑡 < ∞

Differentiating 𝑀(𝑡), we find


𝑡 −1)
𝑀′ (𝑡) = 𝜆𝑒 𝑡 𝑒 𝜆(𝑒
𝑡 −1) 𝑡 −1)
𝑀′′ (𝑡) = 𝜆𝑒 𝑡 𝑒 𝜆(𝑒 + 𝜆2 𝑒 2𝑡 𝑒 𝜆(𝑒

Evaluating these derivatives at 𝑡 = 0, we obtain the moments

𝐸(𝑋) = 𝜆

𝐸(𝑋 2 ) = 𝜆 + 𝜆2

Variance is
𝑉𝑎𝑟 (𝑋) = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2 = 𝜆

Moment generating function for Gamma Distribution

𝑥 𝑟−1 𝑒 −𝛼𝑥 𝛼 𝑟
𝑓(𝑥) = { , 𝑥 > 0, 𝛼, 𝑟 > 0
Γ(𝑟)
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
∞ 𝛼𝑟 ∞
𝑀𝑋 (𝑡) = ∫−∞ 𝑒 𝑡𝑥 𝑓(𝑥) 𝑑𝑥 = ∫ 𝑒 −(𝛼−𝑡)𝑥 𝑥 𝑟−1 𝑑𝑥
Γ(𝑟) 0

𝑑𝑣
Substitute, 𝑥(𝛼 − 𝑡) = 𝑣 then 𝑑𝑥 =
𝛼−𝑡

𝛼𝑟 ∞ −𝑣 𝑣 𝑟−1 𝑑𝑣
𝑀𝑋 (𝑡) = ∫ 𝑒 (𝛼−𝑡)
Γ(𝑟) 0 𝛼−𝑡

𝛼𝑟 ∞
=
Γ(𝑟)(𝛼−𝑡)𝑟
∫0 𝑒 −𝑣 (𝑣)𝑟−1 𝑑𝑣

𝛼𝑟 𝛼 𝑟
= Γ(𝑟) = ( )
Γ(𝑟)(𝛼−𝑡)𝑟 𝛼−𝑡

𝑟
𝐸(𝑋) =
𝛼
𝑟(𝑟+1)
𝐸(𝑋 2 )=
𝛼2

𝑟
𝑉(𝑋) =
𝛼2

Moment generating function for Exponential Distribution

Note: When we sub r=1 in gamma distribution we get exponential distribution.


−𝜆𝑥
𝑓(𝑥) = {𝜆𝑒 , 𝑥>0
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

∞ ∞ 𝑒 −(𝜆−𝑡)𝑥 𝜆
𝑀𝑋 (𝑡) = ∫0 𝑒 𝑡𝑥 𝑓(𝑥) 𝑑𝑥 = 𝜆 ∫0 𝑒 −(𝜆−𝑡)𝑥 𝑑𝑥 = 𝜆 [
−(𝜆−𝑡) 0
] = ( )
𝜆−𝑡

𝜆
𝑀𝑋 (𝑡) = ( ), 𝑡 < 𝜆
𝜆−𝑡

1
𝐸(𝑋) =
𝜆
1
𝑉(𝑋) =
𝜆2
Moment generating function for chi square Distribution
𝒏 𝟏
Special case of Gamma distribution: 𝒓 = and 𝜶 = in 𝚪 function we get 𝝌𝟐 distribution.
𝟐 𝟐

A continuous random variable X is said to have a chi-square distribution if its PDF is given
by
𝑛 𝑥
𝑥 2 −1 𝑒 −2
, 𝑥 > 0,
𝑓(𝑥) = Γ (𝑛) 2𝑛2
2
{ 0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒

1 𝑛
∞ 𝛼𝑟 ∞ (𝑡− )𝑥
𝑀𝑋 (𝑡) = ∫−∞ 𝑒 𝑡𝑥 𝑓(𝑥) 𝑑𝑥 = 𝑛
𝑛 ∫
−∞
𝑒 2 𝑥 2 −1 𝑑𝑥
Γ( )2 2
2

𝑛
𝑀𝑋 (𝑡) = (1 − 2𝑡)− 2

𝐸(𝑋) = 𝑛 and 𝑉(𝑋) = 2𝑛

Moment generating function for Uniform Distribution


1
𝑎≤𝑥≤𝑏
𝑓(𝑥) = {(𝑏 − 𝑎)
0 𝑒𝑙𝑠𝑒 𝑤ℎ𝑒𝑟𝑒

∞ 1 1
𝑀𝑋 (𝑡) = ∫−∞ 𝑒 𝑡𝑥 𝑓(𝑥) 𝑑𝑥 = = (𝑒 𝑏𝑡 − 𝑒 𝑎𝑡 )
𝑏−𝑎 𝑡(𝑏−𝑎)

Expand and get the suitable coefficient in the expanssion to find 𝐸(𝑋) and 𝑉(𝑋).
(𝑎+𝑏)
Mean 𝐸(𝑋) =
2

(𝑎2 + 𝑏2 +𝑎𝑏)
E(𝑋 2 ) =
3

(𝑏−𝑎)2
Variance 𝑉(𝑋) = 𝐸(𝑋 2 ) – [𝐸(𝑋)]2 =
12
Moment generating function for normal distribution

Show that the normal distribution, whose probability density function


1 2 2
1 2 /2𝜎 2
𝑓(𝑥) = 𝑒 −(𝑥−𝜇) has 𝑀(𝑡) = 𝑒 𝑡𝜇+2𝑡 𝜎
√2𝜋𝜎

which exists for all 𝑡. Also, verify the first two moments.

Solution:
+∞ 1 𝑥−𝜇 2
1 𝑡𝑥 −2[ 𝜎 ]
𝑀𝑋 (𝑡) = ∫ 𝑒 𝑒 𝑑𝑥
√2𝜋𝜎 −∞
𝑥−𝜇
Let = 𝑠; thus 𝑥 = 𝜎𝑠 + 𝜇 and 𝑑𝑥 = 𝜎𝑑𝑠.
𝜎

Therefore,
+∞ 𝑠2
1 𝑡(𝜎𝑠+𝜇) − 2
𝑀𝑋 (𝑡) = ∫ 𝑒 𝑒 𝑑𝑠
√2𝜋 −∞
+∞ 𝑠2
𝑡𝜇
1 𝜎𝑡𝑠 − 2
=𝑒 ∫ 𝑒 𝑒 𝑑𝑠
√2𝜋 −∞
+∞
1 1 2
(𝑠 −2𝜎𝑡𝑠)
=𝑒 𝑡𝜇
∫ 𝑒 −2 𝑑𝑠
√2𝜋 −∞
+∞
1 1
[(𝑠−𝜎𝑡)2 −𝜎 2 𝑡 2 ]
=𝑒 𝑡𝜇
∫ 𝑒 −2 𝑑𝑠
√2𝜋 −∞

𝜎2𝑡 2 +∞
𝑡𝜇+ 1 1
[(𝑠−𝜎𝑡)2 ]
=𝑒 2 ∫ 𝑒 −2 𝑑𝑠
√2𝜋 −∞

Let 𝑠 − 𝜎𝑡 = 𝑣; then 𝑑𝑠 = 𝑑𝑣 and we obtain


𝜎2𝑡 2 +∞ 𝑣2 𝜎2𝑡 2
𝑡𝜇+ 1 − 𝑡𝜇+
𝑀𝑋 (𝑡) = 𝑒 2 ∫ 𝑒 2 𝑑𝑣 = 𝑒 2
√2𝜋 −∞

∞ 2 1 ∞ 2 /2 1
[as ∫−∞ 𝑒 −𝑥 𝑑𝑥 = Γ ( ) = √𝜋 and ∫−∞ 𝑒 −𝑣 𝑑𝑣 = √2Γ ( ) = √2𝜋 ]
2 2

To obtain the moments of the normal, we differentiate once to obtain


𝜎2𝑡 2
′ (𝑡) 𝑡𝜇+
𝑀 = 𝑒 2 (𝜇 + 𝑡𝜎 2 )
𝜎2 𝑡 2
′′ (𝑡) 𝑡𝜇+
And a second time to get 𝑀 =𝑒 2 [(𝜇 + 𝑡𝜎 2 )2 + 𝜎 2 ]
Setting 𝑡 = 0, 𝐸(𝑋) = 𝑀′ (0) = 𝜇 and 𝐸(𝑋 2 ) = 𝑀′′ (0) = 𝜎 2 + 𝜇2 .

So 𝑉𝑎𝑟(𝑋) = 𝜎 2 .

Summarizing the MGF of distributions:


1. Binomial Distributions: 𝑀𝑋 (𝑡) = (𝑝𝑒 𝑡 + 𝑞)𝑛
𝑡 −1)
2. Poisson Distributions: 𝑀𝑋 (𝑡) = 𝑒 𝜆(𝑒
𝜎2 𝑡2
𝑡𝜇+
3. Normal Distributions: 𝑀𝑋 (𝑡) = 𝑒 2

𝜆
4. Exponential Distributions: 𝑀𝑋 (𝑡) = ,𝑡 < 𝜆
𝜆−𝑡

𝛼 𝑟
5. Gamma Distributions: 𝑀𝑋 (𝑡) = ( )
𝛼−𝑡
𝑛
6. Chi square Distributions: 𝑀𝑋 (𝑡) = (1 − 2𝑡)−2
1
7. Uniform Distributions: 𝑀𝑋 (𝑡) = (𝑒 𝑏𝑡 − 𝑒 𝑎𝑡 )
𝑡(𝑏−𝑎)

Properties of MGF
1. Suppose that the random variable 𝑋 has mgf 𝑀𝑋 . Let 𝑌 = 𝛼𝑋 + 𝛽. Then 𝑀𝑌 , the mgf
of the random variable 𝑌, is given by

𝑀𝑌 (𝑡) = 𝑒 𝛽𝑡 𝑀𝑋 (𝛼𝑡)

Proof: 𝑀𝑌 (𝑡) = 𝐸(𝑒 𝑌𝑡 ) = 𝐸[𝑒 (𝛼𝑋+𝛽)𝑡 ]

= 𝑒 𝛽𝑡 𝐸(𝑒 𝛼𝑡𝑋 ) = 𝑒 𝛽𝑡 𝑀𝑋 (𝛼𝑡)

2. Let 𝑋 and 𝑌 be two random variables with mgf’s, 𝑀𝑋 (𝑡) and 𝑀𝑌 (𝑡), respectively. If
𝑀𝑋 (𝑡) = 𝑀𝑌 (𝑡) for all values of 𝑡, then 𝑋 and 𝑌 have the same probability distribution.

3. Suppose that 𝑋 and 𝑌 are independent random variables. Let 𝑍 = 𝑋 + 𝑌. Let 𝑀𝑋 (𝑡),
𝑀𝑌 (𝑡) and 𝑀𝑍 (𝑡) be the mgf’s of the random variables 𝑋, 𝑌 and 𝑍, respectively.

Then 𝑀𝑍 (𝑡) = 𝑀𝑋 (𝑡)𝑀𝑌 (𝑡).

4. Suppose that 𝑋 has distribution 𝑁(𝜇, 𝜎 2 ). Let 𝑌 = 𝛼𝑋 + 𝛽. Then 𝑌 is again normally


distributed. Then mgf of 𝑌 is 𝑀𝑌 (𝑡) = 𝑒 𝛽𝑡 𝑀𝑋 (𝛼𝑡).
𝜎2 𝑡 2
𝜇𝑡+
Proof: Mgf of normal distribution is 𝑀𝑋 (𝑡) = 𝑒 2 .
𝛼2 𝜎2 𝑡2 𝛼2 𝜎2 𝑡2
𝛽𝑡 𝛽𝑡 𝛼𝜇𝑡+ (𝛽+𝛼𝜇)𝑡
Then Mgf of 𝑌 is 𝑀𝑌 (𝑡) = 𝑒 𝑀𝑋 (𝛼𝑡) = 𝑒 [𝑒 2 ]=𝑒 𝑒 2

But this is the mgf of a normally distributed random variable with expectation 𝛼𝜇 + 𝛽
and variance 𝛼 2 𝜎 2 . Thus the distribution of 𝑌 is normal.
Reproductive Properties of mgf
1. Suppose that 𝑋 and 𝑌 are independent random variables with distributions 𝑁(𝜇1 , 𝜎 2 )
and 𝑁(𝜇2 , 𝜎22 ), respectively. Let 𝑍 = 𝑋 + 𝑌. Hence
𝜎2𝑡 2 𝜎2𝑡 2
𝜇1 𝑡+ 1 𝜇2𝑡+ 2
𝑀𝑍 (𝑡) = 𝑀𝑋 (𝑡)𝑀𝑌 (𝑡) = 𝑒 2 𝑒 2

(𝜎 2 +𝜎 2 )𝑡 2
[(𝜇1 +𝜇2 )𝑡+ 1 2 ]
2
=𝑒
Theorem: (The reproductive property of the normal distribution)
Let 𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 be 𝑛 independent random variables with distribution 𝑁(𝜇𝑖 , 𝜎𝑖2 ), 𝑖 =
1,2, ⋯ , 𝑛.

Let 𝑍 = 𝑋1 + ⋯ + 𝑋𝑛 . Then 𝑍 has distribution 𝑁(∑𝑛𝑖=1 𝜇𝑖 , ∑𝑛𝑖=1 𝜎𝑖2 ).

Note: The Poisson distribution also possesses a reproductive property.


Theorem:
Let 𝑋1 , ⋯ , 𝑋𝑛 be independent random variables. Suppose that 𝑋𝑖 has a Poisson distribution
with parameter 𝛼𝑖 , 𝑖 = 1,2, ⋯ , 𝑛. Let 𝑍 = 𝑋1 + ⋯ + 𝑋𝑛 . Then 𝑍 has a Poisson distribution
with parameter 𝛼 = 𝛼1 + ⋯ + 𝛼𝑛 .
Theorem:
Suppose that the distribution of 𝑋𝑖 is 𝜒𝑛2𝑖 , 𝑖 = 1,2, ⋯ , 𝑘, where the 𝑋𝑖′ 𝑠 are independent
random variables. Let 𝑍 = 𝑋1 + ⋯ + 𝑋𝑘 . Then 𝑍 has distribution 𝜒𝑛2 , where 𝑛 = 𝑛1 + ⋯ +
𝑛𝑘 .
𝑛𝑖
Proof: 𝑀𝑋𝑖 (𝑡) = (1 − 2𝑡)− 2 , 𝑖 = 1,2, ⋯ , 𝑘

Hence 𝑀𝑍 (𝑡) = 𝑀𝑋1 (𝑡) ⋯ 𝑀𝑋𝑘 (𝑡) = (1 − 2𝑡)−(𝑛1+⋯+𝑛𝑘)/2 .


Exercise
1. Find the mgf of random variable 𝑋 which is uniformly distributed with an interval

(−𝑎, 𝑎) and hence find 𝐸(𝑋 2𝑛 ).


𝑒 𝑎𝑡 −𝑒 −𝑎𝑡 1
Solution: 𝑋~ 𝑈(−𝑎, 𝑎) = (𝑎+𝑎)𝑡
we know, 𝑀𝑋 (𝑡) = (𝑒 𝑏𝑡 − 𝑒 𝑎𝑡 )
𝑡(𝑏−𝑎)

𝑒 𝑎𝑡 −𝑒 −𝑎𝑡
= (2𝑎)𝑡
, by expanding.

𝑡 2𝑛 𝑎2𝑛
𝐸(𝑋 2𝑛 ) = coefficient of =
(2𝑛)! (2𝑛+1)

2. If X is normally distributed with mean 𝜇 and variance, 𝜎 2 then show that 𝐸(𝑋 − 𝜇)2𝑛 =
1.3.5 … . (2𝑛 − 1)𝜎 2𝑛 .
𝜎2 𝑡 2
Solution: Given 𝑋~𝑁(𝜇, 𝜎 2 ),
𝑀𝑋 (𝑡) = 𝑒 𝜇𝑡+ 2

Let 𝑌= (𝑋 − 𝜇)
𝑡 2𝑛
To get E(𝑌 2𝑛 )= The coefficient of (2𝑛)! in 𝑀𝑌 (𝑡).

𝑀𝑌 (𝑡) = 𝐸 (𝑒 𝑡𝑦 )= 𝐸(𝑒 𝑡(𝑥−𝜇) )

= 𝑒 −𝜇𝑡 𝐸(𝑒 𝑡𝑥 )

= 𝑒 −𝜇𝑡 𝑀𝑋 (𝑡)
𝜎2 𝑡 2 𝜎2 𝑡 2 2 3
−𝜇𝑡 𝜇𝑡+ 𝜎2𝑡 2 1 𝜎2𝑡 2 1 𝜎2𝑡 2
𝑀𝑌 (𝑡) = 𝑒 .𝑒 2 =𝑒 2 =1+( ) + 2! ( ) + 3! ( ) +⋯
2 2 2

𝑡 2𝑛 𝜎 2𝑛 (2𝑛)! (2𝑛)!
E(𝑌 2𝑛 )= The coefficient of (2𝑛)! in 𝑀𝑌 (𝑡)= = 𝜎 2𝑛
2𝑛 𝑛! 2𝑛 𝑛!

(2𝑛)(2𝑛−1)(2𝑛−2)…(3)(2)(1)
= (2𝑛)(2𝑛−2)(2𝑛−4)…(6)(4)(2)
𝜎 2𝑛

= 1.3.5 … . (2𝑛 − 1)𝜎 2𝑛 .

3. Let 𝑋1 ~𝜒 2 (3), 𝑋2 ~𝜒 2 (5) and Z= 𝑋1 + 𝑋2 where 𝑋1 𝑎𝑛𝑑 𝑋2 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 random


variables. Find 𝑀𝑍 (𝑡) and 𝑉(𝑍) 𝑎𝑛𝑑 𝑝𝑑𝑓 𝑜𝑓 𝑍.

Solution:
3
𝑋1 ~𝜒 2 (3)
gives 𝑀𝑋1 (𝑡) = (1 − 2𝑡) − 2
5
𝑋2 ~𝜒 2 (5) gives 𝑀𝑋2 (𝑡) = (1 − 2𝑡) – 2
8
𝑀𝑍 (𝑡) = 𝑀𝑋1 (𝑡)𝑀𝑋2 (𝑡) = (1 − 2𝑡) − 2 ~𝜒 2 (8)

Therefore, 𝑉(𝑍) = 2𝑛 = 2 (8) = 16. Where 𝑛 = 8.

𝑛 𝑍 𝑍
−1 − −
𝑍2 𝑒 2 𝑍3𝑒 2
, 𝑍>0
f(Z)= { 2𝑛2Γ(𝑛/2) = { 24Γ(4) , 𝑍 > 0
0, 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 0, 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

4. Let 𝑋1 , 𝑋2 𝑎𝑛𝑑 𝑋3 𝑎𝑟𝑒 3 independent random variable having normal distributions with
parameter (4, 1), (5,2), (7, 3) respectively. Let 𝑌 = 2𝑋1 + 2𝑋2 + 𝑋3 . Find the pdf of 𝑉 =
𝑌−𝜇 2
( 𝜎
) , where 𝜇 𝑎𝑛𝑑 𝜎 are the mean and standard deviation of Y.

Solution:
𝜎2𝑡 2
𝜇𝑡+
𝑀𝑋 (𝑡) = 𝑒 2

𝑌~𝑁(2 ∗ 4 + 2 ∗ 5 + 1 ∗ 7, 22 ∗ 1 + 22 ∗ 2 + 1 ∗ 3)

𝑌~𝑁(25, 15)
15𝑡 2
25𝑡+
𝑀𝑌 (𝑡) = 𝑒 2

𝑥−𝜇
We have, if 𝑋~𝑁(𝜇, 𝜎 2 ), then 𝑍 = ~𝑁(0,1) and 𝑌 = 𝑍 2 ~𝜒 2 (1).
𝜎

𝑌−25
“𝑌” is in normal distribution so is in standard normal distribution and square of it is in
√15
chi square distribution with degree of freedom, n=1.
𝑌−25 2
Therefore,𝑉 = ( ) = 𝑍 2 ~𝜒 2 (1)
√15
𝑣 1 𝑣 1
− − − −
𝑒 2 𝑣 2 𝑒 2 𝑣 2
1 , 𝑣>0 1 , 𝑣>0
f(V)= { 1
22 Γ( ) = { √𝜋 22
2
0, 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 0, 𝑂𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

3
5. If a random variable 𝑋 has mgf 𝑀𝑋 (𝑡) = then find the standard deviation of the
3−𝑡
random variable 𝑋.

Solution:

3 3 1 𝑡 −1 𝑡 𝑡 2
𝑀𝑋 (𝑡) = = 𝑡 = 𝑡 = (1 − ) =1+ +( ) +⋯
3−𝑡 3(1−3) (1−3) 3 3 3

1
𝐸(𝑋) = 𝑐𝑜𝑒𝑓 𝑜𝑓 𝑡 𝑖𝑛 𝑀𝑋 (𝑡) =
3

2)
𝑡2 1 2
𝐸(𝑋 = 𝑐𝑜𝑒𝑓 𝑜𝑓 𝑖𝑛 𝑀𝑋 (𝑡) = 2! =
2! 9 9
1 1
𝑉(𝑋) = . 𝐻𝑒𝑛𝑐𝑒, 𝜎 =
9 3
6. Let 𝑋 be random variable having probability mass function p(x=k) = 𝑝(1 − 𝑝)𝑘−1 ,

𝑘 = 1,2,3 … 𝑛. Find 𝑀𝑋 (𝑡) and 𝑉(𝑋).

Solution:

𝑀𝑋 (𝑡) = ∑𝑛1 𝑒 𝑡𝑥 𝑝(𝑋 = 𝑥)

= ∑𝑛1 𝑒 𝑡𝑥 p(1 − 𝑝)𝑥−1


𝑝
= ∑𝑛1 𝑒 𝑡𝑥 (1 − 𝑝)𝑥
𝑝−1

𝑝
= ∑𝑛1(𝑒 𝑡 (1 − 𝑝))𝑥 , Expanding
1−𝑝

𝑝
= {(𝑒 𝑡 (1 − 𝑝)) + (𝑒 𝑡 (1 − 𝑝))2 + ⋯ }
1−𝑝
𝑝
= (𝑒 𝑡 (1 − 𝑝)){1 + (𝑒 𝑡 (1 − 𝑝)) + (𝑒 𝑡 (1 − 𝑝))2 + ⋯ }
1−𝑝

𝑝 𝑡
1 𝑝𝑒 𝑡
𝑀𝑋 (𝑡) = ∗ (𝑒 (1 − 𝑝)) ∗ =
1−𝑝 1 − 𝑒 𝑡 (1 − 𝑝) 1 − 𝑒 𝑡 (1 − 𝑝)
𝑝𝑒 𝑡
𝑀𝑋1 (𝑡) = [1−𝑒 𝑡(1−𝑝)]2
1
at t=0, 𝐸(𝑋) =
𝑝

and
1 2(1 − 𝑝)
𝐸(𝑋 2 ) = +
𝑝 𝑝2
1−𝑝
𝑉(𝑋) =
𝑝2

7. If 𝑋 has pdf 𝑓(𝑥) = 𝜆𝑒 −𝜆(𝑥−𝑎) , 𝑋 ≥ 𝑎, find the mgf of 𝑋 and hence find 𝑉(𝑋).

Ans: Given is an exponential distribution.



𝑀𝑋 (𝑡) = ∫ 𝑒 𝑡𝑥 𝑓(𝑥) 𝑑𝑥
−∞
∞ ∞ ∞
= ∫𝑎 𝑒 𝑡𝑥 𝜆𝑒 −𝜆(𝑥−𝑎) 𝑑𝑥 = 𝜆 ∫𝑎 𝑒 𝑡𝑥 𝑒 −𝜆(𝑥−𝑎) 𝑑𝑥 = 𝜆𝑒 𝜆𝑎 ∫𝑎 𝑒 −(𝜆−𝑡)𝑥 𝑑𝑥

𝜆𝑎
𝑒 −(𝜆−𝑡)𝑥
= 𝜆𝑒
−(𝜆 − 𝑡)
𝑒 −(𝜆−𝑡)𝑎 𝜆𝑒 𝑎𝑡
= 𝜆𝑒 𝜆𝑎 (𝜆−𝑡)
= ,𝜆 >𝑡
𝜆−𝑡

1 1
𝐸(𝑋)= 𝑎 + and 𝑉(𝑋) =
𝜆 𝜆2
𝑡 −1)
8. If the mgf of discrete random variable is 𝑒 4(𝑒 then find P(𝑋 = 𝜇 + 𝜎) where 𝜇 and 𝜎
are the mean and S.D of 𝑋.
𝑡 −1)
Solution: 𝑀𝑋 (𝑡) = 𝑒 4(𝑒

𝑋~𝑝(𝛼) where 𝛼 = 4

Therefore 𝐸(𝑋) = 𝑉(𝑋) = 4,

Which gives 𝜇 = 4 𝑎𝑛𝑑 𝜎 = 2.


𝑒 −𝛼 𝛼 𝑘
𝑃(𝑋 = 𝑘)= , 𝑘 = 0,1,2, ⋯ , 𝑛
𝑘!

For, 𝛼 = 4,
𝑒 −4 4𝑘
𝑃(𝑋 = 𝑘) =
𝑘!

𝑒 −4 46
To find P(𝑋 = 𝜇 + 𝜎)= P(𝑋 = 6)= = 0.1042.
6!
Sampling

In statistical investigation the interest usually lies in the assessment of variation of one or
more characteristics of objects belonging to a group. This group of objects under study is
called a population. The population may be finite or infinite. Examining the entire
population for sake of assessment of variation may be difficult or even impossible to do. In
such situations we consider a sample. A finite subset of statistical individuals in a
population is called a sample and the number of individuals in a sample is called the sample
size. The process of obtaining a sample is called a sampling. Sampling in which each
member may be chosen more than once is called sampling with replacement. While if each
member cannot be chosen more than once is called sampling without replacement.

Random sampling: If the sample units are selected at random then it is called random
sampling. In this case each unit of population has an equal chance of being included in the
sample.

Let 𝑋 be a random variable contain probability distribution with mean 𝜇 and variance 𝜎 2 .
Let 𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 be 𝑛 independent random variables each having the same distribution of
𝑋. Then (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) is called a random sample of size 𝑛 from 𝑋. 𝜇 is called population
mean and 𝜎 2 is called population variance.

• Let (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) be a random sample of size 𝑛 from 𝑋.


𝑛 𝑛 2
∑ 𝑥 ∑ (𝑥 −𝑥̅ )
The statistic 𝑋̅ = 𝑖=1 𝑖 is called sample mean and 𝑆 2 = 𝑖=1 𝑖 is called sample
𝑛 𝑛
variance.

• Let 𝑋 be a random variable with expectation 𝐸(𝑋) = 𝜇 and variance 𝑉(𝑋) = 𝜎 2 ,


𝜎 2
then 𝐸(𝑋̅) = 𝜇 and 𝑉(𝑋̅) = where 𝑛 sample size.
𝑛

• Let (𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 ) be a random sample of size 𝑛 ≥ 2 from a distribution 𝑁(𝜇, 𝜎 2 )


𝜎 𝑛𝑆 2 2
then mean sample 𝑋̅~𝑁 (𝜇, ) and 2 ~𝜒 2 (𝑛 − 1).
𝑛 𝜎
Exercise
1. Let 𝑋̅ be the mean of a sample of size 5 from a normal distribution with mean 𝜇 = 0,
variance 125. Find 𝐶 so that 𝑃(𝑋̅ < 𝑐) = 0.9.

Solution: 𝑛 = 5, 𝜇 = 0 𝑎𝑛𝑑 𝜎 2 = 125

̅−μ
X C−μ C−0 C
̅ < C) = P (
P(X < ) = P 𝑍 < = P ( 𝑍 < ) = 0.9
σ σ 5
√ 125
√n √n ( 5 )
𝐶
𝜙 ( ) = 0.9
5
𝐶
⇒ = 1.28
5
⇒ 𝐶 = 6.4

2. If 𝑋̅ is the mean of a random sample size 𝑛 from a normal distribution with mean 𝜇 and
variance 100. Find 𝑛 so that Pr{𝜇 − 5 < 𝑋̅ < 𝜇 + 5} = 0.954.

Solution: 𝜎 2 = 100

−5 ̅ −μ
X 5
P(𝜇 − 5 < ̅
X < μ + 5) = P ( σ < σ < σ )= 0.954
√n √n √n

Ans: 𝑛 = 16

3. Let 𝑆 2 be the variance of a random sample of size 6 from 𝑁(𝜇, 12), then find

Pr {2.3 < 𝑆 2 < 22.2}.


12 𝑛𝑆 2
Solution: 𝑋̅~𝑁 (𝜇, ) , 2 ~𝜒 2 (𝑛 − 1)
6 𝜎

6𝑆 2
⇒ ~𝜒 2 (5)
12

2
𝑆2
Pr{2.3 < 𝑆 < 22.2} = Pr {1.15 < < 11.1}
2

⇒ 𝜙(11.1) − 𝜙(1.15) = 0.950 − 0.050 = 0.9.


Central Limit Theorem

Let 𝑋1 , ⋯ , 𝑋𝑛 denote a random sample space of size 𝑛 from a distribution that has mean 𝜇
𝑋𝑖 −𝑛𝜇 (𝑋̅−𝜇)
and variance 𝜎 2 . Then random variable 𝑌𝑛 = ∑𝑛𝑖=1 = 𝜎 has a limiting distribution
𝜎 √𝑛
√𝑛

𝑁(0,1).

Exercise

1. Compute an approximate probability that the mean sample of size 15 from a


3𝑥 2 , 0 < 𝑥 < 1 3 4
distribution having pdf 𝑓(𝑥) = { is between and .
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒 5 5

Solution:

We know how to find mean and variance by the given pdf of one-dimensional random
variable.
1 3 3
Here 𝜇 = ∫0 𝑥 ⋅ 3𝑥 2 𝑑𝑥 = , 𝜎 2 = , 𝑛 = 50
4 80

̅ −μ
X
Then by Central limit theorem, σ ~𝑁(0,1)
√n

3 4
3 4 −μ ̅
X − μ −μ
X < ) = P ( 5 σ < σ < 5 σ ) = P(−3 < Z < 1) = 0.84
P( < ̅
5 5
√n √n √n

2. Let 𝑋̅ the mean of a random sample of size 100 from a distribution which is 𝜒 2 (50).
Compute an approximate value of Pr {49 < 𝑋̅ < 51}.
𝜎
Solution: 𝑋 ∼ 𝜒 2 (50), 𝜇 = 𝑟 = 50, 𝜎 2 = 2𝑟 = 100, n = 100, =1
√𝑛

49 − μ ̅
X − μ 51 − μ
P(49 < ̅
X < 51) = P ( σ < σ < σ ) = P(−1 < Z < 1) = 0.6866
√n √n √n

3. Let 𝑋̅ the mean of a random sample of size 128 from a Gamma distribution with 𝑟 =
1
2, 𝛼 = . Approximate Pr {7 < 𝑋̅ < 9}.
4

1 𝑟 𝑟
Solution: Here 𝑛 = 128, 𝑟 = 2, 𝛼 = , 𝜇 = = 8, 𝜎 2 = = 32
4 𝛼 𝛼2
𝜎 32 1
Then =√ =
√𝑛 128 2

̅ −μ
X
By Central limit theorem, σ ~𝑁(0,1)
√n

7−μ ̅ −μ
X 9−μ
P(7 < ̅
X < 9) = P ( σ < σ < σ ) = P(−2 < Z < 2) = 0.9544
√n √n √n

4. Suppose that 𝑋𝑗 , 𝑗 = 1,2, ⋯ ,50 are independent random variable each having a
Poisson distribution with 𝛼 = 0.03. Let 𝑆 = 𝑋1 + 𝑋2 + ⋯ + 𝑋50 using the central
limit theorem evaluate Pr (𝑆 ≥ 3).

Solution: 𝑆 = 𝑋1 + 𝑋2 + ⋯ + 𝑋50

𝜇 = 𝜎 2 = 𝛼 = 0.03 and n=50


𝑋𝑖 −𝑛𝜇 𝑆−50∗0.03 𝑆−1.5
∑𝑛𝑖=1 = = ~N(0, 1)
𝜎 √𝑛 √0.03∗50 √1.5

𝑆−1.5 3−1.5
𝑃(𝑆 ≥ 3) = 1 − 𝑃(𝑆 ≤ 3) = 1 − 𝑃 ( ≤ )= 0.1093
√1.5 √1.5

5. A distribution with unknown mean 𝜇 has a variance 1.5. Find how large a sample
should be taken from the distribution in order that the probability will be at least 0.95
that the sample mean will be within 0.5 of the population mean.

Solution: Pr {μ − 0.5 < 𝑋̅ < 𝜇 + 0.5 = 0.95

μ − 0.5 − μ ̅
X − μ μ + 0.5 − μ
⇒ P( σ < σ < σ ) = 0.95
√n √n √n
0.5√𝑛 0.5√𝑛
⇒ 𝑃 (− <𝑍< ) = 0.95
√1.5 √1.5

0.5√𝑛
⇒ 2𝜙 ( ) = 1.95
√1.5

0.5√𝑛
⇒ = 1.96
√1.5
⇒ 𝑛 = 23.05~23
Estimation of Parameters
Unbiased estimator
Let 𝜃̂ be an estimate for the unknown parameter 𝜃 associated with the distribution of random
variable 𝑋. Then 𝜃̂ is an unbiased estimator for 𝜃 if 𝐸(𝜃̂) = 𝜃 ∀ 𝜃.

Consistent estimate
Let 𝜃̂ be an estimate of the parameter 𝜃. We say that 𝜃̂ is a consistent estimate of 𝜃 if
lim 𝑃{|𝜃̂ − θ| > 𝜖} = 0 ∀ 𝜖 > 0 𝑜𝑟 lim 𝑃{|𝜃̂ − θ| ≤ 𝜖} = 1 .
𝑛→∞ 𝑛→∞

Note: The above definition indicates that as the sample size increases, the estimate becomes
better.

Unbiasedness and consistency of estimate can be found using the following theorem:

Theorem
Let 𝜃̂ be an estimate of the parameter 𝜃 based on a sample size 𝑛. If 𝐸(𝜃̂) = 𝜃 and
lim 𝑉(𝜃̂) = 0 then 𝜃̂ is a consistent estimate of 𝜃.
𝑛→∞
Proof:
We shall prove by using Chebyshev’s inequality.
1 2 1 2
∴ lim 𝑃{|𝜃̂ − θ| > 𝜖} ≤ 2 𝐸(𝜃̂ − θ) = 2 𝐸{(𝜃̂ − E(𝜃̂)) + (E(𝜃̂) − θ)}
𝑛→∞ 𝜖 𝜖
(Add and subtract E(𝜃̂))
1
= {𝐸[𝜃̂ − E(𝜃̂)]2 + 2𝐸{[𝜃̂ − E(𝜃̂) ](E(𝜃̂) − θ)} + E(E(𝜃̂) − θ)2 }
𝜖2
1 2
= 2 {𝑉(𝜃̂) + 2[𝐸(𝜃̂) − E(𝜃̂)(E(𝜃̂) − θ)] + (E(𝜃̂) − θ) }
𝜖
1 2
= 2 {𝑉(𝜃̂) + (E(𝜃̂) − θ) } → 0 𝑎𝑠 𝑛 → ∞ using given condition.
𝜖
∴ 𝜃̂ is a consistent estimate of 𝜃.
Exercise
1. Show that sample mean is an unbiased and consistent estimate of population mean.
Solution:
Let 𝑋1 , 𝑋2 , … 𝑋𝑛 be samples taken from the distribution of 𝑋 having mean 𝜇
∑𝑛
𝑖=1 𝑋𝑖
∴ 𝑋̅ = is the sample mean
𝑛

To prove: 𝐸(𝑋̅) = 𝜇 and 𝑉(𝑋̅) = 0


𝑛 𝑛
1 1 ∑𝑛𝑖=1 𝐸(𝑋𝑖 ) 𝑛𝜇
̅
𝐸(𝑋) = 𝐸 ( ∑ 𝑋𝑖 ) = ∑ 𝐸(𝑋𝑖 ) = = =𝜇
𝑛 𝑛 𝑛 𝑛
𝑖=1 𝑖=1
That is, the sample mean is an unbiased estimate of the population mean
𝑛 𝑛
∑𝑛𝑖=1 𝑋𝑖 1 1 𝑛𝑉(𝑋) 𝜎 2
̅
𝑉(𝑋) = 𝑉 ( ) = 2 𝑉 (∑ 𝑋𝑖 ) = 2 ∑ 𝑉(𝑋𝑖 ) = =
𝑛 𝑛 𝑛 𝑛2 𝑛
𝑖=1 𝑖=1
2
𝜎
∴ lim 𝑉(𝑋̅) = lim
=0
𝑛→∞ 𝑛→∞ 𝑛
That is, the sample mean is a consistent estimate of the population mean.

2. Show that sample variance 𝑆 2 is not an unbiased estimate of population variance.


Solution:
Let 𝜎 2 be the variance of the distribution of 𝑋, i.e., 𝜎 2 is the population variance
Let 𝑆 2 be the sample variance
To prove: 𝐸(𝑆 2 ) ≠ 𝜎 2
∑𝑛 ̅ 2
𝑖=1(𝑋𝑖 −𝑋) ∑𝑛 2 ̅ 2 ̅
𝑖=1(𝑋𝑖 +(𝑋 ) −2𝑋 𝑋𝑖 ) [ ∑𝑛 2 ̅ 2 ̅2
𝑖=1(𝑋𝑖 ) ] +𝑛(𝑋) −2𝑛(𝑋 )
By definition, 𝑆 2 = = =
𝑛 𝑛 𝑛
∑𝑛 2
𝑖=1 𝑖 )
(𝑋
= − (𝑋̅)2
𝑛
(since ∑𝑛𝑖=1(𝑋̅)2 = 𝑛(𝑋̅)2 and 𝑛𝑋̅ = ∑𝑛𝑖=1 𝑋𝑖 )
∑𝑛𝑖=1(𝑋𝑖 2 ) ∑𝑛𝑖=1(𝑋𝑖 2 )
2)
𝐸(𝑆 = 𝐸 { ̅ 2
− (𝑋) } = 𝐸 { } − 𝐸{(𝑋̅)2 }
𝑛 𝑛
1
= 𝐸(∑𝑛𝑖=1(𝑋𝑖 2 )) − {𝐸(𝑋̅̅̅)2 }
𝑛
(since 𝐸(𝑋 2 ) = 𝑉(𝑋) + 𝜇2 , 𝐸(𝑋̅ 2 ) = 𝑉(𝑋̅) + 𝜇2 )
1
= 𝐸(𝑛𝑋 2 ) − {𝐸(𝑋̅̅̅)2 }
𝑛
= 𝐸(𝑋 2 ) − {𝐸(𝑋̅)2 } = 𝑉(𝑋) + 𝜇2 - {𝑉(𝑋̅) + 𝜇2 }
𝜎2 𝜎2
= [(𝜎 2 + 𝜇2 )] − { + 𝜇2 } (since sample variance is )
𝑛 𝑛
𝜎2
= (𝜎 2 + 𝜇2 ) − { + 𝜇2 }
𝑛
𝜎2
2
𝑛−1
=𝜎 − = 𝜎2 ( )
𝑛 𝑛
∴ 𝐸(𝑆 2 ) ≠ 𝜎 2 i.e., sample variance 𝑆 2 is not an unbiased estimate of population
variance

3. Show that if 𝑋 is a random sample of size 𝑛 with pdf


1 −𝑥
𝑓(𝑥, 𝜃) = {𝜃 𝑒 0 < 𝑥 < ∞, 0<𝜃<∞
𝜃

0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
𝜃2
then 𝑋̅ is an unbiased estimate of 𝜃 and has variance .
𝑛
Solution:
∞ ∞
1 −𝑥
𝐸(𝑋) = 𝜇 = ∫ 𝑥 𝑓(𝑥, 𝜃) 𝑑𝑥 = ∫ 𝑥 𝑒 𝜃 𝑑𝑥
−∞ 0 𝜃
𝑥 𝑥 ∞
1 𝑒− 𝜃 𝑒− 𝜃
= {𝑥 − } =𝜃
𝜃 −1 1
𝜃 𝜃2 0

∞ ∞
2)
1 −𝑥
𝐸(𝑋 = ∫ 𝑥 𝑓(𝑥, 𝜃) 𝑑𝑥 = ∫ 𝑥 2
2
𝑒 𝜃 𝑑𝑥
−∞ 0 𝜃
𝑥 𝑥 𝑥 ∞
1 𝑒 𝑒− 𝜃 𝑒 −
𝜃

𝜃
{ 𝑥2
= − 2𝑥 + 2 } = 2𝜃 2
𝜃 −1 1 −1
𝜃 𝜃 2 𝜃3 0
∴ 𝑉(𝑋) = 𝜎 2 = 2𝜃 2 − 𝜃 2 = 𝜃 2
The random variable 𝑋 has mean 𝜃 and variance 𝜃 2
𝜃2 𝜎2
Hence, 𝐸(𝑋̅) = 𝜃 , 𝑉(𝑋̅) = and lim 𝑉(𝑋̅) = lim =0
𝑛 𝑛→∞ 𝑛→∞ 𝑛

4. Let 𝑋1 , 𝑋2 , … 𝑋𝑛 be samples taken from a normal distribution with 𝜇 = 0 and variance 𝜎 2


∑ 𝑥𝑖 2
= 𝜃, 0 < 𝜃 < ∞. Show that 𝑌 = is an unbiased and consistent estimate of 𝜃.
𝑛
Solution:
𝜃 𝜃
𝑋~𝑁(0, 𝜃) ⟹ 𝑋̅~𝑁 (0, ) i.e., 𝐸(𝑋̅) = 0 and 𝑉(𝑋̅) =
𝑛 𝑛
∑ 𝑥𝑖 2
Let 𝑌 =
𝑛
To prove: 𝐸(𝑌) = 𝜃 and 𝑉(𝑌) = 0 𝑎𝑠 𝑛 → ∞
∑ 𝑥𝑖 2 1 2
𝑛 𝐸(𝑋 2 )
𝐸(𝑌) = 𝐸 ( ) = 𝐸 (∑ 𝑥𝑖 ) = = 𝐸(𝑋 2 )
𝑛 𝑛 𝑛
𝐸(𝑌) = 𝐸(𝑋 2 ) = 𝑉(𝑋) + [ 𝐸(𝑋)]2 = 𝜃 which implies 𝑌 is an unbiased estimate of 𝜃.

2 𝑋− 𝜇 2 𝑋− 𝜇 2 𝑋− 0 2 𝑋2
We know that 𝑋 ∼ 𝑁( 𝜇, 𝜎 ) , 𝑍 = ∼ 𝑁( 0, 1) and 𝑍 = ( ) =( ) = ∼
𝜎 𝜎 √𝜃 𝜃
𝜒 2 (1)
𝑋2 𝑋2
∴ 𝐸 ( ) = 1 𝑎𝑛𝑑 𝑉 ( ) = 2
𝜃 𝜃
This implies 𝐸(𝑋 2 ) = 𝜃 , 𝑉(𝑋 2 ) = 2𝜃 2
∑ 𝑥𝑖 2 1 𝑛 𝑉(𝑋 2 ) 𝑉(𝑋 2 ) 2𝜃2
Consider 𝑉(𝑌) = 𝑉 (
𝑛
) = 𝑛2 𝑉(∑ 𝑥𝑖 2 ) = 𝑛2
=
𝑛
=
𝑛
2𝜃2
∴ lim 𝑉(𝑌) = lim = 0 which implies 𝑌 is a consistent estimate of 𝜃.
𝑛→∞ 𝑛→∞ 𝑛

5. Let 𝑌1 and 𝑌2 be two independent unbiased statistics for 𝜃. The variance of 𝑌1 is twice the
variance of 𝑌2 . Find the constants 𝑘1 and 𝑘2 such that 𝑌 = 𝑘1 𝑌1 + 𝑘2 𝑌2 is an unbiased
statistic for 𝜃 with smallest possible variance for such a linear combination.
Solution:
Given that 𝐸(𝑌1 ) = 𝐸(𝑌2 ) = 𝜃, 𝑉(𝑌1 ) = 2𝑉( 𝑌2 ) = 2𝜎 2
To find: 𝑘1 𝑎𝑛𝑑 𝑘2 such that 𝐸(𝑌) = 𝐸(𝑘1 𝑌1 + 𝑘2 𝑌2 ) = 𝜃
𝑘1 𝐸(𝑌1 ) + 𝑘2 𝐸(𝑌2 ) = 𝜃
𝑘1 𝜃 + 𝑘2 𝜃 = 𝜃
𝑘1 + 𝑘2 = 1 ⇒ 𝑘2 = 1 − 𝑘1
2 2
𝑉(𝑌) = 𝑉(𝑘1 𝑌1 + 𝑘2 𝑌2 ) = 𝑘1 2 𝑉(𝑌1 ) + 𝑘2 𝑉(𝑌2 ) = 𝜎 2 (2𝑘1 2 + 𝑘2 )
2
𝑉(𝑌) = 𝜎 2 (2𝑘1 2 + 𝑘2 ) = 𝜎 2 (2𝑘1 2 + (1 − 𝑘1 )2 )
𝑑𝑉(𝑌)
𝑉(𝑌) has minima if =0
𝑑𝑘1
𝑑𝑉(𝑌) 𝑑[𝜎 2 (2𝑘1 2 +(1−𝑘1 )2 )]
= =0
𝑑𝑘1 𝑑𝑘1
1 2
⇒ 4𝑘1 − 2(1 − 𝑘1 ) = 0 ⇒ 𝑘1 = and 𝑘2 =
3 3
6. Let 𝑋1 , 𝑋2 , … , 𝑋25 ; 𝑌1 , 𝑌2 , … , 𝑌25 be two independent random samples from the
𝑋̅
distribution 𝑁(3, 16) , 𝑁(4, 9) respectively. Evaluate P ( ̅ > 1)
𝑌
Solution:
Given that 𝑋 ~𝑁(3, 16), 𝑌~𝑁(4, 9).
16 9
Then, 𝑋̅ ~𝑁 (3, ) , 𝑌̅ ~𝑁(4, )
25 25
𝑋̅
Now > 1 ⇒ 𝑋̅ > 𝑌̅ ⇒ 𝑋̅ − 𝑌̅ > 0.
𝑌̅
16 9
Since 𝑋̅ − 𝑌̅ ~ 𝑁[ 3 × 1 + 4 × (−1), 12 × + (−1)2 × ]~𝑁(−1, 1)
25 25
We have, 𝑍 = 𝑋̅ − 𝑌̅ + 1 ~𝑁(0, 1)
𝑋̅
Consider 𝑃 ( ̅ > 1) = 𝑃(𝑋̅ − 𝑌̅ > 0)
𝑌
= 𝑃(𝑋̅ − 𝑌̅ + 1 > 1)
= 𝑃(𝑍 > 1)
= 1 − 𝜙(1) = 1 − 0.841 = 0.159

Interval estimation
Let 𝑋 be a random variable with some probability distribution, depending on an unknown
parameter 𝜃. An estimate of 𝜃 given by two magnitudes within which 𝜃 can lie is called an
interval estimate of the parameter 𝜃. The process of obtaining an interval estimate for 𝜃 is called
interval estimation.
Confidence interval
Let 𝜃 be an unknown parameter to be determined by a random sample 𝑋1 , 𝑋2 , 𝑋3 , … 𝑋𝑛 of size
𝑛. The confidence interval for the parameter θ is a random interval containing the parameter
with high probability say 1 − 𝛼 where 1 − 𝛼 is called the confidence coefficient.
NOTE:
Let 𝑋1 , 𝑋2 . . 𝑋𝑛 be a random sample of size 𝑛 from a normal distribution 𝑁(𝜇, 𝜎 2 )
∑𝑛
𝑖=1 𝑋𝑖 𝜎2
1. 𝑋̅ = ~ 𝑁(𝜇, ) for 𝜇, 𝜎 2 is known
𝑛 𝑛
𝑋̅−𝜇
2. 𝑇 = 𝑆 ~ 𝑇(𝑛 − 1) for 𝜇, 𝜎 2 is unknown
( )
√𝑛−1
∑𝑛 ̅ 2
𝑖=1(𝑋𝑖 −𝑋)
3. 𝑌 = ~𝜒 2 (𝑛) for 𝜎, 𝜇 is known
𝜎2
𝑛𝑆 2
4. 𝑌 = ~𝜒 2 (𝑛 − 1) for 𝜎, 𝜇 is unknown
𝜎2

Confidence Interval for mean

Confidence interval for mean


𝝈𝟐 is known: 𝝈𝟐 is unknown:
∑𝑛 𝑋 𝜎2 𝑋̅−𝜇
Consider 𝑋̅ = 𝑖=1 𝑖 ~ 𝑁(𝜇, ) Consider 𝑇 = 𝑆 ~𝑇(𝑛 − 1) i.e., t-
𝑛 𝑛
√𝑛−1
𝑋̅ − 𝜇
∴ 𝑍 = 𝜎 ~𝑁(0,1) distribution with 𝑛 − 1 degrees of freedom
√𝑛
To find 𝑎 such that 𝑃(−𝑎 < 𝑍 < 𝑎) = 1 −
𝛼 To find 𝑎 such that 𝑃(−𝑎 < 𝑇 < 𝑎) = 1 −
𝛼
𝑋̅ − 𝜇
𝑃 (−𝑎 < 𝜎 < 𝑎) = 1 − 𝛼 𝑋̅ − 𝜇
√𝑛 𝑃 (−𝑎 < < 𝑎) = 1 − 𝛼
𝑆
𝑎𝜎 𝑎𝜎
𝑃 (𝑋̅ − < 𝜇 < 𝑋̅ + ) = 1 − 𝛼 √𝑛 − 1
√𝑛 √𝑛 𝑎𝑆 𝑎𝑆
𝑎𝜎 𝑎𝜎 𝑃 (𝑋̅ − < 𝜇 < 𝑋̅ + )
𝜇 ∈ (𝑋̅ − , 𝑋̅ + ) √𝑛 − 1 √𝑛 − 1
√ 𝑛 √𝑛 =1−𝛼
𝑎𝑆 𝑎𝑆
𝜇 ∈ (𝑋̅ − , 𝑋̅ + )
√𝑛 − 1 √𝑛 − 1
Exercise
1. Let the observed value of 𝑋̅ of size 20 from a normal distribution with 𝜇 and 𝜎 2 = 80 be
81.2. Obtain 95% confidence interval for the mean 𝜇.
Solution:
80
Let 𝑋~ 𝑁(𝜇, 80) ⟹ 𝑋̅ ~ 𝑁 (𝜇, ) = 𝑁(𝜇, 4)
20
𝑋̅−𝜇 𝑋̅ −𝜇
∴𝑍= 𝜎 = ~𝑁(0,1)
2
√𝑛

𝑃(−𝑎 < 𝑍 < 𝑎) = 0.95


1.95
2 𝜙(𝑎) − 1 = 0.95 ⇒ 𝜙(𝑎) = = 0.975 ⇒ 𝑎 = 1.96
2
𝑎𝜎 𝑎𝜎
⇒ 𝜇 ∈ (𝑋̅ − , 𝑋̅ + ) = (81.2 − 1.96 × 2, 81.2 + 1.96 × 2)
√𝑛 √𝑛
⇒ 𝜇 ∈ (77.28, 85.12)

2. Let a random sample of size 17 from 𝑁(𝜇, 𝜎 2 ) yield 𝑋̅ = 4.7 and 𝑆 2 = 5.76. Determine
90% confidence interval for 𝜇.
Solution:
Given: 𝑛 = 17, 𝑋̅ = 4.7 and 𝑆 2 = 5.76
𝑋̅−𝜇 4(𝑋̅ −𝜇)
Let 𝑇 = 𝑆 = ~𝑇(17 − 1) ~𝑇(16)
√5.76
√𝑛−1

To find 𝑎 such that 𝑃(−𝑎 < 𝑇 < 𝑎) = 0.90


2𝜙(𝑎) − 1 = 0.90 ⇒ 𝑎 = 1.75

1.75 × √5.76 1.75 × √5.76


⟹ 𝜇 ∈ (4.7 − , 4.7 + ) ⇒ 𝜇 ∈ (3.65, 5.75)
√16 √16
Confidence interval for variance

Confidence interval for variance


𝝁 is known: 𝝁 is unknown:
∑𝑛 ̅ 2
𝑖=1(𝑋𝑖 −𝑋) 𝑛𝑆 2
Let 𝑌 = ~𝜒 2 (𝑛) Let 𝑌 = ~𝜒 2 (𝑛 − 1)
𝜎2 𝜎2
To find a and b such that 𝑃(𝑎 < 𝑌 < 𝑏) = To find 𝑎 and 𝑏 such that 𝑃(𝑎 < 𝑌 < 𝑏) =
1−𝛼 1−𝛼
𝛼 𝛼 𝛼 𝛼
i.e., 𝑃(𝑌 < 𝑎) = , 𝑃(𝑌 > 𝑏) = i.e., 𝑃(𝑌 < 𝑎) = , 𝑃(𝑌 > 𝑏) =
2 2 2 2

∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)2 ∴ 𝑃 (𝑎 <


𝑛𝑆 2
< 𝑏) = 1 − 𝛼
∴ 𝑃 (𝑎 < < 𝑏) = 1 − 𝛼 𝜎2
𝜎2
2
1 𝜎2 1
1 𝜎 1 𝑃( < < )=1−𝛼
𝑃( < < )=1−𝛼 𝑏 𝑛𝑆 2 𝑎
𝑏 ∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)2 𝑎
𝑛𝑆 2 2
𝑛𝑆 2
∑𝑛𝑖=1(𝑋𝑖 −𝑋 ̅ )2 ∑𝑛𝑖=1(𝑋𝑖 ̅ )2
−𝑋 𝑃( < 𝜎 < )=1−𝛼
𝑃( < 𝜎2 < ) 𝑏 𝑎
𝑏 𝑎
2
𝑛𝑆 2 𝑛𝑆 2
=1−𝛼 ⇒ 𝜎 ∈( , )
𝑏 𝑎
∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)2 ∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)2
2
⇒ 𝜎 ∈( , )
𝑏 𝑎
∑ 𝑋𝑖
where 𝜇 = = 𝑋̅
𝑛

Exercise
1. If 8.6, 7.9, 8.3, 6.4, 8.4, 9.8, 7.2, 7.8, 7.5 are the observed values of a random sample of
size 9 from a distribution 𝑁(8, 𝜎 2 ), construct 90% confidence interval for 𝜎 2 .
Solution:
Given: 𝜇 = 8, 𝑛 = 9
∑𝑛 ̅ 2
𝑖=1(𝑋𝑖 −𝑋)
𝑌=
𝜎2
1
= {0.62 + 0.12 + 0.32 + 1.62 + 0.42 + 1.82 + 0.82 + 0.22 + 0.52 }
𝜎2
7.35
= ~𝜒 2 (9)
𝜎2

To find a and b such that 𝑃(𝑎 < 𝑌 < 𝑏) = 1 − 𝛼 = 0.90


⇒ 𝛼 = 0.10
𝛼 0.10
𝑃(𝑌 < 𝑎) = = = 0.05 ⇒ 𝑎 = 3.33 ,
2 2
𝛼 0.10
𝑃(𝑌 > 𝑏) = = = 0.05
2 2
⇒ (𝑌 < 𝑏) = 1 − 0.05 = 0.95 ⇒ 𝑏 = 16.9
using chi square table for 9 degrees of freedom.
∑𝑛 ̅ 2 ∑𝑛
𝑖=1(𝑋𝑖 −𝑋 ) (𝑋 −𝑋̅)2 7.35 7.35
∴ 𝜎2 ∈ ( , 𝑖=1 𝑖 ) =( , ) = (0.43, 2.21)
𝑏 𝑎 16.9 3.33

2. A random sample of size 15 from a normal distribution 𝑁(𝜇, 𝜎 2 ) yields 𝑋̅ = 3.2 , 𝑆 2 =


4.24 . Determine a 90% confidence interval for 𝜎 2 .
Solution:
Given: 1 − 𝛼 = 0.9 ⇒ 𝛼 = 0.1
𝑛𝑆 2
𝑌 = 2 ~𝜒 2 (15 − 1) = 𝜒 2 (14)
𝜎
𝛼 0.10
∴ 𝑃(𝑌 < 𝑎) = = = 0.05 ⇒ 𝑎 = 6.57
2 2
𝛼 0.10
𝑃(𝑌 > 𝑏) = = = 0.05
2 2
⇒ (𝑌 < 𝑏) = 1 − 0.05 = 0.95 ⇒ 𝑏 = 23.7
using chi square table for 14 degrees of freedom.
15 × 4.24 15 × 4.24
∴ 𝜎2 ∈ ( , ) = (2.68, 9.68)
23.7 6.57

3. A random sample of size 9 from a normal distribution 𝑁(𝜇, 𝜎 2 ) yields 𝑆 2 = 7.63.


Determine a 95% confidence interval for 𝜎 2 .
(Ans: 𝜎 2 ∈ (3.924, 31.5))
4. A random sample of size 15 from a normal distribution 𝑁(𝜇, 𝜎 2 ) yields 𝑋̅ = 3.2 , 𝑆 2 =
4.24 . Determine a 95% confidence interval for 𝜇.
(Ans: 𝜇 ∈ (2.02, 4.38))
5. A random sample of size 25 from a normal distribution 𝑁(𝜇, 4) yields 𝑋̅ = 78.3 , 𝑆 2 =
4.24 . Determine a 99% confidence interval for 𝜇.
(Ans: 𝜇 ∈ (77.268, 79.332))
Maximum Likelihood Estimate for 𝜽 (MLE)
Let 𝑋1 , 𝑋2 , … , 𝑋𝑛 be a random sample from the random variable 𝑋 and let 𝑥1 , 𝑥2 , … , 𝑥𝑛 be
sample values. We denote likelihood function 𝐿 as the following function
𝐿(𝑋1 , 𝑋2 , … , 𝑋𝑛 ; 𝜃) = 𝑓(𝑥1 , 𝜃)𝑓(𝑥2 , 𝜃) … 𝑓(𝑥𝑛 , 𝜃)

Maximum Likelihood Estimate (MLE) of 𝜃, say 𝜃̂ based on a random sample 𝑋1 , 𝑋2 , … 𝑋𝑛 is that


value of 𝜃 that maximizes 𝐿(𝑋1 , 𝑋2 , … , 𝑋𝑛 ; 𝜃).

Exercise
1. Let 𝑋1 , 𝑋2 , … 𝑋𝑛 denote a random sample of size 𝑛 from a distribution having pdf
𝜃 𝑥 (1 − 𝜃)1−𝑥 , 0 ≤ 𝜃 ≤ 1
𝑓(𝑥, 𝜃) = {
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Find the MLE for 𝜃.
Solution:
𝐿(𝑋1 , 𝑋2 , … , 𝑋𝑛 ; 𝜃) = 𝑓(𝑥1 , 𝜃)𝑓(𝑥2 , θ) … 𝑓(𝑥𝑛 , 𝜃)
= 𝜃 𝑥1 (1 − 𝜃)1−𝑥1 . 𝜃 𝑥2 (1 − 𝜃)1−𝑥2 . . . 𝜃 𝑥𝑛 (1 − 𝜃)1−𝑥𝑛
𝑛 𝑛
= 𝜃 ∑𝑖=1 𝑥𝑖 (1 − 𝜃)𝑛−∑𝑖=1 𝑥𝑖
Taking log on both sides and then partially differentiating with respect to 𝜃,
𝜕 (log 𝐿) 1 1
= ∑𝑛𝑖=1 𝑥𝑖 × + (𝑛 − ∑𝑛𝑖=1 𝑥𝑖 ) (− )
𝜕𝜃 𝜃 1−𝜃

𝜕 (log 𝐿)
For maximum, =0
𝜕𝜃

On simplifying, we get MLE of 𝜃 = 𝜃̂ = 𝑋̅

2. Let 𝑋1 , 𝑋2 , … 𝑋𝑛 denote a random sample of size 𝑛 from a distribution having pdf


𝜃 𝑥 𝑒 −𝜃
𝑓(𝑥, 𝜃) = { 𝑥! , 0≤𝜃≤1
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Find MLE for 𝜃.
Solution:
𝐿(𝑋1 , 𝑋2 , … , 𝑋𝑛 ; 𝜃) = 𝑓(𝑥1 , 𝜃)𝑓(𝑥2 , θ) … 𝑓(𝑥𝑛 , 𝜃)
𝜃 𝑥1 𝑒 −𝜃 𝜃 𝑥2 𝑒 −𝜃 𝜃 𝑥𝑛 𝑒 −𝜃
= × × …×
𝑥1 ! 𝑥2 ! 𝑥𝑛 !
𝑛
= 𝜃 ∑𝑖=1 𝑥𝑖 𝑒 −𝑛𝜃
𝜕 (log 𝐿)
Taking log on both sides and = 0 gives MLE, 𝜃̂ = 𝑋̅
𝜕𝜃

3. Find the MLE of normal distribution 𝑁(𝜃1 , 𝜃2 ) where −∞ < 𝜃1 < ∞, 0 < 𝜃2 < ∞.
Solution:
(𝑥−𝜇)2
2 2) 1 −
If 𝑋~𝑁(𝜇, 𝜎 ), then 𝑓(𝑥, 𝜇, 𝜎 = 𝑒 2𝜎2
√2𝜋𝜎
(𝑥−𝜃1 )2
1 −
Therefore, 𝑓(𝑥, 𝜃1 , 𝜃2 ) = 𝑒 2𝜃2
√2𝜋𝜃2

𝐿(𝑋1 , 𝑋2 , … , 𝑋𝑛 ; 𝜃) = 𝑓(𝑥1 , 𝜃)𝑓(𝑥2 , θ) … 𝑓(𝑥𝑛 , 𝜃)

1 (𝑥 −𝜃 )2 1 (𝑥 −𝜃 )2 1 (𝑥 −𝜃 )2
− 1 1 − 2 1 − 𝑛 1
= 𝑒 2𝜃2 × 𝑒 2𝜃2 × ⋯× 𝑒 2𝜃2
√2𝜋𝜃2 √2𝜋𝜃2 √2𝜋𝜃2
1 1 𝑛
− ∑ (𝑥 −𝜃 )2
= 𝑛 𝑒 2 𝑖=1 𝑖 1
2𝜃
(2𝜋𝜃2 )2
Taking logarithm on both sides
𝑛
1 2 ∑𝑛𝑖=1(𝑥𝑖 − 𝜃1 )2
log 𝐿 = log ( ) − log 𝑒
2𝜋𝜃2 2𝜃2
𝑛 1 ∑𝑛𝑖=1(𝑥𝑖 − 𝜃1 )2
= log ( )−
2 2𝜋𝜃2 2𝜃2
𝑛 1 ∑𝑛𝑖=1(𝑥𝑖2 + 𝜃12 − 2𝑥𝑖 𝜃)
= log ( )−
2 2𝜋𝜃2 2𝜃2
Differentiating partially with respect to 𝜃1 , we get
𝜕(log 𝐿) 1
=− ∑𝑛𝑖=1(2𝜃1 − 2𝑥𝑖 )
𝜕𝜃1 2𝜃2

𝜕 (log 𝐿)
For maximum, =0
𝜕𝜃1
On simplifying, we get MLE of 𝜃1 = 𝜃̂1 = 𝑋̅
Differentiating partially with respect to 𝜃1 , we get
𝑛
𝜕(log 𝐿) 𝑛 1
=− + 2 [∑(𝑥𝑖 − 𝜃1 )2 ]
𝜕𝜃2 2𝜃2 2𝜃2
𝑖=1
𝜕 (log 𝐿)
For maximum, =0
𝜕𝜃2
𝑛 2
∑ (𝑥 −𝑋̅)
On simplifying, we get MLE of 𝜃2 = 𝜃̂2 = 𝑖=1 𝑖 = 𝑠2
𝑛
Testing of Hypothesis
Given a coin, we suspect that it is biased. To express this mathematically, we define a
random variable 𝑋 as the number of heads appearing when the coin is tossed 𝑛 times. Let
𝑃(𝑋 = 1) = 𝜃, 𝑃(𝑋 = 0) = 1 − 𝜃 i.e., the probability of getting one head is 𝜃 and one
tail is 1 − 𝜃. We wish to test the hypothesis 𝐻0 : 𝜃 = 0.5 against the alternative hypothesis
that 𝐻1 : 𝜃 > 0.5. The test we have decided to use is, if 𝑥̅ > 0.6, reject 𝐻0 and accept 𝐻1 ;
if 𝑥̅ < 0.6, reject 𝐻1 and accept 𝐻0 (i.e., if heads come upto 60% of the times, the coin is
unbiased and if it comes up more than 60% of the times, it is biased). Here, 𝑥̅ is the sample
mean which is obtained after conducting the experiment.
Null Hypothesis (𝑯𝟎 )
It is a definite statement about the population parameter, usually a hypothesis of no
difference
Alternative hypothesis (𝑯𝟏 )
Any hypothesis that is complementary to the null hypothesis is called alternative
hypothesis
Errors in hypothesis testing:

𝐻0 is true 𝐻0 is false

Accept 𝐻0 No error Type II error

Reject 𝐻0 Type I error No error

Statistical test
A statistical test of a hypothesis is a rule, stated in terms of a sample, that prescribes
whether a hypothesis is to be accepted or rejected, based on the value of the sample point
obtained
Critical region
The critical region of a test is the region of the sample space such that if the sample point
lies in it, the null hypothesis 𝐻0 is rejected
Power function (𝑲)
The power function 𝐾 of a test is the probability that the sample point falls in the critical
region of the test
𝐾 = 𝑃(Rejecting 𝐻0 )
The value of the function 𝐾 at a particular point is called as the power of the test at that
point
Level of significance (𝜶)
The level of significance of a test is the size of the critical region of the test
Level of significance = 𝑃(Type I error) = 𝛼

Exercise
1. 𝑋 has a pdf of the form 𝑓(𝑥, 𝜃) = 𝜃𝑥 𝜃−1 , 0 < 𝑥 < 1, 𝜃 > 0. To test 𝐻0 : 𝜃 = 1 against
𝐻1 : 𝜃 = 2, it is decided to use a random sample (𝑋1 , 𝑋2 ) of size 2 with the critical
3
region as 𝐶 = {(𝑥1 , 𝑥2 )|𝑥1 𝑥2 ≥ }. Compute the power function 𝐾(𝜃) and the
4
significance level 𝛼 of the test.
Solution:
3
𝐾(𝜃) = 𝑃(critical region 𝐶) = 𝑃 (𝑋1 𝑋2 ≥ )
4

Since (𝑋1 , 𝑋2 ) is a sample, 𝑋1 and 𝑋2 are independent,


𝑔(𝑥1 , 𝑥2 , 𝜃) = 𝑓(𝑥1 , 𝜃) × 𝑓(𝑥2 , 𝜃)
= 𝜃 2 (𝑥1 𝑥2 )𝜃−1 , 0 < 𝑥1 , 𝑥2 < 1, 𝜃 > 0
1 1
3
𝐾(𝜃) = 𝑃 (𝑋1 𝑋2 ≥ ) = ∫ ∫ 𝑔(𝑥1 , 𝑥2 , 𝜃) 𝑑𝑥2 𝑑𝑥1
4 3
𝑥1 = 𝑥2 =
3
4 4𝑥1

1 1
=∫ ∫ 𝜃 2 (𝑥1 𝑥2 )𝜃−1 𝑑𝑥2 𝑑𝑥1
3 3
𝑥1 = 𝑥2 =
4 4𝑥1
1
1 3 𝜃
= 𝜃∫ (𝑥1𝜃−1 − ( ) ) 𝑑𝑥1
𝑥1 =
3 𝑥1 4
4

3 𝜃 3
= 1 − ( ) [1 − 𝜃 log ( )]
4 4
Level of significance, 𝛼 = 𝐾(1) (since 𝐻0 : 𝜃 = 1)

= 0.0342
2. Let (𝑋1 , 𝑋2 ) be a random sample of size 2 from the distribution having pdf 𝑓(𝑥, 𝜃) =
𝑥
1
𝑒 −𝜃 , 𝑥 > 0, 𝜃 > 0. We reject 𝐻0 : 𝜃 = 2 and accept 𝐻1 : 𝜃 = 1 if the observed value
𝜃
(𝑥1 , 𝑥2 ) of the sample is such that
𝑓(𝑥1 , 2) 𝑓(𝑥2 , 2) 1

𝑓(𝑥1 , 1)𝑓(𝑥2 , 1) 2
Find the significance level of the test and the power of the test when 𝐻1 is true.
Solution:
The critical region is given by

𝑓(𝑥1 , 2) 𝑓(𝑥2 , 2) 1

𝑓(𝑥1 , 1)𝑓(𝑥2 , 1) 2

1 −𝑥21 1 −𝑥22
𝑒 × 𝑒 1
2 2 ≤
1 −𝑥11 1 −𝑥12 2
𝑒 × 𝑒
1 1
𝑥1 +𝑥2
𝑒 2 ≤2
𝑥1 + 𝑥2
≤ log 2
2
𝑥1 + 𝑥2 ≤ log 4

Since (𝑋1 , 𝑋2 ) is a sample, 𝑋1 and 𝑋2 are independent,


1 −𝑥1 1 −𝑥2 1 𝑥1+𝑥2
𝑔(𝑥1 , 𝑥2 , 𝜃) = 𝑓(𝑥1 , 𝜃) × 𝑓(𝑥2 , 𝜃) = 𝑒 𝜃 × 𝑒 𝜃 = 2 𝑒 − 𝜃 , 𝑥1 , 𝑥2 > 0, 𝜃 > 0
𝜃 𝜃 𝜃
𝐾(𝜃) = 𝑃(critical region) = 𝑃(𝑥1 + 𝑥2 ≤ log 4)
log 4 log 4−𝑥1
=∫ ∫ 𝑔(𝑥1 , 𝑥2 , 𝜃) 𝑑𝑥2 𝑑𝑥1
𝑥1 =0 𝑥2 =0
log 4 log 4−𝑥1
1 −𝑥1+𝑥2
=∫ ∫ 𝑒 𝜃 𝑑𝑥2 𝑑𝑥1
𝑥1 =0 𝑥2 =0 𝜃2

1 log 4 −log 4 𝑥
− 1
=− ∫ (𝑒 𝜃 − 𝑒 𝜃 ) 𝑑𝑥1
𝜃 𝑥1=0
log 4 −log 4 log 4
=− 𝑒 𝜃 − 𝑒− 𝜃 + 1
𝜃
Significance level = 𝛼 = 𝐾(2) (since 𝐻0 : 𝜃 = 2)
= 0.1534
Power of the test when 𝐻1 is true (i.e., 𝜃 = 1) is 𝐾(1) = 0.4034
3. The life length of a tyre in miles, 𝑋 is normally distributed with mean 𝜃 and standard
deviation 5000. The past experience indicates that 𝜃 = 30,000. The manufacturer
claims that the tyres made by a new procedure have 𝜃 > 30,000 and it is very likely
that 𝜃 = 35,000. Check the claim by testing 𝐻0 : 𝜃 ≤ 30,000 against 𝐻1 : 𝜃 > 30,000.
Observe 𝑛 independent values of 𝑋, say 𝑥1 , 𝑥2 , … , 𝑥𝑛 and reject 𝐻0 if and only if 𝑋̅ >
𝑐. Determine 𝑛 and 𝑐 so that the power function 𝐾(𝜃) has values 𝐾(30,000) = 0.01
and 𝐾(35,000) = 0.98.
Solution:
𝑲(𝟑𝟎, 𝟎𝟎𝟎) = 𝟎. 𝟎𝟏 𝑲(𝟑𝟓, 𝟎𝟎𝟎) = 𝟎. 𝟗𝟖
𝑃(𝑋̅ > 𝑐|𝜃 = 30000) 𝑃(𝑋̅ > 𝑐|𝜃 = 35000)
= 0.01 = 0.98
𝑐 − 30000 𝑐 − 35000
𝑃 [𝑍 > ] = 0.01 𝑃 [𝑍 > ] = 0.98
5000/√𝑛 5000/√𝑛

𝑐 − 35000 𝑐 − 35000
𝑃 [𝑍 ≤ ] = 0.99 𝑃 [𝑍 ≤ ] = 0.02
5000 5000
√𝑛 √𝑛
𝑐 − 30000 𝑐 − 35000
= 2.33 = −2.06
5000 5000
√𝑛 √𝑛

By solving we get, 𝑐 = 32,654 𝑎𝑛𝑑 𝑛 = 19.

4. Let 𝑋 be the random variable having the pdf


1 −𝑥
𝑓(𝑥, 𝜃) = {𝜃 𝑒
𝜃 0 < 𝑥 < ∞

0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Let 𝐻0 : 𝜃 = 2, 𝐻1 : 𝜃 = 4. Use random samples (𝑋1 , 𝑋2 ) of size 2 and define critical
region to be 𝐶 = {(𝑥1 , 𝑥2 )|9.5 ≤ 𝑋1 + 𝑋2 ≤ ∞}. Find the significance level of the test.
Solution:
𝐾(𝜃) = 𝑃 ((𝑥1 , 𝑥2 ) ∈ 𝐶)= 𝑃 (9.5 ≤ 𝑋1 + 𝑋2 ≤ ∞) = (1 − 𝑃(0 ≤ 𝑋1 + 𝑋2 ≤ 9.5)
1 9.5 9.5−𝑥2 − 𝑥1 +𝑥2 9.5+𝜃 9.5

𝐾(𝜃) = 1 − ∫ ∫0 𝑒 𝜃 𝑑𝑥1 𝑑𝑥2 = 𝑒 𝜃
𝜃2 0 𝜃

𝐾(2) = 𝛼 = 0.05
1 1
5. Let 𝑋 have binomial distribution with parameter 𝑛 = 10 and 𝑝 ∈ {𝑝 ∶ 𝑝 = , }.
4 2
1 1
𝐻0 : 𝑝 = , is rejected and 𝐻1 : 𝑝 = is accepted. If the observed values of 𝑋1 is a
2 4
random sample of size 1 is less than or equal to 3, find the power function of the test.
Solution:
𝐶 = {𝑥1 | 𝑥1 ≤ 3 }
𝑃(𝑋1 = 0) = 10𝐶0 𝑝0 (1 − 𝑝)10−0
𝑃(𝑋1 = 1) = 10𝐶1 𝑝1 (1 − 𝑝)10−1 and so on
𝐾(𝜃) = 𝑃(𝐶) = 𝑃{𝑥1 | 𝑥1 ≤ 3 }
= 𝑃(𝑋1 = 0) + 𝑃(𝑋1 = 1) + 𝑃(𝑋1 = 2) + 𝑃(𝑋1 = 3)
= 10𝐶0 𝑝0 (1 − 𝑝)10−0 + 10𝐶1 𝑝1 (1 − 𝑝)10−1 + 10𝐶2 𝑝2 (1 − 𝑝)10−2
+10𝐶3 𝑝3 (1 − 𝑝)10−3

1 11
𝐾( ) = 𝛼 =
2 64

6. Let 𝑋 have a poisson distribution with mean 𝜃. Test the simple hypothesis 𝐻0 : 𝜃 = 0.5
against the composite hypothesis 𝜃 < 0.5 by using a sample (𝑋1 , 𝑋2 , … , 𝑋12 ) of size
12. Reject 𝐻0 if and only if the observed value 𝑌 = 𝑋1 + 𝑋2 + ⋯ + 𝑋12 ≤ 2. Find the
1 1 1 1 1
powers 𝐾 ( ) . 𝐾 ( ) , 𝐾 ( ) , 𝐾 ( ) and 𝐾 ( ). What is the significance level of the
2 3 4 6 12
test?
Solution:
Since 𝑋~𝑃(𝜃), 𝑌 = 𝑋1 + 𝑋2 + ⋯ + 𝑋12 ~𝑃(12𝜃)
𝑒 −12𝜃 (12𝜃)𝑦
i.e. 𝑃(𝑌 = 𝑦) = , 𝑦 = 0,1,2, …
𝑦!
2
𝑒 −12𝜃 (12𝜃)𝑦
𝐾(𝜃) = 𝑃(𝑌 ≥ 2) = ∑ = 𝑒 −12𝜃 (1 + 12𝜃 + 72𝜃 2 )
𝑦!
𝑦=0
𝜃 1 1 1 1 1
2 3 4 6 12
𝐾(𝜃) 0.0619 0.2382 0.4232 0.6765 0.9198

Significance level is
𝛼 = 𝐾(0.5)(∵ 𝐻0 : 𝜃 = 0.5) = 0.0619
1
7. Let 𝑌 have a binomial distribution with parameters 𝑛 and 𝑝. We reject 𝐻0 : 𝑝 = and
2
1
accept 𝐻1 = 𝑝 > if 𝑦 ≥ 𝑐. Find 𝑛 and 𝑐 to give a power function 𝑘(𝑝) which is such
2
1 2
that 𝑘 ( ) = 0.1 and 𝑘 ( ) = 0.95 approx.
2 3

Solution:
𝑌 ~ 𝐵(𝑛, 𝑝) ⇒ 𝜇 = 𝑛𝑝, 𝜎 2 = 𝑛𝑝𝑞
𝐾(𝑝) = 𝑃(𝑌 ≥ 𝑐)
𝑌−𝑛𝑝 𝑐−𝑛𝑝 𝑐−𝑛𝑝
= 𝑃( ≥ ) = 𝑃 (𝑧 ≥ )
√𝑛𝑝𝑞 √ 𝑛𝑝𝑞 √𝑛𝑝𝑞

1 2𝑐 − 𝑛
𝐾 ( ) = 0.1 ⟹ 𝑃(𝑧 ≥ ) = 0.1
2 √𝑛
2𝑐−𝑛
i.e., 1 − 𝑃 (𝑧 < ) = 0.1
√𝑛
2𝑐 − 𝑛
𝜙( ) = 0.9
√𝑛
2𝑐 − 𝑛
= 𝜙 −1 (0.9) = 1.28
√𝑛
2𝑐 − 𝑛 = 1.28√𝑛 − (𝑖)

2 3𝑐 − 2𝑛
𝐾 ( ) = 0.95 ⟹ 𝑃(𝑧 ≥ ) = 0.95
3 √2𝑛
3𝑐−2𝑛
i.e., 1 − 𝑃 (𝑧 < ) = 0.95
√2𝑛
3𝑐 − 2𝑛
𝜙( ) = 0.05
√2𝑛
3𝑐 − 2𝑛
= 𝜙 −1 (0.05) = −𝜙 −1 (0.95) = −1.64
√2𝑛
3𝑐 − 2𝑛 = 1.64√2𝑛 = −2.3193√𝑛 − (𝑖𝑖)

(𝑖) × 2 − (𝑖𝑖) gives 𝑐 = 4.8793√𝑛

Substituting in (𝑖), we get 𝑛 = 71.8667 ≈ 72, 𝑐 = 41.3696 ≈ 41

8. Let 𝑋~𝑈(0, 𝜃). Test 𝐻0 : 𝜃 = 1 against 𝐻1 : 𝜃 = 2 using a sample (𝑋1 , 𝑋2 ) of size 2, by


rejecting 𝐻0 if either 𝑋̅ > 0.75 or atleast one of 𝑋1 and 𝑋2 is greater than 1. Compute
𝐾(1) and 𝐾(2).
Solution:
1
Since 𝑋~𝑈(0, 𝜃), 𝑓(𝑥, 𝜃) = , 0 < 𝑥 < 𝜃
𝜃

Since (𝑋1 , 𝑋2 ) is a sample, 𝑋1 and 𝑋2 are independent,


1
𝑔(𝑥1 , 𝑥2 , 𝜃) = 𝑓(𝑥1 , 𝜃) × 𝑓(𝑥2 , 𝜃) = , 0 < 𝑥1 , 𝑥2 < 𝜃
𝜃2
𝐾(1) = 𝑃[(𝑋̅ > 0.75) ∪ (𝑋1 > 1) ∪ (𝑋2 > 1)| 𝜃 = 1]
= 𝑃[((𝑋1 + 𝑋2 ) > 1.5) ∪ (𝑋1 > 1) ∪ (𝑋2 > 1)| 𝜃 = 1]
𝑋 +𝑋
(since 𝑋̅ = 1 2 )
2

= 𝑃[(𝑋1 + 𝑋2 ) > 1.5| 𝜃 = 1]


1 1
1
=∫ ∫ 2
𝑑𝑥2 𝑑𝑥1 = 0.125
𝑥1 =0.5 𝑥2 =1.5−𝑥1 1

𝐾(2) = 𝑃[(𝑋1 + 𝑋2 ) > 1.5| 𝜃 = 2]


= 1 − 𝑃[(𝑋1 + 𝑋2 ) < 1.5| 𝜃 = 2]
1.5 1.5−𝑥1
1
=1−∫ ∫ 𝑑𝑥2 𝑑𝑥1
𝑥1 =0 𝑥2 =0 22

= 0.7188
Chi-Square Test
Now imagine that the same question is asked about a given six-sided die. Again, we can
test the hypothesis that the die is fair, by throwing it 𝑛 times and observing the
distribution of the resulting outcomes. Now, however, we have not two but six different
classes into which the outcomes are divided – namely the number of appearances of 1,
the number of appearances of 2,⋯, the number of appearances of 6. Let these numbers
be denoted by 𝑥1 , 𝑥2 , ⋯ , 𝑥6 . We also have the probabilities 𝑝1 , 𝑝2 , ⋯ , 𝑝6 for the
occurrence of the respective faces 1,2, ⋯ ,6 of the die in each throw. In the case of a fair
1
die, each of these would be 𝑝𝑖 = . Since 𝑛 is the total number of throws, we expect that
6
each observed number 𝑥𝑖 should be more or less equal to 𝑛𝑝𝑖 .
(For instance, if we throw the die 60 times, then if the die is fair, each face should
appear roughly 10 times.)
Therefore, the difference 𝑥𝑖 − 𝑛𝑝𝑖 between the observed and expected number of
appearances is a measure of how false our hypothesis is likely to be.
Therefore, we need to design a test that uses these deviations to estimate the
probability of the truth or falsehood of the hypothesis.
Firstly, observe that 𝑥6 = 𝑛 − (𝑥1 + ⋯ + 𝑥5 ), so that we only need to consider
𝑥1 , 𝑥2 , ⋯ , 𝑥5 . The sample (𝑋1 , ⋯ , 𝑋5 ) has what is known as the multinomial distribution
with parameters 𝑛, 𝑝1 , ⋯ , 𝑝5 . Then if we define
(𝑋1 −𝑛𝑝1 )2 (𝑋2 −𝑛𝑝2 )2 (𝑋3 −𝑛𝑝3 )2 (𝑋4 −𝑛𝑝4 )2 (𝑋5 −𝑛𝑝5 )2 (𝑋6 −𝑛𝑝6 )2
𝒬5 = + + + + + ,
𝑛𝑝1 𝑛𝑝2 𝑛𝑝3 𝑛𝑝4 𝑛𝑝5 𝑛𝑝6

𝒬5 ~𝜒52 approximately for large 𝑛 (i.e., in the limiting case where 𝑛 → ∞).
Now the test is as follows. Select a value 𝑐, and reject the hypothesis that the die is fair if
𝒬5 ≥ 𝑐. What should be the ideal value of 𝑐? That depends on the significance level of
the test. What we can do therefore, is to set a desired significance level 𝛼 and select 𝑐
such that 𝑃(𝒬5 ≥ 𝑐) = 𝛼. This value of 𝑐 can be obtained from the chi-square.
The chi-square test in general can therefore be described as follows. Given 𝑘 classes
𝐶1 , 𝐶2 , ⋯ 𝐶𝑘 , to test the hypothesis 𝐻0 that the probability distribution of these classes is
𝑝1 , 𝑝2 , ⋯ , 𝑝𝑘 respectively (where 𝑝1 + 𝑝2 + ⋯ + 𝑝𝑘 ), we conduct 𝑛 trials and observe
the frequencies 𝑋1 , 𝑋2 , ⋯ , 𝑋𝑛 , of the respective classes (so that 𝑋1 + 𝑋2 + ⋯ + 𝑋𝑛 =
1).
(𝑋𝑖 −𝑛𝑝𝑖 )2
Now define 𝒬𝑘−1 = ∑𝑘𝑖=1 .
𝑛𝑝𝑖
Then the chi-square test with significance level 𝛼 is to reject 𝐻0 if 𝒬𝑘−1 ≥ 𝑐, where 𝑐 is
the value such that 𝑃[𝒬𝑘−1 ≥ 𝑐] = 𝛼. Common values of 𝛼 are 0.05 and 0.01, written as
5% and 1% respectively.

Exercise
1. A six-sided die is thrown 60 times and observed frequencies of the faces 1,2, ⋯ ,6 are
13,19,11,8,5,4 respectively. Test whether the die is fair at 5% level of significance?
Solution:
Here we have number of classes 𝑘 = 6, and theoretical probabilities 𝑝1 = 𝑝2 = ⋯ =
1
𝑝6 = (since a fair die must have equally likely outcomes), so that 𝑛𝑝𝑖 = 10, 𝑖 =
6
1,2, ⋯ ,10. Then
6
(𝑥𝑖 − 𝑛𝑝𝑖 )2
𝒬5 = ∑
𝑛𝑝𝑖
𝑖=1
(13 − 10)2 (19 − 10)2 (11 − 10)2 (8 − 10)2 (5 − 10)2 (4 − 10)2
= + + + + +
10 10 10 10 10 10
= 15.6
But 𝜒52 = 11.1 at 5% level of significance. Thus, 𝒬5 > 𝜒52 , which means that we reject
the hypothesis that the die is fair.

2. The Mendelian theory of genetics of crossing two types of peas states that the
9 13 13 1
probabilities of classification of the four resulting types are , , and
16 16 16 16
respectively. If from 160 independent observations, the observed frequencies of these
classifications are 86,35,26,13 respectively, test whether the data is consistent with
the theory with 𝛼 = 0.01.
Solution:
Here 𝑘 = 4, 𝑛 = 160 and 𝑛𝑝1 = 90, 𝑛𝑝2 = 30, 𝑛𝑝3 = 30, 𝑛𝑝4 = 10.
Therefore,
3
(𝑥𝑖 − 𝑛𝑝𝑖 )2
𝒬3 = ∑
𝑛𝑝𝑖
𝑖=1
(86−90)2 (35−30)2 (26−30)2 (13−10)2
= + + +
90 30 30 10
= 2.44 < 11.345 = χ23
at 1% level of significance (since 𝛼 = 0.01).
Thus, we accept the hypothesis that the data is consistent with the theory.
3. The table below lists the observed results of 𝑛 = 120 independent throws of a die.
Face 1 2 3 4 5 6
Frequency 𝑎 20 20 20 20 40 − 𝑎

For what values of 𝑎 would the hypothesis that the die is unbiased be rejected at 0.025
level of significance in a chi-square test?
Solution:
(𝑏−20)2
Since our test (at 0.025 level of significance) is to reject this if > 12.833 (this
10
number is obtained by comparing to a 𝜒 2 (5) distribution), we solve
(𝑏 − 20)2
> 12.833
10
If (𝑏 − 20) > √128.33 + 20
Or 𝑏 < 20 − √128.33
There are two sides coming from the positive and negative square root of (𝑏 − 20)2 ;
both need to be accounted for.

4. A manufacturer of lightbulbs claims that the lightbulbs produced fall into five
categories A, B, C, D, and E by quality, from highest to lowest, and that the percentages
of lightbulbs in these five categories are 15, 25, 35, 20, and 5 respectively. A contractor
who purchases a large number of the lightbulbs tests the claim by taking a random
sample of 30 lightbulbs and observes that the numbers of lightbulbs that fall in the
categories A, B, C, D, and E are 3, 6, 9, 7, and 5 respectively. Test whether the
manufacturer is speaking the truth, using a chi-square test at
(i) 5% significance level.
(ii) 1% significance level.
Solution:
Category A B C D E
15 25 35 20 5
Expected 30 ∗ = 4.5 30 ∗ = 7.5 30 ∗ = 10.5 30 ∗ =6 30 ∗ = 1.5
100 100 100 100 100

Observed 3 6 9 7 5

9.488 @ 5% significance level


𝜒42 = {
13.277 @ 1% significance level
2.25 2.25 2.25 1 12.25
𝑄4 = + + + + = 9.347
4.5 7.5 10.5 6 1.5

We accept the null hypothesis in 1% significance level.


We accept the null hypothesis in 5% significance level.

5. A survey of 320 families with 5 children each revels the following distribution
Number of boys 5 4 3 2 1 0
Number of families 14 55 110 88 40 12
As the result consistent with the hypothesis that the male and female birth are equally
probable at 0.05 significance level.
Solution:
Let 𝐻0 : Male and female birth is equal probable
1 1
If 𝐻0 is true, then 𝑝 = 𝑎𝑛𝑑 1 − 𝑝 =
2 2

𝑋: Number of boys in the family


1
𝑛 = 5, 𝑋~𝐵 (𝑛 = 5, 𝑝= )
2
1 5 10 10
𝑃(𝑋 = 0) = , 𝑃(𝑋 = 1) = , 𝑃(𝑋 = 2) = , 𝑃(𝑋 = 3) = ,
25 25 25 25
5 1
𝑃(𝑋 = 4) = , 𝑃(𝑋 = 5) =
25 25
Out of 320 families,
1
number of families with expected value of no boy = 320 × 5= 10
2
5
number of families with expected value of one boy = 320 × = 50 and so on
25

Number of boys (Observed) 5 4 3 2 1 0


Number of families 10 50 100 100 50 10
(Expected)
Number of 14 55 110 88 40 12
families(Observed)

𝑄5 = 7.14 < 11.1 = 𝜒52 . Thus, we accept the null hypothesis.

You might also like