0% found this document useful (0 votes)
20 views

Math 561 Probability - Chapter 1

Uploaded by

Pablys Zúñiga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Math 561 Probability - Chapter 1

Uploaded by

Pablys Zúñiga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Probability and Mathematical Statistics 1

Chapter 1
PROBABILITY OF EVENTS

1.1. Introduction

During his lecture in 1929, Bertrand Russel said, “Probability is the most
important concept in modern science, especially as nobody has the slightest
notion what it means.” Most people have some vague ideas about what prob-
ability of an event means. The interpretation of the word probability involves
synonyms such as chance, odds, uncertainty, prevalence, risk, expectancy etc.
“We use probability when we want to make an affirmation, but are not quite
sure,” writes J.R. Lucas.

There are many distinct interpretations of the word probability. A com-


plete discussion of these interpretations will take us to areas such as phi-
losophy, theory of algorithm and randomness, religion, etc. Thus, we will
only focus on two extreme interpretations. One interpretation is due to the
so-called objective school and the other is due to the subjective school.

The subjective school defines probabilities as subjective assignments


based on rational thought with available information. Some subjective prob-
abilists interpret probabilities as the degree of belief. Thus, it is difficult to
interpret the probability of an event.

The objective school defines probabilities to be “long run” relative fre-


quencies. This means that one should compute a probability by taking the
number of favorable outcomes of an experiment and dividing it by total num-
bers of the possible outcomes of the experiment, and then taking the limit
as the number of trials becomes large. Some statisticians object to the word
“long run”. The philosopher and statistician John Keynes said “in the long
run we are all dead”. The objective school uses the theory developed by
Probability of Events 2

Von Mises (1928) and Kolmogorov (1965). The Russian mathematician Kol-
mogorov gave the solid foundation of probability theory using measure theory.
The advantage of Kolmogorov’s theory is that one can construct probabilities
according to the rules, compute other probabilities using axioms, and then
interpret these probabilities.

In this book, we will study mathematically one interpretation of prob-


ability out of many. In fact, we will study probability theory based on the
theory developed by the late Kolmogorov. There are many applications of
probability theory. We are studying probability theory because we would
like to study mathematical statistics. Statistics is concerned with the de-
velopment of methods and their applications for collecting, analyzing and
interpreting quantitative data in such a way that the reliability of a con-
clusion based on data may be evaluated objectively by means of probability
statements. Probability theory is used to evaluate the reliability of conclu-
sions and inferences based on data. Thus, probability theory is fundamental
to mathematical statistics.

For an event A of a discrete sample space S, the probability of A can be


computed by using the formula

N (A)
P (A) =
N (S)

where N (A) denotes the number of elements of A and N (S) denotes the
number of elements in the sample space S. For a discrete case, the probability
of an event A can be computed by counting the number of elements in A and
dividing it by the number of elements in the sample space S.

In the next section, we develop various counting techniques. The branch


of mathematics that deals with the various counting techniques is called
combinatorics.

1.2. Counting Techniques

There are three basic counting techniques. They are multiplication rule,
permutation and combination.

1.2.1 Multiplication Rule. If E1 is an experiment with n1 outcomes


and E2 is an experiment with n2 possible outcomes, then the experiment
which consists of performing E1 first and then E2 consists of n1 n2 possible
outcomes.
Probability and Mathematical Statistics 3

Example 1.1. Find the possible number of outcomes in a sequence of two


tosses of a fair coin.
Answer: The number of possible outcomes is 2 · 2 = 4. This is evident from
the following tree diagram.

H HH

H HT
T
H TH
T
T TT
Tree diagram

Example 1.2. Find the number of possible outcomes of the rolling of a die
and then tossing a coin.
Answer: Here n1 = 6 and n2 = 2. Thus by multiplication rule, the number
of possible outcomes is 12.
H 1H
1 1T
2H
2 2T
3H
3 3T
4 4H
4T
5
5H
6 5T
6H
Tree diagram
T 6T

Example 1.3. How many different license plates are possible if Kentucky
uses three letters followed by three digits.
Answer:
(26)3 (10)3
= (17576) (1000)
= 17, 576, 000.

1.2.2. Permutation
Consider a set of 4 objects. Suppose we want to fill 3 positions with
objects selected from the above 4. Then the number of possible ordered
arrangements is 24 and they are
Probability of Events 4

abc bac cab dab


abd bad cad dac
acb bca cba dbc
acd bcd cbd dba
adc bda cdb dca
adb bdc cda dcb

The number of possible ordered arrangements can be computed as follows:


Since there are 3 positions and 4 objects, the first position can be filled in
4 different ways. Once the first position is filled the remaining 2 positions
can be filled from the remaining 3 objects. Thus, the second position can be
filled in 3 ways. The third position can be filled in 2 ways. Then the total
number of ways 3 positions can be filled out of 4 objects is given by

(4) (3) (2) = 24.

In general, if r positions are to be filled from n objects, then the total


number of possible ways they can be filled are given by

n(n − 1)(n − 2) · · · (n − r + 1)
n!
=
(n − r)!
= n Pr .

Thus, n Pr represents the number of ways r positions can be filled from n


objects.

Definition 1.1. Each of the n Pr arrangements is called a permutation of n


objects taken r at a time.

Example 1.4. How many permutations are there of all three of letters a, b,
and c?

Answer:

n!
3 P3 =
(n − r)!
.
3!
= =6
0!
Probability and Mathematical Statistics 5

Example 1.5. Find the number of permutations of n distinct objects.

Answer:

n! n!
n Pn = = = n!.
(n − n)! 0!

Example 1.6. Four names are drawn from the 24 members of a club for the
offices of President, Vice-President, Treasurer, and Secretary. In how many
different ways can this be done?

Answer:
(24)!
24 P4 =
(20)!
= (24) (23) (22) (21)
= 255, 024.

1.2.3. Combination

In permutation, order is important. But in many problems the order of


selection is not important and interest centers only on the set of r objects.

Let c denote the number of subsets of size r that can be selected from
n different objects. The r objects in each set can be ordered in r Pr ways.
Thus we have
n Pr = c (r Pr ) .

From this, we get


n Pr
n!
c==
P
r r (n − r)! r!
 n
The number c is denoted by r . Thus, the above can be written as
 
n n!
= .
r (n − r)! r!

 
Definition 1.2. Each of the nr unordered subsets is called a combination
of n objects taken r at a time.
Example 1.7. How many committees of two chemists and one physicist can
be formed from 4 chemists and 3 physicists?
Probability of Events 6

Answer:
  
4 3
2 1
= (6) (3)
= 18.
Thus 18 different committees can be formed.

1.2.4. Binomial Theorem

We know from lower level mathematics courses that

(x + y)2 = x2 + 2 xy + y 2
     
2 2 2 2 2
= x + xy + y
0 1 2
2  
2 2−k k
= x y .
k
k=0

Similarly

(x + y)3 = x3 + 3 x2 y + 3xy 2 + y 3
       
3 3 3 2 3 2 3 3
= x + x y+ xy + y
0 1 2 3
3  
3 3−k k
= x y .
k
k=0

In general, using induction arguments, we can show that


n  
 n
(x + y)n = xn−k y k .
k
k=0

 
This result is called the Binomial Theorem. The coefficient nk is called the
binomial coefficient. A combinatorial proof of the Binomial Theorem follows.
If we write (x + y)n as the n times the product of the factor (x + y), that is

(x + y)n = (x + y) (x + y) (x + y) · · · (x + y),
 
then the coefficient of xn−k y k is nk , that is the number of ways in which we
can choose the k factors providing the y’s.
Probability and Mathematical Statistics 7

Remark 1.1. In 1665, Newton discovered the Binomial Series. The Binomial
Series is given by
     
α α 2 α n
α
(1 + y) = 1 + y+ y + ··· + y + ···
1 2 n
∞  
 α k
=1+ y ,
k
k=1

where α is a real number and


 
α α(α − 1)(α − 2) · · · (α − k + 1)
= .
k k!
α
This k is called the generalized binomial coefficient.
Now, we investigate some properties of the binomial coefficients.
Theorem 1.1. Let n ∈ N and r = 0, 1, 2, ..., n. Then
   
n n
= .
r n−r

Proof: By direct verification, we get


 
n n!
=
n−r (n − n + r)! (n − r)!
n!
=
r! (n − r)!
 
n
= .
r

This theorem says that the binomial coefficients are symmetrical.


  
Example 1.8. Evaluate 31 + 32 + 30 .
Answer: Since the combinations of 3 things taken 1 at a time are 3, we get
3  3
1 = 3. Similarly, 0 is 1. By Theorem 1,
   
3 3
= = 3.
1 2

Hence      
3 3 3
+ + = 3 + 3 + 1 = 7.
1 2 0
Probability of Events 8

Theorem 1.2. For any positive integer n and r = 1, 2, 3, ..., n, we have


     
n n−1 n−1
= + .
r r r−1

Proof:
(1 + y)n = (1 + y) (1 + y)n−1
= (1 + y)n−1 + y (1 + y)n−1
n   n−1    n − 1
n r  n−1 r
n−1
y = y +y yr
r=0
r r=0
r r=0
r
 n − 1
n−1  n − 1
n−1
= yr + y r+1 .
r=0
r r=0
r

Equating the coefficients of y r from both sides of the above expression, we


obtain      
n n−1 n−1
= +
r r r−1
and the proof is now complete.
  23 24
Example 1.9. Evaluate 23 10 + 9 + 11 .

Answer:
     
23 23 24
+ +
10 9 11
   
24 24
= +
10 11
 
25
=
11
25!
=
(14)! (11)!
= 4, 457, 400.


n  
n r
Example 1.10. Use the Binomial Theorem to show that (−1) = 0.
r=0
r

Answer: Using the Binomial Theorem, we get


n  

n n
(1 + x) = xr
r=0
r
Probability and Mathematical Statistics 9

for all real numbers x. Letting x = −1 in the above, we get


n  
 n
0= (−1)r .
r=0
r

Theorem 1.3. Let m and n be positive integers. Then

k     
m n m+n
= .
r=0
r k−r k

Proof:
(1 + y)m+n = (1 + y)m (1 + y)n
 m    n   
 m + n 
m+n  m  n
r
y = yr yr .
r=0
r r=0
r r=0
r

Equating the coefficients of y k from the both sides of the above expression,
we obtain
          
m+n m n m n m n
= + + ··· +
k 0 k 1 k−1 k k−k

and the conclusion of the theorem follows.

Example 1.11. Show that


n  2
  
n 2n
= .
r=0
r n

Answer: Let k = n and m = n. Then from Theorem 3, we get

 k     
m n m+n
=
r=0
r k−r k
 n
n      
n 2n
=
r=0
r n−r n
 n
n      
n 2n
=
r=0
r r n
n   2  
n 2n
= .
r=0
r n
Probability of Events 10

Theorem 1.4. Let n be a positive integer and k = 1, 2, 3, ..., n. Then


    m 
n−1
n
= .
k k−1
m=k−1

Proof: In order to establish the above identity, we use the Binomial Theorem
together with the following result of the elementary algebra

n−1
xn − y n = (x − y) xk y n−1−k .
k=0

Note that
n   n  

n n
k
x = xk − 1
k k
k=1 k=0
= (x + 1)n − 1n by Binomial Theorem

n−1
= (x + 1 − 1) (x + 1)m by above identity
m=0

n−1 m  
m j
=x x
m=0 j=0
j

n−1 m  
m j+1
= x
m=0 j=0
j

n   m 
n−1
= xk .
k−1
k=1 m=k−1

Hence equating the coefficient of xk , we obtain


    m 
n−1
n
= .
k k−1
m=k−1

This completes the proof of the theorem.


The following result
  
n n
(x1 + x2 + · · · + xm ) = xn1 xn2 · · · xnmm
n1 +n2 +···+nm =n
n1 , n2 , ..., nm 1 2

is known as the multinomial theorem and it generalizes the binomial theorem.


The sum is taken over all positive integers n1 , n2 , ..., nm such that n1 + n2 +
· · · + nm = n, and
 
n n!
= .
n1 , n2 , ..., nm n1 ! n2 !, ..., nm !
Probability and Mathematical Statistics 11

This coefficient is known as the multinomial coefficient.


1.3. Probability Measure
A random experiment is an experiment whose outcomes cannot be pre-
dicted with certainty. However, in most cases the collection of every possible
outcome of a random experiment can be listed.
Definition 1.3. A sample space of a random experiment is the collection of
all possible outcomes.
Example 1.12. What is the sample space for an experiment in which we
select a rat at random from a cage and determine its sex?
Answer: The sample space of this experiment is

S = {M, F }

where M denotes the male rat and F denotes the female rat.
Example 1.13. What is the sample space for an experiment in which the
state of Kentucky picks a three digit integer at random for its daily lottery?
Answer: The sample space of this experiment is

S = {000, 001, 002, · · · · · · , 998, 999}.

Example 1.14. What is the sample space for an experiment in which we


roll a pair of dice, one red and one green?
Answer: The sample space S for this experiment is given by

{(1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6)


(2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
(3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
S=
(4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
(5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
(6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)}

This set S can be written as

S = {(x, y) | 1 ≤ x ≤ 6, 1 ≤ y ≤ 6}

where x represents the number rolled on red die and y denotes the number
rolled on green die.
Probability of Events 12

Definition 1.4. Each element of the sample space is called a sample point.

Definition 1.5. If the sample space consists of a countable number of sample


points, then the sample space is said to be a countable sample space.

Definition 1.6. If a sample space contains an uncountable number of sample


points, then it is called a continuous sample space.

An event A is a subset of the sample space S. It seems obvious that if A


and B are events in sample space S, then A ∪ B, Ac , A ∩ B are also entitled
to be events. Thus precisely we define an event as follows:

Definition 1.7. A subset A of the sample space S is said to be an event if it


belongs to a collection F of subsets of S satisfying the following three rules:
(a) S ∈ F; (b) if A ∈ F then Ac ∈ F; and (c) if Aj ∈ F for j ≥ 1, then

j=1 ∈ F. The collection F is called an event space or a σ-field. If A is the
outcome of an experiment, then we say that the event A has occurred.

Example 1.15. Describe the sample space of rolling a die and interpret the
event {1, 2}.

Answer: The sample space of this experiment is

S = {1, 2, 3, 4, 5, 6}.

The event {1, 2} means getting either a 1 or a 2.

Example 1.16. First describe the sample space of rolling a pair of dice,
then describe the event A that the sum of numbers rolled is 7.

Answer: The sample space of this experiment is

S = {(x, y) | x, y = 1, 2, 3, 4, 5, 6}

and
A = {(1, 6), (6, 1), (2, 5), (5, 2), (4, 3), (3, 4)}.

Definition 1.8. Let S be the sample space of a random experiment. A prob-


ability measure P : F → [0, 1] is a set function which assigns real numbers
to the various events of S satisfying

(P1) P (A) ≥ 0 for all event A ∈ F,


(P2) P (S) = 1,
Probability and Mathematical Statistics 13

∞ ∞

(P3) P Ak = P (Ak )
k=1 k=1
if A1 , A2 , A3 , ..., Ak , ..... are mutually disjoint events of S.
Any set function with the above three properties is a probability measure
for S. For a given sample space S, there may be more than one probability
measure. The probability of an event A is the value of the probability measure
at A, that is
P rob(A) = P (A).

Theorem 1.5. If ∅ is a empty set (that is an impossible event), then

P (∅) = 0.

Proof: Let A1 = S and Ai = ∅ for i = 2, 3, ..., ∞. Then



S= Ai
i=1

where Ai ∩ Aj = ∅ for i = j. By axiom 2 and axiom 3, we get

1 = P (S) (by axiom 2)



=P Ai
i=1


= P (Ai ) (by axiom 3)
i=1


= P (A1 ) + P (Ai )
i=2


= P (S) + P (∅)
i=2


=1+ P (∅).
i=2

Therefore


P (∅) = 0.
i=2

Since P (∅) ≥ 0 by axiom 1, we have

P (∅) = 0
Probability of Events 14

and the proof of the theorem is complete.

This theorem says that the probability of an impossible event is zero.


Note that if the probability of an event is zero, that does not mean the event
is empty (or impossible). There are random experiments in which there are
infinitely many events each with probability 0. Similarly, if A is an event
with probability 1, then it does not mean A is the sample space S. In fact
there are random experiments in which one can find infinitely many events
each with probability 1.

Theorem 1.6. Let {A1 , A2 , ..., An } be a finite collection of n events such


that Ai ∩ Ej = ∅ for i = j. Then

n 
n
P Ai = P (Ai ).
i=1 i=1

Proof: Consider the collection {Ai }∞


i=1 of the subsets of the sample space S
such that
A1 = A1 , A2 = A2 , ..., An = An

and
An+1 = An+2 = An+3 = · · · = ∅.

Hence
n ∞
P Ai =P Ai
i=1 i=1


= P (Ai )
i=1

n ∞

= P (Ai ) + P (Ai )
i=1 i=n+1
n ∞
= P (Ai ) + P (∅)
i=1 i=n+1
n
= P (Ai ) + 0
i=1
n
= P (Ai )
i=1

and the proof of the theorem is now complete.


Probability and Mathematical Statistics 15

When n = 2, the above theorem yields P (A1 ∪ A2 ) = P (A1 ) + P (A2 )


where A1 and A2 are disjoint (or mutually exclusive) events.
In the following theorem, we give a method for computing probability
of an event A by knowing the probabilities of the elementary events of the
sample space S.
Theorem 1.7. If A is an event of a discrete sample space S, then the
probability of A is equal to the sum of the probabilities of its elementary
events.
Proof: Any set A in S can be written as the union of its singleton sets. Let
{Oi }∞
i=1 be the collection of all the singleton sets (or the elementary events)
of A. Then

A= Oi .
i=1

By axiom (P3), we get



P (A) = P Oi
i=1


= P (Oi ).
i=1

Example 1.17. If a fair coin is tossed twice, what is the probability of


getting at least one head?
Answer: The sample space of this experiment is

S = {HH, HT, T H, T T }.

The event A is given by

A = { at least one head }


= {HH, HT, T H}.

By Theorem 1.7, the probability of A is the sum of the probabilities of its


elementary events. Thus, we get

P (A) = P (HH) + P (HT ) + P (T H)


1 1 1
= + +
4 4 4
3
= .
4
Probability of Events 16

Remark 1.2. Notice that here we are not computing the probability of the
elementary events by taking the number of points in the elementary event
and dividing by the total number of points in the sample space. We are
using the randomness to obtain the probability of the elementary events.
That is, we are assuming that each outcome is equally likely. This is why the
randomness is an integral part of probability theory.
Corollary 1.1. If S is a finite sample space with n sample elements and A
is an event in S with m elements, then the probability of A is given by
m
P (A) = .
n
Proof: By the previous theorem, we get
m
P (A) = P Oi
i=1

m
= P (Oi )
i=1
m
1
=
i=1
n
m
= .
n
The proof is now complete.
Example 1.18. A die is loaded in such a way that the probability of the
face with j dots turning up is proportional to j for j = 1, 2, ..., 6. What is
the probability, in one roll of the die, that an odd number of dots will turn
up?
Answer:
P ({j}) ∝ j
= kj
where k is a constant of proportionality. Next, we determine this constant k
by using the axiom (P2). Using Theorem 1.5, we get
P (S) = P ({1}) + P ({2}) + P ({3}) + P ({4}) + P ({5}) + P ({6})
= k + 2k + 3k + 4k + 5k + 6k
= (1 + 2 + 3 + 4 + 5 + 6) k
(6)(6 + 1)
= k
2
= 21k.
Probability and Mathematical Statistics 17

Using (P2), we get


21k = 1.
1
Thus k = 21 . Hence, we have

j
P ({j}) = .
21
Now, we want to find the probability of the odd number of dots turning up.

P (odd numbered dot will turn up) = P ({1}) + P ({3}) + P ({5})


1 3 5
= + +
21 21 21
9
= .
21
n
Remark 1.3. Recall that the sum of the first n integers is equal to 2 (n+1).
That is,
n(n + 1)
1 + 2 + 3 + · · · · · · + (n − 2) + (n − 1) + n = .
2
This formula was first proven by Gauss (1777-1855) when he was a young
school boy.
Remark 1.4. Gauss proved that the sum of the first n positive integers
is n (n+1)
2 when he was a school boy. Kolmogorov, the father of modern
probability theory, proved that the sum of the first n odd positive integers is
n2 , when he was five years old.

1.4. Some Properties of the Probability Measure


Next, we present some theorems that will illustrate the various intuitive
properties of a probability measure.
Theorem 1.8. If A be any event of the sample space S, then

P (Ac ) = 1 − P (A)

where Ac denotes the complement of A with respect to S.


Proof: Let A be any subset of S. Then S = A ∪ Ac . Further A and Ac are
mutually disjoint. Thus, using (P3), we get

1 = P (S) = P (A ∪ Ac )
= P (A) + P (Ac ).
Probability of Events 18

c
A A

Hence, we see that


P (Ac ) = 1 − P (A).
This completes the proof.
Theorem 1.9. If A ⊆ B ⊆ S, then

P (A) ≤ P (B).

Proof: Note that B = A ∪ (B \ A) where B \ A denotes all the elements x


that are in B but not in A. Further, A ∩ (B \ A) = ∅. Hence by (P3), we get

P (B) = P (A ∪ (B \ A))
= P (A) + P (B \ A).

By axiom (P1), we know that P (B \ A) ≥ 0. Thus, from the above, we get

P (B) ≥ P (A)

and the proof is complete.


Theorem 1.10. If A is any event in S, then

0 ≤ P (A) ≤ 1.
Probability and Mathematical Statistics 19

Proof: Follows from axioms (P1) and (P2) and Theorem 1.8.
Theorem 1.10. If A and B are any two events, then

P (A ∪ B) = P (A) + P (B) − P (A ∩ B).

Proof: It is easy to see that

A ∪ B = A ∪ (Ac ∩ B)

and
A ∩ (Ac ∩ B) = ∅.

A B

Hence by (P3), we get

P (A ∪ B) = P (A) + P (Ac ∩ B) (1.1)

But the set B can also be written as

B = (A ∩ B) ∪ (Ac ∩ B)

A B
Probability of Events 20

Therefore, by (P3), we get

P (B) = P (A ∩ B) + P (Ac ∩ B). (1.2)

Eliminating P (Ac ∩ B) from (1.1) and (1.2), we get

P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

and the proof of the theorem is now complete.

This above theorem tells us how to calculate the probability that at least
one of A and A occurs.

Example 1.19. If P (A) = 0.25 and P (B) = 0.8, then show that 0.05 ≤
P (A ∩ B) ≤ 0.25.

Answer: Since A ∩ B ⊆ A and A ∩ B ⊆ B, by Theorem 1.8, we get

P (A ∩ B) ≤ P (A) and also P (A ∩ B) ≤ P (B).

Hence
P (A ∩ B) ≤ min{P (A), P (B)}.

This shows that


P (A ∩ B) ≤ 0.25. (1.3)

Since A ∪ B ⊆ S, by Theorem 1.8, we get

P (A ∪ B) ≤ P (S)

That is, by Theorem 1.10

P (A) + P (B) − P (A ∩ B) ≤ P (S).

Hence, we obtain
0.8 + 0.25 − P (A ∩ B) ≤ 1

and this yields


0.8 + 0.25 − 1 ≤ P (A ∩ B).

From this, we get


0.05 ≤ P (A ∩ B). (1.4)
Probability and Mathematical Statistics 21

From (1.3) and (1.4), we get

0.05 ≤ P (A ∩ B) ≤ 0.25.

Example 1.20. Let A and B be events in a sample space S such that


P (A) = 12 = P (B) and P (Ac ∩ B c ) = 13 . Find P (A ∪ B c ).

Answer: Notice that

A ∪ B c = A ∪ (Ac ∩ B c ).

Hence,
P (A ∪ B c ) = P (A) + P (Ac ∩ B c )
1 1
= +
2 3
5
= .
6

Theorem 1.11. If A1 and A2 are two events such that A1 ⊆ A2 , then

P (A2 \ A1 ) = P (A2 ) − P (A1 ).

Proof: The event A2 can be written as

A 2 = A1 (A2 \ A1 )

where the sets A1 and A2 \ A1 are disjoint. Hence

P (A2 ) = P (A1 ) + P (A2 \ A1 )

which is
P (A2 \ A1 ) = P (A2 ) − P (A1 )

and the proof of the theorem is now complete.

From calculus we know that a real function f : R I →R I (the set of real


numbers) is continuous on RI if and only if, for every convergent sequence
{xn }∞ in R,
I
n=1 
lim f (xn ) = f lim xn .
n→∞ n→∞
Probability of Events 22

Theorem 1.12. If A1 , A2 , ..., An , ... is a sequence of events in sample space


S such that A1 ⊆ A2 ⊆ · · · ⊆ An ⊆ · · ·, then

P An = lim P (An ).
n→∞
n=1

Similarly, if B1 , B2 , ..., Bn , ... is a sequence of events in sample space S such


that B1 ⊇ B2 ⊇ · · · ⊇ Bn ⊇ · · ·, then


P Bn = lim P (Bn ).
n→∞
n=1

Proof: Given an increasing sequence of events

A1 ⊆ A2 ⊆ · · · ⊆ An ⊆ · · ·

we define a disjoint collection of events as follows:

E 1 = A1
En = An \ An−1 ∀n ≥ 2.

Then {En }∞
n=1 is a disjoint collection of events such that

∞ ∞
An = En .
n=1 n=1

Further
∞ ∞
P An =P En
n=1 n=1


= P (En )
n=1

m
= lim P (En )
m→∞
n=1
 

m
= lim P (A1 ) + [P (An ) − P (An−1 )]
m→∞
n=2
= lim P (Am )
m→∞
= lim P (An ).
n→∞
Probability and Mathematical Statistics 23

The second part of the theorem can be proved similarly.


Note that

lim An = An
n→∞
n=1

and


lim Bn = Bn .
n→∞
n=1

Hence the results above theorem can be written as



P lim An = lim P (An )
n→∞ n→∞

and 
P lim Bn = lim P (Bn )
n→∞ n→∞

and the Theorem 1.12 is called the continuity theorem for the probability
measure.
1.5. Review Exercises
1. If we randomly pick two television sets in succession from a shipment of
240 television sets of which 15 are defective, what is the probability that they
will both be defective?
2. A poll of 500 people determines that 382 like ice cream and 362 like cake.
How many people like both if each of them likes at least one of the two?
(Hint: Use P (A ∪ B) = P (A) + P (B) − P (A ∩ B) ).
3. The Mathematics Department of the University of Louisville consists of
8 professors, 6 associate professors, 13 assistant professors. In how many of
all possible samples of size 4, chosen without replacement, will every type of
professor be represented?
4. A pair of dice consisting of a six-sided die and a four-sided die is rolled
and the sum is determined. Let A be the event that a sum of 5 is rolled and
let B be the event that a sum of 5 or a sum of 9 is rolled. Find (a) P (A), (b)
P (B), and (c) P (A ∩ B).
5. A faculty leader was meeting two students in Paris, one arriving by
train from Amsterdam and the other arriving from Brussels at approximately
the same time. Let A and B be the events that the trains are on time,
respectively. If P (A) = 0.93, P (B) = 0.89 and P (A ∩ B) = 0.87, then find
the probability that at least one train is on time.
Probability of Events 24

6. Bill, George, and Ross, in order, roll a die. The first one to roll an even
number wins and the game is ended. What is the probability that Bill will
win the game?

7. Let A and B be events such that P (A) = 1


2 = P (B) and P (Ac ∩ B c ) = 13 .
Find the probability of the event Ac ∪ B c .

8. Suppose a box contains 4 blue, 5 white, 6 red and 7 green balls. In how
many of all possible samples of size 5, chosen without replacement, will every
color be represented?
n  
n
9. Using the Binomial Theorem, show that k = n 2n−1 .
k
k=0

10. A function consists of a domain A, a co-domain B and a rule f . The


rule f assigns to each number in the domain A one and only one letter in the
co-domain B. If A = {1, 2, 3} and B = {x, y, z, w}, then find all the distinct
functions that can be formed from the set A into the set B.

11. Let S be a countable sample space. Let {Oi }∞i=1 be the collection of all
the elementary events in S. What should be the value of the constant c such
 i
that P (Oi ) = c 13 will be a probability measure in S?

12. A box contains five green balls, three black balls, and seven red balls.
Two balls are selected at random without replacement from the box. What
is the probability that both balls are the same color?

13. Find the sample space of the random experiment which consists of tossing
a coin until the first head is obtained. Is this sample space discrete?

14. Find the sample space of the random experiment which consists of tossing
a coin infinitely many times. Is this sample space discrete?

15. Five fair dice are thrown. What is the probability that a full house is
thrown (that is, where two dice show one number and other three dice show
a second number)?

16. If a fair coin is tossed repeatedly, what is the probability that the third
head occurs on the nth toss?

17. In a particular softball league each team consists of 5 women and 5


men. In determining a batting order for 10 players, a woman must bat first,
and successive batters must be of opposite sex. How many different batting
orders are possible for a team?
Probability and Mathematical Statistics 25

18. An urn contains 3 red balls, 2 green balls and 1 yellow ball. Three balls
are selected at random and without replacement from the urn. What is the
probability that at least 1 color is not drawn?

19. A box contains four $10 bills, six $5 bills and two $1 bills. Two bills are
taken at random from the box without replacement. What is the probability
that both bills will be of the same denomination?

20. An urn contains n white counters numbered 1 through n, n black coun-


ters numbered 1 through n, and n red counter numbered 1 through n. If
two counters are to be drawn at random without replacement, what is the
probability that both counters will be of the same color or bear the same
number?

21. Two people take turns rolling a fair die. Person X rolls first, then
person Y , then X, and so on. The winner is the first to roll a 6. What is the
probability that person X wins?
22. Mr. Flowers plants 10 rose bushes in a row. Eight of the bushes are
white and two are red, and he plants them in random order. What is the
probability that he will consecutively plant seven or more white bushes?
23. Using mathematical induction, show that
n   k

dn n d dn−k
[f (x) · g(x)] = [f (x)] · [g(x)] .
dxn k dxk dxn−k
k=0
Probability of Events 26

You might also like