Basic Probability - 1 PDF
Basic Probability - 1 PDF
Chapter 1
(a) the set of all possible outcomes of the experiment can be described;
(b) the outcome of the experiment cannot be predicted with certainty prior to the performance
of the experiment.
The set of all possible outcomes (or sample points) of the experiment is called the sample space
and is denoted by S. For a given experiment it may be possible to define several sample spaces.
Example For the experiment of tossing a coin three times, we could define
(a) S = {HHH,HHT,HTH,THH,HTT,THT,TTH,TTT},
each outcome being an ordered sequence of results; or
(ii) Tossing a coin until the first head appears: S = {H,TH,TTH,TTTH, . . .}.
Events
Combination of events
Since events are sets, they may be combined using the notation of set theory: Venn diagrams
are useful for exhibiting definitions and results, and you should draw such a diagram for each
operation and identity introduced below.
[In the following, A, B, C, A1 , ..., An are events in the event space F (discussed below), and are
therefore subsets of the sample space S.]
The union of A and B, denoted by A ∪ B, is the event ‘either A or B, or both’.
The difference of A and B, denoted by A\B, is the event ‘A but not B’.
Sn Tn
A1 ∪ A2 ∪ ... ∪ An = A1 ∩ A2 ∩ ... ∩ An or i=1 Ai = i=1 Ai (1.8)
Tn Sn
A1 ∩ A2 ∩ ... ∩ An = A1 ∪ A2 ∪ ... ∪ An or i=1 Ai = i=1 Ai (1.9)
(The last two results are known as de Morgan’s Laws - see e.g. Ross for proofs.)
The events A1 , A2 , ..., An are termed mutually exclusive if Ai ∩ Aj = ∅ for all i 6= j.
If the events A1 , A2 , ..., An are both mutually exclusive and exhaustive, they are called a
partition of S.
Event space
The collection of all subsets of S may be too large for probabilities to be assigned reasonably
to all its members. This suggests the concept of event space.
A collection F of subsets of the sample space is called an event space (or σ-field) if
(a) the certain event S and the impossible event ∅ belong to F;
(b) if A ∈ F, then A ∈ F;
∞ ∞
It is readily shown that, if A1 , A2 , . . . ∈ F, then Ai ∈ F. For (Ai ) ∈ F (invoking
T T S
Ai =
i=1 i i=1
properties (b) and (c) of F, and the result follows from property (b).
For a finite sample space, we normally use the collection of all subsets of S (the power set of
S) as the event space. For S = {−∞, ∞} (or a subset of the real line), the collection of sets
containing all one-point sets and all well-defined intervals is an event space.
(i) There is redundancy in both axioms (a) and (b) - see Corollaries to the Complementarity
Rule (overleaf).
(ii) Axiom (c) states that P is countably additive. This is a more extensive property than
being finitely additive, i.e. the property that, if A 1 , A2 , ...An is a collection of n mutually
exclusive events, then
n
X
P(A1 ∪ A2 ∪ . . . An ) = P(Ai ). (1.11)
i=1
This is derived from axiom (c) by defining A n+1 , An+2 , ... to be ∅: only (1.11) is required in
the case of finite S and F. It can be shown that the property of being countably additive
is equivalent to the assertion that P is a continuous set function, i.e. given an increasing
∞
sequence of events A1 ⊆ A2 ⊆ ... and writing A = ∪ Ai = lim Ai , then P(A) = lim P(Ai )
i=1 i→∞ i→∞
(with a similar result for a decreasing sequence) - see e.g. Grimmett & Welsh §1.9
From the probability axioms (a)-(c), many results follow. For example:
P(S) = 1 ⇒ P(∅) = 0
Proof P(B) = P(A ∪ (B ∩ A)) = P(A) + P(B ∩ A) using axiom (c). The result follows
from the fact that P(B ∩ A) ≥ 0 (axiom (a)). ♦
(iii) Addition Law (for two events) If A, B ∈ F, then
n
X n
X
(b) P(A1 ∩ A2 ∩ . . . ∩ An ) ≥ P(Ai ) − (n − 1) = 1− P(Ai ). (1.17b)
i=1 i=1
Again, proofs are by induction on n (see Examples Sheet 1). [(1.17a) can be generalized.]
The axioms and derived properties of P provide a probability ‘calculus’, but they do not help in
setting up a probability model for an experiment or in assigning actual numerical values to the
probabilities of specific events. However, the properties of P may prove very useful in solving
a problem once these two steps have been completed.
An important special case (the ‘classical’ situation) is where S is finite and consists of N equally
likely outcomes E1 , E2 , . . . , EN : thus, E1 , . . . , EN are mutually exclusive and exhaustive, and
P(E1 ) = . . . P(EN ). Then, since
we have
1
P(Ei ) = , i = 1, . . . , N. (1.18)
N
Often there is more than one way of tackling a probability problem: on the other hand, an
approach which solves one problem may be inapplicable to another problem (or only solve it
with considerable difficulty). There are five general methods which are widely used:
(i) Listing the elements of the sample space S = {E 1 , . . . , EN }, where N is finite (and
fairly small), and identifying the ‘favourable’ ones. Thus, if the event of interest is A =
{Ei1 , . . . , EiN (A) }, then
P(A) = P(Ei1 ) + · · · + P(EiN (A) ).
N (A)
P(A) = ; (1.19)
N
i.e.
number of outcomes in A
P(A) = .
total number of outcomes
In such calculations, symmetry arguments may play an important role.
(ii) Enumeration, when S is a finite sample space of equally likely outcomes. To avoid
listing, we calculate the numbers N (A) and N by combinatorial arguments, then again
use P(A) = N (A)/N .
(iii) Sequential (or Historical) method. Here we follow the history of the system (actual or
conceptual) and use the multiplication rules for probabilities (see later).
page 7 110SOR201(2002)
(iv) Recursion. Here we express the problem in some recursive form, and solve it by recursion
or induction (see later).
(v) Define suitable random variables e.g. indicator random variables (see later).
In serious applications, we should also investigate the robustness (or sensitivity of the results)
with respect to small changes in the initially assumed probabilities and assumptions in the
model (e.g. assumed independence of successive trials).
As already observed, in many problems involving finite sample spaces with equally likely out-
comes, the calculation of a probability reduces to problems of combinatorial counting. Also, it
is common for problems which at first appear to be quite different to actually have identical
solutions: e.g. (see §1.6.2 below)
(i) calculate the probability that all students in a class of 30 have different birthdays;
(ii) calculate the probability that in a random sample of size 30, sampling with replacement,
from a set of 365 distinct objects, all objects are different.
In this course, we will not be doing many such examples, but from time to time combinatorial
arguments will be used, and you will be expected to be familiar with the more elementary
results in combinatorial analysis. A summary of the different contexts in which these can arise
is provided below.
All the following results follow from simple counting principles, and it is worth checking that
you can derive them.
Sampling
Suppose that a sample of size r is taken from a population of n distinct elements. Then
(ii) If sampling is without replacement, there are n!/(n − r)! different ordered samples and
n
r = n!/{r!(n − r)!} different non-ordered samples.
(A random sample of size r from a finite population is one which has been drawn in such a
way that all possible samples of size r have the same probability of being chosen. If a sample
is drawn by successively selecting elements from the population (with or without replacement)
in such a way that each element remaining has the same chance of being chosen at the next
selection, then the sample is a random sample.)
Arrangements
(i) The number of different arrangements of r objects chosen from n distinct objects is
n!/(n − r)!. In particular, the number of arrangements of n distinct objects is n!.
(ii) The number of different arrangements of n objects of which n 1 are of one kind, n2 of a
second kind,...,nk of a kth kind is n!/{n1 !n2 !...nk !}, where ki=1 ni = n (known as the
P
Sub-populations
(i) The number of ways in which a population of n distinct elements can be partitioned into
k sub-populations with n1 elements in the first, n2 elements in the second,..., nk elements
in the kth, so that ki=1 ni = n, is the multinomial coefficient n!/{n 1 !n2 !...nk !}.
P
Example 1.1
Suppose n cards numbered 1, 2, ...n are laid out at random in a row. Define the event
Ai : card i appears in the ith position of the row.
This is termed a match (or coincidence or rencontre) in the ith position. What is the probability
of obtaining at least one match?
Solution First, some preliminary results:
The easiest argument is as follows. In the case of the first summation, take a sample
of size 2, {r,s} say, from the population {1, 2, . . . , n}, sampling without
! replacement: let
n
i = min(r, s), j = max(r, s). The number of such samples is , so this is the number
2
of terms in the first summation.
!
n
Similarly, the number of terms in the second summation is – and so on.
3
page 9 110SOR201(2002)
respectively. Hence
! ! !
(n − 1)! n (n − 2)! n (n − 3)! n 0!
P(A1 ∪ · · · ∪ An ) = n. − . + . − · · · + (−1)n+1
n! 2 n! 3 n! n n!
1 1 1
= 1 − + − · · · + (−1)n+1 . .
2! 3! n!
(1.20)
Now
∞
X xr x x2 x3
ex = =1+ + + + ··· for all real x.
r=0
r! 1! 2! 3!
So ∞
X (−1)r 1 1 1
e−1 = =1−1+ − + − ···,
r=0
r! 2! 3! 4!
and therefore
Now 10
3
P(A1 ) = P(no card 1 in 10 packets) =
4
3 × 3 × ... × 3
(by combinatorial argument or multiplication law or binomial distribution (see
4 × 4 × ... × 4
later)). By symmetry
10
3
P(Ai ) = , i = 1, . . . , 4.
4
Similarly, we have that
10
2
P(Ai ∩ Aj ) = , i 6= j,
4 10
1
P(Ai ∩ Aj ∩ Ak ) = , i 6= j 6= k,
4
P(A1 ∩ A2 ∩ A3 ∩ A4 ) = 0.
(b) If the median is 8, two of the chosen numbers come from {1, 2, 3, 4, 5, 6, 7} and 2 from
{9, 10, 11, 12}; so the required probability is
! !
7 4
·
2 2
! . u
t
12
5
[Some people may find this result surprising, thinking that the critical value of n should be
much larger; but this is perhaps because they are confusing this problem with another one -
what is the probability that at least one person in a group of n people has a specific birthday
364 n
e.g. 1st January. The answer to this is clearly 1 − ( 365 ) : this rises more slowly with n and
1
does not exceed 2 until n = 253.]
Example 1.5 A Seating Problem
If n married couples are seated at random at a round table, find the probability that no wife
sits next to her husband.
Solution Let Ai : couple i sit next to each other. Then the required probability is
1 − P(A1 ∪ A2 ∪ ... ∪ An )
n
X X
= 1− P(Ai ) − · · · − (−1)r+1 P(Ai1 ∩ Ai2 ∩ · · · ∩ Air ) − · · ·
i=1 1≤i1 <i2 <...<ir ≤n
−(−1)n+1 P(A 1 ∩ · · · ∩ An ).
2r (2n − r − 1)!
P(Ai1 ∩ Ai2 ∩ · · · ∩ Air ) = .
(2n − 1)!
(Note: there is another version of this problem in which one imposes the constraint that men
and women must alternate around the table).