Lectures Chapter 2A
Lectures Chapter 2A
PROBABILITY (Chapter 2)
The notion of probability
What is probability?
In one sense it is a measure of belief regarding the occurrence of events.
Probability scale:
0.25
1/4, 25%
impossible improbable
0.5
1/2, 50%
as likely as not
0.75
probable
1
100%
certain
Eg: A fair coin is tossed once. Then the probability of a head coming up is 1/2.
The last statement is actually an expression of the belief that if the coin were to be
tossed many times (eg n = 1000) then Hs would come up about half the time
(eg y = 503). (We believe that as n increases indefinitely, the ratio y/n tends to 0.5
exactly. This is an example of the law of large numbers, which well look at in Ch 9.)
Thus pr is also a concept which involves the notion of (long-run) relative frequency.
We wish to be able to find prs such as that of getting 2 Hs on 5 tosses of a coin. In
order to do this, we need a theory of probability.
The theory of probability
In the most commonly accepted theory of probability:
1. events are expressed as sets
2. probabilities of events are expressed as functions of the corresponding sets.
STAT2001_CH02A Page 2 of 9
(countably infinite)
C = (0,1)
(uncountably infinite).
NB: We will use the symbol rather than to indicate subsetting. The latter is
potentially confusing because it looks like C. Also, it indicates proper subsetting in
some books, wherein A B if A B and B A .
Two sets are disjoint (or mutually exclusive) if they have no elements in common.
(Thus A and B are disjoint if AB = .)
Several sets are disjoint if no two of them have any elements in common.
(Thus, A, B and C are disjoint if AB = AC = BC = .)
Venn diagrams are a useful tool when dealing with sets. For example:
STAT2001_CH02A Page 3 of 9
Example 1
A B,
AC
Hence A C = {1,4}.
AC,
STAT2001_CH02A Page 4 of 9
Algebra of sets
1.
A B = B A
AB = BA
2.
(commutative laws)
A ( B C ) = ( A B) C = A B C
A(BC) = (AB)C = ABC
3.
(associative laws)
A ( BC ) = ( A B)( A C )
A( B C ) = ( AB ) ( AC )
4.
5.
(distributive laws)
A B = AB
AB = A B
A = A , A A = A, AA = A, S = , etc
(basic identities)
white A B
A B
AB
Note that the above is not a proper proof of De Morgans 1st law.
The following is a proper proof (for interest only):
STAT2001_CH02A Page 5 of 9
e A B e A B e A and e B
e A and e B
e AB .
Therefore A B AB .
(1)
(2)
Example 2
STAT2001_CH02A Page 6 of 9
Axiom 1
Axiom 2
P(S) = 1
Axiom 3
P( A1 A2 ) = P( A1 ) + P( A2 ) +
These three conditions are known as the three axioms of probability. They do not
completely specify P, but merely ensure that P is sensible. It remains for P to be
precisely defined in any given situation. Typically, P is defined by assigning
reasonable probabilities to each of the sample points (or simple events) in S.
Example 3
STAT2001_CH02A Page 7 of 9
P () = 0 .
Theorem 2
Pf: Apply Axiom 3 and Thm 1, with Ai = for all i = n +1, n + 2,...
Theorem 3
P ( A) = 1 P ( A) .
Pf: 1 = P ( S ) by Axiom 2
= P ( A A) by the definition of complementation
= P ( A) + P ( A) by Thm 2 with n = 2, since A and A are disjoint.
Theorem 4
P ( A) 1 .
If A B , then P ( A) P ( B ) .
There are many other such results we could write down and prove. However, we now
have enough theory to be able to apply our theory of probability to practical problems.
The following is one of the two main basic strategies for computing probabilities.
STAT2001_CH02A Page 8 of 9
2.
3.
4.
5.
Example 4
1.
The experiment consists of tossing a coin twice and each time noting whether
a head or tail comes up.
2.
Let HT denote heads on the 1st toss and tails on the 2nd,
let HH denote 2 heads, etc.
Then the sample points are HH, HT, TH, TT. So the simple events are
E1 = {HH } , E2 = {HT}, E3 = {TH}, E4 = {TT},
and the sample space is S = E1 E2 E3 E4 = {HH, HT, TH, TT}.
3.
4.
5.
Therefore P ( A) = P ( E2 ) + P ( E3 ) = 1/ 4 + 1/ 4 = 1/ 2 .
In practice we often leave out details, such as brackets (mentioned earlier) and even
whole steps. Thus the above solution may be simplified as follows:
2.
3.
5.
Note that since all the sample points are equally likely, we may also think of P(A) as
nA / nS , where nA = 2 is the number of sample points in A and nS = 4 is the number of
sample points in S. Thus P(A) = 2/4 = 1/2, as before.
STAT2001_CH02A Page 9 of 9
Example 5
Example 6
We see that listing all the sample points in this case is tedious and impractical.
What we need here are some tools for counting sample points easily, and that leads us
to the next topic, combinatorics.