Unit 7: Probability Theory
Unit 7: Probability Theory
7.1. Introduction.
Suppose now that you are asked to quote the probability of R, and your answer is
P(R) = 0,7.
There are two main interpretations of this number. The ratio 0,7 represent the odds
in favor of R. This is the subjective probability that measures your personal belief in
R. Objective probability is the interpretation of P(R) = 0,7 as a relative frequency.
Suppose, for instance, that in the last ten years, it rained 7 times on the day 16th
January. Then 0,7 = 7/10 is the relative frequency of occurrences of R, also given
by the ratio between the favorable cases (7) and all possible cases (10).
There are other interpretations of P(R) = 0,7 arising, for instance, from logic or
psychology.
Example
With respect to S1, describe the event B of rolling a total of 7 with the two dice.
C
_ Complement A (also A = A ): all elements of Ω that are not in A;
A ≡ A C ={ω∈Ω / ω∉A }
A U B ={ω∈Ω / ω∈A u ω∈ B }
* ( A ∪ B ) C = AC ∩ B C
* ( A ∩ B )C = AC ∪ B C
_ Difference between two events A & B (A-B): A occurs and B does not
C
A - B ={ω∈Ω / ω∈A y ω∉ B }= A ∩ B
A–B
A B
A B
∞ ∞
If Ai are disjoint { Ai } i∞=1 Ai ∈ S ∀i; Ai ∩ A j = ∅ ∀i ≠ j ⇒ P Ai = ∑ P( Ai )
i =1 i =1
i) P(A) ≥ 0 ∀ A ∈ S
ii) P( Ω ) = 1
iii) If Ai are disjoint sets in S, then:
∞ ∞
{ Ai } i∞=1 Ai ∈ S ∀i; Ai ∩ A j = ∅ ∀i ≠ j ⇒ P Ai = ∑ P( Ai )
i =1 i =1
7.5. Probability space.
Theorem 1.- An event that never occurs (the impossible event) has
probability 0: P( ∅ ) = 0.
Proof.
If we consider the set of events: { Ai } i =1 Ai ∈ S Ai = ∅ ∀i
∞
By iii),
∞ ∞ ∞ ∞
But: P Ai = P ∅ = P(∅) = ∑ P( Ai ) = ∑ P(∅)
i =1 i =1 i =1 i =1
And consequently:
∞
P ( ∅) = ∑ P ( ∅)
i =1
by i) the unique solution is P( ∅ ) = 0.
Proof.
n n
P A i = ∑ P ( A i )
i =1 i = 1
Proof.
As , we can think of B = A ∪ (B-A). Then A y B-A are disjoint, and applying
theorem 2 we obtain:
P(A) ≤ P(B)
Proof.
_
Theorem 6.- Let be A S. Then P(A) = 1- P( A ).
_
Proof.: A y are disjoint and A ∪ A = Ω , so applying theorem 2:
_ _ _
P(A ∪ A ) = P(A ) + P(A ) = P(Ω) = (by ii)= 1. So P(A)=1- P( A ).
Remarks: The basic difficulty with the classical and frequency definitions of
probability is that their approach is to try somehow to prove mathematically that,
for example, the probability of picking a heart form a perfectly shuffled deck is ¼,
or that the probability of an unbiased coin coming up heads is ½. This cannot be
done. All we can say is that if a card is picked at random and then replaced, and
the process is repeated over and over again, the result that ratio of hearts to total
number of drawings will be close to ¼ is in accord with our intuition and our
physical experience. For this reason we should assign a probability of ¼ to the
event obtaining a heart, and similarly we should assign a probability of 1/52 to
each possible outcome of the experiment. The only reason for doing this is that the
consequences agree with our experience. If you decide that some mysterious
caused the ace of spades to be more likely than any other card, you could
incorporate this factor bay assigning a higher probability to the ace of spades. The
mathematical development of the theory would not be affected; however, the
conclusions you might draw from this assumption would be at variance with
experiment results.
In probability theory we are faced with situations in which our intuition or some
physical experiments we have carried out suggest certain results. Intuition and
experience lead us to an assignment of probabilities to events. As far as the
mathematics is concerned, any assignment of probabilities will do, subject to rules
of mathematic consistency. However, our hope is to develop mathematical results
that, when interpreted and related to physical experience, will help to make precise
JOSÉ A. GARCÍA CÓRDOBA DEPARTMENT OF CUANTITATIVE METHDS AND COMPUTER
FACULTAD DE CIENCIAS DE LA EMPRESA
UNIVERSIDAD POLITÉCNICA DE CARTAGENA
7-11
UNIT 7: PROBABILITY THEORY
such notions as “the ratio of the number of heads to the total number of
observations in a very large number of independent tosses of an unbiased coin is
very likely to be very close to ½”.
7.6. Independence.
Consider the following experiment. A person is selected at random and his height is
recorded. After this the last digit of the license number of the next car to pass is
noted. If A is the event that the height is over 6 feet, and B is the event that the
digit is > 7, then, intuitively, A and B are “independent” in the sense that
knowledge about the occurrence or nonoccurrence of one of the events should not
influence the odds about the other. For example, say that P(A)=0,2 and P(B)=0,3.
In a long sequence of trials we would expect the following situation.
(Roughly) 20% of the time A 80% of the time A does not occur;
Occurs; of those cases in which A of these cases:
Occurs:
30% B occurs 30% B occurs
70% B does not occur 70% B does not occur
Pr( i Ai ) = ∏ i Pr( Ai )
Generalization: A set of events {Ai} are independent if and only if:
occurrence of H. Comparing P(A/H) with P(A) will indicate the difference between
the odds about A when H is known to have occurred, and the odds about A before
any information about H is revealed.
The above discussion suggests that we define de Conditional Probability of A
given H as
P(A ∩ H)
P(A / H) =
P(H)
This makes sense if P(H)>0
n n −1
P A i = P(A 1 ) P(A 2 / A 1 ) P(A 3 / A 1 ∩ A 2 )... P A n / A i
i =1 i =1
n
i) P(Ai ) > 0 i = 1,...,n ii) Ai =Ω iii) A i ∩ A j = ∅ ∀i ≠ j
i =1
n
P(A ) = ∑ P(A i ) P A A
i =1
i
n (A ∩ A i )
A=A ∩Ω = (by ii))= A ∩ A i = . i =1
i =1
Then:
n
n ∑ P(A ∩ A i ) n
P(A)=P( (A ∩ A i ) )=(by iii)= i =1 =(Theorem 8)= ∑ P(A i ) P A A i ,
i =1 i =1
Bayes’ Theorem.
Let be A1,A2,...,An ∈ S with:
n
ii) Ai =Ω
i =1
iii) A i ∩ A j = ∅ ∀i ≠ j
P ( A j ) P B A
A j
P j B =
n
∑ P(A j ) P B A j
j =1
Proof.:
P ( A j ) P B
A j P(A j ∩ B) Aj
P B = =(multiplication rule)= = (Th. Prob. Total) =
P( B) P(B)
P ( A j ) P B A
j
n
∑ P(A j ) P B A j
j =1
This formula is sometimes called an “a posteriori probability”. The reason for
this terminology may be seen in the example below:
Example: Two coins are available, one unbiased and the other two-
headed. Choose a coin at random and toss it once; assume that the unbiased
coin is chosen with probability ¾. Given that the result is heads, find the
probability that the two-headed coin was chosen.
We may take to consist of the four possible paths through the tree, with
each path assigned a probability equal to the product of the probabilities
assigned to each branch. Notice that we are given the probabilities of the
events B1=(unbiased coin chosen) and B2=(two-headed coin chosen), as well
as the conditional probabilities P(A/Bi), where A=(coin comes up heads). This
is sufficient to determine the probabilities of all events.
Now we can compute P(B2/A) using Bayes’ Theorem; this is facilitated if,
instead of trying to identify the individual terms in the formula, we simply
look the tree and write:
(1/4)(1)
=-----------------------------------=2/5
(3/4)(1/2)+(1/4)(1)
Bibliography