Section 7 (Discrete Probability Distributions)
Section 7 (Discrete Probability Distributions)
Distributions
Section 7: Outline
General Probability Distributions
Binomial Distribution
Negative Binomial Distribution
Poisson Distribution
GENERAL DISCRETE PROBABILITY
DISTRIBUTIONS
Random Variables
Def: A random variable is a variable whose value is determined
by chance.
Random variables may either be
◦ discrete (i.e. can only take on certain values, typically whole number
values) or
◦ continuous (can take on any value, possibly within a certain range).
Example
◦ If you roll a die, the value on the die is a discrete random variable.
◦ If you flip a coin 10 times, the number of times you get “heads” is a
discrete random variable.
◦ If you measure the length of 30 bolts, the length of each bolt is a
continuous random variable.
Probability Distributions
Def: A probability distribution gives us the probability
associated with each value (or range of values) a random
variable can take on.
So, a discrete probability distribution would give you:
◦ All the different possible outcomes for a given discrete random
variable
◦ The probabilities associated with each of these possible
outcomes.
IMPORTANT: It is always okay to use the standard
probability rules. These distributions formulas are used to
simplify the calculations.
Notation
For a random variable, 𝕏, the probability that it takes on
any one specific value, 𝑥𝑖 is written as: 𝑃(𝕏 = 𝑥𝑖 )
𝕏 P(𝕏=xi)
1 1/6
2 1/6
3 1/6
4 1/6
5 1/6
6 1/6
Example
Six lots of components are ready to be shipped by a certain supplier.
The number of defective components in each lot is:
Lot 1 2 3 4 5 6
Number of defective components 0 2 3 1 2 4
Lot 1 2 3 4 5 6
Number of defective components 0 2 3 1 2 4
Lot 1 2 3 4 5 6
Number of defective components 0 2 3 1 2 4
𝜇 = 𝑥𝑖 ∙ 𝑃 𝕏 = 𝑥𝑖
𝑎𝑙𝑙 𝑖
Variance of a Probability Distribution
Def: Variation in the outcome of a discrete probability
distribution (how much variation is there in the outcomes
that will occur) is measured using the variance or standard
deviation of the distribution given by the following formula:
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 2 = [𝑥𝑖2 ∙ 𝑃 𝑋 = 𝑥𝑖 ] − 𝜇2
𝑎𝑙𝑙 𝑖
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛: 𝜎 = 𝜎2
Example
If you bet $1 in Kentucky’s Pick 4 lottery game, you either
lose $1 or gain $4999.
The game is played by selecting a four-digit number
between 0000 and 9999.
If you bet $1 on 1234, what is your expected value of gain or
loss?
Also, calculate the standard deviation for this distribution.
Let 𝕏 be the amount of money we win.
𝕏 can take 2 values:
◦ $ -1
◦ $ 4999
Example (Continued)
𝕏 𝑃(𝕏 = 𝑥𝑖 ) 𝑥𝑖2 ∙ 𝑃(𝕏 = 𝑥𝑖 )
9999
−1 2 ∙
-1 9999/10000 10000
= 0.9999
1
49992 ∙
4999 1/10000 10000
= 2499.0001
Sum 1 2500
9999 1
𝜇 = −1 ∙ + 4999 ∙ = $ − 0.50
10000 10000
𝕏B=𝑥𝑖 P(𝕏B=𝑥𝑖 )
0 0.3 × 0.3 = 0.09
1 0.42
2 0.7 × 0.7 = 0.49
𝑃 𝕏𝐵 = 1 = 𝑃 𝐻𝑇𝑜𝑟 𝑇𝐻 = 𝑃 𝐻𝑇 + 𝑃 𝑇𝐻
= 0.7 × 0.3 + 0.3 × 0.7 = 0.42
Conditions for Binomial Distributions
The following conditions need to be satisfied to use the binomial distribution:
1. An “experiment” consists of n identical trials (n is a fixed number).
2. Outcome of each trial can be classified as a “success” or a “failure”.
3. Trials are independent (recall: this means that the outcome of any individual trial
does not affect the outcome of any other trials).
4. The probability of “success” remains constant.
5. The variable of interest is the number of successes (𝑥) in n trials. These successes can
happen in any order.
Formula
Things we need to know to use the binomial distribution formula:
n = the number of trials
p = the probability of “success” on any one given trial
𝑥 = the number of success that we are interested in finding the
probability of
Then, the probability of getting 𝑥 successes in n trials:
𝑃 𝕏B = 𝑥 = 𝑛𝐶𝑥 ∙ 𝑝𝑥 ∙ 1 − 𝑝 𝑛−𝑥
where
◦ 𝑛 𝐶𝑥 is given by the following formula:
𝑛!
𝑛𝐶𝑥 =
𝑛 − 𝑥 ! 𝑥!
◦ In Excel, 𝑃 𝕏𝐵 = 𝑥 = 𝑏𝑖𝑛𝑜𝑚. 𝑑𝑖𝑠𝑡 𝑥, 𝑛, 𝑝, 0
◦ 𝑃 𝕏𝐵 ≤ 𝑥 = 𝑏𝑖𝑛𝑜𝑚. 𝑑𝑖𝑠𝑡 𝑥, 𝑛, 𝑝, 1
Example
Use the binomial distribution formula to find the probability of getting 2 heads when
flipping a weighted coin 3 times. Assume that each time you flip the coin, you have a 60%
probability of getting heads.
Solution:
◦ 𝕏𝐵 : the number of Heads 𝑃 𝕏𝐵 = 2 = 3𝐶2 ∙ 0.6 2
∙ 1 − 0.6 3−2
Solution:
◦ a) P(jam 4 days in a row) = P(jam 1 st day AND jam the 2nd day AND jam the 3rd day AND jam
the 4th day)
= 0.15 4 = 0.0005
◦ b) 𝕏𝐵 : # 𝑜𝑓 𝑡𝑖𝑚𝑒𝑠 𝑡ℎ𝑒 𝑒𝑞𝑢𝑖𝑝𝑚𝑒𝑛𝑡 𝑗𝑎𝑚𝑠, 𝑝 = 0.15, 𝑛 = 4
◦ 𝑃 𝕏𝐵 = 1 = 𝑏𝑖𝑛𝑜𝑚. 𝑑𝑖𝑠𝑡 1,4,0.15,0 = 0.3685
Example Continued
c) 𝑃 𝕏𝐵 ≥ 1 = 𝑃(𝕏𝐵 = 1 𝑜𝑟 𝕏𝐵 = 2 𝑜𝑟 𝕏𝐵 = 3 𝑜𝑟 𝕏𝐵 = 4)
= 𝑃 𝕏𝐵 = 1 + 𝑃 𝕏𝐵 = 2 + 𝑃 𝕏𝐵 = 3 + 𝑃(𝕏𝐵 = 4)
= 1 − 𝑃 𝕏𝐵 < 1 = 1 − 𝑃 𝕏𝐵 = 0
𝜎= 𝑛𝑝(1 − 𝑝)
Note: These shortcut formulas are coming from the general formulas of discrete
probability distributions.
For instance:
𝜇 = σ𝑖 𝑥𝑖 ∙ 𝑃(𝕏𝐵 = 𝑥𝑖 ) = σ𝑖 𝑥𝑖 ∙ 𝑛𝐶𝑥𝑖 ∙ 𝑝 𝑥𝑖 ∙ 1−𝑝 𝑛−𝑥𝑖 =𝑛∙𝑝
Example
Suppose that a mechanical component from a particular supplier fails a strength test 15% of the
time.
Assume that we randomly select 25 components from this supplier and answer the following
questions:
How many component would you expect to fail the strength test?
What is the probability that at least 2 components fail the strength test?
NEGATIVE BINOMIAL DISTRIBUTION
Example
A doctor wishes to recruit 5 persons to participate in a study.
Let p=P(a randomly selected person agrees to participate)
If p =0.2, what is the probability that we need to ask 15 persons before we found 5 that agrees to
participate in the study?
Negative Binomial Distribution
Conditions:
◦ The experiment consists of a sequence of independent trials.
◦ Each trial can result in either a success (S) or a failure (F).
◦ The probability of success, p, is constant from trial to trial.
◦ The experiment continues until a total of r successes have been observed, where r is
a specified positive integer.
◦ The random variable of interest 𝕏𝑁𝐵 is the number of failures that precede the rth
success.
◦ 𝕏𝑁𝐵 is called a negative binomial random variable because in contrast to the
binomial random variable, the number of successes is fixed and the number of trials
is random.
◦ The probability distribution of 𝕏𝑁𝐵 depends on the parameters r and p:
◦ 𝑃 𝕏𝑁𝐵 = 𝑥 = 𝑐𝑜𝑚𝑏𝑖𝑛 𝑥 + 𝑟 − 1, 𝑟 − 1 ∙ 𝑝𝑟 1 − 𝑝 𝑥
𝑃(𝕏𝑁𝐵 ≥ 290)
= 1 − 𝑃(𝕏𝑁𝐵 ≤ 289)
= 1 − 𝑛𝑒𝑔𝑏𝑖𝑛𝑜𝑚. 𝑑𝑖𝑠𝑡 289,10,0.02,1
≈ 0.9195
POISSON DISTRIBUTION
Poisson Distribution
The Poisson distribution is a discrete probability
distribution that applies to the number of occurrences
of some event in a specific interval.
The interval can be time, distance, area, volume, etc.
Examples:
◦ number of units being rejected at a quality control station
per hour
◦ number of lightning strikes per day
◦ number of rust spot per square meter of a sheet of metal
Conditions for Poisson
The random variable 𝕏P is the number of occurrences of an event over some
interval.
the occurrences must be random
the occurrences must be independent of each other
the occurrences must be uniformly distributed over the interval being used
𝑥 = number of occurrences in an interval
𝜇 = mean number of occurrences in an interval
The probability distribution of 𝕏𝑃 depends only on the parameter 𝜇:
𝑒 −𝜇 ∙ 𝜇 𝑥
𝑃 𝕏𝑃 = 𝑥 =
𝑥!
In Excel, the function is P 𝕏𝑃 = 𝑥 = 𝑝𝑜𝑖𝑠𝑠𝑜𝑛. 𝑑𝑖𝑠𝑡 𝑥, 𝜇, 0
P 𝕏𝑃 ≤ 𝑥 = 𝑝𝑜𝑖𝑠𝑠𝑜𝑛. 𝑑𝑖𝑠𝑡(𝑥, 𝜇, 1)
Example
Suppose that the average number of accidents at a mechanics shop is 4 accidents per year.
Suppose that the accidents are uniformly distributed from year to year.
◦ What is the probability that there will be exactly 3 accidents in any given year?
◦ What is the probability that there will be exactly 2 accidents in any given year?
Example Continued
What is the probability that there will be less than 2 accidents in any
given year?
What is the probability that there will be more than 3 accidents in any
given year?
Example
A previous experiment found that when submerge in a corrosive solution for 30 days, steel metal
plates developed and average of 540.43 holes per square meter.
If we randomly select a plate that is 1 square meter, what is the probability that it has more than
580 holes?
Example
We want to analyze the effectiveness of a new coating in protecting steel from corrosion.
We coat a single (10cm by 10cm) plate of steel with the new coating and submerge it in similar
corrosive solution for 30 days.
At the end of the 30 days, we found only 2 holes on our plate.
If we assume that the coating does not help in protecting against corrosion, what is the
probability that we would have found only 2 holes on the plate?