04 Normal Approximation For Data and Binomial Distribution
04 Normal Approximation For Data and Binomial Distribution
Many data have histograms that look bell-shaped, e.g. heights, weights, IQ scores:
64 66 68 70 72
A normal curve is determined by x̄ and s: If the data follow the normal curve, then
knowing x̄ and s means knowing the whole histogram.
To compute areas under the normal curve, we first standardize the data by subtracting
off x̄ and then dividing by s:
height − x̄
z =
s
y= 2π
0.1
0.0
-4 -2 0 2 4
Normal approximation
Finding areas under the normal curve is called normal approximation.
What percentage of fathers have heights between 67.4 in and 71.9 in?
1. Standardize:
67.4 in−68.3 in 71.9 in−68.3 in = 2
1.8 in
= −0.5 1.8 in
64 66 68 70 72
0.4
0.3
0.2
0.1
0.0
-4 -2 0 2 4
Computing percentiles for normal data
We saw that for a newborn baby, there is a 49% chance that it is a girl.
What are the chances that 2 out of 3 newborns are girls?
We can compute this by listing all the possibilities (total enumeration):
P( 2 out of 3 are girls) = P(GGB or GBG or BGG)
= P(GGB) + P(GBG) + P(BGG) addition rule
= P(G) P(G) P(B) + P(G) P(B) P(G) + . . . multiplication rule
= 3 × (0.49)(0.49)(0.51)
‘3’ counts the number of ways one can arrange two G and one B
The binomial setting
n!
where n! = 1 × 2 × 3 × . . . × n
k!(n − k)!
3! 1×2×3
0! = 1. We had n = 3 and k = 2, so 2!1! = 1×2×1 = 3.
The binomial formula
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0 1 2 3 4 5 6 7 8 9 10
The probability histogram
We can visualize the probabilities of the various outcomes of X with a probability
histogram:
Probability histogram for the binomial with n=10, p=0.2
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0 1 2 3 4 5 6 7 8 9 10
As the number of experiments n gets larger, the probability histogram of the binomial
distribution looks more and more similar to the normal curve:
Probability histogram for the binomial with n=10, p=0.2 Probability histogram for the binomial with n=50, p=0.2
0.30
0.12
0.25
0.10
0.20
0.08
0.15
0.06
0.10
0.04
0.05
0.02
0.00
0.00
0 1 2 3 4 5 6 7 8 9 10 0 2 4 6 8 10 12 14 16 18 20 22