39 1 Normal Dist
39 1 Normal Dist
Contents
distribution
Time
allocation
You are expected to spend approximately five hours of independent study on the
material presented in this workbook. However, depending upon your ability to concentrate
and on your previous experience with certain mathematical topics this time may vary
considerably.
1
The Normal
Distribution 39.1
Introduction
Mass-produced items should conform to a specification. Usually, a mean is aimed for but due
to random errors in the production process a tolerance is set on deviations from the mean. For
example if we produce piston rings which have a target mean internal diameter of 45 mm then
realistically we expect the diameter to deviate only slightly from this value. The deviations from
the mean value are often modelled very well by the normal distribution. Suppose we decide
that diameters in the range 44.95 mm to 45.05 mm are acceptable, then what proportion of the
output is satisfactory? In this Section we shall see how to use the normal distribution to answer
questions like this.
#
① be familiar with the basic properties of
probability
Prerequisites
Before starting this Section you should . . . ② be familiar with continuous random
variables
" !
• mistakes.
If you use a meter with a digital readout, you will avoid some of the above errors but others,
often present in the design of the electronics controlling the meter, will be present. Errors are
unavoidable and are usually the sum of several factors. The behaviour of variables which are
the sum of several other variables is described by a very important and powerful result called
the Central Limit Theorem which we will study later in this workbook. For now we will quote
the result so that the importance of the normal distribution will be appreciated.
σ 2π
This curve is always bell-shaped with the centre of the bell located at the value of µ. The
height of the bell is controlled by the value of σ. As with all normal distribution curves it is
symmetrical about the centre and decays exponentially as x → ±∞. As with any probability
density function the area under the curve is equal to 1. See Figure 1.
y 1 (x−µ)2
y = √ e− 2σ2
σ 2π
µ x
Figure 1
A normal distribution is completely defined by specifying its mean (the value of µ) and its
variance (the value of σ 2 ). The normal distribution with mean µ and variance σ 2 is written
N (µ, σ 2 ). Hence the distribution N (20, 25) has a mean of 20 and a standard deviation of 5;
remember that the second “parameter” is the variance which is the square of the standard
deviation.
Key Point
A normal distribution has mean µ and variance σ 2 . A random variable X following this
distribution is usually denoted by N (µ, σ 2 ) and we often write
X ∼ N (µ, σ 2 )
Clearly, since µ and σ 2 can both vary, there are infinitely many normal distributions and it is
impossible to give tabulated information concerning them all.
For example, if we produce piston rings which have a target mean internal diameter of 45 mm
then we may realistically expect the actual diameter to deviate from this value.
Such deviations are well-modelled by the normal distribution. Suppose we decide that diameters
in the range 44.95 mm to 45.05 mm are acceptable, we may then ask the question ‘What
proportion of our manufactured output is satisfactory?’
Without tabulated data concerning the appropriate normal distribution we cannot easily answer
this question (because the integral used to calculate areas under the normal curve is intractable).
Key Point
The standard normal distribution has a mean of zero and a variance of one.
In Figure 2 we show the graph of the standard normal bistribution which has probability density
function y = √12π e−x /2
2
y 1
e−x /2
2
y=√
2π
0 x
The result which makes the standard normal distribution so important is as follows:
Key Point
If the behaviour of a continuous random variable X is described by the distribution N (µ, σ 2 )
then the behaviour of the random variable Z = X−µ σ
is described by the standard normal
distribution N (0, 1).
We call Z the standardised normal variable and we write
Z ∼ N (0, 1)
Solution
Here, µ = 45 and σ 2 = 0.000625 so that σ = 0.025. Hence Z = (X − 45)/0.025 is the required
transformation.
Example When the random variable X takes values between 44.95 and 45.05, between
which values does the random variable Z lie?
Solution
When X = 45.05, Z = 45.05−45
0.025
=2
When X = 44.95, Z = 0.025 = −2
44.95−45
The random variable X follows a normal distribution with mean 1000 and
variance 100. When X takes values between 1005 and 1010, between which
values does the standardised normal variable Z lie?
Your solution
0 z1
Figure 3
Since the total area under the curve is equal to 1 it follows from the symmetry in the curve that
the area under the curve in the region x > 0 is equal to 0.5. In Figure 3 the shaded area is the
probability that Z takes values between 0 and z1 .
When we ‘look-up’ a value in the table we obtain the value of the shaded area.
Example What is the probability that Z takes values between 0 and 1.9? (Please refer
to the table of normal probabilities on page 15).
Solution
The row beginning ‘1.9’ and the column headed ‘0’ is the appropriate choice and its entry is 4713.
This is to be read as 0.4713 (we omitted the ‘0.’ in each entry for clarity) The interpretation is
that the probability that Z takes values between 0 and 1.9 is 0.4713.
Example What is the probability that Z takes values between 0 and 1.96?
Solution
This time we want the row beginning 1.9 and the column headed ‘6’.
The entry is 4750 so that the required probability is 0.4750.
Example What is the probability that Z takes values between 0 and 1.965?
Solution
There is no entry corresponding to 1.965 so we take the average of the values for 1.96 and 1.97.
(This linear interpolation is not strictly correct but is acceptable).
The two values are 4750 and 4756 with an average of 4753. Hence the required probability is
0.4753.
Your solution
0.4902.
Linear interpolation gives a value of 4901 + 0.3(4904 − 4901) i.e. about 4902; the probability is
(iv) The entry for 2.33 is 4901, that for 2.34 is 4904.
(iii) The entry is 4901; the probability is 0.4901.
(ii) The entry is 4893; the probability is 0.4893.
(i) The entry is 4772; the probability is 0.4772.
Note from the table that as Z increases from 0 the entries increase, rapidly at first and then
more slowly, toward 5000 i.e. a probability of 0.5. This is consistent with the shape of the curve.
After Z = 3 the increase is quite slow so that we tabulate entries for values of Z rising by 0.1
instead of 0.01 as in the rest of the table.
Case 1
Figure 4 illustrates what we do if both Z values are positive. By using the properties of the
standard normal distribution we can organise matters so that any required area is always of
‘standard form’.
0 z1 z2
0 z2 0 z1
Figure 4
Solution
Using the table
P (Z = z2 ) i.e. P (Z = 2) is 0.4772
P (Z = z1 ) i.e. P (Z = 1) is 0.3413.
Hence P (1 < Z < 2) = 0.4772 − 0.3413 = 0.1359
(Remember that with a continuous distribution, P (Z = 1) is meaningless so that P (1 ≤ Z ≤ 2)
is interpreted as P (1 < Z < 2).
Case 2
The following diagram illustrates the procedure to be followed when finding probabilities of the
form P (Z > z1 ).
area 0.5
0 0 z1
Figure 5
Solution
P (0 < Z < 2) = 0.4772 (from the table) Hence the probability is 0.5 − 0.4772 = 0.0228.
Case 3
0 z1
area 0.5
0 0 z1
Figure 6
Solution
P (Z > 2) = 0.5 + 0.4772 = 0.9772.
Case 4
Here we consider what needs to be done when calculating probabilities of the form
P (−z1 < Z < 0) where z1 is positive. This time we make use of the symmetry in the standard
normal distribution curve.
−z1 0
Figure 7
Solution
The area is equal to that corresponding to P (0 < Z < 2) = 0.4772.
Case 5
Finally we consider probabilities of the form P (−z2 < Z < z1 ). Here we use the sum property
and the symmetry property.
−z1 0 z2
0 z1 0 z2
Figure 8
Solution
Your solution
σ 2π
and so the cumulative distribution function F (x) is given by the formula
x
1
e−(u−µ) /2σ du
2 2
F (x) = √
σ 2π −∞
In the case of the cumulative distribution for the standard normal curve, we use the special
notation Φ(z) and, substituting 0 and 1 for µ and σ 2 , we obtain
z
1
e−u /2 du
2
Φ(z) = √
2π −∞
The shape of the curve is essentially ‘S’ -shaped as shown in the diagram below. Note that the
Φ(z)
1
−2 −1 0 1 2 z
shows that
u−µ du
ν= and so dν =
σ σ
and F (x) may be written as
(x−µ)/σ (x−µ)/σ
1 −ν 2 /2 1 x−µ
e−ν /2 dν = Φ(
2
F (x) = √ e σdν = √ )
σ 2π −∞ 2π −∞ σ
We already know, from the basic definition of a cumulative distribution function, that
P (a < X < b) = F (b) − F (a)
so that we may write the probability statement above in terms of Φ(z) as
P (a < X < b) = F (b) − F (a) = Φ( b−µ
σ
) − Φ( a−µ
σ
)
Some values of Φ(z) are given in the table below. You should compare the values given here
with the values given for the normal probability integral on page 15. Simply adding 0.5 to the
values in the latter table gives the values of Φ(z) . You should also note that the diagrams
shown at the top of each set of tabulated values tells you whether you are looking at the values
of Φ(z) or the values of the normal probability integral.
The value of Φ(z) is measured from z = −∞ to any ordinate z = z1 and represents the
probability P (Z < z1 ).
The values of Φ(z) start as shown below:
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 .5000 5040 5080 5120 5160 5199 5239 5279 5319 5359
0.1 .5398 5438 5478 5517 5577 5596 5636 5675 5714 5753
0.2 .5793 5832 5871 5909 5948 5987 6026 6064 6103 6141
x−µ
Z= σ 0 1 2 3 4 5 6 7 8 9
0 0000 0040 0080 0120 0160 0199 0239 0279 0319 0359
.1 0398 0438 0478 0517 0577 0596 0636 0675 0714 0753
.2 0793 0832 0871 0909 0948 0987 1026 1064 1103 1141
.3 1179 1217 1255 1293 1331 1368 1406 1443 1480 1517
.4 1555 1591 1628 1664 1700 1736 1772 1808 1844 1879
.5 1915 1950 1985 2019 2054 2088 2123 2157 2190 2224
.6 2257 2291 2324 2357 2389 2422 2454 2486 2517 2549
.7 2580 2611 2642 2673 2703 2734 2764 2794 2822 2852
.8 2881 2910 2939 2967 2995 3023 3051 3078 3106 3133
.9 3159 3186 3212 3238 3264 3289 3315 3340 3365 3389
1.0 3413 3438 3461 3485 3508 3531 3554 3577 3599 3621
1.1 3643 3665 3686 3708 3729 3749 3770 3790 3810 3830
1.2 3849 3869 3888 3907 3925 3944 3962 3980 3997 4015
1.3 4032 4049 4066 4082 4099 4115 4131 4147 4162 4177
1.4 4192 4207 4222 4236 4251 4265 4279 4292 4306 4319
1.5 4332 4345 4357 4370 4382 4394 4406 4418 4429 4441
1.6 4452 4463 4474 4484 4495 4505 4515 4525 4535 4545
1.7 4554 4564 4573 4582 4591 4599 4608 4616 4625 4633
1.8 4641 4649 4656 4664 4671 4678 4686 4693 4699 4706
1.9 4713 4719 4726 4732 4738 4744 4750 4756 4761 4767
2.0 4772 4778 4783 4788 4793 4798 4803 4808 4812 4817
2.1 4821 4826 4830 4834 4838 4842 4846 4850 4854 4857
2.2 4861 4865 4868 4871 4875 4878 4881 4884 4887 4890
2.3 4893 4896 4898 4901 4904 4906 4909 4911 4913 4916
2.4 4918 4920 4922 4925 4927 4929 4931 4932 4934 4936
2.5 4938 4940 4941 4943 4946 4947 4948 4949 4951 4952
2.6 4953 4955 4956 4957 4959 4960 4961 4962 4963 4964
2.7 4965 4966 4967 4968 4969 4970 4971 4972 4973 4974
2.8 4974 4975 4976 4977 4977 4978 4979 4979 4980 4981
2.9 4981 4982 4982 4983 4984 4984 4985 4985 4986 4986
3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9
4987 4990 4993 4995 4997 4998 4998 4999 4999 4999
y 1 (x−µ)2
y = √ e− 2σ2
σ 2π
µ x
Figure 1
We now show, by example, how probabilities relating to a general normal distribution X are
determined. We will see that being able to calculate the probabilities of a standard normal
distribution Z is crucial in this respect.
Example Given that the variate X follows the normal distribution X ∼ N (151, 152 ),
calculate:
Solution (contd.)
(b)
185 − 151
P (X ≥ 185) = P (Z ≥ )
15
= P (Z ≥ 2.27)
= 0.5 − 0.4884
= 0.0116
We note that, as for any continuous random variable, we can only calculate the probability
that
• X lies between two given values;
• X is greater than a given value;
• X is less that a given value.
rather than for individual values.
1.6 1.6
= P (−0.625 < Z < 0.625) = 0.4681 21−20
) P (19 < X < 21) = P ( 19−20 <Z<
and
1.6
) = P (Z ≥ 2.5) = 0.5 − 0.4938 = 0.0062 P (X ≥ 24) = P (Z ≥
24 − 20
1.6
giving X−20
The transformation used is Z =
Solution
Let X be the diameter of a piston ring. Then we write X ∼ N (45, (0.05)2 ). The transformation
is Z = X−µ
σ
= X−45
0.05
. The upper limit of acceptability is x2 = 45.05 so that z2 = 45.05−45
0.05
= 1.
The lower limit of acceptability is x1 = 44.95 so that z1 = 44.95−45
0.05
= −1.
The range of ‘acceptable’ Z values is therefore −1 to 1. See Figure 2.
y 1
e−x /2
2
y=√
2π
0 x
Figure 2
Using the symmetry of the curves
0 z1
Figure 3
Your solution
0 z1 0
area 0.5
0 z1
off from the table.
area and an area which can be read
between the right-hand half of the total
This time the shaded area is the difference
Now use the Table to find z2 , and hence write down the value of z1 .
Your solution
X−µ
Rewrite the formula Z = σ
to make σ the subject. Put in values for z2 , x2 and µ hence
evaluate σ.
Your solution
Z 3.1
= = 0.16 (2 d.p.) σ=
X −µ 100.5 − 100