Probability Distribution II - Normal Distribution & Small Sampling Distribution (Students Notes) MAR 23
Probability Distribution II - Normal Distribution & Small Sampling Distribution (Students Notes) MAR 23
Introduction
Suppose we are planning for measuring the variability of an automatic bottling process that
fills 1/2-liter (500 cm3) bottles with cola. The variable, say X, indicating the deviation of the
actual volume from the normal (average) volume can take any real value: -
This type of random variable, which can take an infinite number of values in a given range,
is called a continuous random variable, and the probability distribution of such a variable
is called a continuous probability distribution.
The concepts and assumption inherent in the treatment of such distributions are quite
different from those used in the context of a discrete distribution.
Consider our planning for measuring the variability of the automatic bottling process that fills
1/2-liter (500 cm3)bottles with cola. The random variable X indicates ‘the deviation of the
actual volume from the normal (average) volume.’ Let us, for some time, measure our
random variable X to the nearest one cm3.
Suppose Figure above represent the histogram of the probability distribution of X. The
probability of each value of X is the area of the rectangle over the value.
Since the rectangle will have the same base, the height of each rectangle is proportional to
the probability. The probabilities also add to 1.00 as required for a probability
distribution.
1
Volume is a continuous random variable; it can take on any value measured on an interval
of numbers.
3
Now let us imagine the process of refining the measurement scale of X to the nearest 1/2 cm ,
the nearest 1/10 cm3... and so on.
Obviously, as the process of refining the measurement scale continues, the number of
rectangles in the histogram increases and the width of each rectangle decreases. The
probability of each value is still measured by the area of the rectangle above it, and the total
area of all rectangles remains 1.00. As we keep refining our measurement scale, the discrete
distribution of X tends to a continuous probability distribution.
The step like surface formed by the tops of the rectangles in the histogram tends to a smooth
function. This function is denoted by f(x) and is called the probability density function of the
continuous random variable X.
The density function is the limit of the histograms as the number of rectangles approaches
infinity and the width of each rectangle approaches zero.
A continuous random variable is a random variable that can take on any value in an
interval of numbers.
The probabilities associated with a continuous random variable X are determined by the
probability density function of the random variable.
P(a<X<b)= ∫ f(x).dx
3. The total area under the entire curve of f(x) is equal to 1.00.
2
∝
P(− ∝ ≤ X ≤ ∝) = ∫ f(x).dx = 1.00
-∝
When the sample space is continuous, the probability of any single given value is zero.
For a continuous random variable, non-zero probabilities are associated only with
intervals of numbers.
We define the cumulative distribution function F(x) for a continuous random variable
similarly to the way we defined it for a discrete random variable:
F(x) = P(X = x) = area under f(x) between the smallest possible value of X (often -∝)
and point x
x
= ∫ f(x).dx
-∝
The cumulative distribution function F(x) is a smooth, non-decreasing function that increases
from 0 to 1.00.
The expected value of a continuous random variable X, denoted by E(X), and its variance,
denoted by V(X), require the use of calculus for their computation. Thus
∝
E(X)= ∫ xf (x). dx
-∝
∝
V(X)= ∫ [x−E(x)]2.f(x).dx
-∝
3
The Normal Distribution
A random variable that is affected by many independent causes, and the effect of each
cause is not overwhelmingly large compared to other effects, closely follow a normal
distribution.
All of these are affected by several independent causes where the effect of each cause is
small.
Irrespective of how the full body of data is distributed, it has been found that the Normal
Distribution can be used to characterize the sampling distribution of many of the sample
statistics.
X ~ N (μ, σ2)
For example, a distribution with mean 100 and standard deviation 5 will have the density
function.
4
This function when plotted (see Figure below) will give the famous bell-shaped mesokurtic
normal curve.
1. The normal curve is not a single curve representing only one continuous distribution.
Figure below shows three different normal distributions-with different shapes and positions.
2. The normal curve is bell-shaped and perfectly symmetric about its mean. As a result
50% of the area lies to the right of mean and balance 50% to the left of mean.
The normal curve gradually tapers off in height as it moves in either direction away
from the mean, and gets closer to the X-axis;
3. The normal curve has a (relative) kurtosis of 0, which means it has average
peakedness and is mesokurtic;
4. Theoretically, the normal curve never touches the horizontal axis and extends to
infinity on both sides. That is the curve is asymptotic to X-axis;
5. If several independent random variables are normally distributed, then their sum
will also be normally distributed.
5
If X1, X2,............ Xn are independent normal variables, the their sum S will also be
a normal variable with
Where: -
the resultant Y variable will also be normally distributed with mean = a E(X) + b and
Variance = a2 V(X)
If X1, X2,............ Xn are independent random variables that are normally distributed, then the
random variable Q defined as: -
2 2
V(Q) = a1 V(X1) + a2 V(X2) +............ an V(Xn)
Example 1
A cost accountant needs to forecast the unit cost of a product for the next year. He notes that
each unit of the product requires 10 labor hours and 5 kg of raw material. In addition, each
unit of the product is assigned an overhead cost of KSHs. 200. He estimates that the cost of a
labor hour next year will be normally distributed with an expected value of KSHs. 45 and a
standard deviation of KSHs. 2; the cost of raw material will be normally distributed with an
expected value of KSHs. 60 and a standard deviation of KSHs.3. Find the distribution of the
unit cost of the product. Find its expected value and variance.
Solution: Since the cost of labor L may not influence the cost of raw material M, we can
assume that the two are independent. This makes the unit cost of the product Q a random
variable. So if
2 2
L ~ N (45, 2 ) and M ~ N (60, 3 )
6
2 2
Variance = V(Q) = 10 V(L) + 5 V(M)
There are infinitely many possible normal random variables and the resulting normal curves
2
for different values of μ and σ . So the range probability P(a < X < b) will be different for
different normal curves. We can make use of integral calculus to compute the required range
probability
b
P(a < X < b) = ∫ f(x).dx
Since it is not practicable and indeed impossible to have separate probability tables for each
of the infinitely many possible normal curves, we select one normal curve to serve as a
standard.
The standard normal random variable is denoted by a special name, Z (rather than the general
name X we use for other random variables).
We define the standard normal random variable Z as the normal random variable with
mean = 0 and standard deviation = 1.
We say
2
Z ~ N (0,1 )
The probabilities associated with standard normal distribution are tabulated in two ways-say
Type I and Type II tables, as shown in Figure below.
Type I Tables give the area between μ = 0 and any other z value, as shown by vertical
hatched area in Figure a below. The hatched area shown in figure is P(0<Z<z).
7
P(0 < Z < z) P(Z > z)
Type II Tables give the area towards the tail–end of the standard normal curve beyond the
ordinate at any particular z value. The hatched area shown in Figure b above is P (Z > z).
As the normal curve is perfectly symmetrical, the areas given by Type 1 Tables when
subtracted from 0.5 will provide the same areas as given by Type II Tables and vice-versa.
Example 2
Find the probability that the value of the standard normal random variable will be...
That is, we want P(0 < Z < 1.74). In Figure a above, substitute 1.74 for the point z on the
graph.
That is, we want P(Z < -1.47). By the symmetry of the normal curve, the area to the left of -
1.47 is exactly equal to the area to the right of 1.47. We find
8
(c) P(Z is between 1.3 and 2)
P(1.30 < Z< 2) = TA(for 2.00) - TA(for 1.30) = P(0 < Z < 2) - P(0 < Z < 1.3)
P(-1 < Z< 2) = P(-1 < Z < 0) + P(0 < Z < 2) = P(0 < Z < 1) + 0.4772
In cases, where we need probabilities based on values with greater than second-decimal
accuracy, we may use a linear interpolation between two probabilities obtained from the
table.
Example 3
Solution: P(0 ≤ Z ≤ 1.645) is found as the midpoint between the two probabilities
= 0.45
In many situations, instead of finding the probability that a standard normal random variable
will be within a given interval; we may be interested in the reverse: finding an interval with a
given probability. Consider the following examples.
Example 4
Find a value z of the standard normal random variable such that the probability that the
random variable will have a value between 0 and z is 0.40.
9
Solution: We look inside the table for the value closest to 0.40. The closest value we find to
0.40 is the table area 0.3997.
Example 5
Find the value of the standard normal random variable that cuts off an area of 0.90 to its left.
Solution: Since the area to the left of the given point z is greater than 0.50, z must be on the
right side of 0. Furthermore, the area to the left of 0 all the way to -∝ is equal to 0.50.
Thus z =1.28 cuts off an area of 0.90 to the left of standard normal curve.
Example 6
Find a 0.99 probability interval, symmetric about 0, for the standard normal random variable.
Solution: The required area between the two z values that are equidistant from 0 on either
side is 0.99. Therefore, the area under the curve between 0 and the positive z value is TA =
0.99/2 = 0.495.
The area 0.495 lies exactly between the two areas 0.4949 and 0.4951, corresponding to z =
2.57 and z = 2.58,
Therefore, a simple linear interpolation between the two values gives us z = 2.575.
The importance of the standard normal distribution derives from the fact that any normal
random variable may be transformed to the standard normal random variable.
2
If we want to transform X, where X ~ N (μ, σ ), into the standard normal random variable
2
Z ~ N (0, 1 ), we can do this as follows:
Z=X-µ
10
We move the distribution from its center of μ to a center of 0. This is done by subtracting μ
from all the values of X.
To make the standard deviation of the distribution equal to 1, we divide the random variable
by its standard deviation σ.
Example 7
2
If X ~ N (50, 10 ), find the probability that the value of the random variable X will be greater
than 60
Solution:
σ 10
= P(Z> 60-50)
10
= P(Z>1)
=P(Z>0) – P(0<Z<1)
Example 8
The weekly wage of 2000 workmen is normally distribution with mean wage of KSHs. 70
and wage standard deviation of KSHs. 5. Estimate the number of workers whose weekly
wages are
(a) between KSHs. 70 and KSHs. 71 (b) between KSHs. 69 and KSHs. 73
70−μ X −μ 71−μ
So P(70<X<71)= P( < < )
σ σ σ
11
70−70 71−70
= P( <Z< )
5 5
= P(0<Z<0.2) = 0.0793
So the number of workers whose weekly wages are between KSHs. 70 and KSHs.71
σ σ σ
69−70 73−70
= P( <Z< )
5 5
= P(-0.2<Z<0.6)
= P(-0.2<Z<0)+P(0<Z<0.6)
So the number of workers whose weekly wages are between KSHs. 69 and KSHs. 73
X−μ 72−μ
So P(X>72)= P( > )
σ σ
72−70
= P(Z> )
= P(Z>0.4)
So the number of workers whose weekly wages are more than KSHs. 72
12
(d) The required probability to be calculated is P(X < 65)
X−μ 65−μ
So P(X<65) = P ( < )
σ σ
65−70
= P(Z< )
= P(Z >1.0)
= P(Z>0)-P(0<Z<1.0)
So the number of workers whose weekly wages are less than KSHs. 65
The transformation Z = (X – μ)/ σ takes us from a random variable X with mean μ, and
standard deviation σ to the standard normal random variable.
We also have an opposite, or inverse, transformation, which takes us from the standard
normal random variable Z to the random variable X with mean μ and standard deviation
σ.
X = μ + Zσ
Example 9
The amount of fuel consumed by the engines of a jetliner on a flight between two cities is a
normally distributed random variable X with mean μ = 5.7 tons and standard derivation σ =
0.5 tons. Carrying too much fuel is inefficient as it slows the plane. If, however, too little fuel
is loaded on the plane, an emergency landing may be necessary. What should be the amount
of fuel to load so that there is 0.99 probability that the plane will arrive at its destination
without emergency landing?
13
2
Solution: Given that X ~ N (5.7, 0.5 ),
X−μ
or P( < z) = 0.99
= 0.5 + 0.49
So x = μ + zσ
x = 5.7 + 2.33x0.5
x = 6.865
Therefore, the plane should be loaded with 6.865 tons of fuel to give 0.99 probability that the
fuel will last throughout the flight.
We can summarize the procedure of obtaining values of a normal random variable, given a
probability, as: -
a) Draw a picture of the normal distribution in question and the standard normal
distribution
b) In the picture, shade in the area corresponding to the probability
c) Use the table to find the z value (or values) that gives the required probability
d) Use the transformation from Z to X to get the appropriate value (or values) of the
original normal random variable
The Z-statistic is used in statistical inference when sample size is large. It may, however, be
appreciated that the sample size may be prohibited from being large either due to physical
limitations or due to practical difficulties of sampling costs being too high. Consequently, for
our statistical inferences, we may often have to contend ourselves with a: -
The consequences of the sample being small; n < 30; are that the:-
14
a) Central limit theorem ceases to operate; and
b) Sample variance S2 fails to serve as an unbiased estimator of σ2
Thus, the basic difference which the sample size makes is that while the sampling
2
distributions based on large samples are approximately normal and sample variance S is an
2
unbiased estimator of σ , the same does not occur when the sample is small.
It may be appreciated that the small sampling distributions are also known as exact
sampling distributions, as the statistical inferences based on them are not subject to
approximation. However, the assumption of population being normal is the basic
qualification underlying the application of small sampling distributions.
a) chi-square;
b) F and student; and
c) t-distribution.
The small sampling distributions are defined in terms of the concept of degrees of
freedom.
The concept of degrees of freedom (df) is important for many statistical calculations and
probability distributions.
It refer to the number of independent variables which vary freely without being
influenced by the restrictions imposed by the sample statistic(s) to be computed.
Obviously, we are free to assign any value to n-1 observation out of n observations. Once the
value are freely assigned to n-1 observations, freedom to do the same for the nth observation
is lost and its value is automatically determined as
15
nth observation = n x̄ - sum of n-1 observations
We say that one degree of freedom, df is lost and the sum n x̄ of n observations has n-1 df
associated with it.
For example, if the sum of four observations is 10, we are free to assign any value to three
observations only, say, x1 =2, x2 =1and x3 =4. Given these values, the value of fourth
observation is automatically determined as
x4 = 10 – (2 + 1 + 4) = 3
2 2
Then E(S ) = σ
16
2
However, it can be shown empirically that while calculating S if we divide the sum of square
of deviations from mean (SSD) i.e. ∑(x − x̄)2 by n, it will not be an unbiased estimator of σ2
2 n
by the factor σ /n . To compensate for this downward bias we divide ∑ (x − x̄)2 by n-1,
i=1
so that
2
is an unbiased estimator of population variance σ and we have: -
2
In other words to get the unbiased estimator of population variance σ , we divide the
n
sum ∑ (x − x̄)2 by the degree of freedom n-1
i=1
X = {X1 , X 2 ......X N }
We may draw a random sample of size n comprising x1 , x2 ......xn values from this
population. Each of the n sample values x1 , x2 ......xn can be treated as an independent normal
random variable with mean μ and variance σ2. In other words
17
Thus each of these n normally distribution random variable may be standardized so that
x −μ
Zi = i ~ N (0, 12) where i = 1, 2, .........n
Which will take different values in repeated random sampling. Obviously, U is a random
2
variable. It is called chi-square variable, denoted by χ . Thus the chi-square random
variable is the sum of several independent, squared standard normal random variables.
So
Where:
Properties of χ2 Distribution
18
3. As a sum of squares the χ2 random variable cannot be negative and is, therefore,
bounded on the left by zero.
4. The mean of a χ2 distribution is equal to the degrees of freedom df. The variance of
the distribution is equal to twice the number of degrees of freedom df .
2
χ2~N(n, √2n )
2 2 2 2
6. If χ1 , χ2 , χ3 ,.........χk are k independent χ2 random variables, with degrees of
freedom n ,n ,n ,.........n . Then their sum χ12 +χ22 +χ32 +.........+χk2 also possesses a χ2
distribution with df = n1 + n2 + n3 +.........+ nk .
19
The χ2Distribution in terms of Sample Variance S2
Then
One degree of freedom is lost because all the deviations are measured from x̄ and not from μ..
20
We work with the distribution of
2
of S directly.
Since
Example 5
In an automated process, a machine fills cans of coffee. The variance of the filling process is
known to be 30. In order to keep the process in control, from time to time regular checks of
the variance of the filling process are made. This is done by randomly sampling filled cans,
measuring their amounts and computing the sample variance. A random sample of 101 cans
is selected for the purpose. What is the probability that the sample variance is between 21.28
and 38.72?
21
Solution: We have
Population variance σ2 = 30
n = 101
2
2 (n−1)S
χ =
σ2
So P(21.28 < S2 < 38.72) = P{ (101 – 1) 21.28 < χ2 < (101 – 1) 38.72 }
30 30
The F -Distribution
22
Let us assume two normal population with variances σ1 2 and σ2 2 repetitively. For a random
sample of size n1 drawn from the first population, we have the chi-square variable
Similarly, for a random sample of size n2 drawn from the second population, we have the chi-
square variable
The F distribution is the distribution of the ratio of two chi-square random variables that
are independent of each other, each of which is divided by its own degrees of freedom.
Properties of F- Distribution
23
Figure: F- Distribution with different v1 and v2
4. The F(v1,v2 ) has no mean for v2 ≤ 2 and no variance for v2 ≤ 4. However, for v2 >2, the
mean and for v2 > 4, the variance is given as
5. The F distributions defined as F(v1 ,v2 ) and as F(v2 ,v1 ) are reciprocal of each other.
The T-Distribution
Let us assume a normal population with mean μ and variance σ 2 . If xi represent the n values
of a sample drawn from this population. Then
x −μ
Zi = i ~N(0,12) where i=1,2,.........n
24
A new sample statistic T may, then, be defined as
This statistic - the ratio of the standard normal variable Z to the square root of the χ2
variable divided by its degree of freedom - is known as ‘t’ statistic or student ‘t’ statistic.
x −μ
The random variable i follows t-distribution with n-1 degrees of freedom.
25
The t-distribution in terms of Sampling Distribution of Sample Mean
When defined as above, T again follows t-distribution with n-1 degrees of freedom.
Properties of t- Distribution
1. The t-distribution like Z distribution, is unimodal, symmetric about mean 0, and the t-
variable varies from -∝ and∝
2. The t-distribution is defined by the degrees of freedom v = n-1, the df associated with
the distribution are the df associated with the sample standard deviation.
3. The t-distribution has no mean for n = 2 i.e. for v = 1 and no variance for n ≤ 3 i.e. for
v ≤ 2. However, for v >1, the mean and for v > 2, the variance is given as
26
E(T)=0 Var(T) = v/ (v – 2)
5. The variance of t-distribution approaches 1 as the sample size n tends to increase. In other
words the t-distribution is approximately normal for n ≥ 30.
27