Lesson 4 - Continuous Probability Distributions (With Exercises)
Lesson 4 - Continuous Probability Distributions (With Exercises)
4
Continuous Probability
Distributions
At the end of the lesson, the student should be able to:
A continuous random variable has a probability of zero of assuming exactly any of its
values. Consequently, its probability distribution cannot be given in tabular form. At first this may
seem startling, but it, becomes more plausible when we consider a particular example. Let us
discuss a random variable whose values are the heights of all people over 21 years of age.
Between any two values, say 163.5 and 164.5 centimeters, or even 163.99 and 164.01
centimeters, there are an infinite number of heights, one of which is 164 centimeters. The
probability of selecting a person at random who is exactly 164 centimeters tall and not one of
the infinitely large set of heights so close to 164 centimeters that you cannot humanly measure
the difference is remote, and thus we assign a probability of zero to the event. This is not the
case, however, if we talk about the probability of selecting a person who is at least 163
centimeters but not more than 165 centimeters tall. Now we are dealing with an interval rather
than a point value of our random variable.
P(a < X < b) = P(a < X < b) + P(X = b) = P(a < X < b).
That is, it does not matter whether we include an endpoint of the interval or not. This is
not true, though, when X is discrete. Although the probability distribution of a continuous random
variable cannot be presented in tabular form, it can be stated as a formula. Such a formula
would necessarily be a function of the numerical values of the continuous random variable X
and as such will be represented by the functional notation f(x). In dealing with continuous
variables, f(x) is usually called the probability density function, or
simply the density function of A'. Since X is defined over a continuous sample space, it is
possible for f(x) to have a finite number of discontinuities. However, most density functions that
have practical applications in the analysis of statistical data are continuous and their graphs
may take any of several forms, some of which are shown in Figure 4.1. Because areas will be
used to represent probabilities and probabilities arc positive numerical values, the density
function must lie entirely above the x axis. A probability density function is constructed so that
the area under its curve bounded by the x axis is equal to 1 when computed over the range of X
for which f(x) is defined. Should this range of X be a finite interval, it is always possible to extend
the interval to include the entire sot of real numbers by defining f(x) to be zero at all points in the
extended portions of the interval. In Figure 4.2, the probability that X assumes a value between
a and /; is equal to the shaded area under the density function between the ordinates at. x = a
and x = b, and from integral calculus is given by:
Example:
|
x x 2 3
t t3 x +1
∫ f ( t ) dt =∫
x
❑
F(x) = dt = ❑
❑
❑ ❑ =
−∞ −1 3 9 −1
9
Therefore,
{
0 , x <−1
3
x +1
F(x) = ,−1≤ x <2 ,
9
1 , x ≥ 2.
2 1 1
P(0 < X ≤ 1) = F(1) – F(0) = − =
9 9 9
Let X be a continuous random variable with range [a, b] and probability density function f(x). The
expected value of X is defined by
b
E ( X )=∫ xf ( x ) dx
a
Let’s see how this compares with the formula for a discrete random variable:
n
E ( X )=∑ x i p(x i )
i=1
The discrete formula says to take a weighted sum of the values xi of X, where the weights are
the probabilities p(xi). Recall that f(x) is a probability density. Its units are prob/(unit of X).
So f(x) dx represents the probability that X is in an infinitesimal range of width dx around x. Thus
we can interpret the formula for E(X) as a weighted integral of the values x of X, where the
weights are the probabilities f(x) dx.
Example :
SOLUTION:
|
1 2 1
x ❑= 1
E ( X )=∫ xdx= ❑
0 2 ❑
0 ❑ 2
Example:
3 2
Let X have range [0, 2] and density x . Find E(X).
8
|
2 2 4
3 3 3x ❑¿ = 3
2
E ( X )=∫ xf ( x ) dx=∫ x dx=¿ ❑
0 0 8 32 ❑
0 ❑ 2
Does it make sense that this X has mean is in the right half of its range?
Yes. Since the probability density increases as x increases over the range, the average value of
x should be in the right half of the range.
µ is “pulled” to the right of the midpoint 1 because there is more mass to the right.
Properties of E(X)
The properties of E(X) for continuous random variables are the same as for discrete ones:
E(aX + b) = aE(X) + b
Expectation of Functions of X
This works exactly the same as the discrete case. if h(x) is a function then Y = h(X) is a random
variable and
∞
E ( Y )=E ( h ( X ) )= ∫ h ( x ) fx ( x ) dx
−∞
Example:
[ ]
∞
2 x −λ x 2 − λ x ∞ 2
E ( X ) =∫ x λ e dx= −x e −
2 2 −λ x 2 −λ x
e − 2e =
0 λ λ 0 λ2
The Normal Distribution is the most important and most widely used continuous probability
distribution. It is the cornerstone of the application of statistical inference in analysis of data
because the distributions of several important sample statistics tend towards a Normal
distribution as the sample size increases.
Empirical studies have indicated that the Normal distribution provides an adequate
approximation to the distributions of many physical variables. Specific examples include
meteorological data, such as temperature and rainfall, measurements on living organisms,
scores on aptitude tests, physical measurements of manufactured parts, weights of contents of
food packages, volumes of liquids in bottles/cans, instrumentation errors and other deviations
from established norms, and so on.
The graphical appearance of the Normal distribution is a symmetrical bell-shaped curve that
extends without bound in both positive and negative directions.
[ ]
2
1 −(x−μ)
f ( x )= exp , −∞< x < ∞ ;
σ √2 π 2σ 2
−∞< μ <∞ , σ >0
where μ and σ are parameters. These turn out to be the mean and standard deviation,
respectively, of the distribution. As a shorthand notation, we write X ~ N(μ,σ2).
The curve never actually reaches the horizontal axis buts gets close to it beyond about 3
standard deviations each side of the mean.
The graphs below illustrate the effect of changing the values of μ and σ on the shape of the
probability density function. Low variability (σ = 0.71) with respect to the mean gives a pointed
bell-shaped curve with little spread. Variability of σ = 1.41 produces a flatter bellshaped curve
with a greater spread.
Example:
The volume of water in commercially supplied fresh drinking water containers is approximately
Normally distributed with mean 70 litres and standard deviation 0.75 litres. Estimate the
proportion of containers likely to contain
(i) in excess of 70.9 litres, (ii) at most 68.2 litres, (iii) less than 70.5 litres.
SOLUTION
Let X denote the volume of water in a container, in litres. Then X ~ N(70, 0.75 2 ), i.e. μ = 70, σ =
0.75 and Z = (X − 70)/0.75
The normal distribution can be used as an approximation to the binomial distribution, under
certain circumstances, namely:
(where q = 1 - p)
In some cases, working out a problem using the Normal distribution may be easier than using a
Binomial.
Poisson Approximation
The normal distribution can also be used to approximate the Poisson distribution for large
values of l (the mean of the Poisson distribution).
Continuity Correction
The binomial and Poisson distributions are discrete random variables, whereas the normal
distribution is continuous. We need to take this into account when we are using the normal
distribution to approximate a binomial or Poisson using a continuity correction.
In the discrete distribution, each probability is represented by a rectangle (right hand diagram):
When working out probabilities, we want to include whole rectangles, which is what continuity
correction is all about.
Example
Suppose we toss a fair coin 20 times. What is the probability of getting between 9 and 11
heads?
SOLUTION
Since p is close to ½ (it equals ½!), we can use the normal approximation to the binomial. X ~
N(20 × ½, 20 × ½ × ½) so X ~ N(10, 5) .
In this diagram, the rectangles represent the binomial distribution and the curve is the normal
distribution:
We want P(9 ≤ X ≤ 11), which is the red shaded area. Notice that the first rectangle starts at 8.5
and the last rectangle ends at 11.5 . Using a continuity correction, therefore, our probability
becomes P(8.5 < X < 11.5) in the normal distribution.
The exponential distribution obtains its name from the exponential function in the probability
density function. Plots of the exponential distribution for selected values of are shown in Fig.
4.4. For any value of , the exponential distribution is quite skewed.
Figure: Probability density function of exponential random variables for selected values of λ
1 2 1
μ=E ( X )= and σ =V ( X )= 2
λ λ
It is important to use consistent units in the calculation of probabilities, means, and variances
involving exponential random variables. The following example illustrates unit conversions.
In a large corporate computer network, user log-ons to the system can be modeled as a
Poisson process with a mean of 25 log-ons per hour. What is the probability that there are no
logons in an interval of 6 minutes?
SOLUTION
Let X denote the time in hours from the start of the interval until the first log-on. Then, X has an
exponential distribution with log-ons per hour. We are interested in the probability that X
exceeds 6 minutes. Because is given in log-ons per hour, we express all time units in hours.
That is, 6 minutes 0.1 hour. The probability requested is shown as the shaded area under the
probability density function in Fig. 4.4. Therefore,
∞
P ( X >0.1 ) =∫ 25 e
−25 x −25(0.1)
dx=e =0.082
0.1
In the previous example, the probability that there are no log-ons in a 6-minute interval is 0.082
regardless of the starting time of the interval. A Poisson process assumes that events occur
uniformly throughout the interval of observation; that is, there is no clustering of events. If the
log-ons are well modeled by a Poisson process, the probability that the first log-on after noon
occurs after 12:06 P.M. is the same as the probability that the first log-on after 3:00 P.M. occurs
after 3:06 P.M. And if someone logs on at 2:22 P.M., the probability the next log-on occurs after
2:28 P.M. is still 0.082.
Our starting point for observing the system does not matter. However, if there are high-use
periods during the day, such as right after 8:00 A.M., followed by a period of low use, a Poisson
process is not an appropriate model for log-ons and the distribution is not appropriate for
computing probabilities. It might be reasonable to model each of the highand low-use periods by
a separate Poisson process, employing a larger value for during the high-use periods and a
smaller value otherwise. Then, an exponential distribution with the corresponding value of can
be used to calculate log-on probabilities for the high- and low-use periods.
ASSESSMENT TASK NO. 5
Random Variables and Probability Distributions
I. Discrete or continuous?
a. The time it takes a student selected at random to register for the fall semester
b. The number or bad checks drawn on Upright Bank on a day selected at random
c. The amount of gasoline needed to drive your car 200 miles
d. The number of traffic fatalities per year in the state of Florida
e. The distance a golf ball travels after being hit with a driver
f. The number of ships in Pearl Harbor on any given day
g. Your weight before breakfast each morning
II. Consider each Distribution. Determine if it is a valid distribution or not, and explain your
answer.
a. b.
X 0 1 2
X 0 1 2
III. USA Today reported that approximately 25% of all state prison inmates released on
parole become repeat offenders while on parole. Suppose the parole board is examining
five prisoners up for parole. Let x = number of prisoners out of five parole who become
repeat offenders, and their corresponding probabilities.
X 0 1 2 3 4 5
a. What is the probability that one or more of the five parolees will be repeat
offenders? How does this number relate to the probability that none of the parolees
will be repeat offenders?
b. Find the probability that two or more of the five parolees will be repeat offenders.
c. Find the probability that two or less of the five parolees will be repeat offenders.
d. Compute the mean number of repeat offenders out of five.
e. Compute the standard deviation of the number of repeat offenders out of five.
Problems:
(a) Linda is interested in toilet paper pulling preferences. She takes a simple random
sample of 5 people and asks each whether they always pull from the top or not. The
probability that a person pulls from the top is 0.53, and X= the number of people who
pull from the top.
(b) I roll a fair, 6-sided die until I get a two. X is the number of rolls it takes before I
obtain a roll of two.
(c) You have a bad containing 4 red chips and 6 white chips and you draw 4 chips. Let
random variable Y be the number of red chips drawn from the bag out of 4 draws
without replacement.
i. Five: P( X = 5 ) =
ii. More than three P( X > 3 ) =
iii. Between 1 and 3, not including 3: P( 1 ≤ X < 3) =
iv. Less than 2: P ( X < 2) =
3. Nine percent of all men cannot distinguish between the colors red and green. This
is the type of color blindness that causes problems with traffic signals. If six men
are randomly selected for a study of traffic signal perceptions:
(a) Determine if X = the number of men that cannot distinguish between red and green
is a binomial random variable (check the conditions for the binomial setting).
(b) What is the distribution of X?
(c) Find the probability that exactly two of the six men cannot distinguish between red
and green.
(d) What are the mean and standard deviation of the random variable X?
a. What is the probability that it is violating the 70 mile per hour speed limit? Assume
that the speeds are normally distributed.
b. What is the probability that a randomly selected vehicle is not violating the speed
limit?
c. What is the probability that a randomly selected vehicle is traveling under 50 miles
per hour?
d. What is the probability that a randomly selected vehicle is traveling between 50 and
70 miles per hour?
3. A machine is used to put bolts into boxes. It does so such that the actual number
of bolts in a box is normally distributed with a mean of 106 and a standard
deviation of 2.
Part of the
Complete Complete
Understanding problem
misunderstanding understanding of /5
the Problem misunderstood or
of the problem the problem
misinterpreted
Partially correct
Plan could have
plan based on
No attempt, or led to a correct
Planning a part of the
totally solution if /5
Solution problem being
inappropriate plan implemented
interpreted
properly
correctly
Copying error;
No answer, or computational
Correct answer
Getting an wrong answer error; partial
and correct label /5
Answer based on an answer for a
for the answer
inappropriate plan problem with
multiple answers
TOTAL /15