0% found this document useful (0 votes)
43 views

Lesson 4 - Continuous Probability Distributions (With Exercises)

The document discusses continuous probability distributions and key concepts such as probability density functions. It covers the normal distribution and how to calculate the expected value of continuous random variables. Examples are provided to demonstrate how to find the probability of an event and expected value using the density function.

Uploaded by

crisostomo.nenia
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Lesson 4 - Continuous Probability Distributions (With Exercises)

The document discusses continuous probability distributions and key concepts such as probability density functions. It covers the normal distribution and how to calculate the expected value of continuous random variables. Examples are provided to demonstrate how to find the probability of an event and expected value using the density function.

Uploaded by

crisostomo.nenia
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

LESSON

4
Continuous Probability
Distributions
At the end of the lesson, the student should be able to:

1. Estimate probabilities and identify unusual events. .


CONTINUOUS PROBABILITY DISTRIBUTIONS

4.1 Continuous Random Variables and their Probability Distribution

A continuous random variable has a probability of zero of assuming exactly any of its
values. Consequently, its probability distribution cannot be given in tabular form. At first this may
seem startling, but it, becomes more plausible when we consider a particular example. Let us
discuss a random variable whose values are the heights of all people over 21 years of age.
Between any two values, say 163.5 and 164.5 centimeters, or even 163.99 and 164.01
centimeters, there are an infinite number of heights, one of which is 164 centimeters. The
probability of selecting a person at random who is exactly 164 centimeters tall and not one of
the infinitely large set of heights so close to 164 centimeters that you cannot humanly measure
the difference is remote, and thus we assign a probability of zero to the event. This is not the
case, however, if we talk about the probability of selecting a person who is at least 163
centimeters but not more than 165 centimeters tall. Now we are dealing with an interval rather
than a point value of our random variable.

We shall concern ourselves with computing probabilities for various intervals of


continuous random variables such as P(a < X < b), P(W > c), and so forth. Note that when X is
continuous,

P(a < X < b) = P(a < X < b) + P(X = b) = P(a < X < b).

That is, it does not matter whether we include an endpoint of the interval or not. This is
not true, though, when X is discrete. Although the probability distribution of a continuous random
variable cannot be presented in tabular form, it can be stated as a formula. Such a formula
would necessarily be a function of the numerical values of the continuous random variable X
and as such will be represented by the functional notation f(x). In dealing with continuous
variables, f(x) is usually called the probability density function, or

Figure 4.1 Typical Density Functions

simply the density function of A'. Since X is defined over a continuous sample space, it is
possible for f(x) to have a finite number of discontinuities. However, most density functions that
have practical applications in the analysis of statistical data are continuous and their graphs
may take any of several forms, some of which are shown in Figure 4.1. Because areas will be
used to represent probabilities and probabilities arc positive numerical values, the density
function must lie entirely above the x axis. A probability density function is constructed so that
the area under its curve bounded by the x axis is equal to 1 when computed over the range of X
for which f(x) is defined. Should this range of X be a finite interval, it is always possible to extend
the interval to include the entire sot of real numbers by defining f(x) to be zero at all points in the
extended portions of the interval. In Figure 4.2, the probability that X assumes a value between
a and /; is equal to the shaded area under the density function between the ordinates at. x = a
and x = b, and from integral calculus is given by:

P(a < X < b) = ∫ f (x )dx


a

Figure 4.2 P(a < X < b)

Example:

For the density function ( x )= 3


{
x2
,−1< x <2 ,
( ), elsewhere
, find f(x), and use it to evaluate P(0 < X ≤ 1).

SOLUTION: for –1 < x < 2,

|
x x 2 3
t t3 x +1
∫ f ( t ) dt =∫
x

F(x) = dt = ❑

❑ ❑ =
−∞ −1 3 9 −1
9

Therefore,
{
0 , x <−1
3
x +1
F(x) = ,−1≤ x <2 ,
9
1 , x ≥ 2.

The cumulative distribution function F(x) is expressed graphically in Figure 4.3.


Now,

2 1 1
P(0 < X ≤ 1) = F(1) – F(0) = − =
9 9 9

Figure 4.3 Continuous cumulative distribution function

4.2 Expected Values of Continuous Random Variables

Let X be a continuous random variable with range [a, b] and probability density function f(x). The
expected value of X is defined by

b
E ( X )=∫ xf ( x ) dx
a

Let’s see how this compares with the formula for a discrete random variable:
n
E ( X )=∑ x i p(x i )
i=1
The discrete formula says to take a weighted sum of the values xi of X, where the weights are
the probabilities p(xi). Recall that f(x) is a probability density. Its units are prob/(unit of X).

So f(x) dx represents the probability that X is in an infinitesimal range of width dx around x. Thus
we can interpret the formula for E(X) as a weighted integral of the values x of X, where the
weights are the probabilities f(x) dx.

As before, the expected value is also called the mean or average.

Example :

Let X ∼ uniform(0, 1). Find E(X).

SOLUTION:

Since X has a range of [0, 1] and a density of f(x) = 1:

|
1 2 1
x ❑= 1
E ( X )=∫ xdx= ❑
0 2 ❑
0 ❑ 2

Not surprisingly, the mean is at the midpoint of the range.

Example:

3 2
Let X have range [0, 2] and density x . Find E(X).
8

|
2 2 4
3 3 3x ❑¿ = 3
2
E ( X )=∫ xf ( x ) dx=∫ x dx=¿ ❑
0 0 8 32 ❑
0 ❑ 2

Does it make sense that this X has mean is in the right half of its range?

Yes. Since the probability density increases as x increases over the range, the average value of
x should be in the right half of the range.
µ is “pulled” to the right of the midpoint 1 because there is more mass to the right.

Properties of E(X)

The properties of E(X) for continuous random variables are the same as for discrete ones:

1. If X and Y are random variables on a sample space Ω then

E(X = Y) = E(X) + E(Y)

2. If a and b are constants then

E(aX + b) = aE(X) + b

Expectation of Functions of X

This works exactly the same as the discrete case. if h(x) is a function then Y = h(X) is a random
variable and


E ( Y )=E ( h ( X ) )= ∫ h ( x ) fx ( x ) dx
−∞

Example:

Let X ∼ exp(λ). Find E(X2).

[ ]

2 x −λ x 2 − λ x ∞ 2
E ( X ) =∫ x λ e dx= −x e −
2 2 −λ x 2 −λ x
e − 2e =
0 λ λ 0 λ2

4.3 Normal Distribution

The Normal Distribution is the most important and most widely used continuous probability
distribution. It is the cornerstone of the application of statistical inference in analysis of data
because the distributions of several important sample statistics tend towards a Normal
distribution as the sample size increases.

Empirical studies have indicated that the Normal distribution provides an adequate
approximation to the distributions of many physical variables. Specific examples include
meteorological data, such as temperature and rainfall, measurements on living organisms,
scores on aptitude tests, physical measurements of manufactured parts, weights of contents of
food packages, volumes of liquids in bottles/cans, instrumentation errors and other deviations
from established norms, and so on.
The graphical appearance of the Normal distribution is a symmetrical bell-shaped curve that
extends without bound in both positive and negative directions.

The probability density function is given by:

[ ]
2
1 −(x−μ)
f ( x )= exp , −∞< x < ∞ ;
σ √2 π 2σ 2
−∞< μ <∞ , σ >0

where μ and σ are parameters. These turn out to be the mean and standard deviation,
respectively, of the distribution. As a shorthand notation, we write X ~ N(μ,σ2).

The curve never actually reaches the horizontal axis buts gets close to it beyond about 3
standard deviations each side of the mean.

For any Normally distributed variable:

68.3% of all values will lie between μ −σ and μ + σ (i.e. μ ± σ )


95.45% of all values will lie within μ ± 2 σ
99.73% of all values will lie within μ ± 3 σ

The graphs below illustrate the effect of changing the values of μ and σ on the shape of the
probability density function. Low variability (σ = 0.71) with respect to the mean gives a pointed
bell-shaped curve with little spread. Variability of σ = 1.41 produces a flatter bellshaped curve
with a greater spread.
Example:

The volume of water in commercially supplied fresh drinking water containers is approximately
Normally distributed with mean 70 litres and standard deviation 0.75 litres. Estimate the
proportion of containers likely to contain

(i) in excess of 70.9 litres, (ii) at most 68.2 litres, (iii) less than 70.5 litres.

SOLUTION

Let X denote the volume of water in a container, in litres. Then X ~ N(70, 0.75 2 ), i.e. μ = 70, σ =
0.75 and Z = (X − 70)/0.75

(i) X = 70.9 ; Z = (70.9 − 70)/0.75 = 1.20


P(X > 70.9) = P(Z > 1.20) = 0.1151 or 11.51%

(ii) X = 68.2 ; Z = −2.40


P(X < 68.2) = P(Z < −2.40) = 0.0082 or 0.82%

(iii) X = 70.5 ; Z = 0.67


P(X > 70.5) = 0.2514 ; P(X < 70.5) = 0.7486 or 74.86%

4.4 Normal Approximation to Binomial and Poisson Distribution


Binomial Approximation

The normal distribution can be used as an approximation to the binomial distribution, under
certain circumstances, namely:

If X ~ B(n, p) and if n is large and/or p is close to ½, then X is approximately N(np, npq)

(where q = 1 - p)

In some cases, working out a problem using the Normal distribution may be easier than using a
Binomial.

Poisson Approximation

The normal distribution can also be used to approximate the Poisson distribution for large
values of l (the mean of the Poisson distribution).

If X ~ Po(l) then for large values of l, X ~ N(l, l) approximately.

Continuity Correction

The binomial and Poisson distributions are discrete random variables, whereas the normal
distribution is continuous. We need to take this into account when we are using the normal
distribution to approximate a binomial or Poisson using a continuity correction.

In the discrete distribution, each probability is represented by a rectangle (right hand diagram):

When working out probabilities, we want to include whole rectangles, which is what continuity
correction is all about.

Example
Suppose we toss a fair coin 20 times. What is the probability of getting between 9 and 11
heads?

SOLUTION

Let X be the random variable representing the number of heads thrown.


X ~ Bin(20, ½)

Since p is close to ½ (it equals ½!), we can use the normal approximation to the binomial. X ~
N(20 × ½, 20 × ½ × ½) so X ~ N(10, 5) .

In this diagram, the rectangles represent the binomial distribution and the curve is the normal
distribution:

We want P(9 ≤ X ≤ 11), which is the red shaded area. Notice that the first rectangle starts at 8.5
and the last rectangle ends at 11.5 . Using a continuity correction, therefore, our probability
becomes P(8.5 < X < 11.5) in the normal distribution.

4.5 Exponential Distribution

The exponential distribution obtains its name from the exponential function in the probability
density function. Plots of the exponential distribution for selected values of are shown in Fig.
4.4. For any value of , the exponential distribution is quite skewed.
Figure: Probability density function of exponential random variables for selected values of λ

If the random variable X has an exponential distribution with parameter λ,

1 2 1
μ=E ( X )= and σ =V ( X )= 2
λ λ

It is important to use consistent units in the calculation of probabilities, means, and variances
involving exponential random variables. The following example illustrates unit conversions.

In a large corporate computer network, user log-ons to the system can be modeled as a
Poisson process with a mean of 25 log-ons per hour. What is the probability that there are no
logons in an interval of 6 minutes?

SOLUTION

Let X denote the time in hours from the start of the interval until the first log-on. Then, X has an
exponential distribution with log-ons per hour. We are interested in the probability that X
exceeds 6 minutes. Because is given in log-ons per hour, we express all time units in hours.
That is, 6 minutes 0.1 hour. The probability requested is shown as the shaded area under the
probability density function in Fig. 4.4. Therefore,

P ( X >0.1 ) =∫ 25 e
−25 x −25(0.1)
dx=e =0.082
0.1

Figure 4.4 Probability for the exponential distribution

In the previous example, the probability that there are no log-ons in a 6-minute interval is 0.082
regardless of the starting time of the interval. A Poisson process assumes that events occur
uniformly throughout the interval of observation; that is, there is no clustering of events. If the
log-ons are well modeled by a Poisson process, the probability that the first log-on after noon
occurs after 12:06 P.M. is the same as the probability that the first log-on after 3:00 P.M. occurs
after 3:06 P.M. And if someone logs on at 2:22 P.M., the probability the next log-on occurs after
2:28 P.M. is still 0.082.

Our starting point for observing the system does not matter. However, if there are high-use
periods during the day, such as right after 8:00 A.M., followed by a period of low use, a Poisson
process is not an appropriate model for log-ons and the distribution is not appropriate for
computing probabilities. It might be reasonable to model each of the highand low-use periods by
a separate Poisson process, employing a larger value for during the high-use periods and a
smaller value otherwise. Then, an exponential distribution with the corresponding value of can
be used to calculate log-on probabilities for the high- and low-use periods.
ASSESSMENT TASK NO. 5
Random Variables and Probability Distributions

I. Discrete or continuous?
a. The time it takes a student selected at random to register for the fall semester
b. The number or bad checks drawn on Upright Bank on a day selected at random
c. The amount of gasoline needed to drive your car 200 miles
d. The number of traffic fatalities per year in the state of Florida
e. The distance a golf ball travels after being hit with a driver
f. The number of ships in Pearl Harbor on any given day
g. Your weight before breakfast each morning

II. Consider each Distribution. Determine if it is a valid distribution or not, and explain your
answer.
a. b.
X 0 1 2

P(x) 0.25 0.60 0.15

X 0 1 2

P(x) 0.25 0.60 0.20

III. USA Today reported that approximately 25% of all state prison inmates released on
parole become repeat offenders while on parole. Suppose the parole board is examining
five prisoners up for parole. Let x = number of prisoners out of five parole who become
repeat offenders, and their corresponding probabilities.

X 0 1 2 3 4 5

P(x) 0.237 0.369 0.264 0.088 0.015 0.001

a. What is the probability that one or more of the five parolees will be repeat
offenders? How does this number relate to the probability that none of the parolees
will be repeat offenders?
b. Find the probability that two or more of the five parolees will be repeat offenders.
c. Find the probability that two or less of the five parolees will be repeat offenders.
d. Compute the mean number of repeat offenders out of five.
e. Compute the standard deviation of the number of repeat offenders out of five.

ASSESSMENT TASK NO. 6


Binomial Distribution

Problems:

1. Determine in which of the following situations a binomial distribution can be


applied. If so, state and graph the distribution of X, and find the mean and
standard deviation of X. If not, state which of the four conditions to satisfy the
binomial distribution requirements has been violated.

(a) Linda is interested in toilet paper pulling preferences. She takes a simple random
sample of 5 people and asks each whether they always pull from the top or not. The
probability that a person pulls from the top is 0.53, and X= the number of people who
pull from the top.
(b) I roll a fair, 6-sided die until I get a two. X is the number of rolls it takes before I
obtain a roll of two.
(c) You have a bad containing 4 red chips and 6 white chips and you draw 4 chips. Let
random variable Y be the number of red chips drawn from the bag out of 4 draws
without replacement.

2. From the information in 1 (a), answer the following:


(a) Find the probability that the number of people who pull from the top is:

i. Five: P( X = 5 ) =
ii. More than three P( X > 3 ) =
iii. Between 1 and 3, not including 3: P( 1 ≤ X < 3) =
iv. Less than 2: P ( X < 2) =

3. Nine percent of all men cannot distinguish between the colors red and green. This
is the type of color blindness that causes problems with traffic signals. If six men
are randomly selected for a study of traffic signal perceptions:

(a) Determine if X = the number of men that cannot distinguish between red and green
is a binomial random variable (check the conditions for the binomial setting).
(b) What is the distribution of X?
(c) Find the probability that exactly two of the six men cannot distinguish between red
and green.
(d) What are the mean and standard deviation of the random variable X?

ASSESSMENT TASK NO. 7


NORMAL DISTRIBUTION
Problems:

1. The average speed of vehicles traveling on a stretch of highway is 67 miles per


hour with a standard deviation of 3.5 miles per hour. A vehicle is selected at
random.

a. What is the probability that it is violating the 70 mile per hour speed limit? Assume
that the speeds are normally distributed.
b. What is the probability that a randomly selected vehicle is not violating the speed
limit?
c. What is the probability that a randomly selected vehicle is traveling under 50 miles
per hour?
d. What is the probability that a randomly selected vehicle is traveling between 50 and
70 miles per hour?

2. A customer calling a call center spends an average of 45 minutes on hold during


the peak season, with a standard deviation of 12 minutes. Suppose these times
are normally distributed. Find the probability that the customer will be on hold for
each interval of times:

a. More than 54 minutes.


b. Less than 24 minutes.
c. Between 24 and 54 minutes.
d. More than 39 minutes.

3. A machine is used to put bolts into boxes. It does so such that the actual number
of bolts in a box is normally distributed with a mean of 106 and a standard
deviation of 2.

a. Draw and label the Normal curve from the information


b. What percentage of boxes contains more than 104 bolts?
c. What percentage of boxes contains more than 110 bolts?
d. What percentage of boxes contains less than 108 bolts?
e. What percentage of boxes contain less than 100 bolts?

4. In the accompanying diagram, the shaded area represents approximately 95% of


the scores on a standardized test. If these scores ranged from 78 to 92,
a. What is the mean?
b. What is the standard deviation?

Emerging Developing Proficient


SCALE SCORE
(0 points) (3 points) (5 points)

Part of the
Complete Complete
Understanding problem
misunderstanding understanding of /5
the Problem misunderstood or
of the problem the problem
misinterpreted

Partially correct
Plan could have
plan based on
No attempt, or led to a correct
Planning a part of the
totally solution if /5
Solution problem being
inappropriate plan implemented
interpreted
properly
correctly

Copying error;
No answer, or computational
Correct answer
Getting an wrong answer error; partial
and correct label /5
Answer based on an answer for a
for the answer
inappropriate plan problem with
multiple answers

TOTAL /15

PROBLEM SOLVING RUBRIC

You might also like