0% found this document useful (0 votes)
38 views

Engg Data Analysis Lesson 4 Continuous Probability Distribution Part 2 v2

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Engg Data Analysis Lesson 4 Continuous Probability Distribution Part 2 v2

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 131

DATAENG

(Engineering Data Analysis)

Lesson 4: Continuous Probability Distribution


The Normal Distribution

Approximation of Binomial and Poisson using Normal


Distribution

The Exponential Distribution


The Normal Distribution
• Perhaps the most widely used of all
continuous probability distributions is the
normal distribution.
• The normal probability density function is bell-
shaped and centered at the mean ().
• Its spread is measured by the standard
deviation ().
• These two parameters,  & , completely
determine the location and spread of the
normal density function.
The Normal Distribution
The Normal Distribution
• Many naturally occurring measurements tend
to have relative frequency distributions closely
resembling the normal curve.
• For example, heights of adult males tend to
have a distribution that shows many
measurements clumped around the mean
height, with relatively few very short or very
tall males in the population.
The Normal Distribution
The Normal Distribution
• Any time responses tend to be averages of
independent quantities, the normal distribution
tends to provide a reasonably good model for
their relative frequency behavior.
• In contrast, lifetimes of biological organisms or
electronic components tend to have relative
frequency distributions that are neither normal
nor close to normal.
The Normal Distribution
Some Useful Properties of the
Normal Distribution
• The mean  determines the location of the
distribution.
• The standard deviation  determines the
spread of the distribution. The normal
distribution with larger  is shorter and more
spread.
•  is the mean, the median, and the mode of
the distribution.
Some Useful Properties of the
Normal Distribution
• Any linear function of a normally distributed
random variable is also normally distributed.
• That is, if X has a normal distribution with
mean  and variance 2, and Y = aX + b for
constants a and b, then Y is also normally
distributed with mean a + b and variance
a22.
Standardizing the Normal Curve
• The difficulty encountered in solving
integrals of normal density functions
necessitates the tabulation of normal
curve areas for quick reference.
• However, it would be a hopeless task to
attempt to set up separate tables for every
conceivable value of μ and σ.
Standardizing the Normal Curve
• Fortunately, we are able to transform all the
observations of any normal random
variable X into a new set of observations of
a normal random variable Z with mean 0
and variance 1.
• This can be done by means of the
transformation
X
Z

Standard Normal Distribution
Standardizing the Normal Curve
• Whenever X assumes a value x, the
corresponding value of Z is given by
z = (x − μ)/σ.
• Therefore, if X falls between the values x =
x1 and x = x2, the random variable Z will
fall between the corresponding values
z1 = (x1 −μ)/σ and z2 = (x2 − μ)/σ.
Standardizing the Normal Curve
x2 1
1
2
 2 ( x )
P ( x1  X  x 2 )  
2  x
e 2 dx
1

X
Z  X  Z    dX  dZ

z2 1 2
1  z
P( z 1  Z  z 2 ) 
2
 e 2 dz
z1
Standard normal distribution
probabilities
The c.d.f. of the standard normal
distribution cannot be expressed
analytically in terms of elementary
functions. Therefore, probabilities for the
standard normal distribution or any other
normal distribution can be found only by
numerical approximations or by using a
table of values such as the ones given at
the end of textbooks in statistics.
Area under the
standard normal curve

P(x1 < X < x2) = area of the shaded region


Standard normal cumulative areas tabulated
in statistical tables refer to the shaded
area TO THE LEFT of the z score.
Find P(Z ≤ - 3.57)

1 decimal
st
Find P(Z ≤ - 3.57)

2 decimal
nd
P(Z ≤ - 3.57)
is
0.000179

Area or P
P(Z ≤ - 1.47)
is
0.070781

z is closer to zero from left


P(Z ≤ + 0.52)
is
0.698468
Illustration on how to
get the area between
the two z scores
z = 1.25
and
z = –0.38
Illustration on how to get the area between
the two z scores z = 1.25 and z = –0.38
Example
Find the indicated probability.
a) P(0 ≤ Z ≤ 1.2) =
Example
Find the indicated probability.
a) P(0 ≤ Z ≤ 1.2) =
Example
Find the indicated probability.
a) P(0 ≤ Z ≤ 1.2) =
Example
Find the indicated probability.
a) P(0 ≤ Z ≤ 1.2) = 0.884930 – 0.5
Example
Find the indicated probability.
a) P(0 ≤ Z ≤ 1.2) = 0.884930 – 0.5
Example
Find the indicated probability.
b) P(-0.9 ≤ Z ≤ 0) = 0.5 – 0.184060
Example
Find the indicated probability.
c) P(0.3 ≤ Z ≤ 1.56) = 0.940620 – 0.617911
Example
Find the indicated probability.
d) P(-0.2 ≤ Z ≤ 0.2) = 0.579260 – 0.420740
Example
Find z0 as indicated.
a) P(Z ≤ z0) = 0.5 (or half)

“≤ z0” means “LEFT side of z0”

or

ANSWER: z0 = 0
Example
Find z0 as indicated.
b) P(Z ≤ z0) = 0.8749 (more than half)
“≤ z0” means “LEFT side of z0”

or
z0 > 0
P(Z ≤ + 0.52)
is
0.698468
P(Z ≤ ???)
is
Known
P(Z ≤ ???)
is
Known

z 2nd decimal

1st decimal

z
0.8749
P(Z ≤ ???)
is
Known

z 2nd decimal
0.05
1st decimal ANSWER:
P(Z ≤ 1.15) = 0.8749
z
0.8749
1.1
P(Z ≤ ???)
MS Excel is
Known

ANSWER: P(Z ≤ 1.15) = 0.8749


Example
Find z0 as indicated.
c) P(Z ≥ z0) = 0.117 (less than half)

or

Why?

z0 = 1.19
Example
Find z0 as indicated.
d) P(Z ≥ z0) = 0.617
or

Why?

z0 = - 0.30
Example
Find z0 as indicated.
e) P(-z0 ≤ Z ≤ z0) = 0.90

Find -z0 and z0

0.05 0.90 0.05 Area


Example
Find z0 as indicated.
e) P(-z0 ≤ Z ≤ z0) = 0.95

0.025 0.95 0.025 Area


Example
Find z0 as indicated.
e) P(-z0 ≤ Z ≤ z0) = 0.95
Find -z0

P(Z < -z0) = 0.025

-z0
0.025 0.95 0.025 Area
P(Z ≤ ???)
is
Known

z 2nd decimal
??
1st decimal ANSWER:
P(Z ≤ ??) = 0.025
z
0.025
??
P(Z ≤ ???)
MS Excel is
Known

ANSWER: P(Z ≤ -1.96) = 0.025


Example
Find z0 as indicated.
e) P(-z0 ≤ Z ≤ z0) = 0.95
Find -z0

P(Z < z0) = 0.975

-z0 z0
0.025 0.95 0.025 Area
Example
Find z0 as indicated.
e) P(Z ≤ z0) = 0.75 (more than half)

or

Why?

z0 = ???
Example
Find z0 as indicated.
f) P(Z ≥ z0) = 0.105 (less than half)

or

Why?

z0 =
Example
Find z0 as indicated.
g) P(Z ≥ z0) = 0.745
or

Why?

z0 =
Example
Find z0 as indicated.
h) P(-z0 ≤ Z ≤ z0) = 0.80

Find -z0 and z0

??? ??? ??? Area


Example
The IQs of 600 applicants to a certain
college are approximately normally
distributed with a mean of 115 and a
standard deviation of 12. If the college
requires an IQ of at least 95, how many of
these students will be rejected on this
basis of IQ, regardless of their other
qualifications?
The IQs of 600 applicants to a certain college are
approximately normally distributed
mean = 115

mean = 0 x
Standard normal
distribution

X
Z

z https://homepage.divms.uiowa.edu/~mbognar/applets/normal.html
Solution
Let X  IQ of an applicant
X μ
Then, Z  .
σ
95  115
The cut  off is Z   1.67.
12
A Z score of  1.67 leaves an area
to its left of 0.0475.
In short, P(X  95)  P(Z  1.67)  0.0475.
Hence, # of students rejected  (600)(0.0475)  29.
Area = 0.0475

Z = - 1.67
Example
6.61) Wires manufactured for use in a certain
computer system are specified to have
resistances between 0.12 and 0.14 ohm. The
actual measured resistances of the wires produced
by Company A have a normal probability
distribution with a mean of 0.13 ohm and a
standard deviation of 0.005 ohm.
Justification of the Empirical Rule
Areas under
the normal
distribution
curve
Example
mean or μ = 0.13
SD or σ = 0.005

Random variable (X) is resistance


Example
(a) What is the probability that a randomly
selected wire from Company A’s production will
meet the specifications?

Wires manufactured for use in a certain


computer system are specified to have
resistances between 0.12 and 0.14
ohm.
Example
mean or μ = 0.13
SD or σ = 0.005

Random variable (X) is resistance


The IQs of 600 applicants to a certain college are
approximately normally distributed
mean = 115

mean = 0 x
Standard normal
distribution

X
Z

z https://homepage.divms.uiowa.edu/~mbognar/applets/normal.html
Example
mean or μ = 0.13
between 0.12 SD or σ = 0.005
and 0.14 ohm

Random variable (X) is resistance


Solution to #6.61
Given :   0.13 
  0.005 
0.12  0.13
(a ) Z lower lim it   2
0.005
0.14  0.13
Z upper lim it  2
0.005
P( 2  Z  2)  P( Z  2)  P( Z  2)
 0.9772  0.0228
 0.9544
Example
(b) If four such wires are used in the system and
all are selected from Company A, what is the
probability that all four will meet the
specifications?

Probability Multiplication Rule

Assume independence of the four


outcomes
Solution to #6.61

(b ) From (a ), we have
P( 2  Z  2)  0.9544
Assu min g the outcome of each event
is independen t of the outcome of the others ,
4
P  (0.9544)  0.8297
Example
6.65) Sick-leave time used by employees of a
firm in one month has approximately a
normal distribution with a mean of 200
hours and a variance of 400.

(a) Find the probability that total sick leave


for next month will be less than 150 hours.
Example
6.65) Sick-leave time used by employees of a
firm in one month has approximately a
normal distribution with a mean of 200
hours and a variance of 400.

(b) In planning schedules for next month,


how much time should be budgeted for sick
leave if that amount is to be exceeded with
a probability of only 0.10?
Example
mean or μ = ____
var or σ2 = ____

Random variable (X) is _____


Solution to #6.65
Given :   200 h
  400 h  20h
150  200
(a ) Z    2.5
20
P( Z  2.5)  0.0062
Example
6.65) Sick-leave time used by employees of a
firm in one month has approximately a
normal distribution with a mean of 200
hours and a variance of 400.

(b) In planning schedules for next month,


how much time should be budgeted for sick
leave if that amount is to be exceeded with
a probability of only 0.10?
Example
mean or μ = ____
that amount is to var or σ2 = ____
be exceeded with
a probability of
only 0.10

Random variable (X) is _____


Solution to #6.65
Let X  the random var iable denoting
sick leave time in a month
(b ) P( Z  Z max )  0.90
From the statistica l tables and int erpolating ,
Z max  1.28.
Thus,
Xmax  200
Z  1.28 
20
Xmax  225.6 h
6.110)
Example
A machine operation produces bearings with
diameters that are normally distributed with
mean and standard deviation equal to 3.0005
and 0.001, respectively. Customer
specifications require the bearing diameters to
lie in the interval 3.000  0.0020. Those
outside the interval are considered “scrapped”
and must be re-machined or used as stock for
smaller bearings. With the existing machine
setting, what fraction of total production will be
scrapped?
Example
mean or μ = ____
SD or σ = ____

Random variable (X) is resistance


Solution to #6.110
Given :   3.005
  0.001
( 3.000  0.0020)  3.0005
(a ) Z upper lim it   1 .5
0.001
( 3.000  0.0020)  3.0005
Z lower lim it   2.5
0.001
P( 2.5  Z  1.5)  P( Z  1.5)  P( Z  2.5)
 0.9332  0.0062
 0.927
Thus, (1  0.927 ) or 7.3% will be scrap .
Example
6.111)
Refer to problem 6.110.
Suppose five bearings are drawn from
production.
What is the probability that at least one will
be defective?
Solution to #6.111
The probabilit y that a randomly selected bearing
will be scrapped is 0.073 (from previous problem ).
Let X  random var iable denoting number of scrapped
bearing .
Thus,
P( X  1)  1  P( X  0) This becomes
 5 a binomial
 1   (0.073)0 (1  0.073)5 distribution
 0
 1  0.6845  0.3155
problem.
Normal Distribution Approximation to the
Binomial and Poisson

79
Normal Approximations
• The binomial and Poisson distributions become
more bell-shaped and symmetric as their mean
value increase.
• For manual calculations, the normal approximation
is practical – exact probabilities of the binomial and
Poisson, with large means, require technology
(Minitab, Excel).
• The normal distribution is a good approximation for:
– Binomial if np > 5 and n(1-p) > 5.
– Poisson if λ > 5.

Sec 4-7 Normal Approximation to 80


the Binomial & Poisson
Recall: Binomial Distribution

combination
Sec 3-6 Binomial Distribution 81
Exercise 3-18: Organic Pollution-1
Find the probability that, in the next
18 samples, exactly 2 contain the
pollutant (success).

n = 18

Sec 3-6 Binomial Distribution 82


Exercise 3-18: Organic Pollution-2
Determine the probability that at least 4
samples contain the pollutant.

Answer:
success  with pollutant
at least 4  4, 5, 6, … , 18
P = P(X = 4) + P(X = 5) + … + P(X = 18)
n x
f  x     p 1  p  for x 15 terms
n x
 0,1,...n (3-7)
 x
Sec 3-6 Binomial Distribution 83
Exercise 3-18: Organic Pollution-1

Sec 2- 84
Exercise 3-18: Organic Pollution-2

15 terms

P = 0.09819

Sec 3-6 Binomial Distribution 85


The effect of n and p
on the shape of the
binomial probability
function.
Normal Approximation to the Binomial Distribution

Sec 4-7 Normal Approximation to 88


the Binomial & Poisson
The effect of n and p on the shape of the binomial
probability function.
Normal Approximation to the Binomial Distribution
Example 4-18: Applying the Approximation
In a digital communication channel,
assume that the number of bits
received in error can be modeled
by a binomial random variable.
The probability that a bit is
received in error is 10-5. If 16
million bits are transmitted, what is
the probability that 150 or fewer
errors occur?
Sec 4-7 Normal Approximation to 91
the Binomial & Poisson
Recall: Binomial Distribution

What are the values of n and x?


How many times
f(x) will be
Sec 3-6 Binomial Distribution 92
computed?
Example 4-18: Applying the Approximation

Solution:

P  X  150   P  X  150.5 
 
 X  160 150.5  160 
P 
 160 1  105 5 
   160 1  10  

 9.5 
 PZ    P  Z  0.75104   0.2263
 12.6491 

Sec 4-7 Normal Approximation to the Binomial & Poisson Distributions 93


Example 4-18: Applying the Approximation

Sec 4-7 Normal Approximation to the Binomial & Poisson Distributions 94


Normal Approximation to the Poisson

If X is a Poisson random variable with E  X    and


V  X   ,
X 
Z (4-13)

is approximately a standard normal random variable.
The same continuity correction used for the binomial
distribution can also be applied. The approximation is
good for   5

Sec 4-7 Normal Approximation to the Binomial & Poisson Distributions 95


Example 4-20: Normal Approximation to Poisson
Assume that the number of
asbestos particles in a square
meter of dust on a surface follows
a Poisson distribution with a
mean of 1000.
If a square meter of dust is
analyzed, what is the probability
that 950 or fewer particles are
found?
Sec 4-7 Normal Approximation to the Binomial & Poisson Distributions 96
Example 4-20: Normal Approximation to Poisson

Solution:
1000 x
950
e
1000
P  X  950    ... too hard manually!
x 0 x!
The probability can be approximated as
P  X  950   P  X  950.5 
 950.5  1000 
 PZ  
 1000 
 P  Z  1.57   0.058

Sec 4-7 Normal Approximation to the Binomial & Poisson Distributions 97


Example 4-20: Normal Approximation to
Poisson

Using Excel
0.0578 = POISSON(950,1000,TRUE)
0.0588 = NORMDIST(950.5, 1000, SQRT(1000), TRUE)
1.6% = (0.0588 - 0.0578) / 0.0578 = percent error

Sec 4-7 Normal Approximation to the Binomial & Poisson Distributions 98


Exponential Distribution

99
The Exponential Distribution
The exponential distribution is a continuous
distribution that is sometimes used to model
the time that elapses before an event occurs.
– Such a time is often called a waiting time. For
example, the exponential distribution is
sometimes used to model the lifetime of a
component.
– The waiting time is represented by . In some
applications,  may represent a quantity that
does not measure time.
The Exponential Distribution
The Exponential Distribution
The parameter 
of the
exponential
density function
is a constant
that determines
the rate at which
the curve
decreases. Note
that  = 1/.
Example
The number of traffic accidents at a certain
intersection is thought to be well-modeled by
a Poisson process with a mean of 3
accidents per year.
(a) Find the mean waiting time between
accidents.
(b) Find the standard deviation of the waiting
times between accidents.
(c) Find the probability that more than one year
elapses between accidents.
Example
The number of traffic accidents at a certain
intersection is thought to be well-modeled by
a Poisson process with a mean of 3
accidents per year.
(d) Find the probability that less than one month
elapses between accidents.
(e) If no accidents have occurred within the last
six months, what is the probability that an
accident will occur within the next year?
Solution
1 1
(a )   
 3 accidents / year
1
 year  4 months
3
2 2
(b ) V ( X )    4  16
So,   16  4 months
Solution
(c) Let X  time between accidents (in months )
P( X  12)  1  P( 0  X  12)
x
12  4
e
 1  4
dx
0
12
 x 
 1  e 4 
 
 0
12 / 4 3
 1  [e  1]  e
Solution

(d ) Let X  time between accidents (in months )


x 1
1 4   x
e
P( X  1)   dx   e 4 
4  
0  0
1 / 4
 [ e ] 1
1 / 4
 1 e
Solution
(e) Let X  time between accidents (in months )
P( X  18 | X  6)
P( X  18  X  6)

P( X  6)
x x 18
18 4   
e  e 4 
 4
dx



6  e 18 / 4  e  6 / 4
6
 x
 6

6    x 1  [ e  6 / 4  1]
e 4 1   e 4 
1  dx
4  
0  0
0.2120
  0.9502
0.2231
Relationship between the Poisson
and the Exponential Distributions
• The exponential distribution is frequently
used as a model for the distribution of times
between the occurrence of successive
events, such as customers arriving at a
service facility or calls coming in to a
switchboard.
• The reason for this is that the exponential
distribution is closely related to the Poisson
distribution.
Relationship between the Poisson
and the Exponential Distributions
• Suppose events are occurring in time
according to a Poisson distribution with a
rate of  events per hour. In t hours, the
number of events Y will have a Poisson
distribution with mean value t.
• Suppose we start at time zero and ask,
“How long do I have to wait to see the first
event happen?”
Relationship between the Poisson
and the Exponential Distributions
Let X represent the length of time until this
first event. Then,
P( X  t )
 P[Y  0 on (0, t )]
0
( t )  t The first event
 e does not occur
0! from time zero to
 t time t.
e
Relationship between the Poisson
and the Exponential Distributions
Furthermore, P( X  t )
 1  P( X  t )
 t
 1 e
We see that the cumulative distribution
function of X has the form of an exponential
distribution with  = 1/.
Relationship between the Poisson
and the Exponential Distributions
Upon differenti ating , the probabilit y density
function of X is given by
 t
dF(t ) d(1  e )  t
f (t )    e
dt dt
1 t / 
 f (t )  e t0

This shows that X has an exponential distribution.
Relationship between the Poisson
and the Exponential Distributions
The Exponential Distribution
• The exponential distribution also finds
application in situations involving variables
other than time.
• Some of the succeeding examples
illustrate this.
Example
6.31) The magnitudes of earthquakes recorded
in a region of North America can be
modeled by an exponential distribution with
mean 2.4 as measured on the Richter
scale. Find the probability that the next
earthquake to strike this region will
(a) exceed 3.0 on the Richter scale
(b) fall between 2.0 and 3.0 on the Richter
scale
Solution to #6.31
Given :   2.4
(a) P( X  3.0)  1  P( X  3.0)
3 x
1 
 1  e 2.4 dx
0
2 .4
x 3
  
 1  e 2.4 
 
 0
 3 / 2.4
 1 e  1  0.2865
Solution to #6.31
Given :   2.4
3 x
1 
(b ) P( 2.0  X  3.0)   e 2.4 dx
2
2 .4
x 3
  
   e 2.4 
 
 2
 2 / 2.4  3 / 2.4
e e  0.1481
Example
6.33) A pumping station operator observes that
the demand for water at a certain hour of the
day can be modeled as an exponential
random variable with a mean of 100 cfs (cubic
feet per second).
(a) Find the probability that the demand will
exceed 200 cfs on a randomly selected day.
(b) What is the maximum water-producing
capacity that the station should keep on line
for this hour so that the demand will exceed
this production capacity with a probability of
only 0.01?
Solution to #6.33
Given :   100 cfs
(a ) P( X  200)  1  P( X  200)
200 x
1 
 1  100
e 100 dx
0
200
  x 
 1   e 100 
 
 0
 200 / 100
 1 e  1  0.1353
Solution to #6.33
Given :   100 cfs
(b ) P( X  x max )  0.01
P(0  X  xmax )  0.99
xmax x
1  Solving the last
 100
e 100 dx  0.99 equation, we get
0
xmax xmax = 460.52 cfs.
  x 
  e 100   0.99
 
 0
 xmax / 100
1 e  0.99
Example
6.37) The life lengths of automobile tires of a
certain brand, under average driving
conditions, are found to follow an
exponential distribution with mean 30 (in
thousands of miles). Find the probability that
one of these tires bought today will last
(a)over 30,000 miles
(b)over 30,000 miles given that it already has
gone 15,000 miles
Solution to #6.37
Given :   30,000
(a ) P( X  30,000)  1  P( X  30,000)
30,000 x
1 
30,000
 1  30,000
e dx
0
30,000
  x 

 1 e 30,000 
 
 0
1
 1 e  1  0.3679
Solution to #6.37
Given :   30,000
(b ) P( X  30,000 | X  15,000)
P( X  30,000  X  15,000)

P( X  15,000)
P( X  30,000)

P( X  15,000)
0.3679

1  P( X  15,000)
Solution to #6.37b (continued)
0.3679
 x
15,000 
1 30,000
1  30,000
e dx
0
0.3679
 15,000
 x 

1  e 30,000 
 
 0
0.3679
 1
 0.6066

1 e 2 1
Example
6.43) In deciding how many customer service
representatives to hire and in planning their
schedules, it is important for a firm marketing
electronic typewriters to study repair times for
the machines. Such a study revealed that
repair times have approximately an
exponential distribution with a mean of 22
minutes.
(a)Find the probability that a repair time will last
less than 10 minutes.
(b)The charge for typewriter repairs is $50 for
each half hour or part thereof. What is the
probability that a repair job will result in a
Example

6.43) Continuation
(c) In planning schedules, how much time
should be allowed for each repair so that
the chance of any one repair time
exceeding this allowed time is only 0.10?
Solution to #6.43
Given :   22 min utes
10 x
1 
(a ) P( X  10)   e 22 dx
0
22
10
 x 
   e 22 
 
 0
10 / 22
 1 e  0.3653
Solution to #6.43
Given :   22 min utes
(b ) A repair ch arg e of 1 hour means
the repair time took over 30 min utes
but not more than 60 min utes .
60 x
1 
P( 30  X  60)   e 22 dx

30
22
60
 x 

  e 22 
 
  30
 e  30 / 22  e  60 / 22  0.1903
Solution to #6.43
Given :   22 minutes
xmax x
1 
(c) P(0  X  xmax )  
0
22
e 22
dx

xmax
 
x

0.9   e 22

 0
xmax

0.9  1  e 22

xmax  50.66 minutes


References:
Some of the slides are from the power point presentation created by:
Engr. Dennis N. Yu of De La Salle University.

Montgomery, D. C., & Runger, G. C. (2007). Applied statistics and


probability for engineers. Hoboken, NJ: Wiley.

You might also like