PPT_Lesson_3.4_Statistical Distributions_Measure_Phase
PPT_Lesson_3.4_Statistical Distributions_Measure_Phase
Certification Course
Statistical Distributions
Learning Objectives
What is the likelihood that there are What is the probability that he What is the chance that a customer
no more than 10 defects per day? achieves a 30-minute turn-around- will browse through the merchandise
time? in the store before making a
purchase?
Statistical Distributions
Classes of Distribution
Classes of Distribution
Inferential
Probability Statistics
Statistics
Key Terms
Statistical
Distributions
Discrete Continuous
Probability Probability
Distribution Distribution
These distributions help in predicting the sample behavior that has been observed in a population.
Population
Sample
Illustration of the Probability Mass Function (PMF)
Binomial Distribution
n n-
P(R) = C r* p r* (1-p) r
Where,
P(R) = probability of exactly (r) successes out of a sample size of (n)
p = probability of success
r = number of successes desired
n = sample size
Key Calculations of Binomial Distribution
Mean 𝜇 = 𝑛𝑝
n = Sample size
p= Probability of success
? Using binomial distribution formula, find the probability of getting 6 heads in 10 coin tosses.
!
Outcomes are statistically independent.
Therefore,
P R = 10
6
∗ 0.56 ∗ 1 − 0.5 10−6
= 0.205078 = 20.5%
Poisson distribution is an application of the population knowledge to predict the sample behavior.
𝑋 −λ
λ ∗ e
P 𝑥 =
𝑥!
Where,
P(x) = Probability of exactly (x) occurrences in a Poisson distribution (n)
λ = Mean number of occurrences during interval
x = Number of occurrences desired
e = Base of the natural logarithm (equals 2.71828)
?
The past records of a traffic intersection shows that the mean number of accidents every
month is three at this junction. Assume that the number of accidents follows a Poisson
distribution and calculate the probability of number of accidents happening in a month.
(Y − µ)
Z=
σ
Where,
Z = number of standard deviations between Y and the µ
Y = Value of the data point in concern
µ = Mean of the population
σ = Standard deviation of the population
Normal Distribution: Z-Table
Z 0.0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5348 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 08869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
Normal Distribution: Example 1
Z 0.0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
!
0.1 0.5398 0.5348 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
There is no need of the table to find the answer once you know that
0.4 0.6554 0.6591the variable
0.6628 Z takes
0.6664
a value0.6700 0.6736
of less than (or equal0.6772
to) zero. 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 08869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
Normal Distribution: Example 2
Z 0.0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0
0.1
0.5000
0.5398
! 0.5040
0.5348
0.5080
0.5478
The opposite 0.5517
or complement
0.5120
of0.5557
0.5160
an event A0.5596
0.5199 0.5239
occurring0.5636
0.5279
0.5675
is the event
0.5319
0.5714
A not occurring.
0.5359
0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
P(not A) = 1 – P(A)
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
P(Z greater than 1.12) = 1 – P(Z less than 1.12)
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915
Using the table:
0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 08869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
Normal Distribution: Example 3
Z 0.0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0
0.1
0.5000
0.5398
! 0.5040
0.5348
0.5080
0.5478
0.5120
0.5517
0.5160
0.5557
0.5199
0.5596
0.5239
0.5636
0.5279
0.5675
0.5319
0.5714
0.5359
0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 08869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
Normal Distribution: Example 4
?
Suppose the time taken to resolve customer complaint follows a normal distribution with the
mean value of 250 hours and standard deviation value of 23 hours. What is the probability that
a problem resolution will take more than 300 hours?
!
Normal Distribution: Example 4
?
Suppose the time taken to resolve customer complaint follows a normal distribution with the
mean value of 250 hours and standard deviation value of 23 hours. What is the probability that
a problem resolution will take more than 300 hours?
! METHOD 1:
1 – 98.5% = 1.5%
Normal Distribution: Example 4
?
Suppose the time taken to resolve customer complaint follows a normal distribution with the
mean value of 250 hours and standard deviation value of 23 hours. What is the probability that
a problem resolution will take more than 300 hours?
! METHOD 2:
(Y−µ) (300−250)
Using the Standard Z formula: Z = = = 2.17
σ 23
● The probability that a problem can be resolved in less than 300 hours is 98.5%
● The chances of a problem resolution taking more than 300 hours is 1.5% (1 - 0.985)
Normal Distribution: Example 4
?
Suppose the time taken to resolve customer complaint follows a normal distribution with the
mean value of 250 hours and standard deviation value of 23 hours. What is the probability that
a problem resolution will take more than 300 hours?
!
Given: Y = 300; µ = 250; σ = 23
METHOD 3:
1 - 98.5 = 1.5%
Chi-Square Distribution
fO − fe 2
2
χcalculated ==
fe
Where,
χ2calculated () = chi-square index
fO = observed frequency
fe = expected frequency
T-Distribution
2
S1
Fcalculated = 2
S2
Where,
S1 and S2 = standard deviations of the two samples
● If Fcalculated is 1, there is no difference in the variance
● The larger variance should be placed in numerator and the smaller value in the
denominator
● df1 = n1 – 1 and df2 = n2 – 1)
Fun Facts
DID YOU
KNOW…?
The normal distribution seems to be Data that are influenced by many small
everywhere from temperature and unrelated random effects are
fluctuations, student test scores, and approximately normally distributed.
time taken to complete a task. It is the
average result of other factors.
Central Limit Theorem
Meaning of Theorem
Example
Pythagoras’ Theorem
CLT takes any data, with enough samples, and applies normal distribution principles.
Central Limit Theorem (CLT)
The Central Limit Theorem (CLT) states that the means of random samples drawn
from any distribution with mean µ and variance σ2 will have an approximately
normal distribution with a mean equal to µ and a variance equal to σ2 / n, as n
increases greater than 30.
Importance of CLT
!
CLT implies that the distribution of the sample
means will approach a normal distribution
regardless of what the population distribution
looks like.
As a Six Sigma practitioner remember the CLT forms the basis of inferential statistics.
How CLT Works: Illustration
The probability of the dice landing on any one side is equal to the probability of it landing
on any of the other five sides.
How CLT Works: Experiment 1
500 times
Source: https://www.minitab.com/uploadedFiles/Content/Academic/CentralLimitTheorem.pdf
How CLT Works: Experiment 2
2 rolls
500 trials
How CLT Works: Experiment 3
The central limit theorem states that for a large enough n, X-bar can be approximated by
a normal distribution with mean µ and standard deviation σ/√n.
How CLT Works: Conclusion
The normal model for the sample mean is good when the sample has at least 30 independent observations.
As the sample size increases, the mean has more of a normal distribution.
Caution
If more outliers are present, it is likely that more than 30 observations will be needed to use
the normal distribution.
Key Takeaways
CLT takes any data, with enough samples, and applies normal
distribution principles.
Knowledge Check
Knowledge
Check
What is the similarity between the Binomial and Poisson Distributions?
1
A Binomial is a discrete distribution that focuses on defective items, as a small number of trials and the calculation
for the expected value is n*p; whereas the Poisson is a discrete distribution that focuses on defects, the number of
trials tends towards infinity, and the expected value is λ.
Knowledge
Check
What is the probability of P(Z<2.4)?
2
A. 99.18%
B. 0.81%
C. 95%
D. 5%
Knowledge
Check
What is the probability of P(Z<2.4)?
2
A. 99.18%
B. 0.81%
C. 95%
D. 5%
Looking up the Z value of 2.4 in a left-tailed Z-table gives the probability of 99.18%.
Knowledge
Check
Which distribution is based on the Bernoulli process to predict sample behavior?
3
A. Poisson
B. Binomial
C. F-distribution
D. Normal
Knowledge
Check
Which distribution is based on the Bernoulli process to predict sample behavior?
3
A. Poisson
B. Binomial
C. F-distribution
D. Normal
The binomial distribution is based on the scenario where the output has only two options and probability remains
consistent over time. This scenario is called the Bernoulli process.
Knowledge
Check If the output value is 45, with process average of 40 and standard deviation of 2, what
4 is the Z score value?
A. 5
B. 2
C. 2.5
D. 3
Knowledge
Check If the output value is 45, with process average of 40 and standard deviation of 2, what
4 is the Z score value?
A. 5
B. 2
C. 2.5
D. 3
(Y−µ) (45−40)
Z= = = 2.5
σ 2
Knowledge
Check Given a normal distribution, what is the probability of having a Z score value smaller
5 than 2.5?
A. 0.6%
B. 99.4%
C. 90.2%
D. 2.5%
Knowledge
Check Given a normal distribution, what is the probability of having a Z score value smaller
5 than 2.5?
A. 0.6%
B. 99.4%
C. 90.2%
D. 2.5%
: Using a left tailed Z-table or the Excel function “= NORM.S.DIST() will provide a result of 99.4%.
Knowledge
Check The central limit theorem states that the means of random samples drawn from any
distribution with mean µ and variance σ2 will have an approximately ______ distribution
6 with a mean equal to µ and a variance equal to σ2 / n, as n increases greater than ___.
A. Normal, 10
B. T, 30
C. Normal, 25
D. Normal, 30
Knowledge
Check The central limit theorem states that the means of random samples drawn from any
distribution with mean µ and variance σ2 will have an approximately ______ distribution
6 with a mean equal to µ and a variance equal to σ2 / n, as n increases greater than ___.
A. Normal, 10
B. T, 30
C. Normal, 25
D. Normal, 30
The central limit theorem states that the means of random samples drawn from any distribution with mean µ and
variance σ2 will have an approximately normal distribution with a mean equal to µ and a variance equal to σ2 / n, as n
increases greater than 30.