0% found this document useful (0 votes)
5 views

Day 4 Normal Distribution and Sampling

The document discusses various continuous probability distributions including the uniform, normal, and exponential distributions. It provides definitions and examples for each distribution. Key aspects covered include probability density functions, expected values, variances, and how to calculate probabilities within intervals of the distributions.

Uploaded by

Hóng Hớt
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Day 4 Normal Distribution and Sampling

The document discusses various continuous probability distributions including the uniform, normal, and exponential distributions. It provides definitions and examples for each distribution. Key aspects covered include probability density functions, expected values, variances, and how to calculate probabilities within intervals of the distributions.

Uploaded by

Hóng Hớt
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 115

Continuous Probability

Distribution and Sampling


BDS 2021
Continuous Probability Distributions
• Uniform Probability Distribution f (x) Exponential
• Normal Probability Distribution
• Exponential Probability Distribution

Uniform
f (x)
x

Normal
f (x)

x
Continuous Probability Distributions
• A continuous random variable can assume any value in an interval on the real
line or in a collection of intervals.
• It is not possible to talk about the probability of the random variable assuming
a particular value.
• Instead, we talk about the probability of the random variable assuming a value
within a given interval.

3
Uniform Probability Distribution
• A random variable is uniformly distributed whenever the probability is
proportional to the interval’s length.
• The uniform probability density function is:

f (x) = 1/(b – a) for a < x < b


=0 elsewhere
where: a = smallest value the variable can assume
b = largest value the variable can assume

4
Uniform Probability Distribution
• Expected Value of x
E(x) = (a + b)/2
• Variance of x
Var(x) = (b - a)2/12

5
Uniform Probability Distribution
• Example: Slater's Buffet
Slater’s customers are charged for the amount of salad they take.
Sampling suggests that the amount of salad taken is uniformly distributed
between 5 ounces and 15 ounces.

6
Uniform Probability Distribution
• Uniform Probability Density Function
f(x) = 1/10 for 5 < x < 15
=0 elsewhere

where:
x = salad plate filling weight

7
Uniform Probability Distribution
• Expected Value of x
E(x) = (a + b)/2
= (5 + 15)/2
= 10

• Variance of x
Var(x) = (b - a)2/12
= (15 – 5)2/12
= 8.33

8
Uniform Probability Distribution
• Salad Plate Filling Weight

f(x)

1/10

x
0 5 10 15
Salad Weight (oz.)

9
Uniform Probability Distribution
What is the probability that a customer
will take between 12 and 15 ounces of salad?

f(x)

P(12 < x < 15) = 1/10(3) = .3


1/10

x
0 5 10 12 15
Salad Weight (oz.)

10
Area as a Measure of Probability
• The area under the graph of f(x) and probability are identical.
• This is valid for all continuous random variables.
• The probability that x takes on a value between some lower value x1 and some
higher value x2 can be found by computing the area under the graph of f(x)
over the interval from x1 to x2.

11
Normal Probability Distribution
• The normal probability distribution is the most important distribution
for describing a continuous random variable.
• It is widely used in statistical inference.
• It has been used in a wide variety of applications including:
•Heights of people • Test scores
• Amounts of rainfall • Scientific measurements
• Abraham de Moivre, a French mathematician, published The Doctrine of
Chances in 1733.
• He derived the normal distribution.

12
Normal Probability Distribution
• Normal Probability Density Function
1 −1 /2 ¿ ¿
𝑓 (𝑥 )= 𝑒
𝜎 √2𝜋

where:  = mean
 = standard deviation
 = 3.14159
e = 2.71828

13
Normal Probability Distribution
• Characteristics
The distribution is symmetric; its skewness measure is zero.

14
Normal Probability Distribution
• Characteristics
The entire family of normal probability distributions is defined by its mean
m and its standard deviation s .

Standard Deviation s

x
Mean m

15
Normal Probability Distribution
• Characteristics
The highest point on the normal curve is at the mean, which is also the
median and mode.

16
Normal Probability Distribution
• Characteristics
The mean can be any numerical value: negative, zero, or positive.

x
-10 0 25

17
Normal Probability Distribution
• Characteristics
The standard deviation determines the width of the
curve: larger values result in wider, flatter curves.

s = 15

s = 25

18
Normal Probability Distribution
• Characteristics
Probabilities for the normal random variable are given by areas under the
curve. The total area under the curve is 1 (.5 to the left of the mean and
.5 to the right).

.5 .5
x

19
Normal Probability Distribution
• Characteristics (basis for the empirical rule)

68.26% of values of a normal random variable


are within +/- 1 standard deviation of its mean.

95.44% of values of a normal random variable


are within +/- 2 standard deviations of its mean.

99.72% of values of a normal random variable


are within +/- 3 standard deviations of its mean.

20
Normal Probability Distribution
• Characteristics (basis for the empirical rule)
99.72%
95.44%
68.26%

m x
m – 3s m – 1s m + 1s m + 3s
m – 2s m + 2s

21
Standard Normal Probability Distribution
• Characteristics
A random variable having a normal distribution with a mean of 0 and a
standard deviation of 1 is said to have a standard normal probability
distribution.

22
Standard Normal Probability Distribution
• Characteristics
The letter z is used to designate the standard normal random variable.

s=1

z
0

23
Standard Normal Probability Distribution
• Converting to the Standard Normal Distribution

z=

We can think of z as a measure of the


number of standard deviations x is from .

24
Standard Normal Probability Distribution
• Example: Pep Zone
Pep Zone sells auto parts and supplies including a popular multi-grade
motor oil. When the stock of this oil drops to 20 gallons, a replenishment
order is placed.
The store manager is concerned that sales are being lost due to
stockouts while waiting for a replenishment order.

25
Standard Normal Probability Distribution
• Example: Pep Zone
It has been determined that demand during replenishment lead-time is
normally distributed with a mean of 15 gallons and a standard deviation
of 6 gallons.
The manager would like to know the probability of a stockout during
replenishment lead-time. In other words, what is the probability that
demand during lead-time will exceed 20 gallons?
P(x > 20) = ?

26
Standard Normal Probability Distribution
• Solving for the Stockout Probability
Step 1: Convert x to the standard normal distribution.

z = (x - )/
= (20 - 15)/6
= .83
Step 2: Find the area under the standard normal curve to the left of z = .83.

27
What is a Z Table ?
A standard normal table (also called the unit normal table or z-score table) is a mathematical
table for the values of ϕ, indicating the values of the cumulative distribution function of the normal
distribution. Z-Score, also known as the standard score, indicates how many standard deviations an
entity is, from the mean.

Since probability tables cannot be printed for every normal distribution, as there is an infinite variety of
normal distribution, it is common practice to convert a normal to a standard normal and then use the
z-score table to find probabilities.

Z-Score Formula

It is a way to compare the results from a test to a “normal” population.


If X is a random variable from a normal distribution with mean (μ) and standard deviation (σ), its Z-
score may be calculated by subtracting mean from X and dividing the whole by standard deviation.

28
What is a Z Table ?
How to Interpret z-Score
Here is how to interpret z-scores:
•A z-score of less than 0 represents an element less than the mean.
•A z-score greater than 0 represents an element greater than the mean.
•A z-score equal to 0 represents an element equal to the mean.
•A z-score equal to 1 represents an element, which is 1 standard deviation greater than the mean; a z-
score equal to 2 signifies 2 standard deviations greater than the mean; etc.
•A z-score equal to -1 represents an element, which is 1 standard deviation less than the mean; a z-
score equal to -2 signifies 2 standard deviations less than the mean; etc.
•If the number of elements in the set is large, about 68% of the elements have a z-score between -1 and
1; about 95% have a z-score between -2 and 2 and about 99% have a z-score between -3 and 3.

29
What is a Z Table ?

30
What is a Z Table ?

31
What is a Z Table ?

Example of Z Score
Let us understand the concept with the help of a solved example:
Example: The test scores of students in a class test has a mean of 70 and with a standard
deviation of 12. What is the probable percentage of students scored more than 85?
Solution: The z score for the given data is,
z= (85-70)/12=1.25
From the z score table, the fraction of the data within this score is 0.8944.
This means 89.44 % of the students are within the test scores of 85 and hence the percentage of
students who are above the test scores of 85 = (100-89.44)% = 10.56 %.

32
What is a Z Table ?
Frequently Asked Questions
What does the Z-Score Table Imply?
The z score table helps to know the percentage of values below (to the left) a z-score in a standard
normal distribution.
What are the Types of Z Score Table?
There are two z-score tables which are:
•Positive Z Score Table: It means that the observed value is above the mean of total values.
•Negative Z Score Table: It means that the observed value is below the mean of total values.
What is Z Score and How is it calculated?
A z score is simply defined as the number of standard deviation from the mean. The z-score can be
calculated by subtracting mean by test value and dividing it by standard value.
So, z = (x − μ)/ σ
Where x is the test value, μ is the mean and σ is the standard value.

33
Standard Normal Probability Distribution
• Cumulative Probability Table for the Standard Normal Distribution

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
. . . . . . . . . . .
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
. . . . . . . . . . .

P(z < .83) = .7967

34
Standard Normal Probability Distribution
• Solving for the Stockout Probability
Step 3: Compute the area under the standard normal
curve to the right of z = .83.

P(z > .83) = 1 – P(z < .83)


= 1- .7967
= .2033

35
Standard Normal Probability Distribution
• Solving for the Stockout Probability

Area = .7967 Area = 1 - .7967


= .2033

z
0 .83

36
Standard Normal Probability Distribution
If the manager of Pep Zone wants the probability of a stockout during
replenishment lead-time to be no more than .05, what should the reorder
point be?
(Hint: Given a probability, we can use the standard normal table in an
inverse fashion to find the corresponding z value.)

37
Standard Normal Probability Distribution
• Solving for the Reorder Point

Area = .9500
Area = .0500

z
0 z.05

38
Standard Normal Probability Distribution
• Solving for the Reorder Point
Step 1: Find the z-value that cuts off an area of .05 in the right tail of the
standard normal distribution.
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
. . . . . . . . . . .
1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441
1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545
1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633
1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706
1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767
. . . . . . . . . . .

We look up the complement of the tail area (1 - .05 = .95)

39
Standard Normal Probability Distribution
• Solving for the Reorder Point
Step 2: Convert z.05 to the corresponding value of x.

x =  + z.05
 = 15 + 1.645(6)
= 24.87 or 25

A reorder point of 25 gallons will place the probability


of a stockout during lead time at (slightly less than) .05.

40
Normal Probability Distribution
• Solving for the Reorder Point

Probability of no Probability of a
stockout during stockout during
replenishment replenishment
lead-time = .95 lead-time = .05

x
15 24.87

41
Standard Normal Probability Distribution
• Solving for the Reorder Point
By raising the reorder point from 20 gallons to 25 gallons on hand, the
probability of a stockout decreases from about .20 to .05.
This is a significant decrease in the chance that Pep Zone will be out of
stock and unable to meet a customer’s desire to make a purchase.

42
Using Excel to Compute Normal Probabilities
• Excel has two functions for computing cumulative probabilities and x values for
any normal distribution:
• NORM.DIST is used to compute the cumulative probability given an x
value.
• NORM.INV is used to compute the x value given a cumulative probability.

43
Exponential Probability Distribution
• The exponential probability distribution is useful in describing the time it takes
to complete a task.
• The exponential random variables can be used to describe:
• Time between vehicle arrivals at a toll booth
• Time required to complete a questionnaire
• Distance between major defects in a highway
• In waiting line applications, the exponential distribution is often used for
service time.

44
Exponential Probability Distribution
• A property of the exponential distribution is that the mean and standard
deviation are equal.
• The exponential distribution is skewed to the right. Its skewness measure is 2.

45
Exponential Probability Distribution
• Density Function
1 − 𝑥/ 𝜇
𝑓 (𝑥 )= 𝑒 for > 0
𝜇
where:  = expected value or mean
e = 2.71828

46
Exponential Probability Distribution
• Cumulative Probabilities
(x < 0 )
where:
x0 = some specific value of x

47
Exponential Probability Distribution
• Example: Al’s Full-Service Pump
The time between arrivals of cars at Al’s full-service gas pump follows an
exponential probability distribution with a mean time between arrivals of 3
minutes. Al would like to know the probability that the time between two
successive arrivals will be 2 minutes or less.

48
Exponential Probability Distribution
• Example: Al’s Full-Service Pump
f(x)

.4 P(x < 2) = 1 - 2.71828-2/3 = 1 - .5134 = .4866


.3
.2
.1
x
0 1 2 3 4 5 6 7 8 9 10
Time Between Successive Arrivals (mins.)

49
Relationship between the
Poisson and Exponential
Distributions
The Poisson distribution
provides an appropriate description
of the number of occurrences
per interval.

The exponential distribution


provides an appropriate description
of the length of the interval
between occurrences.

50
Sampling and Sampling Distributions
Sampling and Sampling Distributions
• Selecting a Sample
• Point Estimation
• Introduction to Sampling Distributions
• Sampling Distribution of
• Sampling Distribution of
• Other Sampling Methods

52
Introduction
• An element is the entity on which data are collected.
• A population is a collection of all the elements of interest.
• A sample is a subset of the population.
• The sampled population is the population from which the sample is drawn.
• A frame is a list of the elements that the sample will be selected from.

53
Introduction
• The reason we select a sample is to collect data to answer a research question
about a population.
• The sample results provide only estimates of the values of the population
characteristics.

• The reason is simply that the sample contains only a portion of the population.

• With proper sampling methods, the sample results can provide “good”
estimates of the population characteristics.

54
Selecting a Sample
• Sampling from a Finite Population
• Sampling from an Infinite Population

55
Sampling from a Finite Population
• Finite populations are often defined by lists such as:
• Organization membership roster
• Credit card account numbers
• Inventory product numbers
• A simple random sample of size n from a finite population of size N is a sample
selected such that each possible sample of size n has the same probability of
being selected.

56
Sampling from a Finite Population
• Replacing each sampled element before selecting subsequent elements is
called sampling with replacement. An element can appear in the sample
more than once.
• Sampling without replacement is the procedure used most often.

• In large sampling projects, computer-generated random numbers are often


used to automate the sample selection process.

57
Sampling from a Finite Population
• Example: St. Andrew’s College
St. Andrew’s College received 900 applications for admission in the
upcoming year from prospective students. The applicants were numbered,
from 1 to 900, as their applications arrived. The Director of Admissions
would like to select a simple random sample of 30 applicants.

58
Sampling from a Finite Population
• Example: St. Andrew’s College

Step 1: Assign a random number to each of the 900 applicants.

Step 2: Select the 30 applicants corresponding to the 30 smallest random


numbers.

59
Sampling from an Infinite Population
• Sometimes we want to select a sample, but find that it is not possible to obtain
a list of all elements in the population.
• As a result, we cannot construct a frame for the population.

• Hence we cannot use the random number selection procedure.


• Most often this situation occurs in the case of infinite population.

60
Sampling from an Infinite Population
• Populations are often generated by an ongoing process where there is no upper
limit on the number of units that can be generated.
• Some examples of on-going processes with infinite populations are:
• parts being manufactured on a production line
• transactions occurring at a bank
• telephone calls arriving at a technical help desk
• customers entering a store

61
Sampling from an Infinite Population
• In the case of an infinite population, we must select a random sample in order
to make valid statistical inferences about the population from which the
sample is taken.
• A random sample from an infinite population is a sample selected such that
the following conditions are satisfied.
• Each element selected comes from the population of interest.
• Each element is selected independently.

62
Point Estimation
• Point estimation is a form of statistical inference.
• In point estimation we use the data from the sample to compute a value of a
sample statistic that serves as an estimate of a population parameter.
• We refer to as the point estimator of the population mean .
• s is the point estimator of the population standard deviation .
• is the point estimator of the population proportion p.

63
Point Estimation
• Example: St. Andrew’s College
Recall that St. Andrew’s College received 900 applications from prospective
students. The application form contains a variety of information including the
individual’s Scholastic Aptitude Test (SAT) score and whether or not the
individual desires on-campus housing.
At a meeting in a few hours, the Director of Admissions would like to
announce the average SAT score and the proportion of applicants that want to
live on campus, for the population of 900 applicants.

64
Point Estimation
• Example: St. Andrew’s College
However, the necessary data on the applicants have not yet been
entered in the college’s computerized database. So, the Director decides to
estimate the values of the population parameters of interest based on sample
statistics. The sample of 30 applicants is selected using computer-generated
random numbers.

65
Point Estimation
• as Point Estimator of 
= = 1684
• s as Point Estimator of 

𝑠=√ ∑ ¿ ¿ ¿ ¿
• as Point Estimator of p
=

Note: Different random numbers would have identified a different


sample which would have resulted in different point estimates.

66
Point Estimation
• Once all the data for the 900 applicants were entered in the database of the
college , the values of the population parameters of interest were calculated.
• Population Mean SAT Score
𝜇=
∑ 𝑥𝑖
=1697
900
• Population Standard Deviation for SAT Score

𝜎= √∑ ¿¿¿¿¿
• Population proportion wanting On-Campus Housing
= .72

67
Summary of Point Estimates
Obtained from a Simple Random Sample
Population Parameter Point Point
Parameter Value Estimator Estimate
m = Population mean 1697 = Sample mean 1684
SAT score SAT score
s = Population std. 87.4 s = Sample std. 85.2
deviation for deviation
SAT score for SAT score
p = Population pro- .72 = Sample pro- .67
portion wanting portion wanting
campus housing on campus housing

68
Practical Advice
• The target population is the population we want to make inferences about.

• The sampled population is the population from which the sample is actually
taken.
• Whenever a sample is used to make inferences about a population, we
should make sure that the targeted population and the sampled population
are in close agreement.

69
Sampling Distribution of
• Process of Statistical Inference

Population A simple random sample


Mean m of n elements is selected
is determined from the population.

The value of is used to The sample data


make inferences about provide a value for
the value of m. the sample mean .

70
Sampling Distribution of
• The sampling distribution of is the probability distribution of all possible
values of the sample mean .
• Expected Value of
E() = 
where:  = the population mean
• When the expected value of the point estimator equals the population
parameter, we say the point estimator is unbiased.

71
Sampling Distribution of
• We will use the following notation to define the standard deviation of the
sampling distribution of .

= the standard deviation of


s = the standard deviation of the population
n = the sample size
N = the population size

72
Sampling Distribution of
• Standard Deviation of
Finite Population Infinite Population

𝜎 𝑥=
√ ( )
𝑁 −𝑛 𝜎
𝑁 − 1 √𝑛
𝜎 𝑥=
𝜎
√𝑛
• A finite population is treated as being infinite if n/N < .05.
• is the finite population correction factor.

• is referred to as the standard error of the mean.

73
Sampling Distribution of
• When the population has a normal distribution, the sampling distribution
of is normally distributed for any sample size.
• In most applications, the sampling distribution of can be approximated by
a normal distribution whenever the sample is size 30 or more.
• In cases where the population is highly skewed or outliers are present,
samples of size 50 may be needed.

74
Sampling Distribution of
• The sampling distribution of can be used to provide probability
information about how close the sample mean is to the population mean
m.

75
Central Limit Theorem
• When the population from which we are selecting a random sample does
not have a normal distribution, the central limit theorem is helpful in
identifying the shape of the sampling distribution of .

CENTRAL LIMIT THEOREM


In selecting random samples of size n from a
population, the sampling distribution of the sample
mean can be approximated by a normal
distribution as the sample size becomes large.

76
Sampling Distribution of
• Example: St. Andrew’s College

Sampling
Distribution = = 15.96
of for
SAT Scores

𝑥
𝐸 ( 𝑥 )=1697

77
Sampling Distribution of
• Example: St. Andrew’s College
• What is the probability that a simple random sample of 30 applicants will
provide an estimate of the population mean SAT score that is within +/-10
of the actual population mean  ?
• In other words, what is the probability that will be between 1687 and
1707?

78
Sampling Distribution of
• Example: St. Andrew’s College
Step 1: Calculate the z-value at the upper endpoint of the interval.
z = (1707 - 1697)/15.96 = .63
Step 2: Find the area under the curve to the left of the upper endpoint.
P(z < .63) = .7357

79
Sampling Distribution of
• Example: St. Andrew’s College

Cumulative Probabilities for


the Standard Normal Distribution
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
. . . . . . . . . . .
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
. . . . . . . . . . .

80
Sampling Distribution of
• Example: St. Andrew’s College

Sampling Distribution
= 15.96 of for SAT Scores

Area = .7357

𝑥
1697 1707

81
Sampling Distribution of
• Example: St. Andrew’s College
Step 3: Calculate the z-value at the lower endpoint of the interval.
z = (1687 - 1697)/15.96 = - .63

Step 4: Find the area under the curve to the left of the lower endpoint.
P(z < -.63) = .2643

82
Sampling Distribution of for SAT Scores
• Example: St. Andrew’s College

Sampling Distribution
= 15.96 of for SAT Scores

Area = .2643

𝑥
1687 1697

83
Sampling Distribution of for SAT Scores
• Example: St. Andrew’s College
Step 5: Calculate the area under the curve between
the lower and upper endpoints of the interval.
P(-.68 < z < .68) = P(z < .68) - P(z < -.68)
= .7357 - .2643
= .4714
The probability that the estimate of population mean
SAT score will be between 1687 and 1707 is:

P(1687 < < 1707) = .4714

84
Sampling Distribution of for SAT Scores
• Example: St. Andrew’s College

Sampling Distribution
of for SAT Scores
= 15.96

Area = .4714

𝑥
1687 1697 1707

85
Relationship Between the Sample Size and the Sampling
Distribution of
• Example: St. Andrew’s College
• Suppose we select a simple random sample of 100 applicants instead of the
30 originally considered.
• E() = regardless of the sample size. In our example, E() remains at 1697.

• Whenever the sample size is increased, the standard error of the mean is
decreased. With the increase in the sample size to n = 100, the standard
error of the mean is decreased from 15.96 to:
= =.9433(8.74) = 8.2

86
Relationship Between the Sample Size and the Sampling
Distribution of
• Example: St. Andrew’s College

With n = 100,
With n = 30,

𝑥
𝐸 ( 𝑥 )=1697

87
Relationship Between the Sample Size and the Sampling
Distribution of
• Example: St. Andrew’s College
• Recall that when n = 30, P(1687 < < 1707) = .4714.
• We follow the same steps to solve for P(1687 < < 1707) when n = 100 as we
showed earlier when n = 30.
• Now, with n = 100, P(1687 < < 1707) = .7776.
• Because the sampling distribution with n = 100 has a smaller standard error,
the values of have less variability and tend to be closer to the population
mean than the values of with n = 30.

88
Relationship Between the Sample Size
and the Sampling Distribution of
• Example: St. Andrew’s College
Sampling Distribution
of for SAT Scores

𝜎 𝑥 =8.2
Area = .7776

𝑥
1687 1697 1707
89
Sampling Distribution of
• Making Inferences about a Population Proportion

Population A simple random sample


with proportion of n elements is selected
p=? from the population.

The value of is used The sample data


to make inferences provide a value for the
about the value of p. sample proportion .

90
Sampling Distribution of
• The sampling distribution of is the probability distribution of all possible
values of the sample proportion .
• Expected Value of
E(
where: p = the population proportion

91
Sampling Distribution of
• Standard Deviation of
Finite Population Infinite Population

𝜎 𝑝=
√ √
𝑁 − 𝑛 𝑝(1− 𝑝) 𝜎 = 𝑝(1− 𝑝)
𝑁 −1 𝑛
𝑝
𝑛 √
• is referred to as the standard error of the proportion.
• is the finite population correction factor.

92
Sampling Distribution of
• The sampling distribution of can be approximated by a normal distribution
whenever the sample size is large enough to satisfy the two conditions:

np > 5 and n(1 – p) > 5


• When these conditions are satisfied, the probability distribution of x in the
sample proportion, = x/n, can be approximated by a normal distribution
(because n is a constant).

43
Sampling Distribution of
• Example: St. Andrew’s College
Recall that 72% of the prospective students applying to St. Andrew’s
College desire on-campus housing.
What is the probability that a simple random sample of 30 applicants will
provide an estimate of the population proportion of applicant desiring on-
campus housing that is within plus or minus .05 of the actual population
proportion?

94
Sampling Distribution of
• Example: St. Andrew’s College
For our example, with n = 30 and p = .72, the normal distribution
is an acceptable approximation because:
np = 30(.72) = 21.6 > 5
and
n(1 - p) = 30(.28) = 8.4 > 5

95
Sampling Distribution of
• Example: St. Andrew’s College

Sampling
Distribution
of
= .082

𝑝
𝐸 ( 𝑝 )=.72

96
Sampling Distribution of
• Example: St. Andrew’s College
Step 1: Calculate the z-value at the upper endpoint of the interval.
z = (.77 - .72)/.082 = .61
Step 2: Find the area under the curve to the left of the upper endpoint.
P(z < .61) = .7291

97
Sampling Distribution of
• Example: St. Andrew’s College
Cumulative Probabilities for
the Standard Normal Distribution
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
. . . . . . . . . . .
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
. . . . . . . . . . .

98
Sampling Distribution of
• Example: St. Andrew’s College

Sampling
Distribution
= .082 of

Area = .7291

𝑝
.72 .77

99
Sampling Distribution of
• Example: St. Andrew’s College
Step 3: Calculate the z-value at the lower endpoint of the interval.
z = (.67 - .72)/.082 = - .61
Step 4: Find the area under the curve to the left of the lower endpoint.
P(z < -.61) = .2709

100
Sampling Distribution of
• Example: St. Andrew’s College

Sampling
Distribution
= .082 of

Area = .2709

𝑝
.67 .72

101
Sampling Distribution of
• Example: St. Andrew’s College
Step 5: Calculate the area under the curve between the lower and upper
endpoints of the interval.
P(-.61 < z < .61) = P(z < .61) - P(z < -.61)
= .7291 - .2709
= .4582
The probability that the estimate of the population proportion of
applicants desiring on-campus housing is within plus or minus . 05
of the actual population proportion :
P(.67 < < .77) = .4582

102
Sampling Distribution of
• Example: St. Andrew’s College

Sampling
Distribution
of

= .082 Area = .4582

𝑝
.67 .72 .77

103
Other Sampling Methods
• Stratified Random Sampling
• Cluster Sampling
• Systematic Sampling
• Convenience Sampling
• Judgment Sampling

104
Stratified Random Sampling
• The population is first divided into groups of elements called strata.
• Each element in the population belongs to one and only one stratum.
• Best results are obtained when the elements within each stratum are as
much alike as possible (i.e. a homogeneous group).

105
Stratified Random Sampling
• A simple random sample is taken from each stratum.
• Formulas are available for combining the stratum sample results into one
population parameter estimate.
• Advantage: If strata are homogeneous, this method provides results that is
as “precise” as simple random sampling but with a smaller total sample size.
• Example: The basis for forming the strata might be department, location,
age, industry type, and so on.

106
Cluster Sampling
• The population is first divided into separate groups of elements called
clusters.
• Ideally, each cluster is a representative small-scale version of the population
(i.e. heterogeneous group).
• A simple random sample of the clusters is then taken.
• All elements within each sampled (chosen) cluster form the sample.

107
Cluster Sampling
• Example: A primary application is area sampling, where clusters are city
blocks or other well-defined areas.
• Advantage: The close proximity of elements can be cost effective (i.e. many
sample observations can be obtained in a short time).
• Disadvantage: This method generally requires a larger total sample size than
simple or stratified random sampling.

108
Systematic Sampling
• If a sample size of n is desired from a population containing N elements, we
might sample one element for every N/n elements in the population.
• We randomly select one of the first N/n elements from the population list.
• We then select every N/nth element that follows in the population list.

109
Systematic Sampling
• This method has the properties of a simple random sample, especially if the
list of the population elements is a random ordering.
• Advantage: The sample usually will be easier to identify than it would be if
simple random sampling were used.
• Example: Selecting every 100th listing in a telephone book after the first
randomly selected listing.

110
Convenience Sampling
• It is a nonprobability sampling technique. Items are included in the sample
without known probabilities of being selected.
• The sample is identified primarily by convenience.
• Example: A professor conducting research might use student volunteers to
constitute a sample.

111
Convenience Sampling
• Advantage: Sample selection and data collection are relatively easy.
• Disadvantage: It is impossible to determine how representative of the
population the sample is.

112
Judgment Sampling
• The person most knowledgeable on the subject of the study selects elements
of the population that he or she feels are most representative of the
population.
• It is a nonprobability sampling technique.
• Example: A reporter might sample three or four senators, judging them as
reflecting the general opinion of the senate.

113
Judgment Sampling
• Advantage: It is a relatively easy way of selecting a sample.
• Disadvantage: The quality of the sample results depends on the judgment
of the person selecting the sample.

114
Recommendation
• It is recommended that probability sampling methods (simple random,
stratified, cluster, or systematic) be used.
• For these methods, formulas are available for evaluating the “goodness” of
the sample results in terms of the closeness of the results to the population
parameters being estimated.
• An evaluation of the goodness cannot be made with non-probability
(convenience or judgment) sampling methods.

115

You might also like