0% found this document useful (0 votes)
69 views

MC Math 13 Module 10

1. The document provides an overview of the central limit theorem, confidence intervals, and sample size. 2. It explains that as sample sizes increase, the sampling distribution of the mean becomes normally distributed, even if the original data is not normal. 3. Examples are provided to demonstrate how to calculate probabilities and confidence intervals related to sample means based on what is known about the population parameters.

Uploaded by

Raffy Barotilla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

MC Math 13 Module 10

1. The document provides an overview of the central limit theorem, confidence intervals, and sample size. 2. It explains that as sample sizes increase, the sampling distribution of the mean becomes normally distributed, even if the original data is not normal. 3. Examples are provided to demonstrate how to calculate probabilities and confidence intervals related to sample means based on what is known about the population parameters.

Uploaded by

Raffy Barotilla
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Module 10

The Central Limit Theorem,


Confidence Interval and
Sample Size
Prepared by:
EDWARD B. PESCUELA
Instructor

“Most people use statistics like a drunk man uses a lamppost; more for support than
illumination”
― Andrew Lang

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


MODULE 10
The Central Limit Theorem, Confidence Interval
and Sample Size
Introduction
The Central Limit Theorem tells us that as sample sizes get larger, the sampling distribution of the
mean will become normally distributed, even if the data within each sample are not normally distributed.
The Central Limit Theorem is important for statistics because it allows us to safely assume that the
sampling distribution of the mean will be normal in most cases. This means that we can take advantage of
statistical techniques that assume a normal distribution, as we will see in the next topics.
Before each activity, fast facts and discussions are given to help you understand the concepts and
processes involved as well as to solve problems in each activity. The activities will be done individually.
Answers in every assessment must be written or encoded on a short bond paper following the given
format. Please do not forget to write your significant learning experience at the last part of your output. The
submission of Module 10 outputs will be on May 18, 2021. If you have queries, you may reach me through
FB Group Chat during our scheduled date. Thank you and have fun!

Format

Name: ______________________________________ Year & Section: ________________


Date: _______________________________________ Instructor: Mr. Edward B. Pescuela

MC Math 13: Elementary Statistics & Probability


Module 10: The Central Limit Theorem, Confidence Interval
and Sample Size
Pretest/Exercise1/Activity 1
1.)
2.)
3.)

My Significant Learning Experience


In this module, I have learned that …
_________________________________________________________________________
_________________________________________________________________________
_________________________________________________________________________
_________________________________________________________________________
_________________________________________________________________________
_________________________________________________________________________
.

_____________________________
Signature over Printed Name

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


Objectives
At the end of the week, the pre-service teacher (PST) should be able to:
1. apply the Central Limit Theorem to solve problems involving sample means for large
samples
2. compute for the sample size needed
3. construct and interpret 95% and 99% confidence intervals for means
4. use digital technology in finding the probability of an event and the number of ways an r
objects can be selected from n objects with or without regard to order

The Central Limit Theorem


The Central Limit Theorem is important because it allows us to develop a process to
estimate and test the mean of a population using a sample.
In addition to knowing how individual data values vary about the mean for a population,
statisticians are interested in knowing how the means of samples of the same size taken from the
same population vary about the population mean.

Distribution of Sample Means


Suppose a researcher selects a sample of 30 adult males and finds the mean of the measure
of the triglyceride levels for the sample subjects to be 187 milligrams/deciliter. Then suppose a
second sample is selected, and the mean of that sample is found to be 192 milligrams/deciliter.
Continue the process for 100 samples. What happens then is that the mean becomes a random
variable, and the sample means 187, 192, 184, . . . , 196 constitute a sampling distribution of sample
means.

When all possible samples of a specific size are selected with replacement from a population, the
distribution of the sample means for a variable has two important properties, which are explained next.

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


The Central Limit Theorem:
 specifies a theoretical distribution
 the distribution is formulated by the selection of all possible random samples of a fixed size n
 a sample mean is calculated for each sample producing a sampling distribution

SAMPLING DISTRIBUTION OF THE MEAN


The sampling distribution of the mean is formed by taking the mean of samples from a given population

 The mean of the sample means is equal to the mean of the population from which the samples were
drawn.
 The standard deviation of the distribution is divided by the square root of n. (it is called the
standard error.)

STANDARD ERROR
The standard deviation of the sample means is called the standard error of the mean.
Hence,

x =
n
A third property of the sampling distribution of sample means pertains to the shape of the
distribution and is explained by the central limit theorem.

CENTRAL LIMIT THEOREM


1. Consider a population with mean 𝜇 and standard deviation 𝜎.
2. Draw a random sample of n observations from this population where n is a large number
(n> 30).
3. Find the mean x for each and every sample.
4. The distribution of the sample means x will be approximately normal. This distribution is
called the Sampling Distribution of the Means or the Distribution of Sample Means.
5. The mean and standard deviation (called the standard error) of the Distribution of Sample
Means is:

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


x =  The mean of the Sampling Distribution equals the mean of the
Population

 The standard error equals the standard deviation of the


x = population divided
n by the square root of the sample size.
6. The approximation becomes more accurate as n becomes large.

EXAMPLE 1. Brand of Tires


A certain brand of tires has a mean life of 25,000 miles with a standard deviation of 1600
miles. What is the probability that the mean life of 64 tires is less than 24,600 miles?

Solution
The sampling distribution of the means has a mean of 25,000 miles (the population mean)
𝜇 = 25000 𝑚𝑖.
1600
and a standard deviation (i.e. standard error) of 𝜎𝑥 = = 200
√64

Convert 24,600 mi. to a z-score and use the normal table (or Excel) to determine the required
probability.
24600 − 25000
𝑧 = = −2
200
𝑃(𝑧 < −2) = 0.0228
or 2.28% of the sample means will be less than 24,600 mi.

EXAMPLE 2. Meat Consumption


The average number of pounds of meat that a person consumes per year is 218.4 pounds.
Assume that the standard deviation is 25 pounds and the distribution is approximately normal.
Source: Michael D. Shook and Robert L. Shook, The Book of Odds.

a. Find the probability that a person selected at random consumes less than 224 pounds per year.

b. If a sample of 40 individuals is selected, find the probability


MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021
Hence, the probability that the mean of a sample of 40 individuals is less than 224 pounds
per year is 0.9222, or 92.22%. That is, 𝑃( 𝑋̅ < 224) = 0.9222. Comparing the two probabilities,
you can see that the probability of selecting an individual who consumes less than 224 pounds of
MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021
meat per year is 58.71%, but the probability of selecting a sample of 40 people with a mean
consumption of meat that is less than 224 pounds per year is 92.22%. This rather large difference
is due to the fact that the distribution of sample means is much less variable than the distribution of
individual data values. (Note: An individual person is the equivalent of saying n = 1.)

Try This!
Teachers’ Salaries in the Philippines
The average teacher’s salary in the Philippines is PHP37,764. Assume a normal
distribution with 𝜎 = 𝑃𝐻𝑃 5,100.
a. What is the probability that a randomly selected teacher’s salary is greater than PHP 45,000?
b. For a sample of 75 teachers, what is the probability that the sample mean is greater than PHP
38,000?

Confidence Intervals and Sample Size


One aspect of inferential statistics is estimation, which is the process of estimating the value
of a parameter from information obtained from a sample. For example, The Book of Odds, by
Michael D. Shook and Robert L. Shook (New York: Penguin Putnam, Inc.), contains the following
statements:
“One out of 4 Americans is currently dieting.” (Calorie Control Council)
“Seventy-two percent of Americans have flown on commercial airlines.” (“The Bristol
Meyers Report: Medicine in the Next Century”)
“The average kindergarten student has seen more than 5000 hours of television .” (U.S.
Department of Education)
“The average school nurse makes $32,786 a year.” (National Association of School
Nurses)
“The average amount of life insurance is $108,000 per household with life insurance.”
(American Council of Life Insurance)

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


Since the populations from which these values were obtained are large, these values are only
estimates of the true parameters and are derived from data collected from samples. The statistical
procedures for estimating the population mean, proportion, variance, and standard deviation will
be explained in this module.

Confidence Intervals for the Mean When 𝜎 Is Known


Suppose a college president wishes to estimate the average age of students attending classes
this semester. The president could select a random sample of 100 students and find the average age
of these students, say, 22.3 years. From the sample mean, the president could infer that the average
age of all the students is 22.3 years. This type of estimate is called a point estimate.

You might ask why other measures of central tendency, such as the median and mode, are
not used to estimate the population mean. The reason is that the means of samples vary less than
other statistics (such as medians and modes) when many samples are selected from the same
population. Therefore, the sample mean is the best estimate of the population mean.
Sample measures (i.e., statistics) are used to estimate population measures (i.e., parameters).
These statistics are called estimators. As previously stated, the sample mean is a better estimator
of the population mean than the sample median or sample mode. A good estimator should satisfy
the three properties described now.

Confidence Intervals
As stated in previous module, the sample mean will be, for the most part, somewhat different
from the population mean due to sampling error. Therefore, you might ask a second question: How
good is a point estimate? The answer is that there is no way of knowing how close a particular point
estimate is to the population mean. This answer places some doubt on the accuracy of point
estimates. For this reason, statisticians prefer another type of estimate, called an interval estimate.

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


In an interval estimate, the parameter is specified as being between two values. For example,
an interval estimate for the average age of all students might be 21.9 < 𝜇 < 22.7, or 22.3 ± 0.4
years.
Either the interval contains the parameter or it does not. A degree of confidence (usually a
percent) can be assigned before an interval estimate is made. For instance, you may wish to be 95%
confident that the interval contains the true population mean. Another question then arises. Why
95%? Why not 99 or 99.5%?
If you desire to be more confident, such as 99 or 99.5% confident, then you must make the
interval larger. For example, a 99% confidence interval for the mean age of college students might
be 21.7 < 𝜇 < 22.9, or 22.3 ± 0.6. Hence, a tradeoff occurs. To be more confident that the
interval contains the true population mean, you must make the interval wider.

Intervals constructed in this way are called confidence intervals. Three common confidence
intervals are used: the 90, the 95, and the 99% confidence intervals.

The term 𝑧𝑎/2 (𝜎/√𝑛 ) is called the margin of error (also called the maximum error of the
estimate). For a specific value, say, a _ 0.05, 95% of the sample means will fall within this error
value on either side of the population mean, as previously explained. See Figure 7–1.

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


When 𝑛 ≥ 30, s can be substituted for 𝜎 but a different distribution is used.

A more detailed explanation of the margin of error follows Examples 7–1 and 7–2, which illustrate
the computation of confidence intervals.

Example 1. Days It Takes to Sell an Aveo


A researcher wishes to estimate the number of days it takes an automobile dealer to sell a
Chevrolet Aveo. A sample of 50 cars had a mean time on the dealer’s lot of 54 days. Assume the
population standard deviation to be 6.0 days. Find the best point estimate of the population mean
and the 95% confidence interval of the population mean.
Source: Based on information obtained from Power Information Network.

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


Solution
The best point estimate of the mean is 54 days. For the 95% confidence interval use 𝑧 =
1.96.

Hence one can say with 95% confidence that the interval between 52.3 and 55.7 days does
contain the population mean, based on a sample of 50 automobiles.

Example 2. Credit Union Assets


The following data represent a sample of the assets (in millions of dollars) of 30 credit unions
in southwestern Pennsylvania. Find the 90% confidence interval of the mean.
12.23 16.56 4.39
2.89 1.24 2.17
13.19 9.16 1.42
73.25 1.91 14.64
11.59 6.69 1.06
8.74 3.17 18.13
7.92 4.78 16.85
40.22 2.42 21.58
5.01 1.47 12.24
2.27 12.77 2.76
Source: Pittsburgh Post Gazette.

Solution
Step 1 Find the mean and standard deviation for the data. Use the formulas shown in
Chapter 3 or your calculator. The mean 𝑋̅ = 11.091. Assume the standard deviation of the
population is 14.405.

Step 2 Find 𝛼/2. Since the 90% confidence interval is to be used, 𝛼 = 1 − 0.90 = 0.10,
and
α 0.10
= = 0.05
2 2

Step 3 Find 𝑧𝛼/2 . Subtract 0.05 from 1.000 to get 0.9500. The corresponding z value
obtained from Table E is 1.65. (Note: This value is found by using the z value for an area between
0.9495 and 0.9505. A more precise z value obtained mathematically is 1.645 and is sometimes

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


used; however, 1.65 will be used in this textbook.)

Step 4 Substitute in the formula

Hence, one can be 90% confident that the population mean of the assets of all credit unions
is between $6.752 million and $15.430 million, based on a sample of 30 credit unions.

Sample Size
Sample size determination is closely related to statistical estimation. Quite often you ask,
How large a sample is necessary to make an accurate estimate? The answer is not simple, since it
depends on three things: the margin of error, the population standard deviation, and the degree of
confidence. For example, how close to the true mean do you want to be (2 units, 5 units, etc.), and
how confident do you wish to be (90, 95, 99%, etc.)? For the purpose of this chapter, it will be
assumed that the population standard deviation of the variable is known or has been estimated from
a previous study.

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


The formula for sample size is derived from the margin of error formula

and this formula is solved for n as follows:

Hence,

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


Evaluation
In each problem show all steps of the confidence interval. If some of the assumptions are
not met, note that the results of the interval may not be correct and then continue the process of
the confidence interval.
1.) The Kyoto Protocol was signed in 1997, and required countries to start reducing their carbon
emissions. The protocol became enforceable in February 2005. Table 8.3.4 contains CO2
emissions from a random sample of 30 countries in 2010 ("CO2 emissions," 2013).
Compute a 99% confidence interval to estimate the mean CO2 emission for all countries in
2010. Assume 𝜎 = 2.55

Table 1: CO2 Emissions (metric tons per capita) in 2010


1.36 1.42 5.93 5.36 0.06 9.11 7.32 2.86
7.93 6.72 0.78 1.80 0.20 2.27 0.28 2.44
5.86 3.46 1.46 0.14 2.62 0.79 7.48
0.86 7.84 2.87 2.45 5.85 3.45 1.45

2.) Many people feel that cereal is healthier alternative for children over glazed donuts. Table
2 contains the amount of sugar in a random sample of cereal that is geared towards children
("Healthy breakfast story," 2013). Estimate the mean amount of sugar in all children’s
cereals using a 95% confidence level. Assume the standard deviation of the population was
2.8.
Table 2: Sugar Amounts (g) in Children’s Cereal
10 14 12 9 13 13 13
11 12 15 9 10 11 3
6 12 15 12 12

3.) In Florida, bass fish were collected in 53 randomly selected lakes to measure the amount of
mercury in the fish. The data for the average amount of mercury in each lake is in table 3
("Multi-disciplinary niser activity," 2013). Compute a 90% confidence interval for the mean
amount of mercury in fish in all Florida lakes. Assume the standard deviation of the
population was 1.35.
Table 3: Average Mercury Levels (mg/kg) in Fish
1.23 1.33 0.04 0.44 1.20 0.27 0.94
0.48 0.19 0.83 0.81 0.71 0.5 0.40
0.49 1.16 0.05 0.15 0.19 0.77 0.43
1.08 0.98 0.63 0.56 0.41 0.73 0.25
0.34 0.59 0.34 0.84 0.50 0.34 0.27
0.28 0.34 0.87 0.56 0.17 0.18
0.19 0.04 0.49 1.10 0.16 0.10
0.48 0.21 0.86 0.52 0.65 0.27

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021


4.) National Accounting Examination. If the variance of a national accounting examination is
900, how large a sample is needed to estimate the true mean score within 5 points with 99%
confidence?

5.) A machine fills container with corn meal. The machine is set to put 680 grams in each
container on the average. The standard deviation is equal to 0.5 gram. The average fill is
known to shift from time to time. However, the variability remains constant. That is, σ
remains constant at 0.5 gram. In order to estimate μ, how many containers should be selected
from a large production run so that the maximum error of estimate equals 0.2 gram with
probability 0.95?

Reference

Bluman, Allan G. Elementary Statistics: a step-by-step approach / Allan Bluman. - 8th ed.

MC MATH 13: ELEMENTARY STATISTICS & PROBABILITY |PESCUELA s. 2021

You might also like