Business Statistics CH 2
Business Statistics CH 2
WOLKITE UNIVERSITY
DEPARTMENT OF ACCOUNTING AND FINANCE
1
CHAPTER Statistical Estimations
TWO
DISCUSSION POINTS:
❑ Basic concepts
❑ Point estimators of the mean and proportion
❑ Interval estimators of the mean and proportion
❑ Student's t-distribution
❑ Determining the sample size
2
2.1. Basic concepts
• Statistical Inference is the process of making judgment about a
population based on sampling properties.
• An important aspect of statistical inference is using estimates to
approximate the value of an unknown population parameter.
• This chapter will study different kinds of estimator and lay the
foundations for making statistical inference about the population
mean and proportion.
3
Some Basic Terms
• Parameter: A numerical characteristic of a population, such as a
population mean μ , a population standard deviation 𝜎 , a
population proportion p, and so on.
• Sample statistic: A sample characteristic, such as a sample mean
X , a sample standard deviation s (σX ), a sample proportion pˆ,
and so on. The value of the sample statistic is used to estimate
the value of the corresponding population parameter.
• Standard error: The standard deviation of a point estimator.
• Statistical Estimation: refers the procedure of using a sample
statistic (measure) to estimate a population parameter.
• It is the process of using statistics as estimates of parameters. It is
any procedure where sample information is used to estimate/
predict the numerical value of some population measure (called a
parameter). 4
Cont’d
• Estimator: is a sample statistic which is used to estimate a
population parameter.
• An estimator of a population parameter is a random variable
that depends on the sample information; its value provides
approximations of this unknown parameter. Estimator refers to
any sample statistic that is used to estimate a population
parameter. E.g. X for 𝜇 and pˆ for p.
• Estimate: is a specific numerical value of our estimator. E.g. X =
9, 2, 5
6
Cont’d
3. Consistency: a statistic is a consistent estimator of a
population parameter if as the sample size increases; it becomes
almost certain that the value of the statistic comes very close to
the value of the population parameter.
• If an estimator is consistent, it becomes more reliable with
large samples. The standard error of a consistent estimator
becomes smaller as the sample size gets larger.
• The sample mean and sample proportions are consistent
estimators, since from their formulas as n get big, the standard
error becomes small, that is,
7
2.3. Types of Statistical Estimates
• As its name suggests, the objective of estimation is to determine
the approximate value of a population parameter on the basis of
a sample statistic.
• An estimator of a population parameter is a random variable
that is a function of the sample data.
• An estimate is the calculation of a specific value of this random
variable.
• We can use sample data to estimate a population parameter in
two ways;
1. Point estimator
2. Interval estimator
8
2.2. Point estimators of the mean and proportion
9
Cont’d
10
2.3. Interval estimators of the mean and proportion
• Interval estimate is a range of values used to estimate the
population parameter.
• It places the unknown population parameter between two
limits.
• An interval estimate provides more information about a
population characteristic than does a point estimate. Such
interval estimates are called confidence interval estimates
• It has ranges to estimate the population. It also assumes or
considers the errors associated with the sampling procedure.
• An interval estimator draws inferences about a population by
estimating the value of an unknown parameter using an
interval that is likely to include the value of the population
parameter.
11
Cont’d
12
Interval estimators of the mean and proportion
Confidence
Intervals
Population Population
Mean Proportion
σ2 Known σ2 Unknown
14
Cont’d
σ
ME = z α/2
n
15
Cont’d
16
Cont’d
Finding the Reliability Factor, z/2
• Consider a 95% confidence interval:
1− = .95
α α
= .025 = .025
2 2
Confidence
Confidence
Coefficient, Z/2 value
Level
1−
80% .80 1.28
90% .90 1.645
95% .95 1.96
98% .98 2.33
99% .99 2.58
99.8% .998 3.08
99.9% .999 3.27
18
Cont’d
19
Example 1
• Suppose that the standard deviation of the tube life for a particular
brand of TV picture tube is known to be σ = 500, but that the mean
operating life is not known. Overall, the operating life of the tubes is
assumed to be approximately normally distributed. For a sample of n =
15, the mean operating life is X = 8,900 hr. Determine (a) the 95 percent
and (b) the 90 percent confidence intervals for estimating the
population mean.
• The normal probability distribution can be used in this case because the
population is normally distributed and σ is known.
20
Example 2
• With respect to the above example 1, suppose that the population of
tube life cannot be assumed to be normally distributed. However, the
sample mean of X= 8,900 is based on a sample of n = 35.
• Determine a 95% confidence interval for the true mean resistance of the
population.
• The normal probability distribution can be used in this case by invoking
the central limit theorem, which indicates that for n > 30 the sampling
distribution can be assumed to be normally distributed even though the
population is not normally distributed.
21
Example 3
• Suppose that shopping times for customers at a local mall are normally
distributed with known population standard deviation of 20 minutes. A
random sample of 64 shoppers in the local grocery store had a mean
time of 75 minutes. Find the standard error, margin of error, and the
upper and lower confidence limits of a 95% confidence interval for the
population mean, .
Solution
The upper and lower confidence limits for a 95% confidence interval are
as follows:
Based on a sample of 64 observations, we are 95% confident that the true unknown
population mean extends from approximately 70 minutes to approximately 80 minutes.
22
B) Confidence Interval Estimation for the Mean of a normal
Population: Population Variance Unknown
• In the case of small sample size (n ) and with unknown standard
deviation, t distribution is applied.
• Suppose there is a random sample of n observations from a
normal distribution with mean and unknown variance. If the
sample mean and standard deviation are, respectively, X and s,
𝛼
then the degrees of freedom is v = n - 1, and a 100 (1- )%
2
confidence interval for the population mean with unknown
variance, is given by
𝑠
= 𝑥lj ± 𝑡𝑛−1, 𝛼
2 𝑛
𝑠
𝑂𝑟 𝑥lj ± 𝑀𝐸, 𝑀𝐸 = 𝑡𝑛−1 𝛼
,2 𝑛
23
Student’s t Distribution
• Consider a random sample of n observations
➢with mean X and standard deviation s
➢from a normally distributed population with mean μ
x −μ
t=
s/ n
follows the Student’s t distribution with (n - 1) degrees of
freedom
24
Cont’d
• The t distribution is commonly called student’s t distribution, or
simply student’s distribution.
• In using the t distribution, we assume that the population is
normal or approximately normal.
• Degrees of Freedom: The degrees of freedom are the number
of values that are free to vary after a sample statistic has been
computed, and they tell the researcher which specific curve to
use when a distribution consists of a family of curves.
• Degree of freedom = n-1
25
Cont’d
Characteristics of the t Distribution
• The t distribution is similar to the standard normal distribution
in these ways:
1. It is bell-shaped.
2. It is symmetric about the mean.
3. The mean, median, and mode are equal to 0 and are located
at the center of the distribution.
4. The curve never touches the x axis.
26
Cont’d
Student’s t Distribution
Note: t Z as n increases
Standard
Normal
(t with df = ∞)
t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal
0 t
27
Cont’d
Student’s t Table
Upper Tail Area
Let: n = 3
df .10 .05 .025 df = n - 1 = 2
= .10
1 3.078 6.314 12.706 /2 =.05
2 1.886 2.920 4.303
3 1.638 2.353 3.182 /2 = .05
31
2.3.2. Confidence Intervals for the Population Proportion, p
32
Cont’d
• If 𝑃 is the proportion of observations in a random sample of size n that
belong to class of interest, then an approximate 100 (1-) percent
confidence interval on the proportion p of the population that belongs to
this class, is :
𝑝Ƹ 1 − 𝑝Ƹ 𝑝Ƹ 1 − 𝑝Ƹ
𝑝ҭ − 𝑧𝛼 < 𝑝 < 𝑝ҭ + 𝑧𝛼
2 𝑛 2 𝑛
𝑧𝛼 𝛼
where 2 is the upper 2 percentage point of the standard
normal distribution.
33
Cont’d
• We know that a sample proportion ,𝑝lj is an unbiased estimator of
a population proportion P and if the sample size is large then, the
sampling distribution of 𝑝lj is normal with.
34
Example 1
Solution: The sample proportion, pˆ , and the reliability factor for a 90% confidence
interval estimate (1-) = 0.10 of the true population proportion, P, are found to be
35
Example 2
36
Example 3
37
Example 4
A random sample of 400 voters showed that 32 preferred
Candidate A. Set up a 95% confidence interval estimate for p.
𝑝𝑠 1 − 𝑝𝑠 𝑝𝑠 1 − 𝑝𝑠
𝑝𝑠 − 𝑍𝛼/2 ≤ 𝑝 ≤ 𝑝𝑠 + 𝑍𝛼/2
𝑛 𝑛
We are 95% confident that the proportion of voters who prefer Candidate A is
somewhere between 0.053 and 0.107.
2.5. Determining the sample size
• The reason for taking a sample from a population is that it would be
too costly to gather data for the whole population. But collecting
sample data also costs money; and the larger the sample, the higher
the cost.
• To hold cost down, we want to use as small a sample as possible. On
the other hand, we want a sample to be large enough to provide good
approximation/estimates of population parameters.
• One of the most common questions asked of statistician is, how large
should the sample taken is a survey be?
• The answer to this question depends on three factors:
i. the parameter to be estimated
ii. the desired confidence level of the interval estimator
iii. the maximum error of estimation
39
I) The Sample Size To Estimate A Population Mean
• The confidence interval is
𝜎
= 𝑥lj ± 𝑧𝛼
2 𝑛
From the above expression, we extract the error
of estimation or margin of error (e)
𝜎
e= 𝑧 𝛼
• In general n for is
2 𝑛
Squaring both sides result in
𝑧𝛼 𝜎 2
2
𝑛= n>=30
𝑒
Solving for n
𝑡𝛼 𝑠 2
2
𝑛= n=<30
𝑒
40
Example 1
• An average price for gasoline is expected to be $1.45 per gallon, if the
standard deviation for a specific National State is $0.10 per gallon. It is
believed that the mean price per gallon has changed. How many
samples (gas stations) should be studied so as to estimate the new
National state's mean with a maximum error of the estimate of $0.01
and a 90% level of confidence? Solution: 𝜎 = 0.10 From reference
table α/2 = 0.05, 𝑧0.05 = 1.65, E = 0.01. So,
𝑧𝛼 𝜎 2 1.65∗0.1 2
2 𝑛= =273
𝑛= 0.01
𝑒
41
Example 2
• We wish to know the average thickness of washers in a shipment. We
are willing to take a risk of 5 times in 100 that the error in our
estimate will be 0.002 inch (E) or more. From a sample of another lot
we estimate the standard deviation is 0.00359 with 9 degrees of
freedom.
Solution:
• s = 0.00359, α =5/100= 0.05, so 1- α = 0.95 (α = 0.05), E = 0.002.
• From reference table α/2 = 0.025 and degrees of freedom, df = 9, t=
2.262. Then, use the formula.
2
𝑡𝛼 𝑠
2
𝑛=
𝑒
2.262∗0.00359 2
𝑛= =16.5=17
0.002
42
Example 3 Cont’d
• A gasoline service station shows a standard deviation of Birr 6.25 for the changes made
by the credit card customers. Assume that the station’s management would like to
estimate the population mean gasoline bill for its credit card customers to be with in
± Birr 1.00. For a 95% confidence level, how large a sample would be necessary?
43
II) The Sample Size To Estimate A Population proportion
Solving for n
Suppose that a random sample is selected from a population. Then, a 100 (1-α2)%
confidence interval for the population proportion, extending a distance of at most E on each
side of the sample proportion, can be guaranteed if the sample size is as follows:
44
Example 1
• Suppose the Prime Minister of a country wants an estimate for the
proportion of the ‘Kebele’ administrators who support the country’s current
economic policy. The Prime Minister wants the estimate to be within of the
true proportion and a 95% level of confidence. The secretary of the office of
the Prime Minister estimated the proportion supporting the current policy to
be 0.60 and error of estimate is 0.04. What sample size is required?
ሜ α = =0.95, α=0.05, 𝛼 =0.025, and, 𝑧𝛼 =1.96, E=0.04, pˆ =0.60,
• Solution: 1 −
2
2
and qˆ =0.40. So,
Therefore, the number of ‘Kebele” leaders who support the country’s current
economic policy are 577.
45
Cont’d
Example 2
• Suppose that an opinion survey following a presidential election reported the
views of a sample of U.S. citizens of voting age concerning changing the
Electoral College process. The poll was said to have a 3% margin of error. The
implication is that a 95%confidence interval for the population proportion
holding a particular opinion is the sample proportion plus or minus at most
3%. How many citizens of voting age need to be sampled to obtain this 3%
margin of error?
Therefore, 1,068 U.S. citizens of voting age need to be sampled to achieve the
desired result.
46
Cont’d
Example 3
• Suppose that a production facility purchases a particular component parts in
large lots from a supplier. The production manager wants to estimate the
proportion of defective parts received from this supplier. She believes that the
proportion of defects is no more than 0.2 and wants to be with in 0.02 of the
true proportion of defects with a 90% level of confidence. How large a sample
should she take?
47
Example 3
n= 2
= 2
Error 0.05
= 227.3 228
Round Up
Chap 8-48
49