0% found this document useful (0 votes)

5 views9 pages

Confidence_Intervals-Reader

The document discusses confidence intervals (CIs), which are used to express the uncertainty associated with sample statistics when estimating population parameters. It explains the concept of confidence levels, the calculation of CIs for population means, proportions, and differences between two means or proportions, along with the necessary conditions for their validity. Additionally, it emphasizes the importance of correctly interpreting confidence intervals in statistical reporting.

Uploaded by

basgoudriaan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views9 pages

Confidence_Intervals-Reader

Uploaded by

basgoudriaan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Confidence intervals

Dr. Debarati Bhaumik

1 Introduction
Confidence intervals (CI) are mostly used to convey the information about the
uncertainty associated with sample statistics (mostly the sample mean, but can
also be the median, mode etc.) which are used to estimate the population
parameters. For example we estimate the average height of Dutch nationals
(population parameter) using the sample mean of a random sample from the
population of size n, say 1000. We have seen in the previous lecture that the
sample mean is a random variable and the value of the sample mean varies
from sample to sample. Hence, we need to add a measure of the variability to
the point estimate (from the sample data) of the population parameter. This
measure of variability is called the margin of error. The sample statistic, plus or
minus the margin of error gives the confidence interval. CI tells us how confident
can we be about our point estimates or sample statistics which we use to infer
the population parameter.
Formally, a CI for a population parameter θ is a random interval calculated from
the sample data, that contains θ with some specified probability or confidence.

1.1 Confidence level

Confidence level is the percentage of times random confidence intervals will
contain the population parameter θ when constructed from random samples.
Let us take the example of population mean µ. A 95% confidence interval for
µ is a random interval that contains µ with probability 0.95. In other words,
if we constructed confidence intervals around each sample mean X̄ calculated
from many different random sample data, then about 95% of these confidence
intervals will contain the population mean µ.
In more statistical notation, let α be a value (0 ≤ α ≤ 1) such that 100 × α% is
the percentage of times the confidence intervals constructed from the random
sample data does not contain the population parameter. Then we can write
100 × (1 − α)% as the confidence interval or (1 − α) confidence level.

1
1.2 Confidence interval for a population mean
We will prove the confidence interval for population mean [1]. For other cases
we will just state the results. Generally one uses the standard normal Z−
distribution for computing the confidence interval, however for real life cases
when the population variance is not know then the t− distribution is used. We
will discuss both the cases.
For 0 ≤ α ≤ 1, Let z(α) be a value such that the area under the standard normal
density function curve to the right of z(α) be α or in mathematical terms we can
write P(Z ≥ z(α)) = α. See Figure 1 for more clarity. Due to the symmetry of
the curve we can say that the area under the curve to the left of −z(α) is also
α, i.e., P(Z ≤ −z(α)) = α.

Figure 1: Standard normal density curve showing the value z(α) and the area
under the curve to the right of it as α

Now, if a random variable Z follows the standard normal distribution then by

definition of z(α):

P(−z(α/2) ≤ Z ≤ z(α/2)) = 1 − α.
To understand the equation above visually, see Figure 2. Note that the area to
the right of z(α/2) is α/2 and similarly the area to the left of −z(α/2) is α/2.
By the law of total probability the area between the grey area is 1 − α.
Remember the central limit theorem, which states that the sampling distribution
of the sample means approaches a normal distribution as the sample size gets
larger - no matter what the shape of the population distribution is. Given the
population mean to be µ, the sample mean of size n be X̄ and the standard
deviation (or the standard error) of the sample mean σX̄ then from the central
limit theorem we have that (X̄ − µ)/σX̄ follows approximately the standard
normal distribution, and hence we have:

2
Figure 2: Area under the standard normal curve in between −z(α/2) and
z(α/2).

(X̄ − µ)
P(−z(α/2) ≤ ≤ z(α/2)) ≈ 1 − α,
σX̄
upon some elementary calculation we have

P(X̄ − z(α/2)σX̄ ≤ µ ≤ X̄ + z(α/2)σX̄ ) ≈ 1 − α.

√
Recall that σX̄ = σ/ n, where σ is population variance. Then the above result
becomes,
σ σ
P(X̄ − z(α/2) √ ≤ µ ≤ X̄ + z(α/2) √ ) ≈ 1 − α. (1)
n n
The above equation (1) implies that the probability that µ lies in the interval
X̄ ± z(α/2) √σn is approximately 1 − α. The interval is thus called 100 × (1 − α)%
confidence interval. Note that the interval is random and the probability
that this random interval contains the population mean µ is 1 − α.
In practice we generally choose a small value of α such as 0.1, 0.05, 0.01 which
leads to corresponding confidence intervals of 90%, 95%, 99% respectively.

α confidence level z(α/2)

0.1 90 % 1.64
0.05 95 % 1.96
0.01 99 % 2.58

Table 1: z(α/2) values for selected confidence level from the standard normal
Z− distribution.

3
To recap the results, the confidence interval of a population mean µ is given by

σ
X̄ ± z(α/2) √ , (2)
n

where X̄ is the sample mean, σ is the population standard deviation, n is the

sample size, z(α/2) is the appropriate value from the Z-distribution for desired
confidence level (see Table 1 for the values). Few points to note:
• Unknown σ: Most of the times the population standard deviation σ will
not be known. Hence we will substitute it this sample standard deviation
s. For large samples this substitution has negligible effect. There is no
thumb of rule, however for n ≥ 30 we can adequately use this substitution.
• Small sample size n: When n ≤ 30 instead of using the Z-distribution
table we use the t− distribution table values corresponding to n−1 degrees
of freedom. In Appendix A you can find the t− distribution table values
for different confidence level values. See how for degrees of freedom 30 the
t and z values get close to each other.
Example: Let us see an example from [1]; a particular area contains 8000
condominium units. In a survey of the occupants, a simple random sample
of size 100 yields the information that the average number of motor vehicles
per unit is 1.6, with a sample standard deviation of .8. Here, X̄ = 1.6, n =
100, s = 0.8. Hence, the 95% confidence interval for the population average will
be X̄ ± z(0.025) √sn . We know that z(0.025) = 1.96, hence the 95% CI will be
1.6 ± 1.96 × 0.08. We can use the Z− distribution because n ≥ 30.
Now let us say if we had a a simple random sample of size 25 which gave the
same sample mean and standard deviation as above. Then we would use the
t− table with 24 degrees of freedom. Then the 95% confidence interval for the
population mean would be X̄ ± t(0.025) √sn = 1.6 ± 2.06 × 0.16. See how this
confidence interval is bigger than the previous case. That is due to the fact that
for this case the sample size is smaller (n = 25) than the previous case, where
n = 100, which makes us less confident about the estimation.

1.3 Confidence interval for a population proportion

This is the dichotomous case. When we are interested in knowing what pro-
portion (or percentage) of people/ population elements fall into a particular
category. For example, percentage of people prefer to be contacted through
email or percentage of people in favor of a four-day work week. For these cases
we estimate the population proportion p with the sample proportion p̂ plus mi-
nus a margin of error. Sample proportion is the proportion of individuals in the
sample who have the characteristics of interest. The formula for the confidence
interval for the population proportion p is given by

4
r
p̂(1 − p̂)
p̂ ± z(α/2) , (3)
n−1

where z(α/2) is the appropriate value from the standard normal Z− distribution
for desired level of confidence.
Note: The following conditions need to satisfied to build confidence intervals
for population proportions using sample proportions:
• Random condition: The data should come from a random sample. This
ensures we have unbiased data from the population.
• Normal condition: The sampling distribution of p̂ should be approxi-
mately normal, and for that to happen, these condition need to be met
˙ − p̂) ≥ 10 simultaneously.
np̂ ≥ 10 and n(1
• Independence condition: Individual observations need to be indepen-
dent. If sampling without replacement, our sample size shouldn’t be more
than 10% percent of the population.

Before doing the actual computations of the interval, it’s important to check
whether or not the above conditions have been met, otherwise the calculations
and conclusions that follow aren’t valid.

1.4 Confidence interval for the difference of two means

The goal of many surveys and studies is to compare the difference between
two groups, such as men versus women, liberals versus conservatives. When
the properties being compared are numerical (for example height, weight, age,
income, etc) one is generally interested in the different between the means (aver-
age) of the two populations. For example we want to compare the difference in
average incomes of men versus women. The confidence interval for the difference
of two population means µ1 − µ2 is given by

s
σ12 σ2
(X̄1 − X̄2 ) ± z(α/2) + 2, (4)
n1 n2

where X̄1 and X̄2 are the sample means, n1 and n2 are the sample sizes; σ1
and σ2 are the population standard deviations respectively; and z(α/2) is the
appropriate value from the Z− distribution with desired confidence level.
Following are the two conditions when we use t − distribution with n1 + n2 − 2
degrees of freedom:

1. If one or both of the sample sizes are small (less than 30).

5
2. When the population standard deviations are unknown, we use the sample
standard deviation along with the t− distribution. Then the formula for
confidence interval becomes:

s
s21 s2
(X̄1 − X̄2 ) ± t(α/2, n1 + n2 − 2) + 2, (5)
n1 n2

where, s1 and s2 are the sample standard deviations respectively; and

t(α/2, n1 + n2 − 2) is the appropriate value from the t− distribution with
n1 + n2 − 2 degrees of freedom and 1 − α is the desired confidence level.
Read the values from the t− table in Figure 3.

1.5 Confidence interval for the difference of two propor-

tions
Just like the difference between population means there are situations when
we are interested in the difference between population proportions. Such as
comparing males to females with their opinion on four-day work week. In these
cases we estimate the difference between the population proportions. We do
this by taking the difference between the sample proportions plus minus the
margin of error. The confidence interval for the difference of two population
proportions p1 − p2 is given by

s
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
(p̂1 − p̂2 ) ± z(α/2) + , (6)
n1 n2

where p̂1 and p̂2 are the sample proportions, n1 and n2 are the sample sizes
respectively; and z(α/2) is the appropriate value from the Z− distribution with
desired confidence level. Please note that to create valid confidence intervals
for difference of population proportions from sample proportions, the random,
normal and independence conditions as described in Section 1.3 need to be
satisfied for the individual samples.

2 Interpreting confidence intervals

It is very important how confidence intervals should be interpreted and reported.
It is wrong to say ”Based on the inference we are 95 % confident that population
mean is between xxx and yyy”. Confidence interval goes back to the idea of
confidence level. Remember that a confidence interval is a random interval
constructed from the random sample data. Hence, a 95% confidence level is the
percentage of times of all the possible random samples of size n whose confidence
intervals will contain the population parameter.

6
Hence one may report ”Based on the inference, a range of likely values for
the population parameter is xxx and yyy, with a confidence level of 95%.” Note
that the population parameter is fixed and the sample parameter estimate varies
based on the sample chosen.

3 Resource list
Following are some existing resources and videos to be followed to have a better
understanding of the concepts of confidence interval:

1. Confidence interval estimation around population mean for known σ. This

video very well explains how to interpret a confidence interval: follow this
video.

2. Confidence interval estimation around population mean for unknown σ:

follow this video.

7
A t-distribution table for confidence interval

Figure 3: t-table. Figure used from this website. Please go through the website
to understand how to read t-tables.

8
References
[1] John A Rice. Mathematical statistics and data analysis. Cengage Learning,
2006.

MTH106
No ratings yet
MTH106
274 pages
chap09[1]
No ratings yet
chap09[1]
46 pages
08 Chapter 8 Confidient Interval Estimation (2)
No ratings yet
08 Chapter 8 Confidient Interval Estimation (2)
50 pages
4 Confidence Intervals
100% (1)
4 Confidence Intervals
49 pages
Lecture 3
No ratings yet
Lecture 3
87 pages
Chapter 6
No ratings yet
Chapter 6
44 pages
QEM 2004 - Module 2( Confidence interval estimation) (1)
No ratings yet
QEM 2004 - Module 2( Confidence interval estimation) (1)
59 pages
Chapter 6 Statistics
No ratings yet
Chapter 6 Statistics
60 pages
Statistical Inference 417
No ratings yet
Statistical Inference 417
90 pages
Lecture 6
No ratings yet
Lecture 6
16 pages
Point and Interval Estimation
No ratings yet
Point and Interval Estimation
55 pages
CI Estimation and sample size determination
No ratings yet
CI Estimation and sample size determination
53 pages
PDF Lesson 2 Understanding Confidence Interval Estimates for the Population Mean
No ratings yet
PDF Lesson 2 Understanding Confidence Interval Estimates for the Population Mean
33 pages
Week 2 Nptel Digital Electronics
No ratings yet
Week 2 Nptel Digital Electronics
74 pages
CI Lecture 10 - A
No ratings yet
CI Lecture 10 - A
62 pages
Interval Estimation
No ratings yet
Interval Estimation
46 pages
Chapter 6 Powerpoint
No ratings yet
Chapter 6 Powerpoint
44 pages
Section 5.3 and 5.4
No ratings yet
Section 5.3 and 5.4
41 pages
Confidence Intervals
No ratings yet
Confidence Intervals
56 pages
Week 2 Introduction To Linear Models - Revised - v1
No ratings yet
Week 2 Introduction To Linear Models - Revised - v1
54 pages
S8 Estimate
No ratings yet
S8 Estimate
44 pages
CHAPTER 2
No ratings yet
CHAPTER 2
30 pages
CG Unit2
No ratings yet
CG Unit2
26 pages
Confidence Interval
No ratings yet
Confidence Interval
44 pages
Chapter 9 Slides
No ratings yet
Chapter 9 Slides
33 pages
Estimations
No ratings yet
Estimations
24 pages
Chapter 06
No ratings yet
Chapter 06
44 pages
CSEC Maths - Paper 2 - January 2010 - Solution
100% (1)
CSEC Maths - Paper 2 - January 2010 - Solution
56 pages
BUS_7
No ratings yet
BUS_7
48 pages
Theory Term2
No ratings yet
Theory Term2
9 pages
5.confidence Interval
No ratings yet
5.confidence Interval
53 pages
Week 8a - Interval Estimation
No ratings yet
Week 8a - Interval Estimation
43 pages
Lecture 7
No ratings yet
Lecture 7
50 pages
Estimation of Parameters
No ratings yet
Estimation of Parameters
14 pages
Stat-II CH-TWO
No ratings yet
Stat-II CH-TWO
68 pages
Interval Estimation
No ratings yet
Interval Estimation
46 pages
Chapter 9 Estimation From Sampling Data
No ratings yet
Chapter 9 Estimation From Sampling Data
22 pages
Credit Sessions5 & 6
No ratings yet
Credit Sessions5 & 6
91 pages
Ch3 Prob II Anu Fall24 1
No ratings yet
Ch3 Prob II Anu Fall24 1
20 pages
Eps 400 New Notes Dec 15-1
No ratings yet
Eps 400 New Notes Dec 15-1
47 pages
A Session 18 2021
No ratings yet
A Session 18 2021
36 pages
Statistical Interval
No ratings yet
Statistical Interval
13 pages
Chapter 5
No ratings yet
Chapter 5
43 pages
10 Inferential Statistics
No ratings yet
10 Inferential Statistics
39 pages
Chapter 8 - Confidence Intervals - Lecture Notes
No ratings yet
Chapter 8 - Confidence Intervals - Lecture Notes
12 pages
Mathematics
No ratings yet
Mathematics
83 pages
Chapter 7 Confidence Interval and Sample Mean A
No ratings yet
Chapter 7 Confidence Interval and Sample Mean A
37 pages
MULTIPLE CHOICE QUESTIONS in ENGINEERING MATHEMATICS by Diego Inocencio T. Gillesania
No ratings yet
MULTIPLE CHOICE QUESTIONS in ENGINEERING MATHEMATICS by Diego Inocencio T. Gillesania
74 pages
GR 12 Maths KZN Sept 2022 p1 Memo
No ratings yet
GR 12 Maths KZN Sept 2022 p1 Memo
12 pages
Estimation and Confidence Intervals: Point Estimate
No ratings yet
Estimation and Confidence Intervals: Point Estimate
26 pages
Confidence Interval For The Mean: Point and Interval Estimates
No ratings yet
Confidence Interval For The Mean: Point and Interval Estimates
11 pages
C 4
No ratings yet
C 4
61 pages
L8 Estimate 2014
No ratings yet
L8 Estimate 2014
40 pages
Estimation and Confidence Intervals
No ratings yet
Estimation and Confidence Intervals
28 pages
Topic 5
No ratings yet
Topic 5
11 pages
Matrices Formulas
No ratings yet
Matrices Formulas
4 pages
2013 The Implied Volatility Surfaces
No ratings yet
2013 The Implied Volatility Surfaces
10 pages
Estimation and Confidence Intervals: Mcgraw Hill/Irwin
No ratings yet
Estimation and Confidence Intervals: Mcgraw Hill/Irwin
15 pages
UCE Mathematics 2-1
No ratings yet
UCE Mathematics 2-1
5 pages
Analysis of Algorithms: The Recursion-Tree Method
No ratings yet
Analysis of Algorithms: The Recursion-Tree Method
11 pages
Chapter 9 Estimation From Sampling Data
No ratings yet
Chapter 9 Estimation From Sampling Data
23 pages
Simplex Method
No ratings yet
Simplex Method
5 pages
Mathematics Resource Package: Quarter I Subject: MATH Date: - Day: 4 Content Standard
No ratings yet
Mathematics Resource Package: Quarter I Subject: MATH Date: - Day: 4 Content Standard
8 pages
000.chapter8 Cumulative PDF
No ratings yet
000.chapter8 Cumulative PDF
19 pages
BBA 122 Notes On Estimation and Confidence Intervals
No ratings yet
BBA 122 Notes On Estimation and Confidence Intervals
34 pages
Maths Unit1 Question Bank
No ratings yet
Maths Unit1 Question Bank
8 pages
Confidence Interval
100% (1)
Confidence Interval
19 pages
Real Analysis MSC Assignments PDF
No ratings yet
Real Analysis MSC Assignments PDF
4 pages
BSCHAPTER - (Theory of Estimations)
No ratings yet
BSCHAPTER - (Theory of Estimations)
39 pages
1ST Quarter Grade 9
100% (3)
1ST Quarter Grade 9
12 pages
Core Mathematics C2: GCE Examinations Advanced Subsidiary
No ratings yet
Core Mathematics C2: GCE Examinations Advanced Subsidiary
4 pages
Large Sample Estimation of A Population Mean: Learning Objectives
No ratings yet
Large Sample Estimation of A Population Mean: Learning Objectives
16 pages
Confidence Interval Estimation
No ratings yet
Confidence Interval Estimation
62 pages
6th Grade Released Eog
100% (2)
6th Grade Released Eog
32 pages
Generalized Pigeon Hole Principle and Its Applications
100% (1)
Generalized Pigeon Hole Principle and Its Applications
3 pages
Applied Statistics and Probability For Engineers Chapter - 8
No ratings yet
Applied Statistics and Probability For Engineers Chapter - 8
13 pages
Basic Statistics 7 Probability and Confidence Intervals
No ratings yet
Basic Statistics 7 Probability and Confidence Intervals
22 pages
TOC Notes
No ratings yet
TOC Notes
14 pages
Confidence Intervals
No ratings yet
Confidence Intervals
50 pages
BP CB IX Math FE A
No ratings yet
BP CB IX Math FE A
2 pages
5710 NM Tutorial 2
No ratings yet
5710 NM Tutorial 2
8 pages
Flowchart of Pincer Search Algorithm
No ratings yet
Flowchart of Pincer Search Algorithm
1 page
Confidence Interval - Notes - Update
No ratings yet
Confidence Interval - Notes - Update
4 pages
Source Code For Sudoku Solver
No ratings yet
Source Code For Sudoku Solver
3 pages
Wishart Distribution
No ratings yet
Wishart Distribution
6 pages
On Harmonious Coloring of M (Y N) and C (Y N)
No ratings yet
On Harmonious Coloring of M (Y N) and C (Y N)
3 pages
Loctugan Integrated School: I. Read The Questions Carefully. Encircle The Letter of The Correct Answer
No ratings yet
Loctugan Integrated School: I. Read The Questions Carefully. Encircle The Letter of The Correct Answer
4 pages
G11 1ST SEM QUARTER1 TOS GenMath
No ratings yet
G11 1ST SEM QUARTER1 TOS GenMath
3 pages
Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet
Constructed Layered Systems: Measurements and Analysis
From Everand
Constructed Layered Systems: Measurements and Analysis
W. H. Cogill
No ratings yet

Confidence_Intervals-Reader

Uploaded by

Confidence_Intervals-Reader

Uploaded by

Confidence intervals

Dr. Debarati Bhaumik

1.1 Confidence level

Now, if a random variable Z follows the standard normal distribution then by

P(X̄ − z(α/2)σX̄ ≤ µ ≤ X̄ + z(α/2)σX̄ ) ≈ 1 − α.

α confidence level z(α/2)

where X̄ is the sample mean, σ is the population standard deviation, n is the

1.3 Confidence interval for a population proportion

1.4 Confidence interval for the difference of two means

where, s1 and s2 are the sample standard deviations respectively; and

1.5 Confidence interval for the difference of two propor-

2 Interpreting confidence intervals

1. Confidence interval estimation around population mean for known σ. This

2. Confidence interval estimation around population mean for unknown σ:

You might also like