Module3 Part3 Inference About Population Mean
Module3 Part3 Inference About Population Mean
q Generate an SRS of
size 25 from N(13,2).
q X% = 12.7
q Generate 1000 SRSs of
size 25 from N(13,2).
q What is the sampling
%
distribution of X?
Point Estimation: Sample Mean
1.0
q X% = 12.7 N(13,0.4)
0.8
q Generate 1000 SRSs of
Density
0.6
size 25 from N(13,2).
0.4
q What is the sampling
0.2
%
distribution of X?
0.0
N(13,0.4). 11.5 12.0 12.5 13.0 13.5 14.0 14.5
1.0
SRS of size 25 from
0.8
N(µ,2) is 12.7.
Density
0.6
q How close to 12.7 to µ
0.4
likely to be?
0.2
0.0
µ
values of x
Confidence Interval for a Population
Mean
N(µ, 0.4)
1.0
size 25, the sample mean
0.8
will be within 0.8 (two
Density
0.6
95%
standard deviation) of µ.
0.4
0.2
q In 95% of all samples of
size 25, the distance
0.0
µ - 0.8 µ µ + 0.8
between sample mean values of x
% |, will be
and µ, i.e. |X-µ
within 0.8.
Confidence Interval for a Population
Mean
N(µ, 0.4)
1.0
size 25, the distance
0.8
Density
0.6
between sample mean 95%
0.4
% |, will be
and µ, i.e. |X-µ
0.2
within 0.8.
0.0
µ - 0.8 µ µ + 0.8
q In 95% of all samples of values of x
%
size 25, (X-0.8, %
X+0.8)
will capture µ.
General Form of Confidence Intervals
q An interval calculated from the data, which has the form:
estimate ± margin of error
q A confidence level C, where C is the probability that the
interval will capture the true parameter value in repeated
samples. In other words, the confidence level is the
success rate for the method.
§ We usually choose a confidence level of 90% or higher.
§ The most common confidence level is 95%.
Confidence Level
To say that we are 95%
confident is shorthand for
“95% of all possible samples
of a given size from this
population will yield intervals
that capture the unknown
parameter.”
Confidence Level
Confidence Interval for a Population
Mean
2
s æ z *s ö
m = z* Û n =ç ÷
n è m ø
Example
q Measure density of bacteria in solution
è m ø è 0.5 ø
q Using only 15 measurements will not be enough to ensure that
m is no more than 0.5 * 106. Therefore, we need at least 16
measurements.
Some Cautions
q The data should be an SRS from the population.
q Confidence intervals are not resistant to outliers.
q If n is small (<15) and the population is not Normal, the true
confidence level of this method will be different from C.
q The standard deviation σ of the population must be known.
q The margin of error in a confidence interval covers only
uncertainty from random sampling.
Tests of Significance
(Significance Tests)
(Hypothesis Tests)
For a Population Mean
Test of Significance
q Goal: to assess evidence in the data about some claim
concerning a population.
q Formal procedure for comparing observed data with a claim
(also called a hypothesis) whose truth we want to assess.
q The claim is a statement about a parameter, like the population
mean µ.
q We express the results of a significance test in terms of a
probability, called the P-value, that measures how well the data
and the claim agree.
Steps of Significance Tests
1. Hypotheses
2. Test statistic
3. P-value
4. Conclusion
Example
q Population distribution is N(µ,5).
q Test a claim: µ=5
%
q An SRS of size 25 with the sample mean X=2.5.
q What can we conclude about the claim based on the data?
Example
q Population distribution is N(µ,5).
q Test a claim: µ=5
q An SRS of size 25 with the sample mean %
X=2.5.
q What is the parameter? What is the hypothesis? What is
the information from data?
§ Parameter: population mean µ
§ Hypothesis: µ=5
%
§ Data: X=2.5
Example
We can use software to simulate 1000 sets of SRSs of size
25 from N(5,5) (assume µ=5)
50
X# ~ N(µ, σ/√n)
= N(5, 5/√25)
40
= N(5,1)
30
Count
20
10
0
2 3 4 5 6 7 8
round(rnorm(1000, 5, 1), 1)
Values of 𝑥̅
Example
We can use software to simulate 1000 sets of SRSs of size
25 from N(5,5) (assume µ=5)
50
40
X# ~ N(5,1)
In 1000 sets of SRSs, there
were only 10 times when the
30
% 2.5.
the observed X=
20
10
0
2 3 4 5 6 7 8
round(rnorm(1000, 5, 1), 1)
x = 2.5
Example
We can say, assuming that the actual parameter value µ=5, there is a small
chance that the sample mean can be as small as 2.5, i.e., the observed
statistic is so unlikely that it gives convincing evidence that the claim is not
true.
50
40
% 2.5.
the observed X=
20
10
0
2 3 4 5 6 7 8
round(rnorm(1000, 5, 1), 1)
x = 2.5
Steps of Significance Tests
1. Hypotheses
2. Test statistic
3. P-value
4. Conclusion
Hypotheses
q Null hypothesis (H0): The claim needs to be tested.
§ Often, the null hypothesis is a statement of “no effect” or “no
difference”.
q Alternative hypothesis (H0): The claim about the population
for which we’re trying to find evidence.
§ The alternative is one-sided if it states either that a parameter is (1)
larger than the null hypothesis value, or (2) smaller than the null
hypothesis value.
§ It is two-sided if it states that the parameter is different from the null
value.
Example
q H0: µ=5 versus Ha: µ≠5 (two-sided alternative)
q H0: µ=5 versus Ha: µ<5 (one-sided alternative)
q H0: µ=5 versus Ha: µ>5 (one-sided alternative)
Logic of Significance Tests
“null hypothesis” =
“Prosecutor” “innocence”
Logic of Significance Tests
“null hypothesis” =
“Prosecutor” “innocence”
Test Statistic is Evidence.
q Calculated from the sample data
q Measures how far the data diverge from what we would expect
if the null hypothesis H0 were true.
50
only 10 times
40
when the sample
P-value = 0.01 means are as
30
small as the
Count
% 2.5.
observed X=
20
10
0
2 3 4 5 6 7 8
round(rnorm(1000, 5, 1), 1)
x = 2.5
How to Calculate the P-value
without doing repeated sampling
The probability, computed assuming H0 is true, that the statistic would
take a value as or more extreme than the one actually observed.
Previous Example:
q H0: µ=5 versus Ha: µ<5
q x# = 2.5
q Observed test statistic value from the data is z=-2.5
% ≤2.5) if H0 is true
q P-value = Pr(X
= Pr(Z≤-2.5) if H0 is true
How to Calculate the P-value
without actual repeated sampling
P-value = Pr(Z≤-2.5) if H0 is true
q What is the distribution of Z if H0 is true?
% ~ N(5,1) if H0 is true
X
%
X−5
Z= ~ N(0,1)
1
q P-value is the area to the left of -2.5
q P-value = 0.0062.
P-value
q Measures the strength of the evidence from the data
against a null hypothesis
q Small P-values
§ Observed result is unlikely to occur when H0 is true
§ Strong evidence against H0
q Large P-values
§ Observed result is likely to occur by chance when H0 is true
§ Fail to give convincing evidence against H0
P-value H0: µ = 5 vs Ha: µ < 5
0.4
0.3
0.2
0.1
0.0
2 3 4 5 6 7 8
x = 2.5 x=4
4. State a conclusion.
Example
q Question: Does the job satisfaction of assembly-line workers differ
when their work is machine-paced rather than self-paced?
q Study: chose 18 subjects at random from a company with over 200
workers who assembled electronic devices
q Half of the workers were assigned at random to each of two groups.
Both groups did similar assembly work, but one group was allowed
to pace themselves while the other group used an assembly line that
moved at a fixed pace.
q After two weeks, all the workers took a test of job satisfaction. Then
they switched work set-ups and took the test again after two more
weeks.
Example
q Question: Does the job satisfaction of assembly-line workers differ when their
work is machine-paced rather than self-paced?
q Study: chose 18 subjects at random from a company with over 200 workers who
assembled electronic devices
q Half of the workers were assigned at random to each of two groups. Both groups
did similar assembly work, but one group was allowed to pace themselves while
the other group used an assembly line that moved at a fixed pace.
q After two weeks, all the workers took a test of job satisfaction. Then they
switched work set-ups and took the test again after two more weeks.
q Is this design a matched-pairs design?
A. Yes.
B. No.
Example
q Question: Does the job satisfaction of assembly-line workers differ when
their work is machine-paced rather than self-paced?
q Study: chose 18 subjects at random from a company with over 200
workers who assembled electronic devices
q Half of the workers were assigned at random to each of two groups. Both
groups did similar assembly work, but one group was allowed to pace
themselves while the other group used an assembly line that moved at a
fixed pace.
q After two weeks, all the workers took a test of job satisfaction. Then they
switched work set-ups and took the test again after two more weeks.
q This is a matched-pairs design.
Example
q Question: Does the job satisfaction of assembly-line workers differ when their
work is machine-paced rather than self-paced?
q Study: chose 18 subjects at random from a company with over 200 workers who
assembled electronic devices
q Half of the workers were assigned at random to each of two groups. Both groups
did similar assembly work, but one group was allowed to pace themselves while
the other group used an assembly line that moved at a fixed pace.
q After two weeks, all the workers took a test of job satisfaction. Then they
switched work set-ups and took the test again after two more weeks.
q The response variable is the difference in satisfaction scores, self-paced minus
machine-paced. The sample mean score is 17.
q Suppose job satisfaction scores follow a Normal distribution with standard
deviation s=60.
Example
q Hypotheses:
H0: µ = 0 versus Ha: µ ≠ 0
§ This is a two-sided alternative, i.e., we are interested in
determining if a difference, positive or negative, exists.
q Test statistic is
x − µ0 17 − 0
z= = ≈ 1.20
σ 60
n 18
P-value
q Two-sided P-value
P-value = P(Z < –1.2 or Z > 1.2)
= 2 P(Z < –1.2)
= 2 P(Z > 1.2)
= (2)(0.1151) = 0.2302
P-value
q Two-sided P-value = 0.2302
q If H0 is true, there is a chance of 23% that we would see
results at least as extreme as those in the sample.
q At significance level α=0.05, the P-value is larger than α.
q This means there is not strong evidence against H0 or in
favor of Ha.
P-values and Confidence Intervals
For two-sided alternative,
q If we rejected the null hypothesis at significance level α
(e.g. α=0.05), the level C=1- α (C=95%) confidence
interval will NOT cover the hypothesized value of the
parameter.
q If the level C (e.g. C=95%) confidence interval did not
cover the hypothesized value of the parameter, we will
reject the null hypothesis at the significance level α=1-C
( α=0.05).
More Comments about P-Values
q Reporting the P-value is a better way to summarize a test than
simply stating whether or not H0 is rejected. This is because P-
values quantify how strong the evidence is against H0. The smaller
the value of P-value, the greater the evidence.
§ There is no practical difference between 4.9% and 5.1%.
§ It is the order of magnitude of the P-value that matters: “somewhat significant,”
“significant,” or “very significant.”
q P-values do not provide specific information about the true
population mean µ.
q If you desire a likely range of values for the parameter, use a
confidence interval.
Approximating a P-value
q In the absence of suitable technology, you can get
approximate P-values by comparing your test statistic with
critical values from a table.
q For example, H0: µ = 0 versus Ha: µ > 0
q Test statistic z = –0.276.
q P-value is the area under N(0,1) to the right of –0.276.
q Note that because N(0,1) is symmetric about 0, the area
under N(0,1) to the right of –0.276 is equal to the area to the
left of 0.276.
Approximating a P-value
q Find the area to the left of 0.276 under N(0,1)
q 0.276 is between 0.27 and 0.28
q The area to the left of 0.27 is 0.6064 and the area to the left
of 0.28 is 0.6103.
q The area to the left of 0.276 is between 0.6064 and 0.6103.
q 0.6064 < P-value < 0.6103
q There is no reason whatsoever to reject H0 in this case as the
P-value is so large.
How to Choose α
q What are the consequences of rejecting the null
hypothesis when it is actually true (an innocent person was
convicted of a crime)?
§ A smaller α indicates less chance to commit such errors.
q Are you conducting a preliminary study?
§ With a larger α, you will be less likely to miss an
interesting result.
Statistical Significance Might Not Be
Practical Important
q Statistical significance doesn’t tell you about the
magnitude of the effect, only that there is one.
q An effect could be too small to be relevant.
§ A drug is found to lower patient temperature an average of
0.4°Celsius (P-value < 0.01). But clinical benefits of temperature
reduction only appear for a 1°or larger decrease.
1.00
q Test power is not a fixed
number.
0.80
Prob. of rejecting H0
q Previous example: Test
0.60
H0: µ=5 versus Ha: µ<5
with an SRS of size 25
0.40
q Test power is a function
of the values of µ in Ha
0.05
0 1 2 3 4 5
µ in Ha
When We Need Larger Sample Size?
q If you insist on a smaller significance level (such as 1%
rather than 5%)
q If you insist on higher power (such as 99% rather than 90%)
q At any significance level and desired power, detecting a
small difference between H0 and Ha requires a larger sample
than detecting a large difference.
Common Practices
q State H0 and Ha as in a test of significance.
q Think of the problem as a decision problem, so the
probabilities of Type I and Type II errors are relevant.
q Consider only tests in which the probability of a Type I
error is no greater than a specified α.
q Among these tests, select a test that makes the
probability of a Type II error as small as possible.