Estimation & Hypothesis Testing.pptx (Final)
Estimation & Hypothesis Testing.pptx (Final)
Principles estimation/inferences
Sampling Distribution
Point and interval estimates
Hypothesis testing
Concept of P-value
2
Session Objectives
After completion of this; students will be able to
Explain the principles of sampling distributions of
means and proportions and calculate their standard
errors.
State the principles of estimation and differentiate
between point and interval estimations.
Compute appropriate confidence intervals for
population means and proportions and interpret the
findings.
Familiar with estimation and hypothesis formulation
To test hypothesis and significance using sample
means and proportions .
3
Inferential statistics
Inferential statistics: the process of generalizing or
drawing conclusions about the target population on
the basis of results obtained from a sample.
• 16 possible samples
(with replacement)
8
Sample means Freq P( )
18 1 0.0625
19 2 0.1250
20 3 0.1875
21 4 0.2500
22 3 0.1875
23 2 0.1250
24 1 0.0625
Comparing the population with its sampling distribution
.2 .2
.1 .1
0
18 20 22 24
0
18 19 20 21 22 23 24
_
x Mean
10
Properties of sampling distribution of the mean
If a population is normal with mean μ and standard
deviation σ, the sampling distribution of is also
normally distributed with
The mean, μ, of the distribution of sample mean is
equal to the mean of the population from which the
samples were drawn.
The variance of the distribution of sample mean is
equal to the variance of the population divided by the
Square root of sample size.
σ
μx μ and σx
n
11
Sampling Distribution of the proportion
The sample proportion is derived from counts or
frequency data.
Easier and more reliable, does not depend on
variance.
Suppose we choose a random sample of size n,
the sampling distribution of the samples
proportions p posses the following properties.
The sample proportion p will be an estimate of
the population proportions P.
12
Sampling Distribution of the proportion……….
13
• The mean of the distribution, μp, will be equal to the
true population proportion, p, and the variance of the
distribution, σ2 will be equal to p(q)/n.
μp p σp
p(1 p)
n
np 5
n(1 p) 5
14
Central Limit Theorem (CLT)
Regardless of the shape of the frequency distribution
of a characteristic in the parent population;
18
Parameter Estimations…….
Estimation is a procedure in which we use the
information included in a sample to get inferences
about the true parameter of interest.
An estimator is a statistic that is used to estimate an
unknown population parameter.
Estimator
19
Estimations……
Properties of good estimator:
Unbiased: If a measure of the sample statistic is equal
to the population parameter but unlikely to obtain such
estimator unless include the source population
20
Estimation
There are two types of estimation:
Point estimation:
is a specific numerical value estimate of parameter.
21
Point Estimation
Estimator is a rule (formula) that tells us how to
calculate the value of an estimate based on the
measurements contained in a sample.
23
Confidence Intervals
Also called Confidence Limits (Lower and upper)
25
Confidence Intervals Estimation...........
Confidence interval takes into account the sample to sample
variation of the statistic and gives the measure of precision.
The general formula used to calculate a Confidence interval is:
CI = Estimate ± K × Standard Error, k is called reliability
coefficient
Most commonly the 95% confidence intervals are calculated,
however 90% and 99% confidence intervals are sometimes
used.
27
Confidence Intervals Estimation……
90% CI is narrower than 95% CI since we are only
90% certain that the interval includes the population
parameter.
29
Confidence Intervals Estimation:
To increase precision (of an SRS), use a larger
sample.
31
Two-Sided Confidence Intervals
Two-Sided Confidence Intervals for mean...
This is merely a shorthand algebraic statement that
95% of the SND curve lies b/n +1.96 and –1.96. If
one chooses the sampling distribution of means (a
normal curve with mean μ and standard deviation,
σ/√n), then substituting for Z, it follows that
Little manipulation,
32
Two-Sided Confidence Intervals
Estimating standard error
33
Two-Sided Confidence Intervals
34
Two-Sided Confidence Intervals
35
Confidence interval for a single mean (for continuous variable )
36
Interval estimation for a single mean
37
Interval estimation for difference of mean
38
Interval estimation for difference of mean
39
Interval estimation for single proportion
40
Confidence interval for the difference of two
population proportion
41
Interval for Difference Means
Remark
You can construct a 100(1-α)% confidence
interval for a paired experiment using:
Two Dependent Means
The test to be used in this section is the paired
t test
42
Confidence Intervals
Example: A random sample of 100 drug-treated
patients has a mean survival time of 46.9 months. If
the SD of the population is 43.3 months, find a 95%
confidence interval for the population mean.
Solution: σ
X Zα
2 n
46.9 ± (1.96) (43.3 /√100) = 46.9 ± 8.5
= (38.4 to 55.4 months). Hence, there is 95%
certainty that the limits (38.4 , 55.4) hold the mean
survival times in the population from which the
sample arose.
43
Example 1
44
45
Example 2
46
Example 4
47
Exercise
48
Hypothesis testing
49
Hypothesis testing………
The formal process of hypothesis testing provides us
with a means of answering research questions.
50
Hypothesis testing………
This statement (assumption) may or may not
be true.
Statistical tests can prove (with a certain
degree of confidence), that a relationship
exists.
The best way to determine whether a statistical
hypothesis is true would be to examine the entire
population.
Since that is often impractical, researchers
typically examine a random sample from the
population.
51
Hypothesis testing
In hypothesis testing,
the researcher must define the population under
study,
state the particular hypothesis that will be
investigated,
give the significance level,
select a sample from the population,
Collect the data,
perform the calculations require for the statistical
test, and reach a conclusion.
52
Example of hypothesis
1) The mean height of Aksum College of Health
Sciences students’ is 1.63m.
53
Hypothesis testing……
54
Hypothesis testing…..
55
Hypothesis testing…..
56
Hypothesis testing…..
57
Hypothesis testing…..
Level of significance
A method for decision making must be agreed upon.
If HO is rejected, then HA is accepted.
How is a “significant” difference defined?
A null hypothesis is either true or false, and it is
either rejected or not rejected.
No error is made if it is true and we fail to reject
it, or if it is false and rejected.
An error is made, however, if it is true but
rejected, or if it is false and we fail to reject it.
58
Hypothesis testing cont’d …..
60
Hypothesis testing cont’d …..
Type I Error: A type I error occurs when one rejects
the null hypothesis while it is true.
63
What Do We Test
Effect or Difference we are interested in
Correlation Coefficient
Estimated Parameters
64
Hypothesis testing for population mean
Test procedure for two tailed test
1.State the null hypothesis: H0: μ =μ0
For small sample (n<30) and if the true variance (σ2) is unknown.
the test statistic is distributed as a student t-distribution with n-1
degrees of freedom.
If the interest is to check the presence or absence of association
then Chi-score distribution will be used
66
Example
1. The average age of Aksum university staffs is 30
years with variance of 20 years. To check this
assumption a graduating MPH student wants to
proof whether the assumption made about average
age is true or not. He took a random sample of 10
staffs and found the average age (mean) of 27 years.
Test that the average age of the staffs is 30 years.
Interpretation :
Thus, the average age of Aku staff was not 30
years.
69
Test Procedure for One Tailed Test
Is the same as the test for two tailed test except
that the alternative hypothesis(H1) is
H1 : μ < μo Or μ > μo When Ho: μ = μo
72
Hypothesis Testing for Single Population Proportion (two tail)
73
Example
• The national institute of mental health published an
article stating that in any one year period,
approximately 9.5 percent of American adults suffer
from depression or a depressive illness.
75
76
Procedure for one tailed test of proportion
77
Procedure for one tailed test of proportion
79
Hypothesis testing for two proportions……..
80
Hypothesis testing for two proportions………
81
Example
82
83
Decision: reject Ho
• Because Z calc > Z tab; in other words, the p-value is
les s than the level of significance (i.e., α= 0.01)
84
Tests of Hypothesis using the t - distribution
Tests of hypotheses about the mean are carried out with the
t-distribution just as for the normal distribution, except that
we must consider the number of degrees of freedom and use
a different table (the table of t-distribution).
Types of t-test:
85
The P-Value
86
Two-Sided P-Value
One-sided Ha AUC
in tail beyond zcal
Two-sided Ha
consider potential
deviations in both
directions double
the one-sided P-value Examples: If one-sided P =
0.0010, then two-sided P = 2
× 0.0010 = 0.0020.
If one-sided P = 0.2743, then
two-sided P = 2 × 0.2743 =
0.5486. 87
Interpretation
P-value answer the question: What is the probability
of the observed test statistic … when H0 is true?
Thus, smaller and smaller P-values provide stronger
and stronger evidence against H0
Small P-value strong evidence
88
Interpretation…..
conventions*
0.01 < P 0.05 significant evidence against H0
P 0.01 highly significant evidence against H0
Examples
P =.27 non-significant evidence against H0
P =.01 highly significant evidence against H0
89
Interpretation…..
90
Confidence Interval (CI) VS p-value
91
Thank You !!!
92