0% found this document useful (0 votes)
109 views

Z Test

The Z-test is a statistical test where the test statistic is approximately normally distributed under the null hypothesis. It can be used for tests where the sample size is large enough for the normal approximation. The Z-test has a single critical value for each significance level, making it more convenient than the Student's t-test. Examples of tests that can be performed as Z-tests include the one-sample location test, two-sample location test, and paired difference test. For a Z-test to be valid, nuisance parameters must be known or estimated accurately, and the test statistic must follow a normal distribution.

Uploaded by

melprvn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
109 views

Z Test

The Z-test is a statistical test where the test statistic is approximately normally distributed under the null hypothesis. It can be used for tests where the sample size is large enough for the normal approximation. The Z-test has a single critical value for each significance level, making it more convenient than the Student's t-test. Examples of tests that can be performed as Z-tests include the one-sample location test, two-sample location test, and paired difference test. For a Z-test to be valid, nuisance parameters must be known or estimated accurately, and the test statistic must follow a normal distribution.

Uploaded by

melprvn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Z-test

Page issues

A Z-test is any statistical test for which


the distribution of the test statistic under
the null hypothesis can be approximated
by a normal distribution. Because of the
central limit theorem, many test statistics
are approximately normally distributed for
large samples. For each significance level,
the Z-test has a single critical value (for
example, 1.96 for 5% two tailed) which
makes it more convenient than the
Student's t-test which has separate critical
values for each sample size. Therefore,
many statistical tests can be conveniently
performed as approximate Z-tests if the
sample size is large or the population
variance is known. If the population
variance is unknown (and therefore has to
be estimated from the sample itself) and
the sample size is not large (n < 30), the
Student's t-test may be more appropriate.

If T is a statistic that is approximately


normally distributed under the null
hypothesis, the next step in performing a
Z-test is to estimate the expected value θ
of T under the null hypothesis, and then
obtain an estimate s of the standard
deviation of T. After that the standard
score Z = (T − θ) / s is calculated, from
which one-tailed and two-tailed p-values
can be calculated as Φ(−Z) (for upper-
tailed tests), Φ(Z) (for lower-tailed tests)
and 2Φ(−|Z|) (for two-tailed tests) where Φ
is the standard normal cumulative
distribution function.

Use in location testing


The term "Z-test" is often used to refer
specifically to the one-sample location test
comparing the mean of a set of
measurements to a given constant when
the sample variance is known. If the
observed data X1, ..., Xn are (i)
independent, (ii) have a common mean μ,
and (iii) have a common variance σ2, then
the sample average X has mean μ and
variance σ2 / n.

The null hypothesis is that the mean value


of X is a given number μ0. We can use X 
as a test-statistic, rejecting the null
hypothesis if X − μ0 is large.

To calculate the standardized statistic


Z = (X  −  μ0) / s, we need to either know or
have an approximate value for σ2, from
which we can calculate s2 = σ2 / n. In
some applications, σ2 is known, but this is
uncommon.

If the sample size is moderate or large, we


can substitute the sample variance for σ2,
giving a plug-in test. The resulting test will
not be an exact Z-test since the
uncertainty in the sample variance is not
accounted for—however, it will be a good
approximation unless the sample size is
small.

A t-test can be used to account for the


uncertainty in the sample variance when
the data are exactly normal.
There is no universal constant at which the
sample size is generally considered large
enough to justify use of the plug-in test.
Typical rules of thumb: the sample size
should be 50 observations or more.

For large sample sizes, the t-test


procedure gives almost identical p-values
as the Z-test procedure.

Other location tests that can be performed


as Z-tests are the two-sample location test
and the paired difference test.

Conditions
For the Z-test to be applicable, certain
conditions must be met.

Nuisance parameters should be known,


or estimated with high accuracy (an
example of a nuisance parameter would
be the standard deviation in a one-
sample location test). Z-tests focus on a
single parameter, and treat all other
unknown parameters as being fixed at
their true values. In practice, due to
Slutsky's theorem, "plugging in"
consistent estimates of nuisance
parameters can be justified. However if
the sample size is not large enough for
these estimates to be reasonably
accurate, the Z-test may not perform
well.
The test statistic should follow a normal
distribution. Generally, one appeals to
the central limit theorem to justify
assuming that a test statistic varies
normally. There is a great deal of
statistical research on the question of
when a test statistic varies
approximately normally. If the variation
of the test statistic is strongly non-
normal, a Z-test should not be used.

If estimates of nuisance parameters are


plugged in as discussed above, it is
important to use estimates appropriate for
the way the data were sampled. In the
special case of Z-tests for the one or two
sample location problem, the usual
sample standard deviation is only
appropriate if the data were collected as
an independent sample.

In some situations, it is possible to devise


a test that properly accounts for the
variation in plug-in estimates of nuisance
parameters. In the case of one and two
sample location problems, a t-test does
this.

Example
Suppose that in a particular geographic
region, the mean and standard deviation of
scores on a reading test are 100 points,
and 12 points, respectively. Our interest is
in the scores of 55 students in a particular
school who received a mean score of 96.
We can ask whether this mean score is
significantly lower than the regional mean
—that is, are the students in this school
comparable to a simple random sample of
55 students from the region as a whole, or
are their scores surprisingly low?

First calculate the standard error of the


mean:
where is the population standard
deviation.

Next calculate the z-score, which is the


distance from the sample mean to the
population mean in units of the standard
error:

In this example, we treat the population


mean and variance as known, which would
be appropriate if all students in the region
were tested. When population parameters
are unknown, a t test should be conducted
instead.

The classroom mean score is 96, which is


−2.47 standard error units from the
population mean of 100. Looking up the z-
score in a table of the standard normal
distribution, we find that the probability of
observing a standard normal value below
−2.47 is approximately 0.5 − 0.4932 =
0.0068. This is the one-sided p-value for
the null hypothesis that the 55 students
are comparable to a simple random
sample from the population of all test-
takers. The two-sided p-value is
approximately 0.014 (twice the one-sided
p-value).

Another way of stating things is that with


probability 1 − 0.014 = 0.986, a simple
random sample of 55 students would have
a mean test score within 4 units of the
population mean. We could also say that
with 98.6% confidence we reject the null
hypothesis that the 55 test takers are
comparable to a simple random sample
from the population of test-takers.

The Z-test tells us that the 55 students of


interest have an unusually low mean test
score compared to most simple random
samples of similar size from the
population of test-takers. A deficiency of
this analysis is that it does not consider
whether the effect size of 4 points is
meaningful. If instead of a classroom, we
considered a subregion containing 900
students whose mean score was 99,
nearly the same z-score and p-value would
be observed. This shows that if the
sample size is large enough, very small
differences from the null value can be
highly statistically significant. See
statistical hypothesis testing for further
discussion of this issue.

Z-tests other than location


tests
Location tests are the most familiar Z-
tests. Another class of Z-tests arises in
maximum likelihood estimation of the
parameters in a parametric statistical
model. Maximum likelihood estimates are
approximately normal under certain
conditions, and their asymptotic variance
can be calculated in terms of the Fisher
information. The maximum likelihood
estimate divided by its standard error can
be used as a test statistic for the null
hypothesis that the population value of the
parameter equals zero. More generally, if
is the maximum likelihood estimate of a
parameter θ, and θ0 is the value of θ under
the null hypothesis,

can be used as a Z-test statistic.

When using a Z-test for maximum


likelihood estimates, it is important to be
aware that the normal approximation may
be poor if the sample size is not
sufficiently large. Although there is no
simple, universal rule stating how large the
sample size must be to use a Z-test,
simulation can give a good idea as to
whether a Z-test is appropriate in a given
situation.
Z-tests are employed whenever it can be
argued that a test statistic follows a
normal distribution under the null
hypothesis of interest. Many non-
parametric test statistics, such as U
statistics, are approximately normal for
large enough sample sizes, and hence are
often performed as Z-tests.

See also
Normal distribution
Standard normal table
Standard score
Student's t-test
References
Sprinthall, R. C. (2011). Basic Statistical
Analysis (9th ed.). Pearson Education.
ISBN 978-0-205-05217-2.

Retrieved from
"https://en.wikipedia.org/w/index.php?title=Z-
test&oldid=872721702"

Last edited 1 month ago by Stesmo

Content is available under CC BY-SA 3.0 unless


otherwise noted.

You might also like