0% found this document useful (0 votes)
23 views

Chap 10

Uploaded by

sastf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Chap 10

Uploaded by

sastf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 12

Introduction

In Chapter 8 we began our study of statistical inference by describing how we could


select a random sample and then use the sample values to estimate the value of a
population parameter. Recall that a sample is a part or subset of the population, while a
parameter is a value calculated from the entire population. In Chapter 9 we estimated a
population parameter from a sample statistic. In addition, we developed a range of values,
called a confidence interval, within which we expected the population value to be
located.

In this chapter, rather than developing a range of values within which we expect the
population parameter to occur, we will conduct a test of hypothesis regarding the validity
of a statement about a population parameter.

Two statements called hypotheses are made regarding the possible values of population
parameters.

What is a Hypothesis?

A hypothesis is a statement about a population.

Hypothesis: A statement about a population parameter developed for the


purpose of testing.

In statistical analysis we make a claim, that is, state a hypothesis, and then follow up with
tests to verify the assertion or to determine that it is untrue.

What is Hypothesis Testing?

The terms hypothesis testing and testing a hypothesis are used interchangeably.
Hypothesis testing starts with a statement about a population parameter such as the mean.

Hypothesis testing: A procedure based on sample evidence and probability


theory to determine whether the hypothesis is a reasonable statement.

For example, one statement about the performance of a new model car is that the mean
miles per gallon is 30. The other statement is that the mean miles per gallon is not 30.
Only one of these statements is correct.

Five-Step Procedure for Testing a Hypothesis

Statistical hypothesis testing is a five-step procedure. These steps are:

Step 1 Step 2 Step 3 Step 4 Step 5 Do not


reject H0
Take a
State null and Select a level Identify Formulate a or
sample,
alternate of the test decision reject H0
arrive at
hypotheses significance. statistic. rule. and
decision.
accept
H1

When conducting hypothesis tests, we actually employ a strategy of "proof by


contradiction." That is, we hope to accept a statement to be true by rejecting or ruling out
another statement. The steps involved in hypothesis testing will now be described in more
detail.

First we will concentrate on testing a hypothesis about a population mean, or means.


Then we will consider one or two population proportions. For a mean or means:

Step 1. State the null hypothesis (H0) and the alternate hypothesis (H1).

The first step is to state the hypothesis being tested. It is called the null hypothesis,
designated H0 , and read H sub zero. The capital letter H stands for hypothesis, and the
subscript zero implies "no difference."

Null hypothesis: A statement about the value of a population parameter.

For example, a recent newspaper report made the claim that the mean length of a hospital
stay was 3.3 days. You think that the true length of stay is some other length than 3.3
days.

The null hypothesis is written H0:  = 3.3, where H0 is an abbreviation of the null
hypothesis. The null hypothesis will always contain the equal sign. It is the statement
about the value of the population parameter, in this case the population mean. The null
hypothesis is established for the purpose of testing. On the basis of the sample evidence,
it is either rejected or not rejected.

If the null hypothesis is rejected then we accept the alternate hypothesis.

Alternate hypothesis: A statement that is accepted if the sample data provide


enough evidence that the null hypothesis is false.

The alternate hypothesis is written H1. From the above example the alternate hypothesis is
that the mean length of stay is not 3.3 days. It is written H1:  ≠ 3.3 (≠ is read "not equal
to"). H1 is accepted only if H0 is rejected. When the "≠ " sign appears in the alternate
hypothesis, the test is called a two-tailed test.
There are two other formats for writing the null and alternate hypotheses. Suppose you
think that the mean length of stay is greater than 3.3 days. The null and alternate
hypotheses would be written as follows: ( is read "equal to or less than").

H0:   3.3

H1:  > 3.3

Notice that in this case the null hypothesis indicates "no change or that  is less than 3.3."
The alternate hypothesis states that the mean length of stay is greater than 3.3 days.
Acceptance of the alternate hypothesis would allow us to conclude that the mean length
of stay is greater than 3.3 days.

What if you think that the mean length of stay is less than 3.3 days? The null and
alternate hypotheses would be written as:

H0:   3.3

H1:  < 3.3

Acceptance of the alternate hypothesis in this instance would allow you to conclude the
mean length of stay is less than 3.3 days. When a direction is expressed in the alternate
hypothesis, such as > or < , the test is referred to as being one-tailed.

Step 2. Select the Level of Significance.

After setting up the null hypothesis and alternate hypothesis, the next step is to state the
level of significance.

Level of significance: The probability of rejecting the null hypothesis when


it is true.

The level of significance is designated  , the Greek letter alpha. The level of
significance is sometimes called the level of risk. It will indicate when the sample mean
is too far away from the hypothesized mean for the null hypothesis to be true. Usually the
significance level is set at either 0.01 or 0.05, although other values may be chosen.

Testing a null hypothesis at the 0.05 significance level, for example, indicates that the
probability of rejecting the null hypothesis, even though it is true, is 0.05. The 0.05 level
is also stated as the 5% level. When a true null hypothesis is rejected, it is referred to as a
Type I error.

Type I error: Rejecting the null hypothesis, H0, when it is true.


The decision whether to use the 0.01 or the 0.05 significance level, or some other value,
depends on the consequences of making a Type I error. The significance level is chosen
before the sample is selected.

If the null hypothesis is not true, but our sample results indicate that it is, we have a Type
II error.

Type II error: Accepting the null hypothesis when it actually is false.

For example, H0 is that the mean hospital stay is 3.3 days. Our sample evidence fails to
refute this hypothesis, but actually the population mean length of stay is 4.0 days. In this
situation we have committed a Type II
error by accepting a false H0. Researcher
Null Accepts Rejects
We refer to the probability of these two Hypothesis H0 H0
possible errors as alpha  and beta  .
Alpha ( ) is the probability of making a H0 is true Correct decision Type I error
Type I error and beta ( ) is the H0 is false Type II error Correct decision
probability of making a Type II error.
The table on the right summarizes the decisions the researcher could make and the
possible consequences.

Step 3. Select the Test Statistic.

A test statistic is a quantity calculated from the sample information and is used as the
basis for deciding whether or not to reject the null hypothesis.

Test statistic: A value, determined from sample information, used to


determine whether to reject the null hypothesis.

Exactly which test statistic to employ is determined by factors such as whether the
population standard deviation is known and the size of the sample.

In hypothesis testing for the mean  , when  is known or the sample size is large, the
standard normal distribution, the z value, is the test statistic used. Formula [10-1] is used:

z distribution as a Test Statistic

Where:

z is the value of the test statistic.

is the sample mean.


 is the population mean.
 is the population standard deviation.
n is the sample size.

Step 4. Formulate the Decision Rule.

A decision rule is based on H0 and H1, the level of significance, and the test statistic.

Decision rule: A statement of the conditions under which the null hypothesis
is rejected and conditions under which it is not rejected.

The region or area of rejection indicates the location of the values that are so large or so
small that the probability of their occurrence for a true null hypothesis is rather remote.

If we are applying a one-tailed test, there is one critical value. If we are applying a two-
tailed test, there are two critical values.

Critical value: The dividing point between the region where the null
hypothesis is rejected and the region where it is not rejected.

Chart 10-1 shows the conditions under which the null hypothesis is rejected, using the
0.05 significance level, a one-tailed test, and the standard normal distribution.

Chart 10-1 Sampling Distribution of the Statistic z, Right-Tailed Test, 0.05 Level of
Significance

The above diagram portrays the rejection region for a right-tailed test.

1. The area where the null hypothesis is not rejected is to the left of 1.65.
2. The area of rejection is to the right of 1.65.
3. A one-tailed test is being applied.
4. The 0.05 level of significance was chosen.
5. The sampling distribution of the test statistic z is normally distributed.
6. The value 1.65 separates the regions where the null hypothesis is rejected and
where it is not rejected.
7. The value 1.65 is called the critical value.
When is the standard normal distribution used? It is appropriate when the population is
normal and the population standard deviation is known. When the population standard
deviation is not known, the sample standard deviation is used instead. If the sample is at
least 30, the test statistic follows the normal distribution.

If the computed value of z is greater than 1.65, the null hypothesis is rejected. If the
computed value of z is less than or equal to 1.65, the null hypothesis is not rejected.

Step 5. Compute the value of the test statistic, make a decision, and interpret the
results.

The final step in hypothesis testing after selecting the sample is to compute the value of
the test statistic. This value is compared to the critical value, or values, and a decision is
made whether to reject or not to reject the null hypothesis. Interpret the results.

A summary of the steps in hypothesis testing:

1. Establish the null hypothesis (H0) and the alternate hypothesis (H1).
2. Select the level of significance, that is  .
3. Select an appropriate test statistic.
4. Formulate the decision rule, based on steps 1, 2, and 3 above.
5. Make a decision regarding the null hypothesis based on the sample information.
Interpret the results of the test.

One-Tailed and Two-Tailed Tests of Significance

We need to differentiate between a one-tailed test of significance and a two-tailed test of


significance.

Chart 10-1 above depicts a one-tailed test. The region of rejection is only in the right
(upper) tail of the curve.

Chart 10-2 depicts a situation where the rejection region is in the left (lower) tail of the
normal distribution.

Chart 10-2 Sampling Distribution of the Statistic z, Left-Tailed Test, 0.05 Level of
Significance

Chart 10-3 depicts a situation for a two-tailed test where the rejection region is divided
equally into the two tails of the normal distribution.
Chart 10-3 Regions of Nonrejection and Rejection for a Two-Tailed Test, 0.05 Level of
Significance

Testing for a Population Mean with a Known Population Standard Deviation

Suppose we are concerned with a single population mean. We want to test if our sample
mean could have been obtained from a population with a particular hypothesized mean.
For example, we may be interested in testing whether the mean starting salary of recent
marketing graduates is equal to $32,000 per year. It is assumed that:

1. The population is normally distributed.


2. The population standard deviation is known.

If  is not known, the sample standard deviation is substituted for the population standard
deviation provided the sample size is 30 or more.

Under these conditions the test statistic is the standard normal distribution with the
sample standard deviation s substituted for . Thus we use text formula [10-1].

Where:

z is the value of the test statistic.

is the sample mean.

 is the population mean.


 is the standard deviation of population.
n is the number in sample.

The sample standard deviation s can be substituted for  providing that the sample size is
30 or more.

p-value in Hypothesis Testing

In the process of testing a hypothesis, we compared the test statistic to a critical value.
We made a decision to either reject the null hypothesis or not to reject it. The question is
often asked as to how confident we were in rejecting the null hypothesis.
A p-value is frequently compared to the significance level to evaluate the decision
regarding the null hypothesis. It is a means of reporting the likelihood that H0 is true.

p-value: The probability of observing a sample value as extreme as, or more


extreme than, the value observed, given that the null hypothesis is true.
 If the p-value is greater than the significance level, then H0 is not rejected.
 If the p-value is less than the significance level, then H0 is rejected.
 The p-value for a given test depends on three factors:
1. whether the alternate hypothesis is one-tailed or two-tailed
2. the particular test statistic that is used
3. the computed value of the test statistic

For example, if  = 0.05 and the p-value is 0.0025, H0 is rejected. We report there is only
a 0.0025 likelihood that H0 is true.

If the p value is less than

(a) 0.10, we have some evidence that H0 is not


true.

(b) 0.05, we have strong evidence that H0 is not


Interpreting the weight of evidence
true.
against H0.
(c) 0.01, we have very strong evidence that H0 is
not true.

(d) 0.001, we have extremely strong evidence that


H0 is not true.

Testing for a Population Mean: Large Sample, Population Standard Deviation


Unknown

In most cases the population standard deviation is unknown. Thus, must be based on
prior studies or estimated by the sample standard deviation, s. As long as the sample size,
n, is at least 30, s can be substituted for , as illustrated in the following formula.

z Statistic, Unknown

Tests Concerning Proportions

In the previous chapter we discussed confidence intervals for proportions. We continue


our study of hypothesis testing but expand the idea to a proportion. What is a proportion?
Proportion: The fraction, ratio, or percent indicating the part of the
population or sample having a particular trait of interest.

If we let p stand for the sample proportion then text formula [10-3] is:

Test of Hypothesis, One Proportion

Where:

z is the value of the test statistic


 is the population proportion.
p is the sample proportion.

is the standard error of the population proportion. It is computed by so the


formula for z becomes text Formula [10–4]:

Test of Hypothesis, One Proportion

Where:

z is the value of the test statistic


 is the population proportion.
p is the sample proportion.
n is the sample size.

For example, we want to estimate the proportion of all home sales made to first time
buyers. A random sample of 200 recent transactions showed that 40 were first time
buyers. Therefore, we estimate that 0.20, or 20 percent, of all sales are made to first time
buyers, found by:

To conduct a test of hypothesis for proportions, the same assumptions required for the
binomial distribution must be met. Recall from Chapter 6 that those assumptions are:

1. Each outcome is classified into one of two categories such as, buyers were either
first time home buyers or they were not.
2. The number of trials is fixed. In this case it is 200.
3. Each trial is independent, meaning that the outcome of one trial has no bearing on
the outcome of any other. Whether the 20th sampled person was a first time buyer
does not affect the outcome of any other trial.
4. The probability of a success is fixed. The probability is 0.20 for all 200 buyers in
the sample.

Recall from Chapter 6 that the normal distribution is a good approximation of the
binomial distribution when n and n (1  ) are both greater than 5. In this instance n
refers to the sample size and  to the probability of a success. The test statistic that is
employed for testing hypotheses about proportions is the standard normal distribution.

Testing for a Population Mean: Small Sample, Population Standard Deviation


Unknown

Recall that we can use the standard normal distribution, that is z, when:

1. The population is known to follow a normal distribution and the population


standard deviation is known, or
2. The shape of the population is not known, but the sample size is at least 30.

When the population standard deviation is not known and the sample size is at least 30
the correct statistical procedure is to replace the standard normal distribution with the t
distribution. In Chapter 9, we noted that the following major characteristics of the t
distribution:

1. It is a continuous distribution.
2. It is bell shaped and symmetrical.
3. There is a "family" of t distributions. Each time the size of the sample changes,
and thus the degrees of freedom change, a new t distribution is created.
4. As the number of degrees of freedom increases, the shape of the t distribution
approaches that of the standard normal distribution.
5. The t distribution is more spread out (that is, "flatter") than the standard normal
distribution.

To conduct a test of hypothesis using the t distribution, we use text Formula [10-5]:

Test Statistic for Mean of a Small Sample

Where:

t is the value of the test statistic.

is the mean of the sample.


1 is the hypothesized population mean.
s is the standard deviation of the sample.
n is the number of observations in the sample.

Types of Tests of Hypothesis for a Proportion

There are three formats for testing a hypothesis about a proportion. For a one-tailed test
there are two possibilities, depending on the intent of the researcher. For example, if we
wanted to determine whether more than 25 percent of the sales of homes were sold to
first time buyers, the hypotheses would be given as follows:

H0: π  0.25

H1: π > 0.25

If we wanted to find out whether fewer than 25 percent of the homes were sold to first
time buyers, the hypotheses would be given as:

H0: π  0.25

H1: π < 0.25

For a two-tailed test the null and alternate hypotheses are:

H0: π = 0.25

H1: π ≠ 0.25

Where  means "not equal to." Rejection of H0 and acceptance of H1 allows us to


conclude only that the population proportion is "different from" or "not equal to" the
population value. It does not allow us to make any statement about the direction of the
difference.

Type II Error

Recall that the level of significance, identified by the Greek letter alpha ( ), is the
probability that the null hypothesis is rejected when it is true. This is called a Type I
error.

In a hypothesis testing situation there is also the possibility that a null hypothesis is not
rejected when it is actually false. In other words we accept a false null hypothesis. This is
called a Type II error. The probability of a Type II error is identified by the Greek letter
beta ( )

The likelihood of a Type II error is found by text Formula [10–6]:


Type II Error

You might also like