Hypothesis Testing: Before Any Sample Readings Are Considered
Hypothesis Testing: Before Any Sample Readings Are Considered
HYPOTHESIS TESTING
Type II Error ~ occurs if one does not reject the null One-Tailed Right Test ~ indicates that the null
hypothesis when it is false. should be rejected when the test value is in the
critical region on the right side of the mean.
Level of Significance (a ) ~ the maximum
Eg. 3
probability of committing a type I error. That is
Consider situation B in Eg. 1.
P(type I error ) = a .
One-Tailed Left Test ~ indicates that the null
a is decided by the researcher. It means that should be rejected when the test value is in the
there is an a % chance of rejecting a true null critical region on the left side of the mean.
hypothesis.
Eg. 4
The probability of type II error is denoted as: Consider situation C in Eg. 1.
P(type II error ) = b Two-Tailed Test ~ the null hypothesis should be
rejected when the test value is in either of the two
critical regions.
Eg. 5
Consider situation A in Eg. 1.
1. Draw the figure and indicate the appropriate Test on Sample Mean: s Unknown
area. When the population is normally distributed, the
2. Find the critical value correspond to the area. distribution of sample means will also be normally
3. Determine the sign of the critical value. distributed irrespective of sample size. However, if
the population standard deviation unknown, the
STEP 4: Decide on The Decision Criteria appropriate test statistic formula is as follows:
Decision Rule ~ summary of deciding whether to
accept or reject the null hypothesis based on the x-µ
comparison between the critical value and the test T= where T ~ t (n - 1)
statistics. s
n
Now consider the sample values
when s unknown.
STEP 5: Calculate the Test Statistic Value
Required condition: Population is normally
( )
This step can also be done in Step 2 when the
statistical test is decided. distributed, that is X ~ N µ ,s 2
If the value of the test statistic lies in the critical 1506.8 1507.2 1506.6 1506.7 1506.9 1506.8
region, reject H 0 . 1506.6 1507.0 1507.5 1506.3 1506.4.
If the value of the test statistic does not lie in the Filled bags are supposed to have a mass of
critical region, do not reject H 0 . 1506.5g. Assuming that the mass of a bag has a
normal distribution, test whether the sample
provides significant evidence at the 5% level that
If H 0 is rejected at 5% level, then it is said that the the machine produces overweight bags.
test value is ‘significant’.
If H 0 is rejected at 1% level, then it is said that the
test value is ‘highly significant’.
Eg. 7
Write the decision rule for each situation in Eg. 3
to Eg. 5.
Eg. 16 Experienced
New Inspectors
The same test was given to a group of 100 scouts Inspectors
and to a group of 144 guides. The mean score for x1 = 32.77 Mean = 21.11
the scouts was 27.53 and the mean score for the 2
guides was 26.81. Assuming a common s1 = 68.53 Variance = 121.11
population standard deviation of 3.48, test using a
5% level of significance, whether the scouts’
It is found that the number of mistakes made by
performance in the test was better than that of the
the inspectors above is distributed normally with
guides. Assume that the scores are normally
equal variances. Do experienced inspectors make
distributed.
fewer mistakes than new inspectors? Test at the
5% significance level.
With Unknown Common Variance
When there is an unknown common population
TESTS CONCERNING DIFFERENCE OF TWO
variance s then an estimate Sp can be used.
2 2
MEANS (DEPENDENT SAMPLES)
2
The formula for Sp is:
If the samples are not independent, i.e. there are
where, matched or paired, the test should be conducted
2 2
s1 and s 2 are the sample variances of sample 1 as if one sample where the difference between
each pair is considered.
and sample 2 respectively.
Hence, the statistical test would be:
2
After obtaining the pooled variance, Sp the test
statistics formula can be calculated as below: d - µd
T= where T ~ t (nd - 1)
sd
where T ~ t (n1 + n 2 - 2 ) .
nd
Eg 17 where,
Two different advertisements for mobile
telephones were aired on two TV channels. The µ d = population mean difference in the obs.
two advertisements were aired to two groups of 31
individuals each in the studio and they were asked d = sample mean difference in the obs.
to evaluate the advertisement they watched. The s d = standard deviation of the differences
following results were obtained:
nd = number of differences
Eg. 19 Eg. 21
The data below shows the typing speed in words Putra Line acts as a feeder bus to transport
per minute, before and after a typing course. passengers from various bus stops to the nearest
LRT station. Drivers of Putra Line are required to
Speed Before Speed After maintain consistent schedules to enable
Typist Course Course passengers to catch the LRT on time. The
(words per mins) (words per mins) management of the bus company has set a
standard which specifies an arrival time with a
A 25 35
variance of 5 minutes or less. Arrival time is
B 35 40
measured in minutes. A sample of 12 bus arrivals
C 35 35
at a particular bus stop was collected. The
D 30 40
variance of the arrival times was 5.9 minutes. At
E 45 45
the 2.5% level of significance is there evidence to
F 40 45
conclude that the variability of the arrival times is
larger than that specified by the management of
Using 1% level of significance, is there a the company?
significance increased in typing speed after
attending the typing course? Testing the Difference Between Two
Population Variances
Testing a Population Variance The hypotheses when testing for the difference
The same steps hypothesis testing is also used between two population variances are:
when making inferences about a population
s 12
variance. H0 = =1
s 22
The hypotheses will be:
s 12 s 12
H 0 : s 2 = s 02 H1 = ¹1 or H1 = < 1 or
s 22 s 22
H1 : s 2 ¹ s 02 or H1 : s 2 < s 02
s 12
or H1 :s 2
>s0 2 H1 = >1
s 22
The formula for the test statistics is as follows:
The test statistics is:
c2 =
(n - 1)s 2 s1 2
2
s F=
s2 2
where c 2 = chi-square variable
s 2 = sample variance Required Condition: The population are normally
s 2 = population variance as stated in the distributed.
hypothesis
n = sample size Eg. 22
Ganesan works as a technician at Lingam’s Flour
Required Condition: Population should be Mills. His job is to make sure that the packing
normally distributed. machines are functioning properly. Ganesan
knows that the variances of the weight of the
Eg. 20 packets of flour packed by each machine are an
A pharmaceutical company uses a machine to important measure in determining whether the
pour cough mixtures into bottles in such a way that machines are functioning well. A large variance
the standard deviation of the amount of mixture would mean that the machine needs to be adjusted
was selected. The standard deviation of the or repaired. Ganesan selected a sample of 9
amount of mixture in each bottle is 0.12 ounces. A packets of flour packed by machine A and 13
sample of 41 bottles of cough mixture was packets of flour packed by machine B. The mean
selected. The standard deviation of the amount of and standard deviation of the weights of the flour
mixture in the bottles was 0.09 ounces. At the 5% is shown in the following table:
level of significance can we conclude that there is
a difference in the variability of the mixtures in each Machine A Machine B
bottle? Assume that the mounts of mixture in the x1 = 520 grams x2 = 520.5 grams
bottles are normally distributed. s1 = 15 grams s2 = 30 grams
Eg. 26
From the output below, test the claim that the
average salary for substitute teachers is less than
RM60 per day. Use 𝛼 = 0.05.
One-Sample T: Salary
However, if we’re testing
Test of mu = 60 vs < 60
𝐻' : 𝜇 = 50 vs 𝐻* : 𝜇 ≠ 50
90% Upper
Variable N Mean StDev SE Mean Bound T P
the P-value obtained must multiply by 2. Hence the Salary 8 58.8750 5.0832 1.7972 61.4179 -0.63 0.276
P-value = 0.0356 x 2 = 0.0712
Jan 2018
a) Researchers speculate that drivers who do not wear a seatbelt are more likely to speed than drivers
who do not wear. A random sample of 20 drivers had their speed measured at a certain point.
Assume that the speed is approximately normally distribute. The summary outputs from the data
analysis software are shown below.
Test and CI for Two Variances: no_seatbelt, with_seatbelt
Method
Null hypothesis Sigma(no_seatbelt) / Sigma(with_seatbelt) = 1
Alternative hypothesis Sigma(no_seatbelt) / Sigma(with_seatbelt) ≠ 1
Statistics
Variable N StDev Variance
no_seatbelt 8 6.077 36.928
with_seatbelt 12 8.663 75.050
Tests
Test
Method DF1 Df2 Statistics P-value
F test 7 11 0.49 0.356
i) Based on the output, investigate if there is a difference in the variability of the speed for the two
groups of drivers? Test at 5% significance level.
ii) Is there enough evidence to support the claim that drivers who do not wear seatbelts travel faster
on the average? Use 𝛼 = 0.05.
b) The average weekly loss of labour hours due to accidents in the factories is studied on 10 labours
before and after a new industrial safety awareness program. The Minitab output is shown on the
right.
i) Construct the 95% confidence interval for the difference in mean of the weekly loss of labour
hours before and after the industrial safety awareness program is conducted. Interpret the
confidence interval
ii) Do the data provide evidence that the industrial safety awareness program is effective? Test at
1% level of significance.
T-Test of mean difference = 0 (vs > 0): T-Value = 4.03 P-Value = 0.001