0% found this document useful (0 votes)
2 views

Chapter 5

Chapter 5 discusses the estimation and testing of variance and standard deviation, including formulas for population and sample variances. It introduces the chi-square distribution for confidence intervals and hypothesis testing regarding variances, with examples illustrating the application of these concepts. Additionally, the chapter covers comparing two population variances using the F distribution and extends the discussion to more than two populations.

Uploaded by

Tigist G
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Chapter 5

Chapter 5 discusses the estimation and testing of variance and standard deviation, including formulas for population and sample variances. It introduces the chi-square distribution for confidence intervals and hypothesis testing regarding variances, with examples illustrating the application of these concepts. Additionally, the chapter covers comparing two population variances using the F distribution and extends the discussion to more than two populations.

Uploaded by

Tigist G
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Chapter 5: Inferences concerning Variance (Standard Deviation)

 Estimation of standard deviation


 Tests concerning standard deviation
 Tests concerning the ratio of two variances from normal populations
The variance is defined as the average of the squared deviation from the mean. That is,
N
2
 X i  
Population var iance  2  i 1
i  1, 2, . . ., N
N
Sample variance: The sample variance is dented by S2 and is given by
n

 X  X
2
i
Sample var iance S 2  i 1
i  1, 2, . . ., n
n 1

The sample variance (S2) is a point estimator of σ2 with n-1 degrees of freedom.
Standard deviation: The positive square root of the variance is called standard deviation.
Therefore
2

Population s tan dard deviation     2



 X i  
N

 X  X
2
2 i
Sample s tan dard deviation  S  S 
n 1
(n  1) S 2
If sampling is from a normally distributed population, then the random variable has a
2
(n  1) S 2
chi-square distribution with n-1 degrees of freedom. That is 2
~  2 ( n1)

Confidence intervals for the population variance
To find confidence intervals for variances and standard deviations, you must assume that the
variable is normally distributed. To calculate these confidence intervals, a new statistical
distribution is needed. It is called the chi-square distribution. The chi-square variable is
similar to the t variable in that its distribution is a family of curves based on the number of
degrees of freedom. A chi-square variable cannot be negative, and the distributions are
skewed to the right. At about 100 degrees of freedom, the chi-square distribution becomes
somewhat symmetric. The area under each chi-square distribution is equal to 1.00, or
100%.Two different values are used in the formula because the distribution is not
symmetric.
  (n  1) S 2
P χ 2 α  χ 2  χ 2 α   1   : Since 2
follows a chi  square distribution
1 
 2 2 
 2 (n  1) S 2 
 P χ α  2
 χ 2 α   1  
1 
 2 2
 
 1 2 1 
 P 2  2
   1
 χ 1 α , (n  1) (n  1) S χ 2 α , (n  1) 
 2 2 
 
 (n  1) S 2 (n  1) S 2 
 P 2  2  2   1
 χ α , (n  1) χ α , (n  1) 
1
 2 2 
Therefore the (1-α) 100% confidence interval for σ2 is given by:

(n  1) S 2 2 (n  1) S 2
  
χ 2 α , (n  1) χ 2 α , (n  1)
1
2 2
 The (1   )100% confidence int erval for  is given by :
(n  1) S 2 (n  1) S 2
 
χ 2 α , (n  1) χ 2 α , (n  1)
1
2 2

Example: Find the 95% confidence interval for the variance and standard deviation of the
nicotine content of cigarettes manufactured if a sample of 20 cigarettes has a standard deviation
of 1.6 milligrams.
 Hence, you can be 95% confident that the true standard deviation for the nicotine content
of all cigarettes manufactured is between 1.2 and 2.3 milligrams based on a sample of 20
cigarettes.
Hypothesis testing about σ2
2 2
H0 : 2  o H 1 :  2   o  two  tailed
2 2
H0 : 2  o H1 :  2   o 
2 2
 one  tailed
H0 : 2  o H1 :  2   o 

(n  1) S 2
Test statistic:  2 cal 
 o2
Decision rule:
H1 Reject H0 if
 2 cal  2 OR  2 cal   2
2
 , n1 
1 , n1
H1 :  2   o 2 2

H1 :  2   o
2  2 cal   2  , n1

H1 :  2   o
2  2 cal   2 1 , n1

Examples:
1. An instructor wishes to see whether the variation in scores of the 23 students in her class is
less than the variance of the population. The variance of the class is 198. Is there enough
evidence to support the claim that the variation of the students is less than the population
variance ( =225) at σ=0.05? Assume that the scores are normally distributed.
Example 2. A cigarette manufacturer wishes to test the claim that the variance of the nicotine
content of its cigarettes is 0.644. Nicotine content is measured in milligrams, and assumes that it
is normally distributed. A sample of 20 cigarettes has a standard deviation of 1.00 milligram. At
α=0.05, is there enough evidence to reject the manufacturer’s claim?
 Summarize the results. There is not enough evidence to reject the manufacturer’s claim
that the variance of the nicotine content of the cigarettes is equal to 0.644.
2. A social worker believes that the aid given to refugees in a camp depends on the variation of
the age of migrants. The sample variance of the age of 101 randomly selected refugees in a
camp is found to be 31.
a) At α=0.05 test H 1 :  2  25 against H 1 :  2  25
b) Construct a 95% confidence interval for the variance
c) Construct a 95% confidence interval for the standard deviation.
3. In a drug manufacturing it is important not only that the amount of drug in the capsules be a
particular value on average, but also that the variation around that value be very small. The
drug company will consider its machine accurate enough if the capsules are filled within a
variation of σ2=0.25 mg2. Data is collected for 20 capsules and the sample standard
deviation is found to be 0.787. Is this variability significantly greater than what the company
will tolerate? Use α=0.05.
Solution:
a)
Step1 : H 0 :  2  25 H 1 :  2  25
Step 2 :   0.05
Step3 : Chi  square distribution
Step 4 : Re jection region H 0 if  2 cal   2 0.025, 100  129.561 OR 2 cal   2 0.975, 100  74.222

Step 5 : Test statistics  2 cal 


n  1S 2 
100  31
 124
2
o 25
Step6 : Since 124 is between is H 0
Step7 : At   0.05thevar iance of the age of refugees is 25.
b) 95% CI for  2
n  1S 2   2  n  1S 2
2 2
 0.025, 100
 0.975, 100
100  31 100  31
 2 
129.561 74.222
 23.927   2  41.767
Con :We are 95% confident that the population var iance will be between 23.927 and 41.767
c)95% CI for 
n  1S 2  
n  1S 2
 2 0.025, 100  2 0.975, 100
23.927    41.767
4.89    6.46

2)
Step1 : H 0 :  2  0.25 H 1 :  2  0.25
Step 2 :   0.05
Step3 : Chi  square distribution
Step 4 : Re jection region H 0 if  2 cal   2 0.05, 19  30.144

Step 5 : Test statistics  2 cal 


n  1S 2 19  (0.789)
  47.072
2
o 0.25
Step6 : Since   2 cal  47.072   2 0.05, 19  30.144 reject H 0
Step7 : At   0.05thevar iabilty is greater than the company can tolerate.
Exercises:
1. In the production of synthetic fibers the tensile strength of the fibers should not vary too
much. A random sample of 8 pieces is tested. It is found that S2=37.75 kg2. Under the
assumption of normality construct the 95% confidence interval for σ2.
2. The mean operating life for a random sample of n=10 light bulbs has a sample standard
deviation of S=200 hr. The operating life of bulbs in general is assumed to be normally
distributed. Suppose that before the sample was collected, it was claimed that the
population standard deviation is no larger than S=150. Based on the sample results, test
this claim at the 1 percent level of significance.
Estimation and Tests for Comparing Two Population Variances
In addition to comparing two means, statisticians are interested in comparing two variances or standard
deviations. For example, is the variation in the temperatures for a certain month for two cities different?
In another situation, a researcher may be interested in comparing the variance of the two variances or
standard deviations, an F test is used. If two independent samples are selected from two normally
distributed populations in which the variances are equal and if the variances and are compared as, the
sampling distribution of the variances is called the F distribution.
If two independent samples are selected from two normally distributed populations in which the variances
are equal ( = ) and if the variances are compared

as , the sampling distribution of the variances is called the F distribution

Characteristics of the F Distribution


1. The values of F cannot be negative, because variances are always positive or zero.
2. The distribution is positively skewed.
3. The mean value of F is approximately equal to 1.
4. The F distribution is a family of curves based on the degrees of freedom of the variance of the
numerator and the degrees of freedom of the variance of the denominator.
Formula for the F Test

= , where the larger of the two variances is placed in the numerator regardless of the

subscripts. The F test has two terms for the degrees of freedom: that of the numerator, n1-1, and
that of the denominator, n2-1, where n1 is the sample size from which the larger variance was obtained.
NOTE: When you are finding the F test value, the larger of the variances is placed in the numerator of the
F formula; this is not necessarily the variance of the larger of the two sample sizes.

Assumptions for Testing the Difference between Two Variances


1. The populations from which the samples were obtained must be normally distributed.
(Note: The test should not be used when the distributions depart from normality.)
2. The samples must be independent of each other
Example: The standard deviation of the average waiting time to see a doctor for non-life threatening
problems in the emergency room at an urban hospital is 32 minutes. At a second hospital, the standard
deviation is 28 minutes. If a sample of 16 patients was used in the first case and 18 in the second case, is
there enough evidence to conclude that the standard deviation of the waiting times in the first hospital is
greater than the standard deviation of the waiting times in the second hospital?

Example. A medical researcher wishes to see whether the variance of the heart rates (in beats per minute)
of smokers is different from the variance of heart rates of people who do not smoke. Two samples are
selected, and the data are as shown. Using a0.05, is there enough evidence to support the claim?
Exercise : 1 An experiment was conducted to determine whether there was sufficient evidence to indicate
that data variation within one population, say population A, exceeded the variation within a second
population, population B. Random samples of = = 8 measurements were selected from the two
populations and the sample variances were calculated to be

Tests for Comparing more than two Population Variances


In the previous section, we discussed a method for comparing variances from two normally distributed
populations based on taking independent random samples from the populations. In many situations, we
will need to compare more than two populations. For example, we may want to compare the variability in
the level of nutrients of five different suppliers of a feed supplement or the variability in scores of the
students using SAT preparatory materials from the three major publishers of those materials. Thus, we
need to develop a statistical test that will allow us to compare variance greater than 2 population
variances. The first procedure, Hartley’s test, is very simple to apply .
Hartley F max test for evaluating the hypotheses

You might also like