Elementary Statistics: Describing, Exploring, and Comparing Data
Elementary Statistics: Describing, Exploring, and Comparing Data
Thirteenth Edition
Chapter 3
Describing,
Exploring, and
Comparing Data
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Describing, Exploring, and
Comparing Data
3-1 Measures of Center
3-2 Measures of Variation
3-3 Measures of Relative Standing and Boxplots
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Key Concept
Variation is the single most important topic in statistics,
so this is the single most important section in this book.
This section presents three important measures of
variation: range, standard deviation, and variance.
These statistics are numbers, but our focus is not just
computing those numbers but developing the ability to
interpret and understand them.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Round-off Rule for Measures of
Variation
• Round-off Rule for Measures of Variation
– When rounding the value of a measure of
variation, carry one more decimal place than is
present in the original set of data.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Range
• Range
– The range of a set of data values is the difference
between the maximum data value and the
minimum data value.
Range = (maximum data value) − (minimum data value)
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Important Property of Range
• The range uses only the maximum and the minimum
data values, so it is very sensitive to extreme values.
The range is not resistant.
• Because the range uses only the maximum and
minimum values, it does not take every value into
account and therefore does not truly reflect the
variation among all of the data values.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Range
Find the range of these Verizon data speeds (Mbps):
38.5, 55.6, 22.4, 14.1, 23.1.
Solution
Range = (maximum value) − (minimum value)
= 55.6 − 14.1 = 41.50 Mbps
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Standard Deviation of a Sample (1 of 2)
• Standard Deviation
– The standard deviation of a set of sample values,
denoted by s, is a measure of how much data
values deviate away from the mean.
Notation
s = sample standard deviation
σ = population standard deviation
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Standard Deviation of a Sample (2 of 2)
• Standard Deviation
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Important Properties of Standard
Deviation (1 of 2)
• The standard deviation is a measure of how
much data values deviate away from the mean.
• The value of the standard deviation s is never
negative. It is zero only when all of the data
values are exactly the same.
• Larger values of s indicate greater amounts of
variation.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Important Properties of Standard
Deviation (2 of 2)
• The standard deviation s can increase dramatically
with one or more outliers.
• The units of the standard deviation s (such as
minutes, feet, pounds) are the same as the units of
the original data values.
• The sample standard deviation s is a biased
estimator of the population standard deviation σ,
which means that values of the sample standard
deviation s do not center around the value of σ.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Calculating Standard
Deviation
Use sample standard deviation formula to find the standard
deviation of these Verizon data speed times (in Mbps): 38.5,
55.6, 22.4, 14.1, 23.1.
Solution
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Calculating Standard
Deviation Using Shortcut Formula
Find the standard deviation of the Verizon data speeds
(Mbps) of 38.5, 55.6, 22.4, 14.1, 23.1
Solution
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Range Rule of Thumb for
Understanding Standard Deviation
• Range Rule of Thumb
– The range rule of thumb is a crude but simple
tool for understanding and interpreting standard
deviation. The vast majority (such as 95%) of
sample values lie within 2 standard deviations of
the mean.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Range Rule of Thumb for Identifying
Significant Values
• Significantly low values are µ − 2σ or lower.
• Significantly high values are µ + 2σ or higher.
• Values not significant are between (µ − 2σ ) and
(µ + 2σ).
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Range Rule of Thumb for Estimating
a Value of the Standard Deviation s
• Range Rule of Thumb for Estimating a Value of
the Standard Deviation
– To roughly estimate the standard deviation from a
collection of known sample data, use
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Standard Deviation of a Population
• Standard Deviation of a Population
– A different formula is used to calculate the
standard deviation σ of a population: Instead of
dividing by n − 1 for a sample, we divide by the
population size N.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Variance of a Sample and a
Population
• Variance
– The variance of a set of values is a measure of
variation equal to the square of the standard
deviation.
Sample variance: s² = square of the standard
deviation s.
Population variance: σ² = square of the population
standard deviation σ.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Notation Summary
s = sample standard deviation
s² = sample variance
σ = population standard deviation
σ² = population variance
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Important Properties of Variance
• The units of the variance are the squares of the units
of the original data values.
• The value of the variance can increase dramatically
with the inclusion of outliers. (The variance is not
resistant.)
• The value of the variance is never negative. It is zero
only when all of the data values are the same number.
• The sample variance s² is an unbiased estimator of
the population variance σ².
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Why Divide by (n – 1)?
• There are only n − 1 values that can assigned without
constraint. With a given mean, we can use any
numbers for the first n − 1 values, but the last value
will then be automatically determined.
• With division by n − 1, sample variances s² tend to
center around the value of the population variance σ²;
with division by n, sample variances s² tend to
underestimate the value of the population variance σ².
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Empirical Rule for Data with a Bell-
Shaped Distribution
The empirical rule states that for data sets having a
distribution that is approximately bell-shaped, the
following properties apply.
• About 68% of all values fall within 1 standard deviation of
the mean.
• About 95% of all values fall within 2 standard deviations of
the mean.
• About 99.7% of all values fall within 3 standard deviations
of the mean.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
The Empirical Rule
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: The Empirical Rule (1 of 2)
IQ scores have a bell-shaped distribution with a mean
of 100 and a standard deviation of 15. What percentage
of IQ scores are between 70 and 130?
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: The Empirical Rule (2 of 2)
Solution
The key is to recognize that 70 and 130 are each exactly
2 standard deviations away from the mean of 100.
2 standard deviations = 2s = 2(15) = 30
2 standard deviations from the mean is
100 − 30 = 70
or 100 + 30 = 130
About 95% of all IQ scores are between 70 and 130.
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Chebyshev’s Theorem
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Chebyshev’s Theorem (1 of 2)
IQ scores have a mean of 100 and a standard deviation
of 15. What can we conclude from Chebyshev’s
theorem?
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Example: Chebyshev’s Theorem (2 of 2)
Solution
Applying Chebyshev’s theorem with a mean of 100 and
a standard deviation of 15, we can reach the following
conclusions:
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Comparing Variation in Different
Samples or Populations
• Coefficient of Variation
– The coefficient of variation (or CV) for a set of
nonnegative sample or population data, expressed
as a percent, describes the standard deviation
relative to the mean, and is given by the following:
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Round-off Rule for the Coefficient of
Variation
Round the coefficient of variation to one decimal place
(such as 25.3%).
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved
Biased and Unbiased Estimators
• The sample standard deviation s is a biased
estimator of the population standard deviation s,
which means that values of the sample standard
deviation s do not tend to center around the value
of the population standard deviation σ.
• The sample variance s² is an unbiased estimator
of the population variance σ², which means that
values of s² tend to center around the value of σ²
instead of systematically tending to overestimate or
underestimate σ².
Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved