Chapt 2
er
Descriptive
Statistics
Copyright 2019, Pearson Education, 1
Chapter Outline
• 2.1 Frequency Distributions and Their Graphs
• 2.2 More Graphs and Displays
• 2.3 Measures of Central Tendency
• 2.4 Measures of Variation
• 2.5 Measures of Position
Copyright 2019, Pearson Education, 2
Section 2.1
Frequency Distributions
and Their Graphs
Copyright 2019, Pearson Education, 3
Frequency Distribution
Frequency Distribution Class Frequency, f
• A table that shows 1–5 5
classes or intervals of
6 – 10 8
data with a count of the
11 – 15 6
number of entries in each
class. 16 – 20 8
• The frequency, f, of a 21 – 25 5
class is the number of 26 – 30 4
data entries in the class.
Copyright 2019, Pearson Education, 4
Frequency Distribution
• Each class has a lower class
Class Frequency, f
limit, which is the least
number that can belong 1–5 5
to the class, and an 6 – 10 8
• upper class limit, which is the 11 – 15 6
greatest number that can 16 – 20 8
belong to the class. 21 – 25 5
26 – 30 4
Lower class Upper class
limits limits
Copyright 2019, Pearson Education, 5
Frequency Distribution
• The class width is the
Class Frequency, f
distance between lower Class width
(or upper) limits of 1 –5 5
6–1=5
consecutive classes. 6 – 10 8
• The difference between the 11 – 15 6
maximum (30) and minimum 16 – 20 8
data entries (1) is called the 21 – 25 5
range. 26 – 30 4
Copyright 2019, Pearson Education, 6
Constructing a Frequency
Distribution
1. Decide on the number of classes.
Usually between 5 and 20; otherwise, it may be
difficult to detect any patterns.
2. Find the class width.
Determine the range of the data.
Divide the range by the number of classes.
Round up to the next convenient number.
Copyright 2019, Pearson Education, 7
Constructing a Frequency
Distribution
3. Find the class limits.
You can use the minimum data entry as the
lower limit of the first class.
Find the remaining lower limits (add the class
width to the lower limit of the preceding class).
Find the upper limit of the first class. Remember
that classes cannot overlap.
Find the remaining upper class limits.
Copyright 2019, Pearson Education, 8
Example: Constructing a
Frequency Distribution
The data set lists the out-of-pocket prescription
medicine expenses (in dollars) for 30 U.S. adults in a
recent year. Construct a frequency distribution that has
seven classes. (Adapted from: Health, United States,
2015)
200 239 155 252 384 165 296 405 303 400
307 241 256 315 330 317 352 266 276 345
238 306 290 271 345 312 293 195 168 342
Copyright 2019, Pearson Education, 9
Solution: Constructing a
Frequency Distribution
200 239 155 252 384 165 296 405 303 400
307 241 256 315 330 317 352 266 276 345
238 306 290 271 345 312 293 195 168 342
Number of classes: 7 (given)
Class width: Maximum data value (405)
- Minimum data value (155)
= 250 / 7 classes = 35.71 (width)
Round up to 36
Copyright 2019, Pearson Education, 10
Solution: Constructing a
Frequency Distribution
3. Use 155 (minimum Class
width = 36
value) as first lower
limit. Add the class
width of 36 to get the + 36 =
lower limit of the next
class.
155 + 36 = 191
Find the remaining
lower limits.
Copyright 2019, Pearson Education, 11
Solution: Constructing a
Frequency Distribution
The upper limit of the
first class is 190 (one less
than the lower limit of
the
second class). Class
Add the class width of 36 width = 36
to get the upper limit of
the next class.
190 + 36 = 226
Find the remaining upper
limits.
Copyright 2019, Pearson Education, 12
Solution: Constructing a
Frequency Distribution
4. Make a tally mark for each data entry in the row of
the appropriate class.
5. Count the tally marks to find the total frequency f
for each class.
Copyright 2019, Pearson Education, 13
Breakout Session:
Frequency Distribution
Use the data set, which represents the overall average
class size for 20 national universities in the USA.
Copyright 2019, Pearson Education, 14
Determining the Midpoint
(Lower class limit) (Upper class
limit) 2
Copyright 2019, Pearson Education, 16
Determining the Relative
Frequency
Relative Frequency of a class
• Portion or percentage of the data that falls in a
particular class.
• relative frequency = class frequency f
Sample size
n
Copyright 2019, Pearson Education, 17
Determining the Cumulative
Frequency
Cumulative frequency of a class
• The sum of the frequency for that class and all
previous classes.
• The cumulative frequency of the last class is equal to
the sample size n.
Copyright 2019, Pearson Education, 18
Solution: Finding Midpoints, Relative
and Cumulative Frequencies
• The midpoints, relative frequencies, and cumulative
frequencies of the first five classes are calculated
as follows:
Copyright 2019, Pearson Education, 19
Solution: Finding Midpoints, Relative
and Cumulative Frequencies
• The remaining midpoints, relative frequencies, and
cumulative frequencies are shown in the expanded
frequency distribution below.
Copyright 2019, Pearson Education, 20
Solution: Finding Midpoints, Relative
and Cumulative Frequencies
• There are several patterns in the data set:
• For instance, the most common range for the
expenses is $299 to $334.
• Also, about half of the expenses are less
than $299.
Copyright 2019, Pearson Education, 21
Graphs of Frequency
Distributions
Frequency Histogram
• A bar graph that represents the frequency distribution.
• The horizontal scale is quantitative and measures the
data values.
• The vertical scale measures the frequencies of the
classes.
• Consecutive bars must touch.
frequency
data values
Copyright 2019, Pearson Education, 22
Class Boundaries
Class boundaries
• Because consecutive bars of a histogram must touch,
bars must begin and end at class boundaries instead
of class limits.
• The numbers that separate classes without forming
gaps between them.
Copyright 2019, Pearson Education, 23
Example: Constructing a
Frequency Histogram
Draw a frequency histogram for the frequency
distribution in the previous example. Describe any
patterns.
Copyright 2019, Pearson Education, 24
Solution: Constructing a
Frequency Histogram
• First, find the
class boundaries
• The distance from the upper
limit of the first class to the
lower limit of the second
class is 191– 190 = 1.
• Half this distance is 0.5.
• First class lower boundary
= 155 – 0.5 = 154.5
• First class upper boundary
Copyright 2019, Pearson Education, 25
Solution: Constructing a
Frequency Histogram
You can mark the horizontal scale either at the midpoints or
at the class boundaries. Both histograms are shown below.
Copyright 2019, Pearson Education, 26
Solution: Constructing a
Frequency Histogram
You can see that two-
thirds of the adults are
paying more than
$262.50 for out-of-
pocket
prescription
medicine expenses.
Copyright 2019, Pearson Education, 27
Graphs of Frequency
Distributions
Frequency Polygon
• A line graph that emphasizes the continuous change
in frequencies.
frequency
data values
Copyright 2019, Pearson Education, 28
Example: Constructing a
Frequency Polygon
Draw a frequency polygon for the frequency
distribution in previous example. Class width = 36
Copyright 2019, Pearson Education, 29
Solution: Constructing a
Frequency Polygon
To construct the frequency polygon, use the same
horizontal and vertical scales that were used in the
histogram labeled with class midpoints.
The graph should begin
and end on the horizontal
axis, so extend the left
side to one class width
before the first class
midpoint and extend the
right side to one class
width after the last class
midpoint.
Take 172.5 - 36 = 136.5
and 388.5 + 36 = 424.5
Copyright 2019, Pearson Education, 30
Breakout Session 2: Polygon
Construct a Polygon for the frequency distribution
from Breakout Session 1
Copyright 2019, Pearson Education, 31
Graphs of Frequency
Distributions
Relative Frequency Histogram
• Has the same shape and the same horizontal scale as
the corresponding frequency histogram.
• The vertical scale measures the relative frequencies,
not frequencies.
frequency
relative data values
Copyright 2019, Pearson Education, 34
Example: Constructing a
Relative Frequency Histogram
Construct a relative frequency histogram for the second
example.
Copyright 2019, Pearson Education, 35
Solution: Constructing a
Relative Frequency Histogram
From this graph, you can quickly see that 0.2, or 20%, of
the adults have expenses between $262.50 and $298.50.
Copyright 2019, Pearson Education, 36
Graphs of Frequency
Distributions
Cumulative Frequency Graph or Ogive
• A line graph that displays the cumulative frequency
of each class at its upper class boundary.
• The upper boundaries are marked on the horizontal
axis.
• The cumulative frequencies are marked on the
vertical axis.
cumulative
frequency
data values
Copyright 2019, Pearson Education, 37
Example: Constructing an Ogive
Construct an ogive for the second example frequency
distribution.
Copyright 2019, Pearson Education, 38
Solution: Constructing an Ogive
From the ogive, you can see that 10 adults had expenses of
$262.50 or less. Also, the greatest increase in cumulative
frequency occurs between $298.50 and $334.50.
Copyright 2019, Pearson Education, 39
Breakout Session 3: Ogive
Include cumulative frequencies in your frequency
distribution and construct the ogive for the frequency
distribution
Copyright 2019, Pearson Education, 40
Chapt 2
er
Descriptive
Statistics
Copyright 2019, Pearson Education, 41
Chapter Outline
• 2.1 Frequency Distributions and Their Graphs
• 2.2 More Graphs and Displays
• 2.3 Measures of Central Tendency
• 2.4 Measures of Variation
• 2.5 Measures of Position
Copyright 2019, Pearson Education, 42
Section 2.2
More Graphs and Displays
Copyright 2019, Pearson Education, 43
Graphing Quantitative Data
Sets
Stem-and-leaf plot
• Each number is separated into a stem and a leaf.
• Similar to a histogram.
• Still contains original data values. 26
• Provides an easy way to sort data.
2 1 5 5 6 7 8
Data: 21, 25, 25, 26, 27, 28, 3 0 6 6
30, 36, 36, 45
4 5
. Copyright 2019, Pearson Education, 44
Example: Constructing a
Stem-and-Leaf Plot
The data set lists the numbers of text messages sent in
one day by 50 cell phone users. Display the data in a
stem-and-leaf plot. Describe any patterns. (Adapted
from Pew Research)
Copyright 2019, Pearson Education, 45
Solution: Constructing a
Stem-and-Leaf Plot
• The data entries go from a
low of 16 to a high of 149.
• Use the rightmost digit as the
leaf.
For instance,
76 = 7|6 and
149 = 14 | 9
• List the stems, 7 to 14, to the left
of a vertical line.
• For each data entry, list a leaf to
the right of its stem.
Copyright 2019, Pearson Education, 46
Solution: Constructing a
Stem-and-Leaf Plot
From the display, you can see that more than 50%
of the cell phone users sent between 20 and 50 text
messages.
Copyright 2019, Pearson Education, 47
Graphing Quantitative Data
Sets
Dot plot
• Each data entry is plotted, using a point, above a
horizontal axis
Data: 21, 25, 25, 26, 27, 28, 30, 36, 36, 45
26
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
45
. Copyright 2019, Pearson Education, 48
Example: Constructing a Dot Plot
Use a dot plot to organize the data set in Example 1.
Describe any patterns.
Copyright 2019, Pearson Education, 49
Solution: Constructing a Dot Plot
From the dot plot, you can see that most entries occur between 20
and 80 and only 4 people sent more than 100 text messages. You
can also see that 149 is an unusual data entry.
. Copyright 2019, Pearson Education, 50
Solution: Constructing a Dot Plot
Technology can be used to construct dot plots. For
instance, Minitab and StatCrunch dot plots for the text
messaging data are shown below.
. Copyright 2019, Pearson Education, 51
Graphing Qualitative Data
Sets
Pie Chart
• Pie charts provide a convenient way to present
qualitative data graphically as percents of a whole.
• A circle is divided into sectors that represent categories.
• The area of each sector is proportional to the frequency
of each category.
Copyright 2019, Pearson Education, 52
Example: Constructing a Pie Chart
The numbers of earned degrees conferred (in thousands)
in 2014 are shown in the table. Use a pie chart to
organize the data. (Source: U.S. National Center for
Educational Statistics)
. Copyright 2019, Pearson Education, 53
13.10.2022
Solution: Constructing a Pie Chart
• Find the relative frequency (percent) of each category.
1 360
. Copyright 2019, Pearson Education, 54
Solution: Constructing a Pie Chart
Check using technology
From the pie chart, you can see that almost one-half of
the degrees conferred in 2014 were bachelor’s degrees.
. Copyright 2019, Pearson Education, 55
Graphing Paired Data Sets
Paired Data Sets
• Each entry in one data set corresponds to one entry in
a second data set.
• Graph using a scatter plot.
The ordered pairs are graphed as y
points in a coordinate plane.
Used to show the relationship
between two quantitative variables.
x
. Copyright 2019, Pearson Education, 56
Example: Interpreting a
Scatter Plot
The British statistician Ronald Fisher introduced a
famous data set called Fisher's Iris data set. This data
set describes various physical characteristics, such as
petal length and petal width (in millimeters), for three
species of iris. The petal lengths form the first data set
and the petal widths form the second data set. (Source:
Fisher, R. A., 1936)
. Copyright 2019, Pearson Education, 57
Example: Interpreting a
Scatter Plot
As the petal length increases, what tends to happen to
the petal width?
Each point in the
scatter plot
represents the
petal length and
petal width of one
flower.
. Copyright 2019, Pearson Education, 58
28.10.2021
Solution: Interpreting a
Scatter Plot
From the scatter plot, you can see that as
the petal length increases, the petal width
also tends to increase.
. Copyright 2019, Pearson Education, 59
Chapt 2
er
Descriptive
Statistics
Copyright 2019, Pearson Education, 60
Chapter Outline
• 2.1 Frequency Distributions and Their Graphs
• 2.2 More Graphs and Displays
• 2.3 Measures of Central Tendency
• 2.4 Measures of Variation
• 2.5 Measures of Position
Copyright 2019, Pearson Education, 61
Section 2.3
Measures of Central Tendency
. Copyright 2019, Pearson Education, 62
Section 2.3 Objectives
• How to find the mean, median, and mode of a
population and of a sample
• How to find the weighted mean of a data set, and how
to estimate the sample mean of grouped data
• How to describe the shape of a distribution as
symmetric, uniform, or skewed and how to compare
the mean and median for each
. Copyright 2019, Pearson Education, 63
Measures of Central
Tendency
Measure of central tendency
• A value that represents a typical, or central, entry of a
data set.
• Most common measures of central tendency:
Mean
Median
Mode
. Copyright 2019, Pearson Education, 64
Measure of Central Tendency:
Mean
Mean (average)
• The sum of all the data entries divided by the number
of entries.
• Sigma notation: Σx = add all of the data entries (x)
in the data set.
• Population N
x
mean:
• Sample mean: x
x n
. Copyright 2019, Pearson Education, 65
Example: Finding a Sample
Mean
The weights (in pounds) for a sample of adults before
starting a weight-loss study are listed. What is the mean
weight of the adults?
274 235 223 268 290 285
235
100 pound = 45,36 kg
. Copyright 2019, Pearson Education, 66
Solution: Finding a Sample
Mean
274 235 223 268 290 285 235
• The sum of the weights is
Σx = 274 + 235 + 223 + 268 + 290 + 285 + 235 = 1810
• To find the mean weight, divide the sum of the
weights by the number of adults in the sample (7).
The mean weight of the adults is about 258.6 pounds.
. Copyright 2019, Pearson Education, 67
Measure of Central Tendency:
Median
Median
• The value that lies in the middle of the data when the
data set is ordered.
• Measures the center of an ordered data set by dividing
it into two equal parts.
• If the data set has an
odd number of entries: median is the middle data
entry.
even number of entries: median is the
mean of the two middle data entries.
. Copyright 2019, Pearson Education, 68
Example: Finding the Median
Find the median of the weight listed in the first
example.
274 235 223 268 290 285
235
. Copyright 2019, Pearson Education, 69
Solution: Finding the Median
• First, order the data.
223 235 235 268 274 285 290
• There are seven entries (an odd number), the
median is the middle, or fourth, data entry.
The median weight of the adults is 268
pounds.
. Copyright 2019, Pearson Education, 70
Example: Finding the Median
In the previous example, the adult weighing 285 pounds
decides to not participate in the study. What is the
median weight of the remaining adults?
223 235 235 268 274
290
. Copyright 2019, Pearson Education, 71
Solution: Finding the Median
• First order the data.
223 235 235 268 274 290
• There are six entries (an even number), the median
is the mean of the two middle entries.
The median weight of the remaining adults is 251.5 pounds.
. Copyright 2019, Pearson Education, 72
Measure of Central Tendency:
Mode
Mode
• The data entry that occurs with the greatest frequency.
• If no entry is repeated the data set has no mode.
• If two entries occur with the same greatest frequency,
each entry is a mode (bimodal).
. Copyright 2019, Pearson Education, 73
Example: Finding the Mode
Find the mode of the weights listed in Example 1.
223 235 235 268 274
285 290
. Copyright 2019, Pearson Education, 74
Solution: Finding the Mode
• Ordering the data helps to find the mode.
223 235 235 268 274 285
290
• The entry of 235 occurs twice, whereas the other
data entries occur only once.
The mode of the weights is 235 pounds.
. Copyright 2019, Pearson Education, 75
Example: Finding the Mode
At a political debate a sample of audience members was
asked to name the political party to which they belong.
Their responses are shown in the table. What is the
mode of the responses?
Political Party Frequency, f
Democrat 46
Republican 34
Independent 39
Other/don’t know 5
. Copyright 2019, Pearson Education, 76
Solution: Finding the Mode
Political Party Frequency, f
Democrat 46
Republican 34
Independent 39
Other/don’t know 5
The response occurring with the greatest frequency is
Democrat. So, the mode is Democrat. In this sample, there were
more Democrats than people of any other single affiliation.
. Copyright 2019, Pearson Education, 77
Exercise 4:
Mean, Median and Mode
Calcluate the Mean, Median and Mode for the Data set
2 10 12 13 15
16 16 18 22 23
. Copyright 2019, Pearson Education, 78
Comparing the Mean, Median,
and Mode
• All three measures describe a typical entry of a data
set.
• Advantage of using the mean:
The mean is a reliable measure because it takes
into account every entry of a data set.
• Disadvantage of using the mean:
Greatly affected by outliers (a data entry that is far
removed from the other entries in the data set).
. Copyright 2019, Pearson Education, 80
Example: Comparing the
Mean, Median, and
Mode
The table shows the sample ages of students in a class.
Find the mean, median, and mode of the ages. Are there
any outliers? Which measure of central tendency best
describes a typical entry of this data set?
Ages in a class
20 20 20 20 20 20 21
21 21 21 22 22 22 23
23 23 23 24 24 65
. Copyright 2019, Pearson Education, 81
Solution: Comparing the
Mean, Median, and Mode
Ages in a class
20 20 20 20 20 20 21
21 21 21 22 22 22 23
23 23 23 24 24 65
Mean: x
x
20 20 ... 24 65
23.8
Median: years
21 22
2n 21.5 years 20
Mode: 20 years (the entry occurring with the
greatest frequency)
. Copyright 2019, Pearson Education, 82
Solution: Comparing the
Mean, Median, and Mode
Mean ≈ 23.8 years Median = 21.5 years Mode = 20
years
• The mean takes every entry into account, but is
influenced by the outlier of 65.
• The median also takes every entry into account, and
it is not affected by the outlier.
• In this case the mode exists, but it doesn't appear to
represent a typical entry.
. Copyright 2019, Pearson Education, 83
Solution: Comparing the
Mean, Median, and Mode
Sometimes a graphical comparison can help you decide
which measure of central tendency best represents a
data set.
In this case, it appears that the median best describes the data
set.
. Copyright 2019, Pearson Education, 84
Weighted Mean
Weighted Mean
• The mean of a data set whose entries have varying
weights.
• The weighted mean is given by
xw where w is the weight of each entry x.
x w
. Copyright 2019, Pearson Education, 85
Example: Finding a Weighted
Mean
Your grades from last semester are in the table. The
grading system assigns points as follows: A = 4, B = 3,
C = 2, D = 1, F = 0. Determine your grade point average
(weighted mean).
. Copyright 2019, Pearson Education, 86
Solution: Finding a
Weighted Mean
Last semester, your grade point average was 2.5.
. Copyright 2019, Pearson Education, 87
Mean of Grouped Data
Mean of a Frequency Distribution
• Approximated by
xf n
x n
f
where x and f are the midpoints and frequencies of a
class, respectively.
. Copyright 2019, Pearson Education, 88
Finding the Mean of a
Frequency Distribution
In Words In Symbols
1. Find the midpoint of each (Lower limit)+(Upper limit)
x 2
class.
2. Find the sum of the
products of the midpoints xf
and the frequencies.
3. Find the sum of the n
frequencies. f
4. Find the mean of the xf
frequency distribution. x n
. Copyright 2019, Pearson Education, 89
Example: Find the Mean of a
Frequency Distribution
The frequency distribution
shows the out-of-pocket
prescription medicine
expenses (in dollars) for 30
U.S. adults in a recent year.
Use the frequency distribution
to estimate the mean expense.
Using the sample mean
formula, the mean expense is
$285.50. Compare this with
the estimated mean.
. Copyright 2019, Pearson Education, 90
Solution: Find the Mean of a
Frequency Distribution
The mean expense is $287.70. This value is an estimate because it is
based on class midpoints instead of the original data set.
. Copyright 2019, Pearson Education, 91
The Shape of Distributions
Symmetric Distribution
• A vertical line can be drawn through the middle
of a graph of the distribution and the resulting
halves are approximately mirror images.
. Copyright 2019, Pearson Education, 92
The Shape of Distributions
Uniform Distribution (rectangular)
• All entries or classes in the distribution have equal
or approximately equal frequencies.
• Symmetric.
. Copyright 2019, Pearson Education, 93
The Shape of Distributions
Skewed Left Distribution (negatively skewed)
• The “tail” of the graph elongates more to the left.
• The mean is to the left of the median.
. Copyright 2019, Pearson Education, 94
The Shape of Distributions
Skewed Right Distribution (positively skewed)
• The “tail” of the graph elongates more to the right.
• The mean is to the right of the median.
. Copyright 2019, Pearson Education, 95
Chapt 2
er
Descriptive
Statistics
Copyright 2019, Pearson Education, 96
Chapter Outline
• 2.1 Frequency Distributions and Their Graphs
• 2.2 More Graphs and Displays
• 2.3 Measures of Central Tendency
• 2.4 Measures of Variation
• 2.5 Measures of Position
Copyright 2019, Pearson Education, 97
Section 2.4
Measures of Variation
. Copyright 2019, Pearson Education, 98
Range
Range
• The difference between the maximum and minimum
data entries in the set.
• The data must be quantitative.
• Range = (Max. data entry) – (Min. data entry)
. Copyright 2019, Pearson Education, 99
Example: Finding the Range
Two corporations each hired 10 graduates. The starting
salaries for each graduate are shown. Find the range
of the starting salaries for Corporation A.
. Copyright 2019, Pearson Education, 100
Solution: Finding the Range
• Ordering the data helps to find the least and greatest
salaries.
37 38 39 41 41 41 42 44 45 47
minimum
maximum
Range = (Max. salary) – (Min. salary)
= 47 – 37 = 10
The range of starting salaries for Corporation A is
10, or $10,000.
. Copyright 2019, Pearson Education, 101
Variatio
• Both data sets inn
the last example have a mean of
41.5, or $41,500, a median of 41, or $41,000, and a
mode of 41, or $41,000. And yet the two sets
differ significantly.
• The difference is that the entries in the second set
have greater variation. As you can see in the figures
on the next slide, the starting salaries for Corporation
B are more spread out than those for Corporation A.
. Copyright 2019, Pearson Education, 102
Variatio
n
. Copyright 2019, Pearson Education, 103
Deviation, Variance, and
Standard Deviation
Deviation
• The difference between the data entry, x, and the
mean of the data set.
• Population data set:
Deviation of x = x – μ
• Sample data set:
Deviation of x = x – x
. Copyright 2019, Pearson Education, 104
Deviation, Variance, and
Standard Deviation
Population Variance
2
(x ) 2
(x ) 2
2 N
N
Population Standard
Deviation
. Copyright 2019, Pearson Education, 105
Example: Finding Population
Variance and Standard Deviation
•
. Copyright 2019, Pearson Education, 106
Solution: Finding Population
Standard Deviation
Salary ($1000s), x Deviation: x – μ
41 41 – 41.5 = –0.5
• Determine the
38 38 – 41.5 = –3.5
deviation for each
39 39 – 41.5 = –2.5
data entry.
45 45 – 41.5 = 3.5
47 47 – 41.5 = 5.5
• μ = 41,5 41 41 – 41.5 = –0.5
44 44 – 41.5 = 2.5
41 41 – 41.5 = –0.5
37 37 – 41.5 = –4.5
42 42 – 41.5 = 0.5
Σx = 415 Σ(x – μ) = 0
. Copyright 2019, Pearson Education, 107
Solution: Finding Population
Standard Deviation
• Determine SSx Salary, x Deviation: x – μ Squares: (x – μ)2
41 41 – 41.5 = –0.5 (–0.5)2 = 0.25
38 38 – 41.5 = –3.5 (–3.5)2 = 12.25
39 39 – 41.5 = –2.5 (–2.5)2 = 6.25
45 45 – 41.5 = 3.5 (3.5)2 = 12.25
47 47 – 41.5 = 5.5 (5.5)2 = 30.25
41 41 – 41.5 = –0.5 (–0.5)2 = 0.25
44 44 – 41.5 = 2.5 (2.5)2 = 6.25
41 41 – 41.5 = –0.5 (–0.5)2 = 0.25
37 37 – 41.5 = –4.5 (–4.5)2 = 20.25
42 42 – 41.5 = 0.5 (0.5)2 = 0.25
Σ(x – μ) = 0 SSx = 88.5
. Copyright 2019, Pearson Education, 108
Solution: Finding Population
Standard Deviation
Population Variance
•
2 (x ) 2
88.5 Square
thousands of
N 8.910 $ of Dollars
Population Standard Deviation
88.5 Bring it back
• 2
to
to
3.0 10
thousands of
$
The population variance is about 8.9, and the population standard
deviation is about 3.0, or $3,000.
. Copyright 2019, Pearson Education, 109
Sample Variance, and
Standard Deviation
Sample Variance
(x x ) 2
s2
n Correction, as samples tend
to underestimate variance
1
Sample Standard Deviation
2 (x x ) 2
s s
n 1
. Copyright 2019, Pearson Education, 110
Finding the Sample Variance
& Standard Deviation
In Words In Symbols
1. Find the mean of the x
x n
sample data set.
2. Find deviation of each xx
entry.
3. Square each (x x ) 2
deviation.
SS x (x
4. Add to get the sum of x )2
squares.
. Copyright 2019, Pearson Education, 111
Finding the Sample Variance & Standard
Deviation
In Words In Symbols
5. Divide by n – 1 to get the (x x ) 2
2
sample variance. s
n 1
6. Find the square root to get
the sample standard (x x ) 2
s
deviation. n 1
. Copyright 2019, Pearson Education, 112
Example: Finding Sample
Variance & Standard Deviation
In a study of high school football players that suffered
concussions, researchers placed the players in two
groups. Players that recovered from their concussions in
14 days or less were placed in Group 1. Those that took
more than 14 days were placed in Group 2. The
recovery times (in days) for Group 1 are listed below.
Find the sample variance and standard deviation of the
recovery times.
4 7 6 7 9 5 8 10 9 8 7
10
. Copyright 2019, Pearson Education, 113
Solution: Finding Sample
Variance & Standard Deviation
• Determine the
deviation for each
data entry.
•x = 7,5
. Copyright 2019, Pearson Education, 114
Solution: Finding Sample Variance &
Standard Deviation
The sample variance is about 3.5, and the sample standard
deviation is about 1.9 days.
. Copyright 2019, Pearson Education, 115
Exercise 5:
Standard Deviation
The ages of a random sample of people surveyed in Deggendorf are:
14 23 24 16 16 22 18 19 21 25
Use the table provided to calculate sum of squares, variance and standard
deviation of the data set:
. Copyright 2019, Pearson Education, 116
Coefficient of Variation
. Copyright 2019, Pearson Education, 118
Example: Comparing Variation in
Different Data Sets
The table shows the population heights (in inches) and
weights (in pounds) of the members of a basketball team.
Find the coefficient of variation for the heights and the
weighs. Then compare the results.
. Copyright 2019, Pearson Education, 119
Solution: Comparing Variation in
Different Data Sets
. Copyright 2019, Pearson Education, 120
Solution: Comparing Variation in
Different Data Sets
The weights (9.4%) are more variable than
the heights (4.5%).
. Copyright 2019, Pearson Education, 121
Interpreting Standard
Deviation
• Standard deviation is a measure of the typical amount
an entry deviates from the mean.
• The more the entries are spread out, the greater the
standard deviation.
. Copyright 2019, Pearson Education, 122
Example: Estimating Standard
Deviation
Without calculating, estimate the population standard
deviation of each data set.
. Copyright 2019, Pearson Education, 123
Solution: Estimating Standard
Deviation
. Copyright 2019, Pearson Education, 124
Solution: Estimating Standard
Deviation
. Copyright 2019, Pearson Education, 125
Solution: Estimating Standard
Deviation
. Copyright 2019, Pearson Education, 126
Interpreting Standard Deviation:
Empirical Rule (68 – 95 – 99.7 Rule)
For data with a (symmetric) bell-shaped distribution, the
standard deviation has the following characteristics:
• About 68% of the data lie within one standard
deviation of the mean.
• About 95% of the data lie within two standard
deviations of the mean.
• About 99.7% of the data lie within three standard
deviations of the mean.
. Copyright 2019, Pearson Education, 127
Interpreting Standard Deviation:
Empirical Rule (68 – 95 – 99.7 Rule)
99.7% within 3 standard deviations
95% within 2 standard deviations
68% within 1
standard deviation
34% 34%
2.35% 2.35%
13.5% 13.5%
x 3s x 2s xs x x x x
s 2s 3s
. Copyright 2019, Pearson Education, 128
There are Different Forms of
Bell-Shaped Distributions
. Copyright 2019, Pearson Education, 129
Example: Using the Empirical
Rule
In a survey conducted by the National Center for Health
Statistics, the sample mean height of women in the
United States (ages 20-29) was 64.2 inches, with a
sample standard deviation of 2.9 inches. Estimate the
percent of the women whose heights are between 58.4
inches and 64.2 inches. Assume the sample is normally
distributed.
. Copyright 2019, Pearson Education, 130
22.10.202
2
Solution: Using the Empirical
Rule
• Because the distribution
is bell-shaped, you can
use the Empirical Rule.
• The mean height is 64.2
inches.
64.2 – (2.9 * 2) = 58.4
= two standard deviations
to the mean
. Copyright 2019, Pearson Education, 131
Solution: Using the Empirical
Rule
•
. Copyright 2019, Pearson Education, 132
Chebychev’s Theorem
• The portion of any data set lying within k standard deviations
(k > 1) of the mean is at least:
1
1
k2
. Copyright 2019, Pearson Education, 133
Example: Using Chebychev’s
Theorem
The age distributions for Georgia and Iowa are shown in the
histograms. Apply Chebychev’s Theorem to the data for Georgia
using k = 2. What can you conclude? Is an age of 100 unusual
for a Georgia resident? Explain.
(Source: Based on U.S. Census Bureau)
. Copyright 2019, Pearson Education, 134
22.10.201
Solution: Using Chebychev’s
9
Theorem
You can say that at least 75% of the population of Georgia is
between 0 and 81.9 years old. Also, an age of 100 lies more than
two standard deviations from the mean. So, this age is unusual.
Copyright 2019, Pearson Education, 135
Breakout Session 6
30. The mean wages for a sample of employees in a company was $16.50 per day
with a standard deviation of $1.50 per day. Estimate the percent of wages
between $12.00 and $21.00 per day. (Assume the data set is normally distributed
/ bell-shaped).
32. The mean duration of the 135 space shuttle flights was about 9.9 days, and
the standard deviation was about 3.8 days. Determine how many of the flights
lasted between 2.3 days and 17.5 days. (Assume, that the data set is not normally
distributed).
Copyright 2019, Pearson Education, 136
Chapt 2
er
Descriptive
Statistics
Copyright 2019, Pearson Education, 138
Chapter Outline
• 2.1 Frequency Distributions and Their Graphs
• 2.2 More Graphs and Displays
• 2.3 Measures of Central Tendency
• 2.4 Measures of Variation
• 2.5 Measures of Position
Copyright 2019, Pearson Education, 139
Section 2.5
Measures of Position
. Copyright 2019, Pearson Education, 140
Quartiles
• Fractiles are numbers that partition (divide) an
ordered data set into equal parts.
• Quartiles approximately divide an ordered data set
into four equal parts.
First quartile, Q1: About one quarter of the data
fall on or below Q1.
Second quartile, Q2: About one half of the data
fall on or below Q2 (median).
Third quartile, Q3: About three quarters of the
data fall on or below Q3.
. Copyright 2019, Pearson Education, 141
Quartiles
Video
https://
www.youtube.com/watch?v=oPw2OpIZ4DY
. Copyright 2019, Pearson Education, 142
Example: Finding Quartiles
Each year in the U.S., automobile commuters waste fuel due to
traffic congestion. The amounts (in gallons per year) of fuel wasted
by commuters in the 15 largest U.S. urban areas are listed. Find
the first, second, and third quartiles of the data set. What do you
observe? (Source: Based on 2015 Urban Mobility Scorecard)
20 30 29 22 25 29 25 24 35 23 25 11 33 28
35
Solution:
• Q2 divides
Data entriesthe
to thedata
left of set
2 Q into two halves (Median).
Data entries to the right of Q
2
11 20 22 23 24 25 25 25 28 29 29 30 33 35
35
Q1 Q2 Q3
. Copyright 2019, Pearson Education, 143
Interquartile Range
Interquartile Range (IQR)
• A measure of variation that gives the range of the
middle portion (about half) of the data.
• The difference between the third and first
quartiles.
• IQR = Q3 – Q1
• Use IQR to identify outliers: Multiply IQR by
1.5
• Any data Q1 – 1.5 x IQR is an outlier
. Copyright 2019, Pearson Education, 144
Interquartile Range
Video
https://www.youtube.com/watch?v=VABsJBw1JqA
. Copyright 2019, Pearson Education, 145
Example: Finding the
Interquartile Range
Find the interquartile range of the data set from the
first example. Are their any outliers?
. Copyright 2019, Pearson Education, 146
Solution: Finding the
Interquartile Range
. Copyright 2019, Pearson Education, 147
Exercise IQR
. Copyright 2019, Pearson Education, 148
Box-and-Whisker Plot
Box-and-whisker plot
• Exploratory data analysis tool.
• Highlights important features of a data set.
• Requires (five-number summary):
1. Minimum entry
2. First quartile Q1
3. Median Q2
4. Third quartile Q3
5. Maximum entry
. Copyright 2019, Pearson Education, 150
Drawing a Box-and-Whisker
Plot
1. Find the five-number summary of the data set.
2. Construct a horizontal scale that spans the range of
the data.
3. Plot the five numbers above the horizontal scale.
4. Draw a box above the horizontal scale from Q1 to
Q3
and draw a vertical line in the box at Q2.
5. Draw whiskers from the box to the minimum and
Box
maximumWhisker
entries.
Whisker
Minimum Maximum
entry Q1 Median, Q2 Q3 entry
. Copyright 2019, Pearson Education, 151
Example: Drawing a Box-and-
Whisker Plot
Draw a box-and-whisker plot that represents the data set
in the first example.
Min = 11, Q1 = 23, Q2 = 25, Q3 = 30, Max = 35,
Solution:
The box represents about half of the data, which are
between 23 and 30.
. Copyright 2019, Pearson Education, 152
Example: Drawing a Box-and-
Whisker Plot
Solution:
The left whisker represents about one-quarter of the data,
so about 25% of the data entries are less than 23. The
right whisker represents about one-quarter of the data, so
about 25% of the data entries are greater than 30. Also,
the length of the left whisker is much longer than the
right one. This indicates that the data set has a possible
outlier to the left.
. Copyright 2019, Pearson Education, 153
Exercise:
Drawing a Box-and-Whisker Plot
. Copyright 2019, Pearson Education, 154
The Standard Score
Standard Score (z-score)
• Represents the number of standard deviations a
given value x falls from the mean μ.
• z value mean x
standard deviation
. Copyright 2019, Pearson Education, 155
Example: Finding z-Scores
The mean speed of vehicles along a stretch of highway is
56 miles per hour with a standard deviation of 4 miles
per hour.
You measure the speeds of three cars traveling along this
stretch of highway as:
1st car: 62 miles per hour
2nd car: 47 miles per
3rd hour 56 miles
Find the z-score
car: per hour
that corresponds to each speed. Assume
the distribution of the speeds is approximately bell-shaped.
. Copyright 2019, Pearson Education, 156
Solution: Finding z-Scores
Solution
The z-score that corresponds to each speed is calculated
below.
x = 62 mph x = 47 mph x = 56
mph
z = 62 − 56 = 1.5 z = 47 − 56 = − 2.25 z = 56 − 56 =
0
4 4
4
62 miles per hour is 1.5 standard deviations above the
mean; 47 miles per hour is 2.25 standard deviations
below the mean; and 56 miles per hour is equal to the
. mean. 47 miles per Copyright
hour is2019,
unusually slow, because its 157
Pearson Education,
Exercise: Find the Z-Score at 60 mph
The mean speed of vehicles along a stretch of highway is
56 miles per hour with a standard deviation of 4 miles
per hour.
You measure the speeds of three cars traveling along this
stretch of highway as:
4th car: 60 miles per hour
Find the z-score that corresponds to the speed of 60 mph.
Assume the distribution of the speeds is approximately
bell-shaped.
. Copyright 2019, Pearson Education, 158
Example: Comparing z-Scores
from Different Data Sets
The table shows the mean heights and standard
deviations for a population of men and a population of
women. Compare the z-scores for a 6-foot-tall (72 in. /
1.83m) man and a 6-foot-tall (72 in. / 1.83m) woman.
Assume the distributions of the heights are
approximately bell-shaped.
. Copyright 2019, Pearson Education, 160
Solution: Comparing z-Scores
from Different Data Sets
Solution
Note that 6 feet = 72 inches. Find the z-score for each
height.
• z-score for 6-foot-tall man
z=
x − µ = 72 − 69.9 = 0.7
� 3.0
�
• z-score for 6-foot-tall woman
x − µ = 72 − 64.3 ≈ 3.0
z= � 2.6
�
. Copyright 2019, Pearson Education, 161
Solution: Comparing z-Scores
from Different Data Sets
Solution
The z-score for the 6-foot-tall man is within 1 standard
deviation of the mean (69.9 inches). This is among
the typical heights for a man.
The z-score for the 6-foot-tall woman is about 3
standard deviations from the mean (64.3 inches). This is
an unusual height for a woman.
. Copyright 2019, Pearson Education, 162