Chapter 3
Chapter 3
10, 000 + 1
= 5000.5 . The median is between the 5000th and the 5001st ordered values.
2
7. The mode is used with qualitative data because the computations involved with the mean
and median make no sense for qualitative data.
8. parameter; statistic
9. False. A data set may have multiple modes, or it may have no mode at all.
10. False. The formula
n +1
gives the position of the median, not the value of the median.
2
11. x =
20 + 13 + 4 + 8 + 10 55
=
= 11
5
5
12. x =
83 + 65 + 91 + 87 + 84 420
=
= 84
5
5
13. =
3 + 6 + 10 + 12 + 14 45
=
=9
5
5
121
15.
1 + 19 + 25 + 15 + 12 + 16 + 28 + 13 + 6 135
=
= 15
9
9
142
2.4 . The mean price per ad slot is approximately $2.4 million.
59
16. Let x represent the missing value. Since there are 6 data values in the list, the median 26.5
is between the 3rd and 4th ordered values which are 21 and x, respectively. Thus,
21 + x
= 26.5
2
21 + x = 53
x = 32
The missing value is 32.
17. Mean =
18. Mean =
3960 + 4090 + 3200 + 3100 + 2940 + 3830 + 4090 + 4040 + 3780 33, 030
=
= 3670 psi
9
9
Data in order: 2940, 3100, 3200, 3780, 3830, 3960, 4040, 4090, 4090
Median = the 5th ordered data value = 3830 psi
Mode = 4090 psi (because it is the only data value to occur twice)
19. Mean =
20. Mean =
21. (a) The histogram is skewed to the right, suggesting that the mean is greater than the
median. That is, x > M .
(b) The histogram is symmetric, suggesting that the mean is approximately equal to the
median. That is, x = M .
(c) The histogram is skewed to the left, suggesting that the mean is less than the median.
That is, x < M .
122
IV because the distribution is symmetric (so mean median) and centered near 30.
III because the distribution is skewed to the right, so mean > median.
II because the distribution is skewed to the left, so mean < median.
I because the distribution is symmetric (so mean median) and centered near 40.
123
25. (a) =
39 + 21 + 9 + 32 + 30 + 45 + 11 + 12 + 39 238
=
26.4 minutes
9
9
(b) Samples and sample means will vary.
(c) Answers will vary.
26. (a) =
27. (a) =
0 + 0 + 0 + 4 + 10 + 1 + 10 + 10 + 19 + 9 + 18 + 20 + 13 + 13 + 2 + 7 + 8 + 13
18
157
18
Winning speed:
distance
3499
=
40.417 km/hr
time
86.572
The three results agree approximately. The differences are due to rounding.
29. The distribution is relatively symmetric as is evidenced by both the histogram and the fact
that the mean and median are approximately equal. Therefore, the mean is the better
measure of central tendency.
124
33.
0.76 0.78 0.80 0.82 0.84 0.86 0.88 0.90 0.92 0.94 0.96
Weight (grams)
x 0.874 grams; M = 0.88 grams . The mean is approximately equal to the median
suggesting that the distribution is symmetric. This is confirmed by the histogram (though
is does appear to be slightly skewed left). The mean is the better measure of central
tendency.
Length of Eruptions
34.
Frequency
15
5
90 95 100 105 110 115 120 125
Length (seconds)
x 104.1 seconds; M = 104 seconds . The mean is approximately equal to the median
suggesting that the distribution is symmetric. This is confirmed by the histogram. The
mean is the better measure of central tendency.
125
35.
10 15 20 25 30 35 40 45
Hours
x = 22 hour; M = 25 hours . The mean is smaller than the median suggesting that the
distribution is skewed left. This is confirmed by the histogram. The median is the better
measure of central tendency.
Car Dealers Profit
Number of Sales
36.
Dollars
The frequency for Purdue is 3. The frequencies of all other colleges are lower than 3,
so the mode college attended is Purdue.
(e) Samples and sample means will vary.
(f) Offensive guards: = 306.4 lb; M = 305 lb; Mode = 305 lb
Running backs: = 217.8 lb; M = 220 lb; Mode = 225 lb
Yes, there appears to be differences in the weights of offensive guards and running
backs. All three measures of center indicate that offensive guards are significantly
heavier than running backs. This is due to the nature of the positions. Offensive
guards must be able to protect the quarterback while the running back must be able to
run quickly.
(g) It does not make sense to compute the mean player number. The variable player
number is qualitative, so the quantitative calculations will be meaningless.
127
65 + 70 + 71 + 75 + 95 376
=
= 75.2
5
5
The five data values are in order, so the median is the middle value: M = 71 .
The distribution is skewed right, so the median is the better measure of central
tendency.
Adding 4 to each score gives the following new data set: 69, 74, 75, 79, 99.
69 + 74 + 75 + 79 + 99 396
x=
=
= 79.2
5
5
The curved test score mean is 4 greater than the unadjusted test score mean. Adding 4
to each score increased the mean by 4.
50. (a) x =
(b)
(c)
(d)
(e)
128
52. Midrange =
9. True
10. True
Data, xi
20
13
4
8
10
Sample Mean, x
11
11
11
11
11
(x x ) = 0
(x x )
s=
( x x )
(x x )
=
i
n 1
144
=
= 36 ;
5 1
n 1
129
= 144
144
= 36 = 6 .
5 1
Data, xi
83
65
91
87
84
Sample Mean, x
82
82
82
82
82
(x x ) = 0
( x x ) = 400 = 100 ; s = ( x x )
=
( x x )
n 1
5 1
n 1
= 400
400
= 100 = 10 .
5 1
Population Mean,
9
9
9
9
9
Data, xi
3
6
10
12
14
( x ) = 0
(x ) =
i
( x )
=
i
80
=
= 16 ; =
5
( x )
= 80
80
= 16 = 4 .
5
Data, xi
1
19
25
15
12
16
28
13
6
Population Mean,
15
15
15
15
15
15
15
15
15
( x ) = 0
i
130
( x )
i
= 576
(x )
=
15. x =
576
=
= 64 ; =
9
(x )
6 + 52 + 13 + 49 + 35 + 25 + 31 + 29 + 31 + 29 300
=
= 30 .
10
10
Data, xi
6
52
13
49
35
25
31
29
31
29
Sample Mean, x
30
30
30
30
30
30
30
30
30
30
Squared Deviations, ( xi x )
( 24) 2 = 576
222 = 484
(17) 2 = 289
192 = 361
52 = 25
(5) 2 = 25
12 = 1
(1) 2 = 1
12 = 1
(1) 2 = 1
Deviations, xi x
6 30 = 24
52 30 = 22
13 30 = 17
49 30 = 19
35 30 = 5
25 30 = 5
31 30 = 1
29 30 = 1
31 30 = 1
29 30 = 1
( x x ) = 0
( x x )
( x x ) = 1764 = 196 ; s = ( x x ) = 1764 =
=
i
64
= 64 = 8 .
9
= 1764
n 1
16. =
10 1
196 = 14 .
4 + 10 + 12 + 12 + 13 + 21 72
=
= 12 .
6
6
Data, xi
4
10
12
12
13
21
Population Mean,
12
12
12
12
12
12
( x ) = 0
(x )
=
=
i
2 =
( xi )
N
150
= 25 ;
6
131
( x )
i
150
= 25 = 5 .
6
= 150
Sample Mean, x
381.75
381.75
381.75
381.75
Deviations, xi x
38.25
80.25
27.25
145.75
( x x ) = 0
i
(x x )
=
i
n 1
29,888.75
= 9,962.9 $2
4 1
Squared Deviations, ( xi x )
1463.0625
6440.0625
742.5625
21, 243.0625
( x x ) = 29,888.75
( x x ) = 29,888.75 $99.81
; s=
2
n 1
4 1
18. Range = Largest Data Value Smallest Data Value = 49.26 35.34 = $13.92.
To calculate the sample variance and the sample standard deviation, we use the
computational formula:
2
Data value, xi Data value squared, xi2
xi )
(
2
xi n
35.34
1248.9156
s2 =
n 1
42.09
1771.5681
2
248.44 )
(
39.43
1554.7249
10,399.9932
6
38.93
1515.5449
=
22.584 $2 ;
6 1
43.39
1882.6921
49.26
2426.5476
2
2
248.44 )
(
xi = 248.44 xi = 10,399.9932
10,399.9932
s=
6 1
$4.75
19. Range = Largest Data Value Smallest Data Value = 4090 2940 = 1150 psi
To calculate the sample variance and the sample standard deviation, we use the
computational formula:
2
Data value, xi Data value squared, xi2
xi )
(
2
xi n
3960
15,681,600
2
s =
n 1
4090
16,728,100
( 33,030 )2
3200
10,240,000
122,910,300
9
3100
9,610,000
=
211,275 psi 2 ;
9 1
2940
8,643,600
3830
14,668,900
( 33,030 )2
4090
16,728,100
122,910,300
9
4040
16,321,600
s=
459.6 psi
9 1
3780
14,288,400
2
xi = 33,020 xi = 122,828,600
132
Sample Mean, x
266
266
266
266
266
266
267
266
( x x ) = 0
( x x ) = 426
426
( x x ) = 426 8.4 min
=
= 71 min ; s =
2
s2 =
(x x )
i
n 1
7 1
n 1
7 1
21. Histogram (b) depicts a higher standard deviation because the data is more dispersed, with
data values ranging from 30 to 75. Histogram (a)s data values only range from 40 to 60.
22. (a) III, because it is centered between 52 and 57 and has the greatest amount of dispersion
of the three histograms with mean = 53.
(b) I, because it is centered near 53 and its dispersion is consistent with s = 1.3 but not
with s = 0.12 or s = 9 .
(c) IV, because it is centered near 53 and it has the least dispersion of the three histograms
with mean = 53.
(d) II, because it has a center near 60.
23. Los Angeles ATM fees:
Range = Largest Data Value Smallest Data Value = 2.00 0.00 = $2.00.
2
Data value, xi Data value squared, xi2
xi )
(
2
xi n
2.00
4
s=
1.50
2.25
n 1
2
1.50
2.25
(11.5 )
19.75
1.00
1
8
=
1.50
2.25
8 1
$0.68
2.00
4
0.00
2.00
xi = 11.5
0
4
2
xi = 19.75
133
2
xi n
1.50
2.25
s=
1.00
1
n 1
2
1.00
1
( 8.5 )
10.625
1.25
1.5625
8
=
1.25
1.5625
8 1
$0.48
1.50
2.25
1.00
1
0.00
0
2
xi = 8.5
xi = 10.625
Based on both the range and the standard deviation, ATM fees in Los Angeles have more
dispersion than ATM fees in New York. Both the range and the standard deviation for Los
Angeles are larger.
24. Reaction Time to Blue:
Range = Largest Data Value Smallest Data Value = 0.841 0.267 = 0574 sec.
2
Data value, xi Data value squared, xi2
xi )
(
2
xi n
0.582
0.338724
s=
0.481
0.231361
n 1
0.841
0.707281
( 3.306 )2
2.02038
0.267
0.071289
6
=
6 1
0.685
0.469225
0.1994 sec.
0.45
0.2025
= 3.306
2
i
= 2.02038
2
xi n
0.408
0.166464
s=
0.407
0.165649
n 1
2
0.542
0.293764
2.748 )
(
1.279506
0.402
0.161604
6
=
0.456
0.207936
6 1
0.533
0.284089
0.0647 sec.
2
2.748
1.279506
x
=
x
=
i
i
Based on both the range and the standard deviation, the reaction times for blue have more
variability than those for red. Both the range and the standard deviation for blue are larger.
134
2 =
2
i
( x )
2
i
( x )
( 650 )
47,474
47,474
2
i
= 47, 474 ; N = 9 ;
58.8 ( beats/min.) ;
2
= 650 ;
( 650 )
7.7 beats/min.
(b) Samples, sample variances, and sample standard deviations will vary.
(c) Answers will vary.
2 =
xi2
( x )
2
i
( x )
7778
( 238)
9
7778
= 238 ;
2
i
= 7778 ; N = 9 ;
164.9 min.2 ;
( 238 )
12.8 min.
(b) Samples, sample variances, and sample standard deviations will vary.
(c) Answers will vary.
2 =
2
i
( x )
2
i
( x )
(157 )
2107
2107
2
i
= 2107 ; N = 18 ;
41.0 goals 2 ;
18
= 157 ;
18
(157 )
18
6.4 goals
18
(b) Samples, sample variances, and sample standard deviations will vary.
(c) Answers will vary.
28. (a) Range = Largest Data Value Smallest Data Value = 92.552 82.087 = 10.465 hours.
For the population variance and standard deviation, we use the computational formula:
xi = 606.007 ; xi2 = 52,561.3666 ; N = 7 ;
=
2
2
i
( x )
xi2
( x )
( 606.007 )
52,561.3666
13.981 hours 2 ;
52,561.3666
( 606.007 )
135
3.739 hours
2 =
2
i
( x )
( x )
2
i
85,825,565
( 24, 491)
7
19, 793.3 km 2 ;
85,825,565
( 24, 491)
140.7 km
(c) Range = Largest Data Value Smallest Data Value = 7.617 1.017 = 6.600 min.
For the population variance and standard deviation, we use the computational formula:
xi = 39.667 ; xi2 = 255.510823 ; N = 7 ;
=
2
2
i
( x )
( x )
2
i
( 39.667 )
255.510823
7
4.390 min 2 ;
255.510823
( 39.667 )
7
2.095 min
(d) Range = Largest Data Value Smallest Data Value = 41.65 39.56 = 2.09 km/h.
For the population variance and standard deviation, we use the computational formula:
xi = 282.94 ; xi2 = 11, 439.397 ; N = 7 ;
=
2
2
i
( x )
xi2
( x )
( 282.94 )
11,439.397
0.423 ( km/h ) ;
2
11,439.397
( 282.94 )
0.651 km/h
9 + 24 + 8 + 9 + 5 + 8 + 9 + 10 + 8 + 10 100
=
= 10 fish ;
N
10
10
Range = Largest Data Value Smallest Data Value = 24 5 = 19 fish
xi = 15 + 2 + 3 + 18 + 20 + 1 + 17 + 2 + 19 + 3 = 100 = 10 fish ;
Drew: =
N
10
10
Range = Largest Data Value Smallest Data Value = 20 1 = 19 fish
Both fishermen have the same mean and range, so these values do not indicate any
differences between their catches per day.
136
2
i
=
Drew:
= 100 ;
2
i
= 1236 ; N = 10
( x )
= 100 ;
2
i
2
i
10
4.9 fish
10
(100 )
= 1626 ; N = 10
( x )
1236
1626
(100 )
10
10
7.9 fish
Yes, now there appears to be a difference in the two fishermens records. Ethan had a
more consistent fishing record, which is indicated by the smaller standard deviation.
(c) Answers will vary. One possibility follows: The range is limited as a measure of
dispersion because it does not take all of the data values into account. It is obtained by
using only the two most extreme data values. Since the standard deviation utilizes all
of the data values, it provides a better overall representation of dispersion.
30. (a) Range = Largest Data Value Smallest Data Value = 349 180 = 169 lb
x 8591
xi = 8591 ; xi2 = 2,332, 051; N = 33 ; = N i = 33 260.3 lb
2
i
( x )
2,332,051
( 8591)
33
53.8 lb
33
(b) Range = Largest Data Value Smallest Data Value = 306 177 = 129 lb
x 5889
xi = 5889 ; xi2 = 1, 481,833 ; N = 24 ; = N i = 24 245.4 lb
2
i
( x )
1,481,833
( 5889 )
24
24
39.2 lb
(c) The weights of the offense have the greater dispersion. The offense has both the larger
range and the larger standard deviation.
31. Range = Largest Data Value Smallest Data Value = 73 28 = 45.
For the sample variance and sample standard deviation, we use the computational formula:
xi = 2045 ; xi2 = 109,151 ; n = 40 ;
s2 =
2
i
( x )
n 1
( 2045)
109,151
40 1
40
118.0 ; s =
137
109,151
( 2045 )
40 1
40
10.9
s=
2
i
( x )
n 1
1355.6208
1355.6208
( 205.92 )
35 1
( 205.92 )
35
s=
( x )
35
35 1
2
i
n 1
38.2887
= 43.71 ;
( 43.71)
2
i
= 38.2887 ; n = 50 ;
50
0.04 g
50 1
s=
2
i
( x )
n 1
478,832
( 4582 )
44
44 1
= 4582 ;
2
i
= 478,832 ; n = 44 ;
6 sec.
= 3352;
= 755, 712; n = 15
2
i
Measures of Center:
xi = 3352 223.5 miles ; Mode: none;
x=
n
15
M = 223 miles (the 8th value in the ordered data)
Measures of Dispersion:
Range = Largest Data Value Smallest Data Value = 271 178 = 93 miles;
2
i
s2 =
n 1
755,712
s=
Car 2:
( x )
( 3352 )
= 3558;
15
( 3352 )
15
15 1
475.1 miles 2 ;
21.8 miles
15 1
755,712
= 877, 654; n = 15
2
i
Measures of Center:
xi = 3558 = 237.2 miles ; Mode: none;
x=
n
15
M = 230 miles (the 8th value in the ordered data)
Measures of Dispersion:
Range = Largest Data Value Smallest Data Value = 326 160 = 166 miles;
s2 =
s=
2
i
( x )
n 1
877,654
( 3558 )
15
15 1
( 3558)
877,654
15 1
15
2406.9 miles 2 ;
49.1 miles
The distribution for Car 1 is symmetric since the mean and median are approximately
equal. The distribution for Car 2 is skewed right slightly since the mean is larger than the
median. Both distributions have similar measures of center, but Car 2 has more dispersion
which can be seen by its larger range, variance, and standard deviation. This means that
the distance Car 1 can be driven on 10 gallons of gas is more consistent. Thus, Car 1 is
probably the better car to buy.
36. Fund A:
= 61;
2
i
= 356.12; n = 20
Measures of Center:
xi = 61 3.05 miles ;
x=
20
n
Mode: none;
139
M=
3.0 + 3.1
= 3.05 ;
2
2
i
s2 =
Fund B:
( x )
n 1
= 3558;
( 61)
356.12
20 1
20 8.95 ; s =
356.12
( 61)
20
20 1
2.99
= 877, 654; n = 15
2
i
Measures of Center:
xi = 68.1 3.41 ; Mode = 4.3; M = 3.5 + 3.8 = 3.65
x=
20
2
n
Measures of Dispersion:
Range = Largest Data Value Smallest Data Value = 12.9 ( 6.7 ) = 19.6 ;
2
i
( x )
( 68.1)
825.27
825.27
( 68.1)
20 31.23 ; s =
20 5.59
19 1
19 1
The distribution for Mutual Fund A is symmetric since the mean and median are equal.
Likewise, the distribution for Mutual Fund B is approximately symmetric (but skewed left
slightly since the mean is smaller than the median). Mutual Fund B has a larger measure
of center and greater dispersion which can be seen by its larger range, variance, and
standard deviation. This means that the rate of return on Mutual Fund A is generally
lower, but more consistent. The rate of return o Mutual Fund B is generally higher, but
more dispersed.
s2 =
x=
n 1
Energy Stocks:
= 719.4;
= 9591.0556; n = 32
2
i
502.9
15.716 ;
32
i
= 502.9;
2
i
M=
15.92 + 16.26
= 16.09
2
= 21, 213.3104; n = 32
719.4
19.50 + 19.67
= 19.585
22.481 ; M =
32
2
n
Energy Stocks have higher mean and median rates of return.
x=
Energy Stocks: s =
2
i
( x )
n 1
2
i
( x )
32 1
32
( 719.4 )
21,213.3104
32
7.378
12.751
32 1
Energy Stocks are riskier since they have a larger standard deviation.
n 1
( 502.9 )
9591.0556
140
= 166.26;
= 715.1876; n = 40
2
i
166.26
4.18 + 4.21
= 4.195
4.157 ; M =
40
2
n
National League: xi = 149.93; xi2 = 576.4971; n = 40
x=
149.93
3.84 + 3.87
= 3.855
3.748 , M =
40
2
n
The American League has both the higher mean and median earned-run average.
x=
National League: s =
2
i
( x )
n 1
2
i
( x )
40 1
n 1
(166.26 )
715.1876
576.4971
40
(149.93)
40 1
40
0.787
0.610
141
1
1
100% = 84% of
(b) By Chebyshevs inequality, at least 1 2 100% = 1
2
k
2.5
gasoline prices has prices within k = 2.5 standard deviations of the mean. Now,
1.37 2.5(0.05) = 1.245 and 1.37 + 2.5(0.05) = 1.495 . Thus, the gasoline prices that
are within 2.5 standard deviations of the mean are from $1.245 to $1.495.
(c) Since 1.27 is exactly k = 2 standard deviations below the mean [1.27 = 1.37 2(0.05)]
and 1.47 is exactly k = 2 standard deviations above the mean [1.47 = 1.37 + 2(0.05)],
1
1
142
=
2
2
i
( x )
( x )
( x )
2
i
( 500 )
26,600
10
= 160 thousand $2 ;
10
10
$12.6 thousand
10
N
(b) Add $2500 ($2.5 thousand) to each salary to form the new data set.
New data set: 32.5, 32.5, 47.5, 52.5, 52.5, 52.5, 57.5, 57.5, 62.5, 77.5
Range = Largest Data Value Smallest Data Value = 77.5 32.5 = $45 thousand.
xi = 525 ; xi2 = 29,162.5 ; N = 10 ;
=
2
2
i
( 500 )
26,600
( x )
( x )
2
i
( 525)
29,162.5
10
= 160 thousand $2 ;
10
10
$12.6 thousand
10
N
All three measures of variability remain the same.
(c) Multiply each original data value by 1.05 to generate the new data set.
New data set: 31.5, 31.5, 47.25, 52.5, 52.5, 52.5, 57.75, 57.75, 63, 78.75
Range = Largest Data Value Smallest Data Value = 78.75 31.5 = $47.25 thousand.
xi = 525 ; xi2 = 29,326.5 ; N = 10 ;
2 =
2
i
( 525)
29,162.5
2
i
( x )
( 525 )
29,326.5
10
10
= 176.4 thousand $2 ;
( 525)
29,326.5
10 $13.3 thousand
N
=
10
N
All three measures of variability are larger than original, showing greater dispersion of
salaries. (Note that R and are each 5% larger than original, and 2 is 1.1025 times
larger than original which is (1.05) 2 .)
143
=
2
2
i
( x )
2
i
( x )
( 525 )
30,975
10
10
= 341.25 thousand $2 ;
( 525)
30,975
10
$18.5 thousand
10
N
All three measures of variability are significantly larger than original.
47. Sample size of 5:
All data recorded correctly: s 5.3 .
106 recorded incorrectly as 160: s 27.9 .
Sample size of 12:
All data recorded correctly: s 14.7 .
106 recorded incorrectly as 160: s 22.7 .
Sample size of 30:
All data recorded correctly: s 15.9 .
106 recorded incorrectly as 160: s 19.2 .
As the sample size increases, the impact of the misrecorded data value on the standard
deviation decreases.
48. We use the computational formula:
2
i
( x )
= 312 ;
( 312 )
24,336
2
i
= 24,336 ; n = 4 ;
n
4
=
=0
4 1
n 1
If all values in a data set are identical, then there is zero variance.
s=
14.1
100% = 11.65%,
121
18.1
while the coefficient of variation for blood pressure after exercise is
100% =
135.9
13.32%. There is more variability in systolic blood pressure after exercise.
(b) The coefficient of variation for free calcium concentration in the group of people with
16.1
normal blood pressure is
100% = 14.92%, while the coefficient of variation for
107.9
free calcium concentration in the group of people with high blood pressure is
31.7
100% = 18.85%. There is more variability in free calcium concentration in the
168.2
high blood pressure group.
49. (a) The coefficient of variation for blood pressure before exercise is
144
MAD =
Sample Mean, x
381.75
381.75
381.75
381.75
Deviations, xi x
38.25
80.25
27.25
145.75
( xi x ) = 0
Squared Deviations, xi x
38.25
80.25
27.25
145.75
xi x = 291.50
n
4
deviation of s $99.81 .
51. (a) Skewness =
(b) Skewness =
(c) Skewness =
(d) Skewness =
(e) Skewness =
3(50 40)
= 3 . The distribution is skewed to the right.
10
3(100 100)
= 0 . The distribution is perfectly symmetric.
15
3(400 500)
= 2.5 . The distribution is skewed to the left.
120
3(0.8742 0.88)
0.44 . The distribution is slightly skewed to the left.
0.0397
3(104.136 104)
0.07 . The distribution is symmetric.
6.249
52. (a) Reading from the graph, the average annual return for a portfolio that is 10% foreign is
14.9%. The level of risk is 14.7%.
(b) To best minimize risk, 30% should be invested in foreign stocks. According to the
graph, a 30% investment in foreign stocks has the smallest standard deviation (level of
risk) at about 14.3%.
(c) Answers will vary. One possibility follows: The risk decreases because a portfolio
including foreign stocks is more diversified.
(d) According to Chebyshevs theorem, at least 75% of returns are within k = 2 standard
deviations of the mean. Thus, at least 75% of returns are between x ks =
15.8 2(14.3) = 12.8% and x + ks = 15.8 + 2(14.3) = 44.4% . By Chebyshevs
theorem, at least 88.9% of returns are within k = 3 standard deviations of the mean,
Thus, at least 88.9% of returns are between x ks = 15.8 3(14.3) = 27.1% and
x + ks = 15.8 + 3(14.3) = 58.7% . An investor should not be surprised if she has a
negative rate of return. Chebyshevs theorem indicates that a negative return is fairly
common.
145
546.2
90.9 + 91.2 182.1
=
= 91.05 g ;
91.03 g ; M A =
6
2
2
n
There are 2 modes: 90.8 g and 91.2 g (each value occurs twice).
(a) x A =
(b)
= 546.2 ;
2
i
sA =
2
i
( x )
n 1
= 49, 722.66 ; n = 6 ;
( 546.2 )
49,722.66
6 1
0.23 g
522.3
87.0 + 87.1 174.1
=
= 87.05 g
= 87.05 g ; M B =
6
2
2
n
There are 2 modes: 87.0 g and 87.2 g (each value occurs twice).
(c) xB =
(d)
= 522.2 ;
2
i
sB =
2
i
= 45, 448.9 ; n = 6 ;
( x )
n 1
(e)
( 522.3)
45,466.33
6 1
0.15 g
86 8
87 0 0 1 2 2
88
89
9 8 8 90
3 2 2 91
Yes, there appears to be a difference in these two products ability to mitigate water seepage. All
6 of the measurements for product B are less than the measurements for product A. Although it is
not clear whether there is any practical difference in these two products ability to mitigate water
seepage, product B appears to do a better job.
146
Section 3.3 Measures of Central Tendency and Dispersion from Grouped Data
3.
10 + 20
10 19
20 + 30
20 29
xi f i
xi x
= 15
120
32.8333
17.8333
2544.2127
= 25
16
400
32.8333
7.8333
981.7694
fi
30 39
35
21
735
32.8333
2.1667
98.5864
40 49
45
11
495
32.8333
12.1667
1628.3145
50 59
55
220
32.8333
22.1667
1965.4504
f
x f
x=
f
i
Midpoint, xi
Class
= 60
x f
i i
1+ 6
15
2
2
(x x ) f
( f ) 1
2
f = 7218.3334
7185.3334
$11.06
60 1
( xi )
Frequency, fi
xi f i
xi
11
38.5
14.5714
11.0714
1348.3349
14.5714
6.0714
= 3.5
6 + 11
6 10
( x x )
= 1970
1970
=
32.8333 $32.83 ; s =
60
i i
4.
( xi x )
Frequency, fi
Midpoint, xi
Class
= 8.5
fi
11 15
13.5
67.5
14.5714
1.0714
5.7395
16 20
18.5
111
14.5714
3.9286
92.6034
21 25
23.5
23.5
14.5714
8.9286
79.7199
26 30
28.5
57
14.5714
13.9286
388.0118
31 35
33.5
33.5
14.5714
18.9286
358.2919
36 40
38.5
77
14.5714
23.9286
1145.1558
f
xf
=
f
i i
i
= 28
x f
i i
408
=
14.5714 14.6 points ; =
28
147
( x )
= 408
(x )
f
i
f = 3417.8572
3417.8572
11.0 points
28
Class
09
10 19
0 + 10
2
10 + 20
2
xi f i
xi
=5
31
155
17.3
12.3
4689.99
= 15
39
585
17.3
2.3
206.31
fi
20 29
25
17
425
17.3
7.7
1007.93
30 39
35
210
17.3
17.7
1879.74
40 49
45
180
17.3
27.7
3069.16
50 59
55
110
17.3
37.7
2842.58
60 69
65
65
17.3
47.7
2275.29
f
xf
=
f
i i
i
6.
( xi )
Frequency, fi
Midpoint, xi
Class
09
10 19
1730
=
= 17.3 days ; =
100
2
10 + 20
2
( x )
= 1730
i i
(x )
f
i
f = 15,971
15,971
12.6 days
100
( xi x )
Frequency, fi
xi f i
xi x
=5
24
120
21.6
16.6
6613.44
= 15
14
210
21.6
6.6
609.84
Midpoint, xi
0 + 10
x f
= 100
fi
20 29
25
39
975
21.6
3.4
450.84
30 39
35
18
630
21.6
13.4
3232.08
40 49
45
225
21.6
23.4
2737.8
x f
x=
f
i i
i
= 100
2160
=
= 21.6 hr/wk ; s =
100
x f
= 2160
i i
(x x ) f
( f ) 1
2
148
( x x )
13, 644
11.7 hr/wk
100 1
f = 13, 644
Section 3.3 Measures of Central Tendency and Dispersion from Grouped Data
7.
Frequency, fi
(in millions)
xi f i
xi
= 30
28.9
867
44.4695
14.4695
6050.6898
= 40
35.7
1428
44.4695
4.4695
713.1586
Midpoint, xi
Class
25 34
35 44
25 + 35
2
35 + 45
2
fi
45 54
50
35.1
1755
44.4695
5.5305
1073.5837
55 64
60
24.7
1482
44.4695
15.5305
5957.5518
f
xf
=
f
i i
i
8.
( xi )
Class
0 0.9
1.0 1.9
= 124.4
x f
i i
5532
=
44.4695 44.5 yrs ; =
124.4
2
1+ 2
2
(x )
f
f = 13, 794.9839
13, 794.9839
10.5 yrs
124.4
( xi )
Freq, fi
xi f i
xi
= 0.5
539
269.5
2.7627
2.2627
2759.5783
= 1.5
1.5
2.7627
1.2627
1.5944
Midpt, xi
0 +1
( x )
= 5532
fi
2.0 2.9
2.5
1336
3340
2.7627
0.2627
92.1991
3.0 3.9
3.5
1363
4770.5
2.7627
0.7373
740.9422
4.0 4.9
4.5
289
1300.5
2.7627
1.7373
872.2631
5.0 5.9
5.5
21
115.5
2.7627
2.7373
157.3490
6.0 6.9
6.5
13
2.7627
3.7373
27.9348
f
xf
=
f
i i
i
= 3551
x f
i i
( x )
= 9810.5
9810.5
=
2.7627 2.8 ; =
3551
(x )
f
149
f = 4651.8609
4651.8609
1.1
3551
50 59
60 69
50 + 60
2
60 + 70
2
( xi )
Freq, fi
xi f i
xi
= 55
55
80.9350
25.9350
672.6242
= 65
308
20,020
80.9350
15.9350
78,208.6613
Midpt, xi
Class
fi
70 79
75
1519
113,925
80.9350
5.9350
53,505.5977
80 89
85
1626
138,210
80.9350
4.0650
26,868.3900
90 99
95
503
47,785
80.9350
14.0650
99,505.5851
100 109
105
11
1155
80.9350
24.0650
6370.3665
f
xf
=
f
i i
i
= 3968
x f
i i
( x )
= 321,150
321,150
=
80.9350 80.9F ; =
3968
(x )
f
i
f = 265,131.2248
265,131.2248
8.2F
3968
Frequency
(b)
1800
1600
1400
1200
1000
800
600
400
200
0
50
60
70
80
90 100 110
Temperature
(c)
By the Empirical Rule, 95% of the observations will be within 2 standard deviations of
the mean. Now, 2 = 80.9 2(8.2) = 64.5 and + 2 = 80.9 + 2(8.2) = 97.3 , so
95% of the of days in August will have temperatures between 64.5F and 97.3F .
150
Section 3.3 Measures of Central Tendency and Dispersion from Grouped Data
10. (a)
20 + 25
20 24
2
25 + 30
25 29
( xi )
Freq, fi
xi f i
xi
= 22.5
90
37.8333
15.3333
940.4404
= 27.5
15
412.5
37.8333
10.3333
1601.6563
Midpoint, xi
Class
fi
30 34
32.5
27
877.5
37.8333
5.3333
767.9904
35 39
37.5
40
1500
37.8333
0.3333
4.4436
40 44
42.5
28
1190
37.8333
4.6667
609.7865
45 49
47.5
15
712.5
37.8333
9.6667
1401.6763
50 54
52.5
210
37.8333
14.6667
860.4484
55 59
57.5
115
37.8333
19.6667
773.5582
f
xf
=
f
i i
i
= 135
x f
i i
( x )
= 5107.5
5107.5
=
37.8333 37.8 in ; =
135
(x )
f
i
f = 6960.0001
6960.0001
7.2 in
135
Frequency
(b)
45
40
35
30
25
20
15
10
5
0
20
25
30
35
40
45
50
60
65
Rainfall (inches)
(c) By the Empirical Rule, 95% of the observations will be within 2 standard deviations of the
mean. Now, 2 = 37.8 2(7.2) = 23.4 and + 2 = 37.8 + 2(7.2) = 52.2 , so 95% of
annual rainfalls in St. Louis will be between 23.4 and 52.2 inches.
151
15 + 20
15 19
2
20 + 25
20 24
( xi )
Freq, fi
xi f i
xi
= 17.5
93
1627.5
32.2721
14.7721
20,293.99
= 22.5
511
11,497.5
32.2721
9.7721
48,787.40
Midpoint, xi
Class
25 29
27.5
1628
44,770
32.2721
4.7721
37,074.34
30 34
32.5
2832
92,040
32.2721
0.2279
147.09
35 39
37.5
1843
69,112.5
32.2721
5.2279
50,370.92
40 44
42.5
377
16,022.5
32.2721
10.2279
39,437.95
f
xf
=
f
i i
i
= 7284
x f
i i
( x )
= 235, 070
235, 070
=
32.2721 32.3 yr ; =
7284
(x )
f
i
f = 196,111.69
196,111.69
5.2 yr
7284
(b)
3000
2500
2000
1500
1000
500
0
15
20
25
30
35
40
45
Mothers Age
(c)
fi
By the Empirical Rule, 95% of the observations will be within 2 standard deviations of
the mean. Now, 2 = 32.3 2(5.2) = 21.9 and + 2 = 32.3 + 2(5.2) = 42.7 , so
95% of mothers of multiple births will be between 21.9 and 42.7 years of age.
152
Section 3.3 Measures of Central Tendency and Dispersion from Grouped Data
12. (a)
400 449
450 499
400 + 450
2
450 + 500
2
( xi )
Freq, fi
xi f i
xi
= 425
281
119,425
603.1482
178.1482
8,918,035.5
= 475
577
274,075
603.1482
128.1482
9,475,471.6
Midpoint, xi
Class
fi
500 549
525
840
441,000
603.1482
78.1482
5,129,998.6
550 599
575
1120
644,000
603.1482
28.1482
887,399.7
600 649
625
1166
728,750
603.1482
21.8518
556,766.4
650 699
675
900
607,500
603.1482
71.8518
4,646,413.0
700 749
725
518
375,550
603.1482
121.8518
7,691,192.1
750 800
775.5
394
305,547
603.1482
172.3518
11,703,826.
3
f
xf
=
f
i i
i
= 5796
x f
i i
( x )
= 3, 495,650
3, 495,847
=
603.1482 603.1 ; =
5796
(x )
f
i
f = 49,009,103.2
49, 009,103.2
92.0
5796
(b)
1200
1000
800
600
400
200
0
Score
(c) By the Empirical Rule, 95% of the observations will be within 2 standard deviations of
the mean. Now, 2 = 603.1 2(92.0) = 419.1 and + 2 = 603.1 + 2(92) = 787.1 , so
95% of ISACS college-bound seniors will have SAT Verbal scores between 419 and 787.
153
20 + 30
20 29
2
30 + 40
30 39
( xi x )
Freq, fi
xi f i
xi x
= 25
25
51.75
26.75
715.5625
= 35
210
51.75
16.75
1683.375
Midpt, xi
Class
fi
40 49
45
10
450
51.75
6.75
455.625
50 59
55
14
770
51.75
3.25
147.875
60 69
65
390
51.75
13.25
1053.375
70 79
75
225
51.75
23.25
1621.6875
f
x=
= 40
x f
i i
( x x )
= 2070
f = 5677.5
s=
14.
3+5
3 4.99
2
5+7
5 6.99
( xi x )
Frequency, fi
xi f i
xi x
=4
12
48
48
=6
14
84
Midpoint, xi
Class
7 8.99
48
24
9 10.99
10
30
48
f
x=
= 35
x f
i i
( x x )
i
fi
f = 120
x f = 210 = 6 million shares (compared to 5.88 million shares using the raw data.);
f 35
( x x ) f = 120 1.879 million shares (compared to 2.059 million shares using
35 1
( f ) 1
i i
i
s=
= 210
154
Section 3.3 Measures of Central Tendency and Dispersion from Grouped Data
15. GPA = xw =
w x
w x
i i
i i
w x
w x
w x
w x
w x
w x
i i
i i
i i
i i
i i
i i
19. (a)
( xi )
Class
Midpt, xi
Freq, fi
xi f i
xi
09
20,225
101,125
35.6058
30.6058
18,945,060.7
10 19
15
21,375
320,625
35.6058
20.6058
9,075,803.5
20 29
25
20,437
510,925
35.6058
10.6058
2,298,814.9
30 39
35
21,176
741,160
35.6058
0.6058
7,771.5
40 49
45
22,138
996,210
35.6058
9.3942
1,953,700.5
50 59
55
16,974
933,570
35.6058
19.3942
6,384,515.4
60 69
65
10,289
668,785
35.6058
29.3942
8,889,891.4
70 79
75
6,923
519,225
35.6058
39.3942
10,743,824.4
80 89
85
3,053
259,505
35.6058
49.3942
7,448,669.7
90 99
95
436
41,420
35.6058
59.3942
1,538,064.6
f
x f
=
f
i i
i
= 143,026
x f
i i
( x )
= 5,092,550
5,092,550
=
35.6058 35.6 yr ; =
143,026
155
( x )
f
i
fi
f = 67, 286,116.6
67, 286,116.6
21.7 yr
143,026
( xi )
Class
Midpt, xi
Freq, fi
xi f i
xi
09
19,319
96,595
38.0872
33.0872
21,149,722.6
10 19
15
20,295
304,425
38.0872
23.0872
10,817,616.6
20 29
25
19,459
486,475
38.0872
13.0872
3,332,836.4
30 39
35
20,936
732,760
38.0872
3.0872
199,536.9
40 49
45
22,586
1,016,370
38.0872
6.9128
1,079,312.8
50 59
55
17,864
982,520
38.0872
16.9128
5,109,868.6
60 69
65
11,563
751,595
38.0872
26.9128
8,375,067.1
70 79
75
9,121
684,075
38.0872
36.9128
12,427,862.3
80 89
85
5,367
456,195
38.0872
46.9128
11,811,751.5
90 99
95
1,215
115,425
38.0872
56.9128
3,935,466.2
f
x f
=
f
i i
i
= 147,725
x f
i i
( x )
= 5,626, 435
5,626, 435
=
38.0872 38.1 yr ; =
147,725
( x )
f
i
fi
f = 78, 239,041.0
78, 239,041
23.0 yr
147,725
(c) & (d) Females have both a higher mean age and more dispersion in age.
20. (a)
( xi )
Class
Midpt, xi
Freq, fi
xi f i
xi
10 14
12.5
1.1
13.75
25.8462
13.4996
200.4631
15 19
17.5
53.0
927.5
25.8462
8.4996
3828.8896
20 24
22.5
115.1
2589.75
25.8462
3.4996
1409.6527
25 29
27.5
112.9
3104.75
25.8462
1.5004
254.1605
30 34
32.5
61.9
2011.75
25.8462
6.5004
2615.5969
35 39
37.5
19.8
742.5
25.8462
11.5004
2618.7322
40 44
42.5
3.9
165.75
25.8462
16.5004
1061.8265
45 49
47.5
0.2
9.5
25.8462
21.5004
92.4534
= 367.9
x f
i i
= 9565.25
156
( x )
i
fi
f = 12,081.7749
Section 3.3 Measures of Central Tendency and Dispersion from Grouped Data
x f
=
f
i i
i
(b)
9565.25
=
25.9996 26.0 yr ; =
367.9
( x )
f
12,081.7749
5.7 yr
367.9
( xi )
Class
Midpt, xi
Freq, fi
xi f i
xi
10 14
12.5
0.7
8.75
27.6180
15.118
159.9877
15 19
17.5
43.0
752.5
27.6180
10.118
4402.0787
20 24
22.5
103.6
2331
27.6180
5.118
2713.6905
25 29
27.5
113.6
3124
27.6180
0.118
1.5818
30 34
32.5
91.5
2973.75
27.6180
4.882
2180.8040
35 39
37.5
41.4
1552.5
27.6180
9.882
4042.8725
40 44
42.5
8.3
352.75
27.6180
14.882
1838.2336
45 49
47.5
0.5
23.75
27.6180
19.882
197.6470
x f
=
f
i i
i
21.
Class
09
10 19
20 29
30 39
40 49
50 59
60 69
= 402.6
x f
i i
( x )
= 11,119
11,1119
=
27.6180 27.6 yr ; =
402.6
( x )
f
i
fi
f = 12,081.7749
15,536.8952
6.2 yr
402.6
The year 2002 has both the higher mean age of mothers and more dispersion in
the age of mothers.
Frequency, f
31
39
17
6
4
2
1
Cumulative Frequency, CF
31
70
87
93
97
99
100
n 100
=
= 50 , which is in the
2
2
n
CF
50 31
2
i = 10 +
second class, 10 19. Then M = L +
( 20 10 ) 14.9 days .
39
f
157
Class
09
10 19
20 29
30 39
40 49
Frequency, f
24
14
39
18
5
Cumulative Frequency, CF
24
38
77
95
100
n 100
=
= 50 , which is in the
2
2
n
CF
50 38
i = 20 +
third class, 20 29. Then M = L + 2
( 30 20 ) 23.1 hr/wk .
f
39
23.
Class
25 34
35 44
45 54
55 64
Frequency, f (millions)
28.9
35.7
35.1
24.7
n 124.4
=
= 62.2 ,
2
2
Class
0 0.9
1.0 1.9
2.0 2.9
3.0 3.9
4.0 4.9
5.0 5.9
6.0 6.9
Frequency, f
539
1
1336
1363
289
21
2
Cumulative Frequency, CF
539
540
1876
3239
3528
3549
3551
n 3551
=
= 1775.5 , which is
2
2
n
CF
1775.5 540
i = 2.0 +
in the third class, 2.0 2.9. Then M = L + 2
( 3.0 2.0 ) 2.9 .
f
1336
158
Section 3.3 Measures of Central Tendency and Dispersion from Grouped Data
25. From the table in Problem 5, the modal class (highest frequency class) is 10 19 days.
26. From the table in Problem 6, the modal class (highest frequency class) is 20 29 hr/wk.
27. From the table in Problem 7, the modal class (highest frequency class) is 25 44 years.
28. From the table in Problem 8, the modal class (highest frequency class) is 3.0 3.9.
29. (a) Answers will vary. One possibility follows: Many colleges do not permit students
under age 16 to enroll in courses, so a reasonable midpoint to use would be 17.
(b) Answers will vary. One possibility follows: Since it is not likely that many students
would be over 70 years old, a reasonable midpoint would be 60.
(c) Answers will vary depending on choices for midpoints in parts (a) and (b). Using the
choices midpoints from above:
Class
Midpoint, xi
Freq, fi
xi f i
Less than 18
17
139
2363
= 19
4089
77,691
= 21
3357
70,497
= 23.5
1661
39,033.5
= 27.5
470
12,925
= 32.5
145
4712.5
= 37.5
95
3562.5
117
5265
21
1260
18 19
20 21
22 24
25 29
30 34
35 39
40 49
50 and above
18 + 20
2
20 + 22
2
22 + 25
2
25 + 30
2
30 + 35
2
35 + 40
2
40 + 50
2
= 45
60
f
=
= 10, 094
x f
i i
= 217,309.5
217,309.5
21.5 years . This estimate is a little higher than the actual
fi 10, 094
mean age of 20.9 years.
xi fi
159
2400 2600
0.30
670
x 3300 3500
=
0.42
z-score for the 40-week gestation baby: z =
475
The weight of a 34-week gestation baby is 0.30 standard deviations below the mean, while
the weight of a 40-week gestation baby is 0.42 standard deviations below the mean. Thus,
the 40-week gestation baby weighs less relative to the gestation period.
3000 2600
0.60
670
x 3900 3500
z-score for the 40-week gestation baby: z =
=
0.84
475
The weight of a 34-week gestation baby is 0.60 standard deviations above the mean, while
the weight of a 40-week gestation baby is 0.84 standard deviations above the mean. Thus,
the 34-week gestation baby weighs less relative to the gestation period.
160
75 69.6
=2
2.7
x 70 64.1
z-score for the 70-inch woman: z =
=
2.27
2.7
The height of the 75-inch man is 2 standard deviations above the mean, while the height of
a 70-inch woman is 2.27 standard deviations above the mean. Thus, the 70-inch woman is
relatively taller than the 75-inch man.
68 69.6
0.59
2.7
x 62 64.1
=
0.81
z-score for the 62-inch woman: z =
2.7
The height of the 68-inch man is 0.59 standard deviations below the mean, while the height
of a 62-inch woman is 0.81 standard deviations below the mean. Thus, the 68-inch man is
relatively taller than the 62-inch woman.
2.27 4.198
2.50
0.772
x 2.61 4.338
=
2.20
z-score for Johann Santana: z =
0.785
Jake Peavys 2004 ERA was 2.50 standard deviations below the mean, while Johann
Santanas 2004 ERA was 2.20 standard deviations below the mean. Thus, Peavy had the
better year relative to his peers.
0.406 0.28062
3.82
0.03281
x 0.372 0.26992
z-score for Ichiro Suzuki: z =
=
4.74
0.02154
Ted Williams 1941 batting average was 3.82 standard deviations above the mean, while
Ichiro Suzukis 2004 batting average was 4.74 standard deviations above the mean. Thus,
Suzuki had the better year relative to his peers.
13. The data provided in Table 17 are already listed in ascending order.
k
40
(a) i =
( n + 1) =
( 51 + 1) = 20.8 . Since i = 20.8 is not an integer, we average
100
100
325.5 + 333.2
= 329.35 . This means that
the 20th and 21st data values: P40 =
2
approximately 40% of the states have violent crime rates less than 329.35 crimes per
100,000 population, and approximately 60% of the states have violent crime rates more
than this.
161
162
1.70 . The
the and , the z-score for x = 0.97 inches z =
1.7790
rainfall in 1971 (0.97 inches) is 1.70 standard deviations below the mean.
(b) The data provided are already listed in ascending order. There are n = 20 data points.
25
The index for the first quartile is i =
( 20 + 1) = 5.25 . Since i = 5.2 is not an
100
2.47 + 2.78
2
50
index for the second quartile is i =
( 20 + 1) = 10.5 . Since i = 10.5 is not an
100
integer, we average the 10th and 11th data values: Q2 =
3.97 + 4.0
2
75
index for the third quartile is i =
( 20 + 1) = 15.75 . Since i = 15.75 is not an
100
integer, we average the 15th and 16th data values: Q3 =
5.22 + 5.50
2
= 5.36 inches.
(d) Lower fence = Q1 1.5 ( IQR ) = 2.625 1.5 ( 2.735 ) = 1.478 inches.
163
1.21 . Blackies
1.8858
hemoglobin level (7.8 g/dL) is 1.21 standard deviations below the mean.
(b) The data provided are already listed in ascending order. There are n = 20 data points.
25
The index for the first quartile is i =
( 20 + 1) = 5.25 . Since i = 5.25 is not an
100
8.9 + 9.4
2
50
for the second quartile is i =
( 20 + 1) = 10.5 . Since i = 10.5 is not an integer, we
100
average the 10th and 11th data values: Q2 =
9.9 + 10.0
2
75
third quartile is i =
( 20 + 1) = 15.75 . Since i = 15.75 is not an integer, we
100
average the 15th and 16th data values: Q3 =
11.0 + 11.2
2
= 11.1 g/dL.
(d) Lower fence = Q1 1.5 ( IQR ) = 9.15 1.5 (1.95 ) = 6.225 g/dL.
0.61 . The
7.3837
organic concentration of 20.46 mg/L is 0.61 standard deviations above the mean.
(b) There are n = 33 data points, and we must put them in ascending order:
5.2, 5.29, 5.3, 6.51, 7.4, 8.09, 8.81, 9.72, 10.3, 11.4, 11.9, 14, 14.86, 14.86,
14.9, 15.35, 15.42, 15.72, 15.91, 16.51, 16.87, 17.5, 17.9, 18.3, 19.8, 20.46,
20.46, 22.49, 22.74, 27.1, 29.8, 30.91, 33.67
25
The index for the first quartile is i =
( 33 + 1) = 8.5 . Since i = 8.5 is not an
100
9.72 + 10.3
2
50
index for the second quartile is i =
( 33 + 1) = 17 . Since i = 17 is an integer, the
100
17th data value is the second quartile: Q2 = 15.42 mg/L. The index for the third
164
19.8 + 20.46
2
= 20.13 mg/L.
(d) Lower fence = Q1 1.5 ( IQR ) = 10.01 1.5 (10.12 ) = 5.17 mg/L.
1.60 . The
4.9789
organic concentration of 17.99 mg/L is 1.60 standard deviations above the mean.
(b) There are n = 47 data points, and we must put them in ascending order:
3.02, 3.79, 3.91, 3.99, 4.6, 4.71, 4.8, 4.85, 4.9, 5.5, 7, 7.11, 7.31, 7.45, 7.66,
7.85, 7.9, 7.92, 8.05, 8.37, 8.5, 8.5, 8.79, 9.1, 9.11, 9.29, 9.6, 9.81, 10.3, 10.72,
10.47, 10.89, 11.33, 11.56, 11.72, 11.72, 11.8, 11.97, 12.57, 12.89, 16.92, 17.9,
17.99, 21, 21.4, 21.82, 22.62
25
The index for the first quartile is i =
( 47 + 1) = 12 . Since i = 12 is an integer, the
100
12th data value is the first quartile: Q1 = 7.11 mg/L. The index for the second quartile
50
th
is i =
( 47 + 1) = 24 . Since i = 24 is an integer, the 24 data value is the second
100
75
quartile: Q2 = 9.1 mg/L. The index for the third quartile is i =
( 47 + 1) = 36 .
100
Since i = 36 is an integer, the 36th data is the third quartile: Q3 = 11.72 mg/L.
(c) IQR = Q3 Q1 = 11.72 7.11 = 4.61 mg/L
(d) Lower fence = Q1 1.5 ( IQR ) = 7.11 1.5 ( 4.61) = 0.195 mg/L.
Upper fence = Q3 + 1.5 ( IQR ) = 489.5 + 1.5 ( 489.5 433) = 574.25 minutes.
The cutoff point is 574 minutes. If more minutes are used, the customer is contacted.
20. The first and third quartiles are Q1 = $84 and Q3 = $138 .
(c) Answers will vary. One possibility is that a student may have provided his or her
annual income instead of his or her weekly income.
22. (a) The first and third quartiles are Q1 = $21 and Q3 = $54 .
(b)
(c) Answers will vary. One possibility follows: It is possible that $115 is correct but
simply an unusual situation. For the data value $1000, perhaps a student provided his
or her annual expenditures for entertainment instead of his or her weekly expenditures.
166
24.
Pulse
76
60
60
81
72
80
80
68
73
72.2
7.671
Travel Time
39
21
9
32
30
45
11
12
39
26.4
12.842
z-score
0.49
1.59
1.59
1.14
0.03
1.01
1.01
0.55
0.10
0.0
1.00
z-score
0.98
0.42
1.36
0.43
0.28
1.44
1.20
1.12
0.98
0.0
1.000
167
The median is near the center of the box and the horizontal lines are approximately the
same in length, so the distribution is symmetric.
6. The data in ascending order are as follows:
1, 2, 8, 8, 11, 11, 12, 15, 16, 16, 17, 23, 23, 23, 23, 28, 28, 31, 33, 33, 35, 40
The smallest number in the data set is 1. The largest number in the data set is 40. The first
quartile is Q1 =
M=
Q3 =
17 + 23
2
28 + 28
2
11 + 11
2
= 11 (the mean of the 5th and 6th data points). The median is
= 20 (the mean of the 11th and 12th data points). The third quartile is
= 28 (the mean of the 16th and 17th data points). The five-number summary is
The median is near the center of the box and the horizontal lines are approximately the
same in length, so the distribution is symmetric.
168
The median is to the left of the center of the box and the right line is substantially longer
than the left line, so the distribution is skewed right.
8. The data is ascending order are as follows:
18, 19, 19, 19, 20, 21, 22, 24, 24, 24, 25, 25, 25, 25, 26, 26, 26, 26, 26, 27, 27, 28, 28, 29,
29, 29, 29, 29, 29, 29, 29, 30, 30, 30, 30, 30, 30, 30, 30, 31, 31, 31, 31, 31, 32, 32, 32, 32,
32, 32, 34, 34, 34, 34, 34, 34, 34, 34, 35, 35, 38, 39, 46
The smallest number in the data set is 18. The largest is 46. The first quartile is Q1 = 26
(the 16th data point). The median is M = 30 (the 32nd data point). The third quartile is
Q3 = 32 (the 48th data point). The five-number summary is 18, 26, 30, 32, 46.
The upper and lower fences are: Lower fence = Q1 1.5 ( IQR ) = 18 1.5 ( 32 26 ) = 9 ;
Upper fence = Q3 + 1.5 ( IQR ) = 32 + 1.5 ( 32 26 ) = 41 . Thus, 46 is an outlier.
The median is to the right of the center of the box, so the distribution is skewed left.
169
Q1 =
0.603 + 0.605
2
= 0.604 (the mean of the 6th and 7th data points). The median is
0.610 + 0.610
2
of the 19th and 20th data points). The five-number summary is 0.598, 0.604, 0.608, 0.610,
0.612. The upper and lower fences are:
Lower fence = Q1 1.5 ( IQR ) = 0.604 1.5 ( 0.610 0.604 ) = 0.595 ;
Upper fence = Q3 + 1.5 ( IQR ) = 0.610 + 1.5 ( 0.610 0.604 ) = 0.619 .
The median is to the right of the center of the box, so the distribution is skewed left.
Answers will vary concerning the source of variability in weight.
10. The data is ascending order are as follows:
421, 480, 581, 583, 598, 611, 616, 618, 643, 645, 646, 649, 653, 654, 660, 664, 666, 667,
669, 672, 675, 678, 679, 682, 683, 684, 688, 688, 692, 692, 698, 698, 704, 706, 707, 707,
711, 711, 713, 715, 726, 737, 740, 741, 787, 791, 802, 816, 821, 830, 971
The smallest number in the data set is 421. The largest number in the data set is 971. The
first quartile is Q1 = 653 (the 13th data point). The median is M = 684 (the 26th data
point). The third quartile is Q3 = 713 (the 39th data point). The five-number summary is
421, 653, 684, 713, 971. The upper and lower fences are:
Lower fence = Q1 1.5 ( IQR ) = 653 1.5 ( 713 653) = 563 ;
Upper fence = Q3 + 1.5 ( IQR ) = 713 + 1.5 ( 713 653) = 803 .
Thus, the data points 421, 480, 816, 821, 830, and 971 are outliers.
**
*
The median is near the center of the box. Though the left line is longer than the right line,
when we consider the positions of the outliers, the distribution is relatively symmetric.
Answers will vary. Wyoming is very rural resulting in the need to drive further distances.
New York is more urban with many mass transit systems resulting in many individual
gasoline expenditures.
170
The smallest number in the data set is 28. The largest number in the data set is 73. The
first quartile is Q1 =
median is M =
quartile is Q3 =
45 + 45
51 + 51
2
58 + 59
2
= 51 (the mean of the 20th and 21st data points). The third
= 58.5 (the 30th and 31st data points). The five-number
(c) The median is near the center of the box and the horizontal lines are approximately
equal in length, so the distribution is symmetric. This is confirmed by the histogram.
(d) Since the distribution is symmetric and contains no outliers, the mean and standard
deviation should be reported as the measures of central tendency and dispersion.
12. (a) The data is ascending order are as follows:
3.01, 3.04, 3.25, 3.38, 3.38, 3.56, 3.78, 4.35, 4.43, 4.50, 4.74, 4.88, 5.00, 5.02, 5.32,
5.34, 5.53, 5.58, 5.64, 5.75, 6.06, 6.07, 6.23, 6.52, 6.57, 6.92, 7.16, 7.25, 7.57, 7.97,
8.40, 8.74, 9.70, 10.32, 10.96
The smallest number in the data set is 3.01. The largest number in the data set is 10.96.
The first quartile is Q1 = 4.43 (the 9th data point). The median is M = 5.58 (the 18th
data point). The third quartile is Q3 = 7.16 (the 27th data point). The five-number
summary is 3.01, 4.43, 5.58, 7.16, 10.96.
(b) Lower fence = Q1 1.5 ( IQR ) = 4.43 1.5 ( 7.16 4.43) = 0.335 ;
Upper fence = Q3 + 1.5 ( IQR ) = 7.16 + 1.5 ( 7.16 4.43) = 11.255 . There are no outliers.
171
The smallest number in the data set is 0. The largest number in the data set is 2.83.
The first quartile is Q1 =
median is M =
quartile is Q3 =
0 + 0.41
1.05 + 1.06
2
1.7 + 2.04
2
= 0.205 (the mean of the 7th and 8th data points). The
= 1.055 (the mean of the 14th and 15th data points). The third
= 1.87 (the 21st and 22nd data points). The five-number
(c) The right line is substantially longer than the left line, so the distribution is skewed
right. This is confirmed by the histogram.
(d) Since the distribution is skewed, the median and interquartile range should be reported
as the measures of central tendency and dispersion.
14. (a) The data is ascending order are as follows:
78, 107, 108, 161, 177, 225, 234, 237, 255, 262, 268, 274, 279, 285, 286, 291, 292,
311, 314, 343, 345, 351, 352, 352, 357, 375, 377, 402, 424, 444, 459, 470, 484, 496,
503, 539, 540, 553, 563, 579, 593, 599, 621, 638, 662, 717, 740, 770, 770, 822, 1633
The smallest number in the data set is 78. The largest is 1633. The first quartile is
Q1 = 279 (the 13th data point). The median is M = 375 (the 26th data point). The third
quartile is Q3 = 563 (the 39th data point). The five-number summary is 78, 279, 375,
563, 1633.
172
(c) The median is to the left of the center of the box, so the distribution is skewed right.
This is confirmed by the histogram.
(d) Since the distribution is skewed, the median and interquartile range should be reported
as the measures of central tendency and dispersion.
15. The data in ascending order are:
Keebler:
20, 20, 21, 21, 21, 22, 23, 24, 24, 24, 25, 25, 26, 28, 28, 28, 28, 29, 31, 32, 33
Store Brand: 16, 17, 18, 21, 21, 21, 23, 23, 24, 24, 24, 25, 26, 26, 27, 27, 28, 29, 30, 31, 33
Since both sets of data contain n = 21 data points, the quartiles are in the same positions for
both sets. Namely, the first quartile is the mean of the 5th and 6th data points, the median is
the 11th data point, and the third quartile is the mean of the 16th and 17th data points.
The five-number summaries are:
Keebler: 20, 21.5, 25, 28, 33
Store Brand: 16, 21, 24, 27.5, 33
The fences for Keebler Chips Deluxe Chocolate Chip Cookies are:
Lower fence = 21.5 1.5 ( 28 21.5 ) = 11.75 ; Upper fence = 28 + 1.5 ( 28 21.5 ) = 37.75
The fences for the store brand chocolate chip cookies are:
Lower fence = 21 1.5 ( 27.5 21) = 11.25 ; Upper fence = 27.5 + 1.5 ( 27.5 21) = 37.25
So, neither data set has any outliers.
Keebler appears to have both a higher number of chocolate chips per cookie and the more
consistent number of chips per cookie.
173
Kansas:
42, 59, 62, 64, 68, 71, 73, 88, 91, 92, 95, 101, 113, 116, 122
Nebraska:
26, 28, 30, 55, 60, 61, 62, 63, 65, 69, 74, 81, 88, 102, 110
Since all three sets of data contain n = 15 data points, the quartiles are in the same positions
for all three sets. Namely, the first quartile is the 4th data point, the median is the 8th data
point, and the third quartile is the 12th data point.
The five-number summaries are:
Oklahoma: 18, 44, 62, 78, 145
Kansas: 42, 64, 88, 101, 122
Nebraska: 26, 55, 63, 81, 110
Oklahoma: Lower fence = 44 1.5 ( 78 44 ) = 7 ; Upper fence = 78 + 1.5 ( 78 44 ) = 129 ,
Kansas:
so 145 is an outlier.
Lower fence = 64 1.5 (101 64 ) = 8.5 ; Upper fence = 101 + 1.5 (101 64 ) = 156.5 ,
McGwire: 340, 341, 350, 350, 360, 360, 360, 369, 370, 370, 370, 370, 377, 380, 380, 380,
380, 380, 385, 385, 388, 390, 390, 390, 390, 398, 400, 400, 409, 410, 410, 410,
410, 410, 420, 420, 420, 420, 420, 423, 425, 430, 430, 430, 430, 430, 430, 430,
440, 440, 440, 450, 450, 450, 450, 452, 458, 460, 460, 461, 470, 470, 470, 478,
480, 500, 510, 510, 527, 550
The smallest number in the data set is 340. The largest number is 550. The first quartile is
Q1 = 380 (the mean of the 17th and 18th data points). The median is M = 420 (the mean of
the 35th and 36th data points). The third quartile is Q3 = 450 (the mean of the 53rd and 54th
data points). The five-number summary for Mark McGwire is 340, 380, 420, 450, 550.
Lower fence = 380 1.5(450 380) = 275 ; Upper fence = 450 + 1.5(450 380) = 555 .
Thus, there are no outliers.
174
(Note: The TI-84 gives Q1 = 371 because the calculator uses a different, but acceptable,
procedure for determining the quartiles. In most cases, the different procedures produce
the same results, but in this case, they differ slightly.)
Bonds:
320, 320, 347, 350, 360, 360, 360, 361, 365, 370, 370, 375, 375, 375, 375, 380,
380, 380, 380, 380, 385, 390, 390, 391, 394, 396, 400, 400, 400, 400, 404, 405,
410, 410, 410, 410, 410, 410, 410, 410, 410, 410, 411, 415, 415, 416, 417, 417,
420, 420, 420, 420, 420, 420, 420, 420, 429, 430, 430, 430, 430, 430, 435, 435,
436, 440, 440, 440, 440, 442, 450, 454, 488
The smallest number in the data set is 320. The largest number is 488. The first quartile is
Q1 = 380 (the mean of the 18th and 19th data points). The median is M = 410 (the 37th
data point). The third quartile is Q3 = 420 (the mean of the 55th and 56th data points). The
five-number summary for Barry Bonds is 320, 380, 410, 420, 488.
Lower fence = 380 1.5 ( 420 380 ) = 320 ; Upper fence = 420 + 1.5 ( 420 380 ) = 480 .
Thus, 488 is an outlier.
Mark McGwire appears to have longer distances. Barry Bonds appears to have the most
consistent distances.
175
M=
792.4 + 792.4
2
= 792.4 m/s
Data in order: 789.6, 791.4, 791.7, 792.3, 792.4, 792.4, 793.1, 793.8, 794.0, 794.4
(b) Range = Largest Data Value Smallest Data Value = 974.4 789.6 = 4.8 m/s .
Data, xi
793.8
793.1
792.4
794.0
791.4
792.4
791.7
792.3
789.6
794.4
Sample Mean, x
792.51
792.51
792.51
792.51
792.51
792.51
792.51
792.51
792.51
792.51
Deviations, xi x
Squared Deviations, ( xi x )
793.8 792.51 = 1.29
1.292 = 1.6641
793.1 792.51 = 0.59
0.592 = 0.3481
792.4 792.51 = 0.11
(0.11) 2 = 0.0121
1.492 = 2.2201
7934.0792.51 = 1.49
791.4 792.51 = 1.11
(1.11)2 = 1.2321
792.4 792.51 = 0.11
(0.11) 2 = 0.0121
791.7 792.51 = 0.81
(0.81) 2 = 0.6561
792.3 792.51 = 0.21
(0.21) 2 = 0.0441
789.6 792.51 = 2.91
(2.91) 2 = 8.4681
794.4 792.51 = 1.89
1.892 = 3.5721
x =7925.1
s
( x x ) = 0
(x x )
=
i
n 1
2. (a) x =
18.22904
10 1
2.03 (m/s) ; s =
2
( x x )
10
M=
(x x )
i
n 1
128 + 129
2
= 18.2290
18.22904
10 1
1.42 m/s .
= 128.5 beats/min.
Data in order: 86, 96, 115, 120, 128, 129, 136, 143, 146, 169
(b) Range = Largest Data Value Smallest Data Value = 169 86 = 83 beats/min.
Data, xi
136
Sample Mean, x
126.8
Deviations, xi x
9.2
Squared Deviations, ( xi x )
84.64
169
120
126.8
126.8
1780.84
46.24
128
126.8
42.2
6.8
1.2
129
143
126.8
126.8
2.2
16.2
4.84
262.44
115
146
126.8
126.8
11.8
19.2
139.24
368.64
96
86
126.8
126.8
30.8
40.8
948.64
1664.64
x =1268
( x x ) = 0
i
176
1.44
( x x )
i
= 5301.60
(x x )
=
n 1
s=
3. (a) x =
(x x )
5301.60
n 1
10 1
589.1 (beats/min.) 2 ;
5301.60
10 1
24.3 beats/min.
M = $9,980
Data in order: 5500, 7200, 7889, 8998, 9980, 10995, 12999, 13999, 14050
(b) Range = Largest Data Value Smallest Data Value = 14, 050 5,500 = $8,550 .
Deviations, xi x Squared Deviations, ( xi x )
3871.1111
14,985,501.1
3820.1111
14,593, 248.8
Data, xi
14,050
13,999
Sample Mean, x
10,178.8889
10,178.8889
12,999
10,995
9,980
10,178.8889
10,178.8889
10,178.8889
2820.1111
816.1111
198.8889
7,953,026.6
666,037.3
39,556.8
8,998
7,889
10,178.8889
10,178.8889
1180.8889
2289.8889
1,394, 498.6
5, 243,591.2
7, 200
5,550
10,178.8889
10,178.8889
2978.8889
4678.8889
8,873,779.1
21,892,001.3
x =91,610
s=
(c) x =
( x x ) = 0
i
(x x )
i
n 1
( x x )
= 75,641, 240.9
$3, 074.92 .
Data in order: 5500, 7200, 7889, 8998, 9980, 10995, 12999, 13999, 41050
M = $9,980 ; Range = 41, 050 5,500 = $35,550 .
Data, xi
41,050
13,999
12,999
10,995
9,980
8,998
7,889
7, 200
5,550
x =118,610
Sample Mean, x
13,178.8889
13,178.8889
13,178.8889
13,178.8889
13,178.8889
13,178.8889
13,178.8889
13,178.8889
13,178.8889
( x x ) = 0
i
177
( x x )
i
= 932,681, 240.9
(x x )
s=
n 1
$10, 797.46 .
The mean, range, and standard deviation are all changed considerably by the incorrectly
entered data value. The median does not change. The median is resistant.
4. (a) x =
Data in order: 99000, 115000, 124757, 128429, 135512, 136529, 136833, 136924,
138820, 140794, 149143, 149380, 153146, 157216, 169541
M = $136,924
(b) Range = Largest Data Value Smallest Data Value = 169,541 99, 000 = $70,541 .
Data, xi
138,820
169,541
135,512
149,143
140,794
153,146
99,000
136,924
136,833
115,000
124,757
128, 429
157, 216
149,380
136,529
Sample Mean, x
138,068.2667
138,068.2667
138,068.2667
138,068.2667
138,068.2667
138,068.2667
138,068.2667
138,068.2667
138,068.2667
138,068.2667
138,068.2667
138,068.2667
138,068.2667
138,068.2667
138,068.2667
x =91,610
s=
(x x )
i
n 1
( x x ) = 0
i
4,183, 425,169
15 1
$17, 286.30 .
178
127,955,310
2,369,342
( x x )
i
= 4,183, 425,169
5. (a) =
Data value, xi
44
56
51
46
59
56
58
55
65
64
68
69
56
62
62
62
xi = 933
16
2
i
( x )
2
933)
(
55,169
16
16
6.9 years
(c) Answers will vary depending on samples
selected.
2846
177.9 home runs .
16
To find the median, we put the data in order and find the mean of the 8th and 9th data
183 + 185
values: M =
= 184 home runs . The mode is the most frequent data value,
2
which is 135 home runs.
(b) Range = 235 135 = 100 home runs. To find the standard deviation, we determine
2
i
= 521,902 . So, =
= 2846 and n = 16 , so =
2
i
( x )
( 2846 )
521,902
16
16
78
2.2 children .
36
To find the median, we put the data in order and find the mean of the 18th and 19th data
2+3
= 2.5 children .
values: M =
2
= 78 and n = 36 , so x =
179
s=
2
i
( x )
n 1
( 78)
224
2
i
= 224 .
36
36 1
1.3 children .
134
4.5 cars .
30
To find the median, we put the data in order and find the mean of the 15th and 16th data
4+5
values: M =
= 4.5 cars .
2
(b) Range = 9 1 = 8 cars. To find the standard deviation, we determine xi2 = 754 .
s=
2
i
( x )
n 1
754
(134 )
30
30 1
= 134 and n = 30 , so x =
2.3 cars .
9. (a) By the Empirical Rule, approximately 99.7% of the data will be within 3 standard
deviations of the mean. Now, 600 3(53) = 441 and 600 + 3(53) = 759. Thus, about
99.7% of light bulbs have lifetimes between 441 and 759 hours.
(b) Since 494 is exactly 2 standard deviations below the mean [494 = 600 2(53)] and 706
is exactly 2 standard deviations above the mean [706 = 600 + 2(53)], the Empirical
Rule predicts that approximately 95% of the light bulbs will have lifetimes between
494 and 706 hours.
(c) Since 547 is exactly 1 standard deviations below the mean [547 = 600 1(53)] and 706
is exactly 2 standard deviations above the mean [706 = 600 + 2(53)], the Empirical
Rule predicts that approximately 34 + 47.5 = 81.5% of the light bulbs will have
lifetimes between 547 and 706 hours.
(d) Since 441 hours is 3 standard deviations below the mean [441 = 600 3(53)], the
Empirical Rule predicts that 0.15% of light bulbs will last less than 441 hours. Thus,
the company should expect to replace about 0.15% of the light bulbs.
1
1
180
( xi )
Class
Midpt, xi
Freq, fi
xi f i
xi
20 24
22.5
6035
135,787.5
42.2826
19.7826
2,361,804.87
25 29
27.5
4352
119,680
42.2826
14.7826
951,021.94
30 34
32.5
4083
132,697.5
42.2826
9.7826
390,740.09
35 39
37.5
3933
147,487.5
42.2826
4.7826
89,960.54
40 44
42.5
4194
178,245
42.2826
0.2174
198.22
45 49
47.5
3716
176,510
42.2826
5.2174
101,154.21
50 54
52.5
3005
157,762.5
42.2826
10.2174
313,707.76
55 59
57.5
2355
135,412.5
42.2826
15.2174
545,345.61
60 64
62.5
1664
104,000
42.2826
20.2174
680,148.79
65 69
67.5
1173
79,177.5
42.2826
25.2174
745,930.95
70 74
72.5
1025
74,312.5
42.2826
30.2174
935,918.54
75 79
77.5
895
69,362.5
42.2826
35.2174
1,110,037.41
80 84
82.5
744
61,380
42.2826
40.2174
1,203,374.81
( x )
f = 9, 429,343.76
f
(a) =
xi f i
(b) =
= 37,174
x f
i i
= 1,571,815
1,571,815
42.2826 42.28 years
37,174
( x )
f
i
9, 429,343.76
15.93 years
37,174
181
fi
12.
( xi )
Class
Midpt, xi
Freq, fi
xi f i
xi
20 24
22.5
1903
42,817.5
43.7136
21.2136
856,382.02
25 29
27.5
1415
38,912.5
43.7136
16.2136
371,976.37
30 34
32.5
1364
44,330
43.7136
11.2136
171,515.94
35 39
37.5
1430
53,625
43.7136
6.2136
55,210.62
40 44
42.5
1409
59,882.5
43.7136
1.2136
2,075.21
45 49
47.5
1242
58,995
43.7136
3.7864
17,806.34
50 54
52.5
1008
52,920
43.7136
8.7864
77,818.43
55 59
57.5
784
45,080
43.7136
13.7864
149,040.82
60 64
62.5
599
37,437.5
43.7136
18.7864
211,404.37
65 69
67.5
415
28,012.5
43.7136
23.7864
234,804.02
70 74
72.5
482
34,945
43.7136
28.7864
399,412.59
75 79
77.5
456
35,340
43.7136
33.7864
520,533.50
80 84
82.5
372
30,690
43.7136
38.7864
559,631.15
f
(a) =
xi f i
(b) =
= 12,879
x f
i i
( x )
= 562,987.5
fi
f = 3,627,581.38
562,987.5
43.7136 43.71 years
12,879
f
( x )
f
i
3,627,581.38
16.78 years
12,879
(c) The mean age of a female involved in a traffic fatality is greater than the mean age of
a male involved in a traffic fatality. Also, the ages of females involved in traffic a
traffic fatality are more dispersed. Answers will vary. One possibility is that an
insurance company might use this information in order to help establish the rates it
would charge for insuring drivers.
13. GPA = xw =
w x
w x
i i
i i
w x
w x
i i
i i
Mets:
184,193,950
$6,351,516 .
29
96, 660,970
=
$3, 452,177 .
28
182
2
i
2.30011015 , so
Yankees =
Mets:
2
i
2
i
( x )
2.30011015
(184,193,950 )
29
19
9.37457 1014 , so
Mets =
2
i
( x )
9.37457 10
14
( 96, 660,970 )
28
28
Annotations will vary. One possibility is that the Mets salaries are clearly lower and
less dispersed than the Yankees salaries.
(g) In both boxplots, the median is to the left of the center of the box and the right line is
substantially longer than the left line, so both distributions are skewed right.
(h) For both distributions, the median is the better measure of central tendency since the
distributions are skewed.
183
= 64.04 and n = 10 , so xA =
(b) Material A: M A =
Material B: M B =
5.69 + 5.88
2
8.20 + 9.65
2
(c) In both cases, the mean is substantially larger than the median, so both distributions are
skewed right.
(d) Material A:
2
i
2
i
sA =
Material B:
2
i
sB =
472.177 , so
( x )
n 1
( 64.04 )
472.177
10 1
10
1597.4002 , so
2
i
( x )
n 1
1597.4002
(113.32 )
10 1
10
184
k
95
(b) i =
( n + 1) =
( 88 + 1) = 84.55 . Since i = 84.55 is not an integer, we average
100
100
5, 692, 620 + 6, 221, 710
= $5,957,165 . This means
the 84th and 85th data values: P95 =
2
that approximately 95% of drivers in the 2004 Nextel Cup Series earned less than
$5,957,165, and approximately 5% of drivers in the 2004 Nextel Cup Series earned
more than $5,957,165.
k
10
(c) i =
( n + 1) =
( 88 + 1) = 8.9 . Since i = 8.9 is not an integer, we average the
100
100
65,175 + 70,550
8th and 9th data values: P10 =
= $67862.50 . This means that
2
approximately 10% of drivers in the 2004 Nextel Cup Series earned less than
$67,862.50, and approximately 90% of drivers in the 2004 Nextel Cup Series earned
more than $67,862.50.
(d) Of the 88 drivers in the 2004 Nextel Cup Series, 73 earned less than $4,117,750.
73
Percentile rank of $4,117,750 = 100 83 . Thus, $4,117,750 was at the 83rd
88
percentile. This means that approximately 83% of drivers in the 2004 Nextel Cup
Series earned less than $4,117,750, and approximately 17% of drivers in the 2004
Nextel Cup Series earned more than $4,117,750.
(e) Of the 88 drivers in the 2004 Nextel Cup Series, 13 earned less than $116,359.
13
Percentile rank of $116,359 = 100 15 . Thus, $116,359 was at the 15th
88
percentile. This means that approximately 15% of drivers in the 2004 Nextel Cup
Series earned less than $116,359, and approximately 85% of drivers in the 2004 Nextel
Cup Series earned more than $116,359.
185
k
90
(b) i =
( n + 1) =
( 88 + 1) = 80.1 . Since i = 80.1 is not an integer, we average
100
100
4, 759, 020 + 5,152, 670
the 80th and 81st data values: P90 =
= $4,955,845 . This means
2
that approximately 90% of drivers in the 2004 Nextel Cup Series earned less than
$4,955,845, and approximately 5% of drivers in the 2004 Nextel Cup Series earned
more than $4,955,845.
k
5
(c) i =
( n + 1) =
( 88 + 1) = 4.45 . Since i = 4.45 is not an integer, we average
100
100
57, 450 + 57,590
the 4th and 5th data values: P5 =
= $57,520 . This means that
2
approximately 5% of drivers in the 2004 Nextel Cup Series earned less than $57,520,
and approximately 95% of drivers in the 2004 Nextel Cup Series earned more than
$57,520.
(d) Of the 88 drivers in the 2004 Nextel Cup Series, 49 earned less than $1,333,520.
49
Percentile rank of $1,333,520 = 100 56 . Thus, $1,333,520 was at the 56th
88
percentile. This means that approximately 56% of drivers in the 2004 Nextel Cup
Series earned less than $1,333,520, and approximately 44% of drivers in the 2004
Nextel Cup Series earned more than $1,333,520.
(e) Of the 88 drivers in the 2004 Nextel Cup Series, 16 earned less than $139,614.
16
Percentile rank of $139,614 = 100 18 . Thus, $139,614was at the 18th
88
percentile. This means that approximately 18% of drivers in the 2004 Nextel Cup
Series earned less than $139,614, and approximately 82% of drivers in the 2004 Nextel
Cup Series earned more than $139,614.
x
160 156.5
0.07
51.2
x 185 183.4
z-score for the male: z =
=
= 0.04
40
The weight of the 160-pound female is 0.07 standard deviations above the mean, while the
weight of the 185-pound male is 0.04 standard deviations above the mean. Thus, the 160pound female is relatively heavier.
186
Case Study
20. (a) Reading the boxplot, the median crime rate is approximately 4050 per 100,000
population.
(b) Reading the boxplot, the 25th percentile crime rate is approximately 3100 per 100,000
population.
(c) Reading the boxplot, there is one outlier. It is approximately 8000.
(d) Reading the boxplot, the lowest crime rate is approximately 2200 per 100,000
population.
Answers will vary. None of the provided authors match both the measures of central
tendency and the measures of dispersion well. In other words, there is no clear cut choice
for the author based on the information provided. Based on measures of central tendency,
James Otis or Samuel Adams would appear to be the more likely candidates for A
MOURNER. Based on measures of dispersion, Tom Sturdy seems the more likely choice.
Still, the unknown authors mean word length differs considerably from that of Sturdy, and
the unknown authors standard deviation differs considerably from those of Otis and
Adams.
187
188