Descriptive Statistics Solutions
Descriptive Statistics Solutions
20 15 25 20 10 15 25 20 15 165
4. Population mean: 18.33 .
9 9
Median: The median is located in the 5th position; Median = 20.
The distribution is bimodal: 15 and 20 are the two modes.
R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Price)
median(myData$Price)
3-1
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Price)
median(myData$Price)
R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Life_Expectancy)
median(myData$Life_Expectancy)
R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Expenditures)
median(myData$Expenditures)
9.
a. Mean = 81,727.41; Median =80,705.
Excel: =AVERAGE(D2:D415)
=MEDIAN(D2:D415)
=AVERAGEIF(C2:C418, "Yes", D2:D415)
=AVERAGEIF(C2:C418, "No", D2:D415)
=AVERAGEIF(B2:B418, "Always", D2:D415)
=AVERAGEIF(B2:B418, "Never", D2:D415)
R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Income)
median(myData$Income)
3-2
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
Excel: =AVERAGE(D2:D500)
=AVERAGE(E2:E500)
=AVERAGEIF(B2:B500, "Yes", D2:D500)
=AVERAGEIF(B2:B500, "No", D2:D500)
=AVERAGEIF(B2:B500, "Yes", E2:E500)
=AVERAGEIF(B2:B500, "No", E2:E500)
R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Food)
mean(myData$Travel)
tapply(myData$Food, myData$OwnHome, mean)
tapply(myData$Travel, myData$OwnHome, mean)
13.
94.81×100+102.67×60+115.32×40
a. Average price per share = = 101.27
100+60+40
94.81×40+102.67×60+115.32×100
b. Average price per share = = 107.42
40+60+100
3-3
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
16. 25th percentile: 64; 50th percentile: 112; 75th percentile: 149.25
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
17.
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
quantile(myData$x3, 0.20)
quantile(myData$x3, 0.80)
18.
b. Since the median is closer to Q3 than Q1 (right of center) and the left whisker is
longer than the right whisker, the distribution is negatively skewed.
19.
a. The box plot does not indicate any possible outliers in the data.
b. The median is equidistant from Q1 and Q3 and the right and left whiskers have
approximately the same length, the distribution is approximately symmetric.
20.
3-4
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
Limit: 1.5 × IQR = 1.5 × 350 = 525. Find distances from Q1 − Min = 75 < 525
and from Max − Q3 = 750 > 525; thus, there is at least one outlier on the right side
of the distribution.
c. The distribution is not symmetric. It appears positively skewed because: (1) the
median falls left of center in the interquartile range (Median − Q1 = 100 < 250 =
Q3 − Median), and (2) the right whisker is longer than the left whisker (Q1 − Min =
75 < 750 = Max − Q3).
21.
b. IQR= Q3 − Q1 = 78 − 54 = 24.
Find limit: 1.5 × IQR = 1.5 × 24 = 36. Find distances from Q1 − Min = 54 − 34 =
20 < 36 and Max − Q3 = 98 − 78 = 20 < 36; since both distances are less than 36,
there are no outliers in the distribution.
c. The distribution is symmetric because: (1) The median falls in the center of the
78+54
interquartile range: = 66, and (2) the lengths of the right and left whiskers are
2
equal (Q1 − Min = Max − Q3 = 20).
22.
a. 25th percentile: 40; approximately 25 percent of the observations are less than 40.
50th percentile: 46; approximately 50 percent of the observations are less than 46.
75th percentile: 51; approximately 75 percent of the observations are less than 51.
3-5
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
23.
a. 25th percentile: 67.5; approximately 25 percent of the scores are less than 67.5.
50th percentile: 75; approximately 50 percent of the scores are less than 75.
75th percentile: 85.25; approximately 75 percent of the scores are less than 85.25.
c. The distribution is not symmetric. However, the distribution’s skewness is not clear
from the two guidelines because (1) the median falls left of center in the interquartile
range (Median − Q1 = 7.5 < 10.25 = Q3 − Median), and (2) the left whisker is
longer than the right whisker (Q1 − Min = 42.5 > 13.75 = Max − Q3). If we
calculate the skewness coefficient, we find that the distribution is negatively
skewed.
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
24.
3-6
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
b. In the accompanying “Boxplot of House Value”, there are two outliers in the
house value data. The median house value in California and Hawaii are the
outliers in the data. The median is left of center and the right whisker is
longer than the left whisker, thus the distribution seems to be positively
skewed.
3-7
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
R Code:
#Import the data file into a data frame (table) and label it myData
boxplot(myData$Income, horizontal=TRUE)
boxplot(myData$Value, horizontal=TRUE)
25.
a. 25th percentile: 12; approximately 25 percent of the P/E ratios are less than 12.
50th percentile: 14.5; approximately 50 percent of the P/E ratios are less than 14.5.
75th percentile: 18.75; approximately 75 percent of the P/E ratios are less than
18.75.
b. There are two outliers on the right side of the distribution. The distribution is not
symmetric. It appears positively skewed because (1) the median falls left of center and
(2) the right whisker is longer than the left whisker.
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
3-8
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
boxplot(myData$PE_Ratio, horizontal=TRUE)
1.25
30. 𝐺𝑅 = √(1 + 0.05)(1 + 0.03) − 1 = 1.0647 − 1 = 0.0647, or 6.47%.
31.
a.
32.
a.
33.
G g 5 (1 0.025 )(1 0.036 )(1 0.018 )(1 0.022 )(1 0.052 ) 1 = 0.0305, or 3.05%.
34.
a. The arithmetic mean is given by
17.3 + 19.6 + 6.8 + 8.2
𝑥̅ = = 12.98
4
3-9
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
35.
Retailer 1 Retailer 2
Year 1-Year 2 −0.0784 −0.001
Year 2-Year 3 −0.0717 −0.0209
b. The average growth rates for each retailer are 𝐺1 = −0.075; 𝐺2 = −0.011.
36.
a. The arithmetic mean is given by
37.
3-10
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
38.
a.
4 30601
b. G g = √20117 −1 = 0.1106, or 11.1%. As expected, both ways yield the same result.
39.
b. i | xi | 56 11.2
x 120
24; MAD
N 5 N 5
c. 2 = (x )
i
2
768
153.6
N 5
d. 2 =12.39
40.
∑ |𝑥𝑖 −𝜇| 24
b. 𝑀𝐴𝐷 = 𝑁
= 5
= 4.8
3-11
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
c. 2
=
(x ) i
2
184.00
36.80
N 5
d. 2 =6.07
41.
∑ |𝑥𝑖 −𝑥̄ | 32
b. 𝑀𝐴𝐷 = 𝑛
= 6
= 5.33
(x x )
i
2
256
c. s2 = 51.2
n 1 5
d. s s 2 = 7.16
42.
∑ |𝑥𝑖 −𝑥̄ | 44
b. 𝑀𝐴𝐷 = 𝑛
=
6
= 7.33
c. s2 =
(x x )
i
2
406
81.2 , s s 2 = 9.01
n 1 5
43.
c. s 2 = 112,987.05 ; 𝑠 = 336.14
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
var(myData$Expenditures)
sd(myData$Expenditures)
3-12
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
44.
b. Firm B’s stock price has greater variability as indicated by a higher standard
deviation.
𝑠 12.75
c. Firm A, 𝐶𝑉 = 𝑥̅ = 58.55 = 0.22
𝑠 13.17
Firm B, 𝐶𝑉 = 𝑥̅ = 55.91 = 0.24
Firm B’s stock price also has greater relative dispersion indicated by a higher
coefficient of variation.
R Code:
#Import the data file into a data frame (table) and label it myData
var(myData$’Firm A’)
sd(myData$’Firm A’)
var(myData$’Firm B’)
sd(myData$’Firm B’)
sd(myData$’Firm A’)/mean(myData$’Firm A’)
sd(myData$’Firm B’)/mean(myData$’Firm B’)
45.
s 424.80
c. Monthly rent: CV 0.35
x 1, 222.93
s 645.81
Square footage: CV 0.50
x 1, 286.03
Therefore, there is greater relative dispersion in square footage than in monthly
rent.
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$Rent)/mean(myData$Rent)
3-13
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
sd(myData$Footage)/mean(myData$Footage)
46.
𝑠 10,304.13
a. 𝐶𝑉𝐴 = 𝑥̅ = 79,231.23 =0.13
𝑠 7,510.96
b. 𝐶𝑉𝐵 = 𝑥̅ = 52,753.62 =0.14
R Code:
#Import the data file into a data frame (table) and label it myData
sd(myData$’Corporation A’)/mean(myData$’Corporation A’)
sd(myData$’Corporation B’)/mean(myData$’Corporation B’)
47.
c. The household income and house value have different sample means, so we
cannot compare the MAD and standard deviation directly in order to determine
which data are more variable. The coefficient variation is the recommended
measure to compare the dispersion between the data sets.
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
#Calculating MAD and variance
mean(abs(myData$Income-mean(myData$Income)))
var(myData$Income)
3-14
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
mean(abs(myData$Value-mean(myData$Value)))
var(myData$Value)
#Calculating coefficient of variation
sd(myData$Income)/mean(myData$Income)
sd(myData$Value)/mean(myData$Value)
48.
a. Investment B provides a higher return. Investment A provides the least risk since
it has a smaller standard deviation.
x Rf 10 1.4 x R f 15 1.4
b. SharpeA 1.72 ; SharpeB 1.36 . The Sharpe
sI 5 sI 10
Ratio is higher for investment A; hence it provides a higher reward per unit of
risk.
49.
x Rf 82 x R f 10 2
b. SharpeA 1.2 ; SharpeB 1.14 . Investment A has
sI 5 sI 7
a slightly higher Sharpe Ratio. Hence it provides a higher reward per unit of risk.
50.
b. s1 5.29 , s 2 9.02 . Investment 1 provides the least risk because it has a lower
standard deviation.
x Rf 3 1.2 x R f 5 1.2
c. Sharpe1 0.34 ; Sharpe2 0.42 . Investment 2
sI 5.29 sI 9
performs better because it offers more reward per unit of risk.
51.
3-15
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
a. Stock 2
b. Stock 1 had a higher standard deviation, hence it was riskier. This result is
surprising because, in general, investments with higher returns often have
higher risk.
9.62−3 12.38−3
c. SharpeStock1 = 23.58
= 0.28, SharpeStock2 = 15.45
= 0.61.
Stock 2 has a higher Sharpe ratio; hence it has a higher reward per unit of risk
compared to Stock 1.
52.
a. Mutual Fund 1
b. Mutual Fund 1
13.01−3 10.83−3
c. 𝑆ℎ𝑎𝑟𝑝𝑒1 = 38.61 = 0.26, 𝑆ℎ𝑎𝑟𝑝𝑒2 = 22.84 = 0.34. Mutual Fund 2 had a higher
Sharpe ratio; hence it has a higher reward per unit of risk.
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$MF1)
sd(myData$MF2)
#Calculating Sharpe ratios
SR1 <- (mean(myData$MF1) – 3)/sd(myData$MF1); SR1
SR2 <- (mean(myData$MF2)– 3)/sd(myData$MF2); SR2
53.
3-16
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$MF1)
sd(myData$MF2)
#Calculating Sharpe ratios
SR1 <- (mean(myData$MF1) – 2)/sd(myData$MF1); SR1
SR2 <- (mean(myData$MF2) – 2)/sd(myData$MF2); SR2
54.
a. The values 70 and 90 are two standard deviations below the mean and above
the mean, respectively. Using Chebyshev’s Theorem and k 2 , we have
1 1/ 22 0.75 . In other words, Chebyshev’s Theorem asserts that at least 75%
of the scores fall within 70 and 90.
b. The values 65 and 95 are three standard deviations below the mean and above
the mean, respectively. Using Chebyshev’s Theorem and k 3 , we have
1 1/ 32 0.89 . In other words, Chebyshev’s Theorem asserts that at least 89%
of the scores fall within 65 and 95.
55.
a. The values 1,300 and 1,700 are two standard deviations below the mean and
above the mean, respectively. Using Chebyshev’s Theorem and k 2 , we have
1 1/ 22 0.75 . In other words, Chebyshev’s Theorem asserts that at least 75%
of the scores fall within 1300 and 1700.
b. The values 1,100 and 1,900 are four standard deviations below the mean and
above the mean, respectively. Using Chebyshev’s Theorem and k 4 , we have
1 1/ 42 0.94 . In other words, Chebyshev’s Theorem asserts that at least 94%
of the scores fall within 1100 and 1900.
56.
a. We know that at least 75% of the observations fall within two standard
deviations of the mean. We are given the mean and standard deviation of 500
and 25, respectively. Therefore, at least 75% of the observations fall within 450
and 550.
3-17
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
b. We know that at least 89% of the observations fall within three standard
deviations of the mean. We are given the mean and standard deviation of 500
and 25, respectively. Therefore, at least 89% of the observations fall within 425
and 575.
57.
b. The interval [16, 24] is the interval [ x 2s, x 2s] [20 4, 20 4]. 95% of the
observations fall in this interval.
c. 16 is two standard deviations below the mean. Since 95% of the observations fall
within 2 standard deviations of the mean, we conclude that 5% fall outside this
interval. Half of these observations fall below 16, hence 2.5% of the observations
fall below 16.
58.
a. According to the empirical rule about 68% of the observations are between 700
and 800. Hence, half of the remaining 32%, that is 16%, of the observations are
less than 700.
59.
a. According to the empirical rule, about 95% of the observations are between 17
and 33. Hence, 95% + 2.5% = 97.5% of the observations are less than 33.
60.
a. [ x 2s, x 2s] [5 2 2.5, 5 2 2.5] [0, 10]. Given that 95% of the data falls
within two standard deviations of the mean, 95% + 2.5% = 97.5% of the
observations are positive.
61.
74 is 2 standard deviations above the mean. By the empirical rule, about 95% of the
3-18
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
observations fall in the interval [ x 2s, x 2s] [50 24,50 24] [26,74]. Therefore,
about 2.5% are more than 74; 2.5% of 250 is 6.25 observations or roughly 6
observations are greater than 74.
62.
69
x 9; s 2 . The z-score for the smallest observation, 6, is z 1.5. The z-scores
2
for the remaining values 9, 12, 10, 9, and 8 are: 0, 1.5, 0.5, 0, ‒0.5, respectively.
63.
x 2.9; s 6.03. The smallest and the largest observations in the data set are ‒4 and 15,
respectively.
4 2.9
The z-score for the smallest observation is z 1.14.
6.03
15- 2.9
The z-score for the largest observation is z = = 2.01.
6.03
Since the absolute value of both z-scores is less than 3, we conclude that there are
no outliers in the data.
64.
a. The salaries $66,000 and $78,000 are two standard deviations below the mean
and above the mean, respectively. Using Chebyshev’s theorem and k 2 , we
have 1 1/ 22 0.75 . In other words, Chebyshev’s theorem asserts that at least
75% of the faculty earns at least $66,000 but no more than $78,000.
b. The salaries $63,000 and $81,000 are three standard deviations below the mean
and above the mean, respectively. Using Chebyshev’s theorem and k 3 , we
have 1 1/ 32 0.89 . In other words, Chebyshev’s theorem asserts that at least
89% of the faculty earns at least $63,000 but no more than $81,000.
65.
a. About 68% of the observations fall in the interval [‒4, 20]. Thus, about
16% (32%/2) of the observations are greater than 20 percent.
b. About 95% of the observations fall in the interval [‒16, 32]. Thus
about 2.5% (5%/2) of the observations are less than ‒16 percent.
66.
3-19
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
b. About 95% of the scores are in the interval [68, 132]. About 2.5% of the scores
are less than 68.
c. Using part a, about 16% of the scores are more than 116.
67.
a. About 68% of returns are expected to be in the interval [2, 14]. The probability is
about 0.68.
b. About 32% of future returns fall out of the [2, 14] interval. About 16% (half of
32%) of returns are greater than 14 percent. The probability is about 0.16.
c. About 95% of future returns will fall in the interval [‒4, 20]. About 5% of them
will fall out of this interval. Hence about 2.5% of returns will fall below ‒4
percent.
68.
a. Game that last between 2.2 and 3.8 hours are two standard deviations below the
mean and above the mean, respectively. Using Chebyshev’s theorem and k 2 ,
we have 1 1/ 22 0.75 . Therefore, at least 75% of the games will last between
2.2 and 3.8 hours.
b. Using the empirical rule, we know that approximately 95% of the games will last
between 2.2 and 3.8 hours.
69.
a. xIncome 51, 641.4; sIncome 8,385.91. The smallest and the largest observations
are 37,881 and 70,647, respectively.
37,881 51, 641.4
The z-score for the smallest observation is z 1.64.
8,385.91
70, 647 51, 641.4
The z-score for the largest observation is z 2.27.
8,385.91
Since the absolute value of both z-scores is less than 3, we conclude that there
are no outliers in the household income data.
b. xValue 199,324; sValue 93,589.10. The smallest and the largest observations
are 94,500 and 537,400, respectively.
3-20
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
94,500 199,324
The z-score for the smallest observation is z 1.12.
93,589.10
537, 400 199,324
The z-score for the largest observation is z 3.61.
93,589.10
Since the z-score for the largest observation is greater than 3, we conclude that
there are outliers in the house value data.
R Code:
#Import the data file into a data frame (table) and label it myData
#Calculations for Income
zMinIncome <- (min(myData$Income)-mean(myData$Income))/sd(myData$Income);
zMinIncome
zMaxIncome <- (max(myData$Income) -mean(myData$Income))/sd(myData$Income);
zMaxIncome
#Calculations for Value
zMinValue <- (min(myData$Value)-mean(myData$Value))/sd(myData$Value);
zMinValue
zMaxValue <- (max(myData$Value)-mean(myData$Value))/sd(myData$Value);
zMaxValue
70.
a. 𝑥̅𝑀𝑢𝑡𝑢𝑎𝑙𝐹𝑢𝑛𝑑1 = 7.38; 𝑠𝑀𝑢𝑡𝑢𝑎𝑙𝐹𝑢𝑛𝑑1 = 35.12. The smallest and the largest
observations for Mutual Fund 1 are −51.09 and 90.29, respectively.
−51.09−7.38
The z-score for the smallest observation is 𝑧 = = −1.66.
35.12
90.29−7.38
The z-score for the largest observation is 𝑧 = = 2.36.
35.12
Since the absolute value of both z-scores is less than 3, we conclude that there
are no outliers in the distribution for Mutual Fund 1.
71.
3-21
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
( x x )( y y ) 84.40 21.1.
a. sxy
i i
n 1 4
sxy
b. rxy s s 0.93 . There is a strong positive relationship between x and y.
x y
72.
( x x )( y y ) 49.21 12.3 .
a. sxy
i i
n 1 4
s xy
b. xy s s 0.96 . There is a strong negative relationship between x and y.
r
x y
73.
R Code:
#Import the data file into a data frame (table) and label it myData
cov(myData)
cor(myData)
74.
R Code:
#Import the data file into a data frame (table) and label it myData
3-22
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
cov(myData)
cor(myData)
75.
a. 𝑠𝑥𝑦 = 5.74. There is a positive linear relationship between GRE and subsequent
GPA in graduate school.
b. 𝑟𝑥𝑦 = 0.52. The positive linear relationship between GRE and GPA is moderate.
GRE scores seem to be a good indicator for later performance in graduate school.
R Code:
#Import the data file into a data frame (table) and label it myData
cov(myData)
cor(myData)
76.
a. 𝑠𝑥𝑦 = 66.79. There is a positive linear relationship between education
and salary.
R Code:
#Import the data file into a data frame (table) and label it myData
cov(myData)
cor(myData)
77.
a. rAge,Happiness 0.57. The correlation between age and happiness is positive and
moderate.
b.
3-23
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
Happiness
60
50
40
30
20
10
0
0 20 40 60 80 100
Age
The correlation analysis suggests there is a positive correlation for the whole
data set. However, the scatterplot for age and happiness in the above figure
suggests that age and happiness are not universally positively related. Age and
happiness are negatively correlated for age less than 40. And age and happiness
are positively correlated for age greater than 40.
R Code:
#Import the data file into a data frame (table) and label it myData
cor(myData)
plot(Happiness ~ Age, data=myData, col=”chocolate”, pch=16)
78.
3-24
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
R Code:
#Import the data file into a data frame (table) and label it myData
cor(myData)
79.
b. Highway 1: s = 6.9828
Highway 2: s = 3.0043
c. Variability is higher on the Highway 1 with the lower speed limit, supporting the
researcher’s belief.
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$Highway_1)
sd(myData$Highway_2)
80.
a. Firm A: 𝑥̅ = 75.39; s2 = 52.02; s = 7.21
Firm B: 𝑥̅ = 106.07; s2 = 319.21; s = 17.87
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$’Firm_A’)
sd(myData$’Firm_B’)
81.
a. Mutual Fund 1 had the higher reward over this period since its mean was greater
than the mean of Mutual Fund 2 (10.58% > 9.43%).
b. Mutual Fund 1 was riskier over this period since its standard deviation was greater
than the standard deviation of Mutual Fund 2 (37.09% > 22.02%).
3-25
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
10.58 −2
c. The Sharpe ratio for Mutual Fund 1 is = 0.2312. The Sharpe ratio for the
37.09
9.43−2
Mutual Fund 2 is 22.02 = 0.3375. Because the Sharpe ratio for Mutual Fund 2 is
greater than the Sharpe ratio for Mutual Fund 1, Mutual Fund 2 has a higher reward
per unit of risk.
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$MF1)
sd(myData$MF2)
#Calculating Sharpe ratios
SR1 <- (mean(myData$MF1) – 2)/sd(myData$MF1); SR1
SR2 <- (mean(myData$MF2) – 2)/sd(myData$MF2); SR2
82.
𝑥̄ = ∑ 𝑤𝑖 𝑚𝑖 = 1,820
83.
a. Firm 1:
𝐺𝑔 = 5√(1 + 0.03)(1 + 0.021)(1 + 0.218)(1 + 0.048)(1 + 0.012) − 1
0.063, or 6.3%
Firm 2:
c. Firm 1 has the highest average growth rate at 6.3%, but also the higher
variability at 8.61% against 6.56% for Firm 2.
84.
14.20
a. 𝐺𝑔 (𝐹𝑖𝑟𝑚 1) = √15.73 − 1 = −0.050, or − 5%;
3-26
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
2.99
𝐺𝑔 (𝐹𝑖𝑟𝑚 2) = √3.06 − 1 = −0.012, or − 1.2%
R Code:
#Import the data file into a data frame (table) and label it myData
cor(myData[, 2:3])
86. 𝑟𝐹𝑖𝑛𝑎𝑙,𝑚𝑖𝑑𝑡𝑒𝑟𝑚 = 0.5518; the correlation coefficient suggests that final and midterm
scores have a positive and moderate linear relationship.
R Code:
#Import the data file into a data frame (table) and label it myData
cor(myData)
87.
a. Mutual Fund 1 had the higher reward over this period since its mean was greater
than the mean of Mutual Fund 2 (17.70% > 10.03%).
b. Mutual Fund 1 was riskier over this period since its standard deviation was greater
than the standard deviation of Mutual Fund 2 (36.56% > 22.95%).
17.70−2
c. The Sharpe ratio for Mutual Fund 1 is = 0.43. The Sharpe ratio for Mutual
36.5637
10.03−2
Fund 2 is 22.9492 = 0.35. Because the Sharpe ratio for Mutual Fund 1 is greater
than the Sharpe ratio for Mutual Fund 2, Mutual Fund 1 has a higher reward per unit
of risk.
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$MF1)
3-27
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
sd(myData$MF2)
#Calculating Sharpe ratios
SR1 <- (mean(myData$MF1) – 2)/sd(myData$MF1); SR1
SR2 <- (mean(myData$MF2) – 2)/sd(myData$MF2); SR2
88.
a. Boxplot for Mutual Fund 1:
The boxplot suggests that there are outliers in the upper part of the distribution.
b. 𝑥̅1 = 17.6992; 𝑠1 = 36.5637. The smallest and the largest observations are -51.09 and
131.75, respectively.
−51.09 − 17.6992
The z-score for the smallest observation is z = 36.5637
= -1.8814
131.75 − 17.6992
The z-score for the largest observation is z = = 3.11924
36.5637
Since the z-score for the largest observation is greater than 3, we conclude that there
are outliers for Mutual Fund 1. This is consistent with the boxplot, which showed
outliers in the upper part of the distribution.
3-28
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
The boxplot suggests that there is an outlier in the lower part of the distribution.
d. 𝑥̅2 = 10.0314; 𝑠2 = 22.9492. The smallest and the largest observations are -54.00 and
52.02, respectively.
−54.00 − 10.0314
The z-score for the smallest observation is z = = -2.4402
22.9492
52.02 − 10.0314
The z-score for the largest observation is z = 22.9492
= 2.1796
Since the absolute value of both z-scores is less than 3, we conclude that there are no
outliers for Mutual Fund 2. This is not consistent with the boxplot, which showed an
outlier in the lower part of the distribution. Recall that z-scores are reliable indicators
of outliers when the distribution is relatively bell-shaped and symmetric. Since the
boxplot indicates that Mutual Fund 2 is not symmetric, we are better served
identifying outliers in this case with a boxplot.
89.
a.
3-29
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
The boxplot suggests that there are no outliers for the Expenditures variable.
b.
𝑥̅ = 1,306.94; 𝑠 =336.1355. The smallest and the largest observations for the
Expenditures variable are 467 and 2,116, respectively.
467 − 1,306.94
The z-score for the smallest observation is z = = −2.4988
336.1355
2,116− 1,306.94
The z-score for the largest observation is z = = 2.4069
336.1355
Since the absolute value of both z-scores is less than 3, we conclude that there are no
outliers for the Expenditures variable. This is consistent with the boxplot, which
showed no outliers.
R Code:
#Import the data file into a data frame (table) and label it myData
#Construct boxplot
boxplot(myData$Expenditures, horizontal=TRUE)
#Calculations for Expenditures
zMinExpenditures <- (min(myData$Expenditures)-
mean(myData$Expenditures))/sd(myData$Expenditures); zMinExpenditures
zMaxExpenditures <- (max(myData$Expenditures) -
mean(myData$Expenditures))/sd(myData$Expenditures); zMaxExpenditures
90.
3-30
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
b. Q1 = 2.01; 25% of the states had gas prices below $2.01, and 75% of the states had
gas prices greater than $2.01.
Q3 = 2.55; 75% of the states had gas prices less than $2.55, and 25% of the states
had gas prices greater than $2.59.
The boxplot suggests that there is an outlier in the upper part of the distribution.
e. 𝑥̅ = 2.31, s = 0.3849. The minimum and maximum observations in the data set are
1.84 and 3.37, respectively.
1.84 − 2.31
The z-score for the minimum observation is z = = −1.2266
0.3849
1.84 − 2.31
The z-score for the maximum observation is z = = 2.7481
0.3849
Since all z-scores are between 3 and −3, we conclude there are no outliers in the data.
This result is not consistent with the result in part a. Recall that z-scores are reliable
indicators of outliers when the distribution is relatively bell-shaped and symmetric.
Since the boxplot indicates that Price is positively skewed, we are better served
identifying outliers in this case with a boxplot.
3-31
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
91.
a. 𝑥̅𝑃𝑟𝑖𝑐𝑒 = 15,086.30; 𝑥̅𝐴𝑔𝑒 = 5.05; 𝑥̅ 𝑀𝑖𝑙𝑒𝑠 = 45,810.05
b. 𝑠𝑃𝑟𝑖𝑐𝑒 = 3,560.87; 𝑠𝐴𝑔𝑒 = 2.68; 𝑠𝑀𝑖𝑙𝑒𝑠 = 22,642.97
c. 𝑟𝑃𝑟𝑖𝑐𝑒,𝐴𝑔𝑒 = −0.91; strong negative linear relationship between the price of a car and
its age.
d. 𝑟𝑃𝑟𝑖𝑐𝑒,𝐴𝑔𝑒 = −0.76; strong negative linear relationship between the price of a car and
its mileage.
R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$Price)
sd(myData$Age)
sd(myData$Miles)
cor(myData)
92.
a. 𝑥̅𝐶𝑜𝑙𝑙𝑒𝑔𝑒 𝐺𝑃𝐴 = 3.15; 𝑥̅𝑆𝐴𝑇 = 1282.72
b. Mean College GPA for white students = 3.26
Mean College GPA for nonwhite students = 2.99
The mean College GPA is higher for white students.
93.
a. 𝑥̅𝐼𝑛𝑐𝑜𝑚𝑒 = 276.86; 𝑥̅𝐻𝑜𝑢𝑟𝑠 = 6.77
Excel: =AVERAGE(A2:A36)
=AVERAGE(B2:B36)
=AVERAGEIF(C2:C36,1,A2:A36)
=AVERAGEIF(C2:C36,0,A2:A36)
3-32
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures
=AVERAGEIF(D2:D36,1,A2:A36)
=AVERAGEIF(D2:D36,0,A2:A36)
R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Income)
mean(myData$Hours)
tapply(myData$Income, myData$Hot, mean)
tapply(myData$Income, myData$Holiday, mean)
3-33
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.