0% found this document useful (0 votes)
42 views33 pages

Descriptive Statistics Solutions

JK4eSM_Chapter_3

Uploaded by

Balu Jagadish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views33 pages

Descriptive Statistics Solutions

JK4eSM_Chapter_3

Uploaded by

Balu Jagadish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Chapter 03 - Numerical Descriptive Measures

Chapter 3. Numerical Descriptive


Measures
Solutions
8  10  9  12  12
1. Sample mean: x   10.20
5
Median: The median is at the 3rd position in the arranged data; Median = 10.
Mode: The mode is 12 since 12 appears twice in the dataset.

(4)  0  (6)  1  (3)  (4)


2. Sample mean: x   2.67 .
6
Median: The median is the average of the values at the 3rd and 4th positions;
(3)  (4)
Median =  3.5 .
2
Mode = −4 as it has the greatest frequency.

150  257  55  110  110  43  201  125  55


3. Population mean:    122.89 .
9
Median: The median is located in the 5th position; Median = 110.
The distribution is bimodal; 55 and 110 are the two modes.

20  15  25  20  10  15  25  20  15 165
4. Population mean:     18.33 .
9 9
Median: The median is located in the 5th position; Median = 20.
The distribution is bimodal: 15 and 20 are the two modes.

5. Mean = 516.03; Median = 523

R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Price)
median(myData$Price)

6. Mean = 2.31; Median = 2.18

3-1
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Price)
median(myData$Price)

7. Mean = 78.42; Median = 78.45

R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Life_Expectancy)
median(myData$Life_Expectancy)

8. Mean = 1,306.94; Median = 1,287.50

R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Expenditures)
median(myData$Expenditures)

9.
a. Mean = 81,727.41; Median =80,705.

b. Mean income for married individuals = 81,829.97


Mean income for nonmarried individuals = 81,310.06
Married individuals earn more than nonmarried individuals

c. Mean income for individuals who always exercise = 82,777.20


Mean income for individuals who never exercise = 79,489.59
Individuals who exercise more earn more than individuals who never exercise.

Excel: =AVERAGE(D2:D415)
=MEDIAN(D2:D415)
=AVERAGEIF(C2:C418, "Yes", D2:D415)
=AVERAGEIF(C2:C418, "No", D2:D415)
=AVERAGEIF(B2:B418, "Always", D2:D415)
=AVERAGEIF(B2:B418, "Never", D2:D415)

R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Income)
median(myData$Income)

3-2
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

tapply(myData$Income, myData$Married, mean)


tapply(myData$Income, myData$Exercise, mean)
10.
a. Mean Food = 4,416.44; Mean Travel = 2,405.27.

b. Mean food spending for homeowners = 4,048.90


Mean food spending for non-homeowners = 4,566.70
Non-homeowners spend more on food than homeowners

c. Mean travel spending for homeowners = 1,960.35


Mean travel spending for non-homeowners = 2,569.05
Non-homeowners spend more on travel than homeowners

Excel: =AVERAGE(D2:D500)
=AVERAGE(E2:E500)
=AVERAGEIF(B2:B500, "Yes", D2:D500)
=AVERAGEIF(B2:B500, "No", D2:D500)
=AVERAGEIF(B2:B500, "Yes", E2:E500)
=AVERAGEIF(B2:B500, "No", E2:E500)

R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Food)
mean(myData$Travel)
tapply(myData$Food, myData$OwnHome, mean)
tapply(myData$Travel, myData$OwnHome, mean)

11. Average score x   wi xi  0.3(90)  0.5(60)  0.2(80)  73

12. Average price per share x   wi xi 


70/200(19.58) + 80/200(24.06) + 50/200(29.54)  $23.86

13.
94.81×100+102.67×60+115.32×40
a. Average price per share = = 101.27
100+60+40
94.81×40+102.67×60+115.32×100
b. Average price per share = = 107.42
40+60+100

3-3
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

14. Average weight 𝑥̅ = ∑ 𝑤𝑖 𝑚𝑖 =0.04(3) + 0.11(5) + 0.36(7) + 0.43(9) +


0.06(11) = 7.72

15. Average vacancy rate 𝑥̅ = ∑ 𝑤𝑖 𝑚𝑖 == 0.10(1.5) + 0.10(4.5) + 0.20(7.5) +


0.40(10.5) + 0.20(13.5) = 9.00

16. 25th percentile: 64; 50th percentile: 112; 75th percentile: 149.25

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)

17.

a. 25th percentile: 73.25; 50th percentile: 112; 75th percentile: 142


b. 20th percentile: 237,275.8; 80th percentile: 825,227

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
quantile(myData$x3, 0.20)
quantile(myData$x3, 0.80)

18.

a. There is an outlier in the left-tail of the distribution.

b. Since the median is closer to Q3 than Q1 (right of center) and the left whisker is
longer than the right whisker, the distribution is negatively skewed.

19.

a. The box plot does not indicate any possible outliers in the data.

b. The median is equidistant from Q1 and Q3 and the right and left whiskers have
approximately the same length, the distribution is approximately symmetric.

20.

3-4
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

a. Q1: Approximately 25 percent of the observations are less than 200.

Q3: Approximately 75 percent of the observations are less than 550.

b. IQR= Q3 − Q1 = 550 − 200 = 350.

Limit: 1.5 × IQR = 1.5 × 350 = 525. Find distances from Q1 − Min = 75 < 525
and from Max − Q3 = 750 > 525; thus, there is at least one outlier on the right side
of the distribution.

c. The distribution is not symmetric. It appears positively skewed because: (1) the
median falls left of center in the interquartile range (Median − Q1 = 100 < 250 =
Q3 − Median), and (2) the right whisker is longer than the left whisker (Q1 − Min =
75 < 750 = Max − Q3).

21.

a. Q1: Approximately 25 percent of the observations are less than 54;

Q3: Approximately 75 percent of the observations are less than 78.

b. IQR= Q3 − Q1 = 78 − 54 = 24.

Find limit: 1.5 × IQR = 1.5 × 24 = 36. Find distances from Q1 − Min = 54 − 34 =
20 < 36 and Max − Q3 = 98 − 78 = 20 < 36; since both distances are less than 36,
there are no outliers in the distribution.

c. The distribution is symmetric because: (1) The median falls in the center of the
78+54
interquartile range: = 66, and (2) the lengths of the right and left whiskers are
2
equal (Q1 − Min = Max − Q3 = 20).

22.

a. 25th percentile: 40; approximately 25 percent of the observations are less than 40.
50th percentile: 46; approximately 50 percent of the observations are less than 46.
75th percentile: 51; approximately 75 percent of the observations are less than 51.

b. IQR = 51 − 40 = 11; 1.5 × IQR = 16.5.

Q1−Min = 40−28 = 12.


Max−Q3 = 67− 51= 16

3-5
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

There are no outliers because neither distance exceeds 16.5.

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)

23.
a. 25th percentile: 67.5; approximately 25 percent of the scores are less than 67.5.
50th percentile: 75; approximately 50 percent of the scores are less than 75.
75th percentile: 85.25; approximately 75 percent of the scores are less than 85.25.

b. IQR = 85.25−67.5=17.75; 1.5× IQR =26.625.


Q1−Min = 67.5−25 = 42.5.
Max−Q3 = 99− 85.25= 13.75.
There is at least one outlier on the left side of the distribution because the
distance between Q1 and Min exceeds 26.625.

c. The distribution is not symmetric. However, the distribution’s skewness is not clear
from the two guidelines because (1) the median falls left of center in the interquartile
range (Median − Q1 = 7.5 < 10.25 = Q3 − Median), and (2) the left whisker is
longer than the right whisker (Q1 − Min = 42.5 > 13.75 = Max − Q3). If we
calculate the skewness coefficient, we find that the distribution is negatively
skewed.

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)

24.

a. In the accompanying “Boxplot of Income”, there are no outliers in the household


income data. The median is left of center and the right whisker is longer than the
left whisker, thus the distribution seems to be positively skewed.

3-6
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

b. In the accompanying “Boxplot of House Value”, there are two outliers in the
house value data. The median house value in California and Hawaii are the
outliers in the data. The median is left of center and the right whisker is
longer than the left whisker, thus the distribution seems to be positively
skewed.

3-7
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

R Code:
#Import the data file into a data frame (table) and label it myData
boxplot(myData$Income, horizontal=TRUE)
boxplot(myData$Value, horizontal=TRUE)

25.

a. 25th percentile: 12; approximately 25 percent of the P/E ratios are less than 12.
50th percentile: 14.5; approximately 50 percent of the P/E ratios are less than 14.5.
75th percentile: 18.75; approximately 75 percent of the P/E ratios are less than
18.75.

b. There are two outliers on the right side of the distribution. The distribution is not
symmetric. It appears positively skewed because (1) the median falls left of center and
(2) the right whisker is longer than the left whisker.

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)

3-8
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

boxplot(myData$PE_Ratio, horizontal=TRUE)

26. 𝐺𝑔 = 4√(1 + 0.04)(1 + 0.08)(1 − 0.05)(1 + 0.06) − 1 = 0.0313, or 3.13%.

27. 𝐺𝑅 = 5√(1 − 0.03)(1 + 0.02)(1 − 0.05)(1 + 0.027)(1 + 0.031) − 1= ‒ 0.001, or ‒ 0.1%.

28. 𝐺𝑅 = 3√(1 + 0.10)(1 + 0.05)(1 − 0.15) − 1= ‒0.006, or ‒0.6%.

29. 𝐺𝑔 = 2.5√(1 + 0.02)(1 + 0.05)(1 + 0.018) − 1 = 1.0352 − 1 = 0.0352, or 3.52%.

1.25
30. 𝐺𝑅 = √(1 + 0.05)(1 + 0.03) − 1 = 1.0647 − 1 = 0.0647, or 6.47%.

31.
a.

Year 1 - Year 2 Year 2 – Year 3 Year 3 – Year 4


Growth Rates 0.2222 0.3636 0.0667

b. 𝐺𝑔 = 3√(1 + 0.2222)(1 + 0.3636)(1 + 0.0667) = 0.2114, or 21.14%.

32.

a.

Year 1 - Year 2 Year 2 – Year 3 Year 3 – Year 4


Growth Rates 0.0667 0.0781 0.1014

b. Gg  3 (1  0.0667)(1  0.0781)(1  0.1014)  1  0.082 , or 8.2%.

33.
G g  5 (1  0.025 )(1  0.036 )(1  0.018 )(1  0.022 )(1  0.052 )  1 = 0.0305, or 3.05%.

34.
a. The arithmetic mean is given by
17.3 + 19.6 + 6.8 + 8.2
𝑥̅ = = 12.98
4

3-9
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

The average annual return between Year 1 through Year 4 is 12.98%.

b. The geometric mean is given by

GR  4 (1  0.173)(1  0.196)(1  0.068)(1  0.082) 1  0.1284

The geometric mean return between Year 1 through Year 4 is 12.84%.

c. By the end of Year 4, the total accumulation would be approximately (subject to


rounding)

1,000(1 + 0.1284)4 = $1,621.26

35.

a. The growth rates for each retailer are:

Retailer 1 Retailer 2
Year 1-Year 2 −0.0784 −0.001
Year 2-Year 3 −0.0717 −0.0209

b. The average growth rates for each retailer are 𝐺1 = −0.075; 𝐺2 = −0.011.

36.
a. The arithmetic mean is given by

19.5 + 8.9 − 6.0 − 10.5 + 5.9


𝑥̅ = = 3.56
5

The average annual return between Year 1 through Year 5 is 3.56%.

b. The geometric mean return is given by


5
𝐺𝑅 = √(1 + 0.195)(1 + .089)(1 − .06)(1 − .105)(1 + .059) − 1 = 0.03

The geometric mean return between Year 1 through Year 5 is 3.00%.

c. By the end of Year 5, the total accumulation would be approximately (subject to


rounding)

20,000(1 + 0.03)5 = $23,185.48

37.

3-10
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

3−1 408.2 3−1 65.3


a. 𝐺1 = √ − 1 =0.0367, or 3.67%; 𝐺2 = √ − 1 =0.0149, or 1.49%.
379.8 63.4

b. Retailer 1 has a higher growth rate over the three-year period.

38.

a.

Annual Growth Rate


Year 1-Year 2 (23331-20117)/20117=0.1598
Year 2-Year 3 0.0850
Year 3-Year 4 0.0982
Year 4-Year 5 0.1008

Gg  4 (1  0.1598 )(1  0.0850 )(1  0.0982 )(1  0.1008 )  1  0.1106, or 11.1%.

4 30601
b. G g = √20117 −1 = 0.1106, or 11.1%. As expected, both ways yield the same result.

39.

a. Range = Max−Min = 42−10 = 32

b.    i  | xi   |  56  11.2
x 120
 24; MAD 
N 5 N 5

c.  2 =  (x  )
i
2


768
 153.6
N 5

d.    2 =12.39

40.

a. Range = Max− Min = 10 − (−8) = 18

∑ |𝑥𝑖 −𝜇| 24
b. 𝑀𝐴𝐷 = 𝑁
= 5
= 4.8

3-11
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

c.  2
=
 (x  ) i
2


184.00
 36.80
N 5

d.    2 =6.07

41.

a. Range = Max−Min = 52−32 = 20

∑ |𝑥𝑖 −𝑥̄ | 32
b. 𝑀𝐴𝐷 = 𝑛
= 6
= 5.33

 (x  x )
i
2
256
c. s2 =   51.2
n 1 5

d. s  s 2 = 7.16

42.

a. Range = Max−Min = 12− (−10) = 22

∑ |𝑥𝑖 −𝑥̄ | 44
b. 𝑀𝐴𝐷 = 𝑛
=
6
= 7.33

c. s2 =
 (x  x )
i
2


406
 81.2 , s  s 2 = 9.01
n 1 5

43.

a. Minimum=467; Maximum=2,116; Range= 1,649

b. 𝑥̅ = 1306.94; Median = 1287.5

c. s 2 = 112,987.05 ; 𝑠 = 336.14

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
var(myData$Expenditures)
sd(myData$Expenditures)

3-12
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

44.

a. Firm A: 𝑠 2 = 162.65, 𝑠 = 12.75


Firm B: 𝑠 2 = 173.51, 𝑠 = 13.17.

b. Firm B’s stock price has greater variability as indicated by a higher standard
deviation.

𝑠 12.75
c. Firm A, 𝐶𝑉 = 𝑥̅ = 58.55 = 0.22
𝑠 13.17
Firm B, 𝐶𝑉 = 𝑥̅ = 55.91 = 0.24
Firm B’s stock price also has greater relative dispersion indicated by a higher
coefficient of variation.

R Code:
#Import the data file into a data frame (table) and label it myData
var(myData$’Firm A’)
sd(myData$’Firm A’)
var(myData$’Firm B’)
sd(myData$’Firm B’)
sd(myData$’Firm A’)/mean(myData$’Firm A’)
sd(myData$’Firm B’)/mean(myData$’Firm B’)

45.

a. Monthly rent: x  1, 222.93, s  424.80

b. Square footage: x  1, 286.03, s  645.81

s 424.80
c. Monthly rent: CV    0.35
x 1, 222.93
s 645.81
Square footage: CV    0.50
x 1, 286.03
Therefore, there is greater relative dispersion in square footage than in monthly
rent.

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$Rent)/mean(myData$Rent)

3-13
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

sd(myData$Footage)/mean(myData$Footage)

46.
𝑠 10,304.13
a. 𝐶𝑉𝐴 = 𝑥̅ = 79,231.23 =0.13

𝑠 7,510.96
b. 𝐶𝑉𝐵 = 𝑥̅ = 52,753.62 =0.14

c. Corporation B has a slightly higher coefficient of variation which translates in a


greater relative dispersion.

R Code:
#Import the data file into a data frame (table) and label it myData
sd(myData$’Corporation A’)/mean(myData$’Corporation A’)
sd(myData$’Corporation B’)/mean(myData$’Corporation B’)

47.

a. Range of household income = 70,647 – 37,881 = 32,766;


Range of house value = 537,400 – 94,500 = 442,900.

b. For household income, xIncome  51, 641.4; MADIncome  6,834.68;


s 2 Income  70,323,534; sIncome  8,385.91.
For house value, xValue  199,324; MADValue  71, 738.88; s 2 Value  8, 758,919,820;
sValue  93,589.10.

c. The household income and house value have different sample means, so we
cannot compare the MAD and standard deviation directly in order to determine
which data are more variable. The coefficient variation is the recommended
measure to compare the dispersion between the data sets.

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
#Calculating MAD and variance
mean(abs(myData$Income-mean(myData$Income)))
var(myData$Income)

3-14
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

mean(abs(myData$Value-mean(myData$Value)))
var(myData$Value)
#Calculating coefficient of variation
sd(myData$Income)/mean(myData$Income)
sd(myData$Value)/mean(myData$Value)

48.

a. Investment B provides a higher return. Investment A provides the least risk since
it has a smaller standard deviation.

x  Rf 10  1.4 x  R f 15  1.4
b. SharpeA    1.72 ; SharpeB    1.36 . The Sharpe
sI 5 sI 10
Ratio is higher for investment A; hence it provides a higher reward per unit of
risk.

49.

a. Investment B has a higher mean return compared to investment A. The higher


standard deviation indicates that investment B has a higher risk as well.
Investment A provides the least risk.

x  Rf 82 x  R f 10  2
b. SharpeA    1.2 ; SharpeB    1.14 . Investment A has
sI 5 sI 7
a slightly higher Sharpe Ratio. Hence it provides a higher reward per unit of risk.

50.

a. x1  3 , x2  5 . Investment 2 provides a higher return since it has a higher mean.

b. s1  5.29 , s 2  9.02 . Investment 1 provides the least risk because it has a lower
standard deviation.

x  Rf 3  1.2 x  R f 5  1.2
c. Sharpe1    0.34 ; Sharpe2    0.42 . Investment 2
sI 5.29 sI 9
performs better because it offers more reward per unit of risk.

51.

𝑥̅1 = 9.62; 𝑠1 = 23.58; 𝑥̅2 = 12.38; 𝑠2 = 15.45

3-15
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

a. Stock 2

b. Stock 1 had a higher standard deviation, hence it was riskier. This result is
surprising because, in general, investments with higher returns often have
higher risk.

9.62−3 12.38−3
c. SharpeStock1 = 23.58
= 0.28, SharpeStock2 = 15.45
= 0.61.

Stock 2 has a higher Sharpe ratio; hence it has a higher reward per unit of risk
compared to Stock 1.

52.

𝑥̅1 = 13.01,𝑠1 = 38.61; 𝑥̅2 = 10.83; 𝑠2 = 22.84

a. Mutual Fund 1
b. Mutual Fund 1
13.01−3 10.83−3
c. 𝑆ℎ𝑎𝑟𝑝𝑒1 = 38.61 = 0.26, 𝑆ℎ𝑎𝑟𝑝𝑒2 = 22.84 = 0.34. Mutual Fund 2 had a higher
Sharpe ratio; hence it has a higher reward per unit of risk.

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$MF1)
sd(myData$MF2)
#Calculating Sharpe ratios
SR1 <- (mean(myData$MF1) – 3)/sd(myData$MF1); SR1
SR2 <- (mean(myData$MF2)– 3)/sd(myData$MF2); SR2

53.

𝑥̅1 = 7.38; 𝑠1 = 35.12; 𝑥̅2 = 12.44; 𝑠2 = 28.52


.
a. Mutual Fund 2
b. Mutual Fund 2
7.38−2 12.44−2
c. Sharpe1 = = 0.15; Sharpe2 = = 0.37 . Mutual Fund 2 had a much
35.12 28.52
higher Sharpe ratio than Mutual Fund 1. This suggests that Mutual Fund 2 had
more reward per unit of risk.

3-16
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$MF1)
sd(myData$MF2)
#Calculating Sharpe ratios
SR1 <- (mean(myData$MF1) – 2)/sd(myData$MF1); SR1
SR2 <- (mean(myData$MF2) – 2)/sd(myData$MF2); SR2

54.

a. The values 70 and 90 are two standard deviations below the mean and above
the mean, respectively. Using Chebyshev’s Theorem and k  2 , we have
1 1/ 22  0.75 . In other words, Chebyshev’s Theorem asserts that at least 75%
of the scores fall within 70 and 90.

b. The values 65 and 95 are three standard deviations below the mean and above
the mean, respectively. Using Chebyshev’s Theorem and k  3 , we have
1 1/ 32  0.89 . In other words, Chebyshev’s Theorem asserts that at least 89%
of the scores fall within 65 and 95.

55.

a. The values 1,300 and 1,700 are two standard deviations below the mean and
above the mean, respectively. Using Chebyshev’s Theorem and k  2 , we have
1 1/ 22  0.75 . In other words, Chebyshev’s Theorem asserts that at least 75%
of the scores fall within 1300 and 1700.

b. The values 1,100 and 1,900 are four standard deviations below the mean and
above the mean, respectively. Using Chebyshev’s Theorem and k  4 , we have
1 1/ 42  0.94 . In other words, Chebyshev’s Theorem asserts that at least 94%
of the scores fall within 1100 and 1900.

56.

a. We know that at least 75% of the observations fall within two standard
deviations of the mean. We are given the mean and standard deviation of 500
and 25, respectively. Therefore, at least 75% of the observations fall within 450
and 550.

3-17
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

b. We know that at least 89% of the observations fall within three standard
deviations of the mean. We are given the mean and standard deviation of 500
and 25, respectively. Therefore, at least 89% of the observations fall within 425
and 575.

57.

a. The interval [18, 22] is the interval [ x  s, x  s]  [20  2, 20  2]. According to


the empirical rule 68% of the observations fall within this interval.

b. The interval [16, 24] is the interval [ x  2s, x  2s]  [20  4, 20  4]. 95% of the
observations fall in this interval.

c. 16 is two standard deviations below the mean. Since 95% of the observations fall
within 2 standard deviations of the mean, we conclude that 5% fall outside this
interval. Half of these observations fall below 16, hence 2.5% of the observations
fall below 16.

58.

a. According to the empirical rule about 68% of the observations are between 700
and 800. Hence, half of the remaining 32%, that is 16%, of the observations are
less than 700.

b. 16% of 500 is 80 observations.

59.

a. According to the empirical rule, about 95% of the observations are between 17
and 33. Hence, 95% + 2.5% = 97.5% of the observations are less than 33.

b. 97.5% of 1,000 is 975 observations.

60.

a. [ x  2s, x  2s]  [5  2  2.5, 5  2  2.5]  [0, 10]. Given that 95% of the data falls
within two standard deviations of the mean, 95% + 2.5% = 97.5% of the
observations are positive.

b. 100% − 97.5% = 2.5% of the observations are not positive.

61.
74 is 2 standard deviations above the mean. By the empirical rule, about 95% of the

3-18
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

observations fall in the interval [ x  2s, x  2s]  [50  24,50  24]  [26,74]. Therefore,
about 2.5% are more than 74; 2.5% of 250 is 6.25 observations or roughly 6
observations are greater than 74.

62.
69
x  9; s  2 . The z-score for the smallest observation, 6, is z   1.5. The z-scores
2
for the remaining values 9, 12, 10, 9, and 8 are: 0, 1.5, 0.5, 0, ‒0.5, respectively.

63.
x  2.9; s  6.03. The smallest and the largest observations in the data set are ‒4 and 15,
respectively.
4  2.9
The z-score for the smallest observation is z   1.14.
6.03
15- 2.9
The z-score for the largest observation is z = = 2.01.
6.03
Since the absolute value of both z-scores is less than 3, we conclude that there are
no outliers in the data.

64.

a. The salaries $66,000 and $78,000 are two standard deviations below the mean
and above the mean, respectively. Using Chebyshev’s theorem and k  2 , we
have 1 1/ 22  0.75 . In other words, Chebyshev’s theorem asserts that at least
75% of the faculty earns at least $66,000 but no more than $78,000.

b. The salaries $63,000 and $81,000 are three standard deviations below the mean
and above the mean, respectively. Using Chebyshev’s theorem and k  3 , we
have 1 1/ 32  0.89 . In other words, Chebyshev’s theorem asserts that at least
89% of the faculty earns at least $63,000 but no more than $81,000.

65.

a. About 68% of the observations fall in the interval [‒4, 20]. Thus, about
16% (32%/2) of the observations are greater than 20 percent.

b. About 95% of the observations fall in the interval [‒16, 32]. Thus
about 2.5% (5%/2) of the observations are less than ‒16 percent.

66.

3-19
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

a. About 68% of the scores are in the interval [84, 116].

b. About 95% of the scores are in the interval [68, 132]. About 2.5% of the scores
are less than 68.

c. Using part a, about 16% of the scores are more than 116.

67.

a. About 68% of returns are expected to be in the interval [2, 14]. The probability is
about 0.68.

b. About 32% of future returns fall out of the [2, 14] interval. About 16% (half of
32%) of returns are greater than 14 percent. The probability is about 0.16.

c. About 95% of future returns will fall in the interval [‒4, 20]. About 5% of them
will fall out of this interval. Hence about 2.5% of returns will fall below ‒4
percent.

68.
a. Game that last between 2.2 and 3.8 hours are two standard deviations below the
mean and above the mean, respectively. Using Chebyshev’s theorem and k  2 ,
we have 1 1/ 22  0.75 . Therefore, at least 75% of the games will last between
2.2 and 3.8 hours.

b. Using the empirical rule, we know that approximately 95% of the games will last
between 2.2 and 3.8 hours.

69.
a. xIncome  51, 641.4; sIncome  8,385.91. The smallest and the largest observations
are 37,881 and 70,647, respectively.
37,881  51, 641.4
The z-score for the smallest observation is z   1.64.
8,385.91
70, 647  51, 641.4
The z-score for the largest observation is z   2.27.
8,385.91
Since the absolute value of both z-scores is less than 3, we conclude that there
are no outliers in the household income data.

b. xValue  199,324; sValue  93,589.10. The smallest and the largest observations
are 94,500 and 537,400, respectively.

3-20
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

94,500  199,324
The z-score for the smallest observation is z   1.12.
93,589.10
537, 400  199,324
The z-score for the largest observation is z   3.61.
93,589.10
Since the z-score for the largest observation is greater than 3, we conclude that
there are outliers in the house value data.

R Code:
#Import the data file into a data frame (table) and label it myData
#Calculations for Income
zMinIncome <- (min(myData$Income)-mean(myData$Income))/sd(myData$Income);
zMinIncome
zMaxIncome <- (max(myData$Income) -mean(myData$Income))/sd(myData$Income);
zMaxIncome
#Calculations for Value
zMinValue <- (min(myData$Value)-mean(myData$Value))/sd(myData$Value);
zMinValue
zMaxValue <- (max(myData$Value)-mean(myData$Value))/sd(myData$Value);
zMaxValue

70.
a. 𝑥̅𝑀𝑢𝑡𝑢𝑎𝑙𝐹𝑢𝑛𝑑1 = 7.38; 𝑠𝑀𝑢𝑡𝑢𝑎𝑙𝐹𝑢𝑛𝑑1 = 35.12. The smallest and the largest
observations for Mutual Fund 1 are −51.09 and 90.29, respectively.
−51.09−7.38
The z-score for the smallest observation is 𝑧 = = −1.66.
35.12
90.29−7.38
The z-score for the largest observation is 𝑧 = = 2.36.
35.12
Since the absolute value of both z-scores is less than 3, we conclude that there
are no outliers in the distribution for Mutual Fund 1.

b. 𝑥̅𝑀𝑢𝑡𝑢𝑎𝑙𝐹𝑢𝑛𝑑2 = 12.44; 𝑠𝑀𝑢𝑡𝑢𝑎𝑙𝐹𝑢𝑛𝑑2 = 28.52. The smallest and the largest


observations for Mutual Fund 2 are −54.00 and 52.02, respectively.
−54.00−12.44
The z-score for the smallest observation is 𝑧 = = −2.33.
28.52
52.02−12.44
The z-score for the largest observation is 𝑧 = = 1.39.
28.52
Since the absolute value of both z-scores is less than 3, we conclude that there
are no outliers in the distribution for Mutual Fund 2.

71.

3-21
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

 ( x  x )( y  y )  84.40  21.1.
a. sxy 
i i

n 1 4

sxy
b. rxy  s s  0.93 . There is a strong positive relationship between x and y.
x y

72.

 ( x  x )( y  y )  49.21  12.3 .
a. sxy 
i i

n 1 4

s xy
b. xy s s  0.96 . There is a strong negative relationship between x and y.
r 
x y

73.

a. 𝑠𝑥𝑦 = 781.266. The covariance suggests that there is a positive linear


relationship between the two variables.

b. 𝑟𝑥𝑦 = 0.89. The correlation coefficient indicates that there is a strong,


positive linear relationship between the two mutual funds.

R Code:
#Import the data file into a data frame (table) and label it myData
cov(myData)
cor(myData)

74.

a. 𝑠𝑥𝑦 = 631.39. The covariance suggests that there is a positive linear


relationship between the two variables.

b. 𝑟𝑥𝑦 = 0.45. The correlation coefficient indicates that there is a


moderate, positive linear relationship between the price of a home and
the number of days is takes to sell it.

R Code:
#Import the data file into a data frame (table) and label it myData

3-22
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

cov(myData)
cor(myData)

75.

a. 𝑠𝑥𝑦 = 5.74. There is a positive linear relationship between GRE and subsequent
GPA in graduate school.

b. 𝑟𝑥𝑦 = 0.52. The positive linear relationship between GRE and GPA is moderate.
GRE scores seem to be a good indicator for later performance in graduate school.

R Code:
#Import the data file into a data frame (table) and label it myData
cov(myData)
cor(myData)

76.
a. 𝑠𝑥𝑦 = 66.79. There is a positive linear relationship between education
and salary.

b. 𝑟𝑥𝑦 = 0.90. The correlation coefficient indicates that the relationship


between Education and Salary is positive and strong. Education seems
to be a good indicator of Salary.

R Code:
#Import the data file into a data frame (table) and label it myData
cov(myData)
cor(myData)

77.

a. rAge,Happiness  0.57. The correlation between age and happiness is positive and
moderate.

b.

3-23
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

Scatter Plot for Age and Happiness


100
90
80
70

Happiness
60
50
40
30
20
10
0
0 20 40 60 80 100
Age

The correlation analysis suggests there is a positive correlation for the whole
data set. However, the scatterplot for age and happiness in the above figure
suggests that age and happiness are not universally positively related. Age and
happiness are negatively correlated for age less than 40. And age and happiness
are positively correlated for age greater than 40.

R Code:
#Import the data file into a data frame (table) and label it myData
cor(myData)
plot(Happiness ~ Age, data=myData, col=”chocolate”, pch=16)

78.

a. 𝑠Income,Value = 642,995,533.1; rIncome,Value  0.82. The correlation between household


income and house values is positive and rather strong.

b. 𝑠Income,Foreign = 30,867.06; rIncome,Foreign  0.61. The correlation between household


income and the percentage of the residents who are foreign born is positive and
moderate.

c. 𝑠Income,NoHS = −13,232.11; rIncome,NoHS  0.46. The correlation between household


income and the percentage of the residents who are without a high school
diploma is negative and moderate.

3-24
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

R Code:
#Import the data file into a data frame (table) and label it myData
cor(myData)

79.

a. Highway 1: 𝑥̅ = 56.6; median = 56


Highway 2: 𝑥̅ = 66; median = 66

b. Highway 1: s = 6.9828
Highway 2: s = 3.0043

c. Variability is higher on the Highway 1 with the lower speed limit, supporting the
researcher’s belief.

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$Highway_1)
sd(myData$Highway_2)

80.
a. Firm A: 𝑥̅ = 75.39; s2 = 52.02; s = 7.21
Firm B: 𝑥̅ = 106.07; s2 = 319.21; s = 17.87

b. Firm B had the higher average stock price.

c. Firm B had a higher standard deviation so higher dispersion in stock price.

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$’Firm_A’)
sd(myData$’Firm_B’)

81.
a. Mutual Fund 1 had the higher reward over this period since its mean was greater
than the mean of Mutual Fund 2 (10.58% > 9.43%).

b. Mutual Fund 1 was riskier over this period since its standard deviation was greater
than the standard deviation of Mutual Fund 2 (37.09% > 22.02%).

3-25
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

10.58 −2
c. The Sharpe ratio for Mutual Fund 1 is = 0.2312. The Sharpe ratio for the
37.09
9.43−2
Mutual Fund 2 is 22.02 = 0.3375. Because the Sharpe ratio for Mutual Fund 2 is
greater than the Sharpe ratio for Mutual Fund 1, Mutual Fund 2 has a higher reward
per unit of risk.

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$MF1)
sd(myData$MF2)
#Calculating Sharpe ratios
SR1 <- (mean(myData$MF1) – 2)/sd(myData$MF1); SR1
SR2 <- (mean(myData$MF2) – 2)/sd(myData$MF2); SR2

82.

𝑥̄ = ∑ 𝑤𝑖 𝑚𝑖 = 1,820

83.

a. Firm 1:
𝐺𝑔 = 5√(1 + 0.03)(1 + 0.021)(1 + 0.218)(1 + 0.048)(1 + 0.012) − 1
 0.063, or 6.3%

Firm 2:

𝐺𝑔 = 5√(1 + 0.015)(1 + 0.091)(1 + 0.057)(1 − 0.001)(1 − 0.082) − 1


 0.014, or 1.4%

b. 𝑠1 = 8.61% and 𝑠2 = 6.56%

c. Firm 1 has the highest average growth rate at 6.3%, but also the higher
variability at 8.61% against 6.56% for Firm 2.

84.
14.20
a. 𝐺𝑔 (𝐹𝑖𝑟𝑚 1) = √15.73 − 1 = −0.050, or − 5%;

3-26
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

2.99
𝐺𝑔 (𝐹𝑖𝑟𝑚 2) = √3.06 − 1 = −0.012, or − 1.2%

b. Firm 2’s growth is less negative as compared to Firm 1 over this


period.

85. 𝑟𝐿𝑖𝑓𝑒_𝐸𝑥𝑝𝑒𝑐𝑡𝑎𝑛𝑐𝑦,𝑂𝑏𝑒𝑠𝑖𝑡𝑦 = −0.6750; the correlation coefficient suggests that life


expectancy and obesity have a negative and moderate linear relationship.

R Code:
#Import the data file into a data frame (table) and label it myData
cor(myData[, 2:3])

86. 𝑟𝐹𝑖𝑛𝑎𝑙,𝑚𝑖𝑑𝑡𝑒𝑟𝑚 = 0.5518; the correlation coefficient suggests that final and midterm
scores have a positive and moderate linear relationship.

R Code:
#Import the data file into a data frame (table) and label it myData
cor(myData)

87.
a. Mutual Fund 1 had the higher reward over this period since its mean was greater
than the mean of Mutual Fund 2 (17.70% > 10.03%).

b. Mutual Fund 1 was riskier over this period since its standard deviation was greater
than the standard deviation of Mutual Fund 2 (36.56% > 22.95%).

17.70−2
c. The Sharpe ratio for Mutual Fund 1 is = 0.43. The Sharpe ratio for Mutual
36.5637
10.03−2
Fund 2 is 22.9492 = 0.35. Because the Sharpe ratio for Mutual Fund 1 is greater
than the Sharpe ratio for Mutual Fund 2, Mutual Fund 1 has a higher reward per unit
of risk.

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$MF1)

3-27
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

sd(myData$MF2)
#Calculating Sharpe ratios
SR1 <- (mean(myData$MF1) – 2)/sd(myData$MF1); SR1
SR2 <- (mean(myData$MF2) – 2)/sd(myData$MF2); SR2

88.
a. Boxplot for Mutual Fund 1:

The boxplot suggests that there are outliers in the upper part of the distribution.

b. 𝑥̅1 = 17.6992; 𝑠1 = 36.5637. The smallest and the largest observations are -51.09 and
131.75, respectively.
−51.09 − 17.6992
The z-score for the smallest observation is z = 36.5637
= -1.8814

131.75 − 17.6992
The z-score for the largest observation is z = = 3.11924
36.5637

Since the z-score for the largest observation is greater than 3, we conclude that there
are outliers for Mutual Fund 1. This is consistent with the boxplot, which showed
outliers in the upper part of the distribution.

c. Boxplot for Mutual Fund 2:

3-28
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

The boxplot suggests that there is an outlier in the lower part of the distribution.

d. 𝑥̅2 = 10.0314; 𝑠2 = 22.9492. The smallest and the largest observations are -54.00 and
52.02, respectively.
−54.00 − 10.0314
The z-score for the smallest observation is z = = -2.4402
22.9492

52.02 − 10.0314
The z-score for the largest observation is z = 22.9492
= 2.1796

Since the absolute value of both z-scores is less than 3, we conclude that there are no
outliers for Mutual Fund 2. This is not consistent with the boxplot, which showed an
outlier in the lower part of the distribution. Recall that z-scores are reliable indicators
of outliers when the distribution is relatively bell-shaped and symmetric. Since the
boxplot indicates that Mutual Fund 2 is not symmetric, we are better served
identifying outliers in this case with a boxplot.

89.

a.

3-29
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

The boxplot suggests that there are no outliers for the Expenditures variable.
b.

𝑥̅ = 1,306.94; 𝑠 =336.1355. The smallest and the largest observations for the
Expenditures variable are 467 and 2,116, respectively.
467 − 1,306.94
The z-score for the smallest observation is z = = −2.4988
336.1355

2,116− 1,306.94
The z-score for the largest observation is z = = 2.4069
336.1355

Since the absolute value of both z-scores is less than 3, we conclude that there are no
outliers for the Expenditures variable. This is consistent with the boxplot, which
showed no outliers.

R Code:
#Import the data file into a data frame (table) and label it myData
#Construct boxplot
boxplot(myData$Expenditures, horizontal=TRUE)
#Calculations for Expenditures
zMinExpenditures <- (min(myData$Expenditures)-
mean(myData$Expenditures))/sd(myData$Expenditures); zMinExpenditures
zMaxExpenditures <- (max(myData$Expenditures) -
mean(myData$Expenditures))/sd(myData$Expenditures); zMaxExpenditures

90.

3-30
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

a. mean = 2.3122; median = 2.18

b. Q1 = 2.01; 25% of the states had gas prices below $2.01, and 75% of the states had
gas prices greater than $2.01.
Q3 = 2.55; 75% of the states had gas prices less than $2.55, and 25% of the states
had gas prices greater than $2.59.

c. s2= 0.1482; s = 0.3849


d.

The boxplot suggests that there is an outlier in the upper part of the distribution.

e. 𝑥̅ = 2.31, s = 0.3849. The minimum and maximum observations in the data set are
1.84 and 3.37, respectively.

1.84 − 2.31
The z-score for the minimum observation is z = = −1.2266
0.3849
1.84 − 2.31
The z-score for the maximum observation is z = = 2.7481
0.3849
Since all z-scores are between 3 and −3, we conclude there are no outliers in the data.
This result is not consistent with the result in part a. Recall that z-scores are reliable
indicators of outliers when the distribution is relatively bell-shaped and symmetric.
Since the boxplot indicates that Price is positively skewed, we are better served
identifying outliers in this case with a boxplot.

3-31
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

91.
a. 𝑥̅𝑃𝑟𝑖𝑐𝑒 = 15,086.30; 𝑥̅𝐴𝑔𝑒 = 5.05; 𝑥̅ 𝑀𝑖𝑙𝑒𝑠 = 45,810.05
b. 𝑠𝑃𝑟𝑖𝑐𝑒 = 3,560.87; 𝑠𝐴𝑔𝑒 = 2.68; 𝑠𝑀𝑖𝑙𝑒𝑠 = 22,642.97
c. 𝑟𝑃𝑟𝑖𝑐𝑒,𝐴𝑔𝑒 = −0.91; strong negative linear relationship between the price of a car and
its age.
d. 𝑟𝑃𝑟𝑖𝑐𝑒,𝐴𝑔𝑒 = −0.76; strong negative linear relationship between the price of a car and
its mileage.

R Code:
#Import the data file into a data frame (table) and label it myData
summary(myData)
sd(myData$Price)
sd(myData$Age)
sd(myData$Miles)
cor(myData)

92.
a. 𝑥̅𝐶𝑜𝑙𝑙𝑒𝑔𝑒 𝐺𝑃𝐴 = 3.15; 𝑥̅𝑆𝐴𝑇 = 1282.72
b. Mean College GPA for white students = 3.26
Mean College GPA for nonwhite students = 2.99
The mean College GPA is higher for white students.

c. Mean SAT for white students = 1280.09


Mean SAT for nonwhite students = 1286.67
The mean SAT is higher for nonwhite students.

93.
a. 𝑥̅𝐼𝑛𝑐𝑜𝑚𝑒 = 276.86; 𝑥̅𝐻𝑜𝑢𝑟𝑠 = 6.77

b. Mean Income for Hot day= 324.88


Mean Income for non-Hot day = 236.42
The mean Income is higher for Hot days.

c. Mean Income for Holiday = 325.54


Mean Income for non-Holiday = 248.09
The mean Income is higher for Holiday.

Excel: =AVERAGE(A2:A36)
=AVERAGE(B2:B36)
=AVERAGEIF(C2:C36,1,A2:A36)
=AVERAGEIF(C2:C36,0,A2:A36)

3-32
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.
Chapter 03 - Numerical Descriptive Measures

=AVERAGEIF(D2:D36,1,A2:A36)
=AVERAGEIF(D2:D36,0,A2:A36)

R Code:
#Import the data file into a data frame (table) and label it myData
mean(myData$Income)
mean(myData$Hours)
tapply(myData$Income, myData$Hot, mean)
tapply(myData$Income, myData$Holiday, mean)

3-33
© 2022 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any
manner. This document may not be copied, scanned, duplicated, forwarded, distributed, or posted on a website, in whole or part.

You might also like