Tutorial 5 - Bivariate Analysis
Tutorial 5 - Bivariate Analysis
Tutorial 5
Correlation Coefficient
1. The data below displays the ages (in years) of husbands and wives of six couples. Find the product
moment correlation coefficient of the data and interpret the value.
Ages (in years) of husbands and wives of six couples
Husband’s age 43 57 28 19 35 39
Wife’s age 37 51 32 20 33 38
2. An auto manufacturing company wanted to investigate how the price of one of its car model
depreciates with age. The research department at the company took a sample of eight cars of this
model and collected the following information on the ages (in years) and prices (in hundreds of
dollars) of these cars. The following are some summarized statistics.
∑ 𝑥 = 42 ∑ 𝑦 = 1133 ∑ 𝑥 2 = 264 ∑ 𝑦 2 = 213,565 ∑ 𝑥𝑦 = 4450
(a) Show that the Pearson correlation coefficient is 0.986. Explain the meaning of the value.
(b) What will happen to the prices of the cars when the ages increase?
3. The following table gives information on the number of megapixels and the prices of nine randomly
selected point-and-shoot digital cameras that were available on bestbuy.com. Calculate the value
of the Pearson correlation coefficient and interpret the value. (0.933)
Megapixels 10.3 10.2 7.0 9.1 10.0 12.1 8.0 5.0 14.7
Price (RM) 130 150 62 160 200 280 125 60 400
140
130
Blood Sugar (mg/dL)
120
110
100
90
80
70
2.0 2.5 3.0 3.5 4.0 4.5
Distance (miles)
STA408 Tutorial 5 Chapter 5: Bivariate Analysis
Analysis of Variance
Model Summary
Coefficients
(a) Based on the scatter diagram, state the relationship between distance run and blood sugar
level.
(b) Identify the independent and dependent variables.
(c) Based on the output, state the regression equation. What method has been used to estimate
the coefficients?
(d) State the regression coefficients and explain what it means.
(e) State the value of the coefficient of determination and interpret the value.
(f) Determine the value of the correlation coefficient and interpret the value.
(g) Estimate the blood sugar level if he runs a distance of 3.5 miles. (102.825mg/dL)
(h) Can we conclude that the linear regression model is significant at 1% significance level.
5. An auto manufacturing company wanted to investigate how the price of one of its car model
depreciates with age. The research department at the company took a sample of eight cars of this
model and collected the following information on the ages (in years) and prices (in hundreds of
dollars) of these cars. The following are some summarized statistics.
∑ 𝑥 = 42 ∑ 𝑦 = 1133 ∑ 𝑥 2 = 264 ∑ 𝑦 2 = 213,565 ∑ 𝑥𝑦 = 4450
(a) Identify the independent and dependent variables in this analysis.
𝑆𝑆𝑥𝑦
(b) Show that the Pearson correlation coefficient is 0.986. (Use = ).
√𝑆𝑆𝑥𝑥 𝑆𝑆𝑦𝑦
(c) How many percent of the variation in prices is explained by the age of the cars? (𝑅 2 = 0.9722)
(d) Name the statistics used to determine the answer in (c). (Coefficient of determination)
(e) Find the slope and 𝑦-intercept of the regression equation. (𝑏 = −34.4425, 𝑎 = 322.4481)
(f) Based on the calculation in (e), write the complete estimated regression equation and
interpret the slope in the context of the problem.
(g) Predict the price of car if the car is 7 years old. (81.35 hundreds of dollars)
2
STA408 Tutorial 5 Chapter 5: Bivariate Analysis
6. The following table gives information on the number of megapixels and the prices of nine randomly
selected point-and-shoot digital cameras that were available on bestbuy.com. The scatter diagram
and Minitab output as shown below.
Megapixels 10.3 10.2 7.0 9.1 10.0 12.1 8.0 5.0 14.7
Price (RM) 130 150 62 160 200 280 125 60 400
350
300
Price (RM)
250
200
150
100
50
5.0 7.5 10.0 12.5 15.0
Megapixels
Analysis of Variance
Model Summary
S R-sq R-sq(adj) R-sq(pred)
41.6463 87.03% 85.18% 71.76%
Coefficients
3
STA408 Tutorial 5 Chapter 5: Bivariate Analysis
7. The data of annual energy consumption in billions of BTU for both natural gas and coal for the
randomly selected states are given as follows.
∑ 𝑥 = 2256, ∑ 𝑦 = 3088, ∑ 𝑥𝑦 = 1283269, ∑ 𝑥 2 = 1079380, ∑ 𝑦 2 = 1690322
The Minitab output for the data is shown below.
Regression Analysis: Coal versus Gas
Analysis of Variance
Source DF Adj SS Adj MS F-Value P-Value
Regression 1 64590 64590 7.089 0.056
Gas 1 64590 64590
Error 4 36442 9111
Total 5 101031
Model Summary
Coefficients
8. The experience (in years) and monthly salaries (in hundreds of RM) of nine randomly selected
secretaries are tabulated in the table below.
Experience 14 3 5 6 4 9 18 5 16
Monthly salary 62 29 37 43 35 60 67 32 60
Analysis of Variance
Model Summary
Coefficients
4
STA408 Tutorial 5 Chapter 5: Bivariate Analysis
9. One end A of an elastic string was attached to a horizontal bar and a mass, m grams, was attached
to the other end B. The mass was suspended freely and allowed to settle vertically below A. The
length AB, l mm was recorded, for various masses as follows.
Analysis of Variance
Model Summary
Coefficients
10. A company manufactures an electronic device to be used in a very wide temperature range. The
company knows that increased temperature shortens the life time of the device. The following data
is found.
Temperature (C) 10 20 30 40 50 60 70 80 90
Life time (hours) 420 365 285 220 176 117 69 34 5
5
STA408 Tutorial 5 Chapter 5: Bivariate Analysis
Analysis of Variance
Model Summary
Coefficients
(a) Calculate the correlation coefficient using the data given in the Table above and interpret the
value.
(b) Write down the regression equation and interpret the value of the slope.
(c) State the coefficient of determination and explain its meaning.
(d) Based on the regression equation, estimate the life time of the device when the temperature
used is 65C.
(e) Perform a test to determine whether the linear regression model is significant. Use 1% level
of significance.
11. A study on the amount of rainfall and the quantity of air pollution removed produced the following
data.
Daily rainfall
4.3 4.5 5.9 5.6 6.1 5.2 3.8 2.1 7.5
(in 0.01cm)
Particulate removed
116 118 126 121 132 118 114 108 141
(g/m3)
Analysis of Variance
Model Summary
Coefficients
6
STA408 Tutorial 5 Chapter 5: Bivariate Analysis
(f) The constant term in a regression equation can be interpreted as the value of the predictor
variable when the value of the explanatory variable is zero.
(Answer: (a) TRUE, (b) FALSE, (c) FALSE, (d) FALSE, (e) FALSE, (f) FALSE)