0% found this document useful (1 vote)
236 views

Correlation and Regression

This document contains 10 questions about correlation and regression analysis based on bivariate data presented in tables. The questions require calculating the product moment correlation coefficient r using a statistical calculator, interpreting the value of r, using linear regression to find the equation of the regression line and estimate values, and testing for evidence of correlation at various significance levels.

Uploaded by

mir
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
236 views

Correlation and Regression

This document contains 10 questions about correlation and regression analysis based on bivariate data presented in tables. The questions require calculating the product moment correlation coefficient r using a statistical calculator, interpreting the value of r, using linear regression to find the equation of the regression line and estimate values, and testing for evidence of correlation at various significance levels.

Uploaded by

mir
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Created by T.

Madas

CORRELATION
&
REGRESSION
Part 1

Created by T. Madas
Created by T. Madas

Question 1 (**)
The annual car sales of a small car manufacturer, c , and the annual advertising
expenditure, £ a , has product moment correlation coefficient rac .

The data is coded as

a
x = c − 7000 and y= ,
1000

and the summary is shown in the table below.

Year 2010 2011 2012 2013 2014 2015 2016 2017


x 52 340 511 621 444 700 805 921
y 120 126 134 138 132 146 153 160

a) Find, by a statistical calculator, the value of the product moment correlation


coefficient between x and y , denoted by rxy .

b) State with full justification the value of rac .

c) Interpret the value of rac .

MMS-M , rxy ≈ 0.969 , rac ≈ 0.969

Created by T. Madas
Created by T. Madas

Question 2 (**)
The percentage mock exam marks, of a random sample of 8 G.C.S.E. students, in
Geography and History are recorded in the table below.

Student A B C D E F G H
Geography 80 29 56 56 58 45 67 72
History 78 49 65 50 75 50 60 47

Test, at the 10% level of significance, whether there is evidence of positive correlation
between the percentage mock exam marks in Geography and History.

MMS-O , not significant evidence as 0.4897 < 0.5067

Created by T. Madas
Created by T. Madas

Question 3 (**)
The table below shows the number of Maths teachers x , working in 8 different towns
and the number of burglaries y , committed in a given month in the same 8 towns.

Town A B C D E F G H
x 35 42 21 55 33 29 39 40
y 30 28 21 38 35 27 30 k

a) Use a statistical calculator to find the product moment correlation coefficient


between the number of maths teachers and the number of burglaries, for the
towns A to G.

b) Interpret the value of the product moment correlation coefficient in the context
of this question.

c) Test, at the 5% level of significance, whether there is evidence of positive


correlation between the number of maths teachers and the number of burglaries,
for the towns A to G.

d) Comment on the statement

“… the Maths teachers are likely to be responsible for the burglaries …”

e) Use linear regression to estimate the value of k , for town H.

MMS-P , r ≈ 0.792 , significant evidence as 0.792> 0.6694 , k ≈ 31

Created by T. Madas
Created by T. Madas

Question 4 (**)
The table below shows the marks obtained by a group of students, in two separate tests.

Student A B C D E F G H
Test 1 28 39 18 30 42 43 33 10
Test 2 12 23 16 16 28 18 24 7

The first test is out of 50 marks while the second test is out of 30 marks.

Let x and y represent the marks obtained in Test 1 and Test 2 , respectively.

a) Use a statistical calculator to find the value of the product moment correlation
coefficient between x and y .

b) Explain how the value of the product moment correlation coefficient between
x and y will be affected if the individual test marks were converted into
percentage marks.

c) Test, at the 1% level of significance, whether there is evidence of positive


correlation between x and y .

A student was absent from the second test but he obtained 30 marks in the first test.

d) Use linear regression to estimate this student’s mark in the second test.

MMS-Q , r ≈ 0.789 , unchanged , inconclusive test as 0.789 ≈ 0.7889 , ≈ 18

Created by T. Madas
Created by T. Madas

Question 5 (**)
The table below shows the maximum daytime temperature, in °C , at a certain city
centre, and the amount of a certain pollutant in mg per litre.

Maximum Temperature 10 12 14 16 18 20 22 24
Amount of Pollutant 513 475 525 530 516 520 507 521

a) Find, using a statistical calculator, the value of the product moment correlation
coefficient for the above data.

b) State, with justification, the value of the product moment correlation coefficient,
if the maximum daily temperatures were to be measured in degrees Fahrenheit.

c) Test, at the 10% level of significance, whether there is evidence of positive


correlation in these bivariate data.

MMS-J , r = 0.320 , unchanged , no significant evidence as 0.320 < 0.5067

Created by T. Madas
Created by T. Madas

Question 6 (**)
The table below shows the daily number of shoplifting incidents in a shopping mall,
for a given seven day week and the number of the security guards employed in each of
these seven days.

Number of Shoplifting Incidents 17 20 23 11 35 32 21


Number of Security Guards Employed 6 6 5 7 4 3 5

a) Find, using a statistical calculator, the value of the product moment correlation
coefficient for these data.

b) Test, at the 1% level of significance, whether there is evidence of correlation in


these bivariate data.

c) Briefly comment on the statement:

“… Increasing the number of security guards will result in a decrease in


the shoplifting incidents …”

MMS-K , r = −0.932 , significant evidence as − 0.932 < − 0.8745

Created by T. Madas
Created by T. Madas

Question 7 (**)
An electrical appliances supplier wishes to investigate the impact of advertising on the
sales of his washing machines.

He records the number of monthly advertisements placed on the local radio station and
the number of washing machines sold.

This is a table of his results.

Number of
52 37 66 45 77 27 80 19 47 40
Advertisements (x)
Number of Washing
180 115 171 166 177 99 174 100 143 164
Machines Sold (y)

Test, at the 10% level of significance, whether there is evidence of correlation


between x and y , and explain what conclusions the electrical appliances supplier
should make from this value.

MMS-G , significant evidence as 0.817 > 0.5494

Created by T. Madas
Created by T. Madas

Question 8 (**)
The table below shows the number of Maths teachers x , working in 8 different
schools and the number of students y , in each of these 8 schools.

School A B C D E F G H
x 5 9 11 17 12 10 9 8
y 225 247 334 811 382 340 285 k

a) Use a statistical calculator to find the product moment correlation coefficient


between the number of maths teachers and the number of students, for the
schools A to G.

b) Use linear regression to estimate the value of k , for school H.


Justify the reliability of the estimate.

MMS-H , r ≈ 0.913 , k ≈ 252 − 253

Created by T. Madas
Created by T. Madas

Question 9 (***)
The table below shows the amount spent per month by a car dealership on marketing
and advertising m , in £1000 , and the number of cars c sold that month.

m 6 7 8 9 10
c 8 13 11 12 14

a) Use a statistical calculator to find …

i. … the value of the product moment correlation coefficient between m


and c .

ii. … the equation of the regression line between m and c , giving the
answer in the form

c = a + bm ,

where a and b are constants.

b) Use the equation of the regression line to estimate the number of cars that are
expected to be sold in a month where the amount spent on marketing and
advertising is …

i. … £8,800 .

ii. … £20,000 .

Comment further on the reliability of each of these two estimates.

c) Interpret in the context of this question the physical meaning of a and b .

MMS-R , r = 0.755 , c = 2.8 + 1.1m , c8.8 ≈ 12 , c20 ≈ 25

Created by T. Madas
Created by T. Madas

Question 10 (***)
The table below shows the maximum temperature T °C on five different days and the
corresponding ice cream sales, N , of a certain shop on those days.

T 15 20 25 30 35
N 69 165 172 200 232

a) State, with a reason, which is the explanatory variable in the above described
scenario and state the statistical name of the other variable.

b) Use a statistical calculator to determine …

i. … the value of the product moment correlation coefficient between T


and N .

ii. … the equation of the regression line between N and T , giving the
answer in the form

N = a + bT ,

where a and b are constants.

c) Interpret in the context of this question the physical meaning of a and b .

d) Use the equation of the regression line to estimate the value of N when …

i. … T = 18°C .

ii. … T = 37°C .

iii. … T = 45°C

Comment further on the reliability of each of these estimates.

MMS-L , r = 0.934 , N = 7.22T − 12.9 , N18 ≈ 117 , N37 ≈ 254 , T45 ≈ 312

Created by T. Madas
Created by T. Madas

Question 11 (***)
It is an actual fact that “sleeping with your clothes and shoes on is strongly correlated
with waking up with a headache”.

Evidently the conclusion is that “sleeping with your clothes and shoes on causes a
headache”.

Discuss the validity of the above conclusion indicating how a strong correlation is
possible in the above scenario.

MMS-V , explanation as appropriate

Created by T. Madas
Created by T. Madas

CORRELATION
&
REGRESSION
Part 2

Created by T. Madas
Created by T. Madas

Question 1 (**)
The table below shows the marks obtained by a group of students, in two separate tests.

Student A B C D E F G H
Test 1 27 38 17 29 41 42 32 9
Test 2 13 24 17 17 29 19 25 8

The first test is out of 50 marks while the second test is out of 30 marks.

Let x and y represent the marks obtained in Test 1 and Test 2 , respectively.

The following summary statistics are given.

 x = 235 ,  x 2 = 7853 ,  y = 152 ,  y 2 = 3214 ,  xy = 4904 .


a) Find the value of the product moment correlation coefficient between x and y .

b) Explain how the value of the product moment correlation coefficient between
x and y will be affected if the individual test marks were converted into
percentage marks.

FS2-N , r ≈ 0.789 , unchanged

Created by T. Madas
Created by T. Madas

Question 2 (**)
The table below shows the number of Maths teachers x , working in 8 different towns
and the number of burglaries y , committed in a given month in the same 8 towns.

Town A B C D E F G H
x 37 40 21 50 32 27 39 40
y 30 28 20 35 34 27 31 26

a) Calculate the product moment correlation coefficient between the number of


maths teachers and the number of burglaries.

b) Interpret the value of the product moment correlation coefficient in the context
of this question.

c) Comment on the statement

“… the Maths teachers are likely to be responsible for the burglaries …”

FS2-P , r ≈ 0.692

Created by T. Madas
Created by T. Madas

Question 3 (**)
An electrical appliances supplier wishes to investigate the impact of advertising on the
sales of his washing machines.

He records the number of monthly advertisements placed on the local radio station and
the number of washing machines sold.

This is a table of his results.

Number of
52 37 66 45 77 27 80 19 47 40
Advertisements (x)
Number of Washing
80 75 81 76 77 49 84 50 63 64
Machines Sold (y)

Find, by detailed calculations, the value of the product moment correlation coefficient
between x and y , and explain what conclusions the electrical appliances supplier
should make from this value.

FS2-M , r = 0.820

Created by T. Madas
Created by T. Madas

Question 4 (**+)
An electrical tester wishes to test the accuracy of a voltmeter used in a lab.

He uses a carefully calibrated voltage source and takes readings with the voltmeter he
wishes to be tested.

This is a table of his results. x

Actual Voltage
10 20 30 40 50 60 70 80 90 100
(x)
Voltmeter
9 19 34 39 54 61 68 80 92 99
Reading (y)

a) Show, by detailed calculations, that the product moment correlation coefficient


between x and y is approximately 1.

b) Determine the equation of the regression line between x and y , giving the
answer in the form

y = a + bx ,

where a and b are constants.


Full workings must be shown for this part of the question.

c) Calculate the residual for x = 50 .

FS2-N , y ≈ 0.667 + 0.997 x

Created by T. Madas
Created by T. Madas

Question 5 (**+)
The table below shows the marks obtained by a group of students, in two separate tests.

Student A B C D E F G H I J
Test 1 17 11 16 9 12 12 11 4 7 15
Test 2 24 21 24 20 22 18 18 9 15 21

Let x and y represent the marks obtained in Test 1 and Test 2 , respectively.

d) Find the value of S xx , S yy and S xy .

e) Show that the product moment correlation coefficient between x and y is


approximately 0.9 .

f) Determine the equation of the regression line between x and y , giving the
answer in the form

y = a + bx ,

where a and b are constants.

FS2-Q , S xx = 146.4 , S yy = 185.6 , S xy = 148.2 , y = 7.660 + 1.012 x

Created by T. Madas
Created by T. Madas

Question 6 (**+)
The table below shows 10 pairs of bivariate data.

x 10 30 50 60 70 80 90 100 110 140


y 15 8 3 6 11 8 6 2 3 1

a) Determine the value of S xx , S yy and S xy , and hence calculate the value of the
product moment correlation coefficient between x and y .

b) Find the equation of the least squares regression line between x and y , giving
the answer in the form

y = a + bx ,

where a and b are constants.

FS2-A , S xx = 13440 , S yy = 172.1 , S xy = −1142 , r = −0.751 ,

y = 12.6 − 0.0850 x

Created by T. Madas
Created by T. Madas

Question 7 (**+)
The table below shows the heights and weights of a random sample of 10 pupils,
where the heights are given to the nearest cm and the weights to the nearest 5 kg.

Pupil A B C D E F G H I J
Height (cm) 148 164 156 172 147 184 162 155 182 165
Weight (kg) 40 60 55 75 40 80 65 50 80 70

Let x and y represent the respective heights and weights of these pupils and r the
product moment correlation coefficient between x and y .

a) Determine the value of S xx , S yy and S xy , and hence calculate the value of r ,


correct to three decimal places.

b) Interpret in context the value of r .

c) State the value of r if the heights were measured in metres instead of cm.

d) Determine the equation of the regression line between x and y , giving the
answer in the form

y = bx + a ,
where a and b are constants.

FS2-G , S xx = 1480.5 , S yy = 2052.5 , S xy = 1677.5 , r ≈ 0.962

r ≈ 0.962 regardless of units , y = 1.13 x − 124

Created by T. Madas
Created by T. Madas

Question 8 (**+)
The table below shows the midday daily temperature x , in °C , and the number of cups
of tea y , sold in a small café.

x 20 25 26 27 29 29 32 36
y 100 80 72 74 65 69 63 60

a) Find the value of S xx , S yy and S xy , and hence calculate the product moment
correlation coefficient between x and y .

b) Determine the equation of the regression line between x and y , giving the
answer in the form

y = a + bx ,

where a and b are constants.

c) Use the equation of the regression line to estimate the value of y when …

i. … x = 40 .

ii. … x = 50 .

Comment further on the reliability of these two estimates.

FS2-D , S xx = 160 , S yy = 1128.875 , S xy = −392 , r = −0.922 ,

y = 141.475 − 2.45 x , y40 ≈ 43 , y50 ≈ 19

Created by T. Madas
Created by T. Madas

Question 9 (***)
The table below shows the average midday temperature x of a seaside town, in °C ,
and the number of people y , that used a certain restaurant in that town.

x 17 20 25 29 27 21 20 24
y 40 42 42 43 44 39 41 45

a) Find the value of S xx , S yy and S xy , and hence calculate the product moment
correlation coefficient between x and y .

b) State the value of the product moment correlation coefficient between x and y
if the temperature was measured in degrees Fahrenheit instead of Centigrade.

c) Determine the equation of the regression line between x and y , giving the
answer in the form

y = a + bx ,

where a and b are constants.

d) State, with a reason, which is the explanatory variable in the above described
scenario and state the statistical name of the other variable.

e) Interpret in the context of this question the physical meaning of b .

f) Use the equation of the regression line to estimate the value of y when …

i. … x = 16 .

ii. … x = 35 .

Comment further on the reliability of each of these two estimates.

FS2-L , r ≈ 0.670 regardless of units , y = 34.4 + 0.331x , y16 ≈ 40 , y35 ≈ 46

[solution overleaf]

Created by T. Madas
Created by T. Madas

Created by T. Madas
Created by T. Madas

Question 10 (***)
The table below shows the maximum temperature T °C on five different days and the
corresponding ice cream sales, N , of a certain shop on those days.

T 15 20 25 30 35
N 79 145 182 255 302

a) Find the value of STT , S NN and STN , and hence, determine the value of the
product moment correlation coefficient between T and N .

b) State, with a reason, which is the explanatory variable in the above described
scenario and state the statistical name of the other variable.

c) Determine the equation of the regression line between N and T , giving the
answer in the form

N = a + bT ,

where a and b are constants.

d) Interpret in the context of this question the physical meaning of b .

e) Use the equation of the regression line to estimate the value of N when …

i. … T = 18°C .

ii. … T = 37°C .

iii. … T = 45°C

Comment further on the reliability of each of these estimates.

FS2-H , STT = 250 , S NN = 31145.2 , STN = 2780 , r = 0.996 ,


N = 11.12T − 85.4 , N18 ≈ 115 , N37 ≈ 326 , T45 ≈ 415

[solution overleaf]

Created by T. Madas
Created by T. Madas

Created by T. Madas
Created by T. Madas

Question 11 (***+)
The table below shows the marks obtained by a group of students, in two separate tests.

Student A B C D E F G H
Test 1 35 42 21 55 33 29 39 40
Test 2 30 28 21 38 35 27 30 k

Use linear regression for the test marks of the students A – G , to estimate the value of
k , for student H.

Detailed workings are expected.

FS2-N , k ≈ 31

Created by T. Madas
Created by T. Madas

Question 12 (***+)
The table below shows the amount spent per month by a car dealership on marketing
and advertising m , in £1000 , and the number of cars c sold that month.

m 7 8 9 10 11
c 7 12 10 11 13

a) Find the value of the product moment correlation coefficient between m and c .

b) Determine the equation of the regression line between m and c , giving the
answer in the form

c = a + bm ,

where a and b are constants.

c) Use the equation of the regression line to estimate the number of cars that are
expected to be sold in a month where the amount spent on marketing and
advertising is …

i. … £8,800 .

ii. … £20,000 .

Comment further on the reliability of each of these two estimates.

d) Interpret in the context of this question the physical meaning of a and b .

FS2-U , r = 0.755 , c = 0.7 + 1.1m , c8.8 ≈ 10 , c20 ≈ 23

Created by T. Madas
Created by T. Madas

Question 13 (***+)
The table below shows the tomato yield obtained by a group of ten plants that were
given different amounts of fertilizer and allowed to grow in otherwise identical
conditions.

Plant A B C D E F G H I J
Amount of Fertilizer
0 10 20 30 40 50 60 70 80 90
(grams)
Tomato Yield
1.2 1.9 2.1 2.4 2.5 2.7 3.0 k 3.2 3.1
(kilograms)

a) Find an equation of the line of least squares using the plants A to G, I and J ,
and hence estimate the value of k , for the plant H.
Detailed workings are expected in this part

b) Interpret in context the gradient of the line of least squares.

c) Calculate the residual of plant J.

d) The residual of the plant A is −0.42 . Find the …


i. … sum of the residuals for the plants B to I.
ii. … mean of the residuals for the plants B to I.

Another plant N, not included in the table, was given 200 grams of fertilizer.

e) Discuss briefly, mathematically and in context, whether it is appropriate to use


the line of least squares to predict its yield.

FS2-Y , Y = 0.0198 F + 1.618 , k = 3.0 , R H = −0.3 , 0.72 , 0.09

Created by T. Madas
Created by T. Madas

Question 14 (***+)
The table below shows a set of bivariate data involving two variables x and y .

x 1003 1006 1012 1015 1021


y 0.0017 0.0027 0.0045 0.0056 0.0077

a) Use the coding equations

x − 1012
X= and Y = 10000 y − 27
3

to find the value of S XX , SYY and S XY .

b) Show that the product moment correlation coefficient between X and Y is


approximately 0.9993 .

c) State with justification the value of the product moment correlation coefficient
between x and y .

d) Determine the equation of the regression line between x and y , giving the
answer in the form

y = a + bx ,

where a and b are constants.

FS2-W , S XX = 22.8 , SYY = 2251.2 , S XY = 226.4 , r = 0.9993 ,


y = 0.00033 x − 0.33

Created by T. Madas
Created by T. Madas

Question 15 (***+)
The table below shows a set of bivariate data involving two variables t and v .

t 151 154 157 163 169


v 8800 7800 7400 6500 3100

a) Use the coding equations

t − 157 v
x= and y=
3 100

to find the value of S xx , S yy and S xy .

b) Show that the product moment correlation coefficient between x and y is


approximately −0.958 .

c) State with justification the value of the product moment correlation coefficient
between t and v .

d) Determine the equation of the regression line between t and v , giving the
answer in the form

v = A + Bt ,

where A and B are constants.

FS2-R , S xx = 23.2 , S yy = 1910.8 , S xy = −201.6 , r = −0.958 , v = 52717 − 290t

Created by T. Madas
Created by T. Madas

Question 16 (***+)
Clinical trials are carried out to determine the effect of a stimulant.

Ten volunteers were given different amounts of the stimulant, X milligrams, and the
amount of their nightly sleep, Y hours, were recorded in the following night.

The following summary statistics were obtained.

 X = 900 ,  Y = 78.4 ,  X 2 = 114 000 ,  Y 2 = 616.18 ,  XY = 6834


The following claims are made.

• Claim 1
For every additional 60 milligrams of the stimulant, the nightly sleep typically
reduces by 40 minutes.

• Claim 2
The expected nightly sleep would have been 8 hours if no stimulant was taken.

Comment briefly on these two claims, fully supported by appropriate calculations.

FS2-P , claims not justified supported by the regression line equation

Created by T. Madas
Created by T. Madas

Question 17 (***+)
Dolphins are thought to communicate with each other by high pitch noises they
produce. The frequency, v kHz , of the noise made by a dolphin is recorded at 15
different sea depths, d m . These data are summarized below.

 d = 385.5 ,  d 2 = 11543.25 ,  v = 22.5 ,  v 2 = 38.25 ,  dv = 650.25


a) State, with a reason, which is the explanatory variable in the above described
scenario and state the statistical name of the other variable.

b) Find the value of S dd , Svv and Sdv for this data.

c) Calculate the product moment correlation coefficient between d and v .

d) Interpret the value of the product moment correlation coefficient in the context
of this question.

e) Give a reason to support the fitting of a regression line of the form

v = a + bd ,

where a and b are constants.

f) Determine the value of a and b , correct to three significant figures.

g) Interpret in the context of this question the physical meaning of a and b .

FS2-O , Sdd = 1635.9 , Svv = 4.5 , Sdv = 72 , r ≈ 0.839 , a ≈ 0.369 , b ≈ 0.0440

Created by T. Madas
Created by T. Madas

Question 18 (****)
The mean and variance of 10 independent observations of a random variable x , are
66.5 and 85.8 , respectively.

Based on of a random sample of 10 independent observations of another variable y ,


the regression line of y on x is

y = 0.0949 x − 0.0130 .

Determine the product moment correlation coefficient between x and y . assuming


further that S yy = 8.1 .

FS2-X , r = 0.977

Created by T. Madas
Created by T. Madas

Question 19 (****)
A gym opened on the first day of January of a given year.

The months of that year were numbered as 1, 2, 3, … , 12.

The number of new members, N , at the end of each month m , was recorded for those
12 months.

The regression line of N on m was found to be

N = 34 + 35m .

Use the regression line to find the total number of members which joined that gym
during that year.

No credit will be given for adding 69, 104, 139, … , 419, 454.

FS2-Z , 3138

Created by T. Madas
Created by T. Madas

Question 20 (****+)
Some summary statistics for a set of bivariate data, based two variables x and y , are
given below.

n = 10 , x = 15 , y = 48 , σ 2x = 186 , σ 2y = 172 ,  xy = 8850 .


a) Find the value of each of the following sums.

 x ,  y ,  x2 ,  y2 .
b) Calculate the product moment correlation coefficient between x and y .

c) Describe briefly the effect on the product moment correlation coefficient if


another piece of data, x = 10 with y = 70 , is added to the other 10 bivariate
observations.

FS2-V ,  x = 150 ,  y = 480 ,  x 2 = 4110 ,  y 2 = 24760 , r ≈ 0.922

Created by T. Madas
Created by T. Madas

Question 21 (****+)
On a certain mountain climb, a scientist recorded the temperature, T °C , at ten
different heights, H m above sea level, and some of his results are summarized below.

 T = 124 ,  T 2 = 2078 ,  H = 27 500 ,  HT = 235 500


If the product moment correlation coefficient for this data is −0.98 , determine an
estimate for the temperature at sea level on the day of the climb.

FS2-T , ≈ 25.9 °C

Created by T. Madas
Created by T. Madas

Question 22 (****+)
The number of letters x in people’s first names and number of letters y in people’s
surnames is researched.

The summary data of the number of letters in the first names and the surnames of a
random sample of 20 individuals is shown below.

 x = 125 ,  x 2 = 796 ,  y = 140 ,  y 2 = 1032 ,  xy = 882


a) Calculate the product moment correlation coefficient between x and y .

The name “Richard Edwards” is added to the sample, making the total number of
people in the sample, 21 .

b) Without a direct recalculation , ...

i. ... show that S xx of the 21 first names is likely to have a different value
to the original value of S xx of the original 20 first names.

ii. ... determine the effect of adding “Richard Edwards” to S yy and S xy .

c) Given further that adding “Richard Edwards” increases the value of S xx


explain with justification whether the product moment correlation coefficient
between x and y , increases or decreases.

FS2-S , r ≈ 0.253

Created by T. Madas
Created by T. Madas

Question 23 (****+)
Two variables, x and y , have the following regression equations, based on 5
observations.

y on x : y = 18.5 + 0.1x

x on y : x = 16.6 + 0.4 y

The following summary statistics are also given.

 x 2 = 3215 ,  y 2 = 2227.5 ,  xy = 2634


Show that the product moment correlation coefficient between x and y is 0.2 .

proof

Created by T. Madas
Created by T. Madas

SPEARMAN'S
RANK

Created by T. Madas
Created by T. Madas

Question 1 (**)
Nine gymnasts performed in a gymnastics competition.

Their names were Arnold (A), Brian (B), Christian (C), Damon (D), Eli (E), Fabian (F),
Gordon (G), Harry (H) and Ian (I).

Rank 1 2 3 4 5 6 7 8 9
Judge 1 D C E B F A I H G
Judge 2 D E F C I B A G H

a) Calculate Spearman's rank correlation coefficient for this data.

b) Test whether or not the judges are generally in agreement, at the 1% level of
significance, stating your hypotheses clearly.

FS2-M , rs = 5 ≈ 0.833 , evidence of agreement, 0.8333 > 0.7833


6

Created by T. Madas
Created by T. Madas

Question 2 (**)
The data in the table below shows the time, in seconds, for the fastest qualifying lap for
8 different Formula One racing drivers, and their finishing order in the actual race.

Fastest Qualifying Lap 49.12 49.34 49.07 48.55 49.40 49.27 49.77 48.87
Finishing Position 5 6 1 3 7 4 8 2

a) Calculate Spearman's rank correlation coefficient for this data.

b) Test whether or not there is any association between the fastest qualifying lap
time and the finishing position for Formula One racing drivers, at the 5% level
of significance, stating your hypotheses clearly.

FS2-O , rs = 37 ≈ 0.8810 , evidence of association, 0.8810 > 0.7381


42

Created by T. Madas
Created by T. Madas

Question 3 (**)
The table below shows the mileages travelled by eleven salesmen and the commission
they got paid during a given month.

Name Monthly mileage Monthly commission


Alan 734 £800
Brian 650 £660
Christian 668 £620
Dominic 709 £610
Ethan 437 £450
Finlay 551 £560
Graham 580 £510
Hamish 387 £520
Ian 450 £460
James 298 £430
Kevin 325 £390

c) Calculate Spearman's rank correlation coefficient for this data.

d) Test whether or not there is evidence of positive correlation between the


mileages travelled and the amount of commission received, at the 1% level of
significance, stating your hypotheses clearly.

FS2-J , rs = 97 ≈ 0.8818 , positive correlation, 0.8818 > 0.7091


110

Created by T. Madas
Created by T. Madas

Question 4 (**)
The actual ages, in complete years, of seven cats is shown below.

Cat Name Riri Loulou Ginge Puss Ollie Rex Mog


Age in years 3 4 18 21 5 11 9

These seven cats were seen by a vet, during a day’s surgery, and the vet was asked to
order them according to their age by examination only.

He ordered the cats’ ages, older first, as follows.

Ginge, Puss, Mog, Rex, Loulou, Riri, Ollie.

c) Calculate Spearman's rank correlation coefficient between the actual age of the
cats and the vet’s order.

d) Test whether or not the vet has the ability to identify the age of cats, at the 1%
level of significance, stating your hypotheses clearly.

FS2-Q , rs = 23 ≈ 0.8214 , no evidence of association, 0.8214 < 0.8929


28

Created by T. Madas
Created by T. Madas

Question 5 (**+)
Six ordered pairs ( x, y ) , of bivariate data, are shown in the following set of axes.

O
x

Determine the Spearman's rank correlation coefficient for this data.

FS2-R , rs = 31 ≈ 0.886
35

Created by T. Madas
Created by T. Madas

Question 6 (**+)
The table below shows, for a group of students in a recent mock exam, the number of
marks lost, y , and the corresponding number of papers, x , they practiced leading up
to that exam.

Student A B C D E F G H I J
Number of Papers (x) 17 39 24 26 11 22 25 10 8 6
Number of Marks Lost (y) 12 5 11 14 10 9 8 15 19 17

a) Find the value of S xx , S yy and S xy , and hence determine the value of the
product moment correlation coefficient between x and y .

b) Comment briefly on the result of part (a).

c) Obtain the Spearman's rank correlation coefficient between x and y .

d) Test, at the 1% level of significance, whether there is evidence of negative


association between the ranks of x and y .

FS2-B , S xx = 957.6 , S yy = 166 , S xy = −317 , r = −0.795 , rs = −0.745

Created by T. Madas

You might also like