Notes On Regression For ITM
Notes On Regression For ITM
Example, Y = 1 + 2 X
For some given values of X and Y, we can have many lines drawn through them, but there
will be only one line which is the closest to these points and this is called as the best fit line.
The values of a & b can be found by using the method of least squares. In this method, we
try to minimise the value of ∑e2 where e is the difference between the Y coordinates of the
point plotted and the point on the straight line.
These values of a & b can be substituted in the equation Y = a + b X and this equation can be
used to forecast the value of Y for some given value of X.
Q1. For the following data, find the simple linear regression equation of Y on X and forecast
Y when X = 20.
X : 10 12 15 23 20
Y : 14 17 23 25 21
x = X –X bar: -6 -4 -1 7 4
y = Y –Y bar: -6 -3 3 5 1
xy : 36 12 -3 35 4
x2 : 36 16 1 49 16
y2 : 36 9 9 25 1
Q5. Following are the average prices of a particular stock and the values of Stock Exchange
index for 6 years:
Solution:
X Y x = X –X bar y=Y–Y xy x2 y2 Y cap e
e2
245 307 -118 -24 2832 13924 576 322.74 -15.74
255 322 -108 -9 972 11664 81 323.44 -1.44
240 337 -123 6 -738 15129 36 322.39 14.61
390 310 27 -21 -567 729 441 332.89 -22.89
655 350 292 19 5548 85264 361 351.44 -1.44
393 360 30 29 870 900 841 333.1 26.9
X bar =2178/6 = 363, Y bar = 1986/6 = 331
b = ∑xy / ∑ x2 = 8917 / 127610 = 0.07, a = Y bar – b X bar = 331 – (0.07) (363) = 305.59
r = ∑xy/ sqrt (∑x2 * ∑y2) = 8917/ √127610 x 2336 = 8910/(357.22 x 48.33) = 0.516
Q6. Find Karl Pearson’s coefficient of correlation and the equation of the simple linear
regression line for the following data.
Age X 56 42 36 47 49 42 60 68
Blood pressure Y 147 125 118 128 145 140 155 162
Solution:
X Y x = X –X bar y = Y – Y bar xy x2 y2
56 147 6 7 42 36 49
42 125 -8 -15 120 64 225
36 118 -14 -22 308 196 484
47 128 -3 -12 36 9 144
49 145 -1 5 -5 1 25
42 140 -8 0 0 64 0
60 155 10 15 150 100 225
68 162 18 22 396 324 484
X bar = 400/8 = 50, Y bar = 1120/8 = 140
b = ∑xy / ∑ x2 = 1047 / 794 = 1.319, a = Y bar – b X bar = 140 – (1.319) (50) = 74.05
r = ∑xy/ sqrt (∑x2 * ∑y2) = 1047/ √794 x 1636 = 1047/(28.2 x 40.45) = 0.918
Q7. A research project was undertaken to determine if there is a relationship between the
years of experience on the job (X) and efficiency rating of employees (Y). The objective of
the study was to predict the efficiency rating of the employee. The sample results are as
follows:
Solution:
X Y x = X –X bar y = Y – Y bar xy x2 y2
1 6 -6 2 -12 36 4
20 5 13 1 13 169 1
6 3 -1 -1 1 1 1
8 5 1 1 1 1 1
2 2 -5 -2 10 25 4
1 2 -6 -2 12 36 4
14 4 7 0 0 49 0
8 5 1 1 1 1 1
4 4 -3 0 0 9 0
6 4 -1 0 0 1 0
If the intensity was observed to be 14.85 what is the concentration of quinine Y likely to be in
the solution.
Solution:
X Y x = X –X bar y = Y – Y bar xy x2 y2
0 0 -9 -0.2 1.8 81 0.04
r = ∑xy/ sqrt (∑x2 * ∑y2) = 4.25/ √182.66 x 0.1 = 4.25/ √18.266 = 4.25/4.27 = 0.995
This means that there is a very high positive correlation between X and Y.
Q9. The manufacturers of a particular brand of chocolate were interested in examining the
relationship between the sales of chocolates and shelf space allocated to that brand of
chocolate by various stores. Data from 10 stores are as follows:
Sales ( Rs in thousands) 25 15 28 30 17 16 12 21 19 27
Y
Shelf Space (sq ft) X 5 3.2 5.4 6.1 4.3 3. 2.6 6.4 4.9 6
1
Determine the regression to predict sales using shelf space as the independent variable. Also
find the Karl Pearson’s correlation coefficient between X and Y.
Solution:
X Y x = X –X bar y = Y – Y bar xy x2 y2
5 25 0.3 4 1.2 0.09 16
3.2 15 -1.5 -6 9 2.25 36
5.4 28 0.7 7 4.9 0.49 49
6.1 30 1.4 9 12.6 1.96 81
4.3 17 -0.4 -4 1.6 0.16 16
3.1 16 -1.6 -5 8 2.56 25
2.6 12 -2.1 -9 18.9 4.41 81
6.4 21 1.7 0 0 2.89 0
4.9 19 0.2 -2 -0.4 0.04 4
6 27 1.3 6 7.8 1.69 36
Y bar = 210/10 = 21, X bar = 47/10 = 4.7
r = ∑xy/ sqrt (∑x2 * ∑y2) = 63.6/ √344 x 16.54 = 63.6/ √5689.76 = 63.6/75.44 = 0.843
The data below shows the profit (in Rs.’000), sales (in Rs. Lakhs) and advertising
expenditure(in Rs.’00). Find the multiple regression equation of profit on sales and
advertising expenditure.
Sales(X1) Advertising expenditure (X2) Profit(Y) X12 X1X2 X22 X1Y X2Y
24 16 10 576 384 256 240 160
35 17 11 1225 595 289 385 187
38 18 12 1444 684 324 456 216
41 19 13 1681 779 361 533 247
42 20 14 1764 840 400 588 280
∑X1= 180, ∑X2= 90, ∑Y = 60, ∑X1X2=3282, ∑X2 2 =1630, ∑X12 =6690, ∑X1Y=2202,
∑X2Y= 1090, n= 5
60= 5a + 180 b1 + 90 b2
Solving the above equations, we will get the values of a, b1 and b2 which we substitute in the
equation
Y = a + b1X1 + b2X2
Y = 8.8 + 0.089 X1 + 0.49 X2