0% found this document useful (0 votes)
162 views

Correlation and Regression...

Uploaded by

Customer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
162 views

Correlation and Regression...

Uploaded by

Customer
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 39
7 Correlation and Regression jTRODUCTION ii Arclationship may be obtained in two series, For example; two series relating to the heights cis of0 Broup Of persons are given. It may be observed that weights increase with increase in jghts- 80 that tall people are heaviour than short sized People. We also know that the area A of geottadus is given by A= m7, Ie means larger radius vl always have a larger area than a circle | bsnaller radius. ) BIVARIATE DISTRIBUTION | There are two types of distributions, (1) Univariate Distribution, A heights of students of a class, |) Bivariate Distribution. The distr weights of the students ofa class, 13, COVARIANCE, distribution in which there is only one variable, such as ‘bution involving two atiables such as heights and {etthe corresponding values of two variables Xa GI). Oa, Gyo). Gy) Then the covariance between X and Vis denoted by cov. (X, 1). ind ¥, given by ordered pairs Nis defined as cov (X, y= S157) 01-5) +H -¥) (H+ + ( (1-3) (4-7) ei Sov (X,Y) = E. E(XY)-E()E(Y) (. £09, EM) are the corresponding means 253 x 254 Introduction to Engineering Mathematics _ | Working Rule | Step I.Caleulate the sums 20% and 2 11 1 Step II, Calculate the sum 20% J of the products ofx,andy, Ex, Ey Ea | Step III. Divide the values obtained in steps I, Il by t0Bet °° | Da || | TL} to get cov(X, ). Step IV. Obtain the difference J “2 at _| ” ” | Example 1. Calculate the covariance of the following pairs of observations of tm) | variates. 4. 22. B34. 48. 6.9 6 12) Solution, Ex, =1+2+3+4+5+6=21 By, =44+2+44+8+9+ 1239 Exy, = (1x4) +(2*2)+(3*4) 4 4% 8)+ (5% 9)+ (6% 12) =4444 12+ 32445+72= 169 Cov(X, 1) 169 21 39]_169 91 _ 65 -[B-S 2). 3 DR ‘Ans Example 2. Find the covariance of the following pairs of observations of two variates : 0) 20 (2030) (2500030) 03) (35,38), (40,42)... (45,30) (50,40) (35, 70) Solution. 10+ 15+ 20+ 25 + 30+ 35 + 40+ 45 + 50 + $5 = 325 = 35+ 20+30+30+35+ 38+ 42+ 30+ 40+ 70=370 Yi = (10 x 35) + (15 x 20) + (20 x 30) + (25 x 30) + (30 x 35) + (35 *38) + (40 « 42) + (45 » 30) + (50 « 40) + (55% 7) = 350 + 300 + 600 + 750 + 1050 + 1330 + 1680 + 1350+ 2000 + 3850 = 13260 cov x.y = [PHBL E8234] : : n 4s -(Eae ms 370 10 1010 = 1326 - 1202.5 = 123.5 i Ans y 4 16 and Regression ji" ON 255 eo opRELATI nenever WO Variables. and y are so rl lated thatan increase weteeor derease in the other, then eeoan i _ se in the one is accompanied by an the variables are said to be correlated. forexample. the Yield of crop varies with the amount ofrainfall, 78s OF CORRELATIONS 8 (Positive correlation fan increase in the value of one variable X'r results in a ing ; variable Yon an average, ‘corresponding increase in value of other OR ifadecrase inthe value of one variable X results in a corresponding decrease in value of other variable Yon an average, he correlation is said to be positive, Negative correlation Ifthe increase in the values of one variable X results ina corresponding decrease in the values ofother variable Y. OR lfthe decrease in the values of one variable X results in the increase to a corresponding values of. The correlation between X and Y is said to be negative. @)Linear correlation When all the ploted points lie approximately on a straight line, then the correlation is said to be linear correlation, (4) Perfect correlation oF x Ifthe deviation of one variable X is proportional to the deviation in other variable Y, then the correlation is said to be perfect. Y In this case the plotted points on a graph lie exactly ona straight line, 4 (a) Positive perfect correlation Ifincrease in one variable X is proportional to the increase in the other variable ¥. The graph will be exactly straight line. 4 () Negative perfect correlation Tfincrease in one variable is proportional to the decrease in the other vatiable. The graph will be exactly a straight line. Perfect Correlation: Iftwo variables: vary in such a way that their ‘atio is always constant, then the correlation is'said to be prefect. ~ x SCATTER OR DOT-DIAGRAM. © Perfect Negative . correlatic When we plot the corresponding Values of two variables, taking one on aa ‘axis and the other along y-axis, it shows a collection of dots. Perfect Postive ‘correlation A Introduction to Engineering Mathematics _ This collection of dots is called a dot diagram or a scatter a [Z ed No correlation Negative correlation 256 Positive correlation O Coofficient of correlation = +1 ‘OCoofticient of correlatior Methods of Determining Simple Correlation Graphical Algebraic Scatter Diagram Correlation graph Karl Pearson's Spearman's Concurrent ‘Two-way Coefficient of Rank differences Deviations Frequency Table Method Method Correlation Method 7.7 KARL PEARSON’S COEFFICIENT OF CORRELATION r between two variables x and y is defined by the relation P_ __Covarianed(s, ¥)271 ul = F yy) [variance x Jvariance y" where X=x-x, Y=yn y, ie. X, ¥ are the deviations measured from their respective means, DAY ) : =co~ variance n and ,,6,, being the standard deviations of these series. Example 3. Calculate the coefficient of correlation between x and y series from the following data £ 2(x-¥) = 136, E07 =. Z(x—X) (y—P) = 122 Solution, Here, we have EX? =T(x-¥) = 136 ‘ 38 — joa Roression 257 EY? =Z(y~5Y = 138 EXY =E(x-3)(y—F) = 129 hae yex?_ fey? or pngite values OTEXT, EX?& EY*in (1), we ge =e 122 Vi36Vi3811.66x1175. 122 ~ qa7.005 0% at sample 4. Ten students got the following Percentage of marks in Economics and Statistics. ie 1} 2] 3) 47 sy] «6] 7] 8] 9] 0 sisinEconomics| 78 |_ 36 | 98 | 25 175} a2 | 90 | 02 1.65 | ao visinStatistics_ | _84 |_ st | 91 | 60 | os | 02 | 86 | 38 | 53] a7 Calculate the coefficient of correlation, Solution. Let the marks of two subjects be denoted by.x and y respectively. Then the mean for x marks = 95° _ 65.and the mean ofy marks a =66 IfXand ¥ are deviations of x’s and y’s from their Tespective means, then the data may be ngedin the following form : x Y_[X=2-65 [¥=y-06 ¥ e AY 78 | 84 B 18 169 | 324 234 36 | 51 | -29 -15 841 | 225, 435 98 91 33 25 1089 | 625 825 25 | 60 | -40 -6 1600 36 240 73 | 68 10 2 100 4 20 82 | 62 17 -4 289 16 | -68 90 | 86 25 20 625 | 400 500 62 | 58 -3 -8 9 64 24 65 53 0 -13 0 169 0 39 | 47 | -26 -19 676_| 361 494 650 | 660 0 0 5398 | 2224 | 2704 Here 1X? = 5398, YY? = 2224, DY =2704 f Xr, 2704 TExyar) Ges 4x47. 2704 Ans, x 258 Introduction to Engineering Mathematics. Exampje5._Calculate the correlation coefficient between the following data : x 5 9 13 7 2 Y 2 20 25 33 35 Solutite vite 72 SESPHTH2 Sy —_12+20+25+33435 7; 5 Let X= (x-¥) and ¥= (y-J) y | xex- 3. | = @- 13" | y | Ye -25) Y=(y-25" | AY 3 | °-8 “4 12 -13 169 104 9 -4 16 20 td 25 20 13 0 0 25 0 0. 0 v 4 16 33 8 a 32 21 8 64 35 10 100 80 Ex=0 EX'= 160 zy=0 | 2¥=358 |2x¥=236 EAKoos «gs 1 236, 236 oe 0.986 Texter) 160x358 239.33 Ans, Exampié6, Calculate Karl Pearson's coefficient of correlation for the data given below: 5 4 6 8 2 7 Independent Variable | x | 3 | 7 Dependent variable | y | 7 | 2[ 8 | 8] 10] 13 | 5 | 0 Solution, Let the assumed mean for.x and y series be 5 and 9 respectively. x y X=x-5 | Y=y-5 ] * y aY 3 7 22! =2 4 4 4 7 12 2 3 4 9 6 | 5 8 0 “1 0 1 0 | 4 8 aa =) 1 1 1 | 6 | 10 1 1 1 1 1 3 | 3 3 4 9 16 2 | 2 5 8 “4 9 16 2 | a 10. 2 1 4 1 7) | Total 2 1 32 49 38 ramcalea! Now, a NAW) (= | fey? (2 r N){[N a) a poarssion eng numerator and denominator by = We gt 38x8- E "Ree ep 302 x 2-1 252 B91 ~ 3a g5g 0-2 ye7. From the following data, exami me ; ine 3 can be said 10 be correlaied whether input of oil and output of electricity ti 78 | 48 | 96 | 80 | 7,7 35] 65°73 z # 5° | 22 gyion. Assumed Mean ofx=4.8, sumed Mean of me ofy Output of Electricity 8 = Lp [rx 48| Pap ig nofy=13 19 {24 x re XY 69 |} . 06 a 3.5 34 : 0.36 1.26 aie : eo 11.56 65 30 c 4.84 7.48 18 | 6 3 . 27.04 15.6 ait | 3 0 ‘6 ; : : 55 48 + 96 ad 42 23.04 17.64 20.16 . . . 2.2 10.24 4.84 7.04 es a 8.41 0.81 2.61 ae EX?= 66.66 Ex’Y’= 54.15 Now, (4 133 ton vd 7 66.66 (24) 55.53 (8 HAF Tago Multiplying numerator and denominator by 49, we get 54.15x7 -19.4x 15.3 fe SRT Se (66.667 (19.4)° 55.53 7- (1S. 3y = 379.05 - 296.82 6662-37636 1388.71 — 234.09 Seep 99 90.26 154.62 82.23 8223 _ p69 “gsxi2d 178 Ans, >: 260 Introduction to Engineering Mathomatics — Example 8. Find the correlation between sales and advertising expenditure from. the following data: ‘Sales (Rs. lakh) | 65 | 66 | 67 | 67 | 68 | 69 | 70 | 72 ‘Adv. Expenditure | 67 | 68 | 65 | 68 | 72 | 72 | 69 | 7 Solution. Assumed mean of x= 67, Assumed mean of y= 68. x y | x =x-67 — 68 x? Y? xY os | 67 -2 4 1 2 66 | 68 -1 0 1 0 0 67 | 65 0 =3 0 9 0 o7 | 6 0 0 0 0 0 6 | 72 1 1 16 4 6 | 72 2 4 16 8 7 | © 3 9 1 3 nin 5 9 15, Total 8 52 32 Now, 3208 ) . 3 (sls 4 (*) 52 () 3 (8) Va | Multiplying numerator and denominator by 64, we get > 328-64 256-64 V44x8- 64 /52x8- 64 . (352-64 /416—64 192 = 0.603 ‘Ans 7.8 COEFFICIENT OF CORRELATION OF GROUPED DATA. oor) Fy (ee ee N N N 1 261 coefficient of correlation, deviation from assumed jeviation from assumed: = Total number of items, ind the coefficient of correlation ber A fe a fate able ween the age and the sum assured from mean of x =x—q mean ofy =y—5 Sum assured in Rs, 20,000 | 30,000 | 40,000 50,000 | No. of persons 7 3 7 1 2 7 15 7 1 3 2 2 6 2 32 . 2 = 7 14 2S Pap 100 jn Ltthe sum assured denoted by. andthe age group by =30,000, 43 10,000 19 EfXY (2X (1) W==S] 10.000 | 20.000. | 30,000 | 40.000 | 50.000 Rowswise vy] -2 = 0 1 2 v [seri ebexr sire jer deer) ey |r] a ee a{ielo| 2 f3[o [7 [fe] -+[ a] 2 [se | +10 2lelel es filo l7[7[i [2] 3 | -3 | 3 | 2 sfolo}| o jel o fefol2[o] x2] o | o [lo s|-wlia| 4 [2[ 0 |-|o]-|e 20 | ET 27 32 20 4 xfs la 0 2» 8 33) 6 by 0 2» 16 Q jer fs 16 0 au) 6 262 Introduction to Engineering Mathematics ~ jy Putting the values in (1), we get 2-2) Multiplying numerator and denominator by 10,000, we get 5 100(=7) — (-33\(-6) Roots} —(-33 fi0o(ia)=(-6?_V13100~1089/13100-3721 2713 2713 2713 _ 97556 ~JizoriJoa79 | 109.59%96.85 10613.7915 Hence, the age and sum assured are negatively correlated, i.e., a8 age goes up the sum assured comes down, ‘Ans, Example 10, Calculate the coefficient of correlation for the following table : xaage =" | 8- 12-16 Total = o-4 | 4-8 2 0-5 = = = = ta 5-10 6 8 — = 4 10-15 - 5 (3 = 8 15-20 | — | 7 2 - 9 20-25 =— = _ a 9 Total 13 20 5, 9 47 Solution. + |e 2 6 10 4 Row-wise -2 +! 0 1 y ylslaer| slay wy per | xy | sy [re PA o-s | 25 -2 7 28 7 - 4] 2B 28 5-10] 7.5 -1 6 12! 8 8 4 | -14] 14 20 10-15] 12.5 0 5 0 0 8 0 0 0 15-20] 17.5 I TL] -7 0 9 9 -7 20-25} 22.5 2. 18 9 18 36 18 t : Ea xpi’ |ayr? mf B 20 47 =-1/ 287 | L59 es il EfX =| [xX 26 20} 37 fx? 2 20 ane mY 40 1 0 is FAN Replace the class-interval for x and y by their id-points and then let and Regression io" 263 -1 ee dy! 22=2S 4 5 X37, Ef Xs : Me sf A" Bl EGY LESY" 87, spyrye = 59 BLXY (Epx'Vapy Pe i N N “2 2 P Bae Egy? (zsyy N v N og) se a a) 1.255-0.017 : = 1255-00017 81, (zy 87, v1.723-0.620/1.851—0.0005 47 a7 a1 3) pense ing V1.103 1.8505 1.05%1.36 149g” = Example 11, 4 computer operator while calculating the coefficient between two variates x and y for 25 pairs of observations obtained the following constants : n= 25, Ex = 125, Dx? 50, Ex = 100, Ly = 460, Uxy = 508 It was however later. discovered at the time’ ‘of checking that he had ‘copied down two pairs as (6, 14) and (8, 9) while the correct ‘pairs were (8, 12) and (6, 8). Obtain the correct value of the correlation coefficient, Solution. Here, corrected Ex = Incorrect Ex~(6+8)(8+6) = 125 Corrected Ex = Incorrect By — (14 + 6) + (12+ 8) = 100 Corrected Ex? = 650 (6+ 8) + (8+ &) Corrected Ey" = 460 - (142+ 6) + (122+ 82) Corrected Exy = 508 ~ (84+ 48) + (96+ 48) Corrected value of correlation coefficient is 650 436 = 520 125x100 azeiiiee si fy = ed 2 (125) (100) | 650— oat) [2s ase _ 520-500 oe a: (650-625) (436-400) | 25x36 3 ; CORRELATION _The coefficient ofrank correlation is applied to the problems in which data cannot be measured antitativel ly but qualitative assessment is possible such as beauty, honesty etc. In this case the best aida is given the rank no. I next rank no, 2 and so on: "10 SPEARMAN’S RANK CORRELATION COEFFICIENT. 79 SPEARMAN‘S RANK 264 Introduction to Engineering Mathematics — 1, Solution. Let (4.94)-(¥:.¥2)--Giq+2u) be the ranks of 7 individuals corresponding to two characteristics acteristics. two individuals are equal in either classification, each individual takes the val 1, 2, 3°80 uu hence their arithmetic means are, each ai Bn i nGe) nth non 2 2 x, be the values of variable X and ¥+¥2+¥3>~ ..¥, those of ¥. Let xs ‘then pax («the -tEt axe TE(aap} Ee ern z(E) noe D@n+D (ot Dne sD, (#21 ? 6 2 2 Clearly, Hence Putting these values in lpg py —-2 Sa? Ange —y-4 De nor =) 5 dz. 6x nae 1) ‘Working Rule ‘step 1. _ Assign ranks toeach item ofboth series, if they are not given: Step Ik. Calculate the difference D of ranks of X from the rank of ¥and write it ina. separare column. ‘Step IHL, Square the difference D and write D? in a separate column. Step IV. Apply the formula to get the Rank correlation, 6=D? > 2 ne) where 1 is the total number of pairs of observations. rei r i Regression gn : 265 ee 12, Compute Spearman's rank corn coefficient = 3 10 ¢ ae 1 8 Cy pa | 2 8 Ka forge tO ta i TE | 1c | 6 3 3 9 r_D| 5 4 T 1 t£| 7 5 2 4 |_| 2 6 4 16 || G 4 7 3 9 poy 4 H 1 8 8 0 0 1 1 9 8 a J 3 10 7 49 Ld = 280) n(n? =1) 6x 280 SI ety 11 =-0.697 | " 10(1o0= = ed | MEEQUAL RANKS | ante 8 more than one item withthe same rank Th Fank othe egal items is assigned by | terage rank: ‘toeach of these. individuals. » Forexample; Suppose an item isrepeated atthe rank Sth. the th and Gth items are having 5. | same values then the common rank assigned to Sth and 6th item ig as 5.3, which i the Werage of S.and 6, ‘The next rank: ‘assigned will be seven, fan item is repeated thrice at rank 2, then the 6 | 6 | o | 6 | o [ of 71 3 (AMIE, Winter 2002) Ans. 0472 4. Calculate the coefficient of correlation for the following data : x 1 2 3 4 5 6 7 8 9 >> >s |] 3s] of ef} uw] | 4] {iis ‘Ans: 095 es r f | ation and Regression | gr! ‘& 9. n, 2 B 271 Calculate Karl Pearson's eoetiient of Ferrelaton from the following daa, using 20 as working mean for price and 70 as working mean for demand, Price {14 | 6 7 [ag | 20 [a [2 Ls Demand | 84178 [70 as 6 | 0 [a2 [ss | 0 Clelte the corlation eotfcient between the following pairs of values x} 100 | to Ts [ane 120 | 120 [1s [130 | 135 y [is es 6 |e 6] as 1 0 Ans. -0.915 The flowing marks have been obisned by a a8 of student in statisti (out of 100} Ce Paper [| S6_| so Oe 48..[ 0 [elm 65 [0 [74 [2 | 90 Comput the coefeient of conlaton for the above data, Ans. 0.918 Celt the coeient of comcaton by So the values of x and y from the following data ¥ B® [97] 9 [39 9 ar 6 es 107 | 136 [12308 : You may use Month | Tan, Income | 46 Expenditure | 36 The ranks of the same brackets denote the ran GD Qin, 6,3, 16 students in two subj 44 65, 67, (3.19, (14,1), (5,16, Proficencies ofthis group in su and hence deduce thatthe cone bjects A and B, lato ‘with the same standard deviation an SOE, y= and y are two random variables m between x and y is zero, corltion coefficient»: Show + ‘hal the coefficient of colton between x andar + y jg [LAF 8 the coreaton coefficient betieen to variables x ang « following constants : 2 Aas. 5 = 272 Introduction to Engineering Mathematics iy 14. Calculate the coefficionts of correlation between x (Marks in Mathematics) and. (maz i Physiyy given in this fillowing data = $= Fp 10-40 | 40-70 | 70-100 | Total 0-30] 5 20 = 25 30-60) — 28 2 30 ee a i. Ge Ans.0.4517 5 80. 3 100) 1S. Calculate from the data reproduced pertaining to 66 selected villages in Meerut district, the value or r, between “total cultivated area’ and “the area under wheat’ ‘Area under Wheat | 599 | 500—1000 | 1000-1500 | 1500 ~ 2000 | 2000 2500 | Total Gin Bighas) ‘0-200 12 6 = = i 1g 200-400 2 18 4 a: 7 400 ~ 600 = 4 a 7 o ma 600-800 = ' = 7 ; a 800-1000 = = 1 7 z. Toral 14 29. 7 s 4 6 ‘Ans. 0.749 16. Calculate the coefficient of correlation from the following table giving the age of 100 husbands and their wives in years : ‘Age of husbands Age of wives [20—30 [30-40 | 40-50 | 50— 60 | 60-70 | Total 15-25 5 a 3 = = 7 25-35, 10 25 2 - | 37 35-45 fi 12 2 - 15 45-55 = 4 16 5 25 ‘Ans. r= 0.296 35-65 = - 4 2 6 Total A 20 a 24 7__[ 100 17. Find the coefficient of correlation for the following data >= 16-18 | 18-20 | 20-22 | 22-24 Total 10-20 2 1 1 7 4 20-30 3 2 3 2 10 30-40 3 4 5 6 18 40-50 2 2 3 4 W 50 ~ 60 - ' 2 2 s 60-70 = 1 2 1 4 10. i 16 15 32 Ans. 0.28 1a, Two ils Asgari and Mumtaz were asked to rank 7 different types of lipsticks. The ranks given 1 them are given below : I | Ascari [2 1 als Les 7 6 | armaz [1 z aes sp 7 Calculate Spearman's rank correlation coefficient. r f | 273 | | 1" ‘wo judges in & beauty contest rank Ne ten competitor in the following order 64 13. 1 |9 Te [Ts | 4 1 6}7 5 8 | lo 3 2 | Do the two judges appear to agree in their ‘standard? Ans, 0.224 | 4p, Fillup the blanks: | The valve of coefiiet of correlation lies between Ans.~ 141 | | | 113 REGRESSION ‘some relation shi trated round a cur ip between two variables x and y, then the dots ve. This curve is called the curve of regression. thod used for estimating the unknown values of one variable of another variable, Regression analysis is the met onesponding to the known value: 114 LINE OF REGRESSION Rearesion wil be called non-linear if there exists a relationship (Parabola ete.) other than a snight ine between the variables under consideration, | 11S EQUATIONS TO THE LINES OF REGRESSION Let y=atbr ol) yy bethe equation of the line of regression of yon x. Let(x,,,) be any point of dot, From the figure PR=y, OR=a+bx, PQ=PR-QR=y,-a-bx, Let Sbe the sum of the Squares of such distances, then 5=Dy-a-byy According to the principle of least squares, we have to choose a and bso that S is minimum, Themethod of least square Rives the condition for minimum value of S, as as x 23 (y-a—bx), . 23 (y-a-br)x s 36, fee Smimmun be L-a-b)=0 = Yy-na-bPx=0 Dy snarbyx va (2) |" Ley -ar-t)=0 = Lordy x-0y t => Yve= ay x+bye (3) 274 Introduction to Engineering Mathomaties — 4) Dividing (2) by», we get = where ¥ and ¥ are the means of x's ‘This shows that (¥,7') lie on the line of regression (1), hifting the origin to (¥,7), the equation (3) becomes YoO-Mw-P=aLe 4b (w-¥Y But Xe 0 D => Le-Me-7Y) =b ex? p-LeaHO=9) Bae or Se-¥) Sa ~@) We know that pom EE a ois peat EXY. j sx? fay n or DAxY =nr0,8, Putting the value of ) XY in (4), we get ie. slope of the line of regression = ‘The line of regression passes through (%,). nis Hence the equation to the line of regression Similarly the regression line of x on y is ign esr arat(yny) oy ¢, , 2s. are known as the coefficients of regression. ° pos a i = } ion lines in the case of Example 19. If 0 be. the acute angle between the two regressi variables x dnd y, show that tan = ( ato -usual meanings. Explain the significance where r=0 and r=), where 10,10, have their (Nagpur University, Winter 2004, AMIE, AMLE. Winter 2001) Yr ant a ‘ot jn and Regrassion ze : jon. Lines of regression are tan 0 SS .Q) Proved r “o+e 7 r o,+0, (@) Ifr=0, then there sno relationship bet tween the two variables and theyare independent. ® On putting the value of r= 0 in (3) we get tan 0= «0,0. perpendicular. () Ifr=1or-1 So the lines (1) and (2) are On putting these values of rin (3) we get, tan@=0 or @=0 ice lines (1) and (2) coincide. The correlation between the variables is perfec. Ans. Example 20. If the coefficient of correlation between two variables x'and y is 0.5 and the acute angle between their lines of regression is, (3 tan "| 5 |, show that oy Solution, Here, we have r=0.15 (U.P. III Semester, June 2009) Y aiteF (1) (From Example 18) Putting the values of and tan 0 in (1), we get 28 Introduction to Engineering Mathematics ~ 1) 26,' +20,7 50,5, = 203~ 50,6, + 25; > 203-40, 6,-9,9, * 26, > 2g, (6, -26,)-9,(6,~20,) = (20,-0)(6,-20,)=0 => Either = 6,=26, (ot desired) Proved, Example 21, Find the correlation coefficient between x andy, when the lines of regression are: 2x-9y46=0 x-2y41=0 Solution, Let the line of regression ofx on y be 2x Then, the line of regression of yon xis x-2y-+1=0 —9y 66-0 uy = " nic 2x-9y+6=0 0 = and x-2ytl=000 => 2 x paqfborby = [2x4 3 >1 which isnot possible 22 2 So our choice of regression line is incorrect. The regression line of x on y isx—2y + ‘And, the regression line ofy on xis2x-9y+6=0. x—2yt]=0 => x=2y-1 =) by =2 ‘And 2x-9+6=0. > yodned > by Hence, the correlation coefficient between x and y is. Ans. Example 22. Tv lines of regression are given by Sy 8 + 17 = Oand 2y 5x + 14=0 If 0; =16, find (i) the mean values of x and y (ii) 2 (iii) the coefficient of correlation between x and y. Solution, Weave, 5y- 8x +1 we) 2y-Sx+ 1 Since (¥,F) isa common point of the two lines of regression, we have 5¥-8¥ +17=0 (Q) 29 -5¥+14=0 @) r on id regression 27 ao ving (2)and (3) forx and y, we have Ce - y 1 1x2 =1445 “8x14 SKID) 5x(-3)=(8)*2 x 1 a 34-70 225+16 x 36 2 ‘pesquations offine ofregression can be written as 8.17 2.14 y 3* 3 and 37s 8 rong and? G; ‘nmultiplication of two equations, we get 722) Poe 16 o,)\o, ) 5°5~ 25 = Now we have to determine the sign of r ne. + or — 80,6, are always +e, sor is also +ve from (4). Weare given And = @) On putting the values of rand @, in (4), we get Sees aleee 3s)o, 5 Fro $e, 5 > 2. > a4 Hence() F=4, F=3, (il)o? (it) r= ¢ { ‘Ane, Example 23, In a partially destroyed laboratory record of an analysis of correlation data, the following results only are eligible: oi=9 Regression equations : R 8x ~10y + 66 =0 40x — 18y. = 214 What were : (a) the mean values of x and y (b) the standard deviation of y, (c) coefficie correlation between x and. ye “(Nagpur University, ‘Sumner S003. Solution, Since bon the lines of regression pass through the point (, J), therefore, we As a 8¥-10F +66 =0 (I) 40% -189-214 =0 (2) Introduction to Engineering Mathematics ~ 1, (On solving (1) and (2), by cross multiplication method, we have x _ y CH 214) = 6) (18) ~ (65) (40) — 8) (214) x y 1 x ve igo+1188 | 2640+1712 144+ 400 tet - 3328 4352 256 3328 5435247 ing F667 F256 ‘Also given lines of regression can be written as y= 0.8x+ 6.6:x= 045 y+5.35 2 r 222045 We get . re = 0.8; ale 4 ols g, | = (0.8) (0.45) > P= 0.36 => r=06 Ans. = Q) On putting the valus ofr and a, in (4), we get 9, (0.6) s 0.8 = Hence (a) ¥=13, ¥ =17, (6) % (r= 0.6 Ans. Example 24. The following regression equations were obtained from a correlation table: ‘y=0.516x+33.73, x =0.512y +32.52 Find the value of (a) the correlation coefficient, (b) the mean of x's and (c) the mean of ys Solution. .516x+33.73 512y +3252 (@) From (1), 516 +B) From (2), wl) Multiplying (3) and (4), we get Sy (2 le (0.516,0.512) =. 5316x0512 => 514" Coefficient of correlation = 0,514. Ans. (®) Since, (1) and (2) pass through the point (¥,) . : 0.516% +33.73 3) ¥ = 0.5129 +32.52 A) Regression 279 M 0)and (0) We get ‘ wit F=676, F=68.61 Ams, 25, The 0 eresion equations ofthe variables x andy are ofl ¥=19.13-087y and y=11.64-0.503, lean of XS; (ii) Mean of y's; (iii) The Correlation coefficient between x and y. all! *=19.13-087y ~) i" ay (2) ipand 2) passthrough (2,5) ving 0) and (4) we get oe fat (5) la 6) so, and ©, are always positive, so ris negative, Mutiplying (5) and (6), we get ~0.87x(-0.50) P=085 > ra 066 Anat 26) The regression equations calculated from a giveri set of observations for two random variables are X=04y +64 and y=-0.6x44.6 Calculate ¥,¥ andr Sation. The regression equations are x=-04y+64 (1) Y=-0.6x+4.6 =) rua) coefficient of regression of x on y= oy =-04 (3) "rom 2) coe ficient of regression of ¥ on ¥= 4) ‘rma (3)and (4), we have (-=-2))-cooe 06) : Balecs 730.24 : 1 = £049 : ©. and o, are (always) positive so ris negative r=3049 ‘Oana, 280 Introduction to Engineering Mathematics — jy, To find ¥ and F we solve the equations (1) and (2) simultaneously. Their point of intersection is (&P)> x =6, yt Ans, Example 27. Show that he geometric mean of the coefficients “ofregression is the coefficient of correlation. (AMIE, Summer 2001) 7 °, Solution. The coefficients of regressions are r—“and 7 s, ie. = coefficient of correlation. Proved. 1s of regression is greater than the Example 28. Prove that arithmetic mean of the coefficient (A.MLE., Summer 2000) coefficient of correlation. = Sy S Solution. Coefficients of regression are r-—* and r= oy ” Wehaveprove that A.M.>r = ole sh 3, 3, = 2>0 > cal: +03 -20,9, }>0 1 2 = + >0 which is true. Proved. Example 29. Find the regression line of y on x for the following data xlr Lal 4[elel oe tai prlehee eee Z L8 Estimate the value of y, when x = 10, Solution. S.No. x y ay x 1 1 1 1 2 3 2 6 9 3 4 a 4 16 16 4 6 4 24 oo, 5 8 5 40 “4 6 9 1 6 81 7 i WE) 88 >a 8 4 9 126 196 Total | 56 40 364 524 Let y =a + by be the line of regression ofy on x, where a and bare given by he following, - i equations: and Regression 281 oil Ly=natbyy > 40 = 8a+56b AL) Eaysa x4 byy2 364 = 56a +5245 (2) wshing(1) and 2), we get 6 db i ‘reenution ofthe required lines 1 “Tat Ie-llys6=0 Ans, f rainfall and the quanti ty of air pollution collected [a] a5] 59| 6] or] 52] a9 2 [eel Tine us| ts Fie} ia Hd Find the regression ine ofy on x (AMLE, Summer 2000) Solution, SN y 1 126 . 2 . 21 3 59 6 4 36 lis s 61 M4 6 52 ig 7 38 2 8 2 Ml 315 58 Ley a+ by be the equation of the line of rege ‘he loving equations, Ly=nasb Ee () Zyealetery 5 453.82= 3750+ 188.015 +o(2) solving (1 and 2, Weeeta= 1549 and b = 0675, * equation ofthe | ine of regression isy = 15,49 Introduction to Engineering Mathematics ~ jy = y=y-5 | & r aY 1 eG 4 9 6 2 0 1 0 0 3 2 0 4 0 4 3 1 9 3 5 2 4 4 4 Total 10 26 B Correlation coefficient ‘r” and r Slope of regression line of y on EXY Slope of regression line of x on y= 5y2_ Equation of regression line of y on xis, 3B = 13 Fee =5=22 (x-3) YH HHH) > yrSaTH ) y= 13r4 Ld Equation of regression line of x on y is, ee BLE = 70-3) 3 ¥-3=05Q-5)— x= 057+ 05 Ams Example 32. Find the coefficient of correlation and regression lines to the following data: x]s[7][8 Popa [ali y | 33 | 30 | 28 | 20° [1s [16 | 9 (Nagpur University; Winter 2003) Solution. and Regression ani . 283 spe vaious calculations afe shown in the following table: x y |X=x-10 [| y=y-22 XY 52 7 5 Bly ss a -55 25 121 7 \430,}'--=3 a 2 5 a 8 RS <2 6 -12 4 36 10 | 20 0 2 7 7 i un | 18 1 5; a i . B | 16 3 = 7 : : is_| 9 6 3 ne = vo =- 191 | exe=84 | 2 = 406 Coefficient of correlation Now, 2.2738 atid Eq lation of line of regression y on x is Soe oF aE" Y= 22 == 2.2738 (x10) Y= 22 =-2.2738r + 22.738 y =- 2.27384 + 44,738 and equation of line of regression x on y is = Ab a x-¥ =r2£(y-j) se > x-10 =~ 0.4283 (y— 22) > x— 10 =-0.4283 y + 9.4226 > ao 0.4283 y + 19.4226 2+ 0.4283 y = 19.4226 Hence the equation of the line of regression of y on x is y+ 22738x = 44.738 ‘The equation of the line of regression of x on y is x-+ 0.4283 y = 19.4226 5 284 Introduction to Engineering Mathematics — yj sion and obtain the lines of the regression | Example 33. Calculate the coefficient of correla “for the following data: TIT? TL Pe 1m pH | [4 [46 | is y lo] 8 (Nagpur University, Summer 2008) Sotutton. Fa- TO S TES 7 9 8 10 jz I 13 14 16 14 | Ly = 108) x y x r x p 9 16 9 2 2 8 9 16 2 3 10 4 4 5 4 12 1 0 0 Se a 7 ; 6 | B 1 i ; 7 14 4 a z 8 16, 9 16 2 gee} ras fa 3 FE Ex-60 | 2¥=60 | 2xY=51] EXxY Coefficient of correlation ‘r= ‘Toy? sy? 37 ‘The regréssion coefficient of y on x is 1 295 [8 - 6, 0 The equation of line of regression of y on xis yy = yo 12 =0.95 (x-5) = y =0.95x+7.25 The regression coefficient of x on y is +{)-095 [3 -09s og, 60 ¥ ‘The equation of line of regressions of x on y is ap ce xe¥ TT OD y x-5 =0.95(y~12) r jon and Regression ji 285 + =095y-64 2 jee, the lines of regression for the given data are ad y =095x+7.25 f x =095y-64 ns. 7 le 34. Find the coefficient of correlation and obtain the equation to the lines of Be regression for the dance x 6 2 10 4 8 y 9 i 8 7 (Nagpur University, Winter 2000) solution. Here, we have x 6 2 | or 8 [Ex=30 y lo gpa fang 8 7 | zy=40 = -2x_30 yet. 20 x | y| x=x-6 x ¥ XY 6 oleae 0 il A 2 | cagal esaq 16 5 0 a a 16 9 -12 72 4 ° 0 7| 2 4 i - j pe=40 [SP =20 lnxr=—24 ExY 26 ~26 onde 7 deem = 70.919 VEX? Ey? 40x20 28.2842 ye The regression coefficient y on x ig © (zxy EXE WG “79-6252, G4 ‘The equation of ine of regression y on xis 5 ey I-y Ex? iS P8 =~ 065 (x~6) * Feeression coeficient x on y is ERY ig 5 Ex? The equation of ine ofregression:x on y ig ~ = -2XY us se St op EXY _ og, = 2x ‘2868 Introduction to Engineering Mathematics — > xo 13y+ 164 Hence, the equation of te line of regression of y on xis y 3-065 + 11.9 The equation of the line of regression of x on y is x e-13y+ 164 Ang, Example 35, The following data regarding the heights 0) and the weights (x) of 100 college students are given : Ex= 15000; Ex°= 2272500 Ey = 6800; Ey? = 463025; Eay =1022250 | Find the correlation coefficient between height and weight and state the equation of regression of height on weight. Solution. 15088 =130,5 =F oes (2) __ [2272500 (my n la 100 \ 100 o, = V22725~ 22500 = V225 =15 a py (Ey = 463025 ss00y ” n n 100 100 = 4630.25 —4624 = 6.25 =2.5 Ear gyay_ EE aso) mn" (,)(6,) ~ 15x2.5 10222.5-10200 __ 22.5 15x25 15x25 2.5 Regression equation ofy on x, we have op Se 25 y-yar (e+) => y-68=0.6| ns) (x-150) 5%, y= 68 = (s-150) => 10y-680= x-150 1Oy = x +530 Ans. 7.46 MULTIPLE REGRESSION ‘We know that the production of wheat depends not only the amount ofrain fall, but alsothe fertilizer xy, pestisides x,, quality of seeds x, quality of soil x, etc, In a multiple regression the dependent variable isa function of more than one independent variable. Linear regression isa linear relationship between y and x,,.x3,; pt OX, + 0, x4 a,x, 4 In multiple non-linear regression equation is not linear; for example y= at a, x84 a, x°+ ay xt. and Regression jon _ ; NON LINEAR REL, ost 36. Fit-a non linear relationship Side el ving faa *[![2]3]0 yJ17 [18 | 23732 (UP. I Semester, June 2009) wo Here, we have x}y ||» Te Te [ey Wea) ar [4 1 17 2 | 18 | 4 | 36} 8 | 16 | 72 2: 23 [6.9 | 69. | 2-1 ‘ar || 207 4 | 32 | 16 | 128 | 64 | 256 | si2 Foal | 10 | 90 [30 | 250 | 100 1354 | g08 athe non linear relationship is Pr atax+ axe? Normal equations are Ey= na, *a,Ex+ a} Ee Baya, Ext ae + aD? Ex'y=a,Ex + a,Ex + a,Dxt substituting the values of Zy, n, 3.x, Ex? ete. in these equations, we get 4, + 104, + 30a, 25= 10a, + 30a, + 100a, 80.8 =30a,+ 100a, + 354a, Solving these equations, we get 2, a,=-0.5 and a,=0.2 Then the non-linear relationship is y=20Sx+0.28 Ans. 111 ERROR OF PREDICTION The deviation of the predicted value from the observed value is known as the standard error of Fetition, Itis given by 4% ‘ereyis the actual value and y, the predicted value. Example 37. Prove that O8,=0,haF i) By=oai-F = Introduction to Engineering Mathematics — | on | Sin} | | a 1 | | aro, 1” (x-¥)0-P) n So, Proved, (i Similarly (i) maybe proved. 1 ‘y on x for the data given below: Example 38. Find the standard error of estimate 0 x I 3 4 6 & tf i 14 yLete?Laf «ep st 7 8 9 Solution. The equation of the line of regression of y on x is en Ix 6 alee oe pf B yeteen 80% TT (See Example 28 on Page 280) S.No x y Yr o-¥) ove 1 1 HS wd Basin ! in ul 121 2 3 2 al sor 25 in ul 121 34 100 3 2 iw in 121 48 16 4 el 216, i y co ul 121 5 8 5 S2, at # ul nl 121 6 8 4 6 2 T Wl ul 121 ' 33 5 25 a u § rr ul 121 104 5 25 8 “4 7 il “i 121 308 peo 2-9.) = 497 icone 2 4, = EQS | pe _ 7 i 718 RELATION BETWEEN REGRESSION 7 ‘gr. No. ———— Correlation Analysis The relationshi, variables is given of correlation, Itis ameasure of dij P between two by the coefficient rection and degree Al 121 2s, of relationship between + and y. Here, 1, =r, a I does not reflect upon the nature of io variable (dependent or independent variable). It does not imply cause and effect 5 relationship, between the Variables, Itis a relative measure and have no. 6 un 7 Mtindicates the degree of ASSOCiation. TUS confined to the study of tinear | g relationship. Weep ee Tos 1 | Find the regression line NALYSIS AND CORRELATION ANALYSIS Regression Analysis In this case some and some are ste, an average value. Points are stepped up ped down for making 4, and b,, are mathem: average Telationshi Variables. bb, latical measure of ip between the two It indicates which is dependent variable and which is independent variable Te indi relationst ‘ates the cause and effect hip between the variables. tis an absolute measure, IL is used to forecast the nature of the dependent variable when the value of independent variable is given Kt has not only application of linear relationship butnon-linear EXERCISE 7.2, | --——— Of y on x for the data : Ztthete | 472 37-3 Pee feet i eal aac ee 2740.1 Compute the regression Tine of ony Tor the Taliog 8 data : x [2 Ta a a ) » [2 [08 pe =1eay ie eawation of repression ines forthe following values of x & y, xo N [an] oe 8 [10 » [es 3 4 3 [2 Ans Line of reuressioh of y on sy a9) Ose Regression equation of zon y ize 14. 2y Compute the regression line of y on x forthe following data : Fs|ct 2 3 4 3 6 Yoo] 3a 2 2 al Ans. y= 6 —y 290 6. os 10. un 12, Find the regression lines <¢ from the given data : A Introduction to Engineering Mathematics ~ pedals 4 | 8, =] 6 -|— 9 | 10 x] 12 | 16 | 28 | 25 | 36 ai | 49 | 40 | 50 y | 10 Find the equations to the lines of regression and the ‘Ans.x= 0.2y~ 0.64, .69x+49 coefficient of correlation for the following data es foe te iit gh] a 2! y | 18 1217) #10 = [58 7 5 r-6=- 0.632 (y- 10), Ans. y-l Obtain the lines of regression for the following data > TTri2ts 4 ps ye [7 88 = pio pe fu | 3 | |e] Obtain an estimat For the following data, find the equation of line of regression + 'cofy which should correspond on the average tox = 6.2. ‘Ans. y"— 12= 0.95 (x— 5) andx~ 5= 0.95 (~ 12) Required estimate is 13.4 12] 13 [16 [17 | 20 | 25 x | 10 | y | to | 22 | 24 [27 [29 | 33 | 37 ‘Ans. y = 0.82 + 1.56%. Find the regression line of y on x if 70 [70 ].50 [oo | 80 | so | 90 | 40 | 60 | 60 5a pose cena ease | SO sete [noes emo laee 3.0 ‘The following marks have been obtained by a class of students in statistics, ‘Ans.y = 0.55 + 0.05832 Paper! | 80 4 1 55 | 56 | 58 | 60 | 65 | 68 | 70 | 75 | 85 36 | 50 | 48 | 60 | 62 | 64 | 65 | 70 | 74 | 90 Paper _| 81 Compute the coefficient of correlation for the above data. Find the lines of regression. ‘Ans. r= 0.918, y~ 65.45 = 0.981 (x 65.18) x 65.18 = 0,859 (y~ 65.45) “The following results were obtained from records of age (x) and systolic blood pressure (y) of @ Fou? of 10 men : x iy Mean 53_| 142 | ana (x-%) (y-3) = 1220 Variance | 130] 165 Find the appropriate regression eqaution and use ito estimate the blood pressure ‘ofa man whose age is 45. Ans. y= 0.94 x + 92.26, Blood pressure = 134.56 ‘The following results were obtained from lineups in Applied Mechanics and Engineering Mathematis fn an examination : Find both the regresssion equations. Also estimate the value of y for x Applied Mechanics(x) Engg. Maths») Mean 475 39.5 Standard deviation 16.8 10.8 r=0.95 0. ‘Ans. y= 0.61144 10.5, x= L478 y~1.143,9~ 2883 & Pr | yan and Regression 201 srewo Fegression coefficients are 0,8 and 0.2, what would be the value of coefficient of ‘correlation? . Ans. r= 0.4 ‘The regression equation are: 7x~ I6y +9 = 0, Sy~ Ax 3 0, find X,F and (AMIE, Winter 2003) Ans. ¥ = ~ ‘The following regression equations and variances are obla gor-9y-107= 0, 4x—Sy+33= 0, variance of x= 9. Find (9 the mean values of x and y, (i) the standard deviation of y. (A.MALE,, Winter 2000) F=13,y=17,0, =4. 1d from a correlation table : i Ans. 16, Two random variables have the least square regression fines with equation 3x + 2y = 26 and 6x + y= 31. Find mean values and correlation coefficient between x and y. Ans, F=4,5=7,7=0.5 11, The regression equations of two variables x and y are x= 0.7y+ 5,2, = 0.3x + 2.8. Find the means of the variables and the coefficient of correlation between them. " Ans. r= 0.7395, ¥ =-0.1034, : F = 05172 18, Two lines of regression are given by.x + 2y=5 & 2x +3y= 8, Calculate: (/) mean values of x and y (ii the coefficient of correlation (ii) the rato of the regression coefficients, Ans. ¥ =4, 9 =7,r=-05 b 19, Obtain normal equations for fiting a curve ofthe form Y= ax-+= for m points (x,.y,) x= 1y 2, 1 Ans.Eaysnb+abxr, 2a nat+ boy 20,_ Fill in the blanks : (a) The correlation coefficient is the ......mean between the regression coefficients. (8) The lines of regression always pass through a point (©) Arithmetic meati of the coefficients of regression is than’ the coefficient of correltion, (d) Ifthe two regression lines are perpendicular to each other, then the coefficient of correlation is equal to... (6) If two regression coefficients are, ~ 0.1 and~ 0.9, the value of r is (1) The normal equations for fitting a curve of the form y= a+ br +x" are (@) fy, and 7, are two regression coefficients, then signs of 7; F, depend on (h) If coefficient of correlation r= 0, the two lines of regression are. (0 If two regression lines concide then the coefficient of correlation is (A.M.LL, Winter 2000) Ans. (a) geometric, (b) (¥,¥,) (c) greater; (d) 0, (é)+ 0.3, Oy =na + box + CLR Ex y = abe + bE + Bx! and Bey =aEe? + bE © Ee (@) Coefficient of regression (1) perpendicular (f) +1 E |

You might also like