Topic 3 Curve Fitting Method Linear Regression
Topic 3 Curve Fitting Method Linear Regression
Linear Regression
Linear regression is the most popular regression model. In this model, we wish to
predict response to n data points ( x1 , y1 ), ( x 2 , y 2 ),......, ( x n , y n ) by a regression model given
by
y = a0 + a1 x (1)
where a 0 and a1 are the constants of the regression model.
A measure of goodness of fit, that is, how well a0 + a1 x predicts the response variable
y is the magnitude of the residual ε i at each of the n data points.
Ei = y i − (a 0 + a1 xi ) (2)
Ideally, if all the residuals ε i are zero, one may have found an equation in which all
the points lie on the model. Thus, minimization of the residual is an objective of obtaining
regression coefficients.
The most popular method to minimize the residual is the least squares methods,
where the estimates of the constants of the models are chosen such that the sum of the
n
∑E
2
squared residuals is minimized, that is minimize i .
i =1
Why minimize the sum of the square of the residuals? Why not, for instance,
minimize the sum of the residual errors or the sum of the absolute values of the residuals?
Alternatively, constants of the model can be chosen such that the average residual is zero
without making individual residuals small. Will any of these criteria yield unbiased
06.03.1
parameters with the smallest variance? All of these questions will be answered below. Look
at the data in Table 1.
y = 4x − 4 (4)
10
8 y = 4x − 4
6
y
0
0 1 2 3 4
4
the sum of the residuals, ∑E
i =1
i = 0 as shown in the Table 2.
∑ε
i =1
i =0
4
So does this give us the smallest error? It does as ∑E
i =1
i = 0 . But it does not give
unique values for the parameters of the model. A straight-line of the model
y= 6 (5)
4
also makes ∑E
i =1
i = 0 as shown in the Table 3.
∑E
i =1
i =0
9
y=6
7
y
3
1.5 2 2.5 3 3.5
Since this criterion does not give a unique regression model, it cannot be used for
finding the regression coefficients. Let us see why we cannot use this criterion for any
general data. We want to minimize
n n
∑ Ei = ∑ ( yi − a0 − a1 xi )
i =1 i =1
(6)
Putting these equations to zero, give n = 0 but that is not possible. Therefore, unique values
of a 0 and a1 do not exist.
n
You may think that the reason the minimization criterion ∑E
i =1
i does not work is that
n
negative residuals cancel with positive residuals. So is minimizing ∑E
i =1
i better? Let us
4
look at the data given in the Table 2 for equation y = 4 x − 4 . It makes ∑E
i =1
i = 4 as shown
∑ε
i =1
i =4
4
The value of ∑E i =1
i = 4 also exists for the straight line model y = 6 . No other straight line
4
model for this data has ∑ Ei < 4 . Again, we find the regression coefficients are not unique,
i =1
and hence this criterion also cannot be used for finding the regression model.
Let us use the least squares criterion where we minimize
n n 2
S r = ∑ Ei = ∑ ( yi − a0 − a1 xi )
2
(9)
i =1 i =1
∂S r n
= 2∑ ( y i − a 0 − a1 xi )(− xi ) = 0 (11)
∂a1 i =1
giving
n n n
− ∑ yi + ∑ a0 + ∑ a1 xi = 0 (12)
i =1 i =1 i =1
n n n
− ∑ y i x i + ∑ a 0 x i + ∑ a1 x i2 = 0 (13)
i =1 i =1 i =1
n
Noting that ∑a
i =1
0 = a 0 + a 0 + . . . + a 0 = na 0
n n
na 0 + a1 ∑ xi =∑ y i (14)
i =1 i =1
n n n
a 0 ∑ x i + a1 ∑ x i2 = ∑ x i y i (15)
i =1 i =1 i =1
( xn , y n )
y (xi , yi )
Ei = yi − a0 − a1 xi
( x2 , y 2 )
(x3 , y3 )
y = a 0 + a1 x
(x1 , y1 )
x
Figure 3 Linear regression of y vs. x data showing residuals and square of residual at a
typical point, xi .
∑ xi2 ∑ y i − ∑ xi ∑ xi y i
i =1 i =1 i =1 i =1
a0 = 2
(17)
n
n
n∑ x i2 − ∑ x i
i =1 i =1
Redefining
n _ _
S xy = ∑ x i y i − n x y (18)
i =1
n _2
S xx = ∑ x − n x 2
i (19)
i =1
n
_ ∑x
i =1
i
x= (20)
n
n
_ ∑y
i =1
i
y= (21)
n
we can rewrite
S xy
a1 = (22)
S xx
_ _
a 0 = y − a1 x (23)
Example 1
The torque T needed to turn the torsional spring of a mousetrap through an angle, θ is given
below
Table 5 Torque versus angle for a torsion spring.
Angle, θ Torque, T
Radians N⋅m
0.698132 0.188224
0.959931 0.209138
1.134464 0.230052
1.570796 0.250965
1.919862 0.313707
Solution
Table 6 shows the summations needed for the calculation of the constants of the regression
model.
∑i =1
6.2831 1.1921 8.8491 1.5896
n=5
5 5 5
n∑ θ i Ti −∑ θ i ∑ Ti
k2 = i =1 i =1 i =1
2
5
5
n∑ θ i2 − ∑ θ i
i =1 i =1
5(1.5896) − (6.2831)(1.1921)
=
5(8.8491) − (6.2831) 2
= 9.6091 × 10 −2 N - m/rad
5
_ ∑T i
T= i =1
n
1.1921
=
5
= 2.3842 × 10 −1 N-m
5
_ ∑θ i
θ= i =1
n
6.2831
=
5
= 1.2566 radians
_ _
k1 = T − k 2 θ
= 2.3842 × 10 −1 − (9.6091 × 10 −2 )(1.2566)
= 1.1767 × 10 −1 N - m
Figure 4 Linear regression of torque vs. angle data
Example 2
To find the longitudinal modulus of a composite material, the following data, as given in
Table 7, is collected.
Solution
Rewriting data from Table 7, stresses versus strain data in Table 8
Applying the least square method, the residuals γ i at each data point is
γ i = σ i − Eε i
The sum of square of the residuals is
n
S r = ∑ γ i2
i =1
n
= ∑ (σ i − Eε i )
2
i =1
∑σ ε i i
E= i =1
n
(25)
∑ε
2
i
i =1
Note, Equation (25) only so far has shown that it corresponds to a local minimum or
maximum. Can you show that it corresponds to an absolute minimum.
The summations used in Equation (25) are given in the Table 9.
Table 9 Tabulation for Example 2 for needed summations
i ε σ ε2 εσ
1 0.0000 0.0000 0.0000 0.0000
2 1.8300 × 10 −3 3.0600 × 10 8 3.3489 × 10 −6 5.5998 × 10 5
3 3.6000 × 10 −3 6.1200 × 10 8 1.2960 × 10 −5 2.2032 × 10 6
4 5.3240 × 10 −3 9.1700 × 10 8 2.8345 × 10 −5 4.8821 × 10 6
5 7.0200 × 10 −3 1.2230 × 10 9 4.9280 × 10 −5 8.5855 × 10 6
6 8.6700 × 10 −3 1.5290 × 10 9 7.5169 × 10 −5 1.3256 × 10 7
7 1.0244 × 10 −2 1.8350 × 10 9 1.0494 × 10 −4 1.8798 × 10 7
8 1.1774 × 10 −2 2.1400 × 10 9 1.3863 × 10 −4 2.5196 × 10 7
9 1.3290 × 10 −2 2.4460 × 10 9 1.7662 × 10 −4 3.2507 × 10 7
10 1.4790 × 10 −2 2.7520 × 10 9 2.1874 × 10 −4 4.0702 × 10 7
11 1.5000 × 10 −2 2.7670 × 10 9 2.2500 × 10 −4 4.1505 × 10 7
12 1.5600 × 10 −2 2.8960 × 10 9 2.4336 × 10 −4 4.5178 × 10 7
12
∑
i =1
1.2764 × 10 −3 2.3337 × 10 8
n = 12
12
∑ε
i =1
i
2
= 1.2764 × 10 −3
12
∑σ ε
i =1
i i = 2.3337 × 10 8
12
∑σ ε i i
E= i =1
12
∑ε
i =1
i
2
2.3337 × 10 8
=
1.2764 × 10 −3
= 182.84 GPa
Figure 5 Linear regression model of stress vs. strain for a composite material.
QUESTION:
Given n data pairs, ( x1 , y1 ), , ( x n , y n ) , do the values of the two constants a0 and a1 in the
least squares straight-line regression model y = a 0 + a1 x correspond to the absolute minimum
of the sum of the squares of the residuals? Are these constants of regression unique?
ANSWER:
Given n data pairs ( x1 , y1 ), , ( x n , y n ) , the best fit for the straight-line regression model
y = a 0 + a1 x (A.1)
is found by the method of least squares. Starting with the sum of the squares of the residuals
Sr
n
S r = ∑ ( y i − a 0 − a1 xi )
2
(A.2)
i =1
and using
∂S r
=0 (A.3)
∂a 0
∂S r
=0 (A.4)
∂a1
gives two simultaneous linear equations whose solution is
n n n
n∑ xi y i −∑ xi ∑ y i
a1 = i =1 i =1 i =1
2
(A.5a)
n
n
n∑ x − ∑ xi 2
i
i =1 i =1
n n n n
∑ xi2 ∑ yi − ∑ xi ∑ xi yi
a0 = i =1 i =1 i =1 i =1
2
(A.5b)
n n
n∑ xi2 − ∑ xi
i =1 i =1
But do these values of a0 and a1 give the absolute minimum of value of S r (Equation (A.2))?
The first derivative analysis only tells us that these values give a local minima or maxima of
S r , and not whether they give an absolute minimum or maximum. So, we still need to figure
out if they correspond to an absolute minimum.
We need to first conduct a second derivative test to find out whether the point (a 0 , a1 ) from
Equation (A.5) gives a local minimum or local maximum of S r . Only then can we proceed
to show if this local minimum (or maximum) also corresponds to the absolute minimum (or
maximum).
What is the second derivative test for a local minimum of a function of two variables?
If you have a function f ( x, y ) and we found a critical point (a, b ) from the first derivative
test, then (a, b ) is a minimum point if
2
∂2 f ∂2 f ∂2 f
− > 0 , and (A.6)
∂x 2 ∂y 2 ∂x∂y
∂2 f ∂2 f
> 0 OR >0 (A.7)
∂x 2 ∂y 2
From Equation (A.2)
∂S r n
= ∑ 2( y i − a 0 − a1 xi )(−1)
∂a 0 i =1
n
(A.8)
= −2∑ ( y i − a 0 − a1 xi )
i =1
∂S r
( )
n
= ∑ 2 y i − a 0 − a1 xi (− xi )
∂a1 i =1
(A.9)
( )
n
= − 2∑ x i y i − a 0 x i − a x 2
1 i
i =1
then
∂ 2Sr n
∂a 02
= −2 ∑
i =1
− 1 = 2n (A.10)
∂ 2Sr n
∂a1 2
= 2 ∑
i =1
xi2 (A.11)
∂ 2Sr n
= 2∑ x i (A.12)
∂a 0 ∂a1 i =1
So, we satisfy condition (A.7) because from Equation (A.10) we see that 2n is a positive
n
number. Although not required, from Equation (A.11) we see that 2∑ xi2 is also a positive
i =1
number as assuming that all x data points are NOT zero is reasonable.
Is the other condition (Equation (A.6)) for S r being a minimum met? Yes, we can show
(proof not given that the term is positive)
2 2
∂ 2 Sr ∂ 2 Sr ∂ 2 Sr n 2 n
− = (2 n ) 2∑ x i − 2∑ x i
∂a02 ∂a12 ∂a0 ∂a1 i =1 i =1
n 2 n 2
= 4 n∑ xi − ∑ xi
i =1 i =1
n
= 4∑ ( x i − x j ) 2 > 0 (A.13)
i =1
i< j
So the values of a0 and a1 that we have in Equation (A.5) do correspond to a local minimum
of S r . But, is this local minimum also an absolute minimum. Yes, as given by Equation
(A.5), the first derivatives of S r are zero at only one point. This observation also makes the
straight-line regression model based on least squares to be unique.
As a side note, the denominator in Equations (A.5) is nonzero as shown by Equation (A.13).
This shows that the values of a0 and a1 are finite.