0% found this document useful (0 votes)
66 views2 pages

M10 Derive Least Squares

Linear regression equations are derived by minimizing the sum of squared residuals (SSE) between the observed and predicted y-values. Taking the derivatives of SSE with respect to the intercept (a) and slope (b) parameters and setting them equal to 0 yields two equations with two unknowns that can be solved for the intercept and slope. This process results in the standard linear regression equations that relate the intercept and slope to the means, sums, and cross-products of the x and y variables in the data. Calculators and computers use these standard equations to quickly calculate the linear regression coefficients from input data.

Uploaded by

choco_girl60
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views2 pages

M10 Derive Least Squares

Linear regression equations are derived by minimizing the sum of squared residuals (SSE) between the observed and predicted y-values. Taking the derivatives of SSE with respect to the intercept (a) and slope (b) parameters and setting them equal to 0 yields two equations with two unknowns that can be solved for the intercept and slope. This process results in the standard linear regression equations that relate the intercept and slope to the means, sums, and cross-products of the x and y variables in the data. Calculators and computers use these standard equations to quickly calculate the linear regression coefficients from input data.

Uploaded by

choco_girl60
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 2

Where do the linear regression equations come from?

Time out for a calculus break

We want to minimize the sum of the squared residuals: SSE =


∑ ( y − yˆ)
a ll
2

d a ta

But yˆ = a + bx , so we can substitute into SSE to get SSE =


∑ ( y − a − b )x
a ll
2

d a ta
Since we want to find the values of a and b that make SSE a minimum, a and b are the variables.
Take the derivative of SSE of with respect to a and the derivative of SSE with respect to b. Then set the
derivatives equal to 0, to obtain equations which we will later solve to find the values of a and b.

∑ [( ay −− b) ]= ∑ [x2(y a−− b )− 1)] = (−x2∑ (y a−− b = 0) x


∂ 2

∂a a ll a ll a l l
d a t ad a t a d a t a

∑ [( ay −− b ) ]= ∑ [2(x ay −− b )− x)]= (− 2x∑ [( ay −− )xb ] = − 2∑x(x − a − by = 0) x x


∂ 2 2

∂b a ll a l l a l l a l l
d a t ad a t a d a t a d a t a

By breaking up the sums, we can “simplify” this into the two equations with two unknowns a and b

− ∑ y + n a+ b∑ x = 0 2
()
− ∑ ( x ) y+ a∑ x + b∑ x = 0
a ll a ll a ll a ll a ll
d a ta d a ta d a ta d a ta d a ta

These equations are linear in a and b, so they are not “difficult” to solve, although the algebra requires a
lot of care and patience because the coefficients of the variables a and b are sums. Some cleverness in
substituting means for sums helps to further “simplify” the equations to make them easier to work with.
Solving these equations to obtain the values of a and b that will minimize the SSE gives us:
∑y
a ll
− b∑ x
a ll
d a ta d a ta
a= = y − bx
n
− ∑ ( x ) +y ( y − bx ) ∑ x + b∑ x2 = 0 ()
a ll a ll a ll
d a ta d a tad a ta

− ∑ ( x ) + y y ∑ x − bx ∑ x + b ∑ x 2 = 0 ()
a ll a ll a ll a ll
d a t a d a t da a t a d a t a

− ∑ ( x ) y+ nyx − b xx n+ b∑ x2 = 0 ()
a ll a ll
d a ta d a ta
∑ ( xy )
all
− nx y

a = y − bx
data
Finally, b = ; after finding b substitute its value to find a using
∑ ( x ) − nx
all
2 2

data
Your calculator is very good at doing this type of tedious repetitive calculation quickly. Your calculator has the
formulas programmed into it and uses them with the data you input to quickly calculate the values of a and b

If you want more information about the theory and derivation of the equations for simple linear
regression, correlation and the coefficient of variation, visit the Mathworld website:
http://mathworld.wolfram.com/LeastSquaresFitting.html

You might also like