0% found this document useful (0 votes)
10 views

L2 SLR model 2

The document discusses the Ordinary Least Squares (OLS) estimator for simple linear regression, detailing how to derive the estimators for the slope and intercept by minimizing the sum of squared residuals. It includes an example using data on samosa consumption in relation to price, demonstrating the regression analysis and interpretation of results. Additionally, it covers properties of the estimators, including unbiasedness and efficiency, and introduces the concepts of total, explained, and residual sums of squares.

Uploaded by

dangikunal1001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

L2 SLR model 2

The document discusses the Ordinary Least Squares (OLS) estimator for simple linear regression, detailing how to derive the estimators for the slope and intercept by minimizing the sum of squared residuals. It includes an example using data on samosa consumption in relation to price, demonstrating the regression analysis and interpretation of results. Additionally, it covers properties of the estimators, including unbiasedness and efficiency, and introduces the concepts of total, explained, and residual sums of squares.

Uploaded by

dangikunal1001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

GC code: trkn2ml

Simple Linear Regression Model cont.


The OLS estimator
Recap: Given the three assumptions, outlined earlier, we show that
we could choose β̂0 and β̂1 using OLS, by minimizing the sum of
squared residuals (SSR)
n
X
Minβ̂0 ,β̂1 S = [yi − β̂0 − β̂1 xi ]2
i

First order conditions:


n n n
∂S X X X
= −2[yi −β̂0 −β̂1 xi ] = 0 ⇒ yi = β̂0 n+β̂1 xi (1)
∂β0
i i i

n n n
∂S X X X X
= −2[yi −β̂0 −β̂1 xi ]xi = 0 ⇒ xi yi = β̂0 xi +β̂1 xi2 (2)
∂β1
i i i

(1) and (2) are referred to as the normal equations.


Note that
(1) ⇒ ȳ = β̂0 + β̂1 x̄ ⇒ β̂0 = ȳ − β̂1 x̄.
Plugging this into (2) and recognizing

P
xi = nx̄ where x̄ is the mean of x
• (xi − x̄)(yi − ȳ ) = xi2 − nx̄ ȳ
P P

it is easy to show that


Pn
iP(xi − x̄)(yi − ȳ )
β̂1 = n 2
i (xi − x̄)
These are the OLS estimators of β0 and β1 .
There is a close relationship between the estimated slope
coefficient β̂1 and the correlation coefficient. Recall that the
sample correlation coefficient ρ̂ is given by
Pn
(xi − x̄)(yi − ȳ )
ρ̂= Pn i
p Pn
2 2
i (xi − x̄) i (yi − ȳ )
Example

Eco 221 students of 2024 asked how many samosas they would buy
per month at various prices. Here are some summary statistics:

Variable | Obs Mean Std. dev. Min Max


-------------+---------------------------------------------------------
quantity | 336 7.639881 7.288393 1 22
price | 336 14.85714 6.137028 5 25

We now want to ask by how much would the average consumption


of samosas decrease if price was increased by one (10) rupee(s)?
To answer, we can regress quantity (y ) on price (x)
. reg quantity price

------------------------------------------------------------------------------
quantity | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
price | -.2929574 .062975 -4.65 0.000 -.416835 -.1690799
_cons | 11.99239 1.012088 11.85 0.000 10.00152 13.98326
------------------------------------------------------------------------------

For now we focus only on the column ”coefficient” β̂1 is


approximately -0.293 and β̂0 (also called the constant) is
approximately 12
Interpretation: Every |10 increase in price of samosa would be
associated with a decline in average consumption of approximately
3 (2.93 rounded up) samosas
Deriving the OLS slope coefficient manually

gen q1 = quantity - 7.639881. // This generates y_i - ybar

gen p1 = price -14.85714 // This generate x_i - xbar


. gen q1p1= q1*p1 // This generates the product of the two

. gen p1sq = p1^2 // This generates the square of x_i - xbar

. summ q1p1 p1sq

Variable | Obs Mean Std. dev. Min Max


-------------+---------------------------------------------------------
q1p1 | 336 -11.00085 41.75874 -141.5497 115.2241
p1sq | 336 37.55102 40.58872 .020409 102.8776

. disp -11.00085*336 // This is the sum of q1p1 (the numerator of the OLS estimator)
-3696.2856

. disp 37.5512*336 // This is the sum of p1sq (the denominator of the OLS estimator)
12617.203

. disp -3696.2856/12617.203
-.29295602 // This is the estimated slope coefficient (with some rounding error)
Algebraic properties of the SLR model

The normal equations from the first order conditions imply


P
1. The estimated residuals sum to zero: i ûi = 0. This follows
from the first order condition (1)
2. The regression equation passes through the mean. I.e ȳ = ŷ¯ ,
so that ȳ = β̂0 + β̂1 x̄. So the residual at the sample means is
zero. Also follows from the first order condition (1).
3. The
P covariance between the x and the û is zero. I.e.
i xi ûi = 0. This follows from the first order condition (2).
This in turn means that the sample covariance between x and
û is zero. Why? Cov (x, û) = E(x û) − E(x)E(û) = E(x û) = 0.
Partitioning the sums of squares

One consequence of property 1 and property 3 is that the


covariance between ŷ and û is zero. This is because ŷ is simply a
coefficient multiplied by x, which is uncorrelated with û by
property 3. Thus yi = ŷi + ûi in effect is a decomposition of y into
two bits that are uncorrelated in the sample.
Next we want to ask, how good a job did we do in terms of getting
ŷ very close to y ? To answer this,
Define:
• SST (total sum of squares) = ni=1 (yi − ȳ )2
P

• SSE (explained sum of squares) = ni=1 (ŷi − ȳ )2


P

• SSR (residual sum of squares) = ni=1 ûi2


P

Then the above decomposition implies that SST = SSE + SSR


Proof

n
X n
X
2
SST = (yi − ȳ ) = [(yi − ŷi ) + (ŷi − ȳ )]2
i=1 i=1
n
X
[ûi + (ŷi − ȳ )]2
i=1
n
X n
X n
X
= ûi2 +2 ûi (ŷi − ȳ ) + (ŷi − ȳ )2
i=1 i=1 i=1

= SSR + 0 + SSE
The second term is zero by application of algebraic property 3 as
noted above
Goodness of fit

Goodness of fit R 2 is then defined as the ratio of SSE to SST


Pn Pn
2 i=1 (ŷi − ȳ )2 i=1 û
2
R = Pn 2
= 1 − P n 2
i=1 (yi − ȳ ) i=1 (yi − ȳ )

Clearly, 0 ≤ R 2 ≤ 1
It is the proportion of variation in y that is explained by x It
provides a visual assessment of how good a job we did Pin 2
minimizing the (scaled by y) distance represented by ûi . The
2
higher the R the better the fit.
SST, SSR, SSE and R 2 in our model

Source | SS df MS Number of obs = 336


-------------+---------------------------------- F(1, 334) = 21.64
Model | 1082.85435 1 1082.85435 Prob > F = 0.0000
Residual | 16712.5712 334 50.0376384 R-squared = 0.0609
-------------+---------------------------------- Adj R-squared = 0.0580
Total | 17795.4256 335 53.1206734 Root MSE = 7.0737

------------------------------------------------------------------------------
quantity | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
price | -.2929574 .062975 -4.65 0.000 -.416835 -.1690799
_cons | 11.99239 1.012088 11.85 0.000 10.00152 13.98326
------------------------------------------------------------------------------
The OLS estimators of β̂0 and β̂1 again

Note that while the β0 and β1 are population parameters and are
therefore given scalars, the OLS estimators β̂0 and β̂1 are random
variables. The actual magnitudes of the estimated slopes and
constants depend on the given sample that happens to be drawn.

To see this, let’s draw 15 random samples from our data set of 336
observations, and run the same OLS regression on each of these 15
samples.
. splitsample, generate(samp) nsplit(15)

. tab samp

samp | Freq. Percent Cum.


------------+-----------------------------------
1 | 22 6.55 6.55
2 | 23 6.85 13.39
3 | 22 6.55 19.94
4 | 23 6.85 26.79
5 | 22 6.55 33.33
6 | 22 6.55 39.88
7 | 23 6.85 46.73
8 | 22 6.55 53.27
9 | 23 6.85 60.12
10 | 22 6.55 66.67
11 | 22 6.55 73.21
12 | 23 6.85 80.06
13 | 22 6.55 86.61
14 | 23 6.85 93.45
15 | 22 6.55 100.00
------------+-----------------------------------
Total | 336 100.00
. reg quantity price if samp==1
------------------------------------------------------------------------------
quantity | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
price | -.3990968 .2349877 -1.70 0.104 -.8877804 .0895869
_cons | 14.91626 4.003795 3.73 0.001 6.58991 23.24261
------------------------------------------------------------------------------

. reg quantity price if samp==2

------------------------------------------------------------------------------
quantity | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
price | -.4074795 .2300374 -1.77 0.092 -.8873291 .0723702
_cons | 13.16623 4.239979 3.11 0.006 4.321793 22.01068
------------------------------------------------------------------------------

Rather than listing these out 15 times, easier to visualize the


estimated slope coefficient for each of the 15 samples via a
histogram
Plotting the 15 estimated slopes

Thus, β̂1 is a random variable which has its own sampling


distribution with a mean and variance. The same will be true of β̂0 .
“Desirable” properties of estimators

• Unbiasedness
• Efficiency (minimum variance) [we will cover this later]
These are finite sample properties

An estimator θ̂ of θ is said to be unbiased if E(θ̂) = θ

In the SLR case, we want β̂0 and β̂1 to be unbiased. I.e. that
E[β̂1 ] = β1 and E[β̂0 ] = β0

You might also like