0% found this document useful (0 votes)

20 views16 pages

Ols Estimates

The document discusses three key properties of summations: 1) The sum of deviations from the mean equals zero. 2) The summation of the product of deviations from the x and y means equals the summation of the product of x deviations and y values. 3) The summation of the squared deviations from the mean equals the summation of the product of x deviations and x values. It also outlines the process of deriving the Ordinary Least Squares estimates for the coefficients in a bivariate linear regression model, by taking the partial derivatives of the sum of squared residuals with respect to the coefficients and setting them equal to zero.

Uploaded by

Rkmv Econ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views16 pages

Ols Estimates

Uploaded by

Rkmv Econ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Some important properties of summations

ECON 30331/Evans

Definition: The summation sign (Σ) adds up a series of numbers

Suppose there is a sample with n observations and two variables (xi and yi). Then

x
i 1
i  x1  x2  x3   xn

We can represent sample means with summations:

1 n
x  xi
n i 1
1 n
y   yi
n i 1

Throughout the semester when I write at the board, I will shorten the notation some and write
n

x
i 1
i as simply x i i

Three important properties of summations:

n
Result (1):  (x
i 1
i  x )  0 . The sum of deviations from means equals zero.

n n n n n
1 n 
Proof:  ( xi  x ) 
i 1
 xi   x 
i 1 i 1

i 1
x i  nx  
i 1
x i  n  n  xi 
 i 1 
n n
 x x
i 1
i
i 1
i 0

n n n
Result (2):  (xi 1
i  x )( yi  y )   xi ( yi  y )   ( xi  x ) yi
i 1 i 1

n n n
Proof:  (x
i 1
i  x )( yi  y )   xi ( yi  y )   x ( yi  y )
i 1 i 1

Because x̄ is a constant, it can be moved outside the summation sign in the

final term above

n n n

 ( xi  x )( yi  y )   xi ( yi  y )  x  ( yi  y )
i 1 i 1 i 1

1
Given the results from above (summation of deviation from means equal
n
zero), x  ( yi  y )  0 so
i 1

n n

 ( xi  x )( yi  y )   xi ( yi  y )
i 1 i 1

Following the same logic, we can easily establish that

n n

 ( xi  x )( yi  y )   ( xi  x ) yi
i 1 i 1

n n
Result (3):  (x
i 1
i  x ) 2   xi ( xi  x )
i 1

Proof: This is the same proof as above. Expand the terms on the right hand side of the
equality

n n n n

 ( xi  x ) 2   ( xi  x )( xi  x )   xi ( xi  x )   x ( xi  x )
i 1 i 1 i 1 i 1

In the final term on the right, note that because x̄ is a constant, you can
take it outside the summation, and

n n n

 ( xi  x ) 2   xi ( xi  x )  x  ( xi  x )
i 1 i 1 i 1

n
And given result 1 above, x  ( x1  x )  0 , so
i 1
n n

 (x
i 1
i  x ) 2   xi ( xi  x )
i 1

2
Deriving the OLS estimates for the Bivariate Regression Model

Model: yi  0  xi 1   i

The residuals (εi) are unobserved, but for candidate values of β0 and β1, we can obtain an
estimate of the residual.

Estimated residual: ˆi  yi  ˆ0  xi ˆ1

Objective is to minimize sum of squared residuals:

 
n n
SSR   ˆi2   yi  ˆ0  xi ˆ1
2

i 1 i 1

First order conditions (FOCs):

 
n
(1) SSR / ˆ0   2 yi  ˆ0  xi ˆ1 0
i 1

 
n
(2) SSR / ˆ1   2 yi  ˆ0  xi ˆ1 xi  0
i 1

Use FOCs to obtain estimate for β̂0 and β̂1

The estimate for β̂0

Working with condition (1), multiply both sides by -1/2

 
n
(1a)  yi  ˆ0  xi ˆ1  0
i 1

Then divide by n and expand all terms

1 n 1 n 1 n
(1b)  yi   ˆ0   xi ˆ1  0
n i 1 n i 1 n i 1

1 n ˆ
The first term is ȳ, the second is  0  ˆ0 and the third is
n i 1
ˆ n
1 n ˆ  1
 i 1 n
n i 1
x 
i 1
xi  x ˆ1 and therefore, we can re-write (1b) as

(1c) y  ˆ1  x ˆ1  0

Which means that

3
(1d ) ˆ0  y  xˆ1

The estimate for β̂1

Working with condition (2), multiple both sides by -1/2

n
( 2a ) (y
i 1
i  ˆ0  xi ˆ1 ) xi  0

Substitute ȳ - x̄ β̂1 for β̂0 (from condition 1b)

n
(2b) (y i 1
i  ( y  x ˆ )  xi ˆ1 ) xi  0

Collect like terms

n
(2c)  (( y
i 1
i  y )  ( xi  x ) ˆ1 ) xi  0

Expand the terms in the summation and complete the square, and because β̂1 is a
constant, you can bring it outside the summation

n n
( 2d )  ( yi  y ) xi  ˆ1  ( xi  x ) xi  0
i 1 i 1

Recognize two facts:

n n

(y
i 1
i  y ) xi   ( yi  y )( xi  x )
i 1
n n n

 ( xi  x ) xi  ( xi  x )( xi  x )   ( xi  x ) 2
i 1 i 1 i 1

Substitute these values into (2d)

n n
(2e)  ( yi  y )( xi  x )  ˆ1  ( xi  x ) 2  0
i 1 i 1

Bringing the second term to the right hand side

n n
(2 f )  ( yi  y )( xi  x )  ˆ1  ( xi  x ) 2
i 1 i 1

Then solve for β̂1

(y i  y )( xi  x )
(2 g ) ˆ1  i 1
n

 (x
i 1
i  x)2

4
Some useful properties of OLS estimates:

1. From (1c) above, note that y  ˆ1  x ˆ1 . The OLS models fits means of X
through the means of y. OLS is sometimes referred to as a mean regression.

  y  ˆ 
n
2. From (1a) above, note that i 0  xi ˆ1  0 and note further that
i 1
n
ˆi  yi  ˆ0  xi ˆ1 . Therefore  ˆ   0 which indicates that the sample mean
i 1
i

n
1
of ˆ is equal to zero, or ˆ   ˆi   0
n i 1

3. From (2a) above, recall that ˆi  yi  ˆ0  xi ˆ1 so (2a) can be written as
n n

 ( yi  ˆ0  xi ˆ1 ) xi  0   ˆi xi . Therefore, by construction, the optimal choices

i 1 i 1

of β̂0 and β̂1 are such that xi and ˆ i are uncorrelated.

4. Looking at the OLS estimate in (2g), divide the numerator and denominator by (n-
1)
1 n
 ( yi  y )( xi  x )
n  1 i 1
( 2h) ˆ
1 
1 n

n  1 i 1
( xi  x ) 2

Notice that the numerator in (2h) is σ̂xy and then denominator is

σ̂x2. Recognize also that ρ̂xy= σ̂xy/(σ̂xσ̂y), so

ˆ xy ˆ xyˆ y  ˆ xy   ˆ y  ˆ xy y
ˆ1     
ˆ x2 ˆ xˆ xˆ y  ˆ xˆ y   ˆ x  ˆ x

If one knows the variances and correlations coefficients, one can easily estimate
the OLS value for β̂1

5
Deriving the R2

Given the basic regression model: yi  0  xi 1   i

Predicted outcome: yˆ i  ˆ0  ˆ1 xi

Estimated residual: ˆi  yi  ˆ0  xi ˆ1

By construction: (1) yi  yˆi  ˆi

Take the average of equation (1) over all observations, then

1 n 1 n 1 n
y 
n i 1
yi   yˆi   ˆi  yˆ  ˆ
n i 1 n i 1

Remember that the sample average of ˆ is zero, so ȳ= ŷ (the sample mean of y equals the
sample mean of predicted y).

The total variation in y, or the Sum of Squared Total (SST) is defined as

n
(2) SST   ( yi  y ) 2
i 1

This is nothing more than a statement about how much movement there is in y in your sample.
Noting that yi  yˆi  ˆi and ȳ= ŷ , substitute these values into SST and complete the square

n n n
(4) SST   ( yi  y )2   ( yî  î  yˆ ) 2  [( yî  yˆ ) 2  î2  2î ( yî  yî )]
i 1 i 1 i 1

n n n
  ( yî  yˆ )2   î2  2 î ( yî  yˆ )
i 1 i 1 i 1

Focus on the third term in the equality. Note a few things. First, since yˆ i  ˆ0  ˆ1 xi
and yˆ  y and y  ˆ0  ˆ1 x , then it is easy to show that ( yˆ i  yˆ )  ˆ1 ( xi  x ) . Substitute
this value into the third term

n n n
(5) 2 î ( yî  yˆ )  2 î ˆ1 ( xi  x )  2ˆ1  î ( xi  x )
i 1 i 1 i 1

In equation (5), we can take ˆ1 outside the summation because it is the same value over
all i. Look at the notes for “Deriving the OLS Estimates for the Bivariate Regression
Model”. On the final page, we note some useful properties of the OLS estimates.
6
n
Condition 3 states that by construction,  ˆ x
i 1
i i  0 which means that equation (5) above is

by construction, equal to zero. Therefore, equation (4) reduces to

n n n
(6) SST   ( yi  y )2 
i 1
 ( yˆi  yˆ )2   ˆi2
i 1 i 1

The SST or the total variation in y has two separate parts. The first is

n
(7) SSM   ( yˆ
i 1
i  yˆ ) 2

Where SSM is defined as the sum of squared model. This is a measure of the
variation in the predicted value in Y.

The final term in equation (6) should look very familiar; it is none other than the
objective function or, the sum of squared residuals (SSR).

n
(8) SSR   ˆ
i 1
i
2

Therefore, what we have demonstrates is that

(9) SST = SSM + SSR

…or the actual variation in y (SST) is a function of two components. The first is
the variation predicted by the model (SSM), while the second is the variation that
we cannot predict (SSR).

Dividing both sides of (9) by SST, note that

SSM SSR
1 
SST SST

Or alternatively

SSM SSR
(10) R2   1
SST SST

The R2 measures what fraction of the variation in y is explained by the regression

model. Since SST = SSM + SSR, by construction 0 ≤ R2 ≤ 1.

7
n
Just a note about the textbook. The author calls the term  ( yˆi  yˆ ) 2 the SSE or sum of squared
i 1

SSE SSR
explained. PLEASE NOTE: The textbook definition of R2 is R 2   1 where the
SST SST
n
author defines SSE as sum of squared estimated. I do not like this abbreviation for  ( yˆ  yˆ )
i 1
i
2
.

Our definition SSM matches much better with the STATA prints out – SST is sum of squared
total, SSM is sum of squared model and SSR is sum of squared residuals – so we will use these
abbreviations.

8
Proof that β̂1 is an Unbiased Estimate

Recall the definition for β̂1

 ( y  y )( x  x )
i i
(1) ˆ1  i 1
n

 (x  x )
i 1
i
2

Recalling the properties of summations, note that numerator can be written as

n n
(2)  ( yi  y )( xi  x )   yi ( xi  x )
i 1 i 1

Note further that the true relationship between y and y is given by the population regression line

(3) yi  0  xi 1   i

Using (2) and substituting the true value for y into the model,

n n n

 ( yi  y )( xi  x )  yi ( xi  x )  ( 0  1 xi   i )( xi  x )
(4) ˆ1  i 1
n
 i 1
n
 i 1
n

 ( xi  x )
i 1
2
 ( xi  x )
i 1
2
 (x  x )
i 1
i
2

Break apart the terms in the numerator

n n n

 0 ( xi  x )   1 xi ( xi  x )    i ( xi  x )
(5) ˆ1  i 1 i 1
n
i 1

 (x  x )
i 1
i
2

We can simplify the terms in (5) using the properties of summations:

In the first term in the numerator, note that β0 is a constant and can be pulled
outside the summation. As a result, we have the summation of a deviation from a
mean, which equals zero

n n

  0 ( xi  x )   0  ( xi  x )   0 (0)  0
i 1 i 1

In the second term in the numerator, β1 is a constant and it can pulled outside the
n n
summation. Recall also that  ( xi  x ) 2   xi ( xi  x ) so
i 1 i 1

9
n n n

  1 x i ( xi  x )   1  xi ( xi  x )   1  ( xi  x ) 2
I 1 i 1 i 1

The first term in the numerator drops out, the second term reduces to β1 and therefore, we
can write the OLS estimate for β̂1 as

 (x  x )
i i
(6) ˆ1  1  i 1
n

 (x  x )
i 1
i
2

WE WILL BE USING THIS CHARACTERIZATION OF THE OLS ESTIMATE FOR

β̂1 A LOT THIS SEMESTER. PLEASE UNDERSTAND HOW WE GOT TO THIS
POINT.

Equation (6) points out two important things. First, the estimate for β̂1 is a function of the
‘truth’ that is the true value of β1. Likewise, the estimated value for β̂1 is a function of
the n people who were selected for this sample. The true source of randomness in the
model is therefore the unknown residual εi. As a result, the properties of β̂1 will be a
function of the properties we assume about εi. We typically make four assumptions about
εi

Three assumptions about the residual εi

1) E(εi) = E(εi|xi) = 0
2) V(εi) = V(εi|xi) = σ2ε
3) Cov(εi, εj) = 0 for all i≠j

The first assumption says that on average, the expected error is zero and that this
expectation does not depend on the value of x. The second assumption says that the
errors are “homoskedastic” or they have the same variance. Assumption (3) states that
errors are not correlated across observations. The second and third assumptions will be
relaxed throughout the semester.

Assumption (1) is the killer. If (1) is true, the model has very nice properties, if it false,
the model is useless.

Assumption (1) states that ε and x are independent. This says that the realization of x
conveys no information about the likely value of ε and therefore, the conditional
expectation E(εi|xi) provides the same information as the unconditional expectation E(εi).

Recall that cov( xi ,  i )  E( xi i )  E( xi ) E( i ) . Because E(εi) = E(εi|xi) = 0 then the second
term drops out and cov( xi ,  i )  E ( xi i ) . Let’s work with the right hand side of this term.
E ( xi i )  E ( i | xi ) xi  E( i ) xi  0 and hence cov( xi ,  i )  0. In essence by conditioning
on x, we “fix” this value and E ( xi i ) becomes E ( i ) xi which equals zero by assumption
(1).
10
A key result we will use time and time again throughout the semester is that if we
maintain assumption (1) and we see E ( i xi ), this reduces to E ( i ) xi which will equal
zero.

As we will see, if the value of x conveys information about ε then the model is sunk. We
will go over this in detail about two dozen times throughout the semester.

Let’s also work with condition (2) a little. This states that the variance of εi is the same
whether we know x or not. Recall the definition of variance Var ( i )  E[( i  E ( i ))2 ].
Because E ( i )  0, the definition of the variance reduces Var ( i )  E[ i2 ]   2 .
Therefore, any time we see a E[ i2 ] this means  2 .

In the derivations below, we will also see a lot of terms that are E[ i2 xi2 ]. Given
assumption (2), E[ i2 xi2 ]  E[ i2 | xi2 ]  E[ i2 ]xi2   i2 xi2

Therefore, a key result we will use time and time again throughout the semester -- if
we maintain assumption (2) and we see E[ i2 xi2 ] this reduces to E[ i2 ]xi2 ] which
equal  i2 xi2 .

For now let’s concentrate on the case if (1) is true and see what that buys us.

We have established that β̂1 is a random variable. Any time you have a random variable,
the first two questions you need to ask are a) what is the expected value and b) what is
the variance. In this section, we will produce E[ β̂1]

First start with the definition of β̂1 from in equation (6) and take the expectation

 n
  n   n 
   i ( xi  x )     i ( xi  x )  E 

  i ( xi  x ) 

(7) ˆ
E[ 1 ]  E 1  n
 i 1
  E[ 1 ]  E n
 i 1
  1  i 1

 ( xi  x ) 2   ( x  x )2 
n

 i 1

  
 i 1
i 

i 1
( xi  x ) 2

There is a lot going on in equation (7). First note that E[a+b]=E[a]+E[b] so we can break
apart the two big terms in the expectation. Second, note that the true value β1 is a fixed
constant there E[β1]= β1. Note also that because we assume x is “fixed”
n
then  ( xi  x ) 2 is not random and it too can be brought outside the expectation.
i 1

 n 
Therefore, the properties of E[ β̂1] will be driven by the expectation E    i ( xi  x )2  .
 i 1 
Let’s work with this term. First, write out the terms in the summation under the
expectation

11
 n 
(8) E   i ( xi  x ) 2   E 1 ( x1  x )   E  2 ( x2  x )   E  3 ( x3  x )   E  n ( xn  x ) 
 i 1 

Consider one of these expectations E  i ( xi  x ) for any i. Break this term apart to
read E[ i xi ]  E[ i x ] . Note assumption (1) above states that E(εi|xi) = 0. Looking at the
first term of E[ i xi ]  E[ i x ] , we can easily write it as

E[ i xi ]  E[ i | xi ]  0

Therefore, if assumption (1) is correct, this term should be zero. The second term in
E[ i xi ]  E[ i x ] requires the definition of x which is
1
x   x1  x2  .....xn 
n

And substituting this in generates

 x  x x x   x   x   x 
E[ i x ]  E  i 1  i 2  .. i i  .. i n   E  i 1   E  i 2   ...E  i n 
 n n n n   n   n   n 

 x   x 
Note that each term E  i j  for j≠i is 0 and the term E  i i  must be equal to zero for
 n   n 
the same arguments above. As a result, the far right hand term in equation (7) is zero and
therefore,

(9) E[ β̂1] = β1

The estimate β̂1 is an unbiased estimate of β1, that is, if one were to draw a large number
of samples at random, estimate β̂1 each time, the average of all these estimates would be
the true value β1

Please note --- an unbiased estimate does not mean you have the correct estimate – it
simply means that you used a procedure that on average will give you the correct answer.

Here is another way to think about how the correlation between x and ε would get you
into trouble

From equation (6), divide the numerator and denominator of the right hand term by (n-1)
1 n 1 n
n  1
  i ( xi  x )
n  1
 ( i   )( xi  x )
(10) ˆ1  1  i 1
 1  i 1
1 n 1 n

n  1 i 1
( x i  x ) 2

n  1 i 1
( xi  x ) 2

12
Notice also in the final term in (10), we use the fact that the numerator can be
written as

n n

  i ( xi  x )   ( i   )( xi  x )
i 1 i 1

The numerator is the nothing more than the sample correlation between xi and the
ACTUAL error term εi. The denominator is the sample variance in x.
1 n
ˆ x   ( i   )( xi  x )
n  1 i 1

1 n
ˆ x2  
n  1 i 1
( xi  x ) 2

And therefore, the estimate β̂1 can be written as

ˆ
(11) ˆ1  1  x2
ˆ x

Notice that if in a sample ˆ x  0 , then ̂1  1 . However, if ˆ x  0 then by construction

̂1  1 whereas if ˆ x  0 then ̂1  1

13
The Variance of β̂1

Demonstrating the var( ˆ1 ) is the most detailed and complicated derivation we will do all
semester. In the end, it is a lot of algebra but it simply exploits the properties of definitions and
expectations we have already used.

To start, recall the following facts:

(a) The equation for the estimate of β̂1

 ( y  y )( x  x )
i i
(1) ˆ1  i 1
n

 (x  x )
i 1
i
2

(b) Recall also that the true underlying relationship between x and y is given by the
equation

(2) yi  0  xi 1   i

(c) To analyze some of the properties of β̂1, we substituted the true value for yi, as
defined by equation (2) into the estimate (1). This substitution leads to the following
result:

 (x  x ) i i
(3) ˆ1  1  i 1
n

 (x  x )
i 1
i
2

By definition,

(4) Var (ˆ1 )  E[(ˆ1  E (ˆ1 ))2 ]

Previously, we demonstrated that β̂1 is an unbiased estimate or E[ β̂1]=β1 and therefore,

substituting β1 for E[ β̂1] in equation (4), the Var( β̂1) is then

(5) Var (ˆ1 )  E[(ˆ1  1 )2 ]

Looking at equation (3), note that the difference ̂1  1 is simply

14
n n

  i ( xi  x )  (x  x )
i i n
(6) ˆ1  1  i 1
n
 i 1
where SSTx   ( xi  x ) 2
 (x  x ) 2 SSTx i 1
i
i 1
The variable SSTx is the sum of squared total for x and similar to the SST for y used in the
construction of the R2

Using the definition of the variance and equation (6)

 n  
2
 n 
2

   i ( xi  x )      i xi  
ˆ ˆ 
(7) Var ( 1 )  E[( 1  1 ) ]  E 
2 i 1
  E  i 1
  
 SSTx    SSTx  
     
     

Where xi  xi  x . Because SSTx is a constant (x is considered fixed) we can bring it

outside the summation. Therefore

1  n  
2

E    i xi  
(8) Var ( ˆ1 ) 
SSTx2  i 1  
Let’s work with the numerator in the far right hand term in equation (8). Complete the square on
this term.
2
 n 
  i xi   1 x1   2 x2  .... n xn   [1 x1   2 x2  ... n xn  21 x1 2 x2  21 x1 3 x3  ...2 n 1 xn 1 n xn ]
2 2 2 2 2 2 2

 i 1 

Next, we take the expectation of this term

2
 n 
(10) E   i xi   E[12 x12   22 x22  ... n2 xn2  21 x1 2 x2  21 x1 3 x3  ...2 n 1 xn 1 n xn ]
 i 1 

 E[12 x12 ]  E[ 22 x22 ]  ...E[ n2 xn2 ]  E[21x1 2 x2 ]  E[21x1 3 x3 ]  ...E[2 n1xn1 n xn ]

Let’s look at the terms in equation (10). Consider E[ 2j x 2j ] for any j=1, 2…n. Recall above from
assumption (2) that anytime we see E[ 2j x 2j ] this reduces to E[ 2j ]x 2j   2 x 2j because
E[ i2 | xi ]  E[ i2 ] . Note also that we established above that any time we see E[ i2 ] this equals
E[ i2 ]   2 . Therefore, the first n terms in the second line of equation (10), E[ 2j x 2j ],
equal  2 x 2j for j=1,2,…n.

Next, consider the expectation of the cross terms E[2 i j xi x j ] . The 2 is a constant so it can be
brought outside the expectation. By assumption, xi x j are also constants so they can be brought

15
outside the expectations as well. Therefore, E[2 i j xi x j ]  2 xi x j E[ i j ] . Recall above that we
assumed that cov( i ,  j )  0 and the definition of cov( i ,  j ) is cov( i ,  j )  E[ i j ]  E[ i ]E[ j ]
and since E[εi]=E[εj]=0, cov( i ,  j )  E[ i j ]  0 . Therefore, all the expectation of cross-terms
in (10) are zero. Combining these results

1  n  
2
1
(11) Var ( ˆ1 )  2
E    i xi    2
[ 2 x12   2 x22   2 x32  .... 2 xn2 ]
SSTx  i 1   SSTx

We can reduce the numerator of 11,

N N
(12) [ 2 x12   2 x22   2 x32  .... 2 xn2 ]   2  xi2   2  ( xi  x)2   2 SSTx
i 1 i 1

And therefore:

1  2  2
(13) Var ( ˆ1 )   2
 SSTx   n
SSTx2
 (x  x )
SSTx 2
i
i 1

Notice that the definition of (13) includes a  2 which is the Var(εi). Unfortunately, we do not
know  2 so it must be estimated

An unbiased estimate for  2 is as follows

 ˆ i
2
SSR
(14) ˆ2  i 1

n  k 1 n  k 1

Where k is the number of x’s included in the model. Thus in the simple bivariate model, k=1
and the degrees of freedom in the denominator is n-2.

The estimated variance of β̂1 is then

ˆ2
(15) Est.Var ( ˆ1 )  n

 (x  x )
i 1
i
2

As with all variances, the units of measure on (15) are in β̂1 squared units so we need to take the
square root. The square root of this variance is typically called the “Standard error”
ˆ
(16) se( ˆ1 )  1/2
 n 2
  ( xi  x ) 
 i 1 
16

Kubernetes Tasks
100% (1)
Kubernetes Tasks
786 pages
Fola 0313 Learning Module Lrroldan
No ratings yet
Fola 0313 Learning Module Lrroldan
184 pages
Most Expected Question
No ratings yet
Most Expected Question
75 pages
Designer User's Guide v4x
No ratings yet
Designer User's Guide v4x
676 pages
Literary Translation in The Age of Artificial Intelligence
No ratings yet
Literary Translation in The Age of Artificial Intelligence
10 pages
Graphic Design Tools
100% (2)
Graphic Design Tools
4 pages
2.3 Tye Inchcape Rock
No ratings yet
2.3 Tye Inchcape Rock
13 pages
Derivation of The Ordinary Least Squares Estimator Simple Linear Regression Case
No ratings yet
Derivation of The Ordinary Least Squares Estimator Simple Linear Regression Case
17 pages
OLS 2 Variables
No ratings yet
OLS 2 Variables
55 pages
SGPE Econometrics Lecture 1 OLS
No ratings yet
SGPE Econometrics Lecture 1 OLS
87 pages
Econometrics Lecture 3 Simple Linear Regression (SLR) For Cross Sectional Data Part 2
No ratings yet
Econometrics Lecture 3 Simple Linear Regression (SLR) For Cross Sectional Data Part 2
39 pages
04 Socratic Seminar
No ratings yet
04 Socratic Seminar
15 pages
Intercultural City Planning For Diversity Advantage Yooply 1
100% (1)
Intercultural City Planning For Diversity Advantage Yooply 1
375 pages
Pc1211 Service Manual
100% (2)
Pc1211 Service Manual
43 pages
C# in Depth, 3rd Ed
0% (1)
C# in Depth, 3rd Ed
1 page
Week 3-4
No ratings yet
Week 3-4
75 pages
EC2C4 Econometrics II
No ratings yet
EC2C4 Econometrics II
56 pages
Ols 23-24
No ratings yet
Ols 23-24
87 pages
1 - The Simple Regression Model
No ratings yet
1 - The Simple Regression Model
41 pages
Stat 136 Chapter 4 Least Squares Theory and Analysis of Variance
No ratings yet
Stat 136 Chapter 4 Least Squares Theory and Analysis of Variance
34 pages
Properties of OLS Estimators: Assumptions Underlying Model
100% (1)
Properties of OLS Estimators: Assumptions Underlying Model
23 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
Lecture 2 - Regression Model PDF
No ratings yet
Lecture 2 - Regression Model PDF
69 pages
Excerpt From "Orthodoxy and The Religion of The Future" - by Fr. Seraphim Rose
No ratings yet
Excerpt From "Orthodoxy and The Religion of The Future" - by Fr. Seraphim Rose
35 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
Lecture 2 SLR - 1
No ratings yet
Lecture 2 SLR - 1
28 pages
Lecture 2
No ratings yet
Lecture 2
25 pages
MLRM
No ratings yet
MLRM
22 pages
MLRMSB2
No ratings yet
MLRMSB2
21 pages
ECN 5121 Econometric Methods Two-Variable Regression Model: The Problem of Estimation By: Domodar N. Gujarati
No ratings yet
ECN 5121 Econometric Methods Two-Variable Regression Model: The Problem of Estimation By: Domodar N. Gujarati
65 pages
Future Generation Computer Systems: Netsanet Haile Jörn Altmann
No ratings yet
Future Generation Computer Systems: Netsanet Haile Jörn Altmann
18 pages
Unit 3
No ratings yet
Unit 3
17 pages
CH 2 Multiple Regression S
No ratings yet
CH 2 Multiple Regression S
78 pages
Lecture 2-3 - Properties of The OLS Estimates
No ratings yet
Lecture 2-3 - Properties of The OLS Estimates
20 pages
Chapter 11 Lecture Notes .
No ratings yet
Chapter 11 Lecture Notes .
22 pages
Econometrics Handout Session 2
No ratings yet
Econometrics Handout Session 2
18 pages
PPG MLRM Upto Autocorr PDF
No ratings yet
PPG MLRM Upto Autocorr PDF
20 pages
Ec410 Lecture 4 - Simple Regression II
No ratings yet
Ec410 Lecture 4 - Simple Regression II
8 pages
Introduction To Mathematical Modeling: Simple Linear Regression
No ratings yet
Introduction To Mathematical Modeling: Simple Linear Regression
21 pages
OLS Estimation of Single Equation Models PDF
No ratings yet
OLS Estimation of Single Equation Models PDF
40 pages
LLICO2b ECO1 English
No ratings yet
LLICO2b ECO1 English
15 pages
Ly Thuyet Va Bai Tap Chuyen Alhoughbecause Sang Despite Because of Lop 9 Co Dap An - Printable
No ratings yet
Ly Thuyet Va Bai Tap Chuyen Alhoughbecause Sang Despite Because of Lop 9 Co Dap An - Printable
15 pages
Lect 6
No ratings yet
Lect 6
20 pages
Unit4 Multivariate Analysis
No ratings yet
Unit4 Multivariate Analysis
20 pages
Econometrics ch4
No ratings yet
Econometrics ch4
66 pages
Econometric S
No ratings yet
Econometric S
8 pages
3 SimpleLinearRegression
No ratings yet
3 SimpleLinearRegression
30 pages
Spring Boot
No ratings yet
Spring Boot
13 pages
Education and Research: UP School of Statistics Student Council
No ratings yet
Education and Research: UP School of Statistics Student Council
26 pages
Computational Properties and Goodness-of-Fit of The OLS Sample Regression Equation
No ratings yet
Computational Properties and Goodness-of-Fit of The OLS Sample Regression Equation
18 pages
Lecture II - Docx - 12
No ratings yet
Lecture II - Docx - 12
12 pages
Simple Regression
No ratings yet
Simple Regression
18 pages
ML Lec-3
No ratings yet
ML Lec-3
11 pages
14 Simple Linear Regression
No ratings yet
14 Simple Linear Regression
13 pages
Konsep Model Inovasi Kurikulum KBK, KBM, KTSP, K13, Dan Kurikulum Merdeka (Literature Review) Agus Setiawan
No ratings yet
Konsep Model Inovasi Kurikulum KBK, KBM, KTSP, K13, Dan Kurikulum Merdeka (Literature Review) Agus Setiawan
24 pages
Veronica Davis MFM Release Me From Destructive Covenants
50% (2)
Veronica Davis MFM Release Me From Destructive Covenants
5 pages
Non-Spherical Errors: 1 Efficient OLS
No ratings yet
Non-Spherical Errors: 1 Efficient OLS
14 pages
Biometric Fingerprint Student Attendance System: Software Requirement Specifications
No ratings yet
Biometric Fingerprint Student Attendance System: Software Requirement Specifications
19 pages
Simple Linear Regression Model
No ratings yet
Simple Linear Regression Model
6 pages
Lsreg
No ratings yet
Lsreg
6 pages
Arabic2-Basics-On-Verbs Final PDF
No ratings yet
Arabic2-Basics-On-Verbs Final PDF
17 pages
EA Wallis Budge - The Book of Gates Vol II Cd2 Id691880803 Size216
No ratings yet
EA Wallis Budge - The Book of Gates Vol II Cd2 Id691880803 Size216
101 pages
The Simple Regression Model
No ratings yet
The Simple Regression Model
41 pages
2 Classical Linear Regression Models: 2.1 Assumptions For The Ordinary Least Squares Regression
No ratings yet
2 Classical Linear Regression Models: 2.1 Assumptions For The Ordinary Least Squares Regression
18 pages
Salesforce 100 Interview
No ratings yet
Salesforce 100 Interview
2 pages
Econometric Theory: Module - Ii
No ratings yet
Econometric Theory: Module - Ii
11 pages
Lpe2501 Lecture Notes 2 (Week 5-6)
No ratings yet
Lpe2501 Lecture Notes 2 (Week 5-6)
27 pages
Matrix OLS NYU Notes
No ratings yet
Matrix OLS NYU Notes
14 pages
Wooldridge 6e AppE IM
No ratings yet
Wooldridge 6e AppE IM
5 pages
The Three-Variable Model: Notation and Assumptions
No ratings yet
The Three-Variable Model: Notation and Assumptions
8 pages
Chapter 6: Regression
No ratings yet
Chapter 6: Regression
7 pages
Handout3 26
No ratings yet
Handout3 26
7 pages
CHEM F213 Handout 2016
No ratings yet
CHEM F213 Handout 2016
3 pages
FECO Note 2 - Simple Linear Regression: Xuan Chinh Mai
No ratings yet
FECO Note 2 - Simple Linear Regression: Xuan Chinh Mai
7 pages
Prose
No ratings yet
Prose
5 pages
Grade 12: Tvl-Ict Install Operating System and Drivers For Peripheral / Devices
No ratings yet
Grade 12: Tvl-Ict Install Operating System and Drivers For Peripheral / Devices
10 pages
Parts of A House in English
No ratings yet
Parts of A House in English
7 pages
A Notes Sheet For Econometrics
No ratings yet
A Notes Sheet For Econometrics
1 page
Cpu001 TP BCD
No ratings yet
Cpu001 TP BCD
2 pages
Regression Analysis: Basic Concepts: 1 The Simple Linear Model
No ratings yet
Regression Analysis: Basic Concepts: 1 The Simple Linear Model
4 pages
Statistics: Maths IIT-JEE Best Approach' (MC SIR)
No ratings yet
Statistics: Maths IIT-JEE Best Approach' (MC SIR)
2 pages
Regression and Multiple Regression Analysis
100% (1)
Regression and Multiple Regression Analysis
21 pages
CUDA, Supercomputing For The Masses: Part 4: Understanding and Using Shared Memory
No ratings yet
CUDA, Supercomputing For The Masses: Part 4: Understanding and Using Shared Memory
3 pages

Ols Estimates

Uploaded by

Ols Estimates

Uploaded by

Some important properties of summations

Definition: The summation sign (Σ) adds up a series of numbers

We can represent sample means with summations:

Three important properties of summations:

Because x̄ is a constant, it can be moved outside the summation sign in the

Following the same logic, we can easily establish that

Estimated residual: ˆi  yi  ˆ0  xi ˆ1

Objective is to minimize sum of squared residuals:

First order conditions (FOCs):

Use FOCs to obtain estimate for β̂0 and β̂1

The estimate for β̂0

Working with condition (1), multiply both sides by -1/2

Then divide by n and expand all terms

(1c) y  ˆ1  x ˆ1  0

Which means that

The estimate for β̂1

Working with condition (2), multiple both sides by -1/2

Substitute ȳ - x̄ β̂1 for β̂0 (from condition 1b)

Collect like terms

Recognize two facts:

Substitute these values into (2d)

Bringing the second term to the right hand side

Then solve for β̂1

 ( yi  ˆ0  xi ˆ1 ) xi  0   ˆi xi . Therefore, by construction, the optimal choices

of β̂0 and β̂1 are such that xi and ˆ i are uncorrelated.

Notice that the numerator in (2h) is σ̂xy and then denominator is

Given the basic regression model: yi  0  xi 1   i

Predicted outcome: yˆ i  ˆ0  ˆ1 xi

Estimated residual: ˆi  yi  ˆ0  xi ˆ1

By construction: (1) yi  yˆi  ˆi

Take the average of equation (1) over all observations, then

The total variation in y, or the Sum of Squared Total (SST) is defined as

by construction, equal to zero. Therefore, equation (4) reduces to

Therefore, what we have demonstrates is that

(9) SST = SSM + SSR

Dividing both sides of (9) by SST, note that

The R2 measures what fraction of the variation in y is explained by the regression

Recall the definition for β̂1

Recalling the properties of summations, note that numerator can be written as

Break apart the terms in the numerator

We can simplify the terms in (5) using the properties of summations:

WE WILL BE USING THIS CHARACTERIZATION OF THE OLS ESTIMATE FOR

Three assumptions about the residual εi

And substituting this in generates

And therefore, the estimate β̂1 can be written as

Notice that if in a sample ˆ x  0 , then ̂1  1 . However, if ˆ x  0 then by construction

To start, recall the following facts:

(a) The equation for the estimate of β̂1

(4) Var (ˆ1 )  E[(ˆ1  E (ˆ1 ))2 ]

Previously, we demonstrated that β̂1 is an unbiased estimate or E[ β̂1]=β1 and therefore,

(5) Var (ˆ1 )  E[(ˆ1  1 )2 ]

Looking at equation (3), note that the difference ̂1  1 is simply

Using the definition of the variance and equation (6)

Where xi  xi  x . Because SSTx is a constant (x is considered fixed) we can bring it

Next, we take the expectation of this term

We can reduce the numerator of 11,

An unbiased estimate for  2 is as follows

The estimated variance of β̂1 is then

You might also like