0% found this document useful (0 votes)
28 views

Lecture Note 2: Simple Linear Regression Model: Changsu Ko

The document is lecture notes on simple linear regression. It introduces simple linear regression models as a way to model the relationship between a dependent variable (y) and an independent variable (x). It defines key terms like the regression line, intercept, slope and error term. It outlines the assumptions of the simple linear regression model including that the mean error term is 0 and error terms have constant variance. It also describes how to estimate regression parameters by minimizing the sum of squared errors to find the best fitting regression line.

Uploaded by

aah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Lecture Note 2: Simple Linear Regression Model: Changsu Ko

The document is lecture notes on simple linear regression. It introduces simple linear regression models as a way to model the relationship between a dependent variable (y) and an independent variable (x). It defines key terms like the regression line, intercept, slope and error term. It outlines the assumptions of the simple linear regression model including that the mean error term is 0 and error terms have constant variance. It also describes how to estimate regression parameters by minimizing the sum of squared errors to find the best fitting regression line.

Uploaded by

aah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Lecture Note 2 : Simple Linear Regression Model

Changsu Ko

UCLA

Econ 103

Changsu Ko (UCLA) Summer 2016 Econ 103


The Simple Linear Regression Model

Changsu Ko (UCLA) Summer 2016 Econ 103


Part A

Changsu Ko (UCLA) Summer 2016 Econ 103


An Economic Model

As economists, we are generally interested in studying relationships


between variables

For example, economic theory tells us that expenditure on economic


goods depends on income

We call y (expenditure) the “dependent variable” and x (income) the


“independent” or “explanatory ” variable

In econometrics, we recognize that real-world expenditures are


random variables, and we want to use data to learn about the
relationship between x and y

Changsu Ko (UCLA) Summer 2016 Econ 103


An Economic Model

The relevant pdf we need to consider is the conditional probability


density function since it is the density of y , “conditional” upon an x

The conditional mean, or expected value, of y , conditional on x is


E [y | x]

The expected value of a random variable is called its “mean” value,


which is really the population mean, that is, the center of the
probability distribution of the random variable

This is not the same as the sample mean, which is simply the
arithmetic average of numerical values

Changsu Ko (UCLA) Summer 2016 Econ 103


An Economic Model

Changsu Ko (UCLA) Summer 2016 Econ 103


An Economic Model

In order to investigate the relationship between expenditure and


income, we must build an economic model

Then we build a corresponding econometric model that forms that


basis for a quantitative or empirical economic analysis

This econometric model is also called a regression model

Changsu Ko (UCLA) Summer 2016 Econ 103


An Economic Model

The simple regression function is written as

E [y | x] = µy |x
= β1 + β2 x

where β1 is the intercept and β2 is the slope

This model is called a simple regression because there is only one


explanatory variable on the right-hand side of the equation

Changsu Ko (UCLA) Summer 2016 Econ 103


An Economic Model

Changsu Ko (UCLA) Summer 2016 Econ 103


An Economic Model

The slope of the regression line can be written as:

∆E [y | x] dE [y | x]
β2 = =
∆x dx
where ∆ denotes “change in” and “dE [y | x]/dx” denotes the
derivative of the expected value of y with respect to x

Changsu Ko (UCLA) Summer 2016 Econ 103


An Econometric Model

Changsu Ko (UCLA) Summer 2016 Econ 103


An Econometric Model

The random error term is defined as

e = y − E [y | x]
= y − β1 − β2 x

Rearranging gives

y = β1 + β2 x + e

where y is the dependent variable and x is the independent variable

Changsu Ko (UCLA) Summer 2016 Econ 103


An Econometric Model

The expected value of the error term, given x, is

E [e | x] = E [y | x] − (β1 + β2 x)
= β1 + β2 x − β1 − β2 x
=0

That is, the mean value of the error term, given x, is zero

Changsu Ko (UCLA) Summer 2016 Econ 103


An Econometric Model

Changsu Ko (UCLA) Summer 2016 Econ 103


An Econometric Model
Assumptions

From now on, let’s further assume that x is non-random

Assumption SR1
The value of y , for each value of x, is:

y = β1 + β2 x + e

Changsu Ko (UCLA) Summer 2016 Econ 103


An Econometric Model
Assumptions

Assumption SR2
The expected value of the random error e is:

E (e) = 0

This is equivalent to assuming that E (y ) = β1 + β2 x

Changsu Ko (UCLA) Summer 2016 Econ 103


An Econometric Model
Assumptions

Assumption SR3
The variance of the random error e is:

Var (e) = σ 2 = Var (y )

The random variables y and e have the same variance because they
differ only by a constant

Changsu Ko (UCLA) Summer 2016 Econ 103


An Econometric Model
Assumptions

Assumption SR4
The covariance between any pair of random errors, ei and ej is:

Cov (ei , ej ) = Cov (yi , yj ) = 0

The stronger version of this assumption is that the random errors e’s
are statistically independent, in which case the values of the
dependent variable y are also statistically independent

Changsu Ko (UCLA) Summer 2016 Econ 103


An Econometric Model
Assumptions

Assumption SR5
The variable x is not random, and must take at least two different
values

Assumption SR6 (optional)


The values of e are normally distributed about their mean if the
values of y are normally distributed, and vice versa

Changsu Ko (UCLA) Summer 2016 Econ 103


An Econometric Model

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters

Stata Command (Summary Statistics) : sum x y

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters

Stata Command (Plot) : graph twoway scatter y x

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters

Estimation means that we have the following equation in our hand

The fitted regression line is:

yˆi = b1 + b2 xi

The residual is therefore:

eˆi = yi − yˆi = yi − (b1 + b2 xi )

Stata Command (Regression) : reg y x

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters

b1 = 83.416 and b2 = 10.20964

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters
Minimizing the Sum of Least Squares

Suppose we have another fitted line:

yˆi∗ = b1∗ + b2∗ xi

The residual is therefore:

eˆi∗ = yi − yˆi∗ = yi − (b1∗ + b2∗ xi )

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters
Minimizing the Sum of Least Squares

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters
Minimizing the Sum of Least Squares

The least squares line has the smaller sum of squared residuals:
N N
eˆi2 and SSE = eˆi∗2
X X
If SSE =
i=1 i=1

It must be that SSE ≤ SSE ∗

Least squares estimates for the unknown parameters β1 and β2 are


obtained by minimizing the sum of squares function:
N
X
S(β1 , β2 ) = (yi − β1 − β2 xi )2
i=1

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters
Minimizing the Sum of Least Squares

If we solve the previous minimization problem....


N
X
Minβ1 ,β2 S(β1 , β2 ) = (yi − β1 − β2 xi )2
i=1

We have the following two first order condition (try to get them by
yourself!)
N
X N
X
(yi − b1 − b2 xi ) = êi = 0
i=1 i=1
N
X N
X
(yi − b1 − b2 xi )xi = êi xi = 0
i=1 i=1

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters
Parameter Estimates

Parameter estimates are (based on two previous first order


conditions):
PN
i=1 (xi − x̄)(yi − ȳ )
b2 = PN 2
i=1 (xi − x̄)
b1 = ȳ − b2 x̄

1 PN 1 PN
where x̄ = N i=1 xi and ȳ = N i=1 yi

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters
Parameter Estimates

Stata Command (Predicted Value ŷ ) : predict yhat (after reg)


Stata Command (Drawing fitted line) : graph twoway line yhat x
Changsu Ko (UCLA) Summer 2016 Econ 103
Estimating the Regression Parameters
Interpreting the Estimates

Let’s assume that we get results from the above data


The value b2 = 10.2096 is an estimate of β2 , the amount by which
weekly expenditure on food per household increases when household
weekly income increases by $100

Thus, we estimate that if income goes up by $100, expected weekly


expenditure on food will increase by approximately $10.21

The intercept estimate b1 = 83.42 is an estimate of the weekly food


expenditure on food for a household with zero income

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters
Elasticities

Income elasticity is a useful way to characterize the responsiveness of


consumer expenditure to changes in income

The elasticity of a variable y with respect to another variable x is:

Percent change in y ∆y /y ∆y x
ηyx = = =
Percent change in x ∆x/x ∆x y

In the linear economic model, we have shown that


∆E (y )
β2 =
∆x

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters
Elasticities

The elasticity of mean expenditure with respect to income is:

∆E (y )/E (y ) ∆E (y ) x x
ηyx = = = β2
∆x/x ∆x E (y ) E (y )

A frequently used alternative is to calculate the elasticity at the


“point of the means” because it is a representative point on the
regression line:
x̄ 19.6
ηˆyx = b2 = 10.21 × = 0.71
ȳ 283.57

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Regression Parameters
Predictions

Suppose that we want to predict weekly food expenditure for a


household with a weekly income of $2,000

This prediction is carried out by substituting x = 20 into our


estimated equation to obtain:

ŷ = 83.42 + 10.216xi = 83.42 + 10.216 × 20 = 287.61

That is, we predict that a household with a weekly income of $2,000


will spend $287.61 per week on food

Changsu Ko (UCLA) Summer 2016 Econ 103


Part B

Changsu Ko (UCLA) Summer 2016 Econ 103


Assessing the Least Squares Fit

We call b1 and b2 the least squares estimators for β1 and β2

We can investigate the properties of the estimators b1 and b2

The least squares estimators are random variables. So, what are their
expected values, variances, covariances, and probability distributions?

Also, how do the least squares estimators compare with other


procedures that might be used, and how can we compare alternative
estimators?

Changsu Ko (UCLA) Summer 2016 Econ 103


Assessing the Least Squares Fit
We show below that the estimator b2 can be rewritten as:
N
X
b2 = wi yi
i=1

where
(xi − x̄)
wi = PN
2
j=1 (xj − x̄)

or alternatively as
N
X
b2 = β 2 + wi ei
i=1

We show below that if our model assumptions hold, then E (b2 ) = β2 ,


which means that the estimator is unbiased
Changsu Ko (UCLA) Summer 2016 Econ 103
Assessing the Least Squares Fit

We can find the expected value of b2 using the fact that the expected
value of a sum is the sum of the expected values:
N
X
E (b2 ) = E [β2 + wi ei ]
i=1
= E [β2 + w1 e1 + · · · + wN eN ]
= E [β2 ] + E [w1 e1 ] + · · · + E [wN eN ]
N
X
= E [β2 ] + E [wi ei ]
i=1
= β2

using the fact that E [ei ] = 0 and E [wi ei ] = wi E [ei ] = wi × 0 = 0

Changsu Ko (UCLA) Summer 2016 Econ 103


Assessing the Least Squares Fit

The property of unbiasedness is about the average values of b1 and


b2 if many samples, of the same size, were to be drawn from the
same population

If we took the average of estimates from many samples, these


averages would approach the true parameter values β1 and β2

Unbiasedness does not say that an estimate from any one sample is
close to the true parameter value

But, we can say that the least squares estimation procedure (or the
least squares estimator) is unbiased

Changsu Ko (UCLA) Summer 2016 Econ 103


Assessing the Least Squares Fit

Changsu Ko (UCLA) Summer 2016 Econ 103


Assessing the Least Squares Fit

Recall, the variance of b2 is


Var (b2 ) = E [(b2 − E [b2 ])2 ]

Changsu Ko (UCLA) Summer 2016 Econ 103


Assessing the Least Squares Fit

If the regression model assumptions SR1-SR5 are correct (assumption


SR6 is not required), then the variances and covariances of b1 and b2
are:
PN 2
2 i=1 xi
Var (b1 ) = σ PN
N i=1 (xi − x̄)2
1
Var (b2 ) = σ 2 PN
2
i=1 (xi − x̄)
−x̄
Cov (b1 , b2 ) = σ 2 PN
2
i=1 (xi − x̄)

Changsu Ko (UCLA) Summer 2016 Econ 103


Assessing the Least Squares Fit
Some key properties of the variances and covariance
The larger the variance term σ 2 , the greater the uncertainty in the
statistical model, and the larger the variances and covariance of the
least squares estimators
The
PN larger the2 sum of squares of the independent variables,
i=1 (xi − x̄) , the smaller the variances of the least squares
estimators and the more precisely we can estimate the unknown
parameters
The larger the sample size N, the smaller the variances and
covariance of the least squares estimators
The larger the term N 2
P
i=1 xi , the larger the variance of the least
squares estimator b1
The absolute magnitude of the covariance increases the larger in
magnitude is the sample mean x̄ and the covariance has a sign
opposite to that of x̄
Changsu Ko (UCLA) Summer 2016 Econ 103
Assessing the Least Squares Fit

Changsu Ko (UCLA) Summer 2016 Econ 103


Assessing the Least Squares Fit
The Gauss Markov Theorem

Theorem
Under the assumptions SR1-SR5 of the linear regression model, the
estimators b1 and b2 have the smallest variance of all linear and unbiased
estimator of β1 and β2 . That is, b1 and b2 are the Best Linear Unbiased
Estimator (BLUE) for β1 and β2 , respectively.

Changsu Ko (UCLA) Summer 2016 Econ 103


Assessing the Least Squares Fit
The Gauss Markov Theorem

The estimator b1 and b2 are the “best” when compared to similar


estimators, that is the estimators that are linear and unbiased

The theorem does not say that b1 and b2 are the best of all possible
estimators

The estimators b1 and b2 are best within their class because they
have the minimum (i.e., smallest possible) variance

Changsu Ko (UCLA) Summer 2016 Econ 103


Assessing the Least Squares Fit
The Gauss Markov Theorem

When comparing two linear and unbiased estimators, we always want


to use the one with the smaller variance, since that estimation rule
gives us the higher probability of obtaining an estimate that is close
to the true parameter value

In order for the Gauss-Markov Theorem to hold, assumptions


SR1-SR5 must be true. If any of these assumptions are not true, then
b1 and b2 are not the best linear unbiased estimators of β1 and β2

Note that the Gauss-Markov Theorem does not depend on the


assumption of normality (assumption SR6)

This explains why we are studying these estimators and why they are
so widely used in research, not only in economics but in all social and
physical sciences as well
Changsu Ko (UCLA) Summer 2016 Econ 103
The Probability Distribution of the least squares Estimator

If we make the normality assumption (assumption SR6) about the


error term then the least squares estimators are normally distributed:

σ2 N x2
P
b1 ∼ N(β1 , PN i=1 i )
N i=1 (xi − x̄)2
σ2
b2 ∼ N(β2 , PN )
i=1 (xi − x̄)2

Changsu Ko (UCLA) Summer 2016 Econ 103


The Probability Distribution of the least squares Estimator

Theorem
If assumptions SR1-SR5 hold, and if the sample size N is sufficiently large,
then the least squares estimators have a distribution that approximates the
normal distributions in the previous slide

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Variances
The variance of the random error ei is:

Var (ei ) = σ 2 = E [(ei − E [ei ])2 ]


= E [ei2 ] − E [ei ]2
= E [ei2 ]

Now, the least squares residuals are obtained by replacing the


unknown parameters by their least squares estimates:

eˆi = yi − yˆi = yi − b1 − b2 xi

so that
N
1 X 2
σ̃ 2 = eˆi
N
i=1

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Variances

There is a simple modification that produces an unbiased estimator,


and that is:
N
1 X 2
σ̂ 2 = eˆi
N −2
i=1

so that:

E [σ̂ 2 ] = σ 2

Why do we divide by N − 2? This is because we estimate two


parameters β1 and β2

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Variances

To obtain estimates for Var (b1 ), Var (b2 ), Cov (b1 , b2 ), just replace
the unknown error variance σ 2 by σ̂ 2 to obtain:
PN 2
i=1 xi
\
Var (b1 ) = σ̂ 2 PN
N i=1 (xi − x̄)2
1
\
Var (b2 ) = σ̂ 2 PN
i=1 (xi − x̄)2
−x̄
(b1 , b2 ) = σ̂ 2 PN
Cov\
2
i=1 (xi − x̄)

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Variances

The square roots of the estimated variances are the “standard errors”
of b1 and b2 , respectively
q
\
se(b1 ) = Var (b1 )
q
\
se(b2 ) = Var (b2 )

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Variances

Stata Command (Get Value ê) : predict ehat, residual

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Variances
Variance-Covariance Matrix

The estimated variances and covariances for a regression are arrayed


in a rectangular array, or matrix, with variances on the diagonal and
covariances in the off-diagonal positions
 
Var (b1 ) Cov (b1 , b2 )
Cov (b1 , b2 ) Var (b2 )

For the food expenditure data, the estimated covariance matrix is:
 
1, 884.44 −85.9032)
−85.9032 4.38175)

Stata Command (Get Var-Cov Matrix) : estat vce (after reg)

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Variances
The standard errors of b1 and b2 are measures of the sampling
variability of the least sqaure estimates b1 and b2 in repeated samples

The estimators are random variables. As such, they have probability


distribution, means, and variances

In particular, if assumption SR6 holds, and the random error term ei


are normally distributed, then:

b2 ∼ N(β2 , Var (b2 ))

where
σ2
Var (b2 ) = PN
i=1 (xi − x̄)2

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Variances

The estimator variances, σb22 = Var (b2 ), or its square root


p
σb2 = Var (b2 ), which we might call the true standard deviation of
b2 , measures the sampling variation of the estimates b2

The larger σb2 is the more variation in the least squares estimates b2
we see from sample to sample. If σb2 is large, then the estimates
might change a great deal from sample to sample

If σb2 is small relative to the parameter b2 , we know that the least


squares estimates will fall near b2 with high probability

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimating the Variances

The question we address with the standard error is “How much


variation about their means do the estimates exhibit from sample to
sample?”

We estimate σ 2 , and then estimate σb2 using:


q
σˆb2 = Varˆ(b2 )
v
σˆ2
u
u
= t PN
2
i=1 (xi − x̄)

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Economic variables are not always related by straight-line relationship;
in fact, many economic relationships are represented by curved lines

Fortunately, the simple linear regression model y = β1 + β2 x + e is


much more flexible than it looks at first glance

This is because the variables y and x can be transformations,


involving logarithms, squares, cubes or reciprocals, of the basic
economic variables

They can also be indicator variables that take only the values zero
and one

This means that the simple linear regression model can be used to
account for nonlinear relationship between variables

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship

Consider the linear model of house prices:

PRICE = β1 + β2 SQFT + e

where SQFT is the square footage

It may be reasonable to assume that larger and more expensive homes


have a higher value for an additional square foot of living area than
smaller, less expensive, homes

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship

We can build this into our model in two ways


1 a quadratic equation in which the explanatory variable is SQFT 2
2 a log-linear equation in which the dependent variable is ln(PRICE )

In each case we will find that the slope of the relationship between
PRICE and SQFT is not constant, but changes from point to point

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship

The quadratic function y = β1 + β2 x 2 is a parabola

The elasticity, or the percentage change in y given a 1% change in x,


is:
dy x
η= ×
dx y
x
= (2b2 x) ×
y
2b2 x 2
=
y

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Using a Quadratic Model

A quadratic model for house prices includes the squared value of


SQFT , giving:

PRICE = α1 + α2 SQFT 2 + e

The estimated slope of PRICE w.r.t. SQFT is:

\)
d(PRICE
= 2αˆ2 SQFT
d(SQFT )

if αˆ2 > 0, then larger houses will have larger slope, and a larger
estimated price per additional square foot

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Using a Quadratic Model

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Using a Quadratic Model

For 1,080 houses sold in Baton Rouge, LA, during mid-2005, the
estimated quadratic equation is:

PRICE = 55776.56 + 0.0154 × SQFT 2

The estimated slope is:


\)
d(PRICE
= 2 × 0.0154 × SQFT
d(SQFT )
The elasticity is:

[× SQFT
η̂ = slope
PRICE
[ × SQFT
ˆ 2 × slope
= 2alpha
PRICE

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Using a Quadratic Model

To compute an estimate we must select values for SQFT and PRICE


A common approach is to choose a point on the fitted relationship.
That is, we choose a value for SQFT and choose for price the
corresponding fitted value
For houses of 2000, 4000, and 6000 square feet, the estimated
elasticities are:
1.05 using PRICE = $117461.77
1.63 using PRICE = $302517.39
1.82 using PRICE = $610943.42
In words, for a 2000 square-foot house, we estimate that a 1%
increase in house size will increase price by 1.05%

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Log-linear Model

The log-linear equation ln(y ) = a + bx has a logarithmic term on the


left-hand side of the equation and a linear variable on the right-hand
side
Both its slope and elasticity change at each point and are the same
sign as b
The slope is:

d(y )
= by
d(x)

The elasticity, or the percentage change in y given a 1% change in x,


at a point on this curve is:
x
η = slope × = bx
y

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Log-linear Model

Using the slope expression, we can solve for a semi-elasticity, which


tells us the percentage change in y given a 1-unit increase in x:

(d ŷ /y )
η̂ ∗ = 100 × = 100b
dx
Consider again the model for the price of a house as a function of the
square footage, but now written in semi-log form:

ln(PRICE ) = γ1 + γ2 SQFT + e

This logarithmic transformation can regularize data that is skewed


with a long tail to the right

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Log-linear Model

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Log-linear Model

Using the Baton Rouge data, the fitted log-linear model is:

\ ) = 10.839 + 0.00041 × SQFT


ln(PRICE

To obtain predicted price take the anti-logarithm, which is the


exponential function:

\ = exp(10.839 + 0.00041 × SQFT )


PRICE

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Log-linear Model

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Log-linear Model

The slope of the log-linear model is:

\)
d(PRICE
= γˆ2 PRICE = 0.00041 × PRICE
d(SQFT )

For a house with a predicted PRICE of $100,000, the estimated


increase in PRICE for an additional square foot of house area is
$41.13
For a house with a predicted PRICE of $500,000, the estimated
increase in PRICE for an additional square foot of house area is
$205.63

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Log-linear Model

The estimated elasticity is:

η̂ = γˆ2 SQFT = 0.00041 × SQFT

For a house with 2,000 square feet, the estimated elasticity is 0.823.
That is, a 1% increase in house size is estimated to increase selling
price by 0.823%
For a house with 4,000 square feet, the estimated elasticity is 1.645
Using the “semi-elasticity”, we can say that, for a one-square-foot
increase in size, we estimate a price increase of 0.04%
Or, perhaps more usefully, we estimate that a 100-square-foot
increase will increase price by approximately 4%

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
The Functional Form

We always should do our best to choose a functional form that is:


consistent with economic theory
that fits the data well
that is such that the assumptions of the regression model are satisfied

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Regression with Indicator Variables

An indicator variable is a binary variable that takes the values zero or


one; it is used to represent a non-quantitative characteristic, such as
gender, race, or location

For example
(
1 if house is in University Town
UTOWN =
0 if house is in Golden Oaks
PRICE = β1 + β2 UTOWN + e

How do we model this?

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Regression with Indicator Variables

Changsu Ko (UCLA) Summer 2016 Econ 103


Estimation of Nonlinear Relationship
Regression with Indicator Variables

When an indicator variable is used in a regression, it is important to


write out the regression function for the different values of the
indicator variable
(
β1 + β2 if UTOWN = 1
E [PRICE ] =
β1 if UTOWN = 0

The estimated regression is:

PRICE = b1 + b2 UTOWN
= 215.733 + 61.51 × UTOWN
(
277.242 if UTOWN = 1
=
215.733 if UTOWN = 0

How do we model this?


Changsu Ko (UCLA) Summer 2016 Econ 103
Estimation of Nonlinear Relationship
Regression with Indicator Variables

The least squares estimators b1 and b2 in this indicator variable


regression can be shown to be:

b1 = PRICE GoldenOaks
b2 = PRICE UniversityTown − PRICE GoldenOaks

In the simple regression model, an indicator variable on the right-hand


side gives us a way to estimate the differences between population
means

Changsu Ko (UCLA) Summer 2016 Econ 103


Exercise Questions for Chapter 2

Let’s say that we want to run the regression with y = β1 + β2 x + e.


Now, we have least squares estimator b1 and b2 .
Which is the right statement?
1 If we calculate correctly, E [b2 ] = β2 always
2 E [b2 ] = β2 is true only if e follows Normal distribution
3 E [b2 ] = β2 is true under SR1-SR5, regardless of distribution e follows
4 E [b2 ] = β2 is generally not true under SR1-SR5 (although it can be)
Answer: 3

Changsu Ko (UCLA) Summer 2016 Econ 103


Exercise Questions for Chapter 2

Let’s say that we want to run the regression with y = β1 + β2 x + e.


Now, we have least squares estimator b1 and b2 .
Which is the right statement?
1 Var (b2 ) is zero because b2 is just a number, such as 10.4
2 Var (b1 ) is zero because b1 is just a number, such as 83
3 From the data, we have number from e for each observation
4 After getting b1 and b2 , we can get êi = yi − ŷi . ê is a random variable
Answer: 4

Changsu Ko (UCLA) Summer 2016 Econ 103


Exercise Questions for Chapter 2

Let’s say that we want to run the regression with y = β1 + β2 x + e.


Now, we have least squares estimator b1 and b2 .
Which is the right statement?
PN PN
1 SSE = i=1 (yi − b1 − b2 xi )2 ≤ SSE ∗ = i=1 (yi − b1∗ − b2∗ xi )2 for any
b1∗ and b2∗
PN
2 SSE = i=1 (yi − b1 − b2 xi )2 = 0 because we get b1 and b2 in order to
minimize it
3 b1 and b2 are the best estimator across all the possible estimates
because they have minimum variance
4 With the data, it is possible to get Var (b2 )
Answer: 1

Changsu Ko (UCLA) Summer 2016 Econ 103


Exercise Questions for Chapter 2

Let’s say that we want to run the regression with y = β1 + β2 x + e.


Now, we have least squares estimator b1 and b2 .
Which is the right statement?
1 If we have additional observation in our sample, Var (b2 ) will be smaller
2 Even if we have additional observation in our sample, Var (b2 ) can be
larger if added observations are bad observations
3 If all observation we have has the same value of x, our b1 and b2 is
very reliable because our observations are very consistent
4 The sample variation of x is not very important for b1 and b2 if x is
collected correctly
Answer: 1

Changsu Ko (UCLA) Summer 2016 Econ 103


Exercise Questions for Chapter 2

Let’s say that we want to run the regression with y = β1 + β2 x + e.


Now, we have least squares estimator b1 and b2 .
Which is the right statement?
1 E [b2 ] = β2 is true even if we dont’ assume that E [e] = 0
2 Cov (b1 , b2 ) is always a positive number
ˆ (b2 ) can be a negative number, depends on the sample, because
3 Var
that is not the real value of Var (b2 )
4 If x is non-random, Var (b2 ) is non-random
Answer: 4

Changsu Ko (UCLA) Summer 2016 Econ 103


Exercise Questions for Chapter 2

Let’s say that we want to run the regression with y = β1 + β2 x 2 + e.


Now, we have least squares estimator b1 and b2 .
Which is the right statement?
dy
1 The (estimated) marginal effects (slope) is dx = b2
2 The value of slope change its value along x
3 The (estimated) elasticity (slope) is a constant number across x
4 If b2 > 0, then there is a negative relationship between x and the value
of slope
Answer: 2

Changsu Ko (UCLA) Summer 2016 Econ 103

You might also like