Econometrics_moduleI
Econometrics_moduleI
( June, 2005)
ECONOMETRICS
A TEACHING MATERIAL FOR DISTANCE
STUDENTS MAJORING IN ECONOMICS
Module I
Prepared By:
Bedru Babulo
Seid Hassen
Department of Economics
Faculty of Business and Economics
Mekelle University
June, 2005
Mekelle
Econometrics, Module I 1
Prepared by: Bedru B. and Seid H. ( June, 2005)
Econometrics
Module I
Module I of the course includes the first three chapters. The first chapter introduces students
with the definition and some fundamental conceptualization of econometrics. In chapter two a
fairly detailed treatment of the simple classical linear regression model will be made. In this
chapter students will be introduced with the basic logic, concepts, assumptions, estimation
methods, and interpretations of the simple classical linear regression models and their
applications in economic science. Chapter three, which deals with Multiple Regression Models, is
basically the extension of the simple regression models. But in chapter three attempts will be
made to expand the linear regression model by incorporating more than one explanatory variables
or regressors to the model. In both chapters (chapter one and chapter two), due attention will be
given to the basics of ordinary least square (OLS) method of estimation and investigating the
statistical properties of the parameter estimates which are summarized by the Gauss-Markov’s
BLUE (Best, Linear, Unbiased, estimator) properties.
Chapter One
Econometrics, Module I 2
Prepared by: Bedru B. and Seid H. ( June, 2005)
Introduction
Econometrics, Module I 3
Prepared by: Bedru B. and Seid H. ( June, 2005)
Distance students! Having said the background statement in our attempt for
defining ‘ECONOMETRICS’, we may now formally define what econometrics is.
WHAT IS ECONOMETRICS?
Literally interpreted, econometrics means “economic measurement”, but the scope
of econometrics is much broader as described by leading econometricians. Various
econometricians used different ways of wordings to define econometrics. But if
we distill the fundamental features/concepts of all the definitions, we may obtain
the following definition.
Econometrics, Module I 4
Prepared by: Bedru B. and Seid H. ( June, 2005)
Econometrics, Module I 5
Prepared by: Bedru B. and Seid H. ( June, 2005)
provide explanations of the development of the various variables and it does not
provide measurements the coefficients of economic relationships.
Econometrics, Module I 6
Prepared by: Bedru B. and Seid H. ( June, 2005)
1. A set of variables
2. A list of fundamental relationships and
3. A number of strategic coefficients
Example: Economic theory postulates that the demand for a commodity depends
on its price, on the prices of other related commodities, on consumers’ income and
on tastes. This is an exact relationship which can be written mathematically as:
The above demand equation is exact. How ever, many more factors may affect
demand. In econometrics the influence of these ‘other’ factors is taken into
account by the introduction into the economic relationships of random variable. In
our example, the demand function studied with the tools of econometrics would be
of the stochastic form:
where stands for the random factors which affect the quantity demanded.
Econometrics, Module I 7
Prepared by: Bedru B. and Seid H. ( June, 2005)
Specification of the model is the most important and the most difficult stage of any
econometric research. It is often the weakest point of most econometric
applications. In this stage there exists enormous degree of likelihood of
committing errors or incorrectly specifying the model. Some of the common
reasons for incorrect specification of the econometric models are:
Econometrics, Module I 8
Prepared by: Bedru B. and Seid H. ( June, 2005)
Econometrics, Module I 9
Prepared by: Bedru B. and Seid H. ( June, 2005)
This stage consists of deciding whether the estimates of the parameters are
theoretically meaningful and statistically satisfactory. This stage enables the
econometrician to evaluate the results of calculations and determine the reliability
of the results. For this purpose we use various criteria which may be classified
into three groups:
i. Economic a priori criteria: These criteria are determined by economic
theory and refer to the size and sign of the parameters of economic
relationships.
ii. Statistical criteria (first-order tests): These are determined by statistical
theory and aim at the evaluation of the statistical reliability of the
estimates of the parameters of the model. Correlation coefficient test,
standard error test, t-test, F-test, and R 2-test are some of the most
commonly used statistical tests.
iii. Econometric criteria (second-order tests): These are set by the theory
of econometrics and aim at the investigation of whether the assumptions
of the econometric method employed are satisfied or not in any
particular case. The econometric criteria serve as a second order test (as
test of the statistical tests) i.e. they determine the reliability of the
statistical criteria; they help us establish whether the estimates have the
desirable properties of unbiasedness, consistency etc. Econometric
criteria aim at the detection of the violation or validity of the
assumptions of the various econometric techniques.
Econometrics, Module I 10
Prepared by: Bedru B. and Seid H. ( June, 2005)
forecasting due to various factors (reasons). Therefore, this stage involves the
investigation of the stability of the estimates and their sensitivity to changes in
the size of the sample. Consequently, we must establish whether the estimated
function performs adequately outside the sample of data. i.e. we must test an
extra sample performance the model.
Econometrics, Module I 11
Prepared by: Bedru B. and Seid H. ( June, 2005)
Review questions
Chapter Two
THE CLASSICAL REGRESSION ANALYSIS
[The Simple Linear Regression Model]
Econometrics, Module I 12
Prepared by: Bedru B. and Seid H. ( June, 2005)
Economic theories are mainly concerned with the relationships among various
economic variables. These relationships, when phrased in mathematical terms, can
predict the effect of one variable on another. The functional relationships of these
variables define the dependence of one variable upon the other variable (s) in the
specific form. The specific functional forms may be linear, quadratic, logarithmic,
exponential, hyperbolic, or any other form.
Assuming that the supply for a certain commodity depends on its price (other
determinants taken to be constant) and the function being linear, the relationship
can be put as:
Econometrics, Module I 13
Prepared by: Bedru B. and Seid H. ( June, 2005)
The above relationship between P and Q is such that for a particular value of P,
there is only one corresponding value of Q. This is, therefore, a deterministic
(non-stochastic) relationship since for each price there is always only one
corresponding quantity supplied. This implies that all the variation in Y is due
solely to changes in X, and that there are no other factors affecting the dependent
variable.
If this were true all the points of price-quantity pairs, if plotted on a two-
dimensional plane, would fall on a straight line. However, if we gather
observations on the quantity actually supplied in the market at various prices and
we plot them on a diagram we see that they do not fall on a straight line.
The derivation of the observation from the line may be attributed to several
factors.
a. Omission of variables from the function
b. Random behavior of human beings
c. Imperfect specification of the mathematical form of the model
d. Error of aggregation
e. Error of measurement
In order to take into account the above sources of errors we introduce in
econometric functions a random variable which is usually denoted by the letter ‘u’
or ‘ ’ and is called error term or random disturbance or stochastic term of the
function, so called be cause u is supposed to ‘disturb’ the exact linear relationship
Econometrics, Module I 14
Prepared by: Bedru B. and Seid H. ( June, 2005)
The true relationship which connects the variables involved is split into two parts:
a part represented by a line and a part represented by the random term ‘u’.
The scatter of observations represents the true relationship between Y and X. The
line represents the exact part of the relationship and the deviation of the
observation from the line represents the random component of the relationship.
- Were it not for the errors in the model, we would observe all the points on the
line corresponding to . However because of the
Econometrics, Module I 15
Prepared by: Bedru B. and Seid H. ( June, 2005)
- The first component in the bracket is the part of Y explained by the changes
in X and the second is the part of Y not explained by X, that is to say the
change in Y is due to the random influence of .
The classicals made important assumption in their analysis of regression .The most
importat of these assumptions are discussed below.
Dear distance students! Check yourself whether the following models satisfy the
above assumption and give your answer to your tutor.
a.
Econometrics, Module I 16
Prepared by: Bedru B. and Seid H. ( June, 2005)
b.
This means that the value which u may assume in any one period depends on
chance; it may be positive, negative or zero. Every value has a certain probability
of being assumed by u in any particular instance.
Mathematically, ………………………………..….(2.3)
For all values of X, the u’s will show the same dispersion around their mean.
In Fig.2.c this assumption is denoted by the fact that the values that u can
assume lie with in the same limits, irrespective of the value of X. For ,u
Econometrics, Module I 17
Prepared by: Bedru B. and Seid H. ( June, 2005)
can assume any value with in the range AB; for , u can assume any value
with in the range CD which is equal to AB and so on.
Graphically;
Mathematically;
Econometrics, Module I 18
Prepared by: Bedru B. and Seid H. ( June, 2005)
…………………………..….(2.5)
Econometrics, Module I 19
Prepared by: Bedru B. and Seid H. ( June, 2005)
Dear students! We can now use the above assumptions to derive the following
basic concepts.
(since )
……………………………………….(2.8)
Proof:
Econometrics, Module I 20
Prepared by: Bedru B. and Seid H. ( June, 2005)
(Since and )
= ,Since
(from equation (2.5))
Therefore, .
Econometrics, Module I 21
Prepared by: Bedru B. and Seid H. ( June, 2005)
2.
Note: at this point that the term in the parenthesis in equation 2.8and 2.11 is the
residual, . Hence it is possible to rewrite (2.8) and (2.11) as
and . It follows that;
and
If we rearrange equation (2.11) we obtain;
Econometrics, Module I 22
Prepared by: Bedru B. and Seid H. ( June, 2005)
……………………………………….(2.13)
Equation (2.9) and (2.13) are called the Normal Equations. Substituting the
values of from (2.10) to (2.13), we get:
2
= ( )
………………….(2.14)
……………………………………… (2.17)
Econometrics, Module I 23
Prepared by: Bedru B. and Seid H. ( June, 2005)
We minimize:
Subject to:
The composite function then becomes
where is a Lagrange multiplier.
We minimize the function with respect to
……………………………………..(2.18)
This formula involves the actual values (observations) of the variables and not
their deviation forms, as in the case of unrestricted value of .
Econometrics, Module I 24
Prepared by: Bedru B. and Seid H. ( June, 2005)
econometric methods, the one that gives ‘good’ estimates? We need some criteria
for judging the ‘goodness’ of an estimate.
According to the this theorem, under the basic assumptions of the classical linear
regression model, the least squares estimators are linear, unbiased and have
minimum variance (i.e. are best of all linear unbiased estimators). Some times the
theorem referred as the BLUE theorem i.e. Best, Linear, Unbiased Estimator. An
estimator is called BLUE if:
a. Linear: a linear function of the a random variable, such as, the
dependent variable Y.
b. Unbiased: its average or expected value is equal to the true population
parameter.
Econometrics, Module I 25
Prepared by: Bedru B. and Seid H. ( June, 2005)
(but )
; Now, let
is linear in Y
b. Unbiasedness:
Proposition: are the unbiased estimators of the true parameters
From your statistics course, you may recall that if is an estimator of then
and if is the unbiased estimator of then bias =0
i.e.
Econometrics, Module I 26
Prepared by: Bedru B. and Seid H. ( June, 2005)
In our case, are estimators of the true parameters .To show that they
are the unbiased estimators of their respective parameters means to prove that:
We know that
,
…………………………………………………………………(2.20)
……………………………………………
(2.21)
Since are fixed
, since
Therefore, is unbiased estimator of .
Proof(2): prove that is unbiased i.e.:
From the proof of linearity property under 2.2.2.3 (a), we know that:
, Since
……………………(2.23)
Econometrics, Module I 27
Prepared by: Bedru B. and Seid H. ( June, 2005)
is an unbiased estimator of .
c. Minimum variance of
Now, we have to establish that out of the class of linear and unbiased estimators of
, possess the smallest sampling variances. For this, we shall first
obtain variance of and then establish that each has the minimum variance
in comparison of the variances of other linear and unbiased estimators obtained by
any other econometric methods than OLS.
a. Variance of
……………………………………(2.25)
Substitute (2.22) in (2.25) and we get
(Since =0)
, and therefore,
……………………………………………..(2.26)
b. Variance of
Econometrics, Module I 28
Prepared by: Bedru B. and Seid H. ( June, 2005)
, Since
, Since
Again:
…………………………………………(2.28)
Dear student! We have computed the variances OLS estimators. Now, it is time to
check whether these variances of OLS estimators do possess minimum variance
property compared to the variances other estimators of the true , other
than .
To establish that possess minimum variance property, we compare their
variances with that of the variances of some other alternative linear and unbiased
estimators of , say and . Now, we want to prove that any
other linear and unbiased estimator of the true population parameter obtained from
any other econometric method has larger variance that that OLS estimators.
Lets first show minimum variance of and then that of .
1. Minimum variance of
Suppose: an alternative linear and unbiased estimator of and;
Econometrics, Module I 29
Prepared by: Bedru B. and Seid H. ( June, 2005)
Let ………………………………
(2.29)
where , ; but:
Since
,since
Since is assumed to be an unbiased estimator, then for is to be an
unbiased estimator of , there must be true that and in the above
equation.
But,
Therefore, since
Again
Since .
From these values we can drive
Since
Thus, from the above calculations we can summarize the following results.
since
But,
Since
Econometrics, Module I 30
Prepared by: Bedru B. and Seid H. ( June, 2005)
Therefore,
2. Minimum Variance of
We take a new estimator , which we assume to be a linear and unbiased
estimator of function of . The least square estimator is given by:
By analogy with that the proof of the minimum variance property of , let’s use
the weights wi = ci + ki Consequently;
Econometrics, Module I 31
Prepared by: Bedru B. and Seid H. ( June, 2005)
,Since
but
, Since
Therefore, we have proved that the least square estimators of linear regression
model are best, linear and unbiased (BLU) estimators.
…………………………………..2.30
Econometrics, Module I 32
Prepared by: Bedru B. and Seid H. ( June, 2005)
,
Proof:
……………………………………………………………(2.31)
……………………………………………………………(2.32)
Summing (2.31) will result the following expression
………………………………………………(2.34)
From (2.34):
………………………………………………..(2.35)
Where the y’s are in deviation form.
Now, we have to express in other expression as derived below.
From:
We get, by subtraction
Econometrics, Module I 33
Prepared by: Bedru B. and Seid H. ( June, 2005)
…………………………………………………….(2.36)
Note that we assumed earlier that , , i.e in taking a very large number
samples we expect U to have a mean value of zero, but in any particular single
sample is not necessarily zero.
Similarly: From;
We get, by subtraction
…………………………………………………………….(2.37)
Substituting (2.36) and (2.37) in (2.35) we get
The summation over the n sample values of the squares of the residuals over the
‘n’ samples yields:
Econometrics, Module I 34
Prepared by: Bedru B. and Seid H. ( June, 2005)
since
……………………………………………..(2.39)
b.
Given that the X’s are fixed in all samples and we know that
Hence
……………………………………………(2.40)
c. -2
= -2
But from (2.22) , and substitute it in the above expression, we
will get:
-2
= -2 ,since
Econometrics, Module I 35
Prepared by: Bedru B. and Seid H. ( June, 2005)
…………………………………………………….(2.41)
Consequently, Equation (2.38) can be written interms of (2.39), (2.40) and (2.41)
as follows: ………………………….(2.42)
From which we get
………………………………………………..(2.43)
Since
Dear student! The conclusion that we can drive from the above proof is that we
= ……………………………………(2.44)
……………………………(2.45)
Econometrics, Module I 36
Prepared by: Bedru B. and Seid H. ( June, 2005)
Econometrics, Module I 37
Prepared by: Bedru B. and Seid H. ( June, 2005)
X
Figure ‘d’. Actual and estimated values of the dependent variable Y.
As can be seen from fig.(d) above, represents measures the variation of the
sample observation value of the dependent variable around the mean. However
the variation in Y that can be attributed the influence of X, (i.e. the regression line)
is given by the vertical distance . The part of the total variation in Y about
that can’t be attributed to X is equal to which is referred to as the residual
variation.
In summary:
= deviation of the observation Yi from the regression line.
= deviation of Y from its mean.
= deviation of the regressed (predicted) value ( ) from the mean.
Now, we may write the observed Y as the sum of the predicted value ( ) and the
residual term (ei.).
From equation (2.34) we can have the above equation but in deviation form
. By squaring and summing both sides, we obtain the following
expression:
But =
(but )
………………………………………………(2.46)
Therefore;
Econometrics, Module I 38
Prepared by: Bedru B. and Seid H. ( June, 2005)
………………………………...(2.47)
OR,
i.e
……………………………………….(2.48)
Mathematically; the explained variation as a percentage of the total variation is
explained as:
……………………………………….(2.49)
From equation (2.37) we have . Squaring and summing both sides give us
…………………………………(2.51)
, Since
………………………………………(2.52)
ESS/TSS = r2
………………………….…………(2.55)
Econometrics, Module I 39
Prepared by: Bedru B. and Seid H. ( June, 2005)
The limit of R2: The value of R2 falls between zero and one. i.e. .
Interpretation of R2
Suppose , this means that the regression line gives a good fit to the
observed data since this line explains 90% of the total variation of the Y value
around their mean. The remaining 10% of the total variation in Y is unaccounted
for by the regression line and is attributed to the factors included in the disturbance
variable
Check yourself question:
a. Show that .
b. Show that the square of the coefficient of correlation is equal to ESS/TSS.
Exercise:
Suppose is the correlation coefficient between Y and X and is give by:
And let the square of the correlation coefficient between Y and , and is
given by:
Econometrics, Module I 40
Prepared by: Bedru B. and Seid H. ( June, 2005)
For the purpose of estimation of the parameters the assumption of normality is not
used, but we use this assumption to test the significance of the parameter
estimators; because the testing methods or procedures are based on the assumption
of the normality assumption of the disturbance term. Hence before we discuss on
the various testing methods it is important to see whether the parameters are
normally distributed or not.
We have already assumed that the error term is normally distributed with mean
zero and variance , i.e. . Similarly, we also proved that
. Now, we want to show the following:
1.
2.
Econometrics, Module I 41
Prepared by: Bedru B. and Seid H. ( June, 2005)
All of these testing procedures reach on the same conclusion. Let us now see these
testing methods one by one.
i) Standard error test
This test helps us decide whether the estimates are significantly different
from zero, i.e. whether the sample from which they have been estimated might
have come from a population whose true parameters are zero. .
Formally we test the null hypothesis
against the alternative hypothesis
The standard error test may be outlined as follows.
First: Compute standard error of the parameters.
Econometrics, Module I 42
Prepared by: Bedru B. and Seid H. ( June, 2005)
influence the dependent variable Y and should not be included in the function,
since the conducted test provided evidence that changes in X leave Y unaffected.
In other words acceptance of H0 implies that the relation ship between Y and X is
in fact , i.e. there is no relationship between X and Y.
Numerical example: Suppose that from a sample of size n=30, we estimate the
following supply function.
Test the significance of the slope parameter at 5% level of significance using the
standard error test.
sample size
Econometrics, Module I 43
Prepared by: Bedru B. and Seid H. ( June, 2005)
Where:
SE = is standard error
k = number of parameters in the model.
Since we have two parameters in simple linear regression with intercept different
from zero, our degree of freedom is n-2. Like the standard error test we formally
test the hypothesis: against the alternative for the slope
parameter; and against the alternative for the intercept.
Econometrics, Module I 44
Prepared by: Bedru B. and Seid H. ( June, 2005)
Example:
If we have
against:
Then this is a two tail test. If the level of significance is 5%, divide it by two to
obtain critical value of t from the t-table.
Step 4: Obtain critical value of t, called t c at and n-2 degree of freedom for two
tail test.
Step 5: Compare t* (the computed value of t) and tc (critical value of t)
If t*> tc , reject H0 and accept H1. The conclusion is is statistically
significant.
If t*< tc , accept H0 and reject H1. The conclusion is is statistically
insignificant.
Numerical Example:
Suppose that from a sample size n=20 we estimate the following consumption
function:
The values in the brackets are standard errors. We want to test the null hypothesis:
against the alternative using the t-test at 5% level of
significance.
a. the t-value for the test statistic is:
‘t’ at =0.025 and 18 degree of freedom (df) i.e. (n-2=20-2). From the
t-table ‘tc’ at 0.025 level of significance and 18 df is 2.10.
c. Since t*=3.3 and tc=2.1, t*>tc. It implies that is statistically significant.
Econometrics, Module I 45
Prepared by: Bedru B. and Seid H. ( June, 2005)
In order to define how close the estimate to the true parameter, we must construct
confidence interval for the true parameter, in other words we must establish
limiting values around the estimate with in which the true parameter is expected to
lie within a certain “degree of confidence”. In this respect we say that with a
given probability the population parameter will be with in the defined confidence
interval (confidence limits).
i.e. …………………………………………(2.57)
but …………………………………………………….(2.58)
Econometrics, Module I 46
Prepared by: Bedru B. and Seid H. ( June, 2005)
………………………………………..(2.59)
The limit within which the true lies at degree of confidence is:
; where is the critical value of t at confidence
interval and n-2 degree of freedom.
The test procedure is outlined as follows.
Decision rule: If the hypothesized value of in the null hypothesis is within the
confidence interval, accept H0 and reject H1. The implication is that is
statistically insignificant; while if the hypothesized value of in the null
hypothesis is outside the limit, reject H0 and accept H1. This indicates is
statistically significant.
Numerical Example:
Suppose we have estimated the following regression line from a sample of 20
observations.
Econometrics, Module I 47
Prepared by: Bedru B. and Seid H. ( June, 2005)
below the parameter estimates are the standard errors. Some econometricians
report the t-values of the estimated coefficients in place of the standard errors.
Review Questions
Review Questions
1. Econometrics deals with the measurement of economic relationships which are stochastic
or random. The simplest form of economic relationships between two variables X and Y
can be represented by:
Econometrics, Module I 48
Prepared by: Bedru B. and Seid H. ( June, 2005)
i) Estimate the regression line of sale on price and interpret the results
ii) What is the part of the variation in sales which is not explained by the
regression line?
iii) Estimate the price elasticity of sales.
5. The following table includes the GNP(X) and the demand for food (Y) for a
country over ten years period.
year 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
Econometrics, Module I 49
Prepared by: Bedru B. and Seid H. ( June, 2005)
Y 6 7 8 10 8 9 10 9 11 10
X 50 52 55 59 57 58 62 65 68 70
a. Estimate the food function
b. Compute the coefficient of determination and find the explained and unexplained
variation in the food expenditure.
c. Compute the standard error of the regression coefficients and conduct test of
significance at the 5% level of significance.
6. A sample of 20 observation corresponding to the regression model gave
the following data.
a. Estimate
b. Calculate the variance of our estimates
c.Estimate the conditional mean of Y corresponding to a value of X fixed at X=10.
7. Suppose that a researcher estimates a consumptions function and obtains the following
results:
where C=Consumption, Yd=disposable income, and numbers in the parenthesis are the ‘t-ratios’
a. Test the significant of Yd statistically using t-ratios
b. Determine the estimated standard deviations of the parameter estimates
Econometrics, Module I 50
Prepared by: Bedru B. and Seid H. ( June, 2005)
Chapter Three
THE CLASSICAL REGRESSION ANALYSIS
[The Multiple Linear Regression Model]
3.1 Introduction
In simple regression we study the relationship between a dependent variable and a
single explanatory (independent variable). But it is rarely the case that economic
relationships involve just two variables. Rather a dependent variable Y can
depend on a whole series of explanatory variables or regressors. For instance, in
demand studies we study the relationship between quantity demanded of a good
and price of the good, price of substitute goods and the consumer’s income. The
model we assume is:
-------------------- (3.1)
Where quantity demanded, P1 is price of the good, P2 is price of substitute
goods, Xi is consumer’s income, and are unknown parameters and is the
disturbance.
Equation (3.1) is a multiple regression with three explanatory variables. In general
for K-explanatory variable we can write the model as follows:
------- (3.2)
Where are explanatory variables, Yi is the dependent
Econometrics, Module I 51
Prepared by: Bedru B. and Seid H. ( June, 2005)
Econometrics, Module I 52
Prepared by: Bedru B. and Seid H. ( June, 2005)
We can’t exclusively list all the assumptions but the above assumptions are some
of the basic assumptions that enable us to proceed our analysis.
Econometrics, Module I 53
Prepared by: Bedru B. and Seid H. ( June, 2005)
……………………………………….(3.6)
is sample relation between .
…………………………………..(3.7)
To obtain expressions for the least square estimators, we partially differentiate
with respect to and set the partial derivatives equal to zero.
………………………. (3.8)
……………………. (3.9)
………… ………..(3.10)
Econometrics, Module I 54
Prepared by: Bedru B. and Seid H. ( June, 2005)
------------------------------------------------- (3.14)
Substituting (3.14) in (3.12) , we get:
------- (3.15)
We know that
Substituting the above equations in equation (3.14), the normal equation (3.12) can
be written in deviation form as follows:
…………………………………………(3.16)
Using the above procedure if we substitute (3.14) in (3.13), we get
………………………………………..(3.17)
Let’s bring (2.17) and (2.18) together
……………………………………….(3.18)
……………………………………….(3.19)
…………………………..…………….. (3.21)
Econometrics, Module I 55
Prepared by: Bedru B. and Seid H. ( June, 2005)
………………….……………………… (3.22)
------------------------------------- (3.25)
since
----------------- (3.26)
Econometrics, Module I 56
Prepared by: Bedru B. and Seid H. ( June, 2005)
----------------------------------(3.27)
--------------------------------(3.28)
This measure does not always goes up when a variable is added because of the
degree of freedom term n-k is the numerator. As the number of variables k
increases, RSS goes down, but so does n-k. The effect on depends on the
amount by which falls. While solving one problem, this corrected measure of
goodness of fit unfortunately introduces another one. It losses its interpretation;
is no longer the percent of variation explained. This modified is sometimes
Econometrics, Module I 57
Prepared by: Bedru B. and Seid H. ( June, 2005)
used and misused as a device for selecting the appropriate set of explanatory
variables.
3.4.General Linear Regression Model and Matrix Approach
So far we have discussed the regression models containing one or two explanatory
variables. Let us now generalize the model assuming that it contains k variables.
It will be of the form:
With respect to
The partial derivations are equated to zero to obtain normal equations.
……………………………………………………..
The general form of the above equations (except first ) may be written as:
; where
Econometrics, Module I 58
Prepared by: Bedru B. and Seid H. ( June, 2005)
: : : : :
: : : : :
Solving the above normal equations will result in algebraic complexity. But we
can solve this easily using matrix. Hence in the next section we will discuss the
matrix approach to linear regression model.
…………………………………………………...
Econometrics, Module I 59
Prepared by: Bedru B. and Seid H. ( June, 2005)
In short ……………………………………………………(3.29)
The order of matrix and vectors involved are:
………………….…(3.30)
Since is scalar (1x1), it is equal to its transpose;
-------------------------------------(3.31)
Econometrics, Module I 60
Prepared by: Bedru B. and Seid H. ( June, 2005)
…….……………………………... (3.34)
since
Econometrics, Module I 61
Prepared by: Bedru B. and Seid H. ( June, 2005)
since
Thus, least square estimators are unbiased.
3. Minimum variance
Before showing all the OLS estimators are best(possess the minimum variance
property), it is important to derive their variance.
We know that,
The above matrix is a symmetric matrix containing variances along its main
diagonal and covariance of the estimators every where else. This matrix is,
therefore, called the Variance-covariance matrix of least squares estimators of the
regression slopes. Thus,
……………………………………………(3.35)
From (3.15)
………………………………………………(3.36)
Substituting (3.17) in (3.16)
Econometrics, Module I 62
Prepared by: Bedru B. and Seid H. ( June, 2005)
var( ) ………………………………………….……..(3.37)
Note: ( being a scalar can be moved in front or behind of a matrix while identity
matrix can be suppressed).
Thus we obtain,
Where,
We can, therefore, obtain the variance of any estimator say by taking the ith term
from the principal diagonal of and then multiplying it by .
Where the X’s are in their absolute form. When the x’s are in deviation form we
can write the multiple regression in matrix form as ;
where = and
The above column matrix doesn’t include the constant term .Under such
conditions the variances of slope parameters in deviation form can be written as:
…………………………………………………….(2.38)
(the proof is the same as (3.37) above). In general we can illustrate the variance of
the parameters by taking two explanatory variables.
Econometrics, Module I 63
Prepared by: Bedru B. and Seid H. ( June, 2005)
The multiple regression when written in deviation form that has two explanatory
variables is
In this model;
and
Or
i.e., ……………………………………(3.39)
Econometrics, Module I 64
Prepared by: Bedru B. and Seid H. ( June, 2005)
…………………………………….
(3.41)
The only unknown part in variances and covariance of the estimators is .
In the above model we have three parameters including the constant term and
………………………
(3.42) this is for k explanatory variables. For two explanatory variables
………………………………………...(3.43)
This is all about the variance covariance of the parameters. Now it is time to see
the minimum variance property.
Minimum variance of
To show that all the in the vector are Best Estimators, we have also to prove
that the variances obtained in (3.37) are the smallest amongst all other possible
linear unbiased estimators. We follow the same procedure as followed in case of
single explanatory variable model where, we first assumed an alternative linear
unbiased estimator and then it was established that its variance is greater than the
estimator of the regression model.
Assume that is an alternative unbiased and linear estimator of . Suppose that
Econometrics, Module I 65
Prepared by: Bedru B. and Seid H. ( June, 2005)
……………………………………….(3.45)
Econometrics, Module I 66
Prepared by: Bedru B. and Seid H. ( June, 2005)
……………………………………...……..(3.46)
……………………………………………….(3.47)
We know,
In matrix notation
………………………………………………(3.48)
Equation (3.48) gives the total sum of squares variations in the model.
Explained sum of squares
……………………….(3.49)
Since
……………………(3.50)
Dear Students! We hope that from the discussion made so far on multiple
regression model, in general, you may make the following summary of results.
Econometrics, Module I 67
Prepared by: Bedru B. and Seid H. ( June, 2005)
(i) Model:
(ii) Estimators:
(iii) Statistical properties: BLUE
(iv) Variance-covariance:
(v) Estimation of (e’e):
B.
The null hypothesis (A) states that, holding X2 constant X1 has no (linear)
influence on Y. Similarly hypothesis (B) states that holding X1 constant, X2 has no
Econometrics, Module I 68
Prepared by: Bedru B. and Seid H. ( June, 2005)
influence on the dependent variable Yi.To test these null hypothesis we will use
the following tests:
i- Standard error test: under this and the following testing methods we
test only for .The test for will be done in the same way.
; where
Econometrics, Module I 69
Prepared by: Bedru B. and Seid H. ( June, 2005)
In this section we extend this idea to joint test of the relevance of all the included
explanatory variables. Now consider the following:
Econometrics, Module I 70
Prepared by: Bedru B. and Seid H. ( June, 2005)
The test procedure for any set of hypothesis can be based on a comparison of the
sum of squared errors from the original, the unrestricted multiple regression
model to the sum of squared errors from a regression model in which the null
hypothesis is assumed to be true. When a null hypothesis is assumed to be true,
we in effect place conditions or constraints, on the values that the parameters can
take, and the sum of squared errors increases. The idea of the test is that if these
sum of squared errors are substantially different, then the assumption that the joint
null hypothesis is true has significantly reduced the ability of the model to fit the
data, and the data do not support the null hypothesis.
If the null hypothesis is true, we expect that the data are compliable with the
conditions placed on the parameters. Thus, there would be little change in the sum
of squared errors when the null hypothesis is assumed to be true.
Let the Restricted Residual Sum of Square (RRSS) be the sum of squared errors
in the model obtained by assuming that the null hypothesis is true and URSS be
the sum of the squared error of the original unrestricted model i.e. unrestricted
residual sum of square (URSS). It is always true that RRSS - URSS 0.
Consider .
This model is called unrestricted. The test of joint hypothesis is that:
Econometrics, Module I 71
Prepared by: Bedru B. and Seid H. ( June, 2005)
We know that:
This sum of squared error is called unrestricted residual sum of square (URSS).
This is the case when the null hypothesis is not true. If the null hypothesis is
assumed to be true, i.e. when all the slope coefficients are zero.
(applying OLS)…………………………….(3.52)
but
The sum of squared error when the null hypothesis is assumed to be true is called
Restricted Residual Sum of Square (RRSS) and this is equal to the total sum of
square (TSS).
(has an F-ditribution with k-1 and n-k degrees of freedom for the numerator and denominator respectively)
………………………………………………. (3.54)
Econometrics, Module I 72
Prepared by: Bedru B. and Seid H. ( June, 2005)
…………………………………………..(3.55)
This implies the computed value of F can be calculated either as a ratio of ESS &
TSS or R2 & 1-R2. If the null hypothesis is not true, then the difference between
RRSS and URSS (TSS & RSS) becomes large, implying that the constraints
placed on the model by the null hypothesis have large effect on the ability of the
model to fit the data, and the value of F tends to be large. Thus, we reject the null
hypothesis if the F test static becomes too large. This value is compared with the
critical value of F which leaves the probability of in the upper tail of the F-
distribution with k-1 and n-k degree of freedom.
If the computed value of F is greater than the critical value of F (k-1, n-k), then the
parameters of the model are jointly significant or the dependent variable Y is
linearly related to the independent variables included in the model.
Econometrics, Module I 73
Prepared by: Bedru B. and Seid H. ( June, 2005)
Table: 2.1. Numerical example for the computation of the OLS estimators.
Y
1 49 35 53 200 -3 -7 -9 0 9 63 0 0 49 81 0 21 27 0
2 40 35 53 212 -12 -7 -9 12 144 63 -108 -84 49 81 144 84 108 -144
3 41 38 50 211 -11 -4 -12 11 121 48 -132 -44 16 144 121 44 132 -121
4 46 40 64 212 -6 -2 2 12 36 -4 24 -24 4 4 144 12 -12 -72
5 52 40 70 203 0 -2 8 3 0 -16 24 -6 4 64 9 0 0 0
6 59 42 68 194 7 0 6 -6 49 0 -36 0 0 36 36 0 42 -42
7 53 44 59 194 1 2 -3 -6 1 -6 18 -12 4 9 36 2 -3 -06
8 61 46 73 188 9 4 11 -12 81 44 -132 -48 16 121 144 36 99 -108
9 55 50 59 196 3 8 -3 -4 9 -24 12 -32 64 9 16 24 -9 -12
1 64 50 71 190 12 8 9 -10 144 72 -90 -80 64 81 100 96 108 -120
0
520 420 620 2000
Σyi=0
Σx1=0
Σx2=0
Σx3=0
Σyi2=594
Σx1x2=240
Σx12=270
Σx22=630
Σx32=750
Σx3yi=319
Σx2yi=492
Σx2x3=-420
Σx1x3=-330
Σx3yi=-625
From the table, the means of the variables are computed and given below:
Econometrics, Module I 74
Prepared by: Bedru B. and Seid H. ( June, 2005)
Based on the above table and model answer the following question.
i. Estimate the parameter estimators using the matrix approach
ii. Compute the variance of the parameters.
iii. Compute the coefficient of determination (R2)
iv. Report the regression result.
Solution:
In the matrix notation: ; (when we use the data in deviation form),
Where, ; so that
Note: the calculations may be made easier by taking 30 as common factor from
all the elements of matrix (x’x). This will not affect the final results.
And
Econometrics, Module I 75
Prepared by: Bedru B. and Seid H. ( June, 2005)
(iii)
Econometrics, Module I 76
Prepared by: Bedru B. and Seid H. ( June, 2005)
The first raw and the first column of the above matrix shows and the first
raw and the second column shows and so on.
Consider the following model
Econometrics, Module I 77
Prepared by: Bedru B. and Seid H. ( June, 2005)
(a)
(c).
The (constant) food price elasticity is negative but income elasticity is positive.
Also income elasticity if highly significant. About 78 percent of the variations in
the consumption of food are explained by its price and income of the consumer.
Econometrics, Module I 78
Prepared by: Bedru B. and Seid H. ( June, 2005)
Example 3:
Consider the model:
On the basis of the information given below answer the following question
Since the x’s and y’s in the above formula are in deviation form we have to find
the corresponding deviation forms of the above given values.
We know that:
Econometrics, Module I 79
Prepared by: Bedru B. and Seid H. ( June, 2005)
b.
Econometrics, Module I 80
Prepared by: Bedru B. and Seid H. ( June, 2005)
Hence;
Econometrics, Module I 81
Prepared by: Bedru B. and Seid H. ( June, 2005)
1-
We know that
and and
For two explanatory variable model:
1-
Adjusted
From (d)
this is the computed value of F. Let’s compare this
with the critical value F at 5% level of significance and (3,.23) numerator and
denominator respectively. F (2,22) at 5%level of significance = 3.44.
Econometrics, Module I 82
Prepared by: Bedru B. and Seid H. ( June, 2005)
F*(2,22) = 3.47
Fc(2,22)=3.44
F*>Fc, the decision rule is to reject H0 and accept H1. We can say that the
model is significant i.e. the dependent variable is, at least, linearly related to
one of the explanatory variables.
Instructions:
Read the following instructions carefully.
Make sure that your exam paper contains 4 pages
The exam has four parts. Attempt
All questions of part one
Only two questions from part two
One question from part three
And the question in part four.
Maximum weight of the exam is 40%
Part One: Attempt all of the following questions (15pts).
1. Discuss briefly the goals of econometrics.
2. Researcher is using data for a sample of 10 observations to estimate the relation
between consumption expenditure and income. Preliminary analysis of the sample data
produces the following data.
, ,
Where and
a. Use the above information to compute OLS estimates of the intercept and slope
coefficients and interpret the result
b. Calculate the variance of the slope parameter
c. Compute the value R2 (coefficient of determination) and interpret the result
d. Compute 95% confidence interval for the slope parameter
e. Test the significance of the slope parameter at 5% level of confidence using t-test
Econometrics, Module I 83
Prepared by: Bedru B. and Seid H. ( June, 2005)
3. If the model Yi= +1X1i +2X2i +Ui is to be estimated from a sample of 20 observation using
the semi- processed data given in matrix in deviation form.
, , and
Where and
2. In a study of 100 firms, the total cost(C) was assumed to be dependent on the rate of out put
(X1) and the rate of absenteeism (X2). The means were: , and . The matrix
showing sums of squares and cross products adjusted for means is
Econometrics, Module I 84
Prepared by: Bedru B. and Seid H. ( June, 2005)
100 50 40
50 50 -70 where, and
40 -70 900
Estimate the linear relation ship between C and the other two variables. (10points)
Econometrics, Module I 85