Chapter 1-Introduction To Econometrics
Chapter 1-Introduction To Econometrics
INTRODUCTION TO
ECONOMETRICS
PREPARED BY:
DR. SITI MULIANA SAMSI
• What is econometrics?
Write the relationship between CLFPR and CUNR by the following simple mathematical
model:
CLFPR = B1 + B2 CUNR
Equation above states that CLFPR is linearly related to CUNR. B1 and B2 are known as
the parameters of the linear function. B1 is also known as the intercept; it gives the value
of CLFPR when CUNR is zero. B2 is known as the slope. The slope measures the rate of
change in CLFPR for a unit change in CUNR, or more generally, the rate of change in the
value of the variable on the left-hand side of the equation for a unit change in the value of
the variable on the right-hand side. The slope coefficient B2 can be positive (if the added-
worker effect dominates the discouraged-worker effect) or negative (if the discouraged-
worker effect dominates the added-worker effect). Figure 1-1 suggests that in the present
case it is negative.
• Methodology of econometrics
4. Specifying the statistical, or econometric, model of theory.
This is seen clearly from the scattergram given in Figure 1-1. Although the two
variables are inversely related, the relationship between them is not perfectly or exactly
linear, for if we draw a straight line through the 28 data points, not all the data points will
lie exactly on that straight line.
The data of labor force and unemployment are non-experimentally collected.
Therefore, there may be other forces affecting labor force participation decisions. As a result,
the observed relationship between CLFPR and CUNR is likely to be imprecise.
• Methodology of econometrics
5. Estimating the parameters of the chosen econometric model.
How do we estimate the parameters of the model, namely, B1 and B2? That is, how
do we find the numerical values (i.e., estimates) of these parameters? This will be the
focus of our attention in Part II, where we develop the appropriate methods of computation,
especially the method of ordinary least squares (OLS). Using OLS: , we obtained the
following results
As Eq. (1.3) shows, the estimated value of B1 is 69.5 and that of B2 is – 0.58, where the
symbol means approximately. Thus, if the unemployment rate goes up by one unit (i.e., one
percentage point), ceteris paribus, CLFPR is expected to decrease on the average by
about 0.58 percentage points; that is, as economic conditions worsen, on average, there is
a net decrease in the labor force participation rate of about 0.58 percentage points,
perhaps suggesting that the discouraged-worker effect dominates.
• Methodology of econometrics
6. Checking for model adequacy: Model specification testing.
How adequate is our model, Eq. (1.3)?
There are other factors that also enter into labor force participation decisions. For
example, hourly wages, or earnings, prevailing in the labor market also will be an important
decision variable. In the short run at least, a higher wage may attract more workers to the
labor market, other things remaining the same (ceteris paribus).
We now consider the following model by adding new variable of real average hourly
earnings (AHE82)
For our illustrative example, the empirical counterpart of Eq. (1.4) is as follows (these
results are based on OLS):
Both the slope coefficients are negative. The negative coefficient of CUNR suggests that,
ceteris paribus (i.e., holding the influence of AHE82 constant), a one-percentage-point
increase in the unemployment rate leads, on average, to about a 0.64-percentage-point
decrease in CLFPR, perhaps once again supporting the discouraged-worker hypothesis. On
the other hand, holding the influence of CUNR constant, an increase in real average hourly
earnings of one dollar, on average, leads to about a 1.44 percentage point decline in
CLFPR.
• Methodology of econometrics
7. Testing the hypothesis derived from the model.
Having finally settled on a model, we may want to perform hypothesis testing. That
is, we may want to find out whether the estimated model makes economic sense and
whether the results obtained conform with the underlying economic theory. For example, the
discouraged-worker hypothesis postulates a negative relationship between labor force
participation and the unemployment rate. Is this hypothesis borne out by our results? Our
statistical results seem to be in conformity with this hypothesis because the estimated
coefficient of CUNR is negative.
8. Using the model for prediction or forecasting.
Having gone through this multistage procedure, you can legitimately ask the question:
What do we do with the estimated model, such as Eq. (1.5)? Quite naturally, we would like
to use it for prediction, or forecasting.
When data on CLFPR for 2008 actually become available, we can compare the
predicted value with the actual value. The discrepancy between the two will represent the
prediction error. Naturally, we would like to keep the prediction error as small as possible.
• Methodology of econometrics
9. Using the model for control or policy purposes.
For the policy analyst, the purpose of building and using models is to estimate things that cannot be
observed or measured directly.
Estimating the outcomes of a policy that a decision maker may consider adopting.
Estimating what factors have the greatest leverage to change a specified outcome or what is the
primary source of a given outcome.
Estimating how a variable is likely to evolve in the future, usually assuming “present trends”.
• Measurement of scale
The four scales of measurement are:
1. Nominal: Categorical data and numbers that are simply used as identifiers or names
represent a nominal scale of measurement. Exp: Female as 1 and Male as 2.
2. Ordinal: An ordinal scale of measurement represents an ordered series of relationships
or rank order. Exp: Likert-type scales (such as "On a scale of 1 to 10 with 1 being
dissatisfied and 10 being very satisfied).
3. Interval: An interval scale is one where there is order and the difference between two
values is meaningful. Exp: IQ Test score(300-850), family income ($1500-$3000) etc.
4. Ratio: Measurement scale that not only produces the order of variables but also makes
the difference between variables known along with information on the value of true zero. It
is calculated by assuming that the variables have an option for zero, the difference between
the two variables is the same and there is a specific order between the options. Exp: How
much time do you spend on the internet daily? Less than 2 hours or 3-4 hours…