0% found this document useful (0 votes)
286 views

Chap 1 Nature of Regression Analysis

The document discusses the origins and modern interpretation of regression analysis. Regression analysis studies the dependence of a variable called the dependent variable on one or more other explanatory variables. It aims to estimate and predict the dependent variable based on the known values of the explanatory variables. The analysis deals with statistical relationships rather than deterministic causal relationships. Regression differs from correlation in that it distinguishes between dependent and explanatory variables.

Uploaded by

Samina Ahmeddin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
286 views

Chap 1 Nature of Regression Analysis

The document discusses the origins and modern interpretation of regression analysis. Regression analysis studies the dependence of a variable called the dependent variable on one or more other explanatory variables. It aims to estimate and predict the dependent variable based on the known values of the explanatory variables. The analysis deals with statistical relationships rather than deterministic causal relationships. Regression differs from correlation in that it distinguishes between dependent and explanatory variables.

Uploaded by

Samina Ahmeddin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

CHAPTER 1: THE NATURE OF

REGRESSION ANALYSIS

(GP) Gujarati, Damodar N. and Porter, Dawn C (2008) Basic Econometrics,


5th edition, The McGraw-Hill Companies Chap 1
(DG) Damodar Gujarati, (2015) “Econometrics by Example” Second
Edition, McMillan Education Chapter 1

Basic
2020-10-07 Econometrics Haleema Sadia 1
HISTORICAL ORIGIN OF THE TERM
REGRESSION
• The term regression is introduced by Francis
Galton.
• He found that, although there was a tendency for
tall parents to have tall children and for short
parents to have short children, the average height
of children born of parents of a given height
tended to move or “regress” toward the averge
height in the population as a whole. This tendency
is called Galton’s law of universal regression.

Basic
2020-10-07 Econometrics Haleema Sadia 2
THE MODERN INTERPRETATION OF
REGRESSION
• Regression analysis is concerned with the study of the
dependence of one variable, the dependent variable,
on one or more other variables, the explanatory
variables, with a view to estimating and/or predicting
the (population) mean or average value of the former
in terms of the known or fixed (in repeated sampling)
values of the latter.

Basic
2020-10-07 Econometrics Haleema Sadia 3
Examples of Regression Analysis
1. Reconsider Galton’s law of universal
regression.
We want to find out how the average height
of sons changes, given the father’s height.

Look at the scatter diagram or scattergram


on the next slide.

Basic Econometrics
442020-10-07 Haleema Sadia 4
Figure 1.1 Hypothetical distribution of sons’ heights
corresponding to given heights of fathers.

Basic
2020-10-07 Econometrics Haleema Sadia 5
Examples of Regression Analysis
2. Consider the heights of boys measured at
fixed ages.

Notice that corresponding to any given age


we have a range of heights. Therefore,
knowing the age, we may be able to predict
the average height corresponding to that age.

Basic
2020-10-07 Econometrics Haleema Sadia 6
Figure 1.2 Hypothetical distribution of heights
corresponding to selected ages.

Basic Econometrics
2020-10-07 Haleema Sadia 7
Examples of Regression Analysis
5. A labor economist may want to study the rate
of change of money wages in relation to the
unemployment rate.

Figure 1.3

Basic
2020-10-07 Econometrics Haleema Sadia 8
Examples of Regression Analysis
6. From monetary economics it is known that, other things
remaining the same, the higher the rate of inflation π, the lower
the proportion k of their income that people would want to hold in
the form of money, as depicted in Figure 1.4 (next slide).

A quantitative analysis of this relationship will enable the


monetary economist to predict the amount of money, as a
proportion of their income, that people would want to hold at
various rates of inflation.

Basic
2020-10-07 Econometrics Haleema Sadia 9
Figure 1.4 Money holding in relation to
the inflation rate π

Basic
2020-10-07 Econometrics Haleema Sadia 10
STATISTICAL AND DETERMINISTIC
RELATIONSHIPS
• In the regression analysis we are concerned
with that what is known as the statistical, not
functional or deterministic, dependence
among variables, such as those of classical
physics.
• In statistical relationships among variables we
essentially deal with random or stochastic
variables. These variables have probability
distributions.

Basic
2020-10-07 Econometrics Haleema Sadia 11
REGRESSION VERSUS CAUSATION
• Although regression analysis deals with the
dependence of one variable on other
variables, it does not necessarily imply
causation.
• A statistical relationship per se cannot logically
imply causation.

Basic
2020-10-07 Econometrics Haleema Sadia 12
REGRESSION VERSUS CORRELATION
• In the correlation analysis we try to measure
the strength or degree of linear association
between two variables. The correlation
coefficient measures this strength of (linear)
association
• In regression analysis we try to estimate the
average value of one variable on the basis of
the fixed values of other variables.

Basic Econometrics
2020-10-07 13
Haleema Sadia
REGRESSION VERSUS CORRELATION
• In correlation analysis we treat any two
variables symmetrically. There is no distinction
between variables. Both variables are
considered random.

• Most of the regression theory is based on the


assumption that the dependent variable is
stochastic but the explanatory variables are
fixed or nonstochastic.

Basic
2020-10-07 Econometrics Haleema Sadia 14
TERMINOLOGY
Dependent variable Explanatory variable
Explained variable Independent variable
Predictand Predictor
Regressand Regressor
Response Stimulus
Endogenous Exogenous
Outcome Covariate
Controlled variable Control variable
Basic
2020-10-07 Econometrics Haleema Sadia 15
TERMINOLOGY
• In a simple (two-variable) regression analysis
we study the dependence of a variable on only
a single explanatory variable, such as that of
consumption expenditure on real income.
• In a multiple regression analysis we study the
dependence of one variable on more than one
explanatory variable, such as that of money
demand on interest rates, income, and
inflation.

Basic
2020-10-07 Econometrics Haleema Sadia 16
TERMINOLOGY
• The term random is a synonym for the term
stochastic. A random (stochastic) variable is a
variable that can take on any set of values,
positive or negative, with a given probability.

Basic
2020-10-07 Econometrics Haleema Sadia 17
NOTATION
• Y: dependent variable
• X1, X2, … , Xk : explanatory variables
• Xk : kth explanatory variable
• Xki : ith observation on variable Xk (cross-sectional
data)
• Xkt : tth observation on variable Xk (time series data)
• N (or T): the total number of observations or values in
the population.
• n (or t): the total number of observations in the
sample. (time series data)

Basic
2020-10-07 Econometrics Haleema Sadia 18
TYPES OF DATA
• There are mainly three types of data for
empirical analysis:
1. Time series data
2. Cross sectional data
3. Pooled data

Basic
2020-10-07 Econometrics Haleema Sadia 19
Time series data
• A time series is a set of observations on the
values that a variable takes at different times.

Basic
2020-10-07 Econometrics Haleema Sadia 20
Cross-sectional data
• Cross-sectional data are data on one or more
variables collected at the same point in time.
GPA study hours/week
3.5 10
2.7 8
1.9 9
2.3 5
2.0 8
2.2 6
2.5 3

Basic
2020-10-07 Econometrics Haleema Sadia 21
Pooled data
• In the pooled data there are elements of both
time and cross-sectional data.
time GPA study hs/week
2000 2.5 9
2000 2.7 8
2000 2.3 6
2005 1.9 5
2005 3.1 12
2010 2.4 7
2010 2.0 5
2010 3.9 11
2010 1.2 2

Basic
2020-10-07 Econometrics Haleema Sadia 22
• Panel data is a special type of pooled data in
which the same cross-sectional unit is
surveyed over time.
person time GPA study
hs/week
1 2010 2.5 9
1 2011 2.7 7
1 2012 2.3 6
2 2010 1.9 8
2 2011 3.1 12
2 2012 2.4 6
3 2010 2.0 5
3 2011 3.9 11
3 2012 1.2 2

Basic
2020-10-07 Econometrics Haleema Sadia 23
Sources of Data
• Government agencies (Department of
Commerce...)
• International agencies (World Bank...)
• Surveys

In the social sciences the data that one generally


obtains are nonexperimental in nature, that is, not
subject to the control of the researcher.

Basic
2020-10-07 Econometrics Haleema Sadia 24
The quality of data which are used in
economics is often not that good.
1. Possibility of observational errors.
2. Approximations and roundoffs.
3. Nonresponce to surveys may cause selectivity
bias.
4. The sampling method used in obtaining the
data may vary so widely that it might be very
difficult to compare them.

Basic
2020-10-07 Econometrics Haleema Sadia 25
5. Economic data are generally available at a
highly aggregate level. Such highly aggregated
data may not tell us much about the individual
or micro level units (GNP...) .
6. Because of confidentiality, certain data can be
published only in highly aggregate form
(health data...).

The researcher should always keep in mind that


the results of research are only as good as the
quality of data.

Basic
2020-10-07 Econometrics Haleema Sadia 26

You might also like