JIMMA UNIVERSITY
College of Business and Economics
Continuing and Distance Education Office
Department of Economics
Assignment on Econometrics
Name:
ID.NO:
Department:
Year:
Term:
Center:
Part I
1. True. If the slope parameter in a simple linear regression model is zero, the relationship
between the independent and dependent variables is horizontal, meaning the intercept is equal to
the sample mean of the dependent variable.
2. False. The variance of OLS estimators generally decreases with an increase in sample size, not
increases. Larger sample sizes lead to more precise estimates.
3. True. In multiple regression, adding an additional explanatory variable will always increase or
leave the same R-squared value because the model can explain more or the same amount of
variance in the dependent variable.
4. True. If the error term in a linear regression model is normally distributed, the OLS estimator
is the Best Linear Unbiased Estimator (BLUE) according to the Gauss-Markov theorem.
5. True. The sum of residuals in a linear regression model is always zero because the OLS
estimates are designed to minimize the sum of squared residuals, which includes an adjustment
to make this sum zero.
6. False. Adjusted R-squared is a better measure for comparing models with different numbers of
predictors, but it doesn't directly help compare models with different functional forms like log-
log versus level-log. It’s more about the number of predictors rather than the form of the model.
7. False. The residuals do not need to be normally distributed for OLS estimators to be consistent
and asymptotically normal. The central limit theorem provides that, with a large enough sample
size, the OLS estimators will be approximately normally distributed regardless of the distribution
of residuals.
8. False. The F statistic tests the hypothesis that all coefficients *excluding* the constant are
equal to zero. It assesses the overall significance of the explanatory variables in the model.
9. False. With a finite sample size, OLS estimators are not necessarily normally distributed.
Normality of residuals is not required for finite sample distributions of OLS estimators, but with
large samples, the distribution of OLS estimators approaches normality due to the central limit
theorem.
10. False. Adding irrelevant variables can lead to overfitting and increased variance of the
estimators, while omitting relevant variables can lead to biased and inconsistent estimates. The
consequences are different as they affect the model in different ways.
Part II: Choice
1. B
2. B
3. E
4. B
Part III: Workout
1. List out the reasons why the introduction of the error term is necessary?
Captures Unobserved Factors: The error term accounts for the influence of factors that
are not included in the model but still affect the dependent variable. It reflects the impact
of these unobserved variables on the outcome.
Allows for Model Flexibility: The inclusion of the error term makes the model more
flexible and realistic by acknowledging that the relationship between the dependent and
independent variables is not always perfectly deterministic.
Represents Model Uncertainty: The error term represents the inherent variability or
randomness in the dependent variable that cannot be explained by the independent
variables alone. This reflects the true nature of empirical data.
2. Write down the definition of heteroscedasticity. What problem does the presence of it cause
for the OLS estimator?
Definition: Heteroscedasticity refers to a situation in regression analysis where the variance of
the error term (residuals) is not constant across all levels of the independent variable(s). Instead,
it varies systematically with the independent variables or the predicted values.
Problems Caused:
Inefficiency: Heteroscedasticity makes the OLS estimators inefficient because the
standard errors of the coefficients may be biased, leading to incorrect inferences.
Invalid Hypothesis Tests: It causes the standard errors to be biased, which can lead to
invalid test statistics and confidence intervals, making hypothesis testing unreliable.
3. List down and discuss the classical linear regression assumptions?
Linearity: The relationship between the dependent variable and the independent
variables is linear in parameters. This assumption ensures that the model correctly
specifies the functional form of the relationship.
Independence: The error terms are independent of each other. This assumption implies
that the residuals do not exhibit autocorrelation, ensuring that each observation provides
unique information.
Homoscedasticity: The variance of the error term is constant across all levels of the
independent variable(s). This assumption ensures that the model's predictions are equally
reliable across the range of the explanatory variables.
Zero Conditional Mean: The expected value of the error term, given any value of the
independent variables, is zero. This implies that the error term does not systematically
vary with the independent variables, ensuring that the OLS estimators are unbiased.4
4. On the basis of the information given below answer the following question
ΣΧΙ = 6400 ΣΧΙΧ2 = 8600 ΣΧ₂ = 800 ΣΧ22= 7300 ΣΧΥ = 8400 ΣΧΥ = 27000 ΣΥ = 1600
ΣΧ₁ = 500 n=50 ΣΥ² 1= 56,000
Given the following data:
ΣX₁ = 6400, ΣX₁X₂ = 8600, ΣX₂ = 800, ΣX₂² = 7300
ΣXY = 27000, ΣY = 1600, n = 50, ΣY² = 56000
Where:
- X₁, X₂ are predictor variables.
- Y is the dependent variable.
- ΣX₁X₂, ΣX₂², etc., represent summations for calculations.
Questions
a. Find the OLS estimate of the slope coefficient β₂
b. Compute variance of β₂
c. Test the significance of β₂ slope parameter at the 5% level of significance
d. Compute R² and adjusted R² and interpret the result
e. Test the overall significance of the model
a. OLS Estimate of the Slope Coefficient (β₂)
The OLS slope coefficient β₂ can be found using the following formula:
β₂ = (n ΣX₂Y - ΣX₂ ΣY) / (n ΣX₂² - (ΣX₂)²)
b. Variance of β₂
The variance of the slope coefficient β₂ is calculated by:
Var(β₂) = σ² / Σ(X₂ - X̄ ₂)²
Where σ² is the error variance.
c. Significance of β₂
To test the significance of β₂ at the 5% significance level, the t-statistic is:
t = β₂ / SE(β₂)
The result is compared with the critical value from the t-distribution with (n - 2) degrees of
freedom.
d. R² and Adjusted R²
The coefficient of determination R² is:
R² = 1 - RSS / TSS
Where RSS is the residual sum of squares and TSS is the total sum of squares.
The adjusted R² is given by:
R²_adj = 1 - (1 - R²)(n - 1) / (n - p - 1)
Where p is the number of predictors.
e. Overall Significance of the Model
To test the overall significance of the model, the F-statistic is used:
F = (R² / p) / ((1 - R²) / (n - p - 1))
The F-statistic is compared with the critical F-value from the F-distribution table at the 5%
significance level.