100% found this document useful (1 vote)
155 views

Group Assignment Final PDF

1. The document provides data on price (Py) and quantity supplied (Sy) of a good Y. It estimates the linear regression line relating Sy to Py and finds the regression equation to be Sy = -10.86 + 4.57Py. 2. It calculates the standard errors of the intercept and slope coefficients as 9.466 and 0.3347 respectively. 3. It tests the hypothesis that price influences supply, finding that the slope coefficient is statistically significant based on the test. Price has a positive effect on supply. 4. The expression for the error variance in a linear regression model is derived. The error variance is estimated as the sum of squared errors divided by n-2,

Uploaded by

dejene edossa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
155 views

Group Assignment Final PDF

1. The document provides data on price (Py) and quantity supplied (Sy) of a good Y. It estimates the linear regression line relating Sy to Py and finds the regression equation to be Sy = -10.86 + 4.57Py. 2. It calculates the standard errors of the intercept and slope coefficients as 9.466 and 0.3347 respectively. 3. It tests the hypothesis that price influences supply, finding that the slope coefficient is statistically significant based on the test. Price has a positive effect on supply. 4. The expression for the error variance in a linear regression model is derived. The error variance is estimated as the sum of squared errors divided by n-2,

Uploaded by

dejene edossa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

1.

The following data refers to the price of a good Y ‘Py’ and the quantity of the good Ysupplied,
‘Sy’.

Py
7 12 10 6 9 13 7 13

Sy 20 46 37 14 33 48 22 45

a) Estimate the linear regression line (Sy)    Py

Py Sy Py.Sy Py2 Sy2


7.00 20.00 140.00 49.00 400
12.00 46.00 552.00 144.00 2116
10.00 37.00 370.00 100.00 1369
6.00 14.00 84.00 36.00 196
9.00 33.00 297.00 81.00 1089
13.00 48.00 624.00 169.00 2304
7.00 22.00 154.00 49.00 484
13.00 45.00 585.00 169.00 2025
77 265 2806 797 9983

Based on the above table, the following is calculated:


1 𝑛 77
̅
Py = ∑ Py 𝑖= = 9.625
𝑛 𝑖=1 8

̅ 1 𝑛 265
Sy = ∑ Sy 𝑖 = 33.125
𝑛 𝑖=1 8

𝑛
SSXX = ∑𝑖=1(Xi −X2) = ∑𝑛𝑖=1 X𝑖 2 – nX2
SSXX = 797-(8*(9.625)2 = 797-741.125 = 55.875
𝑛
SSYY =∑𝑖=1(Yi −Y)2 = ∑𝑛𝑖=1 Y𝑖2 – nY2
SSYY = 9983- (8*(33.125)2)=9983-8878.125= 1204.875
𝑛
SSXY =∑𝑖=1(Xi −X)2(Yi-Y)2 = ∑𝑛𝑖=1 XiY𝑖 – nXY
SSXY = 2806 – (8*9.625*33.125) = 2806 – 2550.625 = 255.375
Therefore, based on the above calculations, the regression coefficients (the slope 𝑚, and the y- intercept
𝑛) are obtained as follows:

𝑆𝑆𝑋𝑌 255.375
m= = = 4.57
𝑆𝑆𝑋𝑋 55.875

n = 𝑌̅ − 𝑋̅ ⋅𝑚 =33.125 - (9.625*4.57) = -10.86

Therefore, we find that the regression equation is:


𝐒𝐲 = −10.86 + 4.57 𝐏y

b) Estimate the standard errors of ˆ and ˆ

To Determine The Standard Error Of ˆ and ˆ

Py Sy Py.Sy Py^2 sy^2 Y-Y Y^ Y-Y^ (Y-Y^) ^2


7.00 20.00 140.00 49.00 400.00 172.265625 21.1242 -1.12 1.26382564
12.00 46.00 552.00 144.00 2,116.00 165.765625 43.9742 2.03 4.10386564
10.00 37.00 370.00 100.00 1,369.00 15.015625 34.8342 2.17 4.69068964
6.00 14.00 84.00 36.00 196.00 365.765625 16.5542 -2.55 6.52393764
9.00 33.00 297.00 81.00 1,089.00 0.015625 30.2642 2.74 7.48460164
13.00 48.00 624.00 169.00 2,304.00 221.265625 48.5442 -0.54 0.29615364
7.00 22.00 154.00 49.00 484.00 123.765625 21.1242 0.88 0.76702564
13.00 45.00 585.00 169.00 2,025.00 141.015625 48.5442 -3.54 12.56135364
77.00 265.00 2,806.00 797.00 9,983.00 1,204.88 264.96 0.00 37.69145312

S.E (α ^) = √𝑉𝑎𝑟 (α^) and S.E (β ^) = √𝑉𝑎𝑟 (β^)


𝑛
∑𝑖=1 𝑋𝑖 2 𝛿 2 𝛿2
𝑉𝑎𝑟 (α^) = 𝑛 𝑉𝑎𝑟 (β^)= ∑𝑛
∑𝑖=1 𝑥𝑖 2 𝑖=1
𝑥𝑖 2

∑𝑛𝑖=1(Yi−Y^)2 37.69145
and 𝛿 2 = = = 6.282 Y^= −10.86 + 4.57 Xi in each X values
𝑛−𝑘 8−2
k = no of Variables which is 2 in the above eqn.
797∗6.282 6.282
𝑉𝑎𝑟 (α^) = = 89.60 𝑉𝑎𝑟 (α^)= 55.875 = 0.112
55.875
S.E (α ^) = √𝑉𝑎𝑟 (α^)= √89.06 = 9.466 S.E (β ^) = √𝑉𝑎𝑟 (β^) = √0.112= 0.3347
c) Test the hypothesis that price product influences supply in market
To Test the hypothesis, we need to fulfill the following steps
Step 1:- Put null hypothesis and Alternative hypothesis
 Ho = β = 0 > Null hypothesis
 H1 = β = 0 > Alternative hypothesis
Step 2 :- Find SE(β)
 From the above question we can find that SE(β) = 0.3347
Step 3 :- Now find the half of β
 1/2 β = 0.5*4.57=2.285
Step 4:- Compare the standard errors with the numerical values of β half
 SE(β)=0.3347 and 1/2 β=2.285
 SE(β)<1/2 β
 So accept the alternative hypothesis and reject the null which means that β have a significant
role

 Result shows price has a significant (Positively) effect on supply. A


unit increase in price increases supply by 4.5705 factor.

2. Given the model Yi= βo + β1Xi+Ui with usual OLS assumption


a) Drive the Expression for Error Variance

Estimation of the Error Variance

Note that for a random variable, its variance is the expected value of the squared deviation from the
mean. That is, for a random variable U, with mean Ui its variance is:

 Var(U) = E(U-E(Ui))2
∑𝑛𝑖=1 Ui
 E(Ui) = and ∑𝑛𝑖=1 Ui = 0
𝑛
0
 E(Ui) = = 0
𝑛
 Var(U) = E(U2) =  2 and U=(0,  2 )

NB For the simple linear regression model, the errors have mean 0, and variance  2 .
This means that for the actual observed values Yi , their mean and variance are as follows:
 Yi= βo + β1Xi+Ui
 E(Yi) = E(βo) + E(β1Xi) + E (Ui)
 E(Yi) = βo + β1Xi
 Var(Y) = E(Y- E(Y))2
 VarY = E(βo + β1Xi+Ui - βo + β1Xi)2
 VarY = E(Ui)2 =  2

^
First, we replace the unknown mean 0  1 X i with its fitted value Y i  b0  b1 X i ,
then we take the “average” squared distance from the observed values to their fitted values.
We divide the sum of squared errors by n-2 to obtain an unbiased estimate of  2
n n

 e
^
(Yi  Y i ) 2 i
2

i 1 i 1
  2 = s2  
n 2 n 2

Common notation is to label the numerator as the error sum of squares (SSE).

n n

 (Y  Y e
^
SSE  i i ) 2
 2
i
i 1 i 1

Also, the estimated variance is referred to as the error (or residual) mean square (MSE).

SSE
MSE  s2 
n 2
3. Researcher is using data for a sample of 100 households to estimate the relation between daily
consumption expenditure and income in dire dawa city Sabina kebele. Preliminary analysisof the
sample data produces the following data.
∑ 𝑥𝑦 = 7000
∑ 𝑥2 = 10000
∑ 𝑦2 = 20000
∑ X = 1000
𝑌̅ = 20
n = 100
𝑋̅ = 10

a) Use the above information to compute OLS estimates of the intercept and slope coefficients
and interpret the result.
For the regression line: Yˆ = β0ˆ + β1ˆ X ˆ

Where:

β1ˆ = ∑ 𝑥𝑦

∑ 𝑥2

β1ˆ 7,000
=
10,000
β1ˆ = 𝟎. 70

β0ˆ = 𝑌̅ − β1ˆ𝑋̅
∑ 𝑥 1000
Where: 𝑋̅ = = = 10
𝑛 100

β0ˆ = 20 − (0.7)*(10)

β0ˆ = 𝟏3.0
 Yˆ = 16.5 + 0.35 Xˆ
Meaning: when income of household is zero consumption level is equals to 13 the slope
β1ˆ shows the proportion of income spent for consumption. And shows increasing income
of households by one birr Increases consumption by 0.7.

b) Calculate the variance of the slope


𝛿2
 Var (β1ˆ) =
154.08
𝑛 = = 0.00154
∑𝑖=1 𝑥𝑖 2 10,000

𝑛
2 ∑𝑖=1 𝑒𝑖 2 15,100
𝛿 = = 100−2 = 154.08
𝑛−2
𝑛
∑𝑖=1 𝑒𝑖 2 = (1 − 𝑅 2 )∑ y2 = ( 1 - 0 . 2 4 5 ) 2 0 , 0 0 0 = 15.100

β1ˆ∑ xy 0.7 x 7000


𝑅2 = = = 0.245
∑ y2 2000

4. Suppose that we want study about factors affecting the academic performance of third yearEconomics
students in dira Dawa University. Suppose, to this end, data on cumulative GPA, SEX, personal
computer (PC) ownership, and hours of study (HRS) per week and entrance exam results
(EER) were collected from a sample of 10 third year Economics students. Assume further that,
categorical variables are defined as follow.

Personal computer ownership: 1 for students with PC and 0 for students without PC

SEX: 1 for male and 0 for female

Students CGPA SEX PC HRS EER

1 2.5 1 1 6 250

2 1.93 0 0 5 280

3 2.3 0 0 4 319

4 2.77 1 0 9 326

5 2.32 0 0 7 327

6 3.33 1 1 10 350

7 2.42 0 0 12 329

8 2.33 0 0 8 314

9 2.5 0 0 6 316

10 2.75 1 0 7 350

Import this data in to STATA by directly writing on data editor


a. Find measures of central tendency, measures of variation and measures of shape for
continues variables
CGPA HRS EER
Mean 2.515 Mean 7.4 Mean 316.1
Standard Error 0.11807 Standard Error 0.76303 Standard Error 9.64186
5 5
Median 2.46 Median 7 Median 322.5
Mode 2.5 Mode 6 Mode 350
Standard Standard Standard
0.373371 2.41292 30.4902
Deviatio Deviation Deviation
8 5
n
Sample Variance 0.139406 Sample Variance 5.82222 Sample Variance 929.655
2 6
Kurtosis 2.066931 Kurtosis 0.00458 Kurtosis 1.59049
9 6
Skewness 0.923664 Skewness 0.59792 Skewness -
4 1.24215
Range 1.4 Range 8 Range 100
Minimum 1.93 Minimum 4 Minimum 250
Maximum 3.33 Maximum 12 Maximum 350
Sum 25.15 Sum 74 Sum 3161
Count 10 Count 10 Count 10

b) Tabulate categorical variables

tabulate sex

SEX | Freq. Percent Cum.


------------+-----------------------------------
0| 6 60.00 60.00
1| 4 40.00 100.00
------------+-----------------------------------
Total | 10 100.00

tabulate pc

PC | Freq. Percent Cum.


------------+-----------------------------------
0| 8 80.00 80.00
1| 2 20.00 100.00
------------+-----------------------------------
Total | 10 100.00
c) Test whether there is statistically significant difference b/n mean CGPA of male and female
students using meant test

ttest cgpa , by(sex)

Two-sample t test with equal variances

------------------------------------------------------------------------------

Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

---------+--------------------------------------------------------------------

0 | 6 2.3 .0801249 .1962652 2.094032 2.505968

1 | 4 2.8375 .1752795 .350559 2.279682 3.395318

---------+--------------------------------------------------------------------

combined | 10 2.515 .1180701 .3733705 2.247907 2.782093

---------+--------------------------------------------------------------------

diff | -.5375 .1709768 -.9317732 -.1432268

------------------------------------------------------------------------------

diff = mean(0) - mean(1) t = -3.1437

Ho: diff = 0 degrees of freedom = 8

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Pr(T < t) = 0.0069 Pr(|T| > |t|) = 0.0137 Pr(T > t) = 0.9931
d) Test whether there is statistically significant difference b/n mean CGPA of students withPC and
students without PC using meant test
. ttest cgpa , by( pc )

Two-sample t test with equal variances


------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
0| 8 2.415 .0954501 .2699736 2.189296 2.640704
1| 2 2.915 .415 .5868986 -2.358075 8.188074
---------+--------------------------------------------------------------------
combined | 10 2.515 .1180701 .3733705 2.247907 2.782093
---------+--------------------------------------------------------------------
diff | -.5 .258398 -1.095867 .0958669
------------------------------------------------------------------------------
diff = mean(0) - mean(1) t = -1.9350
Ho: diff = 0 degrees of freedom = 8

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0


Pr(T < t) = 0.0445 Pr(|T| > |t|) = 0.0890 Pr(T > t) = 0.9555

e) Summarize the values of continuous variables (CGPA, HRS & EER) for male and female
separately

Boys:
CGPA HRS EER
Mean 2.3 Mean 7 Mean 314.1667
Standard Standard Standard
0.080125 1.154701 7.254501
Error Error Error
Median 2.325 Median 6.5 Median 317.5
Mode #N/A Mode #N/A Mode #N/A
Standard Standard Standard
0.196265 2.828427 17.76982
Deviation Deviation Deviation
Sample Sample Sample
0.03852 8 315.7667
Variance Variance Variance
Kurtosis 3.444737 Kurtosis 1.66875 Kurtosis 3.851215
Skewness -1.62259 Skewness 1.193243 Skewness -1.83859
Range 0.57 Range 8 Range 49
Minimum 1.93 Minimum 4 Minimum 280
Maximum 2.5 Maximum 12 Maximum 329
Sum 13.8 Sum 42 Sum 1885
Count 6 Count 6 Count 6
Girls:
CGPA HRS EER
Mean 2.3 Mean 7 Mean 314.1667
Standard Standard Standard
0.080125 1.154701 7.254501
Error Error Error
Median 2.325 Median 6.5 Median 317.5
Mode #N/A Mode #N/A Mode #N/A
Standard Standard Standard
0.196265 2.828427 17.76982
Deviation Deviation Deviation
Sample Sample Sample
0.03852 8 315.7667
Variance Variance Variance
Kurtosis 3.444737 Kurtosis 1.66875 Kurtosis 3.851215
f)
Skewness -1.62259 Skewness 1.193243 Skewness -1.83859
Range 0.57 Range 8 Range 49
Minimum 1.93 Minimum 4 Minimum 280
Maximum 2.5 Maximum 12 Maximum 329
Sum 13.8 Sum 42 Sum 1885
Count 6 Count 6 Count 6
f) Show the mean CGPA of male and Female using bar graph
g) Regress CGPA on SEX, PC, HRS and EER
regress cgpa sex pc hrs eer

Source | SS df MS Number of obs = 10


-------------+---------------------------------- F(4, 5) = 20.58
Model | 1.18280261 4 .295700654 Prob > F = 0.0026
Residual | .071847373 5 .014369475 R-squared = 0.9427
-------------+---------------------------------- Adj R-squared = 0.8969
Total | 1.25464999 9 .139405554 Root MSE = .11987

------------------------------------------------------------------------------
cgpa | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
sex | .2650852 .1037748 2.55 0.051 -.0016764 .5318468
pc | .4495351 .1359566 3.31 0.021 .1000475 .7990226
hrs | .0101844 .0197824 0.51 0.629 -.0406679 .0610368
eer | .0077509 .0016881 4.59 0.006 .0034116 .0120902
_cons | -.2063729 .4765721 -0.43 0.683 -1.43144 1.018695
------------------------------------------------------------------------------

h) Interpret the regression results

 The coefficient for sex (.2650852 ) is not statistically significantly different from 0 because its p-value
is definitely larger than 0.01.
 The coefficient for pc (.4495351) is statistically significant because its p-value of 0.021 is less than
0.05.
 The coefficient for hrs (.0101844) is not statistically significantly different from 0 because its p-value
is definitely larger than 0.129.
 The coefficient for eer (.0077509) is statistically significant because its p-value of 0.006 is less than
0.044.
i) Test for heteroskedasticity, multicollinearity and autocorrelation
heteroskedasticity hettest
Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant variance
Variables: fitted values of cgpa
chi2(1) = 0.57
Prob > chi2 = 0.4486
multicollinearity vif
Variable | VIF 1/VIF
-------------+----------------------
pc | 2.06 0.485870
sex | 1.80 0.555963
eer | 1.66 0.602703
hrs | 1.43 0.700729
-------------+----------------------
Mean VIF | 1.74

tsset student

time variable: student, 1 to 10

delta: 1 unit

autocorrelation estat bgodfrey

Breusch-Godfrey LM test for autocorrelation

---------------------------------------------------------------------------

lags(p) | chi2 df Prob > chi2

-------------+-------------------------------------------------------------

1 | 1.722 1 0.1894

---------------------------------------------------------------------------

H0: no serial correlation


5.

You might also like