0% found this document useful (0 votes)
6 views48 pages

hjgh

The document presents statistical analyses including summary statistics and regression results for salary based on years as a senior officer and years with the company, revealing a significant positive relationship with years as a senior officer but not with years at the company. It also examines factors affecting birth weight, finding significant impacts from family income and cigarette consumption, and highlights the influence of education, experience, and tenure on wages. Lastly, the document analyzes the relationship between imports, GDP, and CPI, demonstrating strong correlations and significant coefficients in the regression models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views48 pages

hjgh

The document presents statistical analyses including summary statistics and regression results for salary based on years as a senior officer and years with the company, revealing a significant positive relationship with years as a senior officer but not with years at the company. It also examines factors affecting birth weight, finding significant impacts from family income and cigarette consumption, and highlights the influence of education, experience, and tenure on wages. Lastly, the document analyzes the relationship between imports, GDP, and CPI, demonstrating strong correlations and significant coefficients in the regression models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Name: Hasnat Mubasher

Roll No: 0961-BH-BAF-20

Chapter - 03

------------------------

3.7

A) summary statistics

. summarize salary years_senior years_comp

Variable | Obs Mean Std. dev. Min Max

-------------+---------------------------------------------------------

salary | 135 90.61852 62.5722 10 529.9

years_senior | 135 7.8 6.40988 0 37

years_comp | 135 22.99259 11.67437 2 45

. B) simple regression

1)regression of salary on years as senior officer

. regress salary years_senior


Source | SS df MS Number of obs = 135

-------------+---------------------------------- F(1, 133) = 4.13

Model | 15816.0586 1 15816.0586 Prob > F = 0.0440

Residual | 508831.465 133 3825.80049 R-squared = 0.0301

-------------+---------------------------------- Adj R-squared = 0.0229

Total | 524647.524 134 3915.28003 Root MSE = 61.853

------------------------------------------------------------------------------

salary | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

years_senior | 1.694911 .8336022 2.03 0.044 .0460779 3.343743

_cons | 77.39822 8.403364 9.21 0.000 60.77669 94.01974

------------------------------------------------------------------------------

Coefficient for years_senior: 1.694911

Each additional year as a senior officer increases salary by approximately £1,694.91.

Standard Error: 0.8336022

t-value: 2.03

p-value: 0.044

Statistically significant at the 5% level, indicating a meaningful relationship between years


as a senior officer and salary.

R-squared: 0.0301

The model explains 3.01% of the variance in salary.

Constant (_cons): 77.39822


Estimated salary when years_senior is zero is approximately £77,398.22.

2) regreesion of salary with years with company

. reg salary years_comp

Source | SS df MS Number of obs = 135

-------------+---------------------------------- F(1, 133) = 0.30

Model | 1173.8491 1 1173.8491 Prob > F = 0.5859

Residual | 523473.675 133 3935.89229 R-squared = 0.0022

-------------+---------------------------------- Adj R-squared = -0.0053

Total | 524647.524 134 3915.28003 Root MSE = 62.737

------------------------------------------------------------------------------

salary | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

years_comp | .2535246 .4642326 0.55 0.586 -.6647095 1.171759

_cons | 84.78933 11.9619 7.09 0.000 61.12915 108.4495

------------------------------------------------------------------------------

Coefficient for years_comp: 0.2535246

Each additional year with the company increases salary by approximately £253.52.

Standard Error: 0.4642326

t-value: 0.55

p-value: 0.586
Not statistically significant (p-value > 0.05), indicating no meaningful relationship between
years with the company and salary.

R-squared: 0.0022

The model explains only 0.22% of the variance in salary.

Constant (_cons): 84.78933

Estimated salary when years_comp is zero is approximately £84,789.33.

____________________________________________________________________________________
_________________________________________________________________

Chapter: 04

-------------

4.1:

(A)

. reg bwght faminc cigs

Source | SS df MS Number of obs = 1,388

-------------+---------------------------------- F(2, 1385) = 21.27

Model | 17126.2088 2 8563.10442 Prob > F = 0.0000

Residual | 557485.511 1,385 402.516614 R-squared = 0.0298

-------------+---------------------------------- Adj R-squared = 0.0284

Total | 574611.72 1,387 414.283864 Root MSE = 20.063

------------------------------------------------------------------------------

bwght | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

faminc | .0927647 .0291879 3.18 0.002 .0355075 .1500219


cigs | -.4634075 .0915768 -5.06 0.000 -.6430518 -.2837633

_cons | 116.9741 1.048984 111.51 0.000 114.9164 119.0319

------------------------------------------------------------------------------

(B)

. reg bwght faminc

Source | SS df MS Number of obs = 1,388

-------------+---------------------------------- F(1, 1386) = 16.65

Model | 6819.0527 1 6819.0527 Prob > F = 0.0000

Residual | 567792.667 1,386 409.662819 R-squared = 0.0119

-------------+---------------------------------- Adj R-squared = 0.0112

Total | 574611.72 1,387 414.283864 Root MSE = 20.24

------------------------------------------------------------------------------

bwght | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

faminc | .1183234 .0290016 4.08 0.000 .0614317 .1752152

_cons | 115.265 1.001901 115.05 0.000 113.2996 117.2304

------------------------------------------------------------------------------

(C)

. reg bwght cigs

Source | SS df MS Number of obs = 1,388


-------------+---------------------------------- F(1, 1386) = 32.24

Model | 13060.4194 1 13060.4194 Prob > F = 0.0000

Residual | 561551.3 1,386 405.159668 R-squared = 0.0227

-------------+---------------------------------- Adj R-squared = 0.0220

Total | 574611.72 1,387 414.283864 Root MSE = 20.129

------------------------------------------------------------------------------

bwght | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

cigs | -.5137721 .0904909 -5.68 0.000 -.6912861 -.3362581

_cons | 119.7719 .5723407 209.27 0.000 118.6492 120.8946

------------------------------------------------------------------------------

(E)

wald test

test cigs=2* faminc

( 1) - 2*faminc + cigs = 0

F( 1, 1385) = 42.35

Prob > F = 0.0000

Conclusion: Both family income and cigarette consumption significantly affect birth
weight.
Family income positively impacts birth weight, whereas cigarette consumption has a
negative impact.

The relationship between these variables is complex and not directly proportional.

4.2

(A)

g lnwage=log(wage)

reg lnwage educ exper tenure

Source | SS df MS Number of obs = 900

-------------+---------------------------------- F(3, 896) = 52.15

Model | 23.6080086 3 7.86933619 Prob > F = 0.0000

Residual | 135.21098 896 .150905112 R-squared = 0.1486

-------------+---------------------------------- Adj R-squared = 0.1458

Total | 158.818989 899 .176661834 Root MSE = .38847

------------------------------------------------------------------------------

lnwage | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

educ | .0731166 .0066357 11.02 0.000 .0600933 .0861399

exper | .0153578 .0034253 4.48 0.000 .0086353 .0220804

tenure | .0129641 .0026307 4.93 0.000 .007801 .0181272

_cons | 5.528329 .1127946 49.01 0.000 5.306957 5.749702

------------------------------------------------------------------------------

(B)
wald test

. test exper= educ

( 1) - educ + exper = 0

F( 1, 896) = 95.74

Prob > F = 0.0000

(C)

redundant test

. test exper = 0

( 1) exper = 0

F( 1, 896) = 20.10

Prob > F = 0.0000

(D)

. reg lnwage educ exper

Source | SS df MS Number of obs = 900

-------------+---------------------------------- F(2, 897) = 64.41

Model | 19.9433397 2 9.97166984 Prob > F = 0.0000


Residual | 138.875649 897 .154822351 R-squared = 0.1256

-------------+---------------------------------- Adj R-squared = 0.1236

Total | 158.818989 899 .176661834 Root MSE = .39347

------------------------------------------------------------------------------

lnwage | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

educ | .075865 .0066975 11.33 0.000 .0627204 .0890095

exper | .0194704 .0033649 5.79 0.000 .0128664 .0260745

_cons | 5.537798 .1142326 48.48 0.000 5.313604 5.761993

------------------------------------------------------------------------------

Conclusion: Education, experience, and tenure all positively and significantly influence
wages.

The coefficients for education and experience are distinct and contribute significantly to
explaining wage variations.

4.3

(A)

reg lnDI lnY lnR

Source | SS df MS Number of obs = 26

-------------+---------------------------------- F(2, 23) = 525.29

Model | 1.06594003 2 .532970017 Prob > F = 0.0000

Residual | .023336112 23 .001014614 R-squared = 0.9786

-------------+---------------------------------- Adj R-squared = 0.9767

Total | 1.08927615 25 .043571046 Root MSE = .03185


------------------------------------------------------------------------------

lnDI | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

lnY | 1.227364 .0439292 27.94 0.000 1.13649 1.318239

lnR | -.0299645 .0213055 -1.41 0.173 -.0740382 .0141092

_cons | 7.511585 .2194947 34.22 0.000 7.057526 7.965645

------------------------------------------------------------------------------

(B)

. test lnY=0

( 1) lnY = 0

F( 1, 23) = 780.62

Prob > F = 0.0000

(C)

. test lnY=1

( 1) lnY = 1

F( 1, 23) = 26.79

Prob > F = 0.0000


Conclusion: Income is a significant determinant of disposable income,

with a coefficient significantly greater than one.

The interest rate does not have a statistically significant impact on disposable income in
this model.

____________________________________________________________________________________
_____________________________________________________________________

Chapter: 05

------------

5.1

. gen log_Imports=( Imports)

. gen log_GDP=( GDP)

. gen log_CPI=( CPI)

. reg log_Imports log_GDP log_CPI

Source | SS df MS Number of obs = 75

-------------+---------------------------------- F(2, 72) = 1622.72

Model | 9.7283e+09 2 4.8642e+09 Prob > F = 0.0000

Residual | 215821417 72 2997519.68 R-squared = 0.9783

-------------+---------------------------------- Adj R-squared = 0.9777

Total | 9.9441e+09 74 134380020 Root MSE = 1731.3

------------------------------------------------------------------------------
log_Imports | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_GDP | 816.443 50.00863 16.33 0.000 716.7526 916.1334

log_CPI | 82.20536 25.07781 3.28 0.002 32.21366 132.1971

_cons | -40850.65 1849.574 -22.09 0.000 -44537.71 -37163.59

------------------------------------------------------------------------------

. correlate log_Imports log_GDP log_CPI

(obs=75)

| log_Im~s log_GDP log_CPI

-------------+---------------------------

log_Imports | 1.0000

log_GDP | 0.9875 1.0000

log_CPI | 0.9476 0.9400 1.0000

. reg log_Imports log_GDP

Source | SS df MS Number of obs = 75

-------------+---------------------------------- F(1, 73) = 2853.74

Model | 9.6961e+09 1 9.6961e+09 Prob > F = 0.0000

Residual | 248030850 73 3397682.88 R-squared = 0.9751

-------------+---------------------------------- Adj R-squared = 0.9747

Total | 9.9441e+09 74 134380020 Root MSE = 1843.3


------------------------------------------------------------------------------

log_Imports | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_GDP | 970.5328 18.16784 53.42 0.000 934.3244 1006.741

_cons | -44353.42 1607.281 -27.60 0.000 -47556.72 -41150.11

------------------------------------------------------------------------------

. reg log_Imports log_CPI

Source | SS df MS Number of obs = 75

-------------+---------------------------------- F(1, 73) = 642.35

Model | 8.9293e+09 1 8.9293e+09 Prob > F = 0.0000

Residual | 1.0148e+09 73 13901085.7 R-squared = 0.8980

-------------+---------------------------------- Adj R-squared = 0.8966

Total | 9.9441e+09 74 134380020 Root MSE = 3728.4

------------------------------------------------------------------------------

log_Imports | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_CPI | 467.0531 18.42811 25.34 0.000 430.3259 503.7803

_cons | -16116.12 2284.724 -7.05 0.000 -20669.56 -11562.67

------------------------------------------------------------------------------

. reg log_GDP log_CPI

Source | SS df MS Number of obs = 75


-------------+---------------------------------- F(1, 73) = 553.94

Model | 9095.21732 1 9095.21732 Prob > F = 0.0000

Residual | 1198.594 73 16.4190959 R-squared = 0.8836

-------------+---------------------------------- Adj R-squared = 0.8820

Total | 10293.8113 74 139.105558 Root MSE = 4.052

------------------------------------------------------------------------------

log_GDP | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_CPI | .4713712 .0200277 23.54 0.000 .4314561 .5112864

_cons | 30.29548 2.483042 12.20 0.000 25.34679 35.24418

------------------------------------------------------------------------------

. vif

Variable | VIF 1/VIF

-------------+----------------------

log_CPI | 1.00 1.000000

-------------+----------------------

Mean VIF | 1.00

Conclusion: Both GDP and CPI are significant predictors of imports,

with GDP having a larger impact. The high R-squared values in the regressions

indicate that these models explain a substantial portion of the variance in imports.

The strong correlations among the variables suggest that they are interrelated,
which is further confirmed by the significant regression coefficients. The VIF analysis
shows no multicollinearity issues,

implying that the regression results are reliable.

5.2

. gen log_Imports=( Imports)

. gen log_GDP=( GDP)

. gen log_CPI=( CPI)

. reg log_Imports log_GDP log_CPI

Source | SS df MS Number of obs = 23

-------------+---------------------------------- F(2, 20) = 735.20

Model | 3.9193e+10 2 1.9596e+10 Prob > F = 0.0000

Residual | 533089408 20 26654470.4 R-squared = 0.9866

-------------+---------------------------------- Adj R-squared = 0.9852

Total | 3.9726e+10 22 1.8057e+09 Root MSE = 5162.8

------------------------------------------------------------------------------

log_Imports | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_GDP | 3701.469 305.4213 12.12 0.000 3064.371 4338.566

log_CPI | -120.1663 146.7966 -0.82 0.423 -426.3788 186.0461

_cons | -153804.6 15994.58 -9.62 0.000 -187168.7 -120440.5


------------------------------------------------------------------------------

. correl log_Imports log_GDP log_CPI

(obs=23)

| log_Im~s log_GDP log_CPI

-------------+---------------------------

log_Imports | 1.0000

log_GDP | 0.9930 1.0000

log_CPI | 0.9424 0.9553 1.0000

. reg log_Imports log_GDP

Source | SS df MS Number of obs = 23

-------------+---------------------------------- F(1, 21) = 1493.19

Model | 3.9175e+10 1 3.9175e+10 Prob > F = 0.0000

Residual | 550950315 21 26235729.3 R-squared = 0.9861

-------------+---------------------------------- Adj R-squared = 0.9855

Total | 3.9726e+10 22 1.8057e+09 Root MSE = 5122.1

------------------------------------------------------------------------------

log_Imports | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_GDP | 3462.636 89.60858 38.64 0.000 3276.285 3648.987

_cons | -142196.9 7341.077 -19.37 0.000 -157463.5 -126930.3


------------------------------------------------------------------------------

. reg log_Imports log_CPI

Source | SS df MS Number of obs = 23

-------------+---------------------------------- F(1, 21) = 166.56

Model | 3.5278e+10 1 3.5278e+10 Prob > F = 0.0000

Residual | 4.4480e+09 21 211808596 R-squared = 0.8880

-------------+---------------------------------- Adj R-squared = 0.8827

Total | 3.9726e+10 22 1.8057e+09 Root MSE = 14554

------------------------------------------------------------------------------

log_Imports | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_CPI | 1579.323 122.3747 12.91 0.000 1324.831 1833.815

_cons | 36597.85 8455.932 4.33 0.000 19012.77 54182.92

------------------------------------------------------------------------------

. reg log_GDP log_CPI

Source | SS df MS Number of obs = 23

-------------+---------------------------------- F(1, 21) = 219.13

Model | 2981.59716 1 2981.59716 Prob > F = 0.0000

Residual | 285.740339 21 13.6066828 R-squared = 0.9125

-------------+---------------------------------- Adj R-squared = 0.9084

Total | 3267.3375 22 148.515341 Root MSE = 3.6887


------------------------------------------------------------------------------

log_GDP | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_CPI | .4591392 .0310167 14.80 0.000 .3946364 .523642

_cons | 51.43969 2.143215 24.00 0.000 46.98263 55.89675

------------------------------------------------------------------------------

. vif

Variable | VIF 1/VIF

-------------+----------------------

log_CPI | 1.00 1.000000

-------------+----------------------

Mean VIF | 1.00

The regression analysis shows that `log_GDP` is a significant predictor of `log_Imports`,


with a positive relationship, while `log_CPI` is not significant when both variables are
included in the model. The model explains 98.66% of the variance in `log_Imports`,
indicating a strong fit. Simple regressions confirm `log_GDP` and `log_CPI` each have
strong individual relationships with `log_Imports`. There are no multicollinearity issues, as
indicated by the low VIF values.

5.3
. gen log_M4 =( M4 )

. gen log_Y =( Y )

. gen log_R1=( R1)

. gen log_R2 =( R2)

. reg log_M4 log_Y log_R1

Source | SS df MS Number of obs = 38

-------------+---------------------------------- F(2, 35) = 542.87

Model | 1.7859e+12 2 8.9296e+11 Prob > F = 0.0000

Residual | 5.7571e+10 35 1.6449e+09 R-squared = 0.9688

-------------+---------------------------------- Adj R-squared = 0.9670

Total | 1.8435e+12 37 4.9824e+10 Root MSE = 40557

------------------------------------------------------------------------------

log_M4 | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_Y | 2.301075 .0698359 32.95 0.000 2.1593 2.442849

log_R1 | -11359.09 1928.848 -5.89 0.000 -15274.86 -7443.32

_cons | -450405.6 26855.5 -16.77 0.000 -504925.2 -395886.1

------------------------------------------------------------------------------
. reg log_M4 log_Y log_R1 log_R2

Source | SS df MS Number of obs = 38

-------------+---------------------------------- F(3, 34) = 445.59

Model | 1.7978e+12 3 5.9926e+11 Prob > F = 0.0000

Residual | 4.5725e+10 34 1.3449e+09 R-squared = 0.9752

-------------+---------------------------------- Adj R-squared = 0.9730

Total | 1.8435e+12 37 4.9824e+10 Root MSE = 36672

------------------------------------------------------------------------------

log_M4 | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_Y | 2.267297 .0641643 35.34 0.000 2.1369 2.397695

log_R1 | -6553.293 2379.922 -2.75 0.009 -11389.88 -1716.709

log_R2 | -7556.556 2546.174 -2.97 0.005 -12731 -2382.108

_cons | -427080.8 25523.33 -16.73 0.000 -478950.5 -375211.2

------------------------------------------------------------------------------

. correl log_M4 log_Y log_R1 log_R2

(obs=38)

| log_M4 log_Y log_R1 log_R2

-------------+------------------------------------

log_M4 | 1.0000

log_Y | 0.9684 1.0000

log_R1 | 0.0075 0.1862 1.0000


log_R2 | -0.1838 -0.0055 0.6675 1.0000

. reg log_Y log_R1 log_R2

Source | SS df MS Number of obs = 38

-------------+---------------------------------- F(2, 35) = 1.22

Model | 2.2724e+10 2 1.1362e+10 Prob > F = 0.3082

Residual | 3.2666e+11 35 9.3331e+09 R-squared = 0.0650

-------------+---------------------------------- Adj R-squared = 0.0116

Total | 3.4938e+11 37 9.4427e+09 Root MSE = 96608

------------------------------------------------------------------------------

log_Y | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_R1 | 9457.304 6062.315 1.56 0.128 -2849.849 21764.46

log_R2 | -7038.804 6601.138 -1.07 0.294 -20439.83 6362.219

_cons | 295270 45054.07 6.55 0.000 203805.3 386734.6

------------------------------------------------------------------------------

. reg log_R1 log_Y log_R2

Source | SS df MS Number of obs = 38

-------------+---------------------------------- F(2, 35) = 16.26

Model | 220.555828 2 110.277914 Prob > F = 0.0000

Residual | 237.439335 35 6.78398099 R-squared = 0.4816


-------------+---------------------------------- Adj R-squared = 0.4519

Total | 457.995163 37 12.3782476 Root MSE = 2.6046

------------------------------------------------------------------------------

log_R1 | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_Y | 6.87e-06 4.41e-06 1.56 0.128 -2.07e-06 .0000158

log_R2 | .7279343 .1325253 5.49 0.000 .4588937 .996975

_cons | 1.380364 1.797682 0.77 0.448 -2.269125 5.029853

------------------------------------------------------------------------------

. reg log_R2 log_Y log_R1

Source | SS df MS Number of obs = 38

-------------+---------------------------------- F(2, 35) = 15.09

Model | 178.83366 2 89.4168301 Prob > F = 0.0000

Residual | 207.444574 35 5.92698784 R-squared = 0.4630

-------------+---------------------------------- Adj R-squared = 0.4323

Total | 386.278234 37 10.4399523 Root MSE = 2.4345

------------------------------------------------------------------------------

log_R2 | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

log_Y | -4.47e-06 4.19e-06 -1.07 0.294 -.000013 4.04e-06

log_R1 | .6359773 .1157839 5.49 0.000 .4009235 .8710311

_cons | 3.086696 1.612068 1.91 0.064 -.1859746 6.359368


------------------------------------------------------------------------------

. vif

Variable | VIF 1/VIF

-------------+----------------------

log_R1 | 1.04 0.965332

log_Y | 1.04 0.965332

-------------+----------------------

Mean VIF | 1.04

The regression analysis shows that `log_Y` is a significant positive predictor of `log_M4`,
while `log_R1` has a significant negative relationship. Including `log_R2` in the model
slightly improves the fit, but both `log_R1` and `log_R2` remain significant negative
predictors of `log_M4`. The model explains 97.52% of the variance in `log_M4`, indicating
a strong fit. The correlation matrix shows moderate correlations between `log_R1` and
`log_R2`, but the VIF values indicate no multicollinearity issues.

If multicollinearity were to occur, we could address it by:

1. Removing highly correlated predictors.

2. Combining correlated variables into a single predictor.

3. Using techniques like Ridge Regression or Principal Component Analysis (PCA) to


mitigate multicollinearity.

____________________________________________________________________________________
_______________________________________________________________________

Chapter 06

-----------

6.1
Step 1: Run Regression

Command: regress price sqrft

Source | SS df MS Number of obs = 88

-------------+---------------------------------- F(1, 86) = 140.79

Model | 5.6980e+11 1 5.6980e+11 Prob > F = 0.0000

Residual | 3.4805e+11 86 4.0471e+09 R-squared = 0.6208

-------------+---------------------------------- Adj R-squared = 0.6164

Total | 9.1785e+11 87 1.0550e+10 Root MSE = 63617

------------------------------------------------------------------------------

price | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

sqrft | 140.211 11.81664 11.87 0.000 116.7203 163.7017

_cons | 11204.14 24742.61 0.45 0.652 -37982.53 60390.82

------------------------------------------------------------------------------

Step 2: Checking of Heteroskedasticity

Using White Test:

Command: estat imtest, white


White's test

H0: Homoskedasticity

Ha: Unrestricted heteroskedasticity

chi2(2) = 16.14

Prob > chi2 = 0.0003

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 16.14 2 0.0003

Skewness | 12.28 1 0.0005

Kurtosis | -503685.18 1 1.0000

---------------------+----------------------------

Total | -503656.76 4 1.0000

--------------------------------------------------

As the P value is significant at 5% therefore the null hyp. is rejected which means
heteroskedasticity is present.
Step 3: Perform GLS Estimation and Check for Heteroskedasticity

Case (a): Var(ui) = σ²sqrft_i

Command: gen weight_a = 1/sqrft

Command: reg price weight_a

Command: estat imtest, white

Results: White's test

H0: Homoskedasticity

Ha: Unrestricted heteroskedasticity

Source | SS df MS Number of obs = 88

-------------+---------------------------------- F(1, 86) = 78.28

Model | 4.3736e+11 1 4.3736e+11 Prob > F = 0.0000

Residual | 4.8049e+11 86 5.5871e+09 R-squared = 0.4765

-------------+---------------------------------- Adj R-squared = 0.4704

Total | 9.1785e+11 87 1.0550e+10 Root MSE = 74747

------------------------------------------------------------------------------

price | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

weight_a | -5.58e+08 6.30e+07 -8.85 0.000 -6.83e+08 -4.32e+08

_cons | 589419.3 34377.05 17.15 0.000 521080 657758.7

------------------------------------------------------------------------------

-------------------------------------------------

Source | chi2 df p
---------------------+----------------------------

Heteroskedasticity | 17.34 2 0.0002

Skewness | 12.07 1 0.0005

Kurtosis |-6837088.39 1 1.0000

---------------------+----------------------------

Total |-6837058.98 4 1.0000

--------------------------------------------------

chi2(2) = 17.34

Prob > chi2 = 0.0002

As the P value is significant at 5% therefore the null hyp. is rejected which means
heteroskedasticity is present.

Case (b): Var(ui) = σ²sqrft²_i

Command: gen weight_b = 1/(sqrft^2)

Command: reg price weight_b

Command: estat imtest, white

Result: White's test

H0: Homoskedasticity

Ha: Unrestricted heteroskedasticity


Source | SS df MS Number of obs = 88

-------------+---------------------------------- F(1, 86) = 52.67

Model | 3.4860e+11 1 3.4860e+11 Prob > F = 0.0000

Residual | 5.6925e+11 86 6.6192e+09 R-squared = 0.3798

-------------+---------------------------------- Adj R-squared = 0.3726

Total | 9.1785e+11 87 1.0550e+10 Root MSE = 81358

------------------------------------------------------------------------------

price | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

weight_b | -4.64e+11 6.40e+10 -7.26 0.000 -5.91e+11 -3.37e+11

_cons | 431651.4 20913.49 20.64 0.000 390076.7 473226

------------------------------------------------------------------------------

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 2.79 1 0.0946

Skewness | 9.90 1 0.0017

Kurtosis | -1.97e+07 1 1.0000

---------------------+----------------------------

Total | -1.97e+07 3 1.0000

--------------------------------------------------
.

chi2(1) = 2.79

Prob > chi2 = 0.0946

As the P value is insignificant at 5% therefore the null hyp is accepted which means
heteroskedasticity is not present.

Conclusion:

The initial model has significant heteroskedasticity.

GLS Case (a):This suggests that the chosen weighting scheme did not effectively address
the heteroskedasticity.

GLS Case (b): Using weights proportional to 1/sqfeet² successfully eliminated


heteroskedasticity (White's test p-value = 0.0946).

6.2

Exercise 6.2

Step 1: Run Regression

Command: regress netprofitsales Noempl

Result:

Source | SS df MS Number of obs = 143

-------------+---------------------------------- F(1, 141) = 1.98


Model | .01466615 1 .01466615 Prob > F = 0.1621

Residual | 1.04691193 141 .007424907 R-squared = 0.0138

-------------+---------------------------------- Adj R-squared = 0.0068

Total | 1.06157808 142 .007475902 Root MSE = .08617

------------------------------------------------------------------------------

netprofits~s | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

Noempl | .000498 .0003544 1.41 0.162 -.0002025 .0011986

_cons | .0206526 .0139538 1.48 0.141 -.0069332 .0482383

------------------------------------------------------------------------------

.279487 +

| *

n |

e |

t |* * * *

| * * * *

p | *** * *

r | * * ** *

o | * * ** * *

f | * ** * * ** * *

i | **** * ** * ** ** * *

t | **** **** ***** * * *

/ | * ** *** ********** * * * * * *
s | * ** * ** * **

a | * *

l | * *

e | ** **

s | * *

-.211482 + ** * * *

+----------------------------------------------------------------+

3 No empl. 140

Step 2: White test

Command: estat imtest, white

Result:

White's test

H0: Homoskedasticity

Ha: Unrestricted heteroskedasticity

chi2(2) = 0.05

Prob > chi2 = 0.9753

As the P value is insignificant at 5% therefore the null hyp is accepted which means
heteroskedasticity is not present.
Exercise 6.3:

Step 1: Run Regression

reg Y X

Result:

Source | SS df MS Number of obs = 38

-------------+---------------------------------- F(1, 36) = 25408.97

Model | 2320612.02 1 2320612.02 Prob > F = 0.0000

Residual | 3287.8952 36 91.3304222 R-squared = 0.9986

-------------+---------------------------------- Adj R-squared = 0.9985

Total | 2323899.91 37 62808.1057 Root MSE = 9.5567

------------------------------------------------------------------------------

Y | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

X | 1.059399 .0066461 159.40 0.000 1.04592 1.072878

_cons | -8.672959 1.845795 -4.70 0.000 -12.4164 -4.929513

Breusch Pegan Test

Command: hettest

Result:

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of Y

H0: Constant variance


chi2(1) = 0.72

Prob > chi2 = 0.3970

As the P value is insignificant at 5% therefore the null hyp is accepted which means
heteroskedasticity is not present.

White Test

Command: estat imtest, white

Result: White's test

H0: Homoskedasticity

Ha: Unrestricted heteroskedasticity

chi2(2) = 3.81

Prob > chi2 = 0.1487

As the P value is insignificant at 5% therefore the null hyp is accepted which means
heteroskedasticity is not present.

6.4

Exercise 6.4

Step 1: Run Regression

Command: reg sleep totwrk educ age yngkid male

Result:
Source | SS df MS Number of obs = 706

-------------+---------------------------------- F(5, 700) = 19.38

Model | 16933101.4 5 3386620.28 Prob > F = 0.0000

Residual | 122306734 700 174723.906 R-squared = 0.1216

-------------+---------------------------------- Adj R-squared = 0.1153

Total | 139239836 705 197503.313 Root MSE = 418

------------------------------------------------------------------------------

sleep | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

totwrk | -.1656902 .0180061 -9.20 0.000 -.2010425 -.1303378

educ | -11.76532 5.87132 -2.00 0.045 -23.29283 -.2378133

age | 2.009938 1.520833 1.32 0.187 -.9760034 4.995879

yngkid | 4.784242 50.01991 0.10 0.924 -93.42278 102.9913

male | 87.54557 34.66501 2.53 0.012 19.48572 155.6054

_cons | 3640.234 114.332 31.84 0.000 3415.759 3864.709

------------------------------------------------------------------------------

Step 2:-

Breusch–Pagan test to check heteroskedasticity:-

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of sleep

H0: Constant variance


chi2(1) = 2.42

Prob > chi2 = 0.1199

As the P value is insignificant at 5% therefore the null hyp is accepted which means
heteroskedasticity is not present.

Step 3: Checking estimated variance of u higher for men than women:-

Command:

predict uhat, residuals

gen uhat_sq = uhat^2

reg uhat_sq male

Result:

Source | SS df MS Number of obs = 706

-------------+---------------------------------- F(1, 704) = 1.25

Model | 1.5848e+11 1 1.5848e+11 Prob > F = 0.2648

Residual | 8.9597e+13 704 1.2727e+11 R-squared = 0.0018

-------------+---------------------------------- Adj R-squared = 0.0003

Total | 8.9756e+13 705 1.2731e+11 Root MSE = 3.6e+05

------------------------------------------------------------------------------

uhat_sq | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------
male | -30234.55 27093.98 -1.12 0.265 -83429.22 22960.13

_cons | 190369.1 20393.91 9.33 0.000 150328.9 230409.2

------------------------------------------------------------------------------

As p value of coefficient of the male variable in the regression of uhat_sq on male is


insignificant, it indicates that the variance of the residuals (u) is same between men and
women.

6.5

Step 1: Run Regression

Command: regress price lotsize sqrft bdrms

Result:

Source | SS df MS Number of obs = 88

-------------+---------------------------------- F(3, 84) = 57.46

Model | 6.1713e+11 3 2.0571e+11 Prob > F = 0.0000

Residual | 3.0072e+11 84 3.5800e+09 R-squared = 0.6724

-------------+---------------------------------- Adj R-squared = 0.6607

Total | 9.1785e+11 87 1.0550e+10 Root MSE = 59833

------------------------------------------------------------------------------

price | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

lotsize | 2.067707 .6421258 3.22 0.002 .790769 3.344644

sqrft | 122.7782 13.23741 9.28 0.000 96.45415 149.1022

bdrms | 13852.52 9010.145 1.54 0.128 -4065.14 31770.18


_cons | -21770.31 29475.04 -0.74 0.462 -80384.66 36844.04

Step 2: Test for heteroskedasticity

Command: hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of price

H0: Constant variance

chi2(1) = 20.55

Prob > chi2 = 0.0000

As the P value is significant at 5% therefore the null hyp. is rejected which means
heteroskedasticity is present.

____________________________________________________________________________________
_____________________________________________________________________

Chapter : 07

7.1

Solution: -

Step 1: Run Regression


reg I R Y

Source | SS df MS Number of obs = 30

-------------+---------------------------------- F(2, 27) = 59.98

Model | 1329.98704 2 664.993518 Prob > F = 0.0000

Residual | 299.335844 27 11.0865127 R-squared = 0.8163

-------------+---------------------------------- Adj R-squared = 0.8027

Total | 1629.32288 29 56.1835476 Root MSE = 3.3296

------------------------------------------------------------------------------

I | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

R | -.1841962 .1264157 -1.46 0.157 -.4435798 .0751874

Y | .7699114 .0717905 10.72 0.000 .6226094 .9172134

_cons | 6.224938 2.510894 2.48 0.020 1.073009 11.37687

Step 2: Generate time

gen time=_n

tsset time

Step 3: Run Durbin Watson Test

dwstat
Durbin–Watson d-statistic( 3, 30) = .852153

The result of Durbin–Watson test is 0.85, which indicates +ve autocorrelation.

Step: Resolve

we use Cocraine Orcad test

command: prais I R Y, corc

Iteration 0: rho = 0.0000

Iteration 1: rho = 0.5677

Iteration 2: rho = 0.6138

Iteration 3: rho = 0.6146

Iteration 4: rho = 0.6146

Iteration 5: rho = 0.6146

Cochrane–Orcutt AR(1) regression with iterated estimates

Source | SS df MS Number of obs = 29

-------------+---------------------------------- F(2, 26) = 19.83

Model | 283.65568 2 141.82784 Prob > F = 0.0000

Residual | 185.963077 26 7.15242602 R-squared = 0.6040

-------------+---------------------------------- Adj R-squared = 0.5736

Total | 469.618757 28 16.7720985 Root MSE = 2.6744

------------------------------------------------------------------------------

I | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------
R | -.2957754 .078671 -3.76 0.001 -.4574859 -.1340649

Y | .7848538 .144227 5.44 0.000 .488391 1.081317

_cons | 7.329872 3.658536 2.00 0.056 -.1903569 14.8501

-------------+----------------------------------------------------------------

rho | .6146382

------------------------------------------------------------------------------

Durbin–Watson statistic (original) = 0.852153

Durbin–Watson statistic (transformed) = 1.608128

7.2

Step 1: Run Regression

reg Q P F R

Source | SS df MS Number of obs = 30

-------------+---------------------------------- F(3, 26) = 20.77

Model | 272.177333 3 90.7257777 Prob > F = 0.0000

Residual | 113.590667 26 4.36887181 R-squared = 0.7055

-------------+---------------------------------- Adj R-squared = 0.6716

Total | 385.768 29 13.3023448 Root MSE = 2.0902

------------------------------------------------------------------------------

Q | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------
P | .3162285 .0876154 3.61 0.001 .1361326 .4963245

F | .0615832 .0809901 0.76 0.454 -.1048944 .2280608

R | .0581384 .0080687 7.21 0.000 .041553 .0747239

_cons | -11.59809 14.96435 -0.78 0.445 -42.35776 19.16157

------------------------------------------------------------------------------

Step: 2 Generate time

gen time=_n

tsset time

Step: 3 Run Durbin Watson Test

dwstat

Durbin–Watson d-statistic( 4, 30) = 1.805563

As the result of Durbin Watson test is close to 2, now we need to check the severity of
autocorrelation.

Step: 4 Run Breusch Godfrey test

bgodfrey, lags(1)

Breusch–Godfrey LM test for autocorrelation


---------------------------------------------------------------------------

lags(p) | chi2 df Prob > chi2

-------------+-------------------------------------------------------------

1 | 0.044 1 0.8331

---------------------------------------------------------------------------

H0: no serial correlation

The result shows that the value of P is insignificant and null hypothesis is accepted.

____________________________________________________________________________________
_______________________________________________________________

Chapter: 08

------------

8.1

Step 1

. summarize

Variable | Obs Mean Std. dev. Min Max

-------------+---------------------------------------------------------

iq | 901 101.0866 15.06789 50 145

wage | 901 1191.266 81.10248 1023 1615.6

step 2

. reg wage iq

Source | SS df MS Number of obs = 901


-------------+---------------------------------- F(1, 899) = 93.38

Model | 557036.385 1 557036.385 Prob > F = 0.0000

Residual | 5362814.78 899 5965.31121 R-squared = 0.0941

-------------+---------------------------------- Adj R-squared = 0.0931

Total | 5919851.16 900 6577.6124 Root MSE = 77.235

------------------------------------------------------------------------------

wage | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

iq | 1.65108 .1708609 9.66 0.000 1.315747 1.986412

_cons | 1024.364 17.46236 58.66 0.000 990.0924 1058.636

------------------------------------------------------------------------------

step 3

. display iq*10

740

step 4

. g lniq= log( iq)

step 5

. reg wage lniq

Source | SS df MS Number of obs = 901

-------------+---------------------------------- F(1, 899) = 89.15

Model | 534104.298 1 534104.298 Prob > F = 0.0000


Residual | 5385746.87 899 5990.81965 R-squared = 0.0902

-------------+---------------------------------- Adj R-squared = 0.0892

Total | 5919851.16 900 6577.6124 Root MSE = 77.4

------------------------------------------------------------------------------

wage | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

lniq | 154.5317 16.3662 9.44 0.000 122.4113 186.6521

_cons | 479.7907 75.39528 6.36 0.000 331.8195 627.762

------------------------------------------------------------------------------

. display lniq*10

43.040652

Comments

- Summary Statistics**: The summary shows the mean, standard deviation, minimum, and
maximum values of the variables "iq" and "wage".

- Simple Linear Regression : The regression of "wage" on "iq" reveals a significant


relationship (p < 0.05). For every 1-point increase in IQ, there's a corresponding increase of
approximately 1.65 in wage, holding other variables constant.

- Transformation : The variable "iq" is transformed into its natural logarithm, denoted as
"lniq".

- Regression with Transformed Variable: The regression of "wage" on the natural logarithm
of "iq" also shows a significant relationship (p < 0.05). For every 10% increase in IQ, there's
a corresponding increase of approximately 154.53 in wage, holding other variables
constant.
____________________________________________________________________________________
_________________________________________________________________

Chapter -21

Panel Data

question by mysef

. xtset id time

Panel variable: id (strongly balanced)

Time variable: time, 1960 to 1999

Delta: 1 unit

. xtreg Y X E, fe

Fixed-effects (within) regression Number of obs = 320

Group variable: id Number of groups = 8

R-squared: Obs per group:

Within = 0.6479 min = 40

Between = 0.9878 avg = 40.0

Overall = 0.7397 max = 40


F(2,310) = 285.27

corr(u_i, Xb) = 0.4311 Prob > F = 0.0000

------------------------------------------------------------------------------

Y | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

X | .4737093 .0218886 21.64 0.000 .4306403 .5167784

E | 1.845824 .157163 11.74 0.000 1.536583 2.155065

_cons | 52.81111 2.434349 21.69 0.000 48.02117 57.60104

-------------+----------------------------------------------------------------

sigma_u | .52193716

sigma_e | 2.6826443

rho | .03647321 (fraction of variance due to u_i)

------------------------------------------------------------------------------

F test that all u_i=0: F(7, 310) = 1.23 Prob > F = 0.2843

. xtreg Y X E, re

Random-effects GLS regression Number of obs = 320

Group variable: id Number of groups = 8

R-squared: Obs per group:

Within = 0.6479 min = 40

Between = 0.9879 avg = 40.0

Overall = 0.7397 max = 40


Wald chi2(2) = 900.79

corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

------------------------------------------------------------------------------

Y | Coefficient Std. err. z P>|z| [95% conf. interval]

-------------+----------------------------------------------------------------

X | .4966464 .0183199 27.11 0.000 .4607401 .5325528

E | 1.940393 .1538859 12.61 0.000 1.638783 2.242004

_cons | 50.27199 2.040134 24.64 0.000 46.2734 54.27058

-------------+----------------------------------------------------------------

sigma_u | 0

sigma_e | 2.6826443

rho | 0 (fraction of variance due to u_i)

------------------------------------------------------------------------------

. est store fe

. est store re

. hausman fe re

Note: the rank of the differenced variance matrix (0) does not equal the number of
coefficients

being tested (2); be sure this is what you expect, or there may be problems computing
the

test. Examine the output of your estimators for anything unexpected and possibly
consider
scaling your variables so that the coefficients are on a similar scale.

---- Coefficients ----

| (b) (B) (b-B) sqrt(diag(V_b-V_B))

| fe re Difference Std. err.

-------------+----------------------------------------------------------------

X | .4966464 .4966464 0 0

E | 1.940393 1.940393 0 0

------------------------------------------------------------------------------

b = Consistent under H0 and Ha; obtained from xtreg.

B = Inconsistent under Ha, efficient under H0; obtained from xtreg.

Test of H0: Difference in coefficients not systematic

chi2(0) = (b-B)'[(V_b-V_B)^(-1)](b-B)

= 0.00

Prob > chi2 = .

(V_b-V_B is not positive definite)

both FE and RE regressions indicate significant positive effects of variables

"X" and "E" on "Y", but the Hausman test doesn't provide conclusive evidence due to the
nature of the differenced variance matrix.

You might also like