Example LPM Handout 2016
Example LPM Handout 2016
T. Monsod
3.
4.
5.
6.
What fraction of the women in the sample participated in the labor force in 1975?
Estimate a linear probability model explaining labor force participation in 1975 in
terms of husbands earnings (non-wife income in thousands or nwifeinc); years of
education, years of past labor market experience, age, number of children less than 6
years of age (kidslt6), number of kids between 6-18 (kidsge6)
Obtain the fitted values from the LPM estimated in (2). Are any fitted values negative
or greater than 1?
Using the fitted values, define a new variable inlf_p =1 if the fitted value >=.5 and
inlf_p=0 if the fitted value is <.5. Out of total 753 women in the sample, how many
are predicted to be have participated in the labor force in 1975?
For the 325 women who did not participate, what percentage is predicted not to have
participated using the predictor inlf_p? For the 428 women who participated, what
percentage is predicted to have participated?
What is the overall percent correctly predicted? Do you think this is a complete
description of how well the model does?
#1
. sum inlf
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------inlf |
753
.5683931
.4956295
0
1
#2
. reg
inlf
Source |
SS
df
MS
-------------+-----------------------------Model | 48.8080578
7 6.97257969
Residual | 135.919698
745 .182442547
-------------+-----------------------------Total | 184.727756
752 .245648611
Number of obs
F( 7,
745)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
753
38.22
0.0000
0.2642
0.2573
.42713
-----------------------------------------------------------------------------inlf |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------nwifeinc | -.0034052
.0014485
-2.35
0.019
-.0062488
-.0005616
age | -.0160908
.0024847
-6.48
0.000
-.0209686
-.011213
educ |
.0379953
.007376
5.15
0.000
.023515
.0524756
exper |
.0394924
.0056727
6.96
0.000
.0283561
.0506287
exper2 | -.0005963
.0001848
-3.23
0.001
-.0009591
-.0002335
kidslt6 | -.2618105
.0335058
-7.81
0.000
-.3275875
-.1960335
kidsge6 |
.0130122
.013196
0.99
0.324
-.0128935
.0389179
_cons |
.5855192
.154178
3.80
0.000
.2828442
.8881943
-----------------------------------------------------------------------------#3
. predict y_hat
(option xb assumed; fitted values)
. sum y_hat
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------y_hat |
753
.5683931
.2547633 -.3451103
1.127151
#4
. gen inlf_p=1 if y_hat>=.5
(281 missing values generated)
. replace inlf_p=0 if y_hat<.5
(281 real changes made)
. table inlf_p
---------------------inlf_p |
Freq.
----------+----------0 |
281
1 |
472
---------------------#5
. tabulate inlf_p if inlf==0
inlf_p |
Freq.
Percent
Cum.
------------+----------------------------------0 |
203
62.46
62.46
1 |
122
37.54
100.00
------------+----------------------------------Total |
325
100.00
. tabulate inlf_p if inlf==1
inlf_p |
Freq.
Percent
Cum.
------------+----------------------------------0 |
78
18.22
18.22
1 |
350
81.78
100.00
------------+----------------------------------Total |
428
100.00
#6. The overall percent correctly predicted is a weighted average of the two percentages
obtained in part 5.
= (1- .568) (62.46) + (.568) (81.8)
= 73.44
The model does a good job of predicting that a woman did participate (81.8%). It does less
well predicting correctly less than 62.5% of the time that a woman did not participate.