0% found this document useful (0 votes)
4 views

Presentación Modelo 4

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Presentación Modelo 4

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Professors:

Econ. Gonzalo Villa-Cox, Ph. D.


Econ. Pedro Vargas, M.Sc.
MORAN ASTUDILLO STEFANO JOSUE
GARCIA BRIONES DANNA FERNANDA
LUCIO MOYANO JOEL MESIAS
MEJIA TORRES KARLA MILENA
SALAZAR PALMA DARIO ALEJANDRO
Objectives
State a research hypothesis that is compatible with a limited dependent
variable model based on data from Ecuador

Estimate a regression model compatible with a limited dependent


variable model using data from Ecuador

Contrast and criticize the potential limitations of the model, as well as


dimension the scope of the conclusions by understanding these
limitations.

Propose potential solutions, as well as estimate and interpret models


that can attack these problems.

Identify the determinants of social security affiliation. Estimate the


probability that a person is affiliated considering important factors.
1. Discuss the assumptions necessary for the
preliminary estimate presented in class to be
consistent. Is this plausible for the proposed
application to Social Security affiliation?
Justify your answer based on academic
literature (use references in APA format).
Assumptions necessary for the preliminary estimate
presented in class to be consistent (Giles, 2015)
1. Linearity of the latent variable y* and y* = x'β + σ ε (1)

2. Variance of the error term is constant, or homoscedastic σ = 1

3. Independence of irrelevant alternatives

4. The error term follows a standard normal distribution

5. No correlation with the error term or Exogeneity

6. No multicollinearity
VARIABLE CODE STORAGE TYPE VARIABLE LABEL
Area Byte Area

Dataset Ciudad
Conglomerado
long
Text
City (código)
Conglomerate
Panelm Text Panelm
Vivienda Text Housing
P02 Byte Gender
P03 Byte Age
P06 Byte marital status
P11 Byte Knows how to read and write
Labor Market Variables of Model P12a Byte Did you receive diploma for studies
P12b Long Level of education
VARIABLE STORAGE VARIABLE
CODE P15 Byte Ethnicity
TYPE LABEL P20 – P43 Byte Employment variables
Byte Gender P44a – P44k Byte, Int Social and labor benefits
Gender
P44f Int Recibe Seguro Social
Float Age P63 – P78 Byte, Int Income Variable
Age
Float Level of Sd01 – ced01a Byte – int Unemployment variables
Education education Ced01a Byte Have Identification card
Float Labor market nnivins Byte Level of education
Occupation activity Ingrl Long Labor income
Float Years working
Ingpc Dobule Per capita income
agejob Condact Byte Labor Activity condition
Is this plausible for the proposed application to Social
security affiliation? (INEC, 2023)
• According to the National Employment, Unemployment and Underemployment Survey (INEC, 2023),
the dataset and proposed model in class fit in the context of social security affiliation, since the
variables on the survey just as employment, social and labor benefits, labor activity condition and
others interact in the way that is possible to develop the preliminary estimate presented in class.

• Based on government literature is possible to suggest the characteristics of the labor market
considered in the model do not influence the error term.

• The independent variables of the preliminary estimate do not correlate.


2. Clearly discuss what the
presented model implies. Interpret
the marginal effects identified
by the model in point 1 (if correct).
Clearly discuss what the presented
model implies (Werth, 2022). The of 0,5238 and
0.4731 of each model
∨ 𝑋 𝑖 ) =𝜙 ( 𝛽0 + 𝛽1 ∗ 𝑔𝑒𝑛𝑑𝑒𝑟 + 𝛽 2 ∗ 𝑎𝑔𝑒+ 𝛽3 ∗ 𝑖𝑛𝑠𝑡𝑟𝑢𝑐𝑡𝑖𝑜𝑛 + 𝛽 4 ∗ 𝑜𝑐𝑐𝑢𝑝𝑎𝑡𝑖𝑜𝑛 + 𝛽5 ∗ 𝑗𝑜𝑏𝑦𝑒𝑎𝑟𝑠
may suggest
) weak
explanatory force
Logit model Probit Model

In the logit Model, with


a SL of 5%, we conclude
that age, occupation
and instruction are
significant.

In the probit Model,


with a SL of 5%, we
conclude that all labor
market variables are
significant
Interpret the marginal effects identified
by the model in point 1 (if correct). For the probit model, the
marginal effects interpreted
Logit model Marginal Effects Probit Model Marginal Effects by the model in point 2
suggest that with a SL of
5%, all labor market
variables are significant.

For the probit model, the


marginal effects interpreted
by the model in point 2
suggest that with a SL of
5%, all labor market
variables are significant in
the model estimation.
3. Explore the database and propose
additional “relevant” variables that
could have been omitted in the model
developed in class.
Explore the database and propose
additional “relevant” variables that
could have been omitted in the model
developed in class (Frank, 2012).
VARIABLE
VARIABLE CODE VARIABLE NAME DESCRIPTION
TYPE

P43 Jobtype Quantitative your job type is


for studies did he/she
P12a Diploma Quantitative
received diploma
Ingrl Job income Quantitative Income received in any
labor market activity

Jobincome, Jobtype and diploma were identified as relevant variables that could
have been omitted in the model developed in class.
3A. Justify the additional variables of the
model and implement their computation in
Stata.

The job income, job type in


dependency and the level of
education verified by a diploma are all
relevant variables according to INEC
methodologies to predict the
probability that an individual have
decide to affiliate to social security
and more importantly are relevant to
determine the extent to which they
contribute based on their income.
3B. Perform descriptive/exploratory
statistics for the variables.
2.

Density
1. 3. log_jobincome
Diploma Jobtype
.6

.4

.2
frequency

frequency
25,000 3,000
0
0 5 10 15
log_jobincome
20,000

2,000
15,000

10,000
1,000

5,000

0 0
0 1 Appointment
Permanent Contract
Temporary Contract Per job Per hour Per day
3C. Reestimate the model seen
in class with the variables in The of both models
increased to 0,5621 and
question and interpret the 0.5509. The difference is
not significant.
resulting marginal effects
Logit model Probit Model
In the logit Model, with
a SL of 5%, we conclude
the new variables are
significant for the
model.

In the probit Model,


with a SL of 5%, we
conclude that all labor
market variables except
gender are significant.
3C. Reestimate the model seen in class with the
variables in question and interpret the resulting
marginal effects
For the logit model, the
Logit model Marginal Effects Probit Model Marginal Effects marginal effects interpreted
by the model in point 2
suggest that with a SL of 5%,
age, instruction and
occupation are significant in
the model estimation.

For the probit model, the


marginal effects interpreted
by the model in point 2
suggest that with a SL of 5%,
all labor market variables
are significant in the model
estimation.
3D. Contrast the results against those
originally presented in point 1. Do they
change
Logit Model (1)
noticeably
(3) (1) (3)
(orProbitnot)?
Model (1) (3) (1) (3)
Significant Significant Significant Significant
Marginal Effects ? ? Marginal Effects ? ?
Gender 0,003 -0,016 No No Gender -0,2626 -0,0162 Yes No
Age 0,0404 0,005 Yes Yes Age 0,04 0,020 Yes Yes
Instruction 0,029 0,003 Yes Yes Instruction 0,0312 0,0098 Yes Yes

Occupation -0,1024 -0,0020 Yes Yes Occupation -0,09107 -0,0134 Yes Yes

Job years -0,0050 -0,0085 No No Job years -0,01515 0,0044 Yes No

Log_jobincome - 0,0077 - Yes Log_jobincome - 0,2730 - Yes


Jobtype - 0,00474 - Yes Jobtype - -0,11409 - Yes
Diploma - 0,01293 - Yes Diploma - 0,05207 - Yes

• The sign changed for the variables gender in LM and job years in the PM marginal effects
• gender and job years is no longer significant in the probit model
• big decrease on magnitude for significant and not significant variables.
4. Assume that there is (at least) one factor
omitted in the regression developed in
point 3. Document with scientific literature
what types of factors can motivate the
omission (use references in APA format).
Document with scientific
literature what types of factors
can motivate the omission Some types of factors that
(Hanck & Arnold, 2023)

(Buck, 2015)
(Granados, 2016)
may motivate the omission
An omitted factor in
1. Factors that are unknown or
the regression is a ignored due to lack of
variable that is not The omission of these information or prior
included in the model, factors can cause bias knowledge.
but that has a in the estimators of the 2. Factors that are deliberately
excluded for practical reasons,
significant effect on regression coefficients such as model simplicity, data
the dependent variable and affect the validity availability, or computational
and is correlated with of the inferences cost.
one of the independent 3. Unobservable or difficult-to-
measure factors, such as
variables motivation, ability, or
individuals preference.
5. Pose (formally) a model that is capable
of dealing with the omission indicated in
point 4, as well as the assumptions that
justify it (use references in APA format).
Pose (formally) a model that is
capable of dealing with the omission
indicated in point 4, as well as the
assumptions that justify it
In order to deal with omitted variables, one could choose to employ a multiple
regression approach that incorporates all relevant variables in the analysis.
This model should take into account the correlation between the independent
variables to provide a more accurate representation of the underlying
relationship (Middela & Ramadurai, 2024). The model would be as follows:
𝑦 = 𝛽0 + 𝛽1 𝑥 1 + 𝛽 2 𝑥 2 +…+ 𝛽 𝑘 𝑥 𝑘+ 𝜖

Where:
• is the dependent variable
• are the explanatory variables
• are the coefficients of the regression
• is the random error
1. Once the model was proposed with the new variables, a 2. Then, to confirm the situation, a correlation test was
particular case was identified with the gender variable, which performed to verify multicollinearity between the
was not significant for the model. explanatory variables (ÇAĞLAYAN, 2012).

3. The age variable presented a drawback, it reflected a


high correlation with the job_years variable of 0.5842.
4. Therefore, the age variable was eliminated from the final
probit and logit models
6. Estimate the model proposed in point
5, and interpret the results correctly and
justified.
6. Estimate the model In this logit model, the
Pseudo is 0.5607 and the
proposed in point 5, and Log Pseudolikelihood is -
2090,13, which is a sign of
interpret the results correctly the model’s fit good
quality.

and justified.
Logit model and marginal effects
The marginal effects of
gender, occupation,
log_jobincome, jobtype
and diploma are significant.

The marginal effects of


gender, occupation and
jobtype are negatives.
6. Estimate the model In this probit model, the
Pseudo is 0.5493 and the
proposed in point 5, and Log Pseudolikelihood is -
2144,25, which is a sign of
interpret the results correctly the model’s fit good
quality.

and justified.
Probit model and marginal effects
The marginal effects of all
labor market variables are
significant.

The marginal effects of


gender, occupation and
jobtype are negatives.
7. Contrast the results obtained in point 6
and explain whether there are (or not)
important differences with what you
developed in point 3.
7. Contrast the results obtained in point
6 and explain whether there are (or not)
important differences with what you
developed in point 3.
Logit model Probit Model General Conclusion

• In terms of significant variables, • In terms of significant variables, • Both Models indicate that the
jobyears was the only variable in gender and jobyears became regression models used in point 3
both models to not be significant, significant in point 6 while in point and point 6 are similar in their
including instruction in point 6. 3 they weren’t. ability to explain affiliation to
• The Pseudo R2 value didn’t differ • The Pseudo R2 value didn’t differ social security. If the models are
for a significant amount (0,5621 for a significant amount (0,5509 evaluated by their Pseudo R2, the
to 0,5607), neither the Log to 0,5493), neither the Log first model seems to fit the data
pseudo likelihood difference (- pseudo likelihood difference (- better.
2083 to -2090). 2136 to -2144). • It is emphasized that the
• On the other hand, if they are • On the other hand, if they are differences are minimal and that
evaluated according to their Log evaluated according to their Log these values are only a rough
pseudo-likelihood, the second pseudo-likelihood, the second measure of the quality of the
model seems to be a better fit. model seems to be a better fit. model and do not provide a
complete indication of its
accuracy or validity.

You might also like