Qualitative Methods
Qualitative Methods
Paper 279-25
Abstract
Introductory Example
For multinomial data analysis, it is important to organize data series in an appropriate way. Here is a simple binary data set that illustrates how you can estimate the multinomial logit model using PROC QLIM.
The first five observations of a simulated data set
(Ben-Akiva and Lerman 1985, p. 88) are shown as
follows:
Example of Binary Choice Data
Introduction
It is not uncommon to encounter econometric models where dependent variables are limited or qualitative. These discrete choice and limited dependent
variable models need to be analyzed using more complicated methods than usual continuous models. The
SAS/ETS QLIM procedure is developed to analyze
mainly cross-sectional data, though you can use the
QLIM procedure for panel or time-series data. PROC
QLIM can analyze the following models:
id
auto
1
2
3
4
5
52.9
4.1
4.1
56.2
51.8
transit
4.4
28.5
86.9
31.6
20.2
ttdif
cchoice
alt
48.5
-24.4
-82.8
24.6
31.6
Transit
Transit
Auto
Transit
Transit
0
0
1
0
0
ordinal probit
Rearranged Binary Data
id
1
1
2
2
3
3
4
4
5
5
autodum
1
0
1
0
1
0
1
0
1
0
ttime
cchoice
52.9
4.4
4.1
28.5
4.1
86.9
56.2
31.6
51.8
20.2
Transit
Transit
Transit
Transit
Auto
Auto
Transit
Transit
Transit
Transit
mode
1
2
1
2
1
2
1
2
1
2
choice
0
1
0
1
1
0
0
1
0
1
The new variable (TTIME) takes the value of automobile travel time (AUTO) for the first record of each individual while it contains transit travel time (TRANSIT)
for the second record. We use an alternative-specific
constant (AUTODUM) for conditional logit estimation.
P (yi = 1) =
-6.166
21
42
16.3321
log 1
P (yi = 1)
P (yi = 1)
= (xiA
(probit)
where () is the distribution function of standard normal variables and ^1 and ^2 are maximum likelihood
estimates.
proc qlim data=travel1;
model alt = ttdif /
type=blogit covest=hess optmethod=nr;
endogenous discrete=(alt 0 1);
output out=blg p=p_lo;
model alt = ttdif /
type=bprobit covest=hess optmethod=nr;
endogenous discrete=(alt 0 1);
output out=bpr p=p_pr;
run;
-6.166
21
21
16.3321
Probit Estimates
-0.0644
0.3992
-0.0300
0.0103
Summary Table
Log L
# Observations
# Records
AIC
-6.1652
21
21
16.3303
-0.16
-2.92
xiT ) = xiD
P (yi = 1) =
P (yi = 1) = (^1 + ^2 TTDIF)
1.0
J + 1 alternatives.
0.8
P (yi = j ) =
0.6
0.4
0.2
Uij
0.0
-100
exp(x0 )
PJ ij 0
1 + k=1 exp(xik )
-50
50
100
ttdif
p_lo
p_pr
yi = x0i + i
where only the sign of the dependent variable is observed as follows:
= 1 if yi > 0
= 0 otherwise
i standardnormalwithCDF :
Zx 1
p exp( t2 =2)dt
(x) =
1 2
or
logisticwithCDF :
x)
(x) = 1 +exp(
exp(x) (logit)
= Vij + ij
where Vij is a non-stochastic utility function. If we assume linear utility function, then Vij = x0ij . The error
disturbances are assumed to have iid Gumbel (log
Weibull or type I extreme value) distribution with distribution function, exp( exp( ij )). Then the event
fyi = j g can be expressed using a random utility function as follows:
Uij
Using properties of the Gumbel distribution, the probability of choosing an alternative j from ni choices of
individual i can be derived from utility maximization:
Pi (j )
yi
(probit)
P (yi = 1) =
exp(x0i )
1 + exp(x0i )
exp(x0 )
PJ i j 0 ;
1 + k=1 exp(xi k )
1
P (yi = 1) =
PJ
1 + k=1 exp(x0i k )
P (yi = j )
j2
log
P (yi = j )
P (yi = k )
= (xij
xik )0
Pi (jh jh ) =
= ln
j 2Ch
k2Cjh h
exp[(xhi;jh )0 h +
Ik;jh k;jh ]
0 k;1 k;L 1 1
When the decision level is at 1, there are no inclusive values. Therefore, the conditional probability is
defined as
Pi (j1 j1 ) =
Uij
Pi (j ) =
exp[(xhi;j1 1 )0 1 ]
1 0 1
j 2C1 exp[(xi;j1 ) ]
Qi (j j )f ( j)d
where
Qi (j j ) =
0
0
P exp(xij 0+ zij
)0
k2Ci exp(xik + zik
)
P~i (j ) =
Ih
S
1X
Q (j j
s )
S s=1 i
y1i
y2i
y1
y2
= 1 d + X 1 + e1
= X 2 + e2
2
0; 1 112
12
1 = J1
1 + 2 1
2 = J2
2 + 1 2
where J1 and J2 are selection matrices for X1 and X2 ,
respectively.
Copley et al. (1994) studied the relation of quality of
audit services (QUALINDX) with audit fees (LNFEE)
using Heckmans two-equation simultaneous equations model with 1 = 0. They find that the audit fee
is positively related with quality of service in the audit
supply equation, while there is a negative relationship
between the demand for audit quality and audit service fees. However, single equation modeling does
not reveal this relationship because of simultaneous
equations bias. Their specification of simultaneous
equations system is the following:
QUALINDX = f1 (LNSIZE; FINOFFCL;
GOVT; LNFEE)
TENURE; QUALINDX)
Conclusion
In this paper, three examples are given to introduce
how the QLIM procedure can be used to solve real
problems. However, there are many other interesting
features. For example, you can use PROC QLIM for
count data and limited dependent variable modeling.
Predicted values and marginal effects are calculated
with the OUTPUT statement in the QLIM procedure.
You can also use PROC QLIM to fit switching regression models.
References
Amemiya, T. (1978), The Estimation of a Simultaneous Equation Generalized Probit Model, Econometrica, 46, 11931205.
Amemiya, T. (1985), Advanced Econometrics, Cambridge: Harvard University Press.
Ben-Akiva, M. and Lerman, S.R. (1985), Discrete
Choice Analysis, Cambridge: MIT Press.
Brownstone, D. and Train, K. (1999), Forecasting
New Product Penetration with Flexible Substitution Patterns, Journal of Econometrics, 89, 109
129.
Copley, P.A., Doucet, M.S., and Gaver, K.M. (1994),
A Simultaneous Equations Analysis of Quality
Control Review Outcomes and Engagement Fees
for Audits of Recipients of Federal Financial Assistance, The Accounting Review, 69, 244256.
Deis, D.R. and Hill, R.C. (1998), An Application of the
Bootstrap Method to the Simultaneous Equations
Model of the Demand and Supply of Audit Services, Contemporary Accounting Research, 15,
8399.
Greene, W.H. (2000), Econometric Analysis, 4th ed.,
Upper Saddle River, N.J.: Prentice Hall.
Heckman, J.J. (1978), Dummy Endogenous Variables in a Simultaneous Equation System,
Econometrica, 46, 931959.
Lee, L.-F. (1981), Simultaneous Equations Models with Discrete and Censored Dependent Variables, in Structural Analysis of Discrete Data with
Econometric Applications, ed. C.F. Manski and D.
McFadden, Cambridge: MIT Press.
Train, K. (1999), Halton Sequences for Mixed Logit,
working paper, University of California, Berkeley.
Contact Information
Minbo Kim, SAS Institute Inc., SAS Campus Drive,
Cary, NC 27513. Email [email protected]
SAS and SAS/ETS are registered trademarks of SAS
Institute Inc. in the USA and other countries. indicates USA registration.