0% found this document useful (0 votes)
5 views

Exercise - 03 - Machine Learning

Uploaded by

pulak.pk.kuri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Exercise - 03 - Machine Learning

Uploaded by

pulak.pk.kuri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Exercises for the lecture

Institute for Statistics


Linear Models Daniel Ochieng
MZH, Room 7240
E-Mail: [email protected]
Summer Semester 2024

Exercise 03
rd
Submission by Thursday, 23 May 2024, 8:30 A.M. followed by a tutorial at the same time.
Attempt the tasks in groups of at most three individuals.

7. Programming task I.
Data Arrests from R package effects is about Arrests for Marijuana Possession, which is on
police treatment of individuals arrested in Toronto for simple possession of small quantities
of marijuana. The data are part of a larger dataset featured in a series of articles in the
Toronto Star newspaper. Analyze this data using logistic regression to identify whether
released (i.e., Whether the arrestee was released with a summon; a factor with levels: No;
Yes) is a statistically related variable:

• Color: The arestee’s race; a factor with levels: Black; white.


• year: 1997 through 2002; a numeric vector.
• age: in years; a numeric vector.
• sex: a factor with levels: Female; Male.
• employed: a factor with levels: No; Yes.
• citizen: a factor with levels: No; Yes.
8. Programming task II.
Analyze the Arthritis treatment data from R package vcd with logistic regression to identify
whether Treatment and Sex, Age are statistically related with the treatment outcome.

• Note: to dichotomize the outcome Improved into 1 = Better (i.e., combine Some and
Marked into 1) and 0= None.
Hint: load the data using R code: data(“Arthritis”, package=“vcd”).
9. Poisson regression

(a) Suppose Y takes values 0, 1, 2, . . . with probability density f (y) and mean θ. Calculate
E(Y |y > 0). Hint: Calculate the mean of a truncated Poisson distribution.
(b) Assume now that y takes the values 0, 1, 2, . . . with hurdle density

P [y = 0] = f1 (0)

and
[1 − f1 (0)]
P [y = k] = f2 (0), k = 1, 2, . . .
[1 − f2 (0)]
P∞
where the density f2 (y) has untruncated mean θ2 , that is, k=0 kf (k) = θ2 . Find E(Y ).

1
(c) Introducing regressors, assume the zeros are given by a logit model and the positives
by a Poisson model, that is,
1
f1 (0) =
X ′β 1 )]
[1 + exp(X

X ′β 2 )][exp(X
f1 (k) = exp[− exp(X X ′β 2 )k /y!], k = 1, 2, . . . ;
X ].
give an expression for E[y|X
X ]/dX
(d) Hence obtain an expression for dE[y|X X for the hurdle model.

You might also like