STA360/601 Midterm Solutions
STA360/601 Midterm Solutions
1. (15 points)
p(x|θ)p(θ) P(B|A)P(A)
p(θ|x) = OR p(θ|x) ∝ p(x|θ)p(θ) OR P(A|B) =
p(x) P(B)
(b) (5 points) You receive xi telephone calls on day i, for i = 1, . . . , n. You wish to
model this as X1 , . . . , Xn i.i.d. from some distribution. Which of the following
distributions would make sense to use? (Circle one.)
ii. Poisson (discrete, on {0, 1, 2, 3, . . .}) (also see “law of small numbers”)
(c) (5 points) Suppose X, X1 , . . . , XN are i.i.d. and assume E|X| < ∞ and V(X) < ∞.
What is the standard deviation of
N
1 X
Xi ?
N i=1
Hint: It is the same as the RMSE of the Monte Carlo approximation. (Circle
one.)
i. V(X)/N
√
ii. V(X)/ N
iii. σ(X)/N
√
iv. σ(X)/ N
P PN
V N1 1 1 1
P
Xi = N2
V Xi = N2 i=1 V Xi = N
V X
P P 1/2 1/2
1
σ N
Xi = V N1 Xi = N1 V X = √1 σ(X)
N
3
2. (17 points) (Marginal likelihood)
iid
Suppose X1 , . . . , Xn ∼ Geometric(θ) given θ. Consider a Beta(a, b) prior on θ. What
is the marginal likelihood p(x1:n )?
P
B(a + n, b + xi )
p(x1:n ) = if x1 , . . . , xn ∈ {0, 1, 2, . . .}, and 0 otherwise.
B(a, b)
4
3. (17 points) (Exponential families, Normal distribution)
Show that the collection of N (µ, σ 2 ) distributions, with µ ∈ R and σ 2 > 0, is a
two-parameter exponential family, and identify the sufficient statistics function t(x) =
(t1 (x), t2 (x))T for your parametrization.
1
N (x|µ, σ 2 ) = √ 1
− µ)2
exp − 2σ 2
(x
2πσ 2
µ µ2
− 2σ1 2 (x − µ)2 = − 2σ1 2 (x2 − 2xµ + µ2 ) = − 2σ1 2 x2 + σ2
x − 2σ 2
1
µ µ2
N (x|µ, σ 2 ) = √ exp − 1
2σ 2
x2 + σ2
x − 2σ 2
2πσ 2
µ µ2
= exp − 1
2σ 2
x2 −+ σ2
x 2σ 2
1
− log(2πσ )
2
2
= exp ϕ(θ)T t(x) − κ(θ) h(x)
2
µ −1/(2σ 2 ) x µ2 1 2
where θ = , ϕ(θ) = , t(x) = , κ(θ) = 2σ 2 + 2 log(2πσ ), and
σ2 µ/σ 2 x
h(x) = 1. Thus, the sufficient statistics function is t(x) = (x2 , x)T , for this choice of
parametrization.
(There is more than one correct answer to this problem, since constants can be moved
between t(x) and ϕ(θ), as well as between h(x) and κ(θ).)
5
4. (17 points) (Conjugate priors)
iid
Suppose X1 , . . . , Xn ∼ Uniform(0, θ) given θ, that is,
1
p(xi |θ) = 1(0 < xi < θ).
θ
You would like to find a conjugate prior for θ. Show that the family of Pareto(α, c)
distributions, with α > 0 and c > 0, is a conjugate prior family.
The posterior is
6
5. (17 points) (Sampling methods)
Suppose c > 1 and
1
p(x) ∝
1(1 < x < c).
x
(Note that p(x) is proportional to this, not equal to this.) Assume you can generate
U ∼ Uniform(0, 1). Give an explicit formula, in terms of c and U , for generating a
sample from p(x). You must show your work to receive full credit.
Use the inverse c.d.f. method. First, we need to find the normalization constant. For
any b ∈ [1, c],
Z b b
(1/x)dx = log x = log b − log 1 = log b.
1 1
Thus,
1
p(x) = 1(1 < x < c).
x log c
For any b ∈ [1, c], the c.d.f. is therefore
Z b Z b
1 log b
F (b) = p(x)dx = dx = .
1 1 x log c log c
log x
u = F (x) =
log c
u log c = log x
exp(u log c) = x
cu = x
7
6. (17 points) (Decision theory)
Consider a decision problem in which the state is θ ∈ R, the observation is x, you must
choose an action θ̂ ∈ R, and the loss function is
for some known a, b, c ∈ R with c > 0. Suppose you have computed the posterior
distribution and it is p(θ|x) = N (θ|M, L−1 ) for some M and L. What is the Bayes
procedure (minimizing posterior expected loss)?
The Bayes procedure is the decision procedure (method of choosing θ̂ based on x) that
minimizes the posterior expected loss,
ρ(θ̂, x) = E `(θ, θ̂) x
= E aθ 2 + bθ θ̂ + cθ̂2 x
= aE(θ 2 |x) + bE(θ|x)θ̂ + cθ̂2 .
Since c > 0, this is a strictly convex quadratic function of θ̂, so we can set the derivative
equal to zero and solve to find the minimum:
d
0= ρ(θ̂, x) = bE(θ|x) + 2cθ̂
dθ̂
θ̂ = −bE(θ|x)/(2c).
Since the posterior is Normal with mean M , then E(θ|x) = M , hence the Bayes
procedure is
−bM
θ̂ = .
2c