HW 2 Sol
HW 2 Sol
Make sure your reasoning and work are clear to receive full credit for each problem.
1. 3 points. A city has two taxi companies distinguished by the color of their taxis: 85% of the
taxis are Yellow and the rest are Blue. A taxi was involved in a hit and run accident and
was witnessed by Mr. Green. Unfortunately, Mr. Green is mildly color blind and can only
correctly identify the color 80% of the time. In the trial, Mr. Green testified that the color
of the taxi was blue. Should you trust him?
Solution: This problem can be interpreted as: Does Mr. Green’s decision minimizes the
Bayesian risk? Let’s assume a uniform cost assignment (UCA) and use x0 and H0 to represent
“Yellow” and x1 and H1 to represent “Blue”. Also let π0 = 0.85 denote the prior probability
of a yellow taxi, π1 = 0.15 denote the prior probability of a blue taxi, and Y = y1 denote
Mr. Green’s observation of a blue taxi. Then from slide 26 in Lecture 2b, a deterministic
Bayes decision rule can be written in terms of the posterior probabilities as
X
δBπ (y) = arg max πj (y),
i∈{0,1}
xj ∈Hi
p1 (y1 )π1
π1 (y1 ) =
p(y1 )
p0 (y1 )π0
π0 (y1 ) = .
p(y1 )
Thus we only need to compare κ0 = p0 (y1 )π0 and κ1 = p1 (y1 )π1 .
2. 8 points total. Consider the coin flipping problem where you have an unknown coin, either
fair (HT) or double headed (HH), and you observe the outcome of n flips of this coin. Assume
a uniform cost assignment. For notational consistency, let the state and hypothesis x0 and
H0 be the case when the coin is HT and x1 and H1 be the case when the coin is HH.
1
(a) 3 points. Plot the conditional risk vectors (CRVs) of the deterministic decision rules for
the cases of n = 1, 2, 3, 4, . . . coin flips. You might want to write a Matlab script to do
this for n > 2.
Solution: See the Matlab script and plots below.
1 %−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
2 % ECE531 S p r i n g 2009
3 % DRB 05−Feb −2009
4 % S o l u t i o n t o Homework 2 Problem 2 p a r t a
5 %−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
6 % USER PARAMETERS BELOW
7 %−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
8 ntest = 1:4; % val ues of n to t e s t
9 %−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
10
11 N = 2; % number o f hypotheses
12 M = 2; % number o f states
13 p0 H = 0 . 5 ; % c o n d i t i o n a l p r o b a b i l i t y o f H g i v e n x0
14 p1 H = 1 ; % conditional p r o b a b i l i t y o f H g i v e n x1
15 C = [ 0 1 ; 1 0 ] ; % UCA
16
17 for n = n t e s t
18
19 L = n+1; % number o f p o s s i b l e o b s e r v a t i o n s
20 totD = MˆL ; % t o t a l number o f d e c i s i o n m a t r i c e s
21 B = makebinary (L , 1 ) ;
22
23 % form c o n d i t i o n a l p r o b a b i l i t y m a t r i x
24 % columns a r e i n d e x e d by s t a t e
25 % rows a r e i n d e x e d by o b s e r v a t i o n
26 P0 = zeros ( L , 1 ) ;
27 P1 = zeros ( L , 1 ) ;
28 f o r i = 0 : ( L−1) ,
29 P0 ( i +1) = nchoosek ( n , i ) ∗ p0 Hˆ i ∗ ( 1 − p0 H ) ˆ ( n−i ) ;
30 P1 ( i +1) = nchoosek ( n , i ) ∗ p1 Hˆ i ∗ ( 1 − p1 H ) ˆ ( n−i ) ;
31 end
32 P = [ P0 P1 ] ;
33
34 % compute CRVs f o r a l l p o s s i b l e d e t e r m i n i s t i c d e c i s i o n m a t r i c e s
35 f o r i = 0 : ( totD −1) ,
36 D = [ B ( : , i +1) ’ ; 1−B ( : , i +1) ’ ] ; % d e c i s i o n matrix
37 % compute r i s k v e c t o r s
38 f o r j =0:1 ,
39 R( j +1 , i +1) = C ( : , j +1) ’∗D∗P ( : , j +1);
40 end
41 end
42
43 figure
44 plot (R ( 1 , : ) , R ( 2 , : ) , ’ p ’ ) ;
45 xlabel ( ’R0 ’ )
46 ylabel ( ’R1 ’ )
47 t i t l e ( [ ’ n= ’ num2str ( n ) ] ) ;
48 axis s q u a r e
49 grid on
50
51 end
n=1
1
0.9
0.8
0.7
0.6
R1
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
R0
n=2
1
0.9
0.8
0.7
0.6
R1 0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
R0
n=3
1
0.9
0.8
0.7
0.6
R1
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
R0
n=4
1
0.9
0.8
0.7
0.6
R1 0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
R0
(b) 2 points. What can you say about the convex hull of the deterministic CRVs as n
increases?
Solution: You can see the trend from the plots of how the CRVs of the deterministic
decision rules move closer to the bottom left corner as n increases. The convex hull of
achievable CRVs then fills more of the risk plane as n gets larger. Hence, you can achieve
any arbitrarily small risk combination for sufficiently large n.
(c) 3 points. When n = 2, find the deterministic decision rule(s) that minimize the Bayes
risk for the prior π0 = 0.6 and π1 = 0.4. Repeat this for the case when n = 3. Does the
additional observation reduce the Bayes risk for this prior?
Solution: Notation: HT = x0 ↔ H0 and HH = x1 ↔ H1 . When n=2, we can write the
conditional probability matrix P as
0.25 0
P = 0.5 0
0.25 1
Hence the optimal deterministic decision matrix that minimizes the Bayes cost is
1 1 0
D=
0 0 1
0.125 1
Hence the optimal deterministic decision matrix that minimizes the Bayes cost is
1 1 1 0
D=
0 0 0 1
and the resulting Bayes risk is r = 0.075. The risk is reduced by flipping the coin one
more time, as you would expect.
p1 (y) 3
L(y) = = , 0 ≤ y ≤ 1.
p0 (y) 2(y + 1)
With uniform costs and equal priors, we can compute the optimum threshold on the likelihood
function as τ = 1. From our expression for L(y), an equivalent condition to L(y) ≥ 1 is
y ∈ [0, 0.5]. Thus, for priors π0 = π1 = 0.5, the deterministic Bayes decision rule is given by
1 if 0 ≤ y < 0.5
δBπ (y) = 0/1 if y = 0.5 .
0 if 0.5 < y ≤ 1
Let
π0
τ=
1 − π0
and note that L(y) = τ is equivalent to y = τ ′ where
r
′ π π0
τ = −2 ln .
2e 1 − π0
Hence, we can express the “critical region” of observations where we decide H1 as
Γ1 = y ≥ 0 (y − 1)2 ≤ τ ′ .
Note that these integrals can be expressed as Q-functions (or erf/erfc functions) but cannot
be evaluated in closed form for arbitrary priors.
5. 3 points. Poor textbook Chapter II, Problem 6 (a).
Solution: Here we have p0 (y) = pN (y + s) and p1 (y) = pN (y − s), where pN (x) is the pdf of
the random variable N . This gives the likelihood function
1 + (y + s)2
L(y) = .
1 + (y − s)2
With equal priors and uniform costs, the “critical region” where we decide H1 is Γ1 = {L(y) ≥
1} = 1 + (y + s)2 ≥ 1 + (y − s)2 = {2sy ≥ −2sy} = [0, ∞). Thus, the Bayes decision rule
reduces to
1 if y ≥ 0
δBπ (y) =
0 if y < 0
The minimum Bayes risk is then
1 ∞ 1 1 0 1 1 tan−1 (s)
Z Z
r(δBπ ) = dy + dy = − .
2 0 π [1 + (y + s)2 ] 2 −∞ π [1 + (y − s)2 ] 2 π