0% found this document useful (0 votes)
20 views

Seminar em

Uploaded by

hu jack
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Seminar em

Uploaded by

hu jack
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

EM algorithm for the investigation

Ekaterina Lobacheva
Research Fellow at Samsung-HSE Laboratory
EM algorithm in one slide
X — observed log p(X | ✓) ! max

Z — latent


✓ — parameters L(q, ✓) = Eq(Z) log p(X, Z | ✓) Eq(Z) log q(Z) ! max
q,✓
EM algorithm in one slide
X — observed log p(X | ✓) ! max

Z — latent


✓ — parameters L(q, ✓) = Eq(Z) log p(X, Z | ✓) Eq(Z) log q(Z) ! max
q,✓

Iterations:

E-step: q(Z) = arg max L(q, ✓) = arg min KL(q||p) = p(Z | X, ✓)


q q

M-step: ✓ = arg max L(q, ✓) = arg max Eq(Z) log p(X, Z | ✓)


✓ ✓

Stopping criteria: L(q (t+1) , ✓(t+1) ) L(q (t) , ✓(t) ) < tol
Story time
am e
G t
One of the school organisers decided to prank us and hid Ni g h
all games for our Thursday Game Night somewhere. Let’s
investigate!
Story time
am e
G t
One of the school organisers decided to prank us and hid Ni g h
all games for our Thursday Game Night somewhere. Let’s
investigate!

We've obtained a set of K photographs of


the suspect, but all of them are corrupted
by a directed electromagnetic noise.

Maybe Bayesian algorithms may help us


to expose the prankster?
Data and Notation W

B 2 RH⇥W — clean background


image background
H
B
F 2 RH⇥w — clean face image

H face
F
Data and Notation
Image Xk
B 2 RH⇥W — clean background dk w
image
F 2 RH⇥w — clean face image
H face background
F B
Xk — k-th image from the dataset
dk — coordinate of the upper-left
corner of the face on the k-th image W

All images contain the whole face!


Data and Notation
Image Xk
B 2 RH⇥W — clean background dk w
image
F 2 RH⇥w — clean face image
H face background
F B
Xk — k-th image from the dataset
dk — coordinate of the upper-left
corner of the face on the k-th image W

+ noise from N (0, s2 )

All images contain the whole face!


Probabilistic Model
Image Xk
dk w
Observed: ?

Latent: ? face background


H
F B
Parameters: ?

+ noise from N (0, s2 )


Probabilistic Model
Image Xk
dk w
Observed: X = {X1 , . . . , XK }

Latent: ? face background


H
F B
Parameters: ?

+ noise from N (0, s2 )


Probabilistic Model
Image Xk
dk w
Observed: X = {X1 , . . . , XK }

Latent: d = {d1 , . . . , dK } face background


H
F B
Parameters: ?

+ noise from N (0, s2 )


Probabilistic Model
Image Xk
dk w
Observed: X = {X1 , . . . , XK }

Latent: d = {d1 , . . . , dK } face background


H
F B
Parameters: ✓ = {B, F, s2 }

+ noise from N (0, s2 )


Probabilistic Model
Generation of one image:
(
Y N (Xk [i, j] | F [i, j dk ], s2 ), if [i, j] 2 f aceArea(dk )
p(Xk | dk , ✓) = ? 2
ij
N (X k [i, j] | B[i, j], s ), otherwise
Probabilistic Model
Generation of one image:
(
Y N (Xk [i, j] | F [i, j dk ], s2 ), if [i, j] 2 f aceArea(dk )
p(Xk | dk , ✓) = 2
ij
N (X k [i, j] | B[i, j], s ), otherwise

What else do we need?


Probabilistic Model
Generation of one image:
(
Y N (Xk [i, j] | F [i, j dk ], s2 ), if [i, j] 2 f aceArea(dk )
p(Xk | dk , ✓) = 2
ij
N (X k [i, j] | B[i, j], s ), otherwise

Prior on face positions:


X
p(dk | a) = a[dk ], a[j] = 1, a 2 RW w+1

j
Probabilistic Model
Generation of one image:
(
Y N (Xk [i, j] | F [i, j dk ], s2 ), if [i, j] 2 f aceArea(dk )
p(Xk | dk , ✓) = 2
ij
N (X k [i, j] | B[i, j], s ), otherwise

Prior on face positions:


X
p(dk | a) = a[dk ], a[j] = 1, a 2 RW w+1

Joint probabilistic model:


Y
p(X, d | ✓, a) = p(Xk | dk , ✓)p(dk | a)
k
Task overview
X — observed log p(X | ✓, a) ! max
✓,a
d — latent


✓, a — parameters L(q, ✓, a) = Eq(d) log p(X, d | ✓, a) Eq(d) log q(d) ! max
q,✓,a

Iterations:

E-step: q(d) = p(d | X, ✓, a)

M-step: Eq(d) log p(X, d | ✓, a) ! max


✓,a

Stopping criteria: L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
E-step
Y
q(d) = p(d | X, ✓, a) = ? p(dk | Xk , ✓, a)
k
Y p(Xk , dk | ✓, a)
= P 0 | ✓, a)
d 0 p(X k , d k
k k
Y p(Xk | dk , ✓)p(dk | a)
= P 0 , ✓)p(d0 | a)
d 0 p(X k | d k k
k k
E-step
Y
q(d) = p(d | X, ✓, a) = p(dk | Xk , ✓, a)
k
Y p(Xk , dk | ✓, a)
= P 0 | ✓, a)
d 0 p(X k , d k
k k
Y p(Xk | dk , ✓)p(dk | a)
= P 0 , ✓)p(d0 | a)
d 0 p(X k | d k k
k k
E-step
Y
q(d) = p(d | X, ✓, a) = p(dk | Xk , ✓, a)
k
Y p(Xk , dk | ✓, a)
= P 0 | ✓, a)
d 0 p(X k , d k
k k
Y p(Xk | dk , ✓)p(dk | a)
= P 0 , ✓)p(d0 | a)
d 0 p(X k | d k k
k k
E-step
Y
q(d) = p(d | X, ✓, a) = p(dk | Xk , ✓, a)
k
Y p(Xk , dk | ✓, a)
= P 0 | ✓, a)
d 0 p(X k , d k
k k
Y p(Xk | dk , ✓)p(dk | a)
= P 0 , ✓)p(d0 | a)
d 0 p(X k | d k k
k k

p(Xk | dk , ✓), p(dk | a) — we know from the probabilistic model


E-step
Y
q(d) = p(d | X, ✓, a) = p(dk | Xk , ✓, a)
k
Y p(Xk , dk | ✓, a)
= P 0 | ✓, a)
d 0 p(X k , d k
k k
Y p(Xk | dk , ✓)p(dk | a)
= P 0 , ✓)p(d0 | a)
d 0 p(X k | d k k
k k

p(Xk | dk , ✓), p(dk | a) — we know from the probabilistic model

In practice for each object k:


X
q(dk ) / p(Xk | dk , ✓)p(dk | a) 2 RW w+1
, q(dk = j) = 1
j
M-step: function
Q(✓, a) = Eq(d) log p(X, d | ✓, a) ! max
✓,a

Let’s first simplify Q and rewrite it as a function of individual parameters:


Eq(d) ! Eq(dk ) ✓, a ! F, B, s2 , a
M-step: function
Q(✓, a) = Eq(d) log p(X, d | ✓, a) ! max
✓,a

Let’s first simplify Q and rewrite it as a function of individual parameters:


Eq(d) ! Eq(dk ) ✓, a ! F, B, s2 , a

Y X⇣ ⌘
Q(✓, a) = Eq(d) log p(Xk | dk , ✓)p(dk | a) = Eq(d) log p(Xk | dk , ✓) + log p(dk | a)
k k
X ⇣ ⌘ X ⇣ ⌘
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) = Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a)
k k
M-step: function
Q(✓, a) = Eq(d) log p(X, d | ✓, a) ! max
✓,a

Let’s first simplify Q and rewrite it as a function of individual parameters:


Eq(d) ! Eq(dk ) ✓, a ! F, B, s2 , a

X ⇣ ⌘
Q(✓, a) = Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a)
k
M-step: function
Q(✓, a) = Eq(d) log p(X, d | ✓, a) ! max
✓,a

Let’s first simplify Q and rewrite it as a function of individual parameters:


Eq(d) ! Eq(dk ) ✓, a ! F, B, s2 , a

X ⇣ ⌘
Q(✓, a) = Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a)
k

X⇣ p 1 h i⌘
log p(Xk | dk , ✓) = log( 2⇡s) (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F

i,j
2s2 k k

F B F
Iijd k
= I [i, j] 2 f aceArea(dk ) , Iijdk
= I [i, j] 2
/ f aceArea(dk ) = 1 Iijd k
M-step: function
Q(✓, a) = Eq(d) log p(X, d | ✓, a) ! max
✓,a

Let’s first simplify Q and rewrite it as a function of individual parameters:


Eq(d) ! Eq(dk ) ✓, a ! F, B, s2 , a

X ⇣ ⌘
Q(✓, a) = Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a)
k

X⇣ p 1 h i⌘
log p(Xk | dk , ✓) = log( 2⇡s) (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F

i,j
2s2 k k

F B F
Iijd k
= I [i, j] 2 f aceArea(dk ) , Iijdk
= I [i, j] 2
/ f aceArea(dk ) = 1 Iijd k

log p(dk | a) = log a[dk ]


M-step: maximisation w.r.t. a
8 X  X ⇣
> p
>
>
2
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
>
>
>
> k i,j
>
<
1 h 2 B
i⌘
> 2
(X k [i, j] B[i, j]) Iijdk + (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
>
> 2s k a
>
> X
>
> a[j] = 1
>
:
j
M-step: maximisation w.r.t. a
8 X  X ⇣
> p
>
>
2
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
>
>
>
> k i,j
>
<
1 h 2 B
i⌘
> 2
(X k [i, j] B[i, j]) Iijdk + (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
>
> 2s k a
>
> X
>
> a[j] = 1
>
:
j
M-step: maximisation w.r.t. a
8 X
>
> Q(a) = Eq(dk ) log a[dk ] ! max
< a
k
X
>
> a[j] = 1
:
j
M-step: maximisation w.r.t. a
8 X
>
> Q(a) = Eq(dk ) log a[dk ] ! max The Lagrangian has form:
< a X
k
X L(a, ) = Q(a) ( a[j] 1)
>
> a[j] = 1
: j
j
M-step: maximisation w.r.t. a
8 X
>
> Q(a) = Eq(dk ) log a[dk ] ! max The Lagrangian has form:
< a X
k
X L(a, ) = Q(a) ( a[j] 1)
>
> a[j] = 1
: j
j
P
@L(a, ) X q(dk = j) k q(dk = j)
0= = ) a[j] =
@a[j] a[j]
k
@L(a, ) X X
0= = a[j] 1 ) = q(dk = j)
@ j j,k
M-step: maximisation w.r.t. a
8 X
>
> Q(a) = Eq(dk ) log a[dk ] ! max The Lagrangian has form:
< a X
k
X L(a, ) = Q(a) ( a[j] 1)
>
> a[j] = 1
: j
j
P
@L(a, ) X q(dk = j) k q(dk = j)
0= = ) a[j] =
@a[j] a[j]
k
@L(a, ) X X
0= = a[j] 1 ) = q(dk = j)
@ j j,k
X X
In practice: a[j] / q(dk = j), a[j] = 1
k j
M-step: maximisation w.r.t. F
X  X⇣
2
p
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
k i,j

1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
F
M-step: maximisation w.r.t. F
X  X⇣
2
p
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
k i,j

1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
F
M-step: maximisation w.r.t. F
X X⇣ ⌘
1
Q(F ) = Eq(dk ) 2
(Xk [i, j] F [i, j dk ])2 Iijd
F
! max
i,j
2s k
F
k
M-step: maximisation w.r.t. F
X X⇣ ⌘
1
Q(F ) = Eq(dk ) 2
(Xk [i, j] F [i, j dk ])2 Iijd
F

i,j
2s k
k

X H X
1,w 1⇣ ⌘
1
= Eq(dk ) 2
(Xk [i, m + dk ] F [i, m])2 ! max
i=0,m=0
2s F
k
M-step: maximisation w.r.t. F
X X⇣ ⌘
1
Q(F ) = Eq(dk ) 2
(Xk [i, j] F [i, j dk ])2 Iijd
F

i,j
2s k
k

X H X
1,w 1⇣ ⌘
1
= Eq(dk ) 2
(Xk [i, m + dk ] F [i, m])2 ! max
i=0,m=0
2s F
k

@Q(F ) X q(dk ) ⇣ ⌘
0= = 2
Xk [i, m + dk ] F [i, m]
@F [i, m] s
k,dk

P P
k,dk q(dk )Xk [i, m + dk ] k,dk q(dk )Xk [i, m + dk ]
F [i, m] = P =
k,dk q(dk ) K
M-step: maximisation w.r.t. B
X  X⇣
2
p
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
k i,j

1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
B
M-step: maximisation w.r.t. B
X  X⇣
2
p
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
k i,j

1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
B
M-step: maximisation w.r.t. B
X X⇣ ⌘
1
Q(B) = Eq(dk ) 2
(Xk [i, j] B[i, j])2 Iijd
B
! max
i,j
2s k
B
k
M-step: maximisation w.r.t. B
X X⇣ ⌘
1
Q(B) = Eq(dk ) 2
(Xk [i, j] B[i, j])2 Iijd
B
! max
i,j
2s k
B
k

B ⇣
X q(dk )Iijd ⌘
@Q(B) k
0= = 2
Xk [i, j] B[i, j]
@B[i, j] s
k,dk

P B
k,dk q(d k )I ijdk Xk [i, j]
B[i, j] = P B
k,dk q(d k )I ijdk
2
M-step: maximisation w.r.t. s
X  X⇣
2
p
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
k i,j

1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
s2
2
M-step: maximisation w.r.t. s
X  X⇣
2
p
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
k i,j

1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
s2
2
M-step: maximisation w.r.t. s
X X⇣
1
Q(s2 ) = Eq(dk ) log(s2 )
i,j
2
k
1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
s2
2
M-step: maximisation w.r.t. s
X X⇣
1
Q(s2 ) = Eq(dk ) log(s2 )
i,j
2
k
1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k 2 s

✓ i◆
2
@Q(s ) X 1 1 h
0= = q(dk ) + 4 (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
@s2 2s2 2s k k
k,dk ,i,j

1 X h i
s2 = P q(dk ) (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
k,dk ,i,j q(dk ) k k
k,dk ,i,j
1 X h i
= q(dk ) (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
NWH k k
k,dk ,i,j
Stopping criteria
L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
h i
L(q, ✓, a) = ?Eq(d) log p(X, d | ✓, a) log q(d)
h Y Y i
= Eq(d) log p(Xk | dk , ✓)p(dk | a) log q(dk )
k k
Xh i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
Stopping criteria
L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
h i
L(q, ✓, a) = Eq(d) log p(X, d | ✓, a) log q(d)
h Y Y i
= Eq(d) log p(Xk | dk , ✓)p(dk | a) log q(dk )
k k
Xh i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
Stopping criteria
L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
h i
L(q, ✓, a) = Eq(d) log p(X, d | ✓, a) log q(d)
h Y Y i
= Eq(d) log p(Xk | dk , ✓)p(dk | a) log q(dk )
k k
Xh i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
Stopping criteria
L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
h i
L(q, ✓, a) = Eq(d) log p(X, d | ✓, a) log q(d)
h Y Y i
= Eq(d) log p(Xk | dk , ✓)p(dk | a) log q(dk )
k k
Xh i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
Task overview
X
E-step: q(dk ) / p(Xk | dk , ✓)p(dk | a), q(dk = j) = 1
j
X X
M-step: a[j] / q(dk = j), a[j] = 1
k j
P P B
k,dk q(dk )Xk [i, m + dk ] k,dk q(d k )I ijdk Xk [i, j]
F [i, m] = B[i, j] = P B
K k,dk q(dk )Iijdk

1 X h i
s2 = q(dk ) (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
NWH k k
k,dk ,i,j

Stopping criteria: L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
X h i
L(q, ✓, a) = Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k

You might also like