Seminar em
Seminar em
Ekaterina Lobacheva
Research Fellow at Samsung-HSE Laboratory
EM algorithm in one slide
X — observed log p(X | ✓) ! max
✓
Z — latent
✓ — parameters L(q, ✓) = Eq(Z) log p(X, Z | ✓) Eq(Z) log q(Z) ! max
q,✓
EM algorithm in one slide
X — observed log p(X | ✓) ! max
✓
Z — latent
✓ — parameters L(q, ✓) = Eq(Z) log p(X, Z | ✓) Eq(Z) log q(Z) ! max
q,✓
Iterations:
Stopping criteria: L(q (t+1) , ✓(t+1) ) L(q (t) , ✓(t) ) < tol
Story time
am e
G t
One of the school organisers decided to prank us and hid Ni g h
all games for our Thursday Game Night somewhere. Let’s
investigate!
Story time
am e
G t
One of the school organisers decided to prank us and hid Ni g h
all games for our Thursday Game Night somewhere. Let’s
investigate!
H face
F
Data and Notation
Image Xk
B 2 RH⇥W — clean background dk w
image
F 2 RH⇥w — clean face image
H face background
F B
Xk — k-th image from the dataset
dk — coordinate of the upper-left
corner of the face on the k-th image W
j
Probabilistic Model
Generation of one image:
(
Y N (Xk [i, j] | F [i, j dk ], s2 ), if [i, j] 2 f aceArea(dk )
p(Xk | dk , ✓) = 2
ij
N (X k [i, j] | B[i, j], s ), otherwise
✓, a — parameters L(q, ✓, a) = Eq(d) log p(X, d | ✓, a) Eq(d) log q(d) ! max
q,✓,a
Iterations:
Stopping criteria: L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
E-step
Y
q(d) = p(d | X, ✓, a) = ? p(dk | Xk , ✓, a)
k
Y p(Xk , dk | ✓, a)
= P 0 | ✓, a)
d 0 p(X k , d k
k k
Y p(Xk | dk , ✓)p(dk | a)
= P 0 , ✓)p(d0 | a)
d 0 p(X k | d k k
k k
E-step
Y
q(d) = p(d | X, ✓, a) = p(dk | Xk , ✓, a)
k
Y p(Xk , dk | ✓, a)
= P 0 | ✓, a)
d 0 p(X k , d k
k k
Y p(Xk | dk , ✓)p(dk | a)
= P 0 , ✓)p(d0 | a)
d 0 p(X k | d k k
k k
E-step
Y
q(d) = p(d | X, ✓, a) = p(dk | Xk , ✓, a)
k
Y p(Xk , dk | ✓, a)
= P 0 | ✓, a)
d 0 p(X k , d k
k k
Y p(Xk | dk , ✓)p(dk | a)
= P 0 , ✓)p(d0 | a)
d 0 p(X k | d k k
k k
E-step
Y
q(d) = p(d | X, ✓, a) = p(dk | Xk , ✓, a)
k
Y p(Xk , dk | ✓, a)
= P 0 | ✓, a)
d 0 p(X k , d k
k k
Y p(Xk | dk , ✓)p(dk | a)
= P 0 , ✓)p(d0 | a)
d 0 p(X k | d k k
k k
Y X⇣ ⌘
Q(✓, a) = Eq(d) log p(Xk | dk , ✓)p(dk | a) = Eq(d) log p(Xk | dk , ✓) + log p(dk | a)
k k
X ⇣ ⌘ X ⇣ ⌘
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) = Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a)
k k
M-step: function
Q(✓, a) = Eq(d) log p(X, d | ✓, a) ! max
✓,a
X ⇣ ⌘
Q(✓, a) = Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a)
k
M-step: function
Q(✓, a) = Eq(d) log p(X, d | ✓, a) ! max
✓,a
X ⇣ ⌘
Q(✓, a) = Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a)
k
X⇣ p 1 h i⌘
log p(Xk | dk , ✓) = log( 2⇡s) (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
i,j
2s2 k k
F B F
Iijd k
= I [i, j] 2 f aceArea(dk ) , Iijdk
= I [i, j] 2
/ f aceArea(dk ) = 1 Iijd k
M-step: function
Q(✓, a) = Eq(d) log p(X, d | ✓, a) ! max
✓,a
X ⇣ ⌘
Q(✓, a) = Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a)
k
X⇣ p 1 h i⌘
log p(Xk | dk , ✓) = log( 2⇡s) (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
i,j
2s2 k k
F B F
Iijd k
= I [i, j] 2 f aceArea(dk ) , Iijdk
= I [i, j] 2
/ f aceArea(dk ) = 1 Iijd k
1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
F
M-step: maximisation w.r.t. F
X X⇣
2
p
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
k i,j
1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
F
M-step: maximisation w.r.t. F
X X⇣ ⌘
1
Q(F ) = Eq(dk ) 2
(Xk [i, j] F [i, j dk ])2 Iijd
F
! max
i,j
2s k
F
k
M-step: maximisation w.r.t. F
X X⇣ ⌘
1
Q(F ) = Eq(dk ) 2
(Xk [i, j] F [i, j dk ])2 Iijd
F
i,j
2s k
k
X H X
1,w 1⇣ ⌘
1
= Eq(dk ) 2
(Xk [i, m + dk ] F [i, m])2 ! max
i=0,m=0
2s F
k
M-step: maximisation w.r.t. F
X X⇣ ⌘
1
Q(F ) = Eq(dk ) 2
(Xk [i, j] F [i, j dk ])2 Iijd
F
i,j
2s k
k
X H X
1,w 1⇣ ⌘
1
= Eq(dk ) 2
(Xk [i, m + dk ] F [i, m])2 ! max
i=0,m=0
2s F
k
@Q(F ) X q(dk ) ⇣ ⌘
0= = 2
Xk [i, m + dk ] F [i, m]
@F [i, m] s
k,dk
P P
k,dk q(dk )Xk [i, m + dk ] k,dk q(dk )Xk [i, m + dk ]
F [i, m] = P =
k,dk q(dk ) K
M-step: maximisation w.r.t. B
X X⇣
2
p
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
k i,j
1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
B
M-step: maximisation w.r.t. B
X X⇣
2
p
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
k i,j
1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
B
M-step: maximisation w.r.t. B
X X⇣ ⌘
1
Q(B) = Eq(dk ) 2
(Xk [i, j] B[i, j])2 Iijd
B
! max
i,j
2s k
B
k
M-step: maximisation w.r.t. B
X X⇣ ⌘
1
Q(B) = Eq(dk ) 2
(Xk [i, j] B[i, j])2 Iijd
B
! max
i,j
2s k
B
k
B ⇣
X q(dk )Iijd ⌘
@Q(B) k
0= = 2
Xk [i, j] B[i, j]
@B[i, j] s
k,dk
P B
k,dk q(d k )I ijdk Xk [i, j]
B[i, j] = P B
k,dk q(d k )I ijdk
2
M-step: maximisation w.r.t. s
X X⇣
2
p
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
k i,j
1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
s2
2
M-step: maximisation w.r.t. s
X X⇣
2
p
Q(F, B, s , a) = Eq(dk ) log a[dk ] + log( 2⇡s)
k i,j
1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
s2
2
M-step: maximisation w.r.t. s
X X⇣
1
Q(s2 ) = Eq(dk ) log(s2 )
i,j
2
k
1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k
s2
2
M-step: maximisation w.r.t. s
X X⇣
1
Q(s2 ) = Eq(dk ) log(s2 )
i,j
2
k
1 h i⌘
2
(Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
! max
2s k k 2 s
✓ i◆
2
@Q(s ) X 1 1 h
0= = q(dk ) + 4 (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
@s2 2s2 2s k k
k,dk ,i,j
1 X h i
s2 = P q(dk ) (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
k,dk ,i,j q(dk ) k k
k,dk ,i,j
1 X h i
= q(dk ) (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
NWH k k
k,dk ,i,j
Stopping criteria
L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
h i
L(q, ✓, a) = ?Eq(d) log p(X, d | ✓, a) log q(d)
h Y Y i
= Eq(d) log p(Xk | dk , ✓)p(dk | a) log q(dk )
k k
Xh i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
Stopping criteria
L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
h i
L(q, ✓, a) = Eq(d) log p(X, d | ✓, a) log q(d)
h Y Y i
= Eq(d) log p(Xk | dk , ✓)p(dk | a) log q(dk )
k k
Xh i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
Stopping criteria
L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
h i
L(q, ✓, a) = Eq(d) log p(X, d | ✓, a) log q(d)
h Y Y i
= Eq(d) log p(Xk | dk , ✓)p(dk | a) log q(dk )
k k
Xh i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
Stopping criteria
L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
h i
L(q, ✓, a) = Eq(d) log p(X, d | ✓, a) log q(d)
h Y Y i
= Eq(d) log p(Xk | dk , ✓)p(dk | a) log q(dk )
k k
Xh i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(d) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
X h i
= Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k
Task overview
X
E-step: q(dk ) / p(Xk | dk , ✓)p(dk | a), q(dk = j) = 1
j
X X
M-step: a[j] / q(dk = j), a[j] = 1
k j
P P B
k,dk q(dk )Xk [i, m + dk ] k,dk q(d k )I ijdk Xk [i, j]
F [i, m] = B[i, j] = P B
K k,dk q(dk )Iijdk
1 X h i
s2 = q(dk ) (Xk [i, j] B[i, j])2 Iijd
B
+ (Xk [i, j] F [i, j dk ])2 Iijd
F
NWH k k
k,dk ,i,j
Stopping criteria: L(q (t+1) , ✓(t+1) , a(t+1) ) L(q (t) , ✓(t) , a(t) ) < tol
X h i
L(q, ✓, a) = Eq(dk ) log p(Xk | dk , ✓) + log p(dk | a) log q(dk )
k