EM GaussianMixture Example
EM GaussianMixture Example
This is a numerical mini-example of a single EM iteration as applies to the problem of estimating the mean
of two Gaussians. The full derivation and explanation of the EM algorithm for this case can be found in
many books (e.g. Mitchell’s book, section 6.12.1).
Let (x1 , x2 , x3 ) = (2, 4, 7) be our three datapoints, presumed to have each been generated from one of two
Gaussians.
√
The stdev of both Gaussians are given: σ1 = σ2 = 1/ 2.
The prior over the two Gaussians in also given: λ1 = λ2 = 0.5
Let j be an index over the Gaussians, i an index over the data points, and k an index over the E-M iterations.
We are trying to derive values for the two means µ = (µ1 , µ2 ) that maximize the likelihood of the given
data, i.e.:
YX (xi −µj )2
µM L = arg max λj e− 2σ2
µ1 ,µ2
i j
Let us initialize the Gaussian means to some reasonable values (inside the data range, and integer valued,
[0] [0]
to make calculation easy): µ1 = 3, µ2 = 6.
Let zi,j = 1 if xi was generated by Gaussian j, and 0 otherwise. The zi,j ’s are our latent variables.
The likelihood function can be written as:
[k]
X
L(xi |µ[k] ) = λj L(xi |µ = µj )
j
1
2 Numerical example of one EM iteration over a Mixture of Gaussians
i 1 2 3
xi 2.0 4.0 7.0
[0]
L(xi |µ = µ1 ) √1 e−1 √1 e−1 √1 e−16
π π π
[0]
L(xi |µ = µ2 ) √1 e−16 √1 e−4 √1 e−1 (1)
π π π
1 −1 −16 1 −1 −4 1 −16 −1
L(xi |µ[0] ) √
2 π
(e +e ) √
2 π
(e +e ) √
2 π
(e +e )
e−1 e−1 e−16
E[zi,1 |µ[0] ] e−1 +e−16 ≈1 e−1 +e−4 ≈ 0.953 e−1 +e−16 ≈0
e−16 e−4 e−1
E[zi,2 |µ[0] ] e−1 +e−16 ≈0 e−1 +e−4 ≈ 0.047 e−1 +e−16 ≈1
And therefore: