【ML】EM Algorithm

2022-08-28  本文已影响0人  盐果儿

EM Algorithm is short for Expectation-Maximization Algorithm. It's an iterative method to find maximum likelihood in statistical models where the model depends on unobserved latend variables. 


It used when the dataset is incomplete. 

It's an unsupervised model.


We have a transcript, but we don't know which class the students belong to.

1. Initial guess: P(c_{1}) = P(c_{2}) = 0.5

2. Expectation Step: Using the initial guess, we got the value of the marigianal likelihood (prior proabaility). 

The probability density function:

P(x | c_{1}) = \frac {1}{\sqrt {2 \pi \sigma _{1}}} exp(- \frac {(x - \mu_{1})^2}{2 \sigma _{1} ^2})

P(x | c_{2}) = \frac {1}{\sqrt {2 \pi \sigma _{2}}} exp(- \frac {(x - \mu_{2})^2}{2 \sigma _{2} ^2})

P(c_{1} | x_{i}) = \frac {P(x_{i} | c_{1})P(c_{1})}{P(x_{i} | c_{1})P(c_{1}) + P(x_{i} | c_{2})P(c_{2})}

P(c_{2} | x_{i}) = 1 - P(c_{1} | x_{i})

3. Maximization Step: Using the probability to update the Gaussian distribution.

\mu = \frac {c_{11}x_{1} + c_{12}x_{2} + ... + c_{1n}x_{n}}{c_{11} + c_{12} + ...+ c_{1n}}

\sigma ^2 = \frac {c_{11}(x_{1} - \mu_{1})^2 + c_{11}(x_{1} - \mu_{1})^2 +...+c_{11}(x_{1} - \mu_{1})^2}{c_{11} + c_{12} + ... + c_{1n}}

4. Iterate step2 and step3 until find the maximum likelihood.


Gaussian Distribution (Normal Distribution): If the random variable X obeys a normal distribution with mathematical expectation μ and variance σ2, denoted as N(μ,σ2). Its probability density function determines its position for the expected value μ of a normal distribution, and its standard deviation σ determines the magnitude of the distribution. The normal distribution when μ = 0 and σ = 1 is the standard normal distribution.

Probability Density Function: 

上一篇 下一篇

