从0实现高斯混合模型（EM-GMM）

2018-12-15 本文已影响0人 Yanring_

Problem:

Please build a Gaussian mixture model (GMM) to model the data in file TrainingData_GMM.csv. Note that the data is composed of 4 clusters, and the model should be trained by expectation maximization (EM) algorithm.
Based on the GMM learned above, assign each training data point into one of 4 different clusters

Questions:

1） Show how the log-likelihood evolves as the training proceeds

image

x轴为迭代次数,y轴为log-likelihood值

2） The learned mathematical expression for the GMM model after training on the given dataset

$\alpha=\begin{bmatrix}0.23048224536932024\\0.22999999854996792\\0.272826418924052\\0.2666913371566595\end{bmatrix}$

$\mu = \begin{bmatrix}-0.40658&0.32248\\ 1.20354&-1.19686\\ 0.14435&0.14614\\ -0.44149&-0.45088\\\end{bmatrix}$

$\sigma = \begin{bmatrix} \begin{bmatrix}0.03446&-0.01299\\ -0.01299&0.03458\\\end{bmatrix} \begin{bmatrix}0.02259&-0.00761\\ -0.00761&0.02361\\\end{bmatrix} \begin{bmatrix}0.00886&0.00187\\ 0.00187&0.00881\\\end{bmatrix} \begin{bmatrix}0.07024&0.03731\\ 0.03731&0.06498\\\end{bmatrix} \end{bmatrix}$
3） Randomly select 500 data points from the given dataset and plot them on a 2dimensional coordinate system. Mark the data points coming from the same cluster (using the results of Problem 2) with the same color.

image
4） Some analyses on the impacts of initialization on the converged values of EM algorithm
不同的初始参数对EM-GMM算法最后收敛的效果影响非常大，我的

image

node_num = 500
_,gamma=E()
label = np.argmax(gamma,1)
selected_node_index = np.random.choice(range(n),size=node_num)
node_pos = data[selected_node_index]
label = label[selected_node_index]
pylab.scatter(node_pos[:,0],node_pos[:,1],marker='o',c=label,cmap=pylab.cm.Accent)

<matplotlib.collections.PathCollection at 0x1212d0b00>

image

从0实现高斯混合模型（EM-GMM）

Problem:

Questions:

猜你喜欢

热点阅读