从0实现高斯混合模型(EM-GMM)
2018-12-15 本文已影响0人
Yanring_
Problem:
-
Please build a Gaussian mixture model (GMM) to model the data in file TrainingData_GMM.csv. Note that the data is composed of 4 clusters, and the model should be trained by expectation maximization (EM) algorithm.
-
Based on the GMM learned above, assign each training data point into one of 4 different clusters
Questions:
1) Show how the log-likelihood evolves as the training proceeds
![](https://img.haomeiwen.com/i2471371/81635657d77946f4.png)
x轴为迭代次数,y轴为log-likelihood值
2) The learned mathematical expression for the GMM model after training on the given dataset
3) Randomly select 500 data points from the given dataset and plot them on a 2dimensional coordinate system. Mark the data points coming from the same cluster (using the results of Problem 2) with the same color.
![](https://img.haomeiwen.com/i2471371/719e8b78ae782d6b.png)
4) Some analyses on the impacts of initialization on the converged values of EM algorithm
不同的初始参数对EM-GMM算法最后收敛的效果影响非常大,我的
![](https://img.haomeiwen.com/i2471371/149960f7d1b40a2c.png)
node_num = 500
_,gamma=E()
label = np.argmax(gamma,1)
selected_node_index = np.random.choice(range(n),size=node_num)
node_pos = data[selected_node_index]
label = label[selected_node_index]
pylab.scatter(node_pos[:,0],node_pos[:,1],marker='o',c=label,cmap=pylab.cm.Accent)
<matplotlib.collections.PathCollection at 0x1212d0b00>
![](https://img.haomeiwen.com/i2471371/eb3f9fc91a601bdb.png)