2019-01-10[Stay Sharp]k-means cl

2019-01-10  本文已影响4人  三千雨点

what is k-means clustering?

K-means clustering is a method of prototype based clustering, it can group the data into k clusters in which eash data belongs to the cluster with the nearest mean.

Given the input data set D = \left\{ \boldsymbol { x } _ { 1 } , \boldsymbol { x } _ { 2 } , \ldots , \boldsymbol { x } _ { m } \right\}, k-means clustering will partition the n observations into k ( \leqslant m ) set \mathrm { S } = \left\{ S _ { 1 } , S _ { 2 } , \ldots , S _ { k } \right\} so as to minimize the within-cluster sum of squares.

\underset { \mathbf { S } } { \arg \min } \sum _ { i = 1 } ^ { k } \sum _ { \mathbf { x } \in S _ { i } } \left\| \mathbf { x } - \boldsymbol { \mu } _ { i } \right\| ^ { 2 }
where u_{i} is the mean of points in S _ { i }.

clustering steps

References

https://en.wikipedia.org/wiki/K-means_clustering
https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68

上一篇 下一篇

猜你喜欢

热点阅读