CSIAM2018BDAI:P000081

第一届全国大数据与人工智能科学大会

2018年 7月5日 ~ 7日

P000081

The Seeding Algorithms for Spherical K-Means Clustering

*大川徐 (北京工业大学)


K-means problem is a classic NP-hard problem in machine learning and computational geometry,  and its seeding algorithms based on the Lloyd's algorithm are hotly studied. In order to  cluster the textual data with high dimension in modern data analysis, the spherical K-means  clustering is presented. It aims to partition the given points with unit length into K sets  so as to minimize the within-cluster sum of cosine dissimilarity. In this talk, we mainly  study the seeding algorithm for the spherical K-means clustering, and its special case (with  separable sets) and generalized problem ($\alpha$-spherical K-means clustering). For the  spherical K-means clustering with separable sets, the approximate algorithm with a constant  factor is presented. Moreover, it can be generalized to the $\alpha$-spherical separable K- means clustering. By slickly constructing a useful function, we also show that the famous  seeding algorithms such as K-means++ and K-means$||$ of K-means problem can be applied  directly to solve the $\alpha$-spherical K-means clustering.

K-means problem is a classic NP-hard problem in machine learning and computational geometry, and its seeding algorithms based on the Lloyd's algorithm are hotly studied. In order to cluster the textual data with high dimension in modern data analysis, the spherical K-means clustering is presented. It aims to partition the given points with unit length into K sets so as to minimize the within-cluster sum of cosine dissimilarity. In this talk, we mainly study the seeding algorithm for the spherical K-means clustering, and its special case (with separable sets) and generalized problem ($\alpha$-spherical K-means clustering). For the spherical K-means clustering with separable sets, the approximate algorithm with a constant factor is presented. Moreover, it can be generalized to the $\alpha$-spherical separable K- means clustering. By slickly constructing a useful function, we also show that the famous seeding algorithms such as K-means++ and K-means$||$ of K-means problem can be applied directly to solve the $\alpha$-spherical K-means clustering.

Supported by SmartChair

Math formula preview: