These are chat archives for derrickburns/generalized-kmeans-clustering
Select a day to view.
This project generalizes the Spark MLLIB K-Means clusterer to support clustering of dense or sparse, low or high dimensional data using distance functions defined by Bregman divergences (e.g. squared Euclidean distance, Kullback-Leibler divergence, etc.) Several variants of standard K-Means are easily implemented atop this package, including bisecting K-means, and Anytime K-means.