On Coreset Constructions for the Fuzzy $K$-Means Problem
Johannes Bl\"omer, Sascha Brauer, Kathrin Bujna

TL;DR
This paper introduces the first coresets for fuzzy K-means clustering that are efficient in size and can be used to compute near-optimal solutions, even in streaming data scenarios.
Contribution
It presents novel coresets for fuzzy K-means with size bounds and demonstrates their use in approximation algorithms and streaming settings.
Findings
Coresets have size linear in dimension and polynomial in clusters.
They enable $(1+\epsilon)$-approximate solutions for fuzzy K-means.
Coresets can be maintained in streaming data environments.
Abstract
The fuzzy -means problem is a popular generalization of the well-known -means problem to soft clusterings. We present the first coresets for fuzzy -means with size linear in the dimension, polynomial in the number of clusters, and poly-logarithmic in the number of points. We show that these coresets can be employed in the computation of a -approximation for fuzzy -means, improving previously presented results. We further show that our coresets can be maintained in an insertion-only streaming setting, where data points arrive one-by-one.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
