Geometric-k-means: A Bound Free Approach to Fast and Eco-Friendly k-means
Parichit Sharma, Marcin Stanislaw, Hasan Kurban, Oguzhan Kulekci, Mehmet Dalkilic

TL;DR
Geometric-k-means (Gk-means) is a new algorithm that uses geometric principles to accelerate k-means clustering, reducing computation and energy use while maintaining high solution quality across various datasets.
Contribution
Gk-means introduces a geometric approach using scalar projection to selectively focus on influential data points, significantly improving efficiency and sustainability over existing methods.
Findings
Gk-means outperforms traditional k-means in runtime and distance computations.
Gk-means reduces energy consumption compared to state-of-the-art variants.
Gk-means maintains clustering quality across diverse datasets.
Abstract
This paper introduces Geometric-k-means (or Gk-means for short), a novel approach that significantly enhances the efficiency and energy economy of the widely utilized k-means algorithm, which, despite its inception over five decades ago, remains a cornerstone in machine learning applications. The essence of Gk-means lies in its active utilization of geometric principles, specifically scalar projection, to significantly accelerate the algorithm without sacrificing solution quality. This geometric strategy enables a more discerning focus on data points that are most likely to influence cluster updates, which we call as high expressive data (HE). In contrast, low expressive data (LE), does not impact clustering outcome, is effectively bypassed, leading to considerable reductions in computational overhead. Experiments spanning synthetic, real-world and high-dimensional datasets, demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Statistical Methods and Models
