Clustering for Different Scales of Measurement - the Gap-Ratio Weighted K-means Algorithm
Joris Gu\'erin, Olivier Gibaru, St\'ephane Thiery, Eric Nyiri

TL;DR
This paper introduces the gap-ratio weighted K-means algorithm designed to effectively cluster data with features on different scales and large spreads, validated through robotics and toy dataset experiments.
Contribution
The paper presents a novel weighted K-means algorithm that adjusts feature importance based on data gap ratios, improving clustering in heterogeneous measurement scales.
Findings
Outperforms standard K-means on Lego brick dataset
Effective in handling features with different measurement scales
Validated on multiple datasets with improved clustering accuracy
Abstract
This paper describes a method for clustering data that are spread out over large regions and which dimensions are on different scales of measurement. Such an algorithm was developed to implement a robotics application consisting in sorting and storing objects in an unsupervised way. The toy dataset used to validate such application consists of Lego bricks of different shapes and colors. The uncontrolled lighting conditions together with the use of RGB color features, respectively involve data with a large spread and different levels of measurement between data dimensions. To overcome the combination of these two characteristics in the data, we have developed a new weighted K-means algorithm, called gap-ratio K-means, which consists in weighting each dimension of the feature space before running the K-means algorithm. The weight associated with a feature is proportional to the ratio of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Data Management and Algorithms · Data Stream Mining Techniques
