KNN and K-means in Gini Prametric Spaces
Cassandra Mussard, Arthur Charpentier, St\'ephane Mussard

TL;DR
This paper develops Gini prametric-based K-means and KNN algorithms that incorporate rank and value information, offering robustness to noise and outliers, with demonstrated superior performance on multiple datasets.
Contribution
It introduces Gini prametrics for clustering and classification, providing provably convergent K-means and competitive KNN methods that outperform traditional approaches in noisy settings.
Findings
Gini K-means is provably convergent and noise-resistant.
Gini KNN performs well compared to Hassanat's distance.
Experimental results on 16 UCI datasets show superior performance.
Abstract
This paper introduces enhancements to the K-means and K-nearest neighbors (KNN) algorithms based on the concept of Gini prametric spaces, instead of traditional metric spaces. Unlike standard distance metrics, Gini prametrics incorporate both value-based and rank-based measures, offering robustness to noise and outliers. The main contributions include: (1) a Gini prametric that captures rank information alongside value distances; (2) a Gini K-means algorithm that is provably convergent and resilient to noisy data; and (3) a Gini KNN method that performs competitively with state-of-the-art approaches like Hassanat's distance in noisy environments. Experimental evaluations on 16 UCI datasets demonstrate the superior performance and efficiency of the Gini-based algorithms in clustering and classification tasks. This work opens new directions for rank-based prametrics in machine learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Face and Expression Recognition
