Tk-merge: Computationally Efficient Robust Clustering Under General Assumptions
Luca Insolia, Domenico Perrotta

TL;DR
This paper introduces Tk-merge, a computationally efficient robust clustering algorithm capable of handling general-shaped clusters with minimal assumptions, demonstrating superior performance in diverse real-world applications.
Contribution
The paper presents a novel hybrid clustering method combining trimmed k-means and hierarchical agglomeration, with adaptive contamination estimation, outperforming existing robust clustering techniques.
Findings
Outperforms state-of-the-art robust clustering methods in simulations
Effective in real-world applications like image analysis and GPS data
Handles data contamination effectively
Abstract
We address general-shaped clustering problems under very weak parametric assumptions with a two-step hybrid robust clustering algorithm based on trimmed k-means and hierarchical agglomeration. The algorithm has low computational complexity and effectively identifies the clusters also in presence of data contamination. We also present natural generalizations of the approach as well as an adaptive procedure to estimate the amount of contamination in a data-driven fashion. Our proposal outperforms state-of-the-art robust, model-based methods in our numerical simulations and real-world applications related to color quantization for image analysis, human mobility patterns based on GPS data, biomedical images of diabetic retinopathy, and functional data across weather stations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Mobility and Location-Based Analysis · Advanced Clustering Algorithms Research · Bayesian Methods and Mixture Models
MethodsGreedy Policy Search
