Robust Clustering Using Tau-Scales
Juan D. Gonzalez, Victor J. Yohai, Ruben H. Zamar

TL;DR
This paper introduces K Tau Centers, a robust clustering method based on Tau scales, which outperforms traditional K means and adaptive trimmed K means in the presence of outliers, with proven consistency.
Contribution
The paper proposes a novel robust clustering algorithm called K Tau Centers that combines robustness and efficiency adaptively, with theoretical consistency guarantees.
Findings
Performs well in simulation studies
Effective on real data examples
Centers are consistent estimators of true centers
Abstract
K means is a popular non-parametric clustering procedure introduced by Steinhaus (1956) and further developed by MacQueen (1967). It is known, however, that K means does not perform well in the presence of outliers. Cuesta-Albertos et al (1997) introduced a robust alternative, trimmed K means, which can be tuned to be robust or efficient, but cannot achieve these two properties simultaneously in an adaptive way. To overcome this limitation we propose a new robust clustering procedure called K Tau Centers, which is based on the concept of Tau scale introduced by Yohai and Zamar (1988). We show that K Tau Centers performs well in extensive simulation studies and real data examples. We also show that the centers found by the proposed method are consistent estimators of the "true" centers defined as the minimizers of the the objective function at the population level.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Bayesian Methods and Mixture Models · Statistical Methods and Inference
