Quantile-based clustering
Christian Hennig, Cinzia Viroli, Laura Anderlucci

TL;DR
The paper introduces $K$-quantiles clustering, a flexible, scalable nonparametric method that handles skewness and high-dimensional data, with proven consistency and competitive performance in simulations and real datasets.
Contribution
It presents a novel $K$-quantiles clustering algorithm that is simple, efficient, and adaptable to skewed and high-dimensional data, with theoretical guarantees.
Findings
Proven consistency of $K$-quantiles clustering.
Comparable or superior performance in simulations.
Effective application to high-dimensional microarray data.
Abstract
A new cluster analysis method, -quantiles clustering, is introduced. -quantiles clustering can be computed by a simple greedy algorithm in the style of the classical Lloyd's algorithm for -means. It can be applied to large and high-dimensional datasets. It allows for within-cluster skewness and internal variable scaling based on within-cluster variation. Different versions allow for different levels of parsimony and computational efficiency. Although -quantiles clustering is conceived as nonparametric, it can be connected to a fixed partition model of generalized asymmetric Laplace-distributions. The consistency of -quantiles clustering is proved, and it is shown that -quantiles clusters correspond to well separated mixture components in a nonparametric mixture. In a simulation, -quantiles clustering is compared with a number of popular clustering methods with good…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
