Multi-level conformal clustering: A distribution-free technique for clustering and anomaly detection
Ilia Nouretdinov, James Gammerman, Matteo Fontana, Daljit Rehal

TL;DR
This paper introduces Multi-level Conformal Clustering (MLCC), a distribution-free, hierarchical clustering method that automatically determines the number of clusters and enables simultaneous clustering and anomaly detection with statistical guarantees.
Contribution
The paper presents MLCC, a novel hierarchical clustering technique based on conformal prediction that is distribution-free, adaptable, and capable of joint clustering and anomaly detection.
Findings
MLCC automatically selects the number of clusters at different significance levels.
MLCC provides statistically valid clustering without distributional assumptions.
MLCC can be integrated with various machine learning algorithms.
Abstract
In this work we present a clustering technique called \textit{multi-level conformal clustering (MLCC)}. The technique is hierarchical in nature because it can be performed at multiple significance levels which yields greater insight into the data than performing it at just one level. We describe the theoretical underpinnings of MLCC, compare and contrast it with the hierarchical clustering algorithm, and then apply it to real world datasets to assess its performance. There are several advantages to using MLCC over more classical clustering techniques: Once a significance level has been set, MLCC is able to automatically select the number of clusters. Furthermore, thanks to the conformal prediction framework the resulting clustering model has a clear statistical meaning without any assumptions about the distribution of the data. This statistical robustness also allows us to perform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
