False membership rate control in mixture models
Ariane Marandon, Tabea Rebafka, Etienne Roquain, Nataliya Sokolovska

TL;DR
This paper develops a method for controlling the false membership rate in mixture models, ensuring misclassification stays below a pre-set level, with theoretical guarantees and improved bootstrap-based procedures.
Contribution
It introduces a novel plug-in method with theoretical analysis for FMR control in unsupervised mixture models, including bootstrap enhancements.
Findings
The proposed method guarantees FMR does not exceed the nominal level.
Bootstrap procedures improve the accuracy and robustness of FMR control.
Theoretical bounds quantify the deviation of FMR from the target level.
Abstract
The clustering task consists in partitioning elements of a sample into homogeneous groups. Most datasets contain individuals that are ambiguous and intrinsically difficult to attribute to one or another cluster. However, in practical applications, misclassifying individuals is potentially disastrous and should be avoided. To keep the misclassification rate small, one can decide to classify only a part of the sample. In the supervised setting, this approach is well known and referred to as classification with an abstention option. In this paper the approach is revisited in an unsupervised mixture model framework and the purpose is to develop a method that comes with the guarantee that the false membership rate (FMR) does not exceed a pre-defined nominal level . A plug-in procedure is proposed, for which a theoretical analysis is provided, by quantifying the FMR deviation with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Advanced Statistical Methods and Models
