Diversifying Conformal Selections
Yash Nair, Ying Jin, James Yang, Emmanuel Candes

TL;DR
This paper introduces DACS, a model-free method for selecting diverse candidate sets with controlled false discovery rate, applicable to fields like drug discovery and job hiring, using optimal stopping theory and conformal e-values.
Contribution
The paper proposes a novel diversity-aware conformal selection (DACS) method that adaptively constructs diverse candidate sets with FDR control, leveraging optimal stopping theory and conformal e-values.
Findings
DACS effectively balances diversity and FDR control in simulations.
The method performs well on real-world datasets in drug discovery and hiring.
Computational heuristics significantly improve runtime efficiency.
Abstract
When selecting from a list of potential candidates, it is important to ensure not only that the selected items are of high quality, but also that they are sufficiently dissimilar so as to both avoid redundancy and to capture a broader range of desirable properties. In drug discovery, scientists aim to select potent drugs from a library of unsynthesized candidates, but recognize that it is wasteful to repeatedly synthesize highly similar compounds. In job hiring, recruiters may wish to hire candidates who will perform well on the job, while also considering factors such as socioeconomic background, prior work experience, gender, or race. We study the problem of using any prediction model to construct a maximally diverse selection set of candidates while controlling the false discovery rate (FDR) in a model-free fashion. Our method, diversity-aware conformal selection (DACS), achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Computational Drug Discovery Methods · Machine Learning and Data Classification
