Conformal Drug Property Prediction with Density Estimation under Covariate Shift
Siddhartha Laghuvarapu, Zhen Lin, Jimeng Sun

TL;DR
This paper introduces CoDrug, a method combining energy-based models and density estimation to improve the reliability of uncertainty predictions in drug discovery, especially under distribution shifts.
Contribution
It presents CoDrug, a novel approach that adjusts conformal prediction sets for covariate shift using density estimation, enhancing prediction validity in drug discovery.
Findings
CoDrug reduces coverage gap by over 35% under distribution shift.
It provides valid prediction sets in realistic drug discovery scenarios.
The method effectively addresses covariate shift in molecular property prediction.
Abstract
In drug discovery, it is vital to confirm the predictions of pharmaceutical properties from computational models using costly wet-lab experiments. Hence, obtaining reliable uncertainty estimates is crucial for prioritizing drug molecules for subsequent experimental validation. Conformal Prediction (CP) is a promising tool for creating such prediction sets for molecular properties with a coverage guarantee. However, the exchangeability assumption of CP is often challenged with covariate shift in drug discovery tasks: Most datasets contain limited labeled data, which may not be representative of the vast chemical space from which molecules are drawn. To address this limitation, we propose a method called CoDrug that employs an energy-based model leveraging both training data and unlabelled data, and Kernel Density Estimation (KDE) to assess the densities of a molecule set. The estimated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Analytical Chemistry and Chromatography · Machine Learning in Materials Science
