Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization
Wenhao Gao, Tianfan Fu, Jimeng Sun, Connor W. Coley

TL;DR
This paper introduces PMO, a benchmark for practical molecular optimization emphasizing sample efficiency, revealing that many current methods underperform under realistic query limits and highlighting areas for future improvement.
Contribution
The paper presents an open-source benchmark for molecular optimization, focusing on sample efficiency and providing a standardized setup for fair comparison of algorithms.
Findings
Most state-of-the-art methods do not outperform older methods under limited queries.
No existing algorithm efficiently solves certain molecular optimization tasks within 10K queries.
Algorithm choice and molecular assembly strategies significantly impact optimization performance.
Abstract
Molecular optimization is a fundamental goal in the chemical sciences and is of central interest to drug and material design. In recent years, significant progress has been made in solving challenging problems across various aspects of computational molecular optimizations, emphasizing high validity, diversity, and, most recently, synthesizability. Despite this progress, many papers report results on trivial or self-designed tasks, bringing additional challenges to directly assessing the performance of new methods. Moreover, the sample efficiency of the optimization--the number of molecules evaluated by the oracle--is rarely discussed, despite being an essential consideration for realistic discovery applications. To fill this gap, we have created an open-source benchmark for practical molecular optimization, PMO, to facilitate the transparent and reproducible evaluation of algorithmic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Chemical Synthesis and Analysis
