TL;DR
This paper demonstrates that Bayesian optimization with surrogate models can significantly reduce the computational resources needed for large-scale virtual screening in drug discovery, identifying most top candidates with minimal evaluations.
Contribution
It evaluates various surrogate models and acquisition strategies, showing that model-guided screening can drastically cut down the number of required evaluations in large virtual libraries.
Findings
87.9% of top ligands found after testing only 2.4% of the library
Significant reduction in computational costs achieved
Model-guided search accelerates virtual screening campaigns
Abstract
Structure-based virtual screening is an important tool in early stage drug discovery that scores the interactions between a target protein and candidate ligands. As virtual libraries continue to grow (in excess of molecules), so too do the resources necessary to conduct exhaustive virtual screening campaigns on these libraries. However, Bayesian optimization techniques can aid in their exploration: a surrogate structure-property relationship model trained on the predicted affinities of a subset of the library can be applied to the remaining library members, allowing the least promising compounds to be excluded from evaluation. In this study, we assess various surrogate model architectures, acquisition functions, and acquisition batch sizes as applied to several protein-ligand docking datasets and observe significant reductions in computational costs, even when using a greedy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
