Understanding active learning of molecular docking and its applications
Jeonghyeon Kim, Juno Nam, and Seongok Ryu

TL;DR
This paper investigates how active learning models predict molecular docking scores using only 2D structures, assessing their effectiveness and limitations across multiple receptor targets in virtual screening.
Contribution
It provides a critical analysis of surrogate models' ability to predict docking scores without 3D structural features, highlighting their practical utility and memorization tendencies.
Findings
Surrogate models tend to memorize prevalent structural patterns.
Active learning models are useful in virtual screening for identifying actives.
Models perform well in large-scale screening scenarios.
Abstract
With the advancing capabilities of computational methodologies and resources, ultra-large-scale virtual screening via molecular docking has emerged as a prominent strategy for in silico hit discovery. Given the exhaustive nature of ultra-large-scale virtual screening, active learning methodologies have garnered attention as a means to mitigate computational cost through iterative small-scale docking and machine learning model training. While the efficacy of active learning methodologies has been empirically validated in extant literature, a critical investigation remains in how surrogate models can predict docking score without considering three-dimensional structural features, such as receptor conformation and binding poses. In this paper, we thus investigate how active learning methodologies effectively predict docking scores using only 2D structures and under what circumstances they…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsClick Chemistry and Applications · Computational Drug Discovery Methods
MethodsSparse Evolutionary Training
