Gradual Optimization Learning for Conformational Energy Minimization
Artem Tsypin, Leonid Ugadiarov, Kuzma Khrabrov, Alexander Telepov,, Egor Rumiantsev, Alexey Skrynnik, Aleksandr I. Panov, Dmitry Vetrov, Elena, Tutubalina, Artur Kadurin

TL;DR
This paper introduces GOLF, a framework that enhances neural network-based molecular energy minimization, reducing data requirements by 50 times while maintaining accuracy comparable to physical simulators.
Contribution
The paper proposes GOLF, a novel framework combining efficient data collection and external optimization to significantly improve neural network energy minimization for molecules.
Findings
Neural networks trained with GOLF match physical simulator accuracy.
GOLF reduces additional data needs by 50 times.
Framework is effective on diverse drug-like molecules.
Abstract
Molecular conformation optimization is crucial to computer-aided drug discovery and materials design. Traditional energy minimization techniques rely on iterative optimization methods that use molecular forces calculated by a physical simulator (oracle) as anti-gradients. However, this is a computationally expensive approach that requires many interactions with a physical simulator. One way to accelerate this procedure is to replace the physical simulator with a neural network. Despite recent progress in neural networks for molecular conformation energy prediction, such models are prone to distribution shift, leading to inaccurate energy minimization. We find that the quality of energy minimization with neural networks can be improved by providing optimization trajectories as additional training data. Still, it takes around additional conformations to match the physical…
Peer Reviews
Decision·ICLR 2024 poster
1. The development of the proposed GOLF is clear. 2. Demonstrated outstanding performance in conformation optimization tasks. 3. Generalization to larger molecules.
1. Dataset Limitation: The paper may be limited by the availability and diversity of datasets used for testing, potentially impacting the generalizability of the results. 2. Complexity: It seems that the complexity of GOLF is not clearly discussed in the paper. 3. Practical Implementation: Though the algorithm is not very complicated, this paper does not release code, which leaves me cautious about the practical implementation and complexity of the algorithm. Overall, while the paper presents v
The proposed method sounds reasonable and the experiments shows its effectiveness. The method also looks easy to implement, which can improve the conformation energy minimization performance at a small cost.
- The writing is not very clear, especially in the introduction section. It takes me a while to understand simply enriching the training dataset is actually a preliminary baseline method the authors want to compare with. Also, lots of experiment details are mixed with the method, which makes the paper not very easy to read. - The calculation of COV and MAT looks problematic. It seems the authors optimize **one conformation per molecule** and take them of the entire test set as the generation se
Overall, the paper is well written and informative. It seems relevant to improving the computational tractability of conformational energy minimization. The insight to use active learning to address the distribution shift and improve accuracy is valuable. Also, the approach to active learning by using low-cost simulation to augment the training data set without impacting the quality of the subsequent model is somewhat innovative.
The various weaknesses are already detailed above. Here I summarize the most important ones. Using GO to generate the baseline dataset may limit the scalability of the approach. It is unclear why the authors do not use the molecular databases they mention in the “Introduction” section to extract the baseline dataset. Some of the discussion is somewhat cryptic and can benefit from some additional discussion or graphics. The results seem anecdotal, tied to the selected dataset and molecules, and t
Code & Models
Videos
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Various Chemistry Research Topics
