Enhancing Accuracy and Feature Insights in Hydration Free Energy Predictions for Small Molecules with Machine Learning
Mingjun Han, Yukai Zhang, Taotao Yu, Guodong Du, ChiYung Yam and, Ho-Kin Tang

TL;DR
This study employs machine learning models with feature analysis to improve the prediction of hydration free energy for small molecules, revealing key molecular features and potential for force field refinement.
Contribution
It introduces a combined machine learning approach with feature analysis to enhance hydration free energy predictions and interpret molecular factors influencing accuracy.
Findings
Improved predictive accuracy with MUE of 0.64 kcal/mol using offset-based training.
Molecular geometry and topology are critical features for prediction.
Charge distribution analysis reveals insights into force field inaccuracies.
Abstract
The accurate prediction of solvation free energy is of significant importance as it governs the behavior of solutes in solution. In this work, we apply a variety of machine learning techniques to predict and analyze the alchemical free energy of small molecules. Our methodology incorporates an ensemble of machine learning models with feature processing using the K-nearest neighbors algorithm. Two training strategies are explored: one based on experimental data, and the other based on the offset between molecular dynamics (MD) simulations and experimental measurements. The latter approach yields a substantial improvement in predictive accuracy, achieving a mean unsigned error (MUE) of 0.64 kcal/mol. Feature analysis identifies molecular geometry and topology as the most critical factors in predicting alchemical free energy, supporting the established theory that surface tension is a key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods
