Machine Learning of Free Energies in Chemical Compound Space Using Ensemble Representations: Reaching Experimental Uncertainty for Solvation
Jan Weinreich, Nicholas J. Browning, O. Anatole von Lilienfeld

TL;DR
This paper introduces a machine learning model for predicting solvation free energies across chemical space, achieving experimental accuracy with minimal computational effort by using ensemble representations and molecular dynamics sampling.
Contribution
The authors develop a novel Free energy Machine Learning (FML) model that employs Boltzmann-averaged ensemble representations and short MD simulations, reaching experimental uncertainty levels in solvation free energy predictions.
Findings
FML prediction errors decrease with training set size, reaching 0.6 kcal/mol after 490 molecules.
FML's accuracy is comparable to state-of-the-art physics-based methods.
The model effectively analyzes solvation across 116k molecules, identifying key structural features.
Abstract
Free energies govern the behavior of soft and liquid matter, and improving their predictions could have a large impact on the development of drugs, electrolytes or homogeneous catalysts. Unfortunately, it is challenging to devise an accurate description of effects governing solvation such as hydrogen-bonding, van der Waals interactions, or conformational sampling. We present a Free energy Machine Learning (FML) model applicable throughout chemical compound space and based on a representation that employs Boltzmann averages to account for an approximated sampling of configurational space. Using the FreeSolv database, FML's out-of-sample prediction errors of experimental hydration free energies decay systematically with training set size, and experimental uncertainty (0.6 kcal/mol) is reached after training on 490 molecules (80\% of FreeSolv). Corresponding FML model errors are also on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
