# Using Deep Graph Neural Networks Improves Physics-Based Hydration Free Energy Predictions Even for Molecules Outside of the Training Set Distribution

**Authors:** Luke H. Elder, Alexey V. Onufriev

PMC · DOI: 10.1021/acs.jpcb.5c02263 · The Journal of Physical Chemistry. B · 2025-07-11

## TL;DR

Combining physics-based models with deep learning improves predictions of hydration energy for molecules not seen during training.

## Contribution

A decoupled framework using physics-based models and DNNs improves HFE predictions for out-of-distribution molecules.

## Key findings

- Physics + DNN models consistently improve predictions for out-of-distribution data.
- DNN corrections reduce RMSE below 1 kcal/mol for in-distribution data.
- Removing high-uncertainty molecules improves model accuracy.

## Abstract

The accuracy of computational water models is crucial
to atomistic
simulations of biomolecules. Here we explore a decoupled framework
that combines classical physics-based models with deep neural networks
(DNNs) to correct residual error in hydration free energy (HFE) prediction.
Our main goal is to evaluate this framework on out-of-distribution
data (molecules that differ significantly from those used in training),
where DNNs are known to struggle. Several common physics-based solvation
models are used in the evaluation. Graph neural network architectures
are tested for their ability to generalize using multiple data set
splits, including out-of-distribution HFEs and unseen molecular scaffolds.
Our most important finding is that for out-of-distribution data, where
DNNs alone often struggle, the physics + DNN models consistently improve
physics model predictions. For in-distribution data, the DNN corrections
significantly improve the accuracy of physics-based models, with a
final RMSE below 1 kcal/mol and a relative improvement between 40%
and 65% in most cases. The accuracy of physics + DNN models tends
to improve when the 6% of molecules with the highest experimental
uncertainty are removed. This study provides insights into the potential
and limitations of combining physics and machine learning for molecular
modeling, offering a practical and generalizable strategy of using
DNN as independent postprocessing correction.

## Full-text entities

- **Chemicals:** water (MESH:D014867)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12302205/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12302205/full.md

## References

150 references — full list in the complete paper: https://tomesphere.com/paper/PMC12302205/full.md

---
Source: https://tomesphere.com/paper/PMC12302205