Thermodynamic Descriptors from Molecular Dynamics as Machine Learning Features for Extrapolable Property Prediction
Nuria H. Espejo, Pablo Llombart, Andr\'es Gonz\'alez de Castilla, Jorge Ramirez, Jorge R. Espinosa, Adiran Garaizar

TL;DR
This paper introduces a physics-augmented machine learning framework using thermodynamic properties from molecular dynamics to improve extrapolation in molecular property prediction, especially for novel and diverse chemical spaces.
Contribution
It replaces traditional structural descriptors with thermodynamic properties derived from MD simulations, enabling better extrapolation to unseen chemical classes and elements.
Findings
Maintains low error when extrapolating to dissimilar chemical spaces
Accurately predicts boiling points for unseen chemical classes
Performs comparably to structure-based models on known compounds
Abstract
The limited extrapolative power of structure-based machine learning (ML) models is a critical bottleneck in chemical discovery, particularly for industrial R&D, where navigating uncharted chemical space to find next-generation materials or drugs is paramount. These models, reliant on structural descriptors or graph neural networks (GNNs), often fail when predicting properties for molecules with novel chemotypes. Here, we introduce a physics-augmented ML framework that overcomes this limitation. Our approach replaces conventional structural inputs with thermodynamic properties such as cohesive energy, heat of vaporization, and density, derived directly from molecular dynamics (MD) simulations. While performing comparably to structure-based models on known organic compounds, our method uniquely maintains low error when extrapolating to dissimilar chemical spaces. Crucially, it accurately…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Crystallography and molecular interactions · Computational Drug Discovery Methods
