Density diversity in training data governs thermodynamic transferability of machine learning interatomic potentials
Minwoo Kim, Seungtae Kim, Je-Yeon Jung, Min Young Ha, Won Bo Lee

TL;DR
This paper shows that diversifying training data density, rather than temperature, improves the transferability of machine learning interatomic potentials across different thermodynamic states, especially for fluids.
Contribution
It establishes density diversity as a key principle for designing thermodynamically transferable MLIPs and validates this approach through experiments and analysis.
Findings
Density-diverse datasets resolve transferability failures.
Temperature diversity alone cannot cover missing density regimes.
Local coordination topology is more affected by density than temperature.
Abstract
Machine learning interatomic potentials (MLIPs) offer first-principles accuracy with reduced computational cost, but their transferability across different thermodynamic states remains questionable, particularly for fluid systems where molecules experience local environments far from crystalline equilibrium. Here, we demonstrate that diversifying the density of training configurations, rather than temperature, is the most effective strategy for building thermodynamically transferable MLIPs within a fixed computational budget. We first show that foundation MLIPs trained on solid-state databases accurately describe liquid-like densities but fail at gas-like conditions, while molecular-database-trained models exhibit the opposite behavior. Controlled from-scratch training and distillation experiments confirm that density-diverse datasets resolve both failure modes, whereas…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
