Comparing the latent features of universal machine-learning interatomic potentials

Sofiia Chorna; Davide Tisi; Cesare Malosso; Wei Bin How; Michele Ceriotti; and Sanggyu Chong

arXiv:2512.05717·physics.chem-ph·April 20, 2026

Comparing the latent features of universal machine-learning interatomic potentials

Sofiia Chorna, Davide Tisi, Cesare Malosso, Wei Bin How, Michele Ceriotti, and Sanggyu Chong

PDF

TL;DR

This paper systematically analyzes the learned latent features of universal machine-learning interatomic potentials, revealing how they encode chemical space differently and how training choices influence these features.

Contribution

It provides a quantitative assessment of the information content in uMLIP latent features and discusses methods to compress atom-level features into global descriptors.

Findings

01

uMLIPs encode chemical space in significantly distinct ways

02

Training set and protocol affect feature trends and reconstruction errors

03

Fine-tuning retains strong pre-training bias in latent features

Abstract

The past few years have seen the development of ``universal'' machine-learning interatomic potentials (uMLIPs) capable of approximating the ground-state potential energy surface across a wide range of chemical structures and compositions with reasonable accuracy. While these models differ in the architecture and the dataset used, they share the ability to compress a staggering amount of chemical information into descriptive latent features. Herein, we systematically analyze what the different uMLIPs have learned by quantitatively assessing the relative information content of their latent features with feature reconstruction errors, and observing how the trends are affected by the choice of training set and training protocol. We find that uMLIPs encode the chemical space in significantly distinct ways, with substantial cross-model feature reconstruction errors. When variants of the same…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.