Benchmarking Compositional Generalisation for Machine Learning Interatomic Potentials

Amir Masoud Nourollah; Irtaza Khalid; Stefano Leoni; Steven Schockaert

arXiv:2605.08988·cs.LG·May 12, 2026

Benchmarking Compositional Generalisation for Machine Learning Interatomic Potentials

Amir Masoud Nourollah, Irtaza Khalid, Stefano Leoni, Steven Schockaert

PDF

TL;DR

This paper introduces a benchmark to evaluate how well machine learning interatomic potentials generalize to unseen molecules, revealing current models' limitations in compositional generalization.

Contribution

It proposes four tasks designed to test compositional generalization in ML interatomic potentials and provides an empirical analysis showing their high difficulty for existing models.

Findings

01

State-of-the-art models perform poorly on out-of-distribution molecules.

02

Errors on unseen molecules are often ten times higher than on training molecules.

03

Pre-trained foundation models still struggle with compositional generalization.

Abstract

Machine Learning Interatomic Potentials play a fundamental role in computational chemistry and materials science, enabling applications from molecular dynamics simulations to drug design and materials discovery. While recent approaches can estimate inter-atomic forces with high precision, it remains unclear to what extent they can generalise to previously unseen molecules. Do they learn the compositional structure of chemistry, capturing how molecular fragments and their combinations determine properties, or do they primarily learn to interpolate patterns that are specific to the training examples? To address this question, we propose a benchmark consisting of four tasks that require some form of compositional generalisation. In each task, models are tested on molecules that were unseen during training, but the training data is chosen such that generalisation to the test examples should…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.