Transformers Discover Molecular Structure Without Graph Priors
Tobias Kreiman, Yutong Bai, Fadi Atieh, Elizabeth Weaver, Eric Qu, Aditi S. Krishnapriyan

TL;DR
This paper shows that Transformers trained directly on Cartesian coordinates can effectively predict molecular energies and forces, learning physically consistent patterns without relying on predefined graph structures.
Contribution
It demonstrates that pure Transformers can match GNN performance in molecular modeling, learning meaningful physical patterns without explicit graph priors.
Findings
Transformers achieve competitive energy and force prediction errors.
Attention weights decay inversely with interatomic distance.
Scaling Transformers improves performance predictably.
Abstract
Graph Neural Networks (GNNs) are the dominant architecture for molecular machine learning, particularly for molecular property prediction and machine learning interatomic potentials (MLIPs). GNNs perform message passing on predefined graphs often induced by a fixed radius cutoff or k-nearest neighbor scheme. While this design aligns with the locality present in many molecular tasks, a hard-coded graph can limit expressivity due to the fixed receptive field and slows down inference with sparse graph operations. In this work, we investigate whether pure, unmodified Transformers trained directly on Cartesian coordinateswithout predefined graphs or physical priorscan approximate molecular energies and forces. As a starting point for our analysis, we demonstrate how to train a Transformer to competitive energy and force mean absolute errors under a matched…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHistory and advancements in chemistry
