Information Routing in Atomistic Foundation Models: How Task Alignment and Equivariance Shape Linear Disentanglement
Joshua Steier

TL;DR
This paper introduces Compositional Probe Decomposition (CPD) to analyze how molecular models organize geometric and compositional information, revealing task alignment and symmetry as key factors in representation disentanglement.
Contribution
It presents CPD as a new method for measuring geometric accessibility in molecular models and demonstrates the impact of task alignment, data diversity, and symmetry on representation organization.
Findings
Models trained on HOMO-LUMO gap have higher geometric accessibility.
Task alignment significantly improves geometric information retention.
Symmetry types influence how information is routed in representations.
Abstract
What determines whether a molecular property prediction model organizes its representations so that geometric and compositional information can be cleanly separated? We introduce Compositional Probe Decomposition (CPD), which linearly projects out composition signal and measures how much geometric information remains accessible to a Ridge probe. We validate CPD with four independent checks, including a structural isomer benchmark where compositional projections score at chance while geometric residuals reach 94.6\% pairwise classification accuracy. Across ten models from five architectural families on QM9, we find a \emph{linear accessibility gradient}: models differ by in geometric information accessible after composition removal ( from 0.081 to 0.533 for HOMO-LUMO gap). Three factors explain this gradient. Task alignment dominates: models trained on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Electron Microscopy Techniques and Applications · Quantum many-body systems
