UBio-MolFM: A Universal Molecular Foundation Model for Bio-Systems
Lin Huang, Arthur Jiang, XiaoLi Liu, Zion Wang, Jason Zhao, Chu Wang, HaoCheng Lu, ChengXiang Huang, JiaJun Cheng, YiYue Du, Jia Zhang

TL;DR
UBio-MolFM is a universal molecular foundation model that combines a large bio-specific dataset, an efficient equivariant transformer, and a curriculum learning protocol to achieve quantum-level accuracy in large biomolecular simulations.
Contribution
The paper introduces UBio-MolFM, a novel framework integrating a new dataset, an advanced transformer architecture, and a multi-stage training protocol for scalable, accurate molecular simulations.
Findings
Achieves ab initio-level fidelity on systems up to ~1,500 atoms.
Demonstrates improved inference throughput (~4x) in large-system benchmarks.
Accurately reproduces microscopic forces and macroscopic observables.
Abstract
All-atom molecular simulation serves as a quintessential ``computational microscope'' for understanding the machinery of life, yet it remains fundamentally limited by the trade-off between quantum-mechanical (QM) accuracy and biological scale. We present UBio-MolFM, a universal foundation model framework specifically engineered to bridge this gap. UBio-MolFM introduces three synergistic innovations: (1) UBio-Mol26, a large bio-specific dataset constructed via a multi-fidelity ``Two-Pronged Strategy'' that combines systematic bottom-up enumeration with top-down sampling of native protein environments (up to 1,200 atoms); (2) E2Former-V2, a linear-scaling equivariant transformer that integrates Equivariant Axis-Aligned Sparsification (EAAS) and Long-Short Range (LSR) modeling to capture non-local physics with up to ~4x higher inference throughput in our large-system benchmarks; and (3) a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
