Suiren-1.0 Technical Report: A Family of Molecular Foundation Models
Junyi An, Xinyu Lu, Yun-Fei Shi, Li-Cheng Xu, Nannan Zhang, Chao Qu, Yuan Qi, Fenglei Cao

TL;DR
Suiren-1.0 is a family of molecular foundation models that accurately predict properties of organic molecules, combining 3D conformational data with 2D representations through innovative pre-training and distillation techniques.
Contribution
The paper introduces Suiren-1.0, including novel pre-training, a diffusion-based distillation method, and state-of-the-art performance on molecular modeling tasks.
Findings
Suiren-Base achieves robust quantum property prediction.
Suiren-Dimer extends capabilities with intermolecular interaction data.
Suiren-ConfAvg provides high-fidelity 2D representations from 3D structures.
Abstract
We introduce Suiren-1.0, a family of molecular foundation models for the accurate modeling of diverse organic systems. Suiren-1.0 comprising three specialized variants (Suiren-Base, Suiren-Dimer, and Suiren-ConfAvg) is integrated within an algorithmic framework that bridges the gap between 3D conformational geometry and 2D statistical ensemble spaces. We first pre-train Suiren-Base (1.8B parameters) on a 70M-sample Density Functional Theory dataset using spatial self-supervision and SE(3)-equivariant architectures, achieving robust performance in quantum property prediction. Suiren-Dimer extends this capability through continued pre-training on 13.5M intermolecular interaction samples. To enable efficient downstream application, we propose Conformation Compression Distillation (CCD), a diffusion-based framework that distills complex 3D structural representations into 2D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
