A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining
Shengchao Liu, Weitao Du, Zhiming Ma, Hongyu Guo, Jian Tang

TL;DR
This paper introduces MoleculeSDE, a novel group symmetric stochastic differential equation model that directly generates 3D geometries from 2D topologies and vice versa, improving molecule representation for drug discovery.
Contribution
MoleculeSDE uniquely leverages group symmetric SDEs to generate molecular modalities directly in input space, enhancing mutual information bounds and downstream task performance.
Findings
Outperforms 17 baselines on 26 of 32 tasks
Provides tighter mutual information bounds
Enables effective multi-modal molecule pretraining
Abstract
Molecule pretraining has quickly become the go-to schema to boost the performance of AI-based drug discovery. Naturally, molecules can be represented as 2D topological graphs or 3D geometric point clouds. Although most existing pertaining methods focus on merely the single modality, recent research has shown that maximizing the mutual information (MI) between such two modalities enhances the molecule representation ability. Meanwhile, existing molecule multi-modal pretraining approaches approximate MI based on the representation space encoded from the topology and geometry, thus resulting in the loss of critical structural information of molecules. To address this issue, we propose MoleculeSDE. MoleculeSDE leverages group symmetric (e.g., SE(3)-equivariant and reflection-antisymmetric) stochastic differential equation models to generate the 3D geometries from 2D topologies, and vice…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsComputational Drug Discovery Methods · Chemical Synthesis and Analysis · Machine Learning in Materials Science
MethodsFocus
