NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation
Zhiyuan Liu, Yanchen Luo, Han Huang, Enzhi Zhang, Sihang Li, Junfeng, Fang, Yaorui Shi, Xiang Wang, Kenji Kawaguchi, Tat-Seng Chua

TL;DR
NExT-Mol combines 1D language models and 3D diffusion models to improve the generation of valid, diverse, and accurate 3D molecules for drug discovery and material design.
Contribution
The paper introduces NExT-Mol, a novel framework that integrates pretrained 1D molecule language models with 3D diffusion models for enhanced 3D molecule generation.
Findings
1D LM outperforms baselines in distributional similarity and validity.
3D diffusion model achieves top performance in conformer prediction.
NExT-Mol improves 3D generation metrics by up to 26% on GEOM-DRUGS.
Abstract
3D molecule generation is crucial for drug discovery and material design. While prior efforts focus on 3D diffusion models for their benefits in modeling continuous 3D conformers, they overlook the advantages of 1D SELFIES-based Language Models (LMs), which can generate 100% valid molecules and leverage the billion-scale 1D molecule datasets. To combine these advantages for 3D molecule generation, we propose a foundation model -- NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation. NExT-Mol uses an extensively pretrained molecule LM for 1D molecule generation, and subsequently predicts the generated molecule's 3D conformers with a 3D diffusion model. We enhance NExT-Mol's performance by scaling up the LM's model size, refining the diffusion neural architecture, and applying 1D to 3D transfer learning. Notably, our 1D molecule LM significantly outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science
MethodsDiffusion · Focus
