Chemistry-Enhanced Diffusion-Based Framework for Small-to-Large Molecular Conformation Generation
Yifei Zhu, Jiahui Zhang, Jiawei Peng, Mengge Li, Chao Xu, and Zhenggang Lan

TL;DR
This paper introduces StoL, a diffusion-based framework that rapidly generates large molecular conformations from small-molecule data by assembling chemically valid fragments, eliminating the need for large-molecule training data.
Contribution
StoL is a novel, knowledge-free, fragment-based diffusion model that efficiently generates large molecular structures without large-molecule training data.
Findings
Achieves rapid conformer generation with high chemical validity
Maintains high scalability and transferability to large molecules
Confirmed accuracy against DFT calculations
Abstract
Obtaining 3D conformations of realistic polyatomic molecules at the quantum chemistry level remains challenging, and although recent machine learning advances offer promise, predicting large-molecule structures still requires substantial computational effort. Here, we introduce StoL, a diffusion model-based framework that enables rapid and knowledge-free generation of large molecular structures from small-molecule data. Remarkably, StoL assembles molecules in a LEGO-style fashion from scratch, without seeing the target molecules or any structures of comparable size during training. Given a SMILES input, it decomposes the molecule into chemically valid fragments, generates their 3D structures with a diffusion model trained on small molecules, and assembles them into diverse conformations. This fragment-based strategy eliminates the need for large-molecule training data while maintaining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Computational Drug Discovery Methods · Protein Structure and Dynamics
