Navigating the Design Space of Equivariant Diffusion-Based Generative Models for De Novo 3D Molecule Generation
Tuan Le, Julian Cremer, Frank No\'e, Djork-Arn\'e Clevert, Kristof, Sch\"utt

TL;DR
This paper introduces EQGAT-diff, a novel E(3)-equivariant diffusion model for 3D molecule generation that outperforms existing models in accuracy, efficiency, and transferability, advancing de novo molecular design.
Contribution
The paper presents EQGAT-diff, a new diffusion model that incorporates continuous atom positions and categorical chemical features, with improved training convergence and sample quality.
Findings
EQGAT-diff outperforms established models on QM9 and GEOM-Drugs datasets.
Including chemically motivated features enhances molecule validity.
Fine-tuning on limited data improves performance and transferability.
Abstract
Deep generative diffusion models are a promising avenue for 3D de novo molecular design in materials science and drug discovery. However, their utility is still limited by suboptimal performance on large molecular structures and limited training data. To address this gap, we explore the design space of E(3)-equivariant diffusion models, focusing on previously unexplored areas. Our extensive comparative analysis evaluates the interplay between continuous and discrete state spaces. From this investigation, we present the EQGAT-diff model, which consistently outperforms established models for the QM9 and GEOM-Drugs datasets. Significantly, EQGAT-diff takes continuous atom positions, while chemical elements and bond types are categorical and uses time-dependent loss weighting, substantially increasing training convergence, the quality of generated samples, and inference time. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Protein Structure and Dynamics · Computational Drug Discovery Methods
MethodsDiffusion
