A collaborative constrained graph diffusion model for the generation of realistic synthetic molecules
Manuel Ruiz-Botella, Marta Sales-Pardo, Roger Guimer\`a

TL;DR
CoCoGraph is a novel collaborative constrained graph diffusion model that efficiently generates chemically valid molecules closely matching real chemical property distributions, advancing molecular discovery.
Contribution
It introduces a new constrained graph diffusion approach with collaborative mechanisms, outperforming existing models in validity, efficiency, and property distribution accuracy.
Findings
Outperforms state-of-the-art models on benchmarks
Generates a large database of 8.2 million molecules
Molecules closely match real chemical property distributions
Abstract
Developing new molecular compounds is crucial to address pressing challenges, from health to environmental sustainability. However, exploring the molecular space to discover new molecules is difficult due to the vastness of the space. Here we introduce CoCoGraph, a collaborative and constrained graph diffusion model capable of generating molecules that are guaranteed to be chemically valid. Thanks to the constraints built into the model and to the collaborative mechanism, CoCoGraph outperforms state-of-the-art approaches on standard benchmarks while requiring up to an order of magnitude fewer parameters. Analysis of 36 chemical properties also demonstrates that CoCoGraph generates molecules with distributions more closely matching real molecules than current models. Leveraging the model's efficiency, we created a database of 8.2M million synthetically generated molecules and conducted a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Advanced Graph Neural Networks · Machine Learning in Materials Science
MethodsDiffusion
