Rethinking Diffusion Models with Symmetries through Canonicalization with Applications to Molecular Graph Generation
Cai Zhou, Zijie Chen, Zian Li, Jike Wang, Kaiyi Jiang, Pan Li, Rose Yu, Muhan Zhang, Stephen Bates, Tommi Jaakkola

TL;DR
This paper introduces a canonicalization approach for diffusion models that leverages symmetry properties, improving efficiency and performance in molecular graph generation by mapping samples to canonical forms before modeling.
Contribution
It provides a formal theory of canonical diffusion, demonstrating its universality and expressivity, and applies it to molecular graph generation with state-of-the-art results.
Findings
Canonical diffusion outperforms equivariant baselines in 3D molecule generation.
Canonicalization accelerates training and reduces complexity.
State-of-the-art performance achieved on GEOM-DRUG dataset.
Abstract
Many generative tasks in chemistry and science involve distributions invariant to group symmetries (e.g., permutation and rotation). A common strategy enforces invariance and equivariance through architectural constraints such as equivariant denoisers and invariant priors. In this paper, we challenge this tradition through the alternative canonicalization perspective: first map each sample to an orbit representative with a canonical pose or order, train an unconstrained (non-equivariant) diffusion or flow model on the canonical slice, and finally recover the invariant distribution by sampling a random symmetry transform at generation time. Building on a formal quotient-space perspective, our work provides a comprehensive theory of canonical diffusion by proving: (i) the correctness, universality and superior expressivity of canonical generative models over invariant targets; (ii)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Materials Science · Computational Drug Discovery Methods
