TL;DR
This paper unifies discrete, Gaussian, and simplicial diffusion models under a single framework based on the Wright-Fisher process, enabling stable, multi-domain diffusion modeling.
Contribution
It introduces a unified theory connecting these diffusion methods, allowing flexible, stable models that perform well across multiple data domains.
Findings
Wright-Fisher simplicial diffusion is more stable than previous models.
Unified models trained on multiple domains are competitive with single-domain models.
The theory enables switching between diffusion types without retraining.
Abstract
To model discrete sequences such as DNA, proteins, and language using diffusion, practitioners must choose between three major methods: diffusion in discrete space, Gaussian diffusion in Euclidean space, or diffusion on the simplex. Despite their shared goal, these models have disparate algorithms, theoretical structures, and tradeoffs: discrete diffusion has the most natural domain, Gaussian diffusion has more mature algorithms, and diffusion on the simplex in principle combines the strengths of the other two but in practice suffers from a numerically unstable stochastic processes. Ideally we could see each of these models as instances of the same underlying framework, and enable practitioners to switch between models for downstream applications. However previous theories have only considered connections in special cases. Here we build a theory unifying all three methods of discrete…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
