Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models
Samuel Lavoie, Michael Noukhovitch, Aaron Courville

TL;DR
This paper introduces Discrete Latent Codes (DLCs), a novel image representation for diffusion models that enhances sample fidelity, enables out-of-distribution generation, and facilitates text-to-image synthesis by leveraging large-scale language models.
Contribution
The paper proposes DLCs, a discrete, compositional image representation that improves diffusion model performance and enables new capabilities like out-of-distribution and text-to-image generation.
Findings
Achieved state-of-the-art results on ImageNet for unconditional image generation.
Enabled out-of-distribution image synthesis with coherent semantic composition.
Demonstrated effective text-to-image generation using DLCs and language models.
Abstract
We argue that diffusion models' success in modeling complex distributions is, for the most part, coming from their input conditioning. This paper investigates the representation used to condition diffusion models from the perspective that ideal representations should improve sample fidelity, be easy to generate, and be compositional to allow out-of-training samples generation. We introduce Discrete Latent Code (DLC), an image representation derived from Simplicial Embeddings trained with a self-supervised learning objective. DLCs are sequences of discrete tokens, as opposed to the standard continuous image embeddings. They are easy to generate and their compositionality enables sampling of novel images beyond the training distribution. Diffusion models trained with DLCs have improved generation fidelity, establishing a new state-of-the-art for unconditional image generation on ImageNet.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies
