Nested Diffusion Models Using Hierarchical Latent Priors

Xiao Zhang; Ruoxi Jiang; Rebecca Willett; Michael Maire

arXiv:2412.05984·cs.CV·December 10, 2024

Nested Diffusion Models Using Hierarchical Latent Priors

Xiao Zhang, Ruoxi Jiang, Rebecca Willett, Michael Maire

PDF

Open Access

TL;DR

This paper presents nested diffusion models that use hierarchical latent priors to improve image generation quality, especially for complex scenes, by progressively generating and conditioning on semantic-level latent variables.

Contribution

The paper introduces a hierarchical diffusion framework with semantic latent variables, leveraging a pre-trained encoder, to enhance image quality with minimal additional computational cost.

Findings

01

Significant improvement in image quality across multiple datasets.

02

Outperforms baseline conditional systems in unconditional generation.

03

Efficient hierarchical approach with low overhead.

Abstract

We introduce nested diffusion models, an efficient and powerful hierarchical generative framework that substantially enhances the generation quality of diffusion models, particularly for images of complex scenes. Our approach employs a series of diffusion models to progressively generate latent variables at different semantic levels. Each model in this series is conditioned on the output of the preceding higher-level models, culminating in image generation. Hierarchical latent variables guide the generation process along predefined semantic pathways, allowing our approach to capture intricate structural details while significantly improving image quality. To construct these latent variables, we leverage a pre-trained visual encoder, which learns strong semantic visual representations, and modulate its capacity via dimensionality reduction and noise injection. Across multiple datasets,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsDiffusion