Taming Sampling Perturbations with Variance Expansion Loss for Latent Diffusion Models
Qifan Li, Xingyu Zhou, Jinhua Zhang, Weiyi You, Shuhang Gu

TL;DR
This paper introduces a Variance Expansion loss to improve the robustness of latent diffusion models against sampling perturbations, leading to more stable and higher-quality image generation.
Contribution
It proposes a novel Variance Expansion loss that enhances latent space robustness without sacrificing reconstruction quality in diffusion models.
Findings
Improved image quality and stability across architectures
Enhanced robustness to sampling perturbations
Maintained high reconstruction fidelity
Abstract
Latent diffusion models have emerged as the dominant framework for high-fidelity and efficient image generation, owing to their ability to learn diffusion processes in compact latent spaces. However, while previous research has focused primarily on reconstruction accuracy and semantic alignment of the latent space, we observe that another critical factor, robustness to sampling perturbations, also plays a crucial role in determining generation quality. Through empirical and theoretical analyses, we show that the commonly used -VAE-based tokenizers in latent diffusion models, tend to produce overly compact latent manifolds that are highly sensitive to stochastic perturbations during diffusion sampling, leading to visual degradation. To address this issue, we propose a simple yet effective solution that constructs a latent space robust to sampling perturbations while maintaining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neuroimaging Techniques and Applications · Domain Adaptation and Few-Shot Learning
