Spectral Regularization for Diffusion Models
Satish Chandran, Nicolas Roque dos Santos, Yunshu Wu, Greg Ver Steeg, Evangelos Papalexakis

TL;DR
This paper introduces a spectral regularization framework for diffusion models that enhances sample quality by incorporating Fourier and wavelet domain losses, improving multi-scale structure modeling without altering the core diffusion process.
Contribution
It proposes a novel spectral regularization method that acts as a soft inductive bias, compatible with existing diffusion models, and improves generation quality on complex datasets.
Findings
Enhanced sample quality on image and audio datasets.
Significant improvements on high-resolution, unconditional datasets.
Negligible additional computational cost.
Abstract
Diffusion models are typically trained using pointwise reconstruction objectives that are agnostic to the spectral and multi-scale structure of natural signals. We propose a loss-level spectral regularization framework that augments standard diffusion training with differentiable Fourier- and wavelet-domain losses, without modifying the diffusion process, model architecture, or sampling procedure. The proposed regularizers act as soft inductive biases that encourage appropriate frequency balance and coherent multi-scale structure in generated samples. Our approach is compatible with DDPM, DDIM, and EDM formulations and introduces negligible computational overhead. Experiments on image and audio generation demonstrate consistent improvements in sample quality, with the largest gains observed on higher-resolution, unconditional datasets where fine-scale structure is most challenging to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Hearing Loss and Rehabilitation
