Spectrum Matching: a Unified Perspective for Superior Diffusability in Latent Diffusion

Mang Ning; Mingxiao Li; Le Zhang; Lanmiao Liu; Matthew B. Blaschko; Albert Ali Salah; Itir Onal Ertugrul

arXiv:2603.14645·cs.CV·March 17, 2026

Spectrum Matching: a Unified Perspective for Superior Diffusability in Latent Diffusion

Mang Ning, Mingxiao Li, Le Zhang, Lanmiao Liu, Matthew B. Blaschko, Albert Ali Salah, Itir Onal Ertugrul

PDF

Open Access

TL;DR

This paper introduces Spectrum Matching, a unified spectral perspective for improving latent diffusion models by aligning the frequency spectra of images and latents, leading to enhanced generative performance.

Contribution

It proposes the Spectrum Matching hypothesis, combining Encoding and Decoding Spectrum Matching, to improve latent diffusion by spectral alignment, and extends this view to representation alignment with a new DoG-based method.

Findings

01

Spectrum Matching improves diffusion quality on CelebA and ImageNet.

02

Matching spectral properties leads to better latent diffusion performance.

03

The spectral view clarifies prior methods and guides new improvements.

Abstract

In this paper, we study the diffusability (learnability) of variational autoencoders (VAE) in latent diffusion. First, we show that pixel-space diffusion trained with an MSE objective is inherently biased toward learning low and mid spatial frequencies, and that the power-law power spectral density (PSD) of natural images makes this bias perceptually beneficial. Motivated by this result, we propose the \emph{Spectrum Matching Hypothesis}: latents with superior diffusability should (i) follow a flattened power-law PSD (\emph{Encoding Spectrum Matching}, ESM) and (ii) preserve frequency-to-frequency semantic correspondence through the decoder (\emph{Decoding Spectrum Matching}, DSM). In practice, we apply ESM by matching the PSD between images and latents, and DSM via shared spectral masking with frequency-aligned reconstruction. Importantly, Spectrum Matching provides a unified view that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neuroimaging Techniques and Applications · Domain Adaptation and Few-Shot Learning