DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

Zehong Ma; Longhui Wei; Shuai Wang; Shiliang Zhang; and Qi Tian

arXiv:2511.19365·cs.CV·April 9, 2026

DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

Zehong Ma, Longhui Wei, Shuai Wang, Shiliang Zhang, and Qi Tian

PDF

1 Repo 2 Models

TL;DR

DeCo introduces a frequency-decoupled pixel diffusion framework that separates high and low frequency generation, leading to more efficient training and superior image quality in pixel-based image generation.

Contribution

The paper proposes a novel frequency-decoupled pixel diffusion method with a lightweight decoder and frequency-aware loss, improving efficiency and performance over existing pixel diffusion models.

Findings

01

Achieves state-of-the-art FID scores of 1.62 and 2.22 on ImageNet at 256x256 and 512x512 resolutions.

02

Pretrained text-to-image model scores 0.86 on GenEval, outperforming previous models.

03

Demonstrates that decoupling frequency components enhances pixel diffusion efficiency and quality.

Abstract

Pixel diffusion aims to generate images directly in pixel space in an end-to-end fashion. This approach avoids the limitations of VAE in the two-stage latent diffusion, offering higher model capacity. Existing pixel diffusion models suffer from slow training and inference, as they usually model both high-frequency signals and low-frequency semantics within a single diffusion transformer (DiT). To pursue a more efficient pixel diffusion paradigm, we propose the frequency-DeCoupled pixel diffusion framework. With the intuition to decouple the generation of high and low frequency components, we leverage a lightweight pixel decoder to generate high-frequency details conditioned on semantic guidance from the DiT. This thus frees the DiT to specialize in modeling low-frequency semantics. In addition, we introduce a frequency-aware flow-matching loss that emphasizes visually salient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Zehong-Ma/DeCo
github

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.