Generalized Denoising Diffusion Codebook Models (gDDCM): Tokenizing images using a pre-trained diffusion model

Fei Kong

arXiv:2511.13387·cs.CV·December 15, 2025

Generalized Denoising Diffusion Codebook Models (gDDCM): Tokenizing images using a pre-trained diffusion model

Fei Kong

PDF

Open Access

TL;DR

This paper introduces gDDCM, a unified framework for image tokenization using diffusion models that improves compatibility with various diffusion architectures and enhances reconstruction quality.

Contribution

It proposes a generalized theoretical framework and a novel sampling strategy for diffusion-based image tokenization, addressing limitations of previous models.

Findings

01

Outperforms DDCM in reconstruction quality and perceptual fidelity

02

Achieves compatibility with multiple diffusion variants

03

Demonstrates effectiveness on CIFAR10 and LSUN datasets

Abstract

Denoising diffusion models have emerged as a dominant paradigm in image generation. Discretizing image data into tokens is a critical step for effectively integrating images with Transformer and other architectures. Although the Denoising Diffusion Codebook Models (DDCM) pioneered the use of pre-trained diffusion models for image tokenization, it strictly relies on the traditional discrete-time DDPM architecture. Consequently, it fails to adapt to modern continuous-time variants-such as Flow Matching and Consistency Models-and suffers from inefficient sampling in high-noise regions. To address these limitations, this paper proposes the Generalized Denoising Diffusion Codebook Models (gDDCM). We establish a unified theoretical framework and introduce a generic "De-noise and Back-trace" sampling strategy. By integrating a deterministic ODE denoising step with a residual-aligned noise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neuroimaging Techniques and Applications · Digital Media Forensic Detection