Accelerating Diffusion Decoders via Multi-Scale Sampling and One-Step Distillation

Chuhan Wang; Hao Chen

arXiv:2603.19570·cs.CV·March 23, 2026

Accelerating Diffusion Decoders via Multi-Scale Sampling and One-Step Distillation

Chuhan Wang, Hao Chen

PDF

Open Access

TL;DR

This paper introduces a two-stage acceleration framework for diffusion decoders in image tokenization, combining multi-scale sampling and one-step distillation to significantly reduce decoding time while maintaining high image quality.

Contribution

It proposes a novel multi-scale sampling and distillation method that accelerates diffusion-based image decoders by an order of magnitude with minimal quality loss.

Findings

01

Achieves $ imes$10 speedup in decoding time.

02

Maintains high perceptual fidelity in reconstructed images.

03

Provides a scalable framework for efficient image tokenization.

Abstract

Image tokenization plays a central role in modern generative modeling by mapping visual inputs into compact representations that serve as an intermediate signal between pixels and generative models. Diffusion-based decoders have recently been adopted in image tokenization to reconstruct images from latent representations with high perceptual fidelity. In contrast to diffusion models used for downstream generation, these decoders are dedicated to faithful reconstruction rather than content generation. However, their iterative sampling process introduces significant latency, making them impractical for real-time or large-scale applications. In this work, we introduce a two-stage acceleration framework to address this inefficiency. First, we propose a multi-scale sampling strategy, where decoding begins at a coarse resolution and progressively refines the output by doubling the resolution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Cell Image Analysis Techniques · Digital Media Forensic Detection