Q-DiT4SR: Exploration of Detail-Preserving Diffusion Transformer Quantization for Real-World Image Super-Resolution

Xun Zhang; Kaicheng Yang; Hongliang Lu; Haotong Qin; Yong Guo; Yulun Zhang

arXiv:2602.01273·cs.CV·May 21, 2026

Q-DiT4SR: Exploration of Detail-Preserving Diffusion Transformer Quantization for Real-World Image Super-Resolution

Xun Zhang, Kaicheng Yang, Hongliang Lu, Haotong Qin, Yong Guo, Yulun Zhang

PDF

1 Repo

TL;DR

This paper introduces Q-DiT4SR, a novel post-training quantization framework specifically designed for diffusion transformer-based real-world image super-resolution, achieving state-of-the-art results with significant model size and computation reduction.

Contribution

It presents the first PTQ method tailored for DiT-based super-resolution, including H-SVD for low-rank approximation and VaSMP/VaTMP for precision allocation and scheduling.

Findings

01

Achieves state-of-the-art performance on multiple datasets.

02

Reduces model size by 5.8 times and computations by 6.14 times under W4A4 setting.

03

Demonstrates effective local texture preservation in quantized models.

Abstract

Recently, Diffusion Transformers (DiTs) have emerged in Real-World Image Super-Resolution (Real-ISR) to generate high-quality textures, yet their heavy inference burden hinders real-world deployment. While Post-Training Quantization (PTQ) is a promising solution for acceleration, existing methods in super-resolution mostly focus on U-Net architectures, whereas generic DiT quantization is typically designed for text-to-image tasks. Directly applying these methods to DiT-based super-resolution models leads to severe degradation of local textures. Therefore, we propose Q-DiT4SR, the first PTQ framework specifically tailored for DiT-based Real-ISR. We propose H-SVD, a hierarchical SVD that integrates a global low-rank branch with a local block-wise rank-1 branch under a matched parameter budget. We further propose Variance-aware Spatio-Temporal Mixed Precision: VaSMP allocates cross-layer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xunzhang1128/Q-DiT4SR
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.