One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution
Yushun Fang, Yuxiang Chen, Shibo Yin, Qiang Hu, Jiangchao Yao, Ya Zhang, Xiaoyun Zhang, Yanfeng Wang

TL;DR
ODTSR introduces a one-step diffusion transformer that balances high-fidelity real-world image super-resolution with controllability, enabling prompt adjustments without sacrificing quality, and achieves state-of-the-art results.
Contribution
The paper proposes ODTSR, a novel one-step diffusion transformer with a Noise-hybrid Visual Stream and Fidelity-aware Adversarial Training for controllable and high-quality real-world image super-resolution.
Findings
Achieves state-of-the-art performance on Real-ISR tasks.
Enables prompt controllability in challenging scenarios.
Does not require training on specific datasets for control.
Abstract
Recent advances in diffusion-based real-world image super-resolution (Real-ISR) have demonstrated remarkable perceptual quality, yet the balance between fidelity and controllability remains a problem: multi-step diffusion-based methods suffer from generative diversity and randomness, resulting in low fidelity, while one-step methods lose control flexibility due to fidelity-specific finetuning. In this paper, we present ODTSR, a one-step diffusion transformer based on Qwen-Image that performs Real-ISR considering fidelity and controllability simultaneously: a newly introduced visual stream receives low-quality images (LQ) with adjustable noise (Control Noise), and the original visual stream receives LQs with consistent noise (Prior Noise), forming the Noise-hybrid Visual Stream (NVS) design. ODTSR further employs Fidelity-aware Adversarial Training (FAA) to enhance controllability and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Image and Video Quality Assessment · Generative Adversarial Networks and Image Synthesis
