Improving Consistency in Diffusion Models for Image Super-Resolution
Junhao Gu, Peng-Tao Jiang, Hao Zhang, Mi Zhou, Jinwei Chen, Wenming, Yang, Bo Li

TL;DR
This paper introduces ConsisSR, a novel diffusion-based super-resolution method that improves semantic and training-inference consistency using a hybrid prompt adapter and time-aware latent augmentation, achieving state-of-the-art results.
Contribution
The paper proposes a new framework, ConsisSR, with a hybrid prompt adapter and time-aware latent augmentation to address semantic and training-inference inconsistencies in diffusion-based image super-resolution.
Findings
Achieves state-of-the-art performance among diffusion models.
Effectively reduces semantic inconsistency with CLIP image embeddings.
Bridges training-inference gap with TALA, improving reconstruction quality.
Abstract
Recent methods exploit the powerful text-to-image (T2I) diffusion models for real-world image super-resolution (Real-ISR) and achieve impressive results compared to previous models. However, we observe two kinds of inconsistencies in diffusion-based methods which hinder existing models from fully exploiting diffusion priors. The first is the semantic inconsistency arising from diffusion guidance. T2I generation focuses on semantic-level consistency with text prompts, while Real-ISR emphasizes pixel-level reconstruction from low-quality (LQ) images, necessitating more detailed semantic guidance from LQ inputs. The second is the training-inference inconsistency stemming from the DDPM, which improperly assumes high-quality (HQ) latent corrupted by Gaussian noise as denoising inputs for each timestep. To address these issues, we introduce ConsisSR to handle both semantic and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods · Photoacoustic and Ultrasonic Imaging
MethodsDiffusion · Contrastive Language-Image Pre-training · Adapter
