Revisiting the Perception-Distortion Trade-off with Spatial-Semantic Guided Super-Resolution
Dan Wang, Haiyan Sun, Shan Du, Z. Jane Wang, Zhaochong An, Serge Belongie, Xinrui Cui

TL;DR
This paper introduces SpaSemSR, a diffusion-based super-resolution framework that uses spatial-semantic guidance to balance perceptual quality and fidelity, effectively reducing distortion and hallucination in high-resolution image reconstruction.
Contribution
We propose a novel spatial-semantic guided diffusion approach with dual guidance mechanisms to improve the perception-distortion trade-off in super-resolution tasks.
Findings
SpaSemSR outperforms existing methods on multiple benchmarks.
It achieves a better balance between perceptual quality and fidelity.
The framework effectively suppresses hallucinations while enhancing realism.
Abstract
Image super-resolution (SR) aims to reconstruct high resolution images with both high perceptual quality and low distortion, but is fundamentally limited by the perception-distortion trade-off. GAN-based SR methods reduce distortion but still struggle with realistic fine-grained textures, whereas diffusion-based approaches synthesize rich details but often deviate from the input, hallucinating structures and degrading fidelity. This tension raises a key challenge: how to exploit the powerful generative priors of diffusion models without sacrificing fidelity. To address this, we propose SpaSemSR, a spatial-semantic guided diffusion framework with two complementary guidances. First, spatial-grounded textual guidance integrates object-level spatial cues with semantic prompts, aligning textual and visual structures to reduce distortion. Second, semantic-enhanced visual guidance with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment
