Semantic-Guided Diffusion Model for Single-Step Image Super-Resolution

Zihang Liu; Zhenyu Zhang; Hao Tang

arXiv:2505.07071·cs.CV·May 13, 2025

Semantic-Guided Diffusion Model for Single-Step Image Super-Resolution

Zihang Liu, Zhenyu Zhang, Hao Tang

PDF

Open Access 1 Repo

TL;DR

The paper introduces SAMSR, a semantic-guided diffusion framework for single-step image super-resolution that leverages semantic masks and a pixel-wise sampling strategy to enhance detail recovery in complex images.

Contribution

It proposes a novel semantic-guided diffusion model with a semantic noise module, pixel-wise sampling, and a semantic consistency loss for improved super-resolution.

Findings

01

Significantly improves perceptual quality in super-resolution tasks.

02

Enhances detail recovery in semantically complex regions.

03

Outperforms existing diffusion-based super-resolution methods.

Abstract

Diffusion-based image super-resolution (SR) methods have demonstrated remarkable performance. Recent advancements have introduced deterministic sampling processes that reduce inference from 15 iterative steps to a single step, thereby significantly improving the inference speed of existing diffusion models. However, their efficiency remains limited when handling complex semantic regions due to the single-step inference. To address this limitation, we propose SAMSR, a semantic-guided diffusion framework that incorporates semantic segmentation masks into the sampling process. Specifically, we introduce the SAM-Noise Module, which refines Gaussian noise using segmentation masks to preserve spatial and semantic features. Furthermore, we develop a pixel-wise sampling strategy that dynamically adjusts the residual transfer rate and noise strength based on pixel-level semantic weights,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liu-zihang/samsr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Image and Video Quality Assessment · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings