Rethinking Super-Resolution as Text-Guided Details Generation
Chenxi Ma, Bo Yan, Qing Lin, Weimin Tan, Siming Chen

TL;DR
This paper introduces a novel text-guided super-resolution framework that leverages multi-modal fusion to generate high-resolution images with details aligned to textual descriptions, improving realism and semantic accuracy.
Contribution
It proposes a new text-guided super-resolution approach that incorporates text-image fusion to enhance detail generation beyond traditional image-only methods.
Findings
Effective generation of semantically aligned high-resolution images
Improved visual quality with text-guided detail enhancement
Demonstrated superiority over existing super-resolution techniques
Abstract
Deep neural networks have greatly promoted the performance of single image super-resolution (SISR). Conventional methods still resort to restoring the single high-resolution (HR) solution only based on the input of image modality. However, the image-level information is insufficient to predict adequate details and photo-realistic visual quality facing large upscaling factors (x8, x16). In this paper, we propose a new perspective that regards the SISR as a semantic image detail enhancement problem to generate semantically reasonable HR image that are faithful to the ground truth. To enhance the semantic accuracy and the visual quality of the reconstructed image, we explore the multi-modal fusion learning in SISR by proposing a Text-Guided Super-Resolution (TGSR) framework, which can effectively utilize the information from the text and image modalities. Different from existing methods,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Advanced Image Fusion Techniques · Image Processing Techniques and Applications
