Sharing Key Semantics in Transformer Makes Efficient Image Restoration

Bin Ren; Yawei Li; Jingyun Liang; Rakesh Ranjan; Mengyuan Liu; Rita; Cucchiara; Luc Van Gool; Ming-Hsuan Yang; Nicu Sebe

arXiv:2405.20008·cs.CV·December 19, 2024·1 cites

Sharing Key Semantics in Transformer Makes Efficient Image Restoration

Bin Ren, Yawei Li, Jingyun Liang, Rakesh Ranjan, Mengyuan Liu, Rita, Cucchiara, Luc Van Gool, Ming-Hsuan Yang, Nicu Sebe

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces SemanIR, a novel Transformer-based approach for image restoration that shares key semantics to improve efficiency and accuracy by focusing attention on semantically related regions, achieving state-of-the-art results.

Contribution

The paper proposes sharing a semantic key-dictionary within each Transformer stage to optimize attention computation for image restoration tasks.

Findings

01

Achieves linear complexity in attention calculation within each window.

02

Outperforms existing methods across 6 image restoration tasks.

03

Provides qualitative and quantitative improvements over prior approaches.

Abstract

Image Restoration (IR), a classic low-level vision task, has witnessed significant advancements through deep models that effectively model global information. Notably, the emergence of Vision Transformers (ViTs) has further propelled these advancements. When computing, the self-attention mechanism, a cornerstone of ViTs, tends to encompass all global cues, even those from semantically unrelated objects or regions. This inclusivity introduces computational inefficiencies, particularly noticeable with high input resolution, as it requires processing irrelevant information, thereby impeding efficiency. Additionally, for IR, it is commonly noted that small segments of a degraded image, particularly those closely aligned semantically, provide particularly relevant information to aid in the restoration process, as they contribute essential contextual cues crucial for accurate reconstruction.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amazingren/semanir
pytorchOfficial

Videos

Sharing Key Semantics in Transformer Makes Efficient Image Restoration· slideslive

Taxonomy

TopicsNeural Networks and Applications · Rough Sets and Fuzzy Logic · Advanced Computational Techniques and Applications

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout · Dense Connections