Instruct-IPT: All-in-One Image Processing Transformer via Weight   Modulation

Yuchuan Tian; Jianhong Han; Hanting Chen; Yuanyuan Xi; Ning Ding; Jie; Hu; Chao Xu; Yunhe Wang

arXiv:2407.00676·cs.CV·December 17, 2024·1 cites

Instruct-IPT: All-in-One Image Processing Transformer via Weight Modulation

Yuchuan Tian, Jianhong Han, Hanting Chen, Yuanyuan Xi, Ning Ding, Jie, Hu, Chao Xu, Yunhe Wang

PDF

Open Access 1 Repo

TL;DR

Instruct-IPT is a versatile image processing transformer that uses weight modulation and task-specific biases to effectively handle diverse low-level vision tasks, including denoising, deblurring, and more, with improved performance and flexibility.

Contribution

The paper introduces a novel weight modulation approach with task-specific biases and low-rank decomposition, enabling a single model to excel across multiple distinct image restoration tasks.

Findings

01

Effective multi-task performance on various image restoration tasks.

02

Enhanced cooperation between tasks with distinct characteristics.

03

Extension of the method to diffusion denoisers.

Abstract

Due to the unaffordable size and intensive computation costs of low-level vision models, All-in-One models that are designed to address a handful of low-level vision tasks simultaneously have been popular. However, existing All-in-One models are limited in terms of the range of tasks and performance. To overcome these limitations, we propose Instruct-IPT -- an All-in-One Image Processing Transformer (IPT) that could effectively address manifold image restoration tasks with large inter-task gaps, such as denoising, deblurring, deraining, dehazing, and desnowing. While most research propose feature adaptation methods, we reveal their failure in addressing highly distinct tasks, and suggest weight modulation that adapts weights to specific tasks. Firstly, we search for task-sensitive weights and introduce task-specific biases on top of them. Secondly, we conduct rank analysis for a good…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

huawei-noah/Pretrained-IPT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Diffusion · Position-Wise Feed-Forward Layer · Adam