Pre-Trained Image Processing Transformer
Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua, Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao

TL;DR
This paper introduces the Image Processing Transformer (IPT), a pre-trained model leveraging transformer architecture and contrastive learning, trained on large-scale corrupted images, to excel at low-level vision tasks like denoising and super-resolution.
Contribution
The paper presents a novel pre-trained transformer model for low-level vision tasks, utilizing large-scale corrupted image data and contrastive learning for effective task adaptation.
Findings
IPT outperforms state-of-the-art methods on multiple benchmarks.
A single pre-trained model effectively adapts to various low-level tasks.
Contrastive learning enhances the model's versatility across different image processing tasks.
Abstract
As the computing power of modern hardware is increasing strongly, pre-trained deep learning models (e.g., BERT, GPT-3) learned on large-scale datasets have shown their effectiveness over conventional methods. The big progress is mainly contributed to the representation ability of transformer and its variant architectures. In this paper, we study the low-level computer vision task (e.g., denoising, super-resolution and deraining) and develop a new pre-trained model, namely, image processing transformer (IPT). To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs. The IPT model is trained on these images with multi-heads and multi-tails. In addition, the contrastive learning is introduced for well adapting to different image processing tasks. The pre-trained model can therefore…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods · Advanced Neural Network Applications
MethodsLinear Layer · Contrastive Learning · WordPiece · Residual Connection · Attention Dropout · Weight Decay · Multi-Head Attention · Dense Connections · Attention Is All You Need · Adam
