Masked Autoencoders as Image Processors

Huiyu Duan; Wei Shen; Xiongkuo Min; Danyang Tu; Long Teng; Jia Wang,; Guangtao Zhai

arXiv:2303.17316·cs.CV·March 31, 2023·5 cites

Masked Autoencoders as Image Processors

Huiyu Duan, Wei Shen, Xiongkuo Min, Danyang Tu, Long Teng, Jia Wang,, Guangtao Zhai

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that masked autoencoders, when combined with a new Transformer architecture called CSformer, can effectively pre-train models for a range of low-level image processing tasks, achieving state-of-the-art results.

Contribution

The paper introduces MAEIP, a masked autoencoder architecture tailored for image processing, and a new Transformer model CSformer that enhances low-level vision tasks.

Findings

01

MAEIP pre-training improves performance across various image processing tasks.

02

CSformer achieves state-of-the-art results on denoising, deblurring, and deraining.

03

Masked autoencoders are effective for low-level vision tasks.

Abstract

Transformers have shown significant effectiveness for various vision tasks including both high-level vision and low-level vision. Recently, masked autoencoders (MAE) for feature pre-training have further unleashed the potential of Transformers, leading to state-of-the-art performances on various high-level vision tasks. However, the significance of MAE pre-training on low-level vision tasks has not been sufficiently explored. In this paper, we show that masked autoencoders are also scalable self-supervised learners for image processing tasks. We first present an efficient Transformer model considering both channel attention and shifted-window-based self-attention termed CSformer. Then we develop an effective MAE architecture for image processing (MAEIP) tasks. Extensive experimental results show that with the help of MAEIP pre-training, our proposed CSformer achieves state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

duanhuiyu/maeip_csformer
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing Techniques and Applications · Image Enhancement Techniques · Image and Signal Denoising Methods

MethodsAttention Is All You Need · Masked autoencoder · Dropout · Dense Connections · Linear Layer · Adam · Layer Normalization · Softmax · Residual Connection · Multi-Head Attention