Vision Transformers for Single Image Dehazing

Yuda Song; Zhuqing He; Hui Qian; Xin Du

arXiv:2204.03883·cs.CV·April 12, 2023·35 cites

Vision Transformers for Single Image Dehazing

Yuda Song, Zhuqing He, Hui Qian, Xin Du

PDF

Open Access 1 Repo

TL;DR

This paper introduces DehazeFormer, a vision Transformer-based model for image dehazing that significantly outperforms previous CNN-based methods, especially on indoor datasets, by making key architectural improvements.

Contribution

The paper proposes DehazeFormer, a novel Transformer-based architecture tailored for image dehazing, with modifications that enhance performance and efficiency over existing methods.

Findings

01

DehazeFormer outperforms FFA-Net with fewer parameters and lower computational cost.

02

The large DehazeFormer model achieves over 40 dB PSNR on SOTS indoor set.

03

The method effectively handles highly non-homogeneous haze in remote sensing images.

Abstract

Image dehazing is a representative low-level vision task that estimates latent haze-free images from hazy images. In recent years, convolutional neural network-based methods have dominated image dehazing. However, vision Transformers, which has recently made a breakthrough in high-level vision tasks, has not brought new dimensions to image dehazing. We start with the popular Swin Transformer and find that several of its key designs are unsuitable for image dehazing. To this end, we propose DehazeFormer, which consists of various improvements, such as the modified normalization layer, activation function, and spatial information aggregation scheme. We train multiple variants of DehazeFormer on various datasets to demonstrate its effectiveness. Specifically, on the most frequently used SOTS indoor set, our small model outperforms FFA-Net with only 25% #Param and 5% computational cost. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IDKiro/DehazeFormer
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Enhancement Techniques · Video Surveillance and Tracking Methods · Advanced Image Fusion Techniques

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Dense Connections · Multi-Head Attention · Stochastic Depth · Dropout · Layer Normalization · Softmax