UHD Image Dehazing via anDehazeFormer with Atmospheric-aware KV Cache

Pu Wang; Pengwen Dai; Chen Wu; Yeying Jin; Dianjie Lu; Guijuan Zhang; Youshan Zhang; Zhuoran Zheng

arXiv:2505.14010·cs.CV·May 21, 2025

UHD Image Dehazing via anDehazeFormer with Atmospheric-aware KV Cache

Pu Wang, Pengwen Dai, Chen Wu, Yeying Jin, Dianjie Lu, Guijuan Zhang, Youshan Zhang, Zhuoran Zheng

PDF

Open Access

TL;DR

This paper introduces an efficient transformer-based framework for UHD image dehazing that significantly accelerates training and reduces memory use while maintaining high dehazing quality, enabling real-time processing of ultra-high-resolution images.

Contribution

It proposes an adaptive normalization and atmospheric-aware KV caching mechanism, improving training speed and efficiency for UHD image dehazing with state-of-the-art results.

Findings

01

5x faster training convergence

02

Real-time processing of 50 UHD images/sec

03

Maintains state-of-the-art dehazing quality

Abstract

In this paper, we propose an efficient visual transformer framework for ultra-high-definition (UHD) image dehazing that addresses the key challenges of slow training speed and high memory consumption for existing methods. Our approach introduces two key innovations: 1) an \textbf{a}daptive \textbf{n}ormalization mechanism inspired by the nGPT architecture that enables ultra-fast and stable training with a network with a restricted range of parameter expressions; and 2) we devise an atmospheric scattering-aware KV caching mechanism that dynamically optimizes feature preservation based on the physical haze formation model. The proposed architecture improves the training convergence speed by \textbf{5 $\times$ } while reducing memory overhead, enabling real-time processing of 50 high-resolution images per second on an RTX4090 GPU. Experimental results show that our approach maintains…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Enhancement Techniques · Advanced Image Processing Techniques · Digital Media Forensic Detection

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings