Underwater Diffusion Attention Network with Contrastive Language-Image Joint Learning for Underwater Image Enhancement
Afrah Shaahid, Muzammil Behzad

TL;DR
This paper introduces UDAN-CLIP, a novel diffusion-based underwater image enhancement method that leverages vision-language models, spatial attention, and a new loss to produce more realistic and detailed underwater images.
Contribution
The paper proposes a new diffusion framework with a customized classifier, spatial attention, and CLIP-Diffusion loss to improve underwater image enhancement and semantic consistency.
Findings
Enhanced visual quality and realism in underwater images.
Better preservation of semantic content during enhancement.
Consistent improvements across quantitative metrics and visual assessments.
Abstract
Underwater images are often affected by complex degradations such as light absorption, scattering, color casts, and artifacts, making enhancement critical for effective object detection, recognition, and scene understanding in aquatic environments. Existing methods, especially diffusion-based approaches, typically rely on synthetic paired datasets due to the scarcity of real underwater references, introducing bias and limiting generalization. Furthermore, fine-tuning these models can degrade learned priors, resulting in unrealistic enhancements due to domain shifts. To address these challenges, we propose UDAN-CLIP, an image-to-image diffusion framework pre-trained on synthetic underwater datasets and enhanced with a customized classifier based on vision-language model, a spatial attention module, and a novel CLIP-Diffusion loss. The classifier preserves natural in-air priors and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Enhancement Techniques · Underwater Acoustics Research · Image and Signal Denoising Methods
MethodsSoftmax · Attention Is All You Need · Convolution · Max Pooling · Average Pooling · Sigmoid Activation · Diffusion
