Empowering Semantic-Sensitive Underwater Image Enhancement with VLM

Guodong Fan; Shengning Zhou; Genji Yuan; Huiyu Li; Jingchun Zhou; Jinjiang Li

arXiv:2603.12773·cs.CV·March 16, 2026

Empowering Semantic-Sensitive Underwater Image Enhancement with VLM

Guodong Fan, Shengning Zhou, Genji Yuan, Huiyu Li, Jingchun Zhou, Jinjiang Li

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel semantic-sensitive underwater image enhancement method leveraging Vision-Language Models to improve the focus on key objects, resulting in better perceptual quality and downstream task performance.

Contribution

It proposes a new learning mechanism that uses VLMs to generate semantic guidance for UIE models, enhancing their ability to restore key object features accurately.

Findings

01

Significantly improves perceptual quality metrics.

02

Enhances detection and segmentation task performance.

03

Demonstrates adaptability across different UIE baselines.

Abstract

In recent years, learning-based underwater image enhancement (UIE) techniques have rapidly evolved. However, distribution shifts between high-quality enhanced outputs and natural images can hinder semantic cue extraction for downstream vision tasks, thereby limiting the adaptability of existing enhancement models. To address this challenge, this work proposes a new learning mechanism that leverages Vision-Language Models (VLMs) to empower UIE models with semantic-sensitive capabilities. To be concrete, our strategy first generates textual descriptions of key objects from a degraded image via VLMs. Subsequently, a text-image alignment model remaps these relevant descriptions back onto the image to produce a spatial semantic guidance map. This map then steers the UIE network through a dual-guidance mechanism, which combines cross-attention and an explicit alignment loss. This forces the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Empowering Semantic-Sensitive Underwater Image Enhancement with VLM· underline

Taxonomy

TopicsImage Enhancement Techniques · Generative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications