# A Deep Feature Fusion Underwater Image Enhancement Model Based on Perceptual Vision Swin Transformer

**Authors:** Shasha Tian, Adisorn Sirikham, Jessada Konpang, Chuyang Wang

PMC · DOI: 10.3390/jimaging12010044 · Journal of Imaging · 2026-01-14

## TL;DR

This paper introduces a new underwater image enhancement model that improves image quality by reducing distortions and preserving details using a transformer-based framework.

## Contribution

A novel U-shaped enhancement framework integrating Swin-Transformer blocks with attention and residual modules for underwater image restoration.

## Key findings

- The model achieves state-of-the-art PSNR and SSIM scores with average PSNR of 29.5 dB and SSIM of 0.94.
- The method shows significant improvements in UIQM and UCIQE metrics, with values of 3.62 and 0.59 respectively.
- Qualitative results demonstrate reduced color cast, restored contrast, and sharper structural details in underwater images.

## Abstract

Underwater optical images are the primary carriers of underwater scene information, playing a crucial role in marine resource exploration, underwater environmental monitoring, and engineering inspection. However, wavelength-dependent absorption and scattering severely deteriorate underwater images, leading to reduced contrast, chromatic distortions, and loss of structural details. To address these issues, we propose a U-shaped underwater image enhancement framework that integrates Swin-Transformer blocks with lightweight attention and residual modules. A Dual-Window Multi-Head Self-Attention (DWMSA) in the bottleneck models long-range context while preserving fine local structure. A Global-Aware Attention Map (GAMP) adaptively re-weights channels and spatial locations to focus on severely degraded regions. A Feature-Augmentation Residual Network (FARN) stabilizes deep training and emphasizes texture and color fidelity. Trained with a combination of Charbonnier, perceptual, and edge losses, our method achieves state-of-the-art results in PSNR and SSIM, the lowest LPIPS, and improvements in UIQM and UCIQE on the UFO-120 and EUVP datasets, with average metrics of PSNR 29.5 dB, SSIM 0.94, LPIPS 0.17, UIQM 3.62, and UCIQE 0.59. Qualitative results show reduced color cast, restored contrast, and sharper details. Code, weights, and evaluation scripts will be released to support reproducibility.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12842990/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12842990/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/PMC12842990/full.md

---
Source: https://tomesphere.com/paper/PMC12842990