EndoCaver: Handling Fog, Blur and Glare in Endoscopic Images via Joint Deblurring-Segmentation

Zhuoyu Wu; Wenhui Ou; Pei-Sze Tan; Jiayan Yang; Wenqi Fang; Zheng Wang; Rapha\"el C.-W. Phan

arXiv:2601.22537·eess.IV·May 19, 2026

EndoCaver: Handling Fog, Blur and Glare in Endoscopic Images via Joint Deblurring-Segmentation

Zhuoyu Wu, Wenhui Ou, Pei-Sze Tan, Jiayan Yang, Wenqi Fang, Zheng Wang, Rapha\"el C.-W. Phan

PDF

1 Repo

TL;DR

EndoCaver is a lightweight transformer model that jointly performs deblurring and segmentation of endoscopic images, improving accuracy under challenging conditions while being suitable for on-device clinical use.

Contribution

It introduces a novel joint deblurring-segmentation transformer architecture with reduced complexity and state-of-the-art performance on endoscopic image datasets.

Findings

01

Achieves 0.922 Dice on clean data and 0.889 under severe degradation

02

Reduces model parameters by 90% compared to previous methods

03

Outperforms state-of-the-art in robustness and efficiency

Abstract

Endoscopic image analysis is vital for colorectal cancer screening, yet real-world conditions often suffer from lens fogging, motion blur, and specular highlights, which severely compromise automated polyp detection. We propose EndoCaver, a lightweight transformer with a unidirectional-guided dual-decoder architecture, enabling joint multi-task capability for image deblurring and segmentation while significantly reducing computational complexity and model parameters. Specifically, it integrates a Global Attention Module (GAM) for cross-scale aggregation, a Deblurring-Segmentation Aligner (DSA) to transfer restoration cues, and a cosine-based scheduler (LoCoS) for stable multi-task optimisation. Experiments on the Kvasir-SEG dataset show that EndoCaver achieves 0.922 Dice on clean data and 0.889 under severe image degradation, surpassing state-of-the-art methods while reducing model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ReaganWu/EndoCaver
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis · Colorectal Cancer Screening and Detection