TL;DR
This paper introduces a novel FCN-Transformer architecture for full-size polyp segmentation in colonoscopy images, combining transformer and convolutional features to improve accuracy and generalization over existing methods.
Contribution
The proposed architecture effectively fuses transformer and convolutional features for full-size segmentation, achieving state-of-the-art results and better generalization in polyp segmentation tasks.
Findings
State-of-the-art performance on Kvasir-SEG and CVC-ClinicDB datasets.
Superior generalization demonstrated by cross-dataset evaluation.
Improved metrics such as mDice, mIoU, mPrecision, and mRecall.
Abstract
Colonoscopy is widely recognised as the gold standard procedure for the early detection of colorectal cancer (CRC). Segmentation is valuable for two significant clinical applications, namely lesion detection and classification, providing means to improve accuracy and robustness. The manual segmentation of polyps in colonoscopy images is time-consuming. As a result, the use of deep learning (DL) for automation of polyp segmentation has become important. However, DL-based solutions can be vulnerable to overfitting and the resulting inability to generalise to images captured by different colonoscopes. Recent transformer-based architectures for semantic segmentation both achieve higher performance and generalise better than alternatives, however typically predict a segmentation map of spatial dimensions for a input image. To this end, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
