Stepwise Feature Fusion: Local Guides Global
Jinfeng Wang, Qiming Huang, Feilong Tang, Jia Meng, Jionglong Su, and, Sifan Song

TL;DR
This paper introduces SSFormer, a novel pyramid Transformer-based model with a progressive locality decoder that enhances generalization and segmentation accuracy in colonoscopy images, addressing overfitting and boundary ambiguity issues.
Contribution
The paper presents SSFormer, a new state-of-the-art medical image segmentation model combining pyramid Transformer encoder with a local feature-focused decoder.
Findings
Achieves state-of-the-art segmentation performance.
Demonstrates improved generalization on unseen data.
Effectively emphasizes local features to reduce overfitting.
Abstract
Colonoscopy, currently the most efficient and recognized colon polyp detection technology, is necessary for early screening and prevention of colorectal cancer. However, due to the varying size and complex morphological features of colonic polyps as well as the indistinct boundary between polyps and mucosa, accurate segmentation of polyps is still challenging. Deep learning has become popular for accurate polyp segmentation tasks with excellent results. However, due to the structure of polyps image and the varying shapes of polyps, it easy for existing deep learning models to overfitting the current dataset. As a result, the model may not process unseen colonoscopy data. To address this, we propose a new State-Of-The-Art model for medical image segmentation, the SSFormer, which uses a pyramid Transformer encoder to improve the generalization ability of models. Specifically, our proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsColorectal Cancer Screening and Detection · Image Retrieval and Classification Techniques · Radiomics and Machine Learning in Medical Imaging
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing · Dropout · Softmax
