MS-Twins: Multi-Scale Deep Self-Attention Networks for Medical Image Segmentation
Jing Xu

TL;DR
MS-Twins introduces a multi-scale self-attention and convolution-based segmentation model that effectively captures semantic and fine-grained features, outperforming existing methods on medical imaging benchmarks.
Contribution
The paper presents MS-Twins, a novel multi-scale self-attention network that integrates transformer and convolution for improved medical image segmentation.
Findings
MS-Twins achieves 8% higher accuracy than SwinUNet on Synapse dataset.
MS-Twins outperforms nnUNet on Synapse and ACDC datasets.
The model better captures semantic and fine-grained information through multi-scale features.
Abstract
Although transformer is preferred in natural language processing, some studies has only been applied to the field of medical imaging in recent years. For its long-term dependency, the transformer is expected to contribute to unconventional convolution neural net conquer their inherent spatial induction bias. The lately suggested transformer-based segmentation method only uses the transformer as an auxiliary module to help encode the global context into a convolutional representation. How to optimally integrate self-attention with convolution has not been investigated in depth. To solve the problem, this paper proposes MS-Twins (Multi-Scale Twins), which is a powerful segmentation model on account of the bond of self-attention and convolution. MS-Twins can better capture semantic and fine-grained information by combining different scales and cascading features. Compared with the existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Advanced Neural Network Applications · Brain Tumor Detection and Classification
MethodsFocus · Convolution
