U-Netmer: U-Net meets Transformer for medical image segmentation
Sheng He, Rina Bao, P. Ellen Grant, Yangming Ou

TL;DR
U-Netmer innovatively combines U-Net and Transformer in a global-local framework to enhance medical image segmentation, effectively addressing token-flatten and scale-sensitivity issues, and demonstrates superior performance across diverse datasets.
Contribution
It introduces a novel U-Netmer model that integrates U-Net and Transformer without flattening tokens, allowing multi-scale training and improving segmentation accuracy.
Findings
Achieves state-of-the-art results on 7 public datasets
Effectively handles multiple organs and imaging modalities
Provides a confidence score correlating with segmentation accuracy
Abstract
The combination of the U-Net based deep learning models and Transformer is a new trend for medical image segmentation. U-Net can extract the detailed local semantic and texture information and Transformer can learn the long-rang dependencies among pixels in the input image. However, directly adapting the Transformer for segmentation has ``token-flatten" problem (flattens the local patches into 1D tokens which losses the interaction among pixels within local patches) and ``scale-sensitivity" problem (uses a fixed scale to split the input image into local patches). Compared to directly combining U-Net and Transformer, we propose a new global-local fashion combination of U-Net and Transformer, named U-Netmer, to solve the two problems. The proposed U-Netmer splits an input image into local patches. The global-context information among local patches is learnt by the self-attention mechanism…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · AI in cancer detection · Advanced Neural Network Applications
MethodsMulti-Head Attention · Attention Is All You Need · Test · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Concatenated Skip Connection · Convolution · Dropout · Dense Connections · Adam
