Mixed Transformer U-Net For Medical Image Segmentation
Hongyi Wang, Shiao Xie, Lanfen Lin, Yutaro Iwamoto, Xian-Hua Han,, Yen-Wei Chen, Ruofeng Tong

TL;DR
This paper introduces a novel U-Net variant called MT-UNet that integrates a new Mixed Transformer Module to effectively model both intra- and inter-sample affinities, improving medical image segmentation accuracy.
Contribution
The paper proposes the Mixed Transformer Module (MTM) combining Local-Global Gaussian-Weighted Self-Attention and External Attention, enabling efficient long-range and dataset-level correlation modeling within a U-Net framework.
Findings
Achieves superior segmentation performance on public datasets.
Effectively models long-range dependencies with lower computational cost.
Outperforms existing state-of-the-art methods.
Abstract
Though U-Net has achieved tremendous success in medical image segmentation tasks, it lacks the ability to explicitly model long-range dependencies. Therefore, Vision Transformers have emerged as alternative segmentation structures recently, for their innate ability of capturing long-range correlations through Self-Attention (SA). However, Transformers usually rely on large-scale pre-training and have high computational complexity. Furthermore, SA can only model self-affinities within a single sample, ignoring the potential correlations of the overall dataset. To address these problems, we propose a novel Transformer module named Mixed Transformer Module (MTM) for simultaneous inter- and intra- affinities learning. MTM first calculates self-affinities efficiently through our well-designed Local-Global Gaussian-Weighted Self-Attention (LGG-SA). Then, it mines inter-connections between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · AI in cancer detection
MethodsAttention Is All You Need · *Communicated@Fast*How Do I Communicate to Expedia? · Linear Layer · Multi-Head Attention · Dropout · Layer Normalization · Residual Connection · Dense Connections · Softmax · Absolute Position Encodings
