Mixed Transformer U-Net For Medical Image Segmentation

Hongyi Wang; Shiao Xie; Lanfen Lin; Yutaro Iwamoto; Xian-Hua Han,; Yen-Wei Chen; Ruofeng Tong

arXiv:2111.04734·eess.IV·November 12, 2021·25 cites

Mixed Transformer U-Net For Medical Image Segmentation

Hongyi Wang, Shiao Xie, Lanfen Lin, Yutaro Iwamoto, Xian-Hua Han,, Yen-Wei Chen, Ruofeng Tong

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel U-Net variant called MT-UNet that integrates a new Mixed Transformer Module to effectively model both intra- and inter-sample affinities, improving medical image segmentation accuracy.

Contribution

The paper proposes the Mixed Transformer Module (MTM) combining Local-Global Gaussian-Weighted Self-Attention and External Attention, enabling efficient long-range and dataset-level correlation modeling within a U-Net framework.

Findings

01

Achieves superior segmentation performance on public datasets.

02

Effectively models long-range dependencies with lower computational cost.

03

Outperforms existing state-of-the-art methods.

Abstract

Though U-Net has achieved tremendous success in medical image segmentation tasks, it lacks the ability to explicitly model long-range dependencies. Therefore, Vision Transformers have emerged as alternative segmentation structures recently, for their innate ability of capturing long-range correlations through Self-Attention (SA). However, Transformers usually rely on large-scale pre-training and have high computational complexity. Furthermore, SA can only model self-affinities within a single sample, ignoring the potential correlations of the overall dataset. To address these problems, we propose a novel Transformer module named Mixed Transformer Module (MTM) for simultaneous inter- and intra- affinities learning. MTM first calculates self-affinities efficiently through our well-designed Local-Global Gaussian-Weighted Self-Attention (LGG-SA). Then, it mines inter-connections between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dootmaan/mt-unet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · AI in cancer detection

MethodsAttention Is All You Need · *Communicated@Fast*How Do I Communicate to Expedia? · Linear Layer · Multi-Head Attention · Dropout · Layer Normalization · Residual Connection · Dense Connections · Softmax · Absolute Position Encodings