U-Net Transformer: Self and Cross Attention for Medical Image   Segmentation

Olivier Petit; Nicolas Thome; Cl\'ement Rambour; Luc Soler

arXiv:2103.06104·eess.IV·March 15, 2021

U-Net Transformer: Self and Cross Attention for Medical Image Segmentation

Olivier Petit, Nicolas Thome, Cl\'ement Rambour, Luc Soler

PDF

2 Repos

TL;DR

The U-Transformer network enhances medical image segmentation by integrating self- and cross-attention mechanisms into a U-shaped architecture, effectively modeling long-range dependencies and improving accuracy over traditional U-Nets.

Contribution

This paper introduces a novel U-Transformer architecture that combines Transformers with U-Net for improved modeling of spatial dependencies in medical image segmentation.

Findings

01

Significant performance improvements over U-Net and Attention U-Nets.

02

Both self- and cross-attention are crucial for optimal results.

03

Enhanced interpretability of segmentation results.

Abstract

Medical image segmentation remains particularly challenging for complex and low-contrast anatomical structures. In this paper, we introduce the U-Transformer network, which combines a U-shaped architecture for image segmentation with self- and cross-attention from Transformers. U-Transformer overcomes the inability of U-Nets to model long-range contextual interactions and spatial dependencies, which are arguably crucial for accurate segmentation in challenging contexts. To this end, attention mechanisms are incorporated at two main levels: a self-attention module leverages global interactions between encoder features, while cross-attention in the skip connections allows a fine spatial recovery in the U-Net decoder by filtering out non-semantic features. Experiments on two abdominal CT-image datasets show the large performance gain brought out by U-Transformer compared to U-Net and local…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConcatenated Skip Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Max Pooling · U-Net