ViT-V-Net: Vision Transformer for Unsupervised Volumetric Medical Image   Registration

Junyu Chen; Yufan He; Eric C. Frey; Ye Li; Yong Du

arXiv:2104.06468·eess.IV·April 15, 2021·140 cites

ViT-V-Net: Vision Transformer for Unsupervised Volumetric Medical Image Registration

Junyu Chen, Yufan He, Eric C. Frey, Ye Li, Yong Du

PDF

Open Access 1 Repo

TL;DR

This paper introduces ViT-V-Net, a novel hybrid model combining Vision Transformers and ConvNets, to improve unsupervised volumetric medical image registration by capturing long-range spatial relations and detailed localization.

Contribution

The paper proposes a new hybrid architecture, ViT-V-Net, that integrates Vision Transformers with ConvNets for enhanced medical image registration performance.

Findings

01

Achieves superior registration accuracy compared to existing methods.

02

Effectively captures long-range spatial relations in volumetric images.

03

Improves localization detail in medical image registration.

Abstract

In the last decade, convolutional neural networks (ConvNets) have dominated and achieved state-of-the-art performances in a variety of medical imaging applications. However, the performances of ConvNets are still limited by lacking the understanding of long-range spatial relations in an image. The recently proposed Vision Transformer (ViT) for image classification uses a purely self-attention-based model that learns long-range spatial relations to focus on the relevant parts of an image. Nevertheless, ViT emphasizes the low-resolution features because of the consecutive downsamplings, result in a lack of detailed localization information, making it unsuitable for image registration. Recently, several ViT-based image segmentation methods have been combined with ConvNets to improve the recovery of detailed localization information. Inspired by them, we present ViT-V-Net, which bridges ViT…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

junyuchen245/ViT-V-Net_for_3D_Image_Registration_Pytorch
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · COVID-19 diagnosis using AI

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Dropout · Adam · Layer Normalization · Label Smoothing