ResViT: Residual vision transformers for multi-modal medical image   synthesis

Onat Dalmaz; Mahmut Yurt; Tolga \c{C}ukur

arXiv:2106.16031·eess.IV·July 21, 2022

ResViT: Residual vision transformers for multi-modal medical image synthesis

Onat Dalmaz, Mahmut Yurt, Tolga \c{C}ukur

PDF

2 Repos

TL;DR

ResViT introduces a novel generative adversarial model combining vision transformers and CNNs for improved multi-modal medical image synthesis, demonstrating superior results over existing methods.

Contribution

The paper proposes ResViT, a new GAN architecture with residual transformer blocks and a unified implementation for multi-modal medical image synthesis.

Findings

01

ResViT outperforms CNN- and transformer-based methods in quality and metrics.

02

ResViT effectively synthesizes missing MRI sequences and CT images from MRI.

03

The model reduces computational load with weight sharing among transformer blocks.

Abstract

Generative adversarial models with convolutional neural network (CNN) backbones have recently been established as state-of-the-art in numerous medical image synthesis tasks. However, CNNs are designed to perform local processing with compact filters, and this inductive bias compromises learning of contextual features. Here, we propose a novel generative adversarial approach for medical image synthesis, ResViT, that leverages the contextual sensitivity of vision transformers along with the precision of convolution operators and realism of adversarial learning.} ResViT's generator employs a central bottleneck comprising novel aggregated residual transformer (ART) blocks that synergistically combine residual convolutional and transformer modules. Residual connections in ART blocks promote diversity in captured representations, while a channel compression module distills task-relevant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConvolution