3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of   Transformer-MLP Paradigm for Dense Prediction in Medical Volume

Jianye Pang; Cheng Jiang; Yihao Chen; Jianbo Chang; Ming Feng; Renzhi; Wang; Jianhua Yao

arXiv:2204.06779·cs.CV·April 15, 2022

3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of Transformer-MLP Paradigm for Dense Prediction in Medical Volume

Jianye Pang, Cheng Jiang, Yihao Chen, Jianbo Chang, Ming Feng, Renzhi, Wang, Jianhua Yao

PDF

Open Access

TL;DR

The paper introduces 3D Shuffle-Mixer, a novel efficient vision transformer-MLP hybrid model for dense medical volume prediction, addressing limitations of existing methods with a new local transformer and volume context integration.

Contribution

It proposes a new 3D Shuffle-Mixer network combining local vision transformer and MLP for improved medical volume dense prediction.

Findings

01

Outperforms state-of-the-art methods in medical dense prediction tasks.

02

Efficiently captures full-view spatial and volume context.

03

Enhances feature learning with adaptive spatial and channel-wise shortcuts.

Abstract

Dense prediction in medical volume provides enriched guidance for clinical analysis. CNN backbones have met bottleneck due to lack of long-range dependencies and global context modeling power. Recent works proposed to combine vision transformer with CNN, due to its strong global capture ability and learning capability. However, most works are limited to simply applying pure transformer with several fatal flaws (i.e., lack of inductive bias, heavy computation and little consideration for 3D data). Therefore, designing an elegant and efficient vision transformer learner for dense prediction in medical volume is promising and challenging. In this paper, we propose a novel 3D Shuffle-Mixer network of a new Local Vision Transformer-MLP paradigm for medical dense prediction. In our network, a local vision transformer block is utilized to shuffle and learn spatial context from full-view slices…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Machine Learning in Healthcare · Medical Image Segmentation Techniques

MethodsAttention Is All You Need · Linear Layer · Softmax · Dense Connections · Multi-Head Attention · Layer Normalization · Residual Connection · Vision Transformer