EViT-Unet: U-Net Like Efficient Vision Transformer for Medical Image   Segmentation on Mobile and Edge Devices

Xin Li; Wenhui Zhu; Xuanzhao Dong; Oana M. Dumitrascu; Yalin Wang

arXiv:2410.15036·eess.IV·October 22, 2024

EViT-Unet: U-Net Like Efficient Vision Transformer for Medical Image Segmentation on Mobile and Edge Devices

Xin Li, Wenhui Zhu, Xuanzhao Dong, Oana M. Dumitrascu, Yalin Wang

PDF

Open Access 1 Repo

TL;DR

EViT-UNet is a novel efficient Vision Transformer-based U-Net architecture designed for medical image segmentation on resource-limited devices, balancing high accuracy with reduced computational demands.

Contribution

This paper introduces EViT-UNet, a U-shaped network combining convolution and self-attention, optimized for low-resource medical image segmentation tasks.

Findings

01

Achieves high segmentation accuracy

02

Reduces computational complexity significantly

03

Suitable for mobile and edge medical devices

Abstract

With the rapid development of deep learning, CNN-based U-shaped networks have succeeded in medical image segmentation and are widely applied for various tasks. However, their limitations in capturing global features hinder their performance in complex segmentation tasks. The rise of Vision Transformer (ViT) has effectively compensated for this deficiency of CNNs and promoted the application of ViT-based U-networks in medical image segmentation. However, the high computational demands of ViT make it unsuitable for many medical devices and mobile platforms with limited resources, restricting its deployment on resource-constrained and edge devices. To address this, we propose EViT-UNet, an efficient ViT-based segmentation network that reduces computational complexity while maintaining accuracy, making it ideal for resource-constrained medical devices. EViT-UNet is built on a U-shaped…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Retinal-Research/EVIT-UNET
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBrain Tumor Detection and Classification

MethodsAttention Is All You Need · Dense Connections · Layer Normalization · Residual Connection · Position-Wise Feed-Forward Layer · Adam · Linear Layer · Softmax · Multi-Head Attention · Vision Transformer