DeNAS-ViT: Data Efficient NAS-Optimized Vision Transformer for Ultrasound Image Segmentation
Renqi Chen, Xinzhe Zheng, Haoyang Su, Kehan Wu

TL;DR
DeNAS-ViT is a novel neural architecture search method that optimizes Vision Transformers for ultrasound image segmentation, effectively handling data scarcity and improving accuracy with minimal labeled data.
Contribution
It introduces the first NAS-based approach for ultrasound segmentation, incorporating token-level search and a semi-supervised framework to enhance performance on limited data.
Findings
Achieves state-of-the-art segmentation accuracy on public datasets.
Maintains robustness with limited labeled ultrasound data.
Demonstrates generalization potential beyond ultrasound imaging.
Abstract
Accurate segmentation of ultrasound images is essential for reliable medical diagnoses but is challenged by poor image quality and scarce labeled data. Prior approaches have relied on manually designed, complex network architectures to improve multi-scale feature extraction. However, such handcrafted models offer limited gains when prior knowledge is inadequate and are prone to overfitting on small datasets. In this paper, we introduce DeNAS-ViT, a data-efficient NAS-optimized Vision Transformer, the first method to leverage neural architecture search (NAS) for ultrasound image segmentation by automatically optimizing model architecture through token-level search. Specifically, we propose an efficient NAS module that performs multi-scale token search prior to the ViT's attention mechanism, effectively capturing both contextual and local features while minimizing computational costs.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMedical Image Segmentation Techniques · AI in cancer detection · Advanced Neural Network Applications
MethodsAttention Is All You Need · Contrastive Learning · Softmax · Byte Pair Encoding · Layer Normalization · Linear Layer · Label Smoothing · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam
