DeNAS-ViT: Data Efficient NAS-Optimized Vision Transformer for Ultrasound Image Segmentation

Renqi Chen; Xinzhe Zheng; Haoyang Su; Kehan Wu

arXiv:2407.04203·cs.CV·November 11, 2025

DeNAS-ViT: Data Efficient NAS-Optimized Vision Transformer for Ultrasound Image Segmentation

Renqi Chen, Xinzhe Zheng, Haoyang Su, Kehan Wu

PDF

Open Access 1 Video

TL;DR

DeNAS-ViT is a novel neural architecture search method that optimizes Vision Transformers for ultrasound image segmentation, effectively handling data scarcity and improving accuracy with minimal labeled data.

Contribution

It introduces the first NAS-based approach for ultrasound segmentation, incorporating token-level search and a semi-supervised framework to enhance performance on limited data.

Findings

01

Achieves state-of-the-art segmentation accuracy on public datasets.

02

Maintains robustness with limited labeled ultrasound data.

03

Demonstrates generalization potential beyond ultrasound imaging.

Abstract

Accurate segmentation of ultrasound images is essential for reliable medical diagnoses but is challenged by poor image quality and scarce labeled data. Prior approaches have relied on manually designed, complex network architectures to improve multi-scale feature extraction. However, such handcrafted models offer limited gains when prior knowledge is inadequate and are prone to overfitting on small datasets. In this paper, we introduce DeNAS-ViT, a data-efficient NAS-optimized Vision Transformer, the first method to leverage neural architecture search (NAS) for ultrasound image segmentation by automatically optimizing model architecture through token-level search. Specifically, we propose an efficient NAS module that performs multi-scale token search prior to the ViT's attention mechanism, effectively capturing both contextual and local features while minimizing computational costs.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

DeNAS-ViT: Data Efficient NAS-Optimized Vision Transformer for Ultrasound Image Segmentation· underline

Taxonomy

TopicsMedical Image Segmentation Techniques · AI in cancer detection · Advanced Neural Network Applications

MethodsAttention Is All You Need · Contrastive Learning · Softmax · Byte Pair Encoding · Layer Normalization · Linear Layer · Label Smoothing · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Adam