Self-adaptive vision-language model for 3D segmentation of pulmonary artery and vein
Xiaotong Guo, Deqian Yang, Dan Wang, Haochen Zhao, Yuan Li, Zhilin, Sui, Tao Zhou, Lijun Zhang, and Yanda Meng

TL;DR
This paper introduces a self-adaptive vision-language model leveraging pre-trained CLIP for accurate 3D pulmonary artery and vein segmentation with limited labeled data, demonstrating superior performance on a large dataset.
Contribution
It proposes a novel language-guided cross-attention fusion framework with a self-adaptive learning strategy to enhance pulmonary vessel segmentation using pre-trained foundation models.
Findings
Outperforms state-of-the-art methods significantly
Validated on the largest pulmonary artery-vein CT dataset to date
Achieves high accuracy with limited labeled data
Abstract
Accurate segmentation of pulmonary structures iscrucial in clinical diagnosis, disease study, and treatment planning. Significant progress has been made in deep learning-based segmentation techniques, but most require much labeled data for training. Consequently, developing precise segmentation methods that demand fewer labeled datasets is paramount in medical image analysis. The emergence of pre-trained vision-language foundation models, such as CLIP, recently opened the door for universal computer vision tasks. Exploiting the generalization ability of these pre-trained foundation models on downstream tasks, such as segmentation, leads to unexpected performance with a relatively small amount of labeled data. However, exploring these models for pulmonary artery-vein segmentation is still limited. This paper proposes a novel framework called Language-guided self-adaptive Cross-Attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging
MethodsContrastive Language-Image Pre-training · Adapter
