Self-adaptive vision-language model for 3D segmentation of pulmonary   artery and vein

Xiaotong Guo; Deqian Yang; Dan Wang; Haochen Zhao; Yuan Li; Zhilin; Sui; Tao Zhou; Lijun Zhang; and Yanda Meng

arXiv:2501.03722·cs.CV·January 8, 2025

Self-adaptive vision-language model for 3D segmentation of pulmonary artery and vein

Xiaotong Guo, Deqian Yang, Dan Wang, Haochen Zhao, Yuan Li, Zhilin, Sui, Tao Zhou, Lijun Zhang, and Yanda Meng

PDF

Open Access

TL;DR

This paper introduces a self-adaptive vision-language model leveraging pre-trained CLIP for accurate 3D pulmonary artery and vein segmentation with limited labeled data, demonstrating superior performance on a large dataset.

Contribution

It proposes a novel language-guided cross-attention fusion framework with a self-adaptive learning strategy to enhance pulmonary vessel segmentation using pre-trained foundation models.

Findings

01

Outperforms state-of-the-art methods significantly

02

Validated on the largest pulmonary artery-vein CT dataset to date

03

Achieves high accuracy with limited labeled data

Abstract

Accurate segmentation of pulmonary structures iscrucial in clinical diagnosis, disease study, and treatment planning. Significant progress has been made in deep learning-based segmentation techniques, but most require much labeled data for training. Consequently, developing precise segmentation methods that demand fewer labeled datasets is paramount in medical image analysis. The emergence of pre-trained vision-language foundation models, such as CLIP, recently opened the door for universal computer vision tasks. Exploiting the generalization ability of these pre-trained foundation models on downstream tasks, such as segmentation, leads to unexpected performance with a relatively small amount of labeled data. However, exploring these models for pulmonary artery-vein segmentation is still limited. This paper proposes a novel framework called Language-guided self-adaptive Cross-Attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiomics and Machine Learning in Medical Imaging

MethodsContrastive Language-Image Pre-training · Adapter