Semantic-Topological Graph Reasoning for Language-Guided Pulmonary Screening

Chenyu Xue; Yiran Liu; Mian Zhou; Jionglong Su; and Zhixiang Lu

arXiv:2604.05620·cs.CV·April 8, 2026

Semantic-Topological Graph Reasoning for Language-Guided Pulmonary Screening

Chenyu Xue, Yiran Liu, Mian Zhou, Jionglong Su, and Zhixiang Lu

PDF

TL;DR

This paper introduces a novel framework combining large language models and vision foundation models for improved, stable, and efficient language-guided pulmonary image segmentation, achieving state-of-the-art results.

Contribution

It proposes a Semantic-Topological Graph Reasoning framework with a new distillation and fine-tuning strategy for enhanced medical image segmentation.

Findings

01

Achieved 81.5% Dice Similarity Coefficient on LIDC-IDRI, surpassing previous methods.

02

Introduced a graph reasoning approach for resolving anatomical ambiguity.

03

SAFT fine-tuning strategy improves cross-validation stability.

Abstract

Medical image segmentation driven by free-text clinical instructions is a critical frontier in computer-aided diagnosis. However, existing multimodal and foundation models struggle with the semantic ambiguity of clinical reports and fail to disambiguate complex anatomical overlaps in low-contrast scans. Furthermore, fully fine-tuning these massive architectures on limited medical datasets invariably leads to severe overfitting. To address these challenges, we propose a novel Semantic-Topological Graph Reasoning (STGR) framework for language-guided pulmonary screening. Our approach elegantly synergizes the reasoning capabilities of large language models (LLaMA-3-V) with the zero-shot delineation of vision foundation models (MedSAM). Specifically, we introduce a Text-to-Vision Intent Distillation (TVID) module to extract precise diagnostic guidance. To resolve anatomical ambiguity, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.