Semantic-Topological Graph Reasoning for Language-Guided Pulmonary Screening
Chenyu Xue, Yiran Liu, Mian Zhou, Jionglong Su, and Zhixiang Lu

TL;DR
This paper introduces a novel framework combining large language models and vision foundation models for improved, stable, and efficient language-guided pulmonary image segmentation, achieving state-of-the-art results.
Contribution
It proposes a Semantic-Topological Graph Reasoning framework with a new distillation and fine-tuning strategy for enhanced medical image segmentation.
Findings
Achieved 81.5% Dice Similarity Coefficient on LIDC-IDRI, surpassing previous methods.
Introduced a graph reasoning approach for resolving anatomical ambiguity.
SAFT fine-tuning strategy improves cross-validation stability.
Abstract
Medical image segmentation driven by free-text clinical instructions is a critical frontier in computer-aided diagnosis. However, existing multimodal and foundation models struggle with the semantic ambiguity of clinical reports and fail to disambiguate complex anatomical overlaps in low-contrast scans. Furthermore, fully fine-tuning these massive architectures on limited medical datasets invariably leads to severe overfitting. To address these challenges, we propose a novel Semantic-Topological Graph Reasoning (STGR) framework for language-guided pulmonary screening. Our approach elegantly synergizes the reasoning capabilities of large language models (LLaMA-3-V) with the zero-shot delineation of vision foundation models (MedSAM). Specifically, we introduce a Text-to-Vision Intent Distillation (TVID) module to extract precise diagnostic guidance. To resolve anatomical ambiguity, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
