GLeVE: Graph-Guided Lesion Grounding with Proposal Verification in 3D CT

Shuo Jiang; Yuhao Hong; Chunbo Jiang; Weihong Chen; Huangwei Chen; Shenghao Zhu; Beining Wu; Mingxuan Liu; Zhu Zhu; Feiwei Qin; Min Tan; and Yifei Chen

arXiv:2605.22619·cs.CV·May 22, 2026

GLeVE: Graph-Guided Lesion Grounding with Proposal Verification in 3D CT

Shuo Jiang, Yuhao Hong, Chunbo Jiang, Weihong Chen, Huangwei Chen, Shenghao Zhu, Beining Wu, Mingxuan Liu, Zhu Zhu, Feiwei Qin, Min Tan, and Yifei Chen

PDF

TL;DR

GLeVE is a novel framework that improves 3D CT lesion grounding by combining graph reasoning, anatomical verification, and octree refinement, leading to better localization and segmentation accuracy.

Contribution

It introduces a graph-guided, anatomy-aware lesion grounding method with hierarchical refinement, advancing beyond existing phrase-level and dense supervision approaches.

Findings

01

GLeVE outperforms classical multimodal models in lesion localization.

02

The method achieves higher segmentation accuracy on AbdomenAtlas 3.0.

03

Hierarchical octree refinement improves boundary delineation.

Abstract

Grounding radiology report descriptions to 3D CT volumes is essential for verifiable clinical interpretation, yet remains challenging due to the semantic-spatial gap between free-text narratives and volumetric anatomy. Existing report-assisted and vision-language grounding methods typically rely on phrase-level alignment or dense pixel supervision, resulting in limited lesion-wise correspondence and suboptimal localization accuracy. We propose GLeVE, a graph-guided lesion grounding framework with anatomical prior verification and octree-based autoregressive refinement. GLeVE treats each lesion description as an atomic semantic unit and encodes organ attribution, attributes, and inter-lesion relations through relation-aware graph reasoning to produce discriminative lesion-wise queries. Anatomy-aware proposal generation with region-level verification enforces one-to-one text-lesion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.