Large Language Model with Region-guided Referring and Grounding for CT Report Generation
Zhixuan Chen, Yequan Bie, Haibo Jin, and Hao Chen

TL;DR
This paper introduces Reg2RG, a novel region-guided framework for CT report generation that improves focus on anatomical regions, enhances interpretability, and outperforms existing methods in clinical report quality.
Contribution
The paper presents the first region-guided referring and grounding framework for CT report generation, integrating local and global features with a novel training strategy and employing a large language model for improved report quality.
Findings
Outperforms state-of-the-art methods in natural language metrics
Enhances interpretability through region-specific grounding
Improves clinical efficacy in CT report generation
Abstract
Computed tomography (CT) report generation is crucial to assist radiologists in interpreting CT volumes, which can be time-consuming and labor-intensive. Existing methods primarily only consider the global features of the entire volume, making it struggle to focus on specific regions and potentially missing abnormalities. To address this issue, we propose Reg2RG, the first region-guided referring and grounding framework for CT report generation, which enhances diagnostic performance by focusing on anatomical regions within the volume. Specifically, we utilize masks from a universal segmentation module to capture local features for each referring region. A local feature decoupling (LFD) strategy is proposed to preserve the local high-resolution details with little computational overhead. Then the local features are integrated with global features to capture inter-regional relationships…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Radiomics and Machine Learning in Medical Imaging · Biomedical Text Mining and Ontologies
MethodsFocus
