Region-Grounded Report Generation for 3D Medical Imaging: A Fine-Grained Dataset and Graph-Enhanced Framework

Cong Huy Nguyen; Son Dinh Nguyen; Guanlin Li; Tuan Dung Nguyen; Aditya Narayan Sankaran; Mai Huy Thong; Thanh Trung Nguyen; Mai Hong Son; Reza Farahbakhsh; Phi Le Nguyen; Noel Crespi

arXiv:2604.18145·cs.CV·May 18, 2026

Region-Grounded Report Generation for 3D Medical Imaging: A Fine-Grained Dataset and Graph-Enhanced Framework

Cong Huy Nguyen, Son Dinh Nguyen, Guanlin Li, Tuan Dung Nguyen, Aditya Narayan Sankaran, Mai Huy Thong, Thanh Trung Nguyen, Mai Hong Son, Reza Farahbakhsh, Phi Le Nguyen, Noel Crespi

PDF

1 Repo

TL;DR

This paper introduces VietPET-RoI, a large-scale 3D PET/CT dataset with detailed RoI annotations in a low-resource language, and proposes HiRRA, a graph-based framework that improves localized report generation by mimicking radiologist workflows.

Contribution

The paper presents the first fine-grained 3D PET/CT dataset with RoI annotations in a low-resource language and a novel graph-enhanced framework for localized report generation.

Findings

01

Achieved state-of-the-art performance with 19.7% higher BLEU score.

02

Surpassed existing models by 4.7% in ROUGE-L.

03

Improved clinical metrics by 45.8%, indicating better reliability.

Abstract

Automated medical report generation for 3D PET/CT imaging is fundamentally challenged by the high-dimensional nature of volumetric data and a critical scarcity of annotated datasets, particularly for low-resource languages. Current black-box methods map whole volumes to reports, ignoring the clinical workflow of analyzing localized Regions of Interest (RoIs) to derive diagnostic conclusions. In this paper, we bridge this gap by introducing VietPET-RoI, the first large-scale 3D PET/CT dataset with fine-grained RoI annotation for a low-resource language, comprising 600 PET/CT samples and 1,960 manually annotated RoIs, paired with corresponding clinical reports. Furthermore, to demonstrate the utility of this dataset, we propose HiRRA, a novel framework that mimics the professional radiologist diagnostic workflow by employing graph-based relational modules to capture dependencies between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://github.com
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.