Vision-Language Models for Automated 3D PET/CT Report Generation
Wenpei Jiao, Kun Shang, Hui Li, Ke Yan, Jiajin Zhang, Guangjie Yang, Lijuan Guo, Yan Wan, Xing Yang, Dakai Jin, Zhaoheng Xie

TL;DR
This paper introduces PETRG-3D, a novel 3D dual-branch framework for automated PET/CT report generation, addressing challenges in functional imaging and variability across hospitals, and demonstrates its effectiveness on new datasets and evaluation protocols.
Contribution
The paper presents PETRG-3D, a new end-to-end 3D model for PET/CT report generation, along with new datasets and a clinical evaluation protocol, advancing disease-aware reasoning in medical AI.
Findings
PETRG-3D outperforms existing methods on natural language metrics.
The model improves clinical efficacy metrics by 8.18%.
Style-adaptive prompts enhance reporting consistency across hospitals.
Abstract
Positron emission tomography/computed tomography (PET/CT) is essential in oncology, yet the rapid expansion of scanners has outpaced the availability of trained specialists, making automated PET/CT report generation (PETRG) increasingly important for reducing clinical workload. Compared with structural imaging (e.g., X-ray, CT, and MRI), functional PET poses distinct challenges: metabolic patterns vary with tracer physiology, and whole-body 3D contextual information is required rather than local-region interpretation. To advance PETRG, we propose PETRG-3D, an end-to-end 3D dual-branch framework that separately encodes PET and CT volumes and incorporates style-adaptive prompts to mitigate inter-hospital variability in reporting practices. We construct PETRG-Lym, a multi-center lymphoma dataset collected from four hospitals (824 reports w/ 245,509 paired PET/CT slices), and construct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Multimodal Machine Learning Applications · Topic Modeling
