Argus: Benchmarking and Enhancing Vision-Language Models for 3D Radiology Report Generation
Che Liu, Zhongwei Wan, Yuqi Wang, Hui Shen, Haozhe Wang, Kangyu Zheng,, Mi Zhang, Rossella Arcucci

TL;DR
This paper introduces Argus, a new benchmark dataset and training methodology for vision-language models in 3D radiology report generation, significantly improving performance on CT scan reports.
Contribution
It creates the largest public 3D CT report dataset and develops a comprehensive training recipe, leading to the state-of-the-art Argus models for 3D radiology report generation.
Findings
Argus models outperform previous methods across various sizes and resolutions.
The benchmark enables robust evaluation of VLMs on 3D radiology data.
Optimal training strategies significantly enhance report generation quality.
Abstract
Automatic radiology report generation holds significant potential to streamline the labor-intensive process of report writing by radiologists, particularly for 3D radiographs such as CT scans. While CT scans are critical for clinical diagnostics, they remain less explored compared to 2D radiographs. To date, there has been no comprehensive benchmark for 3D radiograph report generation (3DRRG), nor sufficient investigation into the optimal training strategies for Vision Language Models (VLMs) in this context, particularly with respect to vision encoder choices, visual token compression, and model scaling. In this work, we make three key contributions. We curate **CT-3DRRG**, the largest **publicly** available 3D CT-report dataset, establishing a robust and diverse benchmark for evaluating VLM performance on 3DRRG. Furthermore, we propose a comprehensive training recipe for building…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Medical Imaging and Analysis · AI in cancer detection
