CAG-VLM: Fine-Tuning of a Large-Scale Model to Recognize Angiographic Images for Next-Generation Diagnostic Systems
Yuto Nakamura, Satoshi Kodera, Haruki Settai, Hiroki Shinohara, Masatsugu Tamura, Tomohiro Noguchi, Tatsuki Furusawa, Ryo Takizawa, Tempei Kabayama, Norihiko Takeda

TL;DR
This paper introduces CAG-VLM, a specialized vision-language model fine-tuned on a large, curated dataset of angiographic images and reports to assist cardiologists in diagnosis and treatment planning.
Contribution
It presents a novel two-stage pipeline for dataset creation and fine-tuning of open-source VLMs specifically for coronary angiography analysis.
Findings
Fine-tuned models achieve high clinician ratings
The dataset enables effective model training and evaluation
CAG-VLM assists in report generation and decision support
Abstract
Coronary angiography (CAG) is the gold-standard imaging modality for evaluating coronary artery disease, but its interpretation and subsequent treatment planning rely heavily on expert cardiologists. To enable AI-based decision support, we introduce a two-stage, physician-curated pipeline and a bilingual (Japanese/English) CAG image-report dataset. First, we sample 14,686 frames from 539 exams and annotate them for key-frame detection and left/right laterality; a ConvNeXt-Base CNN trained on this data achieves 0.96 F1 on laterality classification, even on low-contrast frames. Second, we apply the CNN to 243 independent exams, extract 1,114 key frames, and pair each with its pre-procedure report and expert-validated diagnostic and treatment summary, yielding a parallel corpus. We then fine-tune three open-source VLMs (PaliGemma2, Gemma3, and ConceptCLIP-enhanced Gemma3) via LoRA and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Retinal Imaging and Analysis · Coronary Interventions and Diagnostics
MethodsHeatmap · Class activation guide
