RadAlign: Advancing Radiology Report Generation with Vision-Language Concept Alignment

Difei Gu; Yunhe Gao; Yang Zhou; Mu Zhou; Dimitris Metaxas

arXiv:2501.07525·cs.CV·July 23, 2025

RadAlign: Advancing Radiology Report Generation with Vision-Language Concept Alignment

Difei Gu, Yunhe Gao, Yang Zhou, Mu Zhou, Dimitris Metaxas

PDF

1 Repo

TL;DR

RadAlign is a novel AI framework that combines vision-language alignment and large language models to improve accuracy, interpretability, and reliability in automated radiology report generation from chest X-rays.

Contribution

It introduces RadAlign, integrating specialized vision-language models with large language models and retrieval mechanisms for superior disease classification and report quality.

Findings

01

Achieved an average AUC of 0.885 in disease classification.

02

Delivered a GREEN score of 0.678 in report quality, outperforming previous methods.

03

Enhanced interpretability and reduced hallucinations in report generation.

Abstract

Automated chest radiographs interpretation requires both accurate disease classification and detailed radiology report generation, presenting a significant challenge in the clinical workflow. Current approaches either focus on classification accuracy at the expense of interpretability or generate detailed but potentially unreliable reports through image captioning techniques. In this study, we present RadAlign, a novel framework that combines the predictive accuracy of vision-language models (VLMs) with the reasoning capabilities of large language models (LLMs). Inspired by the radiologist's workflow, RadAlign first employs a specialized VLM to align visual features with key medical concepts, achieving superior disease classification with an average AUC of 0.885 across multiple diseases. These recognized medical conditions, represented as text-based concepts in the aligned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

difeigu/radalign
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsALIGN · Focus