MedRepBench: A Comprehensive Benchmark for Medical Report Interpretation

Fangxin Shang; Yuan Xia; Dalu Yang; Yahui Wang; Binglin Yang

arXiv:2508.16674·cs.CV·August 26, 2025

MedRepBench: A Comprehensive Benchmark for Medical Report Interpretation

Fangxin Shang, Yuan Xia, Dalu Yang, Yahui Wang, Binglin Yang

PDF

TL;DR

MedRepBench is a new comprehensive benchmark for evaluating medical report interpretation models, focusing on structured understanding, with diverse real-world Chinese reports and multiple evaluation protocols.

Contribution

Introduces MedRepBench, a large-scale, multi-faceted benchmark for assessing vision-language models in medical report understanding, including objective and subjective evaluation methods.

Findings

01

VLMs improved by 6% recall using reward optimization

02

OCR+LLM pipeline shows layout-blindness and latency issues

03

Benchmark covers diverse clinical reports and evaluation protocols

Abstract

Medical report interpretation plays a crucial role in healthcare, enabling both patient-facing explanations and effective information flow across clinical systems. While recent vision-language models (VLMs) and large language models (LLMs) have demonstrated general document understanding capabilities, there remains a lack of standardized benchmarks to assess structured interpretation quality in medical reports. We introduce MedRepBench, a comprehensive benchmark built from 1,900 de-identified real-world Chinese medical reports spanning diverse departments, patient demographics, and acquisition formats. The benchmark is designed primarily to evaluate end-to-end VLMs for structured medical report understanding. To enable controlled comparisons, we also include a text-only evaluation setting using high-quality OCR outputs combined with LLMs, allowing us to estimate the upper-bound…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.