MedLayBench-V: A Large-Scale Benchmark for Expert-Lay Semantic Alignment in Medical Vision Language Models
Han Jang, Junhyeok Lee, Heeseong Eum, Kyu Sung Choi

TL;DR
MedLayBench-V is a large-scale multimodal benchmark designed to improve medical vision-language models' ability to communicate diagnostic findings in lay language, addressing a critical resource gap.
Contribution
We introduce MedLayBench-V, the first large-scale benchmark for expert-lay semantic alignment in medical vision-language models, constructed via a novel SCGR pipeline.
Findings
Dataset enforces strict semantic equivalence using UMLS CUIs.
Provides a verified foundation for training and evaluating lay-accessible medical models.
Addresses the lack of large-scale benchmarks for expert-lay medical image understanding.
Abstract
Medical Vision-Language Models (Med-VLMs) have achieved expert-level proficiency in interpreting diagnostic imaging. However, current models are predominantly trained on professional literature, limiting their ability to communicate findings in the lay register required for patient-centered care. While text-centric research has actively developed resources for simplifying medical jargon, there is a critical absence of large-scale multimodal benchmarks designed to facilitate lay-accessible medical image understanding. To bridge this resource gap, we introduce MedLayBench-V, the first large-scale multimodal benchmark dedicated to expert-lay semantic alignment. Unlike naive simplification approaches that risk hallucination, our dataset is constructed via a Structured Concept-Grounded Refinement (SCGR) pipeline. This method enforces strict semantic equivalence by integrating Unified Medical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
