MultiMedEdit: A Scenario-Aware Benchmark for Evaluating Knowledge Editing in Medical VQA
Shengtao Wen, Haodong Chen, Yadong Wang, Zhongying Pan, Xiang Chen, Yu Tian, Bo Qian, Dong Liang, Sheng-Jun Huang

TL;DR
MultiMedEdit introduces a comprehensive benchmark for evaluating knowledge editing in multimodal medical visual question answering, addressing the unique challenges of integrating updated knowledge with visual reasoning in clinical scenarios.
Contribution
It is the first benchmark specifically designed for knowledge editing in clinical multimodal tasks, including a new metric suite and extensive experimental analysis.
Findings
Current methods struggle with generalization and long-tail reasoning.
Significant practical trade-offs exist in edit latency and memory footprint.
The benchmark reveals limitations of existing approaches in complex clinical workflows.
Abstract
Knowledge editing (KE) provides a scalable approach for updating factual knowledge in large language models without full retraining. While previous studies have demonstrated effectiveness in general domains and medical QA tasks, little attention has been paid to KE in multimodal medical scenarios. Unlike text-only settings, medical KE demands integrating updated knowledge with visual reasoning to support safe and interpretable clinical decisions. To address this gap, we propose MultiMedEdit, the first benchmark tailored to evaluating KE in clinical multimodal tasks. Our framework spans both understanding and reasoning task types, defines a three-dimensional metric suite (reliability, generality, and locality), and supports cross-paradigm comparisons across general and domain-specific models. We conduct extensive experiments under single-editing and lifelong-editing settings. Results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Multimodal Machine Learning Applications
