RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question   Answering and Clinical Reasoning

Congyun Jin; Ming Zhang; Xiaowei Ma; Li Yujiao; Yingbo Wang; Yabo Jia,; Yuliang Du; Tao Sun; Haowen Wang; Cong Fan; Jinjie Gu; Chenfei Chi; Xiangguo; Lv; Fangzhou Li; Wei Xue; Yiran Huang

arXiv:2402.14840·cs.CL·February 26, 2024·1 cites

RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question Answering and Clinical Reasoning

Congyun Jin, Ming Zhang, Xiaowei Ma, Li Yujiao, Yingbo Wang, Yabo Jia,, Yuliang Du, Tao Sun, Haowen Wang, Cong Fan, Jinjie Gu, Chenfei Chi, Xiangguo, Lv, Fangzhou Li, Wei Xue, Yiran Huang

PDF

Open Access

TL;DR

RJUA-MedDQA introduces a challenging multimodal benchmark for medical document question answering and clinical reasoning, emphasizing complex interpretation, numerical reasoning, and clinical inference, supported by an efficient annotation method and extensive evaluations.

Contribution

The paper presents a new comprehensive benchmark for medical document understanding, along with the ESRA annotation method that improves efficiency and accuracy, and evaluates current LMMs' capabilities and limitations.

Findings

01

Existing LMMs have limited overall performance.

02

LMMs are more robust to low-quality images than LLMs.

03

Reasoning across text and images remains challenging.

Abstract

Recent advancements in Large Language Models (LLMs) and Large Multi-modal Models (LMMs) have shown potential in various medical applications, such as Intelligent Medical Diagnosis. Although impressive results have been achieved, we find that existing benchmarks do not reflect the complexity of real medical reports and specialized in-depth reasoning capabilities. In this work, we introduced RJUA-MedDQA, a comprehensive benchmark in the field of medical specialization, which poses several challenges: comprehensively interpreting imgage content across diverse challenging layouts, possessing numerical reasoning ability to identify abnormal indicators and demonstrating clinical reasoning ability to provide statements of disease diagnosis, status and advice based on medical contexts. We carefully design the data generation pipeline and proposed the Efficient Structural Restoration Annotation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques

MethodsSparse Evolutionary Training