ECG-Expert-QA: A Benchmark for Evaluating Medical Large Language Models in Heart Disease Diagnosis
Xu Wang, Jiaju Kang, Puyu Han, Yubao Zhao, Qian Liu, Liwenfei He,, Lingqiong Zhang, Lingyun Dai, Yongcheng Wang, Jie Tao

TL;DR
ECG-Expert-QA is a large, high-quality multimodal dataset designed to evaluate and develop advanced AI models for ECG interpretation, including diagnostic accuracy and conversational reasoning in clinical scenarios.
Contribution
It introduces a comprehensive ECG dataset with multi-turn dialogue support, combining real and synthetic data for realistic AI evaluation in heart disease diagnosis.
Findings
Supports multi-turn dialogues for clinical reasoning
Includes diverse diagnostic tasks and rare conditions
Ensures high data quality with strict validation
Abstract
We present ECG-Expert-QA, a comprehensive multimodal dataset for evaluating diagnostic capabilities in electrocardiogram (ECG) interpretation. It combines real-world clinical ECG data with systematically generated synthetic cases, covering 12 essential diagnostic tasks and totaling 47,211 expert-validated QA pairs. These encompass diverse clinical scenarios, from basic rhythm recognition to complex diagnoses involving rare conditions and temporal changes. A key innovation is the support for multi-turn dialogues, enabling the development of conversational medical AI systems that emulate clinician-patient or interprofessional interactions. This allows for more realistic assessment of AI models' clinical reasoning, diagnostic accuracy, and knowledge integration. Constructed through a knowledge-guided framework with strict quality control, ECG-Expert-QA ensures linguistic and clinical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsECG Monitoring and Analysis · Atrial Fibrillation Management and Outcomes · Machine Learning in Healthcare
