MeDiaQA: A Question Answering Dataset on Medical Dialogues
Huqun Suri, Qi Zhang, Wenhua Huo, Yan Liu, Chunsheng Guan

TL;DR
MeDiaQA is a new large-scale medical dialogue question answering dataset designed to evaluate reasoning and understanding in multi-turn medical conversations, with a baseline model showing significant room for improvement.
Contribution
The paper introduces MeDiaQA, the first dataset for reasoning over medical dialogues, and proposes MeDia-BERT, a baseline model for this challenging task.
Findings
MeDiaQA contains 22k questions from 11k dialogues across 150 specialties.
MeDia-BERT achieves 64.3% accuracy, below human performance of 93%.
The dataset enables testing of reasoning and understanding in medical dialogue QA.
Abstract
In this paper, we introduce MeDiaQA, a novel question answering(QA) dataset, which constructed on real online Medical Dialogues. It contains 22k multiple-choice questions annotated by human for over 11k dialogues with 120k utterances between patients and doctors, covering 150 specialties of diseases, which are collected from haodf.com and dxy.com. MeDiaQA is the first QA dataset where reasoning over medical dialogues, especially their quantitative contents. The dataset has the potential to test the computing, reasoning and understanding ability of models across multi-turn dialogues, which is challenging compared with the existing datasets. To address the challenges, we design MeDia-BERT, and it achieves 64.3% accuracy, while human performance of 93% accuracy, which indicates that there still remains a large room for improvement.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
