What Disease does this Patient Have? A Large-scale Open Domain Question   Answering Dataset from Medical Exams

Di Jin; Eileen Pan; Nassim Oufattole; Wei-Hung Weng; Hanyi Fang and; Peter Szolovits

arXiv:2009.13081·cs.CL·September 29, 2020·62 cites

What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams

Di Jin, Eileen Pan, Nassim Oufattole, Wei-Hung Weng, Hanyi Fang and, Peter Szolovits

PDF

Open Access 3 Repos 10 Models 5 Datasets

TL;DR

This paper introduces MedQA, a large-scale, multilingual open-domain medical question answering dataset from professional exams, highlighting the current performance limitations of existing models and encouraging future advancements.

Contribution

It provides the first free-form multiple-choice medical OpenQA dataset from professional exams in three languages, facilitating research and development of more advanced models.

Findings

01

Current best models achieve only 36.7% to 70.1% accuracy.

02

MedQA covers three languages: English, simplified Chinese, traditional Chinese.

03

The dataset presents significant challenges for existing OpenQA systems.

Abstract

Open domain question answering (OpenQA) tasks have been recently attracting more and more attention from the natural language processing (NLP) community. In this work, we present the first free-form multiple-choice OpenQA dataset for solving medical problems, MedQA, collected from the professional medical board exams. It covers three languages: English, simplified Chinese, and traditional Chinese, and contains 12,723, 34,251, and 14,123 questions for the three languages, respectively. We implement both rule-based and popular neural methods by sequentially combining a document retriever and a machine comprehension model. Through experiments, we find that even the current best method can only achieve 36.7\%, 42.0\%, and 70.1\% of test accuracy on the English, traditional Chinese, and simplified Chinese questions, respectively. We expect MedQA to present great challenges to existing OpenQA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining