Development of an Extractive Clinical Question Answering Dataset with   Multi-Answer and Multi-Focus Questions

Sungrim Moon; Huan He; Hongfang Liu; Jungwei W. Fan

arXiv:2201.02517·cs.CL·June 27, 2023·5 cites

Development of an Extractive Clinical Question Answering Dataset with Multi-Answer and Multi-Focus Questions

Sungrim Moon, Huan He, Hongfang Liu, Jungwei W. Fan

PDF

Open Access

TL;DR

This paper introduces RxWhyQA, a large clinical question-answering dataset with multi-answer and multi-focus questions, to advance NLP systems in handling complex, realistic clinical inquiries.

Contribution

The creation of RxWhyQA dataset, incorporating complex multi-answer and multi-focus questions based on clinical relations, filling a gap in existing datasets for clinical EQA.

Findings

01

Baseline model achieved 0.72 F1 on the dataset.

02

25% of questions require multiple answers.

03

90% of relevant terms occur within adjacent sentences.

Abstract

Background: Extractive question-answering (EQA) is a useful natural language processing (NLP) application for answering patient-specific questions by locating answers in their clinical notes. Realistic clinical EQA can have multiple answers to a single question and multiple focus points in one question, which are lacking in the existing datasets for development of artificial intelligence solutions. Objective: Create a dataset for developing and evaluating clinical EQA systems that can handle natural multi-answer and multi-focus questions. Methods: We leveraged the annotated relations from the 2018 National NLP Clinical Challenges (n2c2) corpus to generate an EQA dataset. Specifically, the 1-to-N, M-to-1, and M-to-N drug-reason relations were included to form the multi-answer and multi-focus QA entries, which represent more complex and natural challenges in addition to the basic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies