Improving Health Question Answering with Reliable and Time-Aware   Evidence Retrieval

Juraj Vladika; Florian Matthes

arXiv:2404.08359·cs.CL·April 15, 2024·2 cites

Improving Health Question Answering with Reliable and Time-Aware Evidence Retrieval

Juraj Vladika, Florian Matthes

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper enhances health question answering by optimizing evidence retrieval from large medical databases, emphasizing recent and highly cited documents to improve accuracy in an open-domain setting.

Contribution

It introduces a retrieval strategy that improves health QA performance by focusing on recent and highly cited evidence, addressing limitations of pre-selected evidence approaches.

Findings

01

Reducing retrieved documents improves macro F1 score by up to 10%.

02

Favoring recent and highly cited articles enhances answer accuracy.

03

Adjusting retrieval parameters significantly impacts QA system performance.

Abstract

In today's digital world, seeking answers to health questions on the Internet is a common practice. However, existing question answering (QA) systems often rely on using pre-selected and annotated evidence documents, thus making them inadequate for addressing novel questions. Our study focuses on the open-domain QA setting, where the key challenge is to first uncover relevant evidence in large knowledge bases. By utilizing the common retrieve-then-read QA pipeline and PubMed as a trustworthy collection of medical research documents, we answer health questions from three diverse datasets. We modify different retrieval settings to observe their influence on the QA pipeline's performance, including the number of retrieved documents, sentence selection process, the publication year of articles, and their number of citations. Our results reveal that cutting down on the amount of retrieved…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jvladika/improving-health-qa
noneOfficial

Videos

Improving Health Question Answering with Reliable and Time-Aware Evidence Retrieval· underline

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Expert finding and Q&A systems