Open Domain Question Answering with Conflicting Contexts

Siyi Liu; Qiang Ning; Kishaloy Halder; Wei Xiao; Zheng Qi; Phu Mon; Htut; Yi Zhang; Neha Anna John; Bonan Min; Yassine Benajiba; Dan Roth

arXiv:2410.12311·cs.CL·April 29, 2025

Open Domain Question Answering with Conflicting Contexts

Siyi Liu, Qiang Ning, Kishaloy Halder, Wei Xiao, Zheng Qi, Phu Mon, Htut, Yi Zhang, Neha Anna John, Bonan Min, Yassine Benajiba, Dan Roth

PDF

Open Access 1 Video

TL;DR

This paper highlights the challenge of conflicting information in open domain question answering, introduces a new dataset to evaluate this issue, and explores finetuning LLMs to better reason with conflicting contexts.

Contribution

It presents the QACC dataset for conflicting contexts, benchmarks LLMs' limitations, and proposes finetuning models with explanations to improve reasoning.

Findings

01

Up to 25% of questions have conflicting contexts.

02

LLMs struggle with conflicting information.

03

Finetuning with explanations improves reasoning.

Abstract

Open domain question answering systems frequently rely on information retrieved from large collections of text (such as the Web) to answer questions. However, such collections of text often contain conflicting information, and indiscriminately depending on this information may result in untruthful and inaccurate answers. To understand the gravity of this problem, we collect a human-annotated dataset, Question Answering with Conflicting Contexts (QACC), and find that as much as 25% of unambiguous, open domain questions can lead to conflicting contexts when retrieved using Google Search. We evaluate and benchmark three powerful Large Language Models (LLMs) with our dataset QACC and demonstrate their limitations in effectively addressing questions with conflicting information. To explore how humans reason through conflicting contexts, we request our annotators to provide explanations for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Open Domain Question Answering with Conflicting Contexts· underline

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Speech and dialogue systems

MethodsGravity