CondAmbigQA: A Benchmark and Dataset for Conditional Ambiguous Question Answering

Zongxi Li; Yang Li; Haoran Xie; S. Joe Qin

arXiv:2502.01523·cs.CL·September 12, 2025·2 cites

CondAmbigQA: A Benchmark and Dataset for Conditional Ambiguous Question Answering

Zongxi Li, Yang Li, Haoran Xie, S. Joe Qin

PDF

Open Access 1 Video

TL;DR

This paper introduces CondAmbigQA, a benchmark for evaluating how well language models handle ambiguous questions by explicitly considering contextual conditions, improving answer accuracy and reducing hallucinations.

Contribution

It presents a new dataset and evaluation framework for conditional ambiguous question answering, emphasizing the importance of explicit conditions in resolving query ambiguity.

Findings

01

Models considering conditions improve accuracy by 11.75%.

02

Explicit conditions lead to an additional 7.15% gain.

03

Addressing ambiguity reduces perceived hallucinations in LLM responses.

Abstract

Users often assume that large language models (LLMs) share their cognitive alignment of context and intent, leading them to omit critical information in question-answering (QA) and produce ambiguous queries. Responses based on misaligned assumptions may be perceived as hallucinations. Therefore, identifying possible implicit assumptions is crucial in QA. To address this fundamental challenge, we propose Conditional Ambiguous Question-Answering (CondAmbigQA), a benchmark comprising 2,000 ambiguous queries and condition-aware evaluation metrics. Our study pioneers "conditions" as explicit contextual constraints that resolve ambiguities in QA tasks through retrieval-based annotation, where retrieved Wikipedia fragments help identify possible interpretations for a given query and annotate answers accordingly. Experiments demonstrate that models considering conditions before answering…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

CondAmbigQA: A Benchmark and Dataset for Conditional Ambiguous Question Answering· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems