CondAmbigQA: A Benchmark and Dataset for Conditional Ambiguous Question Answering
Zongxi Li, Yang Li, Haoran Xie, S. Joe Qin

TL;DR
This paper introduces CondAmbigQA, a benchmark for evaluating how well language models handle ambiguous questions by explicitly considering contextual conditions, improving answer accuracy and reducing hallucinations.
Contribution
It presents a new dataset and evaluation framework for conditional ambiguous question answering, emphasizing the importance of explicit conditions in resolving query ambiguity.
Findings
Models considering conditions improve accuracy by 11.75%.
Explicit conditions lead to an additional 7.15% gain.
Addressing ambiguity reduces perceived hallucinations in LLM responses.
Abstract
Users often assume that large language models (LLMs) share their cognitive alignment of context and intent, leading them to omit critical information in question-answering (QA) and produce ambiguous queries. Responses based on misaligned assumptions may be perceived as hallucinations. Therefore, identifying possible implicit assumptions is crucial in QA. To address this fundamental challenge, we propose Conditional Ambiguous Question-Answering (CondAmbigQA), a benchmark comprising 2,000 ambiguous queries and condition-aware evaluation metrics. Our study pioneers "conditions" as explicit contextual constraints that resolve ambiguities in QA tasks through retrieval-based annotation, where retrieved Wikipedia fragments help identify possible interpretations for a given query and annotate answers accordingly. Experiments demonstrate that models considering conditions before answering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
