CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models
Tong Zhang, Peixin Qin, Yang Deng, Chen Huang, Wenqiang Lei, Junhong, Liu, Dingnan Jin, Hongru Liang, Tat-Seng Chua

TL;DR
CLAMBER introduces a comprehensive benchmark with a large dataset to evaluate and improve large language models' ability to identify and clarify ambiguous user queries, highlighting current limitations and guiding future research.
Contribution
This paper presents CLAMBER, a new benchmark and dataset for assessing LLMs' performance on ambiguity detection and clarification, addressing a gap in existing evaluation methods.
Findings
Current LLMs have limited ability to identify ambiguity in queries.
Chain-of-thought and few-shot prompting offer minimal improvements in ambiguity detection.
LLMs struggle to generate high-quality clarifying questions due to conflict resolution issues.
Abstract
Large language models (LLMs) are increasingly used to meet user information needs, but their effectiveness in dealing with user queries that contain various types of ambiguity remains unknown, ultimately risking user trust and satisfaction. To this end, we introduce CLAMBER, a benchmark for evaluating LLMs using a well-organized taxonomy. Building upon the taxonomy, we construct ~12K high-quality data to assess the strengths, weaknesses, and potential risks of various off-the-shelf LLMs. Our findings indicate the limited practical utility of current LLMs in identifying and clarifying ambiguous user queries, even enhanced by chain-of-thought (CoT) and few-shot prompting. These techniques may result in overconfidence in LLMs and yield only marginal enhancements in identifying ambiguity. Furthermore, current LLMs fall short in generating high-quality clarifying questions due to a lack of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management
