InfoQuest: Evaluating Multi-Turn Dialogue Agents for Open-Ended Conversations with Hidden Context
Bryan L. M. de Oliveira, Luana G. B. Martins, Bruno Brand\~ao and, Luckeciano C. Melo

TL;DR
InfoQuest is a benchmark designed to evaluate dialogue agents' ability to handle ambiguous, open-ended requests by asking clarifying questions, revealing current limitations in models' information-seeking skills.
Contribution
The paper introduces a new multi-turn dialogue benchmark for assessing how models manage hidden context and ambiguous requests in open-ended conversations.
Findings
Proprietary models perform better but still struggle with clarification.
All models require multiple turns to infer user intent.
Models often default to generic responses without proper clarification.
Abstract
Large language models excel at following explicit instructions, but they often struggle with ambiguous or incomplete user requests, defaulting to verbose, generic responses instead of seeking clarification. We introduce InfoQuest, a multi-turn chat benchmark designed to evaluate how dialogue agents handle hidden context in open-ended user requests. This benchmark presents intentionally ambiguous scenarios that require models to engage in information-seeking dialogue by asking clarifying questions before providing appropriate responses. Our evaluation of both open and closed models reveals that, while proprietary models generally perform better, all current assistants struggle to gather critical information effectively. They often require multiple turns to infer user intent and frequently default to generic responses without proper clarification. We provide a systematic methodology for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · AI in Service Interactions · Topic Modeling
