YIELD: A Large-Scale Dataset and Evaluation Framework for Information Elicitation Agents
Victor De Lima, Grace Hui Yang

TL;DR
This paper introduces YIELD, a large-scale dataset and evaluation framework for developing and assessing information elicitation agents in institutional decision-making contexts, supported by experiments with foundation LLMs.
Contribution
The paper presents YIELD, a 26-million-token dataset and formalizes information elicitation as a POMDP, enabling systematic research and improved alignment of language models.
Findings
Training on YIELD enhances model alignment with elicitation behavior.
Human evaluation supports the effectiveness of models trained on YIELD.
The dataset and tools are publicly available for research use.
Abstract
Most conversational agents (CAs) are designed to satisfy user needs through user-driven interactions. However, many real-world settings, such as academic interviewing, judicial proceedings, and journalistic investigations, involve broader institutional decision-making processes and require agents that can elicit information from users. In this paper, we introduce Information Elicitation Agents (IEAs) in which the agent's goal is to elicit information from users to support the agent's institutional or task-oriented objectives. To enable systematic research on this setting, we present YIELD, a 26M-token dataset of 2,281 ethically sourced, human-to-human dialogues. Moreover, we formalize information elicitation as a finite-horizon POMDP and propose novel metrics tailored to IEAs. Pilot experiments on multiple foundation LLMs show that training on YIELD improves their alignment with real…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
