Flexible Agent Alignment with Goal Inference from Open-Ended Dialog
Rachel Ma, Jingyi Qu, Andreea Bobu, Dylan Hadfield-Menell

TL;DR
This paper presents a new framework and method for aligning large language model agents with human preferences expressed in open-ended dialogue, enabling dynamic goal inference and better collaboration.
Contribution
It introduces OU-AGs, a formal framework for open-ended assistance, and GOOD, a goal inference method that improves alignment without large offline datasets.
Findings
GOOD produces coherent goal representations across domains.
It improves alignment with user intent in multi-turn interactions.
The framework handles evolving, natural language expressed preferences.
Abstract
We introduce Open-Universe Assistance Games (OU-AGs), a formal framework extending assistance games to LLM-based agents. Effective assistance requires reasoning over human preferences that are unbounded, underspecified, and evolving. Current LLM agents struggle in multi-turn interactions and with maintaining accurate models of user intent in collaborative settings. Existing assistance game formulations assume fixed, predefined preferences, an assumption that breaks down in open-ended dialogue where goals are revised incrementally and expressed in natural language. Grounded in cognitive science accounts of preference construction, we represent human preferences as a dynamically updated distribution over discrete natural-language goals. To operationalize OU-AGs, we introduce GOOD (GOals from Open-ended Dialogue), a data-efficient online method that extracts and ranks candidate goals…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
