TO-GATE: Clarifying Questions and Summarizing Responses with Trajectory Optimization for Eliciting Human Preference
Yulin Dou, Jiangming Liu

TL;DR
TO-GATE introduces a trajectory optimization framework for LLMs to generate more effective clarification questions and summaries, significantly improving human preference elicitation accuracy.
Contribution
It presents a novel trajectory optimization approach with a clarification resolver and summarizer, enhancing question relevance and task alignment in preference elicitation.
Findings
Achieves 9.32% improvement over baseline methods
Effectively generates task-specific clarification questions
Enhances final response relevance and accuracy
Abstract
Large language models (LLMs) can effectively elicit human preferences through multi-turn dialogue. Complex tasks can be accomplished through iterative clarifying questions and final responses generated by an LLM acting as a questioner (STaR-GATE; Andukuri et al., 2024}). However, existing approaches based on self-taught reasoning struggle to identify optimal dialogue trajectories and avoid irrelevant questions to the tasks. To address this limitation, we propose TO-GATE, a novel framework that enhances question generation through trajectory optimization, which consists of two key components: a clarification resolver that generates optimal questioning trajectories, and a summarizer that ensures task-aligned final responses. The trajectory optimization enables the model to produce effective elicitation questions and summary responses tailored to specific tasks. Experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Speech and dialogue systems · Bayesian Modeling and Causal Inference
