PrefIx: Understand and Adapt to User Preference in Human-Agent Interaction
Jialin Li, Zhenhao Chen, Hanjun Luo, Hanan Salam

TL;DR
This paper introduces Prefix, a new evaluation environment for human-agent interaction that assesses both task performance and interaction quality, emphasizing preference inference and adaptation in LLM-based agents.
Contribution
It proposes the Interaction-as-a-Tool paradigm and formalizes user experience metrics, enabling comprehensive evaluation of agent interaction behaviors and preference alignment.
Findings
Preference-aware agents improve UX by 7.6%.
Preference alignment increases by 18.5%.
High reliability and human correlation in UX evaluation.
Abstract
LLM-based agents can complete tasks correctly yet still frustrate users through poor interaction patterns, such as excessive confirmations, opaque reasoning, or misaligned pacing. Current benchmarks evaluate task accuracy but overlook how agents interact: whether they infer preferences from implicit cues, adapt dynamically, or maintain fine-grained interaction quality. We introduce Prefix, a configurable environment that evaluates both what agents accomplish and how they interact. Central to Prefix is the Interaction-as-a-Tool (IaaT) paradigm, which treats interaction behaviors as structured tool calls, unifying them with existing evaluation frameworks. We define 31 preference settings across 14 attributes and formalize user experience (UX) as a core metric alongside task accuracy. A composite LLM-as-a-Judge mechanism across seven UX dimensions achieves strong aggregate reliability (ICC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Robot Interaction and HRI · Explainable Artificial Intelligence (XAI) · Innovative Human-Technology Interaction
