TL;DR
This paper introduces PersonalAlign, a hierarchical implicit intent alignment framework for personalized GUI agents that utilize long-term user records to resolve vague instructions and anticipate routines.
Contribution
It proposes a new agent task, a benchmark dataset, and a hierarchical memory model to improve personalization and proactive assistance in GUI agents.
Findings
HIM-Agent improves execution performance by 15.7%.
HIM-Agent enhances proactive suggestions by 7.3%.
AndroidIntent benchmark evaluates intent resolution and proactive capabilities.
Abstract
While GUI agents have shown strong performance under explicit and completion instructions, real-world deployment requires aligning with users' more complex implicit intents. In this work, we highlight Hierarchical Implicit Intent Alignment for Personalized GUI Agent (PersonalAlign), a new agent task that requires agents to leverage long-term user records as persistent context to resolve omitted preferences in vague instructions and anticipate latent routines by user state for proactive assistance. To facilitate this study, we introduce AndroidIntent, a benchmark designed to evaluate agents' ability in resolving vague instructions and providing proactive suggestions through reasoning over long-term user records. We annotated 775 user-specific preferences and 215 routines from 20k long-term records across different users for evaluation. Furthermore, we introduce Hierarchical Intent Memory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
