PAFFA: Premeditated Actions For Fast Agents
Shambhavi Krishna, Zheng Chen, Yuan Ling, Xiaojiang Huang, Yingjie Li,, Fan Yang, Xiang Li

TL;DR
PAFFA introduces a novel inference-time technique that leverages pre-computed browser interaction patterns and strategic re-use of LLM inference to significantly speed up web task completion while maintaining accuracy.
Contribution
It presents PAFFA, a method that constructs an 'Action Library' for faster, more accurate web interaction by re-using LLM inference without task-specific training.
Findings
Reduces inference tokens by 87%
Maintains robust performance with 0.57 vs. 0.50 step accuracy
Generalizes to unseen websites through action library updates
Abstract
Modern AI assistants have made significant progress in natural language understanding and tool-use, with emerging efforts to interact with Web interfaces. However, current approaches that heavily rely on repeated LLM-driven HTML parsing are computationally expensive and error-prone, particularly when handling dynamic web interfaces and multi-step tasks. We introduce PAFFA (Premeditated Actions For Fast Agents), a method that makes LLMs faster and more accurate in completing tasks on the internet using a novel inference-time technique that requires no task-specific training. PAFFA constructs an 'Action Library', leveraging the parametric knowledge of the base LLM to pre-compute browser interaction patterns that generalize across tasks. By strategically re-using LLM inference across tasks - either via 'Dist-Map' for task-agnostic identification of key interactive web elements, or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques
MethodsLib
