Prune4Web: DOM Tree Pruning Programming for Web Agent
Jiayuan Zhang, Kaiquan Chen, Zhihao Lu, Enshen Zhou, Qian Yu, Jing Zhang

TL;DR
Prune4Web introduces a DOM tree pruning method using LLM-generated scripts to efficiently filter web page elements, significantly improving web automation accuracy by reducing candidate elements and optimizing task-specific filtering.
Contribution
This paper presents a novel programmatic pruning approach that shifts DOM filtering from LLMs to executable scripts, enhancing scalability and precision in web automation tasks.
Findings
Achieves 25x to 50x reduction in candidate DOM elements.
Improves low-level grounding accuracy from 46.8% to 88.28%.
Demonstrates state-of-the-art performance in web automation.
Abstract
Web automation employs intelligent agents to execute high-level tasks by mimicking human interactions with web interfaces. Despite the capabilities of recent Large Language Model (LLM)-based web agents, navigating complex, real-world webpages efficiently remains a significant hurdle due to the prohibitively large size of Document Object Model (DOM) structures, often ranging from 10,000 to 100,000 tokens. Existing strategies typically rely on crude DOM truncation -- risking the loss of critical information -- or employ inefficient heuristics and separate ranking models, failing to achieve an optimal balance between precision and scalability. To address these challenges, we introduce Prune4Web, a novel paradigm that shifts DOM processing from resource-intensive LLM reading to efficient programmatic pruning. Central to our approach is DOM Tree Pruning Programming, where an LLM generates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Web Data Mining and Analysis · Natural Language Processing Techniques
