Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights
Mathieu Andreux, Breno Baldas Skuk, Hamza Benchekroun, Emilien Bir\'e, Antoine Bonnet, Riaz Bordie, Nathan Bout, Matthias Brunel, Pierre-Louis Cedoz, Antoine Chassang, Micka\"el Chen, Alexandra D. Constantinou, Antoine d'Andign\'e, Hubert de La Jonqui\`ere, Aur\'elien Delfosse

TL;DR
This paper introduces Surfer-H, a cost-effective web agent utilizing the specialized Holo1 vision-language model, achieving high accuracy on web navigation tasks while being open-source to foster further research.
Contribution
The paper presents Holo1, a new open-weight VLM optimized for web tasks, and integrates it with Surfer-H to enhance web agent performance and cost-efficiency.
Findings
Holo1 outperforms existing models on UI benchmarks.
Surfer-H achieves 92.2% accuracy on WebVoyager.
Open-sourcing datasets and models to support research.
Abstract
We present Surfer-H, a cost-efficient web agent that integrates Vision-Language Models (VLM) to perform user-defined tasks on the web. We pair it with Holo1, a new open-weight collection of VLMs specialized in web navigation and information extraction. Holo1 was trained on carefully curated data sources, including open-access web content, synthetic examples, and self-produced agentic data. Holo1 tops generalist User Interface (UI) benchmarks as well as our new web UI localization benchmark, WebClick. When powered by Holo1, Surfer-H achieves a 92.2% state-of-the-art performance on WebVoyager, striking a Pareto-optimal balance between accuracy and cost-efficiency. To accelerate research advancement in agentic systems, we are open-sourcing both our WebClick evaluation dataset and the Holo1 model weights.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Agent-Based Network Management · Peer-to-Peer Network Technologies · Web Data Mining and Analysis
