Weblica: Scalable and Reproducible Training Environments for Visual Web Agents
O\u{g}uzhan Fatih Kar, Roman Bachmann, Yuanzheng Gong, Anders Boesen Lindbo Larsen, Afshin Dehghan

TL;DR
Weblica introduces a scalable, reproducible framework for training visual web agents using web environment synthesis and HTTP caching, enabling large-scale reinforcement learning and improved web navigation performance.
Contribution
The paper presents Weblica, a novel framework combining HTTP caching and LLM-based environment synthesis for scalable, reproducible web agent training environments.
Findings
Weblica-8B outperforms similar-sized open-weight models on web navigation benchmarks.
The framework scales RL training to thousands of diverse web environments.
Weblica achieves competitive performance with API models using fewer inference steps.
Abstract
The web is complex, open-ended, and constantly changing, making it challenging to scale training data for visual web agents. Existing data collection attempts remain limited to offline trajectories for supervised fine-tuning or a handful of simulated environments for RL training, thus failing to capture web diversity. We propose Weblica (Web Replica), a framework for constructing reproducible and scalable web environments. Our framework leverages 1) HTTP-level caching to capture and replay stable visual states while preserving interactive behavior and 2) LLM-based environment synthesis grounded in real-world websites and core web navigation skills. Using this framework, we scale RL training to thousands of diverse environments and tasks. Our best model, Weblica-8B, outperforms open-weight baselines of similar size across multiple web navigation benchmarks while using fewer inference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
