Skim: Speculative Execution for Fast and Efficient Web Agents

Mike Wong; Kevin Hsieh; Suman Nath; Ravi Netravali

arXiv:2605.16565·cs.AI·May 20, 2026

Skim: Speculative Execution for Fast and Efficient Web Agents

Mike Wong, Kevin Hsieh, Suman Nath, Ravi Netravali

PDF

TL;DR

Skim is a framework that leverages website URL and format patterns to enable web agents to bypass heavy computations, significantly reducing cost and latency without sacrificing accuracy.

Contribution

It introduces a method to exploit website structure for speculative execution, enabling faster and more efficient web agents.

Findings

01

Median per-task cost reduced by 1.9x

02

Latency decreased by 33.4%

03

No accuracy loss observed

Abstract

Skim is a speculative execution framework for web agents that exploits the predictable structure of purpose-built websites. Today's web-agent expense is not intrinsic to the tasks but a property of how agents are composed: frontier-model inference, browser rendering, and ReAct-style planning are applied to every step of every task regardless of complexity. Skim's key observation is that websites enforce stable URL patterns, answer formats, and task-to-trajectory mappings across queries of the same type, so most queries can bypass these heavyweight components entirely. An offline profiler captures these patterns once per site. At runtime, Skim matches each query to a template, synthesizes the destination URL, and extracts the answer with a small model. A lightweight verifier gates each fast-path output against the query and schema; rare misspeculations cascade to the full agent,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.