SpecAgent: A Speculative Retrieval and Forecasting Agent for Code Completion

George Ma; Anurag Koul; Qi Chen; Yawen Wu; Sachit Kuhar; Yu Yu; Aritra Sengupta; Varun Kumar; Murali Krishna Ramanathan

arXiv:2510.17925·cs.SE·April 22, 2026

SpecAgent: A Speculative Retrieval and Forecasting Agent for Code Completion

George Ma, Anurag Koul, Qi Chen, Yawen Wu, Sachit Kuhar, Yu Yu, Aritra Sengupta, Varun Kumar, Murali Krishna Ramanathan

PDF

TL;DR

SpecAgent enhances code completion by proactively exploring repositories during indexing, improving quality and reducing latency, and introduces a leakage-free benchmark for realistic evaluation.

Contribution

It introduces SpecAgent, a novel agent that anticipates future code edits to improve retrieval and code generation, and creates a leakage-free benchmark for fair evaluation.

Findings

01

SpecAgent achieves 9-11% absolute improvement over baselines.

02

It significantly reduces inference latency.

03

The new benchmark provides a more realistic evaluation environment.

Abstract

Large Language Models (LLMs) excel at code-related tasks but often struggle in realistic software repositories, where project-specific APIs and cross-file dependencies are crucial. Retrieval-augmented methods mitigate this by injecting repository context at inference time. The low inference-time latency budget affects either retrieval quality or the added latency adversely impacts user experience. We address this limitation with SpecAgent, an agent that improves both latency and code-generation quality by proactively exploring repository files during indexing and constructing speculative context that anticipates future edits in each file. This indexing-time asynchrony allows thorough context computation, masking latency, and the speculative nature of the context improves code-generation quality. Additionally, we identify the problem of future context leakage in existing benchmarks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.