Sketch-and-Verify: Structured Inference-Time Scaling via Program Sketching

Shan Jiang; Zijian Yi; Chenguang Zhu

arXiv:2605.08658·cs.LG·May 12, 2026

Sketch-and-Verify: Structured Inference-Time Scaling via Program Sketching

Shan Jiang, Zijian Yi, Chenguang Zhu

PDF

TL;DR

Sketch-and-Verify introduces a structured inference-time scaling method that uses program sketching to efficiently explore diverse algorithmic strategies, outperforming flat sampling within the same computational budget.

Contribution

The paper presents a novel sketching-based approach for inference-time scaling that guarantees diverse algorithm exploration and demonstrates its effectiveness on HumanEval+ benchmarks.

Findings

01

Sketching outperforms flat sampling at matched candidate counts.

02

Cross-tier sketching complements tier upgrades but does not replace them.

03

Practitioners should use sketching when stronger tiers are unavailable.

Abstract

SKETCHVERIFY is a within-tier cost-performance policy, not a universal accuracy improvement. The operational question: a practitioner stuck with a small, cheap code model (here, Gemini 3.1 Flash Lite) for latency, deployment, or budget reasons -- how should they spend a small amount of extra test-time compute? SKETCHVERIFY factorizes the search space: the LLM enumerates K distinct algorithmic strategies, writes a program sketch for each (a partial program with ?? holes), and fills each sketch M times, producing K x M structurally diverse candidates that are verified by execution and selected by fingerprint clustering. Each extra sketch is guaranteed to explore a different algorithm; each extra flat sample likely duplicates an existing one. Our central evidence is a cost-quality Pareto plot on HumanEval+ across three Gemini tiers (Lite, Flash, Pro), and a reanalysis of the 19 problems…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.