Reject, Resample, Repeat: Understanding Parallel Reasoning in Language Model Inference

Noah Golowich; Fan Chen; Dhruv Rohatgi; Raghav Singhal; Carles Domingo-Enrich; Dylan J. Foster; Akshay Krishnamurthy

arXiv:2603.07887·cs.LG·March 10, 2026

Reject, Resample, Repeat: Understanding Parallel Reasoning in Language Model Inference

Noah Golowich, Fan Chen, Dhruv Rohatgi, Raghav Singhal, Carles Domingo-Enrich, Dylan J. Foster, Akshay Krishnamurthy

PDF

Open Access

TL;DR

This paper models parallel reasoning in language model inference using particle filtering, providing theoretical guarantees and empirical insights into sampling accuracy and limitations of such methods.

Contribution

It introduces a rigorous framework for understanding inference-time sampling methods via particle filtering, with new guarantees, improvements, and fundamental limits.

Findings

01

Theoretical criteria effectively predict sampling error in SMC.

02

Algorithmic improvements enhance SMC performance.

03

Fundamental limits exist for all particle filtering methods.

Abstract

Inference-time methods that aggregate and prune multiple samples have emerged as a powerful paradigm for steering large language models, yet we lack any principled understanding of their accuracy-cost tradeoffs. In this paper, we introduce a route to rigorously study such approaches using the lens of *particle filtering* algorithms such as Sequential Monte Carlo (SMC). Given a base language model and a *process reward model* estimating expected terminal rewards, we ask: *how accurately can we sample from a target distribution given some number of process reward evaluations?* Theoretically, we identify (1) simple criteria enabling non-asymptotic guarantees for SMC; (2) algorithmic improvements to SMC; and (3) a fundamental limit faced by all particle filtering methods. Empirically, we demonstrate that our theoretical criteria effectively govern the *sampling error* of SMC, though not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Healthcare and Education