The Model Knows, the Decoder Finds: Future Value Guided Particle Power Sampling
Tu Nguyen, Matthieu Zimmer, Rasul Tutunov, Xiaotong Ji, and Haitham Bou Ammar

TL;DR
The paper introduces APPS, a particle-based decoding method that improves multi-step reasoning in large language models by efficiently approximating future value signals during inference.
Contribution
APPS is a novel blockwise particle algorithm that enhances decoding by incorporating future-value guidance, improving reasoning accuracy without additional training.
Findings
APPS improves reasoning accuracy on benchmarks.
APPS offers better accuracy-runtime trade-offs.
Future-value-guided sampling recovers part of the gap to trained models.
Abstract
A recurring pattern in "reasoning without training" is that base LLMs already assign non-trivial probability mass to correct multi-step solutions; the bottleneck is locating these modes efficiently at inference time. Power sampling provides a principled way to bias decoding toward such modes by targeting p_theta(x)^alpha with alpha > 1, but practical approximations must account for future-dependent correction factors that determine which prefixes remain promising. We introduce Auxiliary Particle Power Sampling (APPS), a blockwise particle algorithm for approximating the sequence-level power target with a bounded population of partial solutions. APPS propagates hypotheses in parallel using proposal-corrected power reweighting and refines their survival through future-value-guided selection at resampling boundaries. This redistributes finite compute across competing prefixes rather than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
