Amortized Reasoning Tree Search: Decoupling Proposal and Decision in Large Language Models
Zesheng Hong, Jiadong Yu, Hui Pan

TL;DR
This paper introduces ARTS, a novel approach that decouples reasoning and verification in large language models, effectively addressing the suppression of rare correct reasoning paths and improving performance on complex tasks.
Contribution
The paper proposes ARTS, which separates generation from verification using a Flow Matching objective, enhancing reasoning robustness without altering the base model.
Findings
Achieves 74.6% performance on MATH-500 benchmark.
Recovers performance on long-tail reasoning tasks where RL collapses.
Matches fully fine-tuned policies without modifying the generative backbone.
Abstract
Reinforcement Learning with Verifiable Rewards (RLVR) has established itself as the dominant paradigm for instilling rigorous reasoning capabilities in Large Language Models. While effective at amplifying dominant behaviors, we identify a critical pathology in this alignment process: the systematic suppression of valid but rare (low-likelihood under the base model distribution) reasoning paths. We theoretically characterize this phenomenon as a "Normalization Squeeze," where the interplay between mode-seeking policy gradients and finite sampling acts as a high-pass likelihood filter, driving the probability of rare correct traces to statistical extinction. To counteract this collapse without discarding the base model's latent diversity, we propose Amortized Reasoning Tree Search (ARTS). Unlike standard approaches that force internalization via parameter updates, ARTS prioritizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
