Corrector Sampling in Language Models

Itai Gat; Neta Shaul; Uriel Singer; Yaron Lipman

arXiv:2506.06215·cs.LG·June 9, 2025

Corrector Sampling in Language Models

Itai Gat, Neta Shaul, Uriel Singer, Yaron Lipman

PDF

Open Access

TL;DR

This paper introduces Resample-Previous-Tokens (RPT), a novel sampling method for autoregressive language models that reduces error accumulation by revisiting previous tokens, leading to improved reasoning and coding performance.

Contribution

The paper presents RPT, a new sampling technique that can be integrated into existing models to enhance their accuracy without sacrificing speed.

Findings

01

RPT improves reasoning and coding benchmark scores by approximately 10%.

02

Fine-tuning a pretrained 8B model with RPT yields significant performance gains.

03

RPT maintains the original model's prediction quality and efficiency.

Abstract

Autoregressive language models accumulate errors due to their fixed, irrevocable left-to-right token generation. To address this, we propose a new sampling method called Resample-Previous-Tokens (RPT). RPT mitigates error accumulation by iteratively revisiting and potentially replacing tokens in a window of previously generated text. This method can be integrated into existing autoregressive models, preserving their next-token-prediction quality and speed. Fine-tuning a pretrained 8B parameter model with RPT for only 100B resulted in ~10% relative improvements on reasoning and coding benchmarks compared to the standard sampling.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods