Corrector Sampling in Language Models
Itai Gat, Neta Shaul, Uriel Singer, Yaron Lipman

TL;DR
This paper introduces Resample-Previous-Tokens (RPT), a novel sampling method for autoregressive language models that reduces error accumulation by revisiting previous tokens, leading to improved reasoning and coding performance.
Contribution
The paper presents RPT, a new sampling technique that can be integrated into existing models to enhance their accuracy without sacrificing speed.
Findings
RPT improves reasoning and coding benchmark scores by approximately 10%.
Fine-tuning a pretrained 8B model with RPT yields significant performance gains.
RPT maintains the original model's prediction quality and efficiency.
Abstract
Autoregressive language models accumulate errors due to their fixed, irrevocable left-to-right token generation. To address this, we propose a new sampling method called Resample-Previous-Tokens (RPT). RPT mitigates error accumulation by iteratively revisiting and potentially replacing tokens in a window of previously generated text. This method can be integrated into existing autoregressive models, preserving their next-token-prediction quality and speed. Fine-tuning a pretrained 8B parameter model with RPT for only 100B resulted in ~10% relative improvements on reasoning and coding benchmarks compared to the standard sampling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
