Language Models can perform Single-Utterance Self-Correction of Perturbed Reasoning
Sam Silver, Jimin Sun, Ivan Zhang, Sara Hooker, Eddie Kim

TL;DR
This paper investigates the intrinsic ability of large language models to self-correct reasoning errors within a single utterance, revealing robust correction capabilities across various models and datasets, which challenges prior assumptions about their brittleness.
Contribution
It demonstrates that LLMs have inherent single-utterance self-correction abilities in reasoning tasks, even without fine-tuning for this purpose, highlighting an underappreciated trait.
Findings
Models can correct reasoning errors within a single utterance.
Self-correction occurs across different models and datasets.
Intrinsic correction ability is stronger than previously thought.
Abstract
Large Language Models (LLMs) have demonstrated impressive mathematical reasoning capabilities, yet their performance remains brittle to minor variations in problem description and prompting strategy. Furthermore, reasoning is vulnerable to sampling-induced errors which autoregressive models must primarily address using self-correction via additionally-generated tokens. To better understand self-correction capabilities of recent models, we conduct experiments measuring models' ability to self-correct synthetic perturbations introduced into their Chain of Thought (CoT) reasoning. We observe robust single-utterance intrinsic self-correction behavior across a range of open-weight models and datasets, ranging from subtle, implicit corrections to explicit acknowledgments and corrections of errors. Our findings suggest that LLMs, including those not finetuned for long CoT, may possess stronger…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
