STaR: Bootstrapping Reasoning With Reasoning
Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman

TL;DR
STaR is a self-improving method that iteratively enhances language models' reasoning abilities by bootstrapping from a few rationales and large datasets, leading to significant performance gains.
Contribution
The paper introduces STaR, a novel iterative self-taught approach that improves reasoning in language models without requiring large rationale datasets.
Findings
STaR outperforms models trained only on final answers.
STaR achieves comparable results to much larger models on CommonsenseQA.
Iterative self-taught training enhances reasoning capabilities.
Abstract
Generating step-by-step "chain-of-thought" rationales improves language model performance on complex reasoning tasks like mathematics or commonsense question-answering. However, inducing language model rationale generation currently requires either constructing massive rationale datasets or sacrificing accuracy by using only few-shot inference. We propose a technique to iteratively leverage a small number of rationale examples and a large dataset without rationales, to bootstrap the ability to perform successively more complex reasoning. This technique, the "Self-Taught Reasoner" (STaR), relies on a simple loop: generate rationales to answer many questions, prompted with a few rationale examples; if the generated answers are wrong, try again to generate a rationale given the correct answer; fine-tune on all the rationales that ultimately yielded correct answers; repeat. We show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · AI-based Problem Solving and Planning
