Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks

Abhranil Chandra; Ayush Agrawal; Arian Hosseini; Sebastian Fischmeister; Rishabh Agarwal; Navin Goyal; Aaron Courville

arXiv:2512.22255·cs.AI·January 26, 2026

Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks

Abhranil Chandra, Ayush Agrawal, Arian Hosseini, Sebastian Fischmeister, Rishabh Agarwal, Navin Goyal, Aaron Courville

PDF

Open Access

TL;DR

Training language models on synthetic, distribution-matched chain-of-thought traces—even if incorrect—can enhance reasoning performance more effectively than using human-annotated data, due to better alignment with the model's own distribution.

Contribution

This paper demonstrates that synthetic, distribution-matched reasoning traces improve model reasoning more than human data, highlighting the importance of data distribution in training.

Findings

01

Synthetic data closer to model distribution enhances reasoning.

02

Partially flawed reasoning traces still provide valuable learning signals.

03

Distribution alignment improves performance across multiple reasoning tasks.

Abstract

We present the surprising finding that a language model's reasoning capabilities can be improved by training on synthetic datasets of chain-of-thought (CoT) traces from more capable models, even when all of those traces lead to an incorrect final answer. Our experiments show this approach can yield better performance on reasoning tasks than training on human-annotated datasets. We hypothesize that two key factors explain this phenomenon: first, the distribution of synthetic data is inherently closer to the language model's own distribution, making it more amenable to learning. Second, these `incorrect' traces are often only partially flawed and contain valid reasoning steps from which the model can learn. To further test the first hypothesis, we use a language model to paraphrase human-annotated traces -- shifting their distribution closer to the model's own distribution -- and show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques