Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought
Violet Xiang, Charlie Snell, Kanishk Gandhi, Alon Albalak and, Anikait Singh, Chase Blagden, Duy Phung, Rafael Rafailov, Nathan, Lile, Dakota Mahan, Louis Castricato, Jan-Philipp Franken, Nick, Haber, Chelsea Finn

TL;DR
This paper introduces Meta Chain-of-Thought (Meta-CoT), a framework that models the reasoning process behind Chain-of-Thought to enhance reasoning capabilities in large language models, combining empirical evidence and training methods.
Contribution
It proposes a new Meta-CoT framework, methods for generating Meta-CoT, and a training pipeline, advancing the understanding and development of reasoning in LLMs.
Findings
Models exhibit behaviors consistent with in-context search.
Methods for producing Meta-CoT include process supervision and synthetic data.
A training pipeline incorporating instruction tuning and reinforcement learning is outlined.
Abstract
We propose a novel framework, Meta Chain-of-Thought (Meta-CoT), which extends traditional Chain-of-Thought (CoT) by explicitly modeling the underlying reasoning required to arrive at a particular CoT. We present empirical evidence from state-of-the-art models exhibiting behaviors consistent with in-context search, and explore methods for producing Meta-CoT via process supervision, synthetic data generation, and search algorithms. Finally, we outline a concrete pipeline for training a model to produce Meta-CoTs, incorporating instruction tuning with linearized search traces and reinforcement learning post-training. Finally, we discuss open research questions, including scaling laws, verifier roles, and the potential for discovering novel reasoning algorithms. This work provides a theoretical and practical roadmap to enable Meta-CoT in LLMs, paving the way for more powerful and human-like…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Semantic Web and Ontologies
