CoT Information: Improved Sample Complexity under Chain-of-Thought Supervision
Awni Altabaa, Omar Montasser, John Lafferty

TL;DR
This paper provides a theoretical analysis showing that chain-of-thought supervision can significantly reduce the sample complexity needed for learning complex functions, by quantifying the additional information gained from reasoning steps.
Contribution
It introduces the CoT information measure and links it to sample complexity bounds, offering the first statistical theory for learning with CoT supervision.
Findings
CoT supervision leads to faster learning rates compared to end-to-end supervision.
Sample complexity scales with the inverse of the CoT information measure.
Theoretical lower bounds are established based on CoT information.
Abstract
Learning complex functions that involve multi-step reasoning poses a significant challenge for standard supervised learning from input-output examples. Chain-of-thought (CoT) supervision, which provides intermediate reasoning steps together with the final output, has emerged as a powerful empirical technique, underpinning much of the recent progress in the reasoning capabilities of large language models. This paper develops a statistical theory of learning under CoT supervision. A key characteristic of the CoT setting, in contrast to standard supervision, is the mismatch between the training objective (CoT risk) and the test objective (end-to-end risk). A central part of our analysis, distinguished from prior work, is explicitly linking those two types of risk to achieve sharper sample complexity bounds. This is achieved via the *CoT information measure* $\mathcal{I}_{\mathcal{D},…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · Mental Health Research Topics · Quantum Computing Algorithms and Architecture
