Rethinking Chain-of-Thought from the Perspective of Self-Training
Zongqian Wu, Baoduo Xu, Ruochen Cui, Mengmeng Zhan, Xiaofeng Zhu, Lei Feng

TL;DR
This paper introduces a novel Chain-of-Thought framework that leverages self-training principles to enhance reasoning in large language models, addressing over-reasoning and iteration similarity issues for better performance and efficiency.
Contribution
It proposes a new CoT framework with task-specific prompting and adaptive iteration modules, improving reasoning accuracy and computational efficiency.
Findings
Significant performance improvements over existing CoT methods.
Enhanced reasoning accuracy with reduced over-reasoning.
Improved computational efficiency in reasoning processes.
Abstract
Chain-of-thought (CoT) reasoning has emerged as an effective approach for activating latent capabilities in LLMs. Interestingly, we observe that both CoT reasoning and self-training share the core objective: iteratively leveraging model-generated information to progressively reduce prediction uncertainty. Building on this insight, we propose a novel CoT framework to improve reasoning performance. Our framework integrates two key components: (i) a task-specific prompt module that optimizes the initial reasoning process, and (ii) an adaptive reasoning iteration module that dynamically refines the reasoning process and addresses the limitations of previous CoT approaches, \ie over-reasoning and high similarity between consecutive reasoning iterations. Extensive experiments demonstrate that the proposed method achieves significant advantages in both performance and computational efficiency.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsInformation Systems Theories and Implementation · Educational Leadership and Innovation · Management and Organizational Studies
