From Reasoning to Super-Intelligence: A Search-Theoretic Perspective

Shai Shalev-Shwartz; Amnon Shashua

arXiv:2507.15865·cs.AI·July 29, 2025

From Reasoning to Super-Intelligence: A Search-Theoretic Perspective

Shai Shalev-Shwartz, Amnon Shashua

PDF

Open Access

TL;DR

This paper introduces the Diligent Learner, a new framework for training large reasoning models that explicitly models reasoning as a guided depth-first search, overcoming key obstacles in current CoT learning methods.

Contribution

It proposes the Diligent Learner paradigm, providing theoretical guarantees for efficient learning from CoT data and addressing limitations of existing approaches.

Findings

01

Diligent Learner can learn efficiently from CoT data under mild assumptions.

02

Existing methods often fail due to distribution drift and exponential inference costs.

03

The framework enables scalable, reliable reasoning systems trained on incomplete data.

Abstract

Chain-of-Thought (CoT) reasoning has emerged as a powerful tool for enhancing the problem-solving capabilities of large language models (LLMs). However, the theoretical foundations of learning from CoT data remain underdeveloped, and existing approaches -- such as Supervised Fine-Tuning (SFT), Reinforcement Learning (RL), Tree-of-Thoughts (ToT), and Monte Carlo Tree Search (MCTS) -- often fail on complex reasoning tasks. In this work, we identify core obstacles that hinder effective CoT learning, including distribution drift, lack of embedded search, and exponential inference costs. We introduce the Diligent Learner, a new learning paradigm that explicitly models reasoning as a depth-first search guided by a validator and supports backtracking upon failure. Under two mild and realistic assumptions, we prove that the Diligent Learner can efficiently learn from CoT data while existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputability, Logic, AI Algorithms