Student Guides Teacher: Weak-to-Strong Inference via Spectral Orthogonal Exploration

Dayu Wang; Jiaye Yang; Weikang Li; Jiahui Liang; Yang Li; Deguo Xia; Jizhou Huang

arXiv:2601.06160·cs.AI·April 30, 2026

Student Guides Teacher: Weak-to-Strong Inference via Spectral Orthogonal Exploration

Dayu Wang, Jiaye Yang, Weikang Li, Jiahui Liang, Yang Li, Deguo Xia, Jizhou Huang

PDF

TL;DR

This paper introduces Spectral Orthogonal Exploration (SOE), a geometric inference method that enhances reasoning diversity in Large Language Models, significantly improving accuracy and exploration efficiency on mathematical, logic, and code generation tasks.

Contribution

The paper proposes SOE, a novel geometric inference framework that uses orthogonal probes to diversify reasoning trajectories and mitigate reasoning collapse in LLMs.

Findings

01

SOE improves average accuracy by 62.4% on mathematical benchmarks.

02

SOE increases sampling efficiency by 113.7% over baseline methods.

03

Preliminary results show SOE's effectiveness on logic and code generation tasks.

Abstract

Large Language Models (LLMs) often suffer from ''Reasoning Collapse'' on challenging mathematical reasoning tasks, where stochastic sampling produces lexical variations of the same erroneous logic rather than genuine semantic exploration. We observe that failed reasoning traces are often associated with a low-rank bias manifold in the model's hidden-state geometry, which reduces exploration toward corrective solution directions. To address this, we propose Spectral Orthogonal Exploration (SOE), a geometric inference framework under a ''Student Guides Teacher'' paradigm. Instead of using a weak auxiliary agent for imitation, SOE uses it as an orthogonal probe to introduce semantically heterogeneous reasoning signals into the teacher's orthogonal complement of its dominant subspace. This intervention steers the teacher toward more diverse reasoning trajectories and improves exploration…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.