Structural Rationale Distillation via Reasoning Space Compression

Jialin Yang; Jiankun Wang; Jiajun Wu; Henry Leung; Jiayu Zhou; Steve Drew

arXiv:2605.07139·cs.CL·May 11, 2026

Structural Rationale Distillation via Reasoning Space Compression

Jialin Yang, Jiankun Wang, Jiajun Wu, Henry Leung, Jiayu Zhou, Steve Drew

PDF

TL;DR

This paper introduces D-RPC, a reasoning path compression method that improves reasoning distillation from large to small language models by using a compact, reusable reasoning path bank, leading to more consistent and effective rationales.

Contribution

The paper proposes a novel reasoning path compression technique for distillation that balances coverage and supervision entropy, backed by PAC-Bayes analysis and extensive empirical validation.

Findings

01

D-RPC outperforms existing distillation methods across multiple benchmarks.

02

Smaller reasoning path banks can achieve optimal generalization performance.

03

D-RPC produces more consistent and diverse rationales with fewer tokens.

Abstract

When distilling reasoning from large language models (LLMs) into smaller ones, teacher rationales for similar problems often vary wildly in structure and strategy. Like a chef who makes the same dish differently each time, this inconsistency burdens the student with noisy supervision that is hard to internalize. We propose Distillation through Reasoning Path Compression (D-RPC), which constrains the teacher to follow a compact, dynamically maintained bank of reusable high-level reasoning paths. For each training question, D-RPC retrieves the most relevant path and conditions the teacher to follow it, producing rationales that are consistent across similar problems yet diverse enough to cover different problem types. A PAC-Bayes analysis formalizes the resulting trade-off between bank size and coverage: smaller banks reduce supervision entropy but risk coverage gaps, and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.