TL;DR
TrigReason introduces a trigger-based collaborative framework that leverages small and large reasoning models to improve efficiency and reduce latency in complex reasoning tasks.
Contribution
The paper systematically analyzes SRM limitations and proposes a trigger-based method for selective LRM intervention, enhancing reasoning efficiency and accuracy.
Findings
TrigReason matches LRM accuracy while offloading more reasoning steps to SRMs.
Reduces latency by 43.9% and API cost by 73.3% under edge-cloud conditions.
Offloads 1.70x - 4.79x more reasoning steps to SRMs.
Abstract
Large Reasoning Models (LRMs) achieve strong performance on complex tasks through extended chains of thought but suffer from high inference latency due to autoregressive reasoning. Recent work explores using Small Reasoning Models (SRMs) to accelerate LRM inference. In this paper, we systematically characterize the capability boundaries of SRMs and identify three common types of reasoning risks: (1) path divergence, where SRMs lack the strategic ability to construct an initial plan, causing reasoning to deviate from the most probable path; (2) cognitive overload, where SRMs fail to solve particularly difficult steps; and (3) recovery inability, where SRMs lack robust self-reflection and error correction mechanisms. To address these challenges, we propose TrigReason, a trigger-based collaborative reasoning framework that replaces continuous polling with selective intervention. TrigReason…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
