Adaptive GoGI-Skip: Coupling Goal-Gradient Importance with Dynamic Uncertainty for Efficient Reasoning
Ren Zhuang

TL;DR
Adaptive GoGI-Skip enhances reasoning efficiency by dynamically coupling goal-gradient importance with uncertainty-based token skipping, reducing tokens by over 45% and doubling inference speed without accuracy loss.
Contribution
It introduces a novel framework that non-linearly couples gradient-based token importance with adaptive dynamic skipping, improving reasoning efficiency in language models.
Findings
Reduces token volume by over 45%
Speeds up inference by up to 2.0×
Maintains accuracy across multiple reasoning benchmarks
Abstract
Chain-of-Thought (CoT) prompting trades inference speed for reasoning accuracy. Existing compressors force a compromise as static gradient techniques treat tokens independently, severing sequential logic, while uncertainty-based pruning ignores the final answer. We introduce Adaptive GoGI-Skip, a framework that resolves this tension by non-linearly coupling Goal-Gradient Importance (GoGI) with Adaptive Dynamic Skipping (ADS). GoGI quantifies each token's functional contribution to answer correctness via gradient sensitivity. ADS leverages runtime entropy to dynamically modulate the GoGI threshold, preserving low-gradient tokens essential for structural coherence at high-uncertainty junctions. Trained on 7,472 MATH traces, our policy transfers zero-shot to AIME, GPQA, and GSM8K, reducing token volume by 45\% and accelerating inference up to 2.0 without accuracy loss. These…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
