Relative Kinetic Utility for Reasoning-Aware Structural Pruning in Large Language Models

Tianhao Qian

arXiv:2605.09008·cs.LG·May 12, 2026

Relative Kinetic Utility for Reasoning-Aware Structural Pruning in Large Language Models

Tianhao Qian

PDF

TL;DR

This paper introduces RKU, a novel framework for structural pruning in large language models that preserves reasoning capabilities at high sparsity levels by focusing on critical structural pathways.

Contribution

The paper proposes RKU, a continuous kinetic integral approach with Fisher normalization, to improve reasoning accuracy in pruned LLMs at high sparsity, surpassing existing methods.

Findings

01

RKU improves reasoning accuracy at 40% sparsity on GSM8K.

02

RKU outperforms baseline methods in high-sparsity regimes.

03

RKU better preserves reasoning representations under out-of-distribution tests.

Abstract

Chain-of-Thought (CoT) prompting symbolized a huge improvement of reasoning capabilities of Large Language Models (LLMs). However, scaling up test-time computation yields extensive CoT sequences, introducing severe inference latency and key-value (KV) cache memory bottlenecks. While structural pruning offers a fundamental, hardware-aware solution to alleviate static parameter burdens, existing magnitude-based methods may cut off the neurons of CoT: by over-indexing on discrete cross-entropy objectives, these heuristics fall into a \textit{magnitude trap}: they prioritize high-frequency, low-information syntactic tokens and trigger a disappointing reasoning collapse at high sparsities (e.g., 40\%). To overcome this topological phase transition, we propose \textsc{Relative Kinetic Utility} (RKU), a novel theoretical framework that elevates discrete pruning to a continuous kinetic integral…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.