LightThinker++: From Reasoning Compression to Memory Management

Yuqi Zhu; Jintian Zhang; Zhenjie Wan; Yujie Luo; Shuofei Qiao; Zhengke Gui; Da Zheng; Lei Liang; Huajun Chen; and Ningyu Zhang

arXiv:2604.03679·cs.CL·April 7, 2026

LightThinker++: From Reasoning Compression to Memory Management

Yuqi Zhu, Jintian Zhang, Zhenjie Wan, Yujie Luo, Shuofei Qiao, Zhengke Gui, Da Zheng, Lei Liang, Huajun Chen, and Ningyu Zhang

PDF

TL;DR

LightThinker++ introduces explicit memory management and reasoning compression techniques to significantly improve the efficiency and scalability of large language models in complex, long-horizon reasoning tasks.

Contribution

It presents a novel framework that combines dynamic reasoning compression with explicit memory primitives, enabling scalable and efficient long-term reasoning in LLMs.

Findings

01

Reduces peak token usage by 70% and inference time by 26% with minimal accuracy loss.

02

Slashes peak token usage by 69.9% and gains +2.42% accuracy in standard reasoning.

03

Maintains stable memory footprint beyond 80 rounds with a 14.8% performance gain in complex scenarios.

Abstract

Large language models (LLMs) excel at complex reasoning, yet their efficiency is limited by the surging cognitive overhead of long thought traces. In this paper, we propose LightThinker, a method that enables LLMs to dynamically compress intermediate thoughts into compact semantic representations. However, static compression often struggles with complex reasoning where the irreversible loss of intermediate details can lead to logical bottlenecks. To address this, we evolve the framework into LightThinker++, introducing Explicit Adaptive Memory Management. This paradigm shifts to behavioral-level management by incorporating explicit memory primitives, supported by a specialized trajectory synthesis pipeline to train purposeful memory scheduling. Extensive experiments demonstrate the framework's versatility across three dimensions. (1) LightThinker reduces peak token usage by 70% and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.