Free(): Learning to Forget in Malloc-Only Reasoning Models
Yilun Zheng, Dongyang Ma, Tian Liang, Jiahao Xu, Xinting Huang, Lihui Chen, Haitao Mi, Yan Wang

TL;DR
Free()LM introduces a self-forgetting mechanism in reasoning models, enabling dynamic pruning of obsolete information to improve performance and stability across various model scales and tasks.
Contribution
It proposes the Free-Module, a plug-and-play adapter that allows models to identify and prune useless context, addressing the flaw of information accumulation in standard LLMs.
Findings
Achieves 3.3% average improvement over top reasoning baselines.
Establishes new SOTA on IMOanswerBench with DeepSeek V3.2-Speciale.
Restores performance from 0% to 50% accuracy in long-horizon tasks.
Abstract
Reasoning models enhance problem-solving by scaling test-time compute, yet they face a critical paradox: excessive thinking tokens often degrade performance rather than improve it. We attribute this to a fundamental architectural flaw: standard LLMs operate as "malloc-only" engines, continuously accumulating valid and redundant steps alike without a mechanism to prune obsolete information. To break this cycle, we propose Free()LM, a model that introduces an intrinsic self-forgetting capability via the Free-Module, a plug-and-play LoRA adapter. By iteratively switching between reasoning and cleaning modes, Free()LM dynamically identifies and prunes useless context chunks, maintaining a compact and noise-free state. Extensive experiments show that Free()LM provides consistent improvements across all model scales (8B to 685B). It achieves a 3.3% average improvement over top-tier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Software System Performance and Reliability
