Adacc: An Adaptive Framework Unifying Compression and Activation Recomputation for LLM Training

Ping Chen; Zhuohong Deng; Ping Li; Shuibing He; Hongzi Zhu; Yi Zheng; Zhefeng Wang; Baoxing Huai; Minyi Guo

arXiv:2508.00806·cs.LG·August 11, 2025

Adacc: An Adaptive Framework Unifying Compression and Activation Recomputation for LLM Training

Ping Chen, Zhuohong Deng, Ping Li, Shuibing He, Hongzi Zhu, Yi Zheng, Zhefeng Wang, Baoxing Huai, Minyi Guo

PDF

Open Access

TL;DR

Adacc is an adaptive framework that unifies activation recomputation and data compression, dynamically optimizing memory usage during LLM training to enhance efficiency without sacrificing accuracy.

Contribution

It introduces a fine-grained, tensor-level adaptive strategy that combines multiple memory optimization techniques with global scheduling and dynamic policy updates.

Findings

01

Improves training throughput by up to 1.37x

02

Maintains model accuracy comparable to baseline

03

Effectively balances memory savings and computational overhead

Abstract

Training large language models (LLMs) is often constrained by GPU memory limitations. To alleviate memory pressure, activation recomputation and data compression have been proposed as two major strategies. However, both approaches have limitations: recomputation introduces significant training overhead, while compression can lead to accuracy degradation and computational inefficiency when applied naively. In this paper, we propose Adacc, the first adaptive memory optimization framework that unifies activation recomputation and data compression to improve training efficiency for LLMs while preserving model accuracy. Unlike existing methods that apply static, rule-based strategies or rely solely on one technique, Adacc makes fine-grained, tensor-level decisions, dynamically selecting between recomputation, retention, and compression based on tensor characteristics and runtime hardware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Intelligent Tutoring Systems and Adaptive Learning