Lossless Compression for LLM Tensor Incremental Snapshots
Daniel Waddington, Cornel Constantinescu

TL;DR
This paper introduces LMC, a lossless compression method tailored for LLM tensor checkpoints, significantly reducing data volume and compression time, enabling faster, more frequent model checkpoints during training.
Contribution
The paper presents a novel, efficient lossless compression algorithm for LLM tensor data, outperforming existing methods in speed and compression ratio, and demonstrates its practical implementation for high-throughput checkpointing.
Findings
LMC achieves better compression than BZ2.
LMC offers an order-of-magnitude faster compression.
Parallel implementation reaches 2.78 GiB/s compression throughput.
Abstract
During the training of Large Language Models (LLMs), tensor data is periodically "checkpointed" to persistent storage to allow recovery of work done in the event of failure. The volume of data that must be copied during each checkpoint, even when using reduced-precision representations such as bfloat16, often reaches hundreds of gigabytes. Furthermore, the data must be moved across a network and written to a storage system before the next epoch occurs. With a view to ultimately building an optimized checkpointing solution, this paper presents experimental analysis of checkpoint data used to derive a design that maximizes the use of lossless compression to reduce the volume of data. We examine how tensor data and its compressibility evolve during model training and evaluate the efficacy of existing common off-the-shelf general purpose compression engines combined with known data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Advanced Data Compression Techniques · Tensor decomposition and applications
