ZipCCL: Efficient Lossless Data Compression of Communication Collectives for Accelerating LLM Training
Wenxiang Lin, Xinglin Pan, Ruibo Fan, Shaohuai Shi, Xiaowen Chu

TL;DR
ZipCCL introduces a lossless communication compression library for LLM training, leveraging Gaussian data distribution and GPU optimizations to significantly reduce communication time and accelerate training.
Contribution
The paper presents ZipCCL, a novel lossless compression method with Gaussian-aware encoding, GPU-optimized kernels, and adaptive strategies for efficient LLM training communication.
Findings
Reduces communication time by up to 1.35× in LLM training.
Achieves end-to-end training speedups of up to 1.18×.
Maintains model quality while accelerating training.
Abstract
Communication has emerged as a critical bottleneck in the distributed training of large language models (LLMs). While numerous approaches have been proposed to reduce communication overhead, the potential of lossless compression has remained largely underexplored since compression and decompression typically consume larger overheads than the benefits of reduced communication traffic. We observe that the communication data, including activations, gradients and parameters, during training often follows a near-Gaussian distribution, which is a key feature for data compression. Thus, we introduce ZipCCL, a lossless compressed communication library of collectives for LLM training. ZipCCL is equipped with our novel techniques: (1) theoretically grounded exponent coding that exploits the Gaussian distribution of LLM tensors to accelerate compression without expensive online statistics, (2)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
