ZipLLM: Efficient LLM Storage via Model-Aware Synergistic Data Deduplication and Compression

Zirui Wang; Tingfeng Lan; Zhaoyuan Su; Juncheng Yang; Yue Cheng

arXiv:2505.06252·cs.DB·November 11, 2025

ZipLLM: Efficient LLM Storage via Model-Aware Synergistic Data Deduplication and Compression

Zirui Wang, Tingfeng Lan, Zhaoyuan Su, Juncheng Yang, Yue Cheng

PDF

Open Access

TL;DR

ZipLLM combines model-aware deduplication and delta compression to significantly reduce large language model storage, outperforming existing methods by 20%.

Contribution

The paper introduces ZipLLM, a novel storage pipeline that unifies tensor-level deduplication with a new delta compression algorithm, leveraging LLM family clustering.

Findings

01

Reduces storage by 54%, surpassing state-of-the-art methods.

02

Identifies structured parameter differences suitable for delta compression.

03

Demonstrates high data reduction with low metadata overhead.

Abstract

Modern model hubs, such as Hugging Face, store tens of petabytes of LLMs, with fine-tuned variants vastly outnumbering base models and dominating storage consumption. Existing storage reduction techniques -- such as deduplication and compression -- are either LLM-oblivious or not compatible with each other, limiting data reduction effectiveness. Our large-scale characterization study across all publicly available Hugging Face LLM repositories reveals several key insights: (1) fine-tuned models within the same family exhibit highly structured, sparse parameter differences suitable for delta compression; (2) bitwise similarity enables LLM family clustering; and (3) tensor-level deduplication is better aligned with model storage workloads, achieving high data reduction with low metadata overhead. Building on these insights, we design BitX, an effective, fast, lossless delta compression…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Storage Technologies · Cloud Computing and Resource Management · Scientific Computing and Data Management