TStore: Rethinking AI Model Hub with Tensor-Centric Compression
Tingfeng Lan, Zirui Wang, Yunjia Zheng, Zhaoyuan Su, Juncheng Yang, and Yue Cheng

TL;DR
TStore introduces a tensor-centric system that reduces AI model storage by identifying redundancy through tensor-level fingerprinting and clustering, enabling efficient compression without sacrificing model performance.
Contribution
The paper presents TStore, a novel tensor-centric approach for fine-grained deduplication and compression in AI model hubs, addressing storage challenges.
Findings
Achieves significant storage savings in real-world model repositories.
Maintains model usability and performance after compression.
Introduces tensor-level fingerprinting and clustering for redundancy detection.
Abstract
Modern AI models are growing rapidly in size and redundancy, leading to significant storage and distribution challenges in model hubs. We present TStore, a tensor-centric system for reducing storage overhead through fine-grained deduplication and compression. TStore leverages tensor-level fingerprinting and clustering to identify redundancy across models without requiring annotations. Our design enables efficient storage reduction while preserving model usability and performance. Experiments on real-world model repositories demonstrate substantial storage savings with minimal overhead.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
