An Efficient Compression of Deep Neural Network Checkpoints Based on Prediction and Context Modeling

Yuriy Kim; Evgeny Belyaev

arXiv:2506.12000·cs.LG·June 16, 2025

An Efficient Compression of Deep Neural Network Checkpoints Based on Prediction and Context Modeling

Yuriy Kim, Evgeny Belyaev

PDF

Open Access

TL;DR

This paper introduces a novel checkpoint compression method for deep neural networks that combines prediction, pruning, and quantization to significantly reduce storage needs while maintaining model performance.

Contribution

It presents a new compression technique that integrates prediction-based arithmetic coding with pruning and quantization for neural network checkpoints.

Findings

01

Achieves substantial bit size reduction.

02

Enables near-lossless recovery of training states.

03

Preserves model performance after compression.

Abstract

This paper is dedicated to an efficient compression of weights and optimizer states (called checkpoints) obtained at different stages during a neural network training process. First, we propose a prediction-based compression approach, where values from the previously saved checkpoint are used for context modeling in arithmetic coding. Second, in order to enhance the compression performance, we also propose to apply pruning and quantization of the checkpoint values. Experimental results show that our approach achieves substantial bit size reduction, while enabling near-lossless training recovery from restored checkpoints, preserving the model's performance and making it suitable for storage-limited environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Medical Imaging and Analysis