CoDeQ: End-to-End Joint Model Compression with Dead-Zone Quantizer for High-Sparsity and Low-Precision Networks
Jonathan Wensh{\o}j, Tong Chen, Bob Pepin, Raghavendra Selvan

TL;DR
CoDeQ is a fully differentiable, end-to-end method for joint model compression that combines pruning and quantization using a dead-zone quantizer, achieving high sparsity and low bit operations with minimal complexity.
Contribution
Introduces CoDeQ, a novel approach that directly learns sparsity and quantization parameters within the training loop, eliminating auxiliary procedures and hyperparameter tuning.
Findings
Reduces bit operations to approximately 5% on ImageNet with ResNet-18.
Maintains near full-precision accuracy with high sparsity.
Supports both fixed-precision and mixed-precision quantization.
Abstract
While joint pruning--quantization is theoretically superior to sequential application, current joint methods rely on auxiliary procedures outside the training loop for finding compression parameters. This reliance adds engineering complexity and hyperparameter tuning, while also lacking a direct data-driven gradient signal, which might result in sub-optimal compression. In this paper, we introduce CoDeQ, a simple, fully differentiable method for joint pruning--quantization. Our approach builds on a key observation: the dead-zone of a scalar quantizer is equivalent to magnitude pruning, and can be used to induce sparsity directly within the quantization operator. Concretely, we parameterize the dead-zone width and learn it via backpropagation, alongside the quantization parameters. This design provides explicit control of sparsity, regularized by a single global hyperparameter, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Data Compression Techniques · Advanced Image Processing Techniques
