Quantization-Aware Regularizers for Deep Neural Networks Compression

Dario Malchiodi; Mattia Ferraretto; Marco Frasca

arXiv:2602.03614·cs.LG·February 4, 2026

Quantization-Aware Regularizers for Deep Neural Networks Compression

Dario Malchiodi, Mattia Ferraretto, Marco Frasca

PDF

Open Access

TL;DR

This paper introduces a novel regularization approach during training that encourages neural network weights to naturally cluster, making quantization more accurate and integrated into the learning process, thereby improving compression without significant accuracy loss.

Contribution

It presents a new regularization method that embeds quantization awareness into training, allowing weights to form clusters and quantization parameters to be learned during backpropagation.

Findings

01

Effective on CIFAR-10 with AlexNet and VGG16

02

Reduces accuracy loss in quantization

03

Integrates quantization into training process

Abstract

Deep Neural Networks reached state-of-the-art performance across numerous domains, but this progress has come at the cost of increasingly large and over-parameterized models, posing serious challenges for deployment on resource-constrained devices. As a result, model compression has become essential, and -- among compression techniques -- weight quantization is largely used and particularly effective, yet it typically introduces a non-negligible accuracy drop. However, it is usually applied to already trained models, without influencing how the parameter space is explored during the learning phase. In contrast, we introduce per-layer regularization terms that drive weights to naturally form clusters during training, integrating quantization awareness directly into the optimization process. This reduces the accuracy loss typically associated with quantization methods while preserving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning