Optimizing Deep Neural Networks using Safety-Guided Self Compression

Mohammad Zbeeb; Mariam Salman; Mohammad Bazzi; Ammar Mohanna

arXiv:2505.00350·cs.LG·May 2, 2025

Optimizing Deep Neural Networks using Safety-Guided Self Compression

Mohammad Zbeeb, Mariam Salman, Mohammad Bazzi, Ammar Mohanna

PDF

1 Repo

TL;DR

This paper presents a safety-guided quantization framework for neural networks that reduces model size by 60% while improving accuracy, applicable across different architectures, and validated through extensive experiments.

Contribution

Introduces a novel safety-driven quantization method using preservation sets to optimize neural network models without accuracy loss.

Findings

01

Achieves up to 2.5% accuracy improvement over original models.

02

Reduces model size by 60% while maintaining performance.

03

Enhances generalization and reduces variance compared to traditional quantization.

Abstract

The deployment of deep neural networks on resource-constrained devices necessitates effective model com- pression strategies that judiciously balance the reduction of model size with the preservation of performance. This study introduces a novel safety-driven quantization framework that leverages preservation sets to systematically prune and quantize neural network weights, thereby optimizing model complexity without compromising accuracy. The proposed methodology is rigorously evaluated on both a convolutional neural network (CNN) and an attention-based language model, demonstrating its applicability across diverse architectural paradigms. Experimental results reveal that our framework achieves up to a 2.5% enhancement in test accuracy relative to the original unquantized models while maintaining 60% of the initial model size. In comparison to conventional quantization techniques, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Moe-Zbeeb/Optimizing-Deep-Neural-Networks-via-Safety-Guided-Self-Compression
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.