Robust Quantization: One Model to Rule Them All
Moran Shkolnik, Brian Chmiel, Ron Banner, Gil Shomron, Yury Nahshan,, Alex Bronstein, Uri Weiser

TL;DR
This paper introduces a robust quantization method that creates a single neural network model capable of adapting to various bit-widths and quantization policies, enhancing flexibility and application scope.
Contribution
We propose a theoretically motivated approach that yields a single, versatile model robust to different quantization schemes, unlike traditional methods dependent on specific configurations.
Findings
Effective across multiple ImageNet models
Supports various bit-widths and policies
Improves model robustness to quantization variations
Abstract
Neural network quantization methods often involve simulating the quantization process during training, making the trained model highly dependent on the target bit-width and precise way quantization is performed. Robust quantization offers an alternative approach with improved tolerance to different classes of data-types and quantization policies. It opens up new exciting applications where the quantization process is not static and can vary to meet different circumstances and implementations. To address this issue, we propose a method that provides intrinsic robustness to the model against a broad range of quantization processes. Our method is motivated by theoretical arguments and enables us to store a single generic model capable of operating at various bit-widths and quantization policies. We validate our method's effectiveness on different ImageNet models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
