Error Diffusion: Post Training Quantization with Block-Scaled Number Formats for Neural Networks
Alireza Khodamoradi, Kristof Denolf, Eric Dellinger

TL;DR
This paper introduces error diffusion, a post-training quantization method that supports block-scaled number formats, improving neural network deployment by reducing hardware costs while maintaining output quality.
Contribution
It presents a hyperparameter-free, error diffusion approach for post-training quantization with block-scaled formats, without relying on backpropagation or Hessian information.
Findings
Consistently improves quantization performance across various architectures.
Demonstrates robustness of block-scaled formats in neural network quantization.
Provides an open-source library for emulating number formats in PyTorch.
Abstract
Quantization reduces the model's hardware costs, such as data movement, storage, and operations like multiply and addition. It also affects the model's behavior by degrading the output quality. Therefore, there is a need for methods that preserve the model's behavior when quantizing model parameters. More exotic numerical encodings, such as block-scaled number formats, have shown advantages for utilizing a fixed bit budget to encode model parameters. This paper presents error diffusion (ED), a hyperparameter-free method for post-training quantization with support for block-scaled data formats. Our approach does not rely on backpropagation or Hessian information. We describe how to improve the quantization process by viewing the neural model as a composite function and diffusing the quantization error in every layer. In addition, we introduce TensorCast, an open-source library based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Image Retrieval and Classification Techniques · Computational Physics and Python Applications
MethodsLib · Diffusion
