Floating-Point Multiply-Add with Approximate Normalization for Low-Cost Matrix Engines
Kosmas Alexandridis, Christodoulos Peltekis, Dionysios Filippas,, Giorgos Dimitrakopoulos

TL;DR
This paper introduces an approximate normalization technique for floating-point multiply-add units in matrix engines, significantly reducing hardware complexity and power consumption while maintaining acceptable accuracy in machine learning models.
Contribution
It presents a novel approximate normalization method that decreases hardware area and power usage in floating-point units for machine learning accelerators without degrading model accuracy.
Findings
16% reduction in area and power consumption for Bfloat16 units
1% average accuracy loss in transformer models
Effective energy efficiency improvement in matrix engines
Abstract
The widespread adoption of machine learning algorithms necessitates hardware acceleration to ensure efficient performance. This acceleration relies on custom matrix engines that operate on full or reduced-precision floating-point arithmetic. However, conventional floating-point implementations can be power hungry. This paper proposes a method to improve the energy efficiency of the matrix engines used in machine learning algorithm acceleration. Our approach leverages approximate normalization within the floating-point multiply-add units as a means to reduce their hardware complexity, without sacrificing overall machine-learning model accuracy. Hardware synthesis results show that this technique reduces area and power consumption roughly by 16% and 13% on average for Bfloat16 format. Also, the error introduced in transformer model accuracy is 1% on average, for the most efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReal-time simulation and control systems
