The AetherFloat Family: Block-Scale-Free Quad-Radix Floating-Point Architectures for AI Accelerators
Keita Morisaki

TL;DR
The paper introduces AetherFloat, a novel hardware-efficient floating-point format for AI accelerators that reduces silicon area, power, and delay while improving dynamic range and eliminating the need for block-scaling logic.
Contribution
It presents a parameterizable architecture with a new base-4 scaling and explicit mantissa, enabling block-scale-free inference and improved hardware efficiency.
Findings
33.17% area reduction in MAC units
21.99% total power savings
11.73% critical path delay reduction
Abstract
The IEEE 754 floating-point standard is the bedrock of modern computing, but its structural requirements -- a hidden leading bit, Base-2 bit-level normalization, and Sign-Magnitude encoding -- impose significant silicon area and power overhead in massively parallel Neural Processing Units (NPUs). Furthermore, the industry's recent shift to 8-bit formats (e.g., FP8 E4M3, OCP MX formats) has introduced a new hardware penalty: the strict necessity of Block-Scaling (AMAX) logic to prevent out-of-bound Large Language Model (LLM) activations from overflowing and degrading accuracy. The AetherFloat Family is a parameterizable architectural replacement designed from first principles for Hardware/Software Co-Design in AI acceleration. By synthesizing Lexicographic One's Complement Unpacking, Quad-Radix (Base-4) Scaling, and an Explicit Mantissa, AetherFloat achieves zero-cycle native integer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical Methods and Algorithms · Low-power high-performance VLSI design · Advanced Neural Network Applications
