The AetherFloat Family: Block-Scale-Free Quad-Radix Floating-Point Architectures for AI Accelerators

Keita Morisaki

arXiv:2603.08741·cs.AR·March 11, 2026

The AetherFloat Family: Block-Scale-Free Quad-Radix Floating-Point Architectures for AI Accelerators

Keita Morisaki

PDF

Open Access

TL;DR

The paper introduces AetherFloat, a novel hardware-efficient floating-point format for AI accelerators that reduces silicon area, power, and delay while improving dynamic range and eliminating the need for block-scaling logic.

Contribution

It presents a parameterizable architecture with a new base-4 scaling and explicit mantissa, enabling block-scale-free inference and improved hardware efficiency.

Findings

01

33.17% area reduction in MAC units

02

21.99% total power savings

03

11.73% critical path delay reduction

Abstract

The IEEE 754 floating-point standard is the bedrock of modern computing, but its structural requirements -- a hidden leading bit, Base-2 bit-level normalization, and Sign-Magnitude encoding -- impose significant silicon area and power overhead in massively parallel Neural Processing Units (NPUs). Furthermore, the industry's recent shift to 8-bit formats (e.g., FP8 E4M3, OCP MX formats) has introduced a new hardware penalty: the strict necessity of Block-Scaling (AMAX) logic to prevent out-of-bound Large Language Model (LLM) activations from overflowing and degrading accuracy. The AetherFloat Family is a parameterizable architectural replacement designed from first principles for Hardware/Software Co-Design in AI acceleration. By synthesizing Lexicographic One's Complement Unpacking, Quad-Radix (Base-4) Scaling, and an Explicit Mantissa, AetherFloat achieves zero-cycle native integer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNumerical Methods and Algorithms · Low-power high-performance VLSI design · Advanced Neural Network Applications