Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics Learning

Stef Cuyckens; Xiaoling Yi; Nitish Satya Murthy; Chao Fang; Marian Verhelst

arXiv:2505.22404·cs.AR·December 16, 2025

Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics Learning

Stef Cuyckens, Xiaoling Yi, Nitish Satya Murthy, Chao Fang, Marian Verhelst

PDF

Open Access

TL;DR

This paper introduces a novel hardware design supporting all MX data types with shared exponents, significantly reducing memory and increasing training throughput for robotics learning at the edge.

Contribution

It presents a precision-scalable arithmetic unit and shared exponent support that overcome limitations of prior MX processing hardware, enabling more efficient on-device learning.

Findings

01

51% lower memory footprint

02

4x higher training throughput

03

comparable energy efficiency

Abstract

Autonomous robots require efficient on-device learning to adapt to new environments without cloud dependency. For this edge training, Microscaling (MX) data types offer a promising solution by combining integer and floating-point representations with shared exponents, reducing energy consumption while maintaining accuracy. However, the state-of-the-art continuous learning processor, namely Dacapo, faces limitations with its MXINT-only support and inefficient vector-based grouping during backpropagation. In this paper, we present, to the best of our knowledge, the first work that addresses these limitations with two key innovations: (1) a precision-scalable arithmetic unit that supports all six MX data types by exploiting sub-word parallelism and unified integer and floating-point processing; and (2) support for square shared exponent groups to enable efficient weight handling during…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Ferroelectric and Negative Capacitance Devices · Advanced Neural Network Applications