da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs

Chang Sun; Zhiqiang Que; Vladimir Loncar; Wayne Luk; Maria Spiropulu

arXiv:2507.04535·cs.AR·April 27, 2026

da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs

Chang Sun, Zhiqiang Que, Vladimir Loncar, Wayne Luk, Maria Spiropulu

PDF

1 Repo

TL;DR

This paper introduces a new distributed arithmetic algorithm for FPGA-based neural network inference that reduces resource usage and latency, enabling faster, more efficient real-time neural network deployment.

Contribution

The authors present an efficient, open-source distributed arithmetic algorithm for CMVM operations on FPGAs, optimized for area and latency, integrated into the hls4ml library.

Findings

01

Resource reduction of up to one-third for quantized neural networks.

02

Simultaneous reduction in latency and area consumption.

03

Enabling implementation of neural networks previously infeasible on FPGAs.

Abstract

Neural networks with a latency requirement on the order of microseconds, like the ones used at the CERN Large Hadron Collider, are typically deployed on FPGAs fully unrolled and pipelined. A bottleneck for the deployment of such neural networks is area utilization, which is directly related to the required constant matrix-vector multiplication (CMVM) operations. In this work, we propose an efficient algorithm for implementing CMVM operations with distributed arithmetic on FPGAs that simultaneously optimizes for area consumption and latency. The algorithm achieves resource reduction similar to state-of-the-art algorithms while being significantly faster to compute. The proposed algorithm is open-sourced and integrated into the \texttt{hls4ml} library, a free and open-source library for running real-time neural network inference on FPGAs. We show that the proposed algorithm can reduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.