Data-Rate-Aware High-Speed CNN Inference on FPGAs

Tobias Habermann; Martin Kumm

arXiv:2603.08726·cs.AR·March 11, 2026

Data-Rate-Aware High-Speed CNN Inference on FPGAs

Tobias Habermann, Martin Kumm

PDF

Open Access

TL;DR

This paper introduces a data-rate-aware FPGA-based CNN accelerator that optimizes hardware utilization by maintaining continuous data flow, enabling efficient processing of complex CNNs with reduced resource usage.

Contribution

It proposes a novel data-rate-aware architecture and design-space exploration method to improve FPGA CNN inference efficiency across varying data rates.

Findings

01

Significant reduction in arithmetic resource usage.

02

Enhanced hardware utilization and efficiency.

03

Effective processing of complex CNNs on a single FPGA.

Abstract

Dataflow-based CNN accelerators on FPGAs achieve low latency and high throughput by mapping computations of each layer directly to corresponding hardware units. However, layers such as pooling and strided convolutions reduce the data at their output with respect to their input, strongly effecting the data rate of the following layers. This leads to underutilization in fully unrolled designs. While prior work introduced data-rate-aware layer-wise adaptation, determining the most efficient implementation remains challenging. This paper presents a data-rate-aware CNN accelerator architecture for multi-pixel processing. Building on existing analytical models, the proposed method performs design-space exploration to identify configurations that improve hardware utilization and resource efficiency while preserving continuous flow of data, keeping all hardware units busy. Experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Neural Networks Stability and Synchronization · Embedded Systems Design Techniques