Data-Rate-Aware High-Speed CNN Inference on FPGAs
Tobias Habermann, Martin Kumm

TL;DR
This paper introduces a data-rate-aware FPGA-based CNN accelerator that optimizes hardware utilization by maintaining continuous data flow, enabling efficient processing of complex CNNs with reduced resource usage.
Contribution
It proposes a novel data-rate-aware architecture and design-space exploration method to improve FPGA CNN inference efficiency across varying data rates.
Findings
Significant reduction in arithmetic resource usage.
Enhanced hardware utilization and efficiency.
Effective processing of complex CNNs on a single FPGA.
Abstract
Dataflow-based CNN accelerators on FPGAs achieve low latency and high throughput by mapping computations of each layer directly to corresponding hardware units. However, layers such as pooling and strided convolutions reduce the data at their output with respect to their input, strongly effecting the data rate of the following layers. This leads to underutilization in fully unrolled designs. While prior work introduced data-rate-aware layer-wise adaptation, determining the most efficient implementation remains challenging. This paper presents a data-rate-aware CNN accelerator architecture for multi-pixel processing. Building on existing analytical models, the proposed method performs design-space exploration to identify configurations that improve hardware utilization and resource efficiency while preserving continuous flow of data, keeping all hardware units busy. Experimental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks Stability and Synchronization · Embedded Systems Design Techniques
