Continuous-Flow Data-Rate-Aware CNN Inference on FPGA
Tobias Habermann, Michael Mecik, Zhenyu Wang, C\'esar David Vera, Martin Kumm, Mario Garrido

TL;DR
This paper introduces a novel data-rate-aware FPGA architecture for CNN inference that maintains high hardware utilization and throughput by intelligently managing data flow and sharing resources.
Contribution
It presents a new approach to designing continuous-flow CNN architectures on FPGA that adapt to data rate reductions, improving efficiency and enabling complex CNNs like MobileNet.
Findings
High hardware utilization close to 100% achieved
Significant logic savings enable complex CNNs on FPGA
High throughput maintained for various CNN architectures
Abstract
Among hardware accelerators for deep-learning inference, data flow implementations offer low latency and high throughput capabilities. In these architectures, each neuron is mapped to a dedicated hardware unit, making them well-suited for field-programmable gate array (FPGA) implementation. Previous unrolled implementations mostly focus on fully connected networks because of their simplicity, although it is well known that convolutional neural networks (CNNs) require fewer computations for the same accuracy. When observing the data flow in CNNs, pooling layers and convolutional layers with a stride larger than one, the number of data at their output is reduced with respect to their input. This data reduction strongly affects the data rate in a fully parallel implementation, making hardware units heavily underutilized unless it is handled properly. This work addresses this issue by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Embedded Systems Design Techniques · Adversarial Robustness in Machine Learning
