TL;DR
BARVINN is a configurable DNN accelerator that supports arbitrary precision inference, controlled by a RISC-V CPU, enabling scalable, runtime-programmable low-precision neural network acceleration on FPGA platforms.
Contribution
It introduces a scalable, runtime-programmable DNN accelerator with bit-level configurability and a code generator for CNN model deployment, without hardware reconfiguration.
Findings
Achieves 8.2 TMACs on FPGA platform.
Supports multiple quantization levels with runtime programmability.
Demonstrates scalable throughput across different models.
Abstract
We present a DNN accelerator that allows inference at arbitrary precision with dedicated processing elements that are configurable at the bit level. Our DNN accelerator has 8 Processing Elements controlled by a RISC-V controller with a combined 8.2 TMACs of computational power when implemented with the recent Alveo U250 FPGA platform. We develop a code generator tool that ingests CNN models in ONNX format and generates an executable command stream for the RISC-V controller. We demonstrate the scalable throughput of our accelerator by running different DNN kernels and models when different quantization levels are selected. Compared to other low precision accelerators, our accelerator provides run time programmability without hardware reconfiguration and can accelerate DNNs with multiple quantization levels, regardless of the target FPGA size. BARVINN is an open source project and it is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
