Design Flow of Accelerating Hybrid Extremely Low Bit-width Neural Network in Embedded FPGA
Junsong Wang, Qiuwen Lou, Xiaofan Zhang, Chao Zhu, Yonghua Lin, Deming, Chen

TL;DR
This paper introduces a design flow for accelerating extremely low bit-width neural networks on embedded FPGAs, achieving high performance and energy efficiency suitable for edge devices.
Contribution
It presents a novel design flow that integrates hybrid quantization schemes for efficient neural network deployment on embedded FPGAs, optimizing accuracy and computation tradeoffs.
Findings
Achieves up to 10.3 TOPS performance
Classifies 325.3 images per second per watt
Most energy-efficient solution compared to GPU and other FPGA implementations
Abstract
Neural network accelerators with low latency and low energy consumption are desirable for edge computing. To create such accelerators, we propose a design flow for accelerating the extremely low bit-width neural network (ELB-NN) in embedded FPGAs with hybrid quantization schemes. This flow covers both network training and FPGA-based network deployment, which facilitates the design space exploration and simplifies the tradeoff between network accuracy and computation efficiency. Using this flow helps hardware designers to deliver a network accelerator in edge devices under strict resource and power constraints. We present the proposed flow by supporting hybrid ELB settings within a neural network. Results show that our design can deliver very high performance peaking at 10.3 TOPS and classify up to 325.3 image/s/watt while running large-scale neural networks for less than 5W using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Machine Learning and ELM
