Development of embedded target detection system based on FPGA and YOLOv3-Tiny
Zihan Jiang, Fanghao Liu, Huawei Wang, Mamataziz Mattohti, Xiangquan Chen, Jingfu Guo, Xiaotian Wu, and Yongjun Dong

TL;DR
This paper develops an FPGA-based embedded target detection system using YOLOv3-Tiny, optimizing for efficiency, speed, and resource use in resource-constrained environments.
Contribution
It introduces a hardware-accelerated FPGA system with lightweight CNN optimizations and novel architecture to significantly improve embedded AI performance.
Findings
Inference latency of 0.211 seconds on ZYNQ-XC7Z035
Power efficiency of 10.11 GOPS/W, surpassing similar designs
Resource utilization reduced by up to 51.94%
Abstract
Computational complexity and storage requirements are crucial factors influencing the performance and efficiency of convolutional neural networks (CNNs) in resource-constrained environments. This paper presents a high-performance embedded target detection system based on FPGA and YOLOv3-Tiny, specifically designed for embedded artificial intelligence applications. By integrating lightweight CNN optimization techniques with hardware accelerator design, significant improvements are made in both computational efficiency and resource utilization. Key optimizations, including low-bit quantization, batch normalization fusion, and table lookup mapping, reduce model parameters and computational complexity. Additionally, an FPGA hardware accelerator with a pipelined architecture is developed to enhance the efficiency of convolution operations while minimizing off-chip data transmission through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
