A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks
Yixing Li, Zichuan Liu, Kai Xu, Hao Yu, Fengbo Ren

TL;DR
This paper presents an FPGA accelerator architecture for binary CNNs that outperforms GPUs in throughput and energy efficiency, especially for small batch sizes, by leveraging massive spatial parallelism and deep pipelining.
Contribution
The paper introduces an optimized FPGA architecture for binary CNNs that achieves higher throughput and energy efficiency than GPUs, with performance insensitive to batch size.
Findings
8.3x faster than Titan X GPU for small batch processing
75x more energy-efficient than Titan X GPU in small batch scenarios
Comparable throughput to GPU for large batch processing
Abstract
FPGA-based hardware accelerators for convolutional neural networks (CNNs) have obtained great attentions due to their higher energy efficiency than GPUs. However, it is challenging for FPGA-based solutions to achieve a higher throughput than GPU counterparts. In this paper, we demonstrate that FPGA acceleration can be a superior solution in terms of both throughput and energy efficiency when a CNN is trained with binary constraints on weights and activations. Specifically, we propose an optimized FPGA accelerator architecture tailored for bitwise convolution and normalization that features massive spatial parallelism with deep pipelines stages. A key advantage of the FPGA accelerator is that its performance is insensitive to data batch size, while the performance of GPU acceleration varies largely depending on the batch size of the data. Experiment results show that the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Advanced Memory and Neural Computing
MethodsConvolution
