PipeCNN: An OpenCL-Based FPGA Accelerator for Large-Scale Convolution   Neuron Networks

Dong Wang; Jianjing An; Ke Xu

arXiv:1611.02450·cs.AR·November 9, 2016·32 cites

PipeCNN: An OpenCL-Based FPGA Accelerator for Large-Scale Convolution Neuron Networks

Dong Wang, Jianjing An, Ke Xu

PDF

Open Access

TL;DR

This paper presents PipeCNN, an FPGA-based CNN accelerator utilizing deeply pipelined OpenCL kernels, achieving high performance and resource efficiency, and demonstrating its effectiveness on large-scale networks like AlexNet and VGG.

Contribution

It introduces a novel FPGA architecture with deeply pipelined kernels and data reuse techniques, improving performance and resource utilization over prior OpenCL-based CNN accelerators.

Findings

01

Achieved 33.9 GOPS peak performance on FPGA.

02

Reduced DSP resource usage by 34% compared to previous designs.

03

Successfully implemented large-scale CNNs like AlexNet and VGG.

Abstract

Convolutional neural networks (CNNs) have been widely employed in many applications such as image classification, video analysis and speech recognition. Being compute-intensive, CNN computations are mainly accelerated by GPUs with high power dissipations. Recently, studies were carried out exploiting FPGA as CNN accelerator because of its reconfigurability and energy efficiency advantage over GPU, especially when OpenCL-based high-level synthesis tools are now available providing fast verification and implementation flows. Previous OpenCL-based design only focused on creating a generic framework to identify performance-related hardware parameters, without utilizing FPGA's special capability of pipelining kernel functions to minimize memory bandwidth requirement. In this work, we propose an FPGA accelerator with a new architecture of deeply pipelined OpenCL kernels. Data reuse and task…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Adversarial Robustness in Machine Learning