FeCaffe: FPGA-enabled Caffe with OpenCL for Deep Learning Training and   Inference on Intel Stratix 10

Ke He; Bo Liu; Yu Zhang; Andrew Ling; Dian Gu

arXiv:1911.08905·cs.DC·March 24, 2020

FeCaffe: FPGA-enabled Caffe with OpenCL for Deep Learning Training and Inference on Intel Stratix 10

Ke He, Bo Liu, Yu Zhang, Andrew Ling, Dian Gu

PDF

TL;DR

FeCaffe is a novel FPGA-enabled extension of the Caffe framework that supports deep learning training and inference on Intel Stratix 10, offering significant speedups and flexibility for CNN development.

Contribution

This work introduces FeCaffe, a hierarchical design methodology enabling FPGA support for mainline deep learning features in Caffe, including training and inference.

Findings

01

Supports almost full CNN training and inference features on FPGA

02

Achieves 6.4x and 8.4x average speedup for forward and backward passes on LeNet

03

Provides a flexible, extensible platform for FPGA-based deep learning development

Abstract

Deep learning and Convolutional Neural Network (CNN) have becoming increasingly more popular and important in both academic and industrial areas in recent years cause they are able to provide better accuracy and result in classification, detection and recognition areas, compared to traditional approaches. Currently, there are many popular frameworks in the market for deep learning development, such as Caffe, TensorFlow, Pytorch, and most of frameworks natively support CPU and consider GPU as the mainline accelerator by default. FPGA device, viewed as a potential heterogeneous platform, still cannot provide a comprehensive support for CNN development in popular frameworks, in particular to the training phase. In this paper, we firstly propose the FeCaffe, i.e. FPGA-enabled Caffe, a hierarchical software and hardware design methodology based on the Caffe to enable FPGA to support mainline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsConvolution · Dense Connections · LeNet