FeCaffe: FPGA-enabled Caffe with OpenCL for Deep Learning Training and Inference on Intel Stratix 10
Ke He, Bo Liu, Yu Zhang, Andrew Ling, Dian Gu

TL;DR
FeCaffe is a novel FPGA-enabled extension of the Caffe framework that supports deep learning training and inference on Intel Stratix 10, offering significant speedups and flexibility for CNN development.
Contribution
This work introduces FeCaffe, a hierarchical design methodology enabling FPGA support for mainline deep learning features in Caffe, including training and inference.
Findings
Supports almost full CNN training and inference features on FPGA
Achieves 6.4x and 8.4x average speedup for forward and backward passes on LeNet
Provides a flexible, extensible platform for FPGA-based deep learning development
Abstract
Deep learning and Convolutional Neural Network (CNN) have becoming increasingly more popular and important in both academic and industrial areas in recent years cause they are able to provide better accuracy and result in classification, detection and recognition areas, compared to traditional approaches. Currently, there are many popular frameworks in the market for deep learning development, such as Caffe, TensorFlow, Pytorch, and most of frameworks natively support CPU and consider GPU as the mainline accelerator by default. FPGA device, viewed as a potential heterogeneous platform, still cannot provide a comprehensive support for CNN development in popular frameworks, in particular to the training phase. In this paper, we firstly propose the FeCaffe, i.e. FPGA-enabled Caffe, a hierarchical software and hardware design methodology based on the Caffe to enable FPGA to support mainline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution · Dense Connections · LeNet
