Sparse Winograd Convolutional neural networks on small-scale systolic   arrays

Feng Shi; Haochen Li; Yuhe Gao; Benjamin Kuschner; Song-Chun Zhu

arXiv:1810.01973·cs.DC·October 5, 2018

Sparse Winograd Convolutional neural networks on small-scale systolic arrays

Feng Shi, Haochen Li, Yuhe Gao, Benjamin Kuschner, Song-Chun Zhu

PDF

Open Access

TL;DR

This paper presents an FPGA-based accelerator combining sparse Winograd convolution, small-scale systolic arrays, and optimized memory layout, achieving high efficiency and speedup for deep learning tasks.

Contribution

It introduces a novel FPGA accelerator design that balances computation and memory support using sparse Winograd convolution and small systolic arrays.

Findings

01

Achieves 20x-30x energy efficiency improvements.

02

More than 5x speedup over dense implementations.

03

High computational resource utilization on FPGA.

Abstract

The reconfigurability, energy-efficiency, and massive parallelism on FPGAs make them one of the best choices for implementing efficient deep learning accelerators. However, state-of-art implementations seldom consider the balance between high throughput of computation power and the ability of the memory subsystem to support it. In this paper, we implement an accelerator on FPGA by combining the sparse Winograd convolution, clusters of small-scale systolic arrays, and a tailored memory layout design. We also provide an analytical model analysis for the general Winograd convolution algorithm as a design reference. Experimental results on VGG16 show that it achieves very high computational resource utilization, 20x ~ 30x energy efficiency, and more than 5x speedup compared with the dense implementation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing

MethodsConvolution