A 0.3-2.6 TOPS/W Precision-Scalable Processor for Real-Time Large-Scale   ConvNets

Bert Moons; Marian Verhelst

arXiv:1606.05094·cs.AR·June 17, 2016

A 0.3-2.6 TOPS/W Precision-Scalable Processor for Real-Time Large-Scale ConvNets

Bert Moons, Marian Verhelst

PDF

TL;DR

This paper presents a low-power, precision-scalable CNN processor in 40nm technology that exploits sparsity and dynamic precision to achieve high energy efficiency and real-time performance for large-scale ConvNets.

Contribution

It introduces the first processor to combine sparsity exploitation with dynamic precision scalability for energy-efficient CNN processing.

Findings

01

Achieves 0.3-2.6 TOPS/W efficiency range.

02

Consumes 25-288mW at 204MHz.

03

Outperforms state-of-the-art by up to 3.9x in energy efficiency.

Abstract

A low-power precision-scalable processor for ConvNets or convolutional neural networks (CNN) is implemented in a 40nm technology. Its 256 parallel processing units achieve a peak 102GOPS running at 204MHz. To minimize energy consumption while maintaining throughput, this works is the first to both exploit the sparsity of convolutions and to implement dynamic precision-scalability enabling supply- and energy scaling. The processor is fully C-programmable, consumes 25-288mW at 204 MHz and scales efficiency from 0.3-2.6 real TOPS/W. This system hereby outperforms the state-of-the-art up to 3.9x in energy efficiency.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.