A 0.3-2.6 TOPS/W Precision-Scalable Processor for Real-Time Large-Scale ConvNets
Bert Moons, Marian Verhelst

TL;DR
This paper presents a low-power, precision-scalable CNN processor in 40nm technology that exploits sparsity and dynamic precision to achieve high energy efficiency and real-time performance for large-scale ConvNets.
Contribution
It introduces the first processor to combine sparsity exploitation with dynamic precision scalability for energy-efficient CNN processing.
Findings
Achieves 0.3-2.6 TOPS/W efficiency range.
Consumes 25-288mW at 204MHz.
Outperforms state-of-the-art by up to 3.9x in energy efficiency.
Abstract
A low-power precision-scalable processor for ConvNets or convolutional neural networks (CNN) is implemented in a 40nm technology. Its 256 parallel processing units achieve a peak 102GOPS running at 204MHz. To minimize energy consumption while maintaining throughput, this works is the first to both exploit the sparsity of convolutions and to implement dynamic precision-scalability enabling supply- and energy scaling. The processor is fully C-programmable, consumes 25-288mW at 204 MHz and scales efficiency from 0.3-2.6 real TOPS/W. This system hereby outperforms the state-of-the-art up to 3.9x in energy efficiency.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
