Scaling Neural Network Performance through Customized Hardware   Architectures on Reconfigurable Logic

Michaela Blott; Thomas B. Preusser; Nicholas Fraser; Giulio; Gambardella; Kenneth OBrien; Yaman Umuroglu; Miriam Leeser

arXiv:1807.03123·cs.CV·July 10, 2018·1 cites

Scaling Neural Network Performance through Customized Hardware Architectures on Reconfigurable Logic

Michaela Blott, Thomas B. Preusser, Nicholas Fraser, Giulio, Gambardella, Kenneth OBrien, Yaman Umuroglu, Miriam Leeser

PDF

Open Access

TL;DR

This paper explores how customized reconfigurable hardware architectures can scale neural network performance by supporting various precisions and larger datasets, backed by formal models and experimental validation on ImageNet.

Contribution

It introduces a formalized approach and cost models to understand hardware scalability for neural networks with different precisions and dataset sizes, validated on AWS F1.

Findings

01

Scalability of hardware architectures depends on precision and dataset size.

02

Performance prediction models align with experimental results.

03

Supporting multiple precisions enhances neural network deployment flexibility.

Abstract

Convolutional Neural Networks have dramatically improved in recent years, surpassing human accuracy on certain problems and performance exceeding that of traditional computer vision algorithms. While the compute pattern in itself is relatively simple, significant compute and memory challenges remain as CNNs may contain millions of floating-point parameters and require billions of floating-point operations to process a single image. These computational requirements, combined with storage footprints that exceed typical cache sizes, pose a significant performance and power challenge for modern compute architectures. One of the promising opportunities to scale performance and power efficiency is leveraging reduced precision representations for all activations and weights as this allows to scale compute capabilities, reduce weight and feature map buffering requirements as well as energy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing