Scaling Neural Network Performance through Customized Hardware Architectures on Reconfigurable Logic
Michaela Blott, Thomas B. Preusser, Nicholas Fraser, Giulio, Gambardella, Kenneth OBrien, Yaman Umuroglu, Miriam Leeser

TL;DR
This paper explores how customized reconfigurable hardware architectures can scale neural network performance by supporting various precisions and larger datasets, backed by formal models and experimental validation on ImageNet.
Contribution
It introduces a formalized approach and cost models to understand hardware scalability for neural networks with different precisions and dataset sizes, validated on AWS F1.
Findings
Scalability of hardware architectures depends on precision and dataset size.
Performance prediction models align with experimental results.
Supporting multiple precisions enhances neural network deployment flexibility.
Abstract
Convolutional Neural Networks have dramatically improved in recent years, surpassing human accuracy on certain problems and performance exceeding that of traditional computer vision algorithms. While the compute pattern in itself is relatively simple, significant compute and memory challenges remain as CNNs may contain millions of floating-point parameters and require billions of floating-point operations to process a single image. These computational requirements, combined with storage footprints that exceed typical cache sizes, pose a significant performance and power challenge for modern compute architectures. One of the promising opportunities to scale performance and power efficiency is leveraging reduced precision representations for all activations and weights as this allows to scale compute capabilities, reduce weight and feature map buffering requirements as well as energy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing
