Unrolling Ternary Neural Networks

Stephen Tridgell; Martin Kumm; Martin Hardieck; David Boland; Duncan; Moss; Peter Zipf; Philip H.W. Leong

arXiv:1909.04509·eess.SP·September 11, 2019

Unrolling Ternary Neural Networks

Stephen Tridgell, Martin Kumm, Martin Hardieck, David Boland, Duncan, Moss, Peter Zipf, Philip H.W. Leong

PDF

2 Repos

TL;DR

This paper presents a highly efficient FPGA implementation of a ternary neural network for CIFAR10, achieving high throughput and low latency by removing unnecessary computations through compile-time optimizations.

Contribution

It introduces a method to customize FPGA datapaths for known network architectures and weights, significantly reducing operations and increasing inference speed.

Findings

01

Achieved 122k frames/sec inference speed.

02

Removed 90% of convolution operations using sparsity and compile-time optimization.

03

Maintained 90.9% accuracy on CIFAR10.

Abstract

The computational complexity of neural networks for large scale or real-time applications necessitates hardware acceleration. Most approaches assume that the network architecture and parameters are unknown at design time, permitting usage in a large number of applications. This paper demonstrates, for the case where the neural network architecture and ternary weight values are known a priori, that extremely high throughput implementations of neural network inference can be made by customising the datapath and routing to remove unnecessary computations and data movement. This approach is ideally suited to FPGA implementations as a specialized implementation of a trained network improves efficiency while still retaining generality with the reconfigurability of an FPGA. A VGG style network with ternary weights and fixed point activations is implemented for the CIFAR10 dataset on Amazon's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDropout · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Softmax · Convolution · Ethereum Customer Service Number +1-833-534-1729