A MAC-less Neural Inference Processor Supporting Compressed, Variable   Precision Weights

Vincenzo Liguori

arXiv:2012.06018·cs.CV·December 14, 2020

A MAC-less Neural Inference Processor Supporting Compressed, Variable Precision Weights

Vincenzo Liguori

PDF

Open Access

TL;DR

This paper presents two novel CNN inference architectures that leverage weight sparsity and compression, enabling variable precision processing and reducing computational and bandwidth demands.

Contribution

It introduces MAC-less architectures that utilize bit-level weight sparsity and compression, supporting variable precision weights with smaller resource requirements.

Findings

01

Achieved reduced computational complexity and bandwidth usage.

02

Demonstrated implementation feasibility across different technologies.

03

Supported variable precision weights with smaller, efficient BLMAC units.

Abstract

This paper introduces two architectures for the inference of convolutional neural networks (CNNs). Both architectures exploit weight sparsity and compression to reduce computational complexity and bandwidth. The first architecture uses multiply-accumulators (MACs) but avoids unnecessary multiplications by skipping zero weights. The second architecture exploits weight sparsity at the level of their bit representation by substituting resource-intensive MACs with much smaller Bit Layer Multiply Accumulators (BLMACs). The use of BLMACs also allows variable precision weights as variable size integers and even floating points. Some details of an implementation of the second architecture are given. Weight compression with arithmetic coding is also discussed as well as bandwidth implications. Finally, some implementation results for a pathfinder design and various technologies are presented.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Human Pose and Action Recognition