Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional   Neural Networks

James Garland; David Gregg

arXiv:1609.05132·cs.NE·August 17, 2017

Low Complexity Multiply Accumulate Unit for Weight-Sharing Convolutional Neural Networks

James Garland, David Gregg

PDF

TL;DR

This paper introduces a low-complexity MAC unit for weight-sharing CNNs that replaces multipliers with adders, reducing hardware size and power consumption while maintaining performance.

Contribution

A novel MAC circuit design that leverages weight binning in CNNs to minimize hardware complexity and power use.

Findings

01

Fewer gates and smaller logic compared to traditional MAC units.

02

Reduced power consumption in the proposed MAC design.

03

Maintains same clock speed performance as conventional units.

Abstract

Convolutional Neural Networks (CNNs) are one of the most successful deep machine learning technologies for processing image, voice and video data. CNNs require large amounts of processing capacity and memory, which can exceed the resources of low power mobile and embedded systems. Several designs for hardware accelerators have been proposed for CNNs which typically contain large numbers of Multiply Accumulate (MAC) units. One approach to reducing data sizes and memory traffic in CNN accelerators is "weight sharing", where the full range of values in a trained CNN are put in bins and the bin index is stored instead of the original weight value. In this paper we propose a novel MAC circuit that exploits binning in weight-sharing CNNs. Rather than computing the MAC directly we instead count the frequency of each weight and place it in a bin. We then compute the accumulated value in a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings