DPRed: Making Typical Activation and Weight Values Matter In Deep   Learning Computing

Alberto Delmas; Sayeh Sharify; Patrick Judd; Kevin Siu; Milos Nikolic,; Andreas Moshovos

arXiv:1804.06732·cs.NE·December 18, 2018·5 cites

DPRed: Making Typical Activation and Weight Values Matter In Deep Learning Computing

Alberto Delmas, Sayeh Sharify, Patrick Judd, Kevin Siu, Milos Nikolic,, Andreas Moshovos

PDF

Open Access

TL;DR

DPRed introduces a dynamic precision adjustment technique for deep neural networks, reducing memory traffic and improving speed and energy efficiency by tailoring precision at a group level for weights and activations.

Contribution

The paper proposes DPRed, a novel method for per-group dynamic precision adjustment in neural networks, enabling significant reductions in memory traffic and execution time.

Findings

01

Reduces off-chip traffic by up to 65%.

02

Achieves 1.82x to 2.81x speedups on 8-bit networks.

03

Improves energy efficiency through precision-aware execution.

Abstract

We show that selecting a single data type (precision) for all values in Deep Neural Networks, even if that data type is different per layer, amounts to worst case design. Much shorter data types can be used if we target the common case by adjusting the precision at a much finer granularity. We propose Dynamic Precision Reduction (DPRed), where we group weights and activations and encode them using a precision specific to each group. The per group precisions are selected statically for the weights and dynamically by hardware for the activations. We exploit these precisions to reduce: 1) off-chip storage and off- and on-chip communication, and 2) execution time. DPRed compression reduces off-chip traffic to nearly 35% and 33% on average compared to no compression respectively for 16b and 8b models. This makes it possible to sustain higher performance for a given off-chip memory interface…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Explainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis