Refresh Triggered Computation: Improving the Energy Efficiency of   Convolutional Neural Network Accelerators

Syed M. A. H. Jafri; Hasan Hassan; Ahmed Hemani; Onur Mutlu

arXiv:1910.06672·cs.AR·October 8, 2020

Refresh Triggered Computation: Improving the Energy Efficiency of Convolutional Neural Network Accelerators

Syed M. A. H. Jafri, Hasan Hassan, Ahmed Hemani, Onur Mutlu

PDF

TL;DR

This paper introduces Refresh Triggered Computation (RTC), a novel method to reduce DRAM refresh energy in CNN accelerators by exploiting access patterns, achieving significant energy savings with minimal overhead.

Contribution

The paper proposes three RTC designs that leverage memory access patterns to reduce DRAM refresh energy, with minimal area overhead, applicable to CNNs and other predictable memory access applications.

Findings

01

RTC reduces DRAM energy by up to 61.3% in CNNs.

02

All RTC designs have less than 0.2% area overhead.

03

RTC also benefits non-CNN workloads like Face Recognition and BCPNN.

Abstract

To employ a Convolutional Neural Network (CNN) in an energy-constrained embedded system, it is critical for the CNN implementation to be highly energy efficient. Many recent studies propose CNN accelerator architectures with custom computation units that try to improve energy-efficiency and performance of CNNs by minimizing data transfers from DRAM-based main memory. However, in these architectures, DRAM is still responsible for half of the overall energy consumption of the system, on average. A key factor of the high energy consumption of DRAM is the refresh overhead, which is estimated to consume 40% of the total DRAM energy. In this paper, we propose a new mechanism, Refresh Triggered Computation (RTC), that exploits the memory access patterns of CNN applications to reduce the number of refresh operations. We propose three RTC designs (min-RTC, mid-RTC, and full-RTC), each of which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.