Refresh Triggered Computation: Improving the Energy Efficiency of Convolutional Neural Network Accelerators
Syed M. A. H. Jafri, Hasan Hassan, Ahmed Hemani, Onur Mutlu

TL;DR
This paper introduces Refresh Triggered Computation (RTC), a novel method to reduce DRAM refresh energy in CNN accelerators by exploiting access patterns, achieving significant energy savings with minimal overhead.
Contribution
The paper proposes three RTC designs that leverage memory access patterns to reduce DRAM refresh energy, with minimal area overhead, applicable to CNNs and other predictable memory access applications.
Findings
RTC reduces DRAM energy by up to 61.3% in CNNs.
All RTC designs have less than 0.2% area overhead.
RTC also benefits non-CNN workloads like Face Recognition and BCPNN.
Abstract
To employ a Convolutional Neural Network (CNN) in an energy-constrained embedded system, it is critical for the CNN implementation to be highly energy efficient. Many recent studies propose CNN accelerator architectures with custom computation units that try to improve energy-efficiency and performance of CNNs by minimizing data transfers from DRAM-based main memory. However, in these architectures, DRAM is still responsible for half of the overall energy consumption of the system, on average. A key factor of the high energy consumption of DRAM is the refresh overhead, which is estimated to consume 40% of the total DRAM energy. In this paper, we propose a new mechanism, Refresh Triggered Computation (RTC), that exploits the memory access patterns of CNN applications to reduce the number of refresh operations. We propose three RTC designs (min-RTC, mid-RTC, and full-RTC), each of which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
