Improving Memory Utilization in Convolutional Neural Network   Accelerators

Petar Jokic; Stephane Emery; Luca Benini

arXiv:2007.09963·eess.IV·April 7, 2021

Improving Memory Utilization in Convolutional Neural Network Accelerators

Petar Jokic, Stephane Emery, Luca Benini

PDF

TL;DR

This paper introduces a novel memory mapping technique for CNN accelerators that overlaps activation memory regions, significantly reducing memory usage and enabling larger networks to run efficiently on limited hardware.

Contribution

It proposes a mathematical model for maximizing activation memory overlap, improving memory utilization beyond traditional methods, and validates the approach with real-world network experiments and FPGA implementation.

Findings

01

Memory reduction of up to 32.9% for activations.

02

Overall network memory savings of up to 23.9%.

03

Activation memory savings of 48.8% for high-resolution networks.

Abstract

While the accuracy of convolutional neural networks has achieved vast improvements by introducing larger and deeper network architectures, also the memory footprint for storing their parameters and activations has increased. This trend especially challenges power- and resource-limited accelerator designs, which are often restricted to store all network data in on-chip memory to avoid interfacing energy-hungry external memories. Maximizing the network size that fits on a given accelerator thus requires to maximize its memory utilization. While the traditionally used ping-pong buffering technique is mapping subsequent activation layers to disjunctive memory regions, we propose a mapping method that allows these regions to overlap and thus utilize the memory more efficiently. This work presents the mathematical model to compute the maximum activations memory overlap and thus the lower…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.