Towards Memory-Efficient Neural Networks via Multi-Level in situ   Generation

Jiaqi Gu; Hanqing Zhu; Chenghao Feng; Mingjie Liu; Zixuan Jiang; Ray; T. Chen; David Z. Pan

arXiv:2108.11430·cs.LG·September 7, 2021

Towards Memory-Efficient Neural Networks via Multi-Level in situ Generation

Jiaqi Gu, Hanqing Zhu, Chenghao Feng, Mingjie Liu, Zixuan Jiang, Ray, T. Chen, David Z. Pan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel multi-level in situ generation framework that significantly reduces memory access costs in neural networks, enabling more efficient deployment on resource-limited devices without sacrificing accuracy.

Contribution

It presents the first unified approach leveraging bit-level redundancy and intrinsic correlations in DNN kernels to enable on-the-fly high-resolution parameter recovery with minimal hardware overhead.

Findings

01

Boosts memory efficiency by 10-20x

02

Achieves comparable accuracy to state-of-the-art methods

03

Demonstrates effectiveness on multiple neural network architectures

Abstract

Deep neural networks (DNN) have shown superior performance in a variety of tasks. As they rapidly evolve, their escalating computation and memory demands make it challenging to deploy them on resource-constrained edge devices. Though extensive efficient accelerator designs, from traditional electronics to emerging photonics, have been successfully demonstrated, they are still bottlenecked by expensive memory accesses due to tremendous gaps between the bandwidth/power/latency of electrical memory and computing cores. Previous solutions fail to fully-leverage the ultra-fast computational speed of emerging DNN accelerators to break through the critical memory bound. In this work, we propose a general and unified framework to trade expensive memory transactions with ultra-fast on-chip computations, directly translating to performance improvement. We are the first to jointly explore the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JeremieMelo/Memory-Efficient-Multi-Level-Generation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Neural Networks and Reservoir Computing · Advanced Memory and Neural Computing