Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN   Training

Bojian Zheng; Abhishek Tiwari; Nandita Vijaykumar; Gennady Pekhimenko

arXiv:1805.08899·cs.LG·December 2, 2019·6 cites

Echo: Compiler-based GPU Memory Footprint Reduction for LSTM RNN Training

Bojian Zheng, Abhishek Tiwari, Nandita Vijaykumar, Gennady Pekhimenko

PDF

Open Access

TL;DR

Echo is a compiler-based optimization that reduces GPU memory usage during LSTM RNN training by intelligently recomputing feature maps, enabling larger models and faster training without source code changes.

Contribution

It introduces a novel compiler scheme that accurately estimates and manages recomputation overhead to effectively reduce memory footprint during training.

Findings

01

Achieves an average memory reduction of 1.89X

02

Maximum reduction of 3.13X in experiments

03

Enables larger batch sizes and energy savings

Abstract

The Long-Short-Term-Memory Recurrent Neural Networks (LSTM RNNs) are a popular class of machine learning models for analyzing sequential data. Their training on modern GPUs, however, is limited by the GPU memory capacity. Our profiling results of the LSTM RNN-based Neural Machine Translation (NMT) model reveal that feature maps of the attention and RNN layers form the memory bottleneck and runtime is unevenly distributed across different layers when training on GPUs. Based on these two observations, we propose to recompute the feature maps rather than stashing them persistently in the GPU memory. While the idea of feature map recomputation has been considered before, existing solutions fail to deliver satisfactory footprint reduction, as they do not address two key challenges. For each feature map recomputation to be effective and efficient, its effect on (1) the total memory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Topic Modeling · Multimodal Machine Learning Applications

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory