Memory-Efficient Backpropagation Through Time

Audr\=unas Gruslys; Remi Munos; Ivo Danihelka; Marc Lanctot; Alex; Graves

arXiv:1606.03401·cs.NE·June 13, 2016·51 cites

Memory-Efficient Backpropagation Through Time

Audr\=unas Gruslys, Remi Munos, Ivo Danihelka, Marc Lanctot, Alex, Graves

PDF

Open Access 2 Repos

TL;DR

This paper introduces a memory-efficient algorithm for backpropagation through time in RNNs that balances caching and recomputation, significantly reducing memory use while maintaining computational efficiency, especially for long sequences.

Contribution

A novel dynamic programming-based method that optimally manages memory and computation trade-offs in BPTT, adaptable to various memory constraints.

Findings

01

Reduces memory usage by 95% for sequences of length 1000

02

Increases computational time by only one third compared to standard BPTT

03

Effective for training long sequence RNNs

Abstract

We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs). Our approach uses dynamic programming to balance a trade-off between caching of intermediate results and recomputation. The algorithm is capable of tightly fitting within almost any user-set memory budget while finding an optimal execution policy minimizing the computational cost. Computational devices have limited memory capacity and maximizing a computational performance given a fixed memory budget is a practical use-case. We provide asymptotic computational upper bounds for various regimes. The algorithm is particularly effective for long sequences. For sequences of length 1000, our algorithm saves 95\% of memory usage while using only one third more time per iteration than the standard BPTT.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Neural Network Applications · Machine Learning and Algorithms