TFLMS: Large Model Support in TensorFlow by Graph Rewriting

Tung D. Le; Haruki Imai; Yasushi Negishi; Kiyokuni Kawachiya

arXiv:1807.02037·cs.LG·October 3, 2019·28 cites

TFLMS: Large Model Support in TensorFlow by Graph Rewriting

Tung D. Le, Haruki Imai, Yasushi Negishi, Kiyokuni Kawachiya

PDF

Open Access

TL;DR

This paper introduces TFLMS, a graph rewriting approach in TensorFlow that enables training larger neural networks by inserting swap operations, effectively managing memory limitations on accelerators.

Contribution

The paper presents a formal graph rewriting method and a TensorFlow module, TFLMS, to support large model training through memory swapping without sacrificing accuracy.

Findings

01

Enabled training of larger models with increased batch sizes

02

Achieved 4.7x larger batch size for ResNet-50

03

Allowed training of 3DUNet on large images without splitting

Abstract

While accelerators such as GPUs have limited memory, deep neural networks are becoming larger and will not fit with the memory limitation of accelerators for training. We propose an approach to tackle this problem by rewriting the computational graph of a neural network, in which swap-out and swap-in operations are inserted to temporarily store intermediate results on CPU memory. In particular, we first revise the concept of a computational graph by defining a concrete semantics for variables in a graph. We then formally show how to derive swap-out and swap-in operations from an existing graph and present rules to optimize the graph. To realize our approach, we developed a module in TensorFlow, named TFLMS. TFLMS is published as a pull request in the TensorFlow repository for contributing to the TensorFlow community. With TFLMS, we were able to train ResNet-50 and 3DUnet with 4.7x and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Graph Theory and Algorithms