Extending TensorFlow's Semantics with Pipelined Execution

Sam Whitlock; James Larus; Edouard Bugnion

arXiv:1908.09291·cs.DC·August 27, 2019

Extending TensorFlow's Semantics with Pipelined Execution

Sam Whitlock, James Larus, Edouard Bugnion

PDF

Open Access

TL;DR

This paper introduces Pipelined TensorFlow (PTF), an extension to TensorFlow that enables pipelined execution of applications with concurrent data processing, improving throughput while maintaining compatibility with existing TensorFlow functions.

Contribution

PTF extends TensorFlow's semantics to support pipelined, concurrent data processing by partitioning dataflow graphs and adding metadata, without modifying the core runtime.

Findings

01

Increases bioinformatics application throughput by 4x

02

Maintains low latency increase of 0.13x

03

Achieves 321 MB/sec genome processing rate

Abstract

TensorFlow is a popular cloud computing framework that targets machine learning applications. It separates the specification of application logic (in a dataflow graph) from the execution of the logic. TensorFlow's native runtime executes the application with low overhead across a diverse set of hardware including CPUs, GPUs, and ASICs. Although the underlying dataflow engine supporting these features could be applied to computations beyond machine learning, certain design decisions limit this broader application, such as the inability for an application to differentiate between data items across concurrent requests. This paper introduces Pipelined TensorFlow (PTF), a system that extends TensorFlow's semantics to provide support for a broader variety of application logic. In particular, PTF supports applications that concurrently process finite batches of data on a single…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Cloud Computing and Resource Management · Advanced Data Storage Technologies