Extending TensorFlow's Semantics with Pipelined Execution
Sam Whitlock, James Larus, Edouard Bugnion

TL;DR
This paper introduces Pipelined TensorFlow (PTF), an extension to TensorFlow that enables pipelined execution of applications with concurrent data processing, improving throughput while maintaining compatibility with existing TensorFlow functions.
Contribution
PTF extends TensorFlow's semantics to support pipelined, concurrent data processing by partitioning dataflow graphs and adding metadata, without modifying the core runtime.
Findings
Increases bioinformatics application throughput by 4x
Maintains low latency increase of 0.13x
Achieves 321 MB/sec genome processing rate
Abstract
TensorFlow is a popular cloud computing framework that targets machine learning applications. It separates the specification of application logic (in a dataflow graph) from the execution of the logic. TensorFlow's native runtime executes the application with low overhead across a diverse set of hardware including CPUs, GPUs, and ASICs. Although the underlying dataflow engine supporting these features could be applied to computations beyond machine learning, certain design decisions limit this broader application, such as the inability for an application to differentiate between data items across concurrent requests. This paper introduces Pipelined TensorFlow (PTF), a system that extends TensorFlow's semantics to provide support for a broader variety of application logic. In particular, PTF supports applications that concurrently process finite batches of data on a single…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Cloud Computing and Resource Management · Advanced Data Storage Technologies
