Auto-Vectorizing TensorFlow Graphs: Jacobians, Auto-Batching And Beyond

Ashish Agarwal; Igor Ganichev

arXiv:1903.04243·cs.DC·March 12, 2019·6 cites

Auto-Vectorizing TensorFlow Graphs: Jacobians, Auto-Batching And Beyond

Ashish Agarwal, Igor Ganichev

PDF

Open Access

TL;DR

This paper introduces a static loop vectorization technique for TensorFlow, enabling efficient auto-batching, Jacobian computation, and input pipeline optimization, resulting in significant speedups over traditional methods.

Contribution

It presents a novel static loop vectorization approach and a parallel-for abstraction for TensorFlow, enhancing performance for various applications.

Findings

01

Significant speedups over loop-based implementations.

02

Improved auto-batching and Jacobian computation efficiency.

03

Enhanced input pipeline performance.

Abstract

We propose a static loop vectorization optimization on top of high level dataflow IR used by frameworks like TensorFlow. A new statically vectorized parallel-for abstraction is provided on top of TensorFlow, and used for applications ranging from auto-batching and per-example gradients, to jacobian computation, optimized map functions and input pipeline optimization. We report huge speedups compared to both loop based implementations, as well as run-time batching adopted by the DyNet framework.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems