ACRoBat: Optimizing Auto-batching of Dynamic Deep Learning at Compile   Time

Pratik Fegade; Tianqi Chen; Phillip B. Gibbons; Todd C. Mowry

arXiv:2305.10611·cs.LG·May 20, 2024·1 cites

ACRoBat: Optimizing Auto-batching of Dynamic Deep Learning at Compile Time

Pratik Fegade, Tianqi Chen, Phillip B. Gibbons, Todd C. Mowry

PDF

Open Access

TL;DR

ACRoBat is a compiler framework that significantly improves automatic batching efficiency for dynamic deep learning models with control flow, achieving up to 8.5 times better performance than existing solutions.

Contribution

It introduces a hybrid static-dynamic compilation approach for automatic batching in dynamic deep learning, enhancing performance and hardware utilization.

Findings

01

ACRoBat outperforms DyNet by up to 8.5X on GPU.

02

It effectively handles dynamic control flow in deep learning models.

03

The framework enables high-throughput, efficient tensor code generation.

Abstract

Dynamic control flow is an important technique often used to design expressive and efficient deep learning computations for applications such as text parsing, machine translation, exiting early out of deep models and so on. The control flow divergence resulting from dynamic control flow makes batching, an important optimization enabling high throughput and hardware utilization, difficult to perform manually. In this paper, we present ACRoBat, a framework that enables efficient automatic batching for dynamic deep learning computations by performing hybrid static+dynamic compiler optimizations and end-to-end tensor code generation. ACRoBat performs up to 8.5X better than DyNet, a state-of-the-art framework for automatic batching, on an Nvidia GeForce GPU.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Multimodal Machine Learning Applications