Learning to Optimize Tensor Programs
Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau,, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

TL;DR
This paper presents a learning-based framework that automatically optimizes tensor programs for deep learning, reducing reliance on hardware-specific libraries and enabling efficient deployment across diverse hardware platforms.
Contribution
The authors develop a domain-specific learning approach to optimize tensor operators, improving performance portability and reducing engineering effort compared to traditional hand-tuned libraries.
Findings
Achieves performance comparable to state-of-the-art libraries on various hardware.
Reduces engineering costs by removing dependency on hardware-specific libraries.
Enables effective tensor program optimization across CPUs and GPUs.
Abstract
We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective deep learning systems. However, existing systems rely on manually optimized libraries such as cuDNN where only a narrow range of server class GPUs are well-supported. The reliance on hardware-specific operator libraries limits the applicability of high-level graph optimizations and incurs significant engineering costs when deploying to new hardware targets. We use learning to remove this engineering burden. We learn domain-specific statistical cost models to guide the search of tensor operator implementations over billions of possible program variants. We further accelerate the search by effective model transfer across workloads. Experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Computational Physics and Python Applications · Model Reduction and Neural Networks
