Gensor: A Graph-based Construction Tensor Compilation Method for Deep Learning
Hangda Liu, Boyu Diao, Yu Yang, Wenxin Chen, Xiaohui Peng, Yongjun Xu

TL;DR
Gensor is a graph-based tensor compilation method that enhances deep learning performance by rapidly exploring a larger optimization space, leading to significant speedups on GPUs for various models.
Contribution
Gensor introduces a novel graph-based approach for tensor program optimization, expanding the search space and improving performance with faster compilation times.
Findings
Achieves 18-30% performance improvement over state-of-the-art methods.
Generates operator kernels in seconds, enabling rapid optimization.
Provides an average 20% acceleration for models like ResNet-50 and GPT-2.
Abstract
High-performance deep learning depends on efficient tensor programs. In recent years, automatic tensor program optimization, also known as tensor compilation, has emerged as the primary approach to generating efficient tensor programs. However, how to generate kernels with higher performance in a shorter time is still the key challenge. In this paper, we present Gensor, a graph-based construction tensor compilation method for deep learning, to further improve the performance of construction tensor compilation. Unlike existing tree-based methods, Gensor abstracts construction space into a graph structure. Gensor then explores the construction space with Markov analysis. Gensor takes tensor programs as states and models scheduling primitives as transition actions between these states. Therefore, the process of tensor program construction optimization is abstracted as a graph traversal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Tensor decomposition and applications · Parallel Computing and Optimization Techniques
