HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler   for Neural Networks

Zining Zhang; Bingsheng He; Zhenjie Zhang

arXiv:2211.11172·cs.LG·November 22, 2022

HARL: Hierarchical Adaptive Reinforcement Learning Based Auto Scheduler for Neural Networks

Zining Zhang, Bingsheng He, Zhenjie Zhang

PDF

Open Access

TL;DR

HARL is a hierarchical reinforcement learning auto-scheduler that significantly accelerates tensor program tuning and improves performance for neural network inference, reducing search time and increasing efficiency.

Contribution

It introduces a novel hierarchical RL architecture with adaptive exploration for tensor program search, outperforming existing auto-schedulers in speed and performance.

Findings

01

Tensor operator performance improved by 22%.

02

Search speed increased by 4.3 times.

03

Significant improvements in end-to-end neural network inference.

Abstract

To efficiently perform inference with neural networks, the underlying tensor programs require sufficient tuning efforts before being deployed into production environments. Usually, enormous tensor program candidates need to be sufficiently explored to find the one with the best performance. This is necessary to make the neural network products meet the high demand of real-world applications such as natural language processing, auto-driving, etc. Auto-schedulers are being developed to avoid the need for human intervention. However, due to the gigantic search space and lack of intelligent search guidance, current auto-schedulers require hours to days of tuning time to find the best-performing tensor program for the entire neural network. In this paper, we propose HARL, a reinforcement learning (RL) based auto-scheduler specifically designed for efficient tensor program exploration. HARL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Advanced Neural Network Applications · Tensor decomposition and applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings