IOS: Inter-Operator Scheduler for CNN Acceleration

Yaoyao Ding; Ligeng Zhu; Zhihao Jia; Gennady Pekhimenko; Song Han

arXiv:2011.01302·cs.LG·March 9, 2021·5 cites

IOS: Inter-Operator Scheduler for CNN Acceleration

Yaoyao Ding, Ligeng Zhu, Zhihao Jia, Gennady Pekhimenko, Song Han

PDF

Open Access 1 Repo

TL;DR

This paper introduces IOS, a dynamic programming-based scheduler that optimizes parallel execution across multiple CNN operators, significantly improving performance over existing intra-operator optimization methods especially for small batch sizes.

Contribution

The paper presents a novel inter-operator scheduling approach for CNNs, addressing the underutilization of hardware parallelism and outperforming current state-of-the-art libraries.

Findings

01

IOS outperforms TensorRT by 1.1 to 1.5x on CNN benchmarks.

02

The scheduler effectively reduces the performance gap caused by limited intra-operator parallelism.

03

Dynamic programming enables optimal scheduling across operators for better hardware utilization.

Abstract

To accelerate CNN inference, existing deep learning frameworks focus on optimizing intra-operator parallelization. However, a single operator can no longer fully utilize the available parallelism given the rapid advances in high-performance hardware, resulting in a large gap between the peak performance and the real performance. This performance gap is more severe under smaller batch sizes. In this work, we extensively study the parallelism between operators and propose Inter-Operator Scheduler (IOS) to automatically schedule multiple operators' parallel execution through a novel dynamic programming algorithm. IOS consistently outperforms state-of-the-art libraries (e.g., TensorRT) by 1.1 to 1.5x on modern CNN benchmarks. The code to reproduce each experiment is available at: https://github.com/mit-han-lab/inter-operator-scheduler.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mit-han-lab/inter-operator-scheduler
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Adversarial Robustness in Machine Learning