DLFusion: An Auto-Tuning Compiler for Layer Fusion on Deep Neural   Network Accelerator

Zihan Liu; Jingwen Leng; Quan Chen; Chao Li; Wenli Zheng; and Li Li; Minyi Guo

arXiv:2011.05630·cs.DC·November 12, 2020·1 cites

DLFusion: An Auto-Tuning Compiler for Layer Fusion on Deep Neural Network Accelerator

Zihan Liu, Jingwen Leng, Quan Chen, Chao Li, Wenli Zheng, and Li Li, Minyi Guo

PDF

Open Access

TL;DR

This paper presents DLFusion, an auto-tuning compiler framework that optimizes layer fusion and core utilization on DNN accelerators, significantly improving performance with minimal search effort.

Contribution

It introduces a joint auto-tuning approach for layer fusion and core configuration, effectively reducing search space and enhancing DNN accelerator performance.

Findings

01

Achieves up to 7.9x speedup over baseline.

02

Close to oracle performance with less search time.

03

Demonstrates effectiveness on various DNN models.

Abstract

Many hardware vendors have introduced specialized deep neural networks (DNN) accelerators owing to their superior performance and efficiency. As such, how to generate and optimize the code for the hardware accelerator becomes an important yet less explored problem. In this paper, we perform the compiler-stage optimization study using a novel and representative Cambricon DNN accelerator and demonstrate that the code optimization knobs play an important role in unleashing the potential of hardware computational horsepower. However, even only two studied code optimization knobs, namely the number of cores and layer fusion scheme, present an enormous search space that prevents the naive brute-force search. This work introduces a joint, auto-tuning optimization framework to address this challenge. We first use a set of synthesized DNN layers to study the interplay between the hardware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Advanced Memory and Neural Computing