DLFusion: An Auto-Tuning Compiler for Layer Fusion on Deep Neural Network Accelerator
Zihan Liu, Jingwen Leng, Quan Chen, Chao Li, Wenli Zheng, and Li Li, Minyi Guo

TL;DR
This paper presents DLFusion, an auto-tuning compiler framework that optimizes layer fusion and core utilization on DNN accelerators, significantly improving performance with minimal search effort.
Contribution
It introduces a joint auto-tuning approach for layer fusion and core configuration, effectively reducing search space and enhancing DNN accelerator performance.
Findings
Achieves up to 7.9x speedup over baseline.
Close to oracle performance with less search time.
Demonstrates effectiveness on various DNN models.
Abstract
Many hardware vendors have introduced specialized deep neural networks (DNN) accelerators owing to their superior performance and efficiency. As such, how to generate and optimize the code for the hardware accelerator becomes an important yet less explored problem. In this paper, we perform the compiler-stage optimization study using a novel and representative Cambricon DNN accelerator and demonstrate that the code optimization knobs play an important role in unleashing the potential of hardware computational horsepower. However, even only two studied code optimization knobs, namely the number of cores and layer fusion scheme, present an enormous search space that prevents the naive brute-force search. This work introduces a joint, auto-tuning optimization framework to address this challenge. We first use a set of synthesized DNN layers to study the interplay between the hardware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Advanced Memory and Neural Computing
