DNNFusion: Accelerating Deep Neural Networks Execution with Advanced   Operator Fusion

Wei Niu; Jiexiong Guan; Yanzhi Wang; Gagan Agrawal; Bin Ren

arXiv:2108.13342·cs.LG·December 2, 2021

DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion

Wei Niu, Jiexiong Guan, Yanzhi Wang, Gagan Agrawal, Bin Ren

PDF

TL;DR

DNNFusion is a novel framework that significantly enhances operator fusion in deep neural network execution, leading to substantial speedups and memory reductions suitable for mobile and real-time applications.

Contribution

It introduces a new operator classification and graph rewriting approach to expand fusion opportunities beyond existing pattern-based methods.

Findings

01

Achieves up to 8.8x more fusion opportunities

02

Outperforms state-of-the-art frameworks with 9.3x speedup

03

Reduces memory requirements for DNN inference

Abstract

Deep Neural Networks (DNNs) have emerged as the core enabler of many major applications on mobile devices. To achieve high accuracy, DNN models have become increasingly deep with hundreds or even thousands of operator layers, leading to high memory and computational requirements for inference. Operator fusion (or kernel/layer fusion) is key optimization in many state-of-the-art DNN execution frameworks, such as TensorFlow, TVM, and MNN. However, these frameworks usually adopt fusion approaches based on certain patterns that are too restrictive to cover the diversity of operators and layer connections. Polyhedral-based loop fusion techniques, on the other hand, work on a low-level view of the computation without operator-level information, and can also miss potential fusion opportunities. To address this challenge, this paper proposes a novel and extensive loop fusion framework called…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.