Applying Graph Explanation to Operator Fusion
Keith G. Mills, Muhammad Fetrat Qharabagh, Weichen Qiu, Fred X. Han,, Mohammad Salameh, Wei Lu, Shangling Jui, Di Niu

TL;DR
This paper introduces a novel method that uses Graph Explanation Techniques to improve layer fusion in deep neural networks, significantly reducing DRAM access and enhancing inference efficiency.
Contribution
It integrates explainable AI with layer fusion optimization, enabling recursive splitting of invalid fusion groups to minimize DRAM access in DNNs.
Findings
Over 20% DRAM access reduction on EfficientNet-B3
Effective fusion optimization on ResNets and MobileNets
Improved inference efficiency through explainable AI techniques
Abstract
Layer fusion techniques are critical to improving the inference efficiency of deep neural networks (DNN) for deployment. Fusion aims to lower inference costs by reducing data transactions between an accelerator's on-chip buffer and DRAM. This is accomplished by grouped execution of multiple operations like convolution and activations together into single execution units - fusion groups. However, on-chip buffer capacity limits fusion group size and optimizing fusion on whole DNNs requires partitioning into multiple fusion groups. Finding the optimal groups is a complex problem where the presence of invalid solutions hampers traditional search algorithms and demands robust approaches. In this paper we incorporate Explainable AI, specifically Graph Explanation Techniques (GET), into layer fusion. Given an invalid fusion group, we identify the operations most responsible for group…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Graph Theory and Algorithms · Advanced Graph Neural Networks
MethodsConvolution
