AGO: Boosting Mobile AI Inference Performance by Removing Constraints on Graph Optimization
Zhiying Xu, Hongding Peng, Wei Wang

TL;DR
AGO enhances mobile AI inference by removing graph optimization constraints, enabling complex operator fusion and flexible subgraph partitioning, resulting in up to 3.3x performance gains.
Contribution
This paper introduces AGO, a novel framework that removes constraints on graph optimization, allowing arbitrary subgraph structures for improved inference performance.
Findings
Up to 3.3x inference speedup on various neural networks.
Effective operator fusion for complex operators.
Flexible subgraph partitioning with acyclic guarantees.
Abstract
Traditional deep learning compilers rely on heuristics for subgraph generation, which impose extra constraints on graph optimization, e.g., each subgraph can only contain at most one complex operator. In this paper, we propose AGO, a framework for graph optimization with arbitrary structures to boost the inference performance of deep models by removing such constraints. To create new optimization opportunities for complicated subgraphs, we propose intensive operator fusion, which can effectively stitch multiple complex operators together for better performance. Further, we design a graph partitioning scheme that allows an arbitrary structure for each subgraph while guaranteeing the acyclic property among all generated subgraphs. Additionally, to enable efficient performance tuning on complicated subgraphs, we devise a novel divide-and-conquer tuning mechanism to orchestrate different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Machine Learning in Materials Science
