ADE-HGNN: Accelerating HGNNs through Attention Disparity Exploitation
Dengke Han, Meng Wu, Runzhen Xue, Mingyu Yan, Xiaochun Ye, Dongrui Fan

TL;DR
This paper introduces ADE-HGNN, a specialized hardware accelerator for HGNNs that employs attention disparity-based pruning and operation fusion to significantly accelerate inference while maintaining accuracy.
Contribution
It proposes a novel runtime pruning method and execution framework for HGNNs, along with a dedicated hardware accelerator design to improve performance and energy efficiency.
Findings
Achieves 28.21x speedup over NVIDIA T4
Achieves 7.98x speedup over NVIDIA A100
Maintains inference accuracy loss within 1.47%
Abstract
Heterogeneous Graph Neural Networks (HGNNs) have recently demonstrated great power in handling heterogeneous graph data, rendering them widely applied in many critical real-world domains. Most HGNN models leverage attention mechanisms to significantly improvemodel accuracy, albeit at the cost of increased computational complexity and memory bandwidth requirements. Fortunately, the attention disparity from source vertices towards a common target vertex unveils an opportunity to boost the model execution performance by pruning unimportant source vertices during neighbor aggregation. In this study, we commence with a quantitative analysis of the attention disparity in HGNN models, where the importance of different source vertices varies for the same target vertex. To fully exploit this finding for inference acceleration, we propose a runtime pruning method based on min-heap and map it to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification · Machine Learning in Healthcare · COVID-19 diagnosis using AI
