TLV-HGNN: Thinking Like a Vertex for Memory-efficient HGNN Inference
Dengke Han, Duo Wang, Mingyu Yan, Xiaochun Ye, Dongrui Fan

TL;DR
This paper introduces TVL-HGNN, a hardware accelerator for HGNN inference that significantly improves speed and energy efficiency by eliminating redundant memory accesses and intermediate storage through a novel vertex-centric execution paradigm.
Contribution
It proposes a semantics-complete execution paradigm and a vertex grouping technique, enabling memory-efficient and high-performance HGNN inference on reconfigurable hardware.
Findings
Achieves 7.85x speedup over NVIDIA A100
Reduces energy consumption by 98.79%
Outperforms state-of-the-art HGNN accelerators
Abstract
Heterogeneous graph neural networks (HGNNs) excel at processing heterogeneous graph data and are widely applied in critical domains. In HGNN inference, the neighbor aggregation stage is the primary performance determinant, yet it suffers from two major sources of memory inefficiency. First, the commonly adopted per-semantic execution paradigm stores intermediate aggregation results for each semantic prior to semantic fusion, causing substantial memory expansion. Second, the aggregation process incurs extensive redundant memory accesses, including repeated loading of target vertex features across semantics and repeated accesses to shared neighbors due to cross-semantic neighborhood overlap. These inefficiencies severely limit scalability and reduce HGNN inference performance. In this work, we first propose a semantics-complete execution paradigm from a vertex perspective that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Big Data and Digital Economy
