STAG: Enabling Low Latency and Low Staleness of GNN-based Services with Dynamic Graphs
Jiawen Wang, Quan Chen, Deze Zeng, Zhuo Song, Chen Chen, Minyi Guo

TL;DR
STAG is a framework that significantly reduces update latency and staleness in GNN-based services by using collaborative serving and incremental propagation strategies, enabling faster and more current node representations.
Contribution
The paper introduces STAG, a novel GNN serving framework that addresses neighbor explosion and duplicated computation, improving update speed and data freshness.
Findings
Accelerates update phase by up to 90.1x
Reduces staleness time significantly
Slight increase in response latency
Abstract
Many emerging user-facing services adopt Graph Neural Networks (GNNs) to improve serving accuracy. When the graph used by a GNN model changes, representations (embedding) of nodes in the graph should be updated accordingly. However, the node representation update is too slow, resulting in either long response latency of user queries (the inference is performed after the update completes) or high staleness problem (the inference is performed based on stale data). Our in-depth analysis shows that the slow update is mainly due to neighbor explosion problem in graphs and duplicated computation. Based on such findings, we propose STAG, a GNN serving framework that enables low latency and low staleness of GNN-based services. It comprises a collaborative serving mechanism and an additivity-based incremental propagation strategy. With the collaborative serving mechanism, only part of node…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Recommender Systems and Techniques · Caching and Content Delivery
