An LLM-Guided Query-Aware Inference System for GNN Models on Large Knowledge Graphs

Waleed Afandi; Hussein Abdallah; Ashraf Aboulnaga; Essam Mansour

arXiv:2603.04545·cs.LG·April 21, 2026

An LLM-Guided Query-Aware Inference System for GNN Models on Large Knowledge Graphs

Waleed Afandi, Hussein Abdallah, Ashraf Aboulnaga, Essam Mansour

PDF

TL;DR

This paper introduces KG-WISE, a novel inference system for large knowledge graphs that leverages LLMs to generate query-specific subgraphs and decomposes models for efficient, accurate, and resource-friendly GNN inference.

Contribution

KG-WISE is the first system to decompose GNN models into fine-grained components and use LLMs for query-aware subgraph extraction, significantly improving inference efficiency.

Findings

01

Achieves up to 28x faster inference than existing systems.

02

Reduces memory usage by up to 98%.

03

Maintains or improves accuracy across various large KGs.

Abstract

Efficient inference for graph neural networks (GNNs) on large knowledge graphs (KGs) is essential for many real-world applications. GNN inference queries are computationally expensive and vary in complexity, as each involves a different number of target nodes linked to subgraphs of diverse densities and structures. Existing acceleration methods, such as pruning, quantization, and knowledge distillation, instantiate smaller models but do not adapt them to the structure or semantics of individual queries. They also store models as monolithic files that must be fully loaded, and miss the opportunity to retrieve only the neighboring nodes and corresponding model components that are semantically relevant to the target nodes. These limitations lead to excessive data loading and redundant computation on large KGs. This paper presents KG-WISE, a task-driven inference paradigm for large KGs.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.