Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation
Shichang Zhang, Yozen Liu, Yizhou Sun, Neil Shah

TL;DR
This paper introduces Graph-less Neural Networks (GLNNs), which are MLPs enhanced via knowledge distillation from GNNs, achieving near-GNN accuracy with significantly faster inference suitable for latency-sensitive applications.
Contribution
The paper proposes a novel method to improve MLPs using GNN knowledge distillation, creating graph-less models that match GNN accuracy while vastly reducing inference latency.
Findings
GLNNs infer 146X-273X faster than GNNs.
GLNNs improve MLP accuracy by 12.36% on average.
GLNNs match GNN performance on 6 out of 7 datasets.
Abstract
Graph Neural Networks (GNNs) are popular for graph machine learning and have shown great results on wide node classification tasks. Yet, they are less popular for practical deployments in the industry owing to their scalability challenges incurred by data dependency. Namely, GNN inference depends on neighbor nodes multiple hops away from the target, and fetching them burdens latency-constrained applications. Existing inference acceleration methods like pruning and quantization can speed up GNNs by reducing Multiplication-and-ACcumulation (MAC) operations, but the improvements are limited given the data dependency is not resolved. Conversely, multi-layer perceptrons (MLPs) have no graph dependency and infer much faster than GNNs, even though they are less accurate than GNNs for node classification in general. Motivated by these complementary strengths and weaknesses, we bring GNNs and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Graph Neural Networks · Ferroelectric and Negative Capacitance Devices · Advanced Memory and Neural Computing
MethodsPruning · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Knowledge Distillation
