Frameless Graph Knowledge Distillation
Dai Shi, Zhiqi Shao, Yi Guo, Junbin Gao

TL;DR
This paper introduces a graph knowledge distillation framework using multi-scaled graph framelets, enabling simple student models to effectively learn complex graph representations and outperform or match teacher models in accuracy while maintaining high inference speed.
Contribution
The work proposes a novel KD framework with graph framelet decomposition, allowing student GNNs to utilize multi-scaled graph knowledge and address over-squashing issues.
Findings
Student models achieve equal or better accuracy than teacher models.
The framework effectively handles both homophilic and heterophilic graphs.
Inference speed remains high despite complex knowledge transfer.
Abstract
Knowledge distillation (KD) has shown great potential for transferring knowledge from a complex teacher model to a simple student model in which the heavy learning task can be accomplished efficiently and without losing too much prediction accuracy. Recently, many attempts have been made by applying the KD mechanism to the graph representation learning models such as graph neural networks (GNNs) to accelerate the model's inference speed via student models. However, many existing KD-based GNNs utilize MLP as a universal approximator in the student model to imitate the teacher model's process without considering the graph knowledge from the teacher model. In this work, we provide a KD-based framework on multi-scaled GNNs, known as graph framelet, and prove that by adequately utilizing the graph knowledge in a multi-scaled manner provided by graph framelet decomposition, the student model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Online Learning and Analytics · Intelligent Tutoring Systems and Adaptive Learning
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
