Alignahead: Online Cross-Layer Knowledge Extraction on Graph Neural Networks
Jiongyu Guo, Defang Chen, Can Wang

TL;DR
This paper introduces an online cross-layer knowledge distillation method for GNNs that improves student model performance without pre-trained teachers, accelerates convergence, and enhances knowledge sharing across layers.
Contribution
Proposes a novel online distillation framework with cross-layer alignment for GNNs, eliminating the need for pre-trained teacher models and improving training efficiency.
Findings
Consistent performance boost across five datasets.
Accelerated convergence speed with the alignahead technique.
Effectiveness increases with more student models in training.
Abstract
Existing knowledge distillation methods on graph neural networks (GNNs) are almost offline, where the student model extracts knowledge from a powerful teacher model to improve its performance. However, a pre-trained teacher model is not always accessible due to training cost, privacy, etc. In this paper, we propose a novel online knowledge distillation framework to resolve this problem. Specifically, each student GNN model learns the extracted local structure from another simultaneously trained counterpart in an alternating training procedure. We further develop a cross-layer distillation strategy by aligning ahead one student layer with the layer in different depth of another student model, which theoretically makes the structure information spread over all layers. Experimental results on five datasets including PPI, Coauthor-CS/Physics and Amazon-Computer/Photo demonstrate that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Brain Tumor Detection and Classification · Advanced Neural Network Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Knowledge Distillation
