Teach Harder, Learn Poorer: Rethinking Hard Sample Distillation for GNN-to-MLP Knowledge Distillation
Lirong Wu, Yunfan Liu, Haitao Lin, Yufei Huang, Stan Z. Li

TL;DR
This paper introduces a hardness-aware distillation framework for GNN-to-MLP knowledge transfer, effectively decoupling and estimating knowledge and distillation hardness to improve performance without extra learnable parameters.
Contribution
It proposes a novel HGMD framework that decouples knowledge hardness and distillation hardness, using a non-parametric approach for improved GNN-to-MLP knowledge distillation.
Findings
HGMD outperforms state-of-the-art methods on seven datasets.
HGMD-mixup improves MLP performance by 12.95%.
The approach effectively decouples hardness, enhancing distillation quality.
Abstract
To bridge the gaps between powerful Graph Neural Networks (GNNs) and lightweight Multi-Layer Perceptron (MLPs), GNN-to-MLP Knowledge Distillation (KD) proposes to distill knowledge from a well-trained teacher GNN into a student MLP. In this paper, we revisit the knowledge samples (nodes) in teacher GNNs from the perspective of hardness, and identify that hard sample distillation may be a major performance bottleneck of existing graph KD algorithms. The GNN-to-MLP KD involves two different types of hardness, one student-free knowledge hardness describing the inherent complexity of GNN knowledge, and the other student-dependent distillation hardness describing the difficulty of teacher-to-student distillation. However, most of the existing work focuses on only one of these aspects or regards them as one thing. This paper proposes a simple yet effective Hardness-aware GNN-to-MLP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Neural Networks and Applications · Geophysical Methods and Applications
MethodsKnowledge Distillation
