Efficient Graph Knowledge Distillation from GNNs to Kolmogorov--Arnold Networks via Self-Attention Dynamic Sampling

Can Cui; Zilong Fu; Penghe Huang; Yuanyuan Li; Wu Deng; Dongyan Li

arXiv:2509.00560·cs.LG·February 10, 2026

Efficient Graph Knowledge Distillation from GNNs to Kolmogorov--Arnold Networks via Self-Attention Dynamic Sampling

Can Cui, Zilong Fu, Penghe Huang, Yuanyuan Li, Wu Deng, Dongyan Li

PDF

Open Access

TL;DR

This paper introduces SA-DSD, a novel self-attention-guided dynamic sampling framework for knowledge distillation from GNNs to Kolmogorov--Arnold Networks, significantly reducing computational costs while improving performance on graph tasks.

Contribution

It presents the first use of an enhanced Kolmogorov-Arnold Network as a student model and introduces Fourier KAN+ with learnable bases, advancing lightweight graph learning methods.

Findings

01

SA-DSD outperforms three GNN teachers by up to 3.62% in accuracy.

02

Achieves 16.69x parameter reduction and 55.75% faster training per epoch.

03

Improves FR-KAN+ performance by 15.61%.

Abstract

Recent success of graph neural networks (GNNs) in modeling complex graph-structured data has fueled interest in deploying them on resource-constrained edge devices. However, their substantial computational and memory demands present ongoing challenges. Knowledge distillation (KD) from GNNs to MLPs offers a lightweight alternative, but MLPs remain limited by fixed activations and the absence of neighborhood aggregation, constraining distilled performance. To tackle these intertwined limitations, we propose SA-DSD, a novel self-attention-guided dynamic sampling distillation framework. To the best of our knowledge, this is the first work to employ an enhanced Kolmogorov-Arnold Network (KAN) as the student model. We improve Fourier KAN (FR-KAN+) with learnable frequency bases, phase shifts, and optimized algorithms, substantially improving nonlinear fitting capability over MLPs while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Intelligent Tutoring Systems and Adaptive Learning · Topic Modeling