DLink: Distilling Layer-wise and Dominant Knowledge from EEG Foundation Models
Jingyuan Wang, Zhihao Jia, Chenyu Liu, Xinliang Zhou, Haoran Luo, Ziyu Jia, Yong Li, Fang Li, Junfeng Yao, Yi Ding

TL;DR
This paper introduces DLink, a spectral distillation framework that effectively transfers knowledge from EEG foundation models into compact students, improving efficiency while maintaining high performance.
Contribution
DLink employs layer-wise routing and spectral alignment to distill rich intermediate representations from EEG foundation models into smaller, efficient models.
Findings
DLink improves compact student performance on EEG benchmarks.
It reduces parameters, FLOPs, and inference latency significantly.
DLink narrows the gap between lightweight models and full EFMs.
Abstract
EEG foundation models (EFMs) achieve strong cross-subject and cross-task generalization through large-scale pretraining and downstream fine-tuning. Through empirical analysis, we observe that (i) task-adapted EFMs provide strong decoding performance but incur substantial overhead when retained as inference backbones, making knowledge distillation a natural route for optimizing compact students; and (ii) direct distillation from a fixed teacher representation underutilizes EFM knowledge, as task-discriminative information is distributed across intermediate layers rather than concentrated in the final layer. These observations motivate DLink (Distilling Layer-wise and Dominant Knowledge), a spectrally guided distillation framework with input-conditioned layer routing for transferring EFM knowledge into compact students. DLink uses a lightweight router to aggregate teacher layers for each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
