A Closer Look at Knowledge Distillation in Spiking Neural Network Training
Xu Liu, Na Xia, Jinxing Zhou, Jingyuan Xu, Dan Guo

TL;DR
This paper introduces two novel knowledge distillation strategies, SAMD and NLD, to improve training of spiking neural networks by addressing their intrinsic differences from artificial neural networks, leading to better performance.
Contribution
The paper proposes SAMD and NLD, innovative KD methods that align SNNs with ANNs by considering their architectural differences, enhancing SNN training effectiveness.
Findings
SAMD improves semantic alignment between SNNs and ANNs.
NLD facilitates smoother logits matching, boosting SNN accuracy.
Experimental results show significant performance gains across datasets.
Abstract
Spiking Neural Networks (SNNs) become popular due to excellent energy efficiency, yet facing challenges for effective model training. Recent works improve this by introducing knowledge distillation (KD) techniques, with the pre-trained artificial neural networks (ANNs) used as teachers and the target SNNs as students. This is commonly accomplished through a straightforward element-wise alignment of intermediate features and prediction logits from ANNs and SNNs, often neglecting the intrinsic differences between their architectures. Specifically, ANN's outputs exhibit a continuous distribution, whereas SNN's outputs are characterized by sparsity and discreteness. To mitigate this issue, we introduce two innovative KD strategies. Firstly, we propose the Saliency-scaled Activation Map Distillation (SAMD), which aligns the spike activation map of the student SNN with the class-aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Memory and Neural Computing · Neural Networks and Reservoir Computing · Ferroelectric and Negative Capacitance Devices
