TL;DR
This paper introduces an interpretable knowledge distillation method that uses PCA and graph neural networks to improve lightweight neural networks and visualize the embedding process.
Contribution
It proposes a novel interpretable embedding knowledge transfer method combining PCA and message passing neural networks for better distillation.
Findings
Improves CIFAR100 accuracy by 2.28% over SOTA
Provides visual interpretability of the embedding procedure
Demonstrates effectiveness of the proposed KD method
Abstract
Knowledge distillation (KD) is one of the most useful techniques for light-weight neural networks. Although neural networks have a clear purpose of embedding datasets into the low-dimensional space, the existing knowledge was quite far from this purpose and provided only limited information. We argue that good knowledge should be able to interpret the embedding procedure. This paper proposes a method of generating interpretable embedding procedure (IEP) knowledge based on principal component analysis, and distilling it based on a message passing neural network. Experimental results show that the student network trained by the proposed KD method improves 2.28% in the CIFAR100 dataset, which is higher performance than the state-of-the-art (SOTA) method. We also demonstrate that the embedding procedure knowledge is interpretable via visualization of the proposed KD process. The implemented…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
