Deep geometric knowledge distillation with graphs
Carlos Lassance, Myriam Bontonou, Ghouthi Boukli Hacene, Vincent, Gripon, Jian Tang, Antonio Ortega

TL;DR
This paper introduces a graph-based relative knowledge distillation method that effectively transfers geometric information between neural networks of different sizes, improving accuracy under resource constraints.
Contribution
It proposes a novel graph-based RKD approach that captures latent space geometry, enabling dimension-agnostic knowledge transfer in deep learning models.
Findings
Outperforms existing RKD methods on vision benchmarks
Enables effective knowledge distillation across different network sizes
Improves accuracy for the same computational budget
Abstract
In most cases deep learning architectures are trained disregarding the amount of operations and energy consumption. However, some applications, like embedded systems, can be resource-constrained during inference. A popular approach to reduce the size of a deep learning architecture consists in distilling knowledge from a bigger network (teacher) to a smaller one (student). Directly training the student to mimic the teacher representation can be effective, but it requires that both share the same latent space dimensions. In this work, we focus instead on relative knowledge distillation (RKD), which considers the geometry of the respective latent spaces, allowing for dimension-agnostic transfer of knowledge. Specifically we introduce a graph-based RKD method, in which graphs are used to capture the geometry of latent spaces. Using classical computer vision benchmarks, we demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Big Data and Digital Economy · Human Pose and Action Recognition
MethodsKnowledge Distillation
