[Re] Distilling Knowledge via Knowledge Review
Apoorva Verma, Pranjal Gulati, Sarthak Gupta

TL;DR
This paper reproduces and analyzes the robustness of a knowledge distillation framework that introduces cross-level connections, residual learning, and feature fusion to enhance student model performance.
Contribution
It validates the original framework's improvements and provides new insights through ablation studies on the novel modules introduced.
Findings
Consistent verification of test accuracy improvements
Demonstrated effectiveness of the fusion module
Confirmed robustness of the review framework
Abstract
This effort aims to reproduce the results of experiments and analyze the robustness of the review framework for knowledge distillation introduced in the CVPR '21 paper 'Distilling Knowledge via Knowledge Review' by Chen et al. Previous works in knowledge distillation only studied connections paths between the same levels of the student and the teacher, and cross-level connection paths had not been considered. Chen et al. propose a new residual learning framework to train a single student layer using multiple teacher layers. They also design a novel fusion module to condense feature maps across levels and a loss function to compare feature information stored across different levels to improve performance. In this work, we consistently verify the improvements in test accuracy across student models as reported in the original paper and study the effectiveness of the novel modules…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Online Learning and Analytics · Adversarial Robustness in Machine Learning
MethodsKnowledge Distillation
