LumiNet: Perception-Driven Knowledge Distillation via Statistical Logit Calibration
Md. Ismail Hossain, M M Lutfe Elahi, Sameera Ramasinghe, Ali Cheraghian, Fuad Rahman, Nabeel Mohammed, Shafin Rahman

TL;DR
LumiNet introduces a perception-driven approach to improve logit-based knowledge distillation by calibrating logits based on model representation, effectively addressing overconfidence and outperforming feature-based methods on major benchmarks.
Contribution
The paper presents LumiNet, a novel logit-based distillation method that incorporates perception for better calibration and knowledge transfer, surpassing existing feature-based approaches.
Findings
Outperforms feature-based methods on CIFAR-100, ImageNet, and MSCOCO.
Achieves 1.5% and 2.05% improvements over KD with ResNet18 and MobileNetV2 on ImageNet.
Effectively addresses overconfidence in logit-based distillation.
Abstract
In the knowledge distillation literature, feature-based methods have dominated due to their ability to effectively tap into extensive teacher models. In contrast, logit-based approaches, which aim to distill "dark knowledge" from teachers, typically exhibit inferior performance compared to feature-based methods. To bridge this gap, we present LumiNet, a novel knowledge distillation algorithm designed to enhance logit-based distillation. We introduce the concept of "perception", aiming to calibrate logits based on the model's representation capability. This concept addresses overconfidence issues in the logit-based distillation method while also introducing a novel method to distill knowledge from the teacher. It reconstructs the logits of a sample/instances by considering relationships with other samples in the batch. LumiNet excels on benchmarks like CIFAR-100, ImageNet, and MSCOCO,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsDepthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution · Batch Normalization · Inverted Residual Block · Average Pooling · Convolution · 1x1 Convolution · Knowledge Distillation
