Instance Temperature Knowledge Distillation
Zhengbo Zhang, Yuxi Zhou, Jia Gong, Jun Liu, Zhigang Tu

TL;DR
This paper introduces RLKD, a reinforcement learning-based method for dynamically adjusting temperature in knowledge distillation, improving the learning process by considering future benefits and applying to image classification and detection.
Contribution
Proposes a novel reinforcement learning framework for instance temperature adjustment in knowledge distillation, incorporating a new state representation and reward calibration.
Findings
Effective temperature adjustment improves student network performance.
Applicable to various tasks like image classification and object detection.
Enhanced learning efficiency through exploration strategy.
Abstract
Knowledge distillation (KD) enhances the performance of a student network by allowing it to learn the knowledge transferred from a teacher network incrementally. Existing methods dynamically adjust the temperature to enable the student network to adapt to the varying learning difficulties at different learning stages of KD. KD is a continuous process, but when adjusting the temperature, these methods consider only the immediate benefits of the operation in the current learning phase and fail to take into account its future returns. To address this issue, we formulate the adjustment of temperature as a sequential decision-making task and propose a method based on reinforcement learning, termed RLKD. Importantly, we design a novel state representation to enable the agent to make more informed action (i.e. instance temperature adjustment). To handle the problem of delayed rewards in our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
