Knowledge Distillation with Training Wheels
Guanlin Liu, Anand Ramachandran, Tanmay Gangwani, Yan Fu, Abhinav, Sethy

TL;DR
This paper introduces a generalized framework for knowledge distillation that allows models to learn from teachers during training and selectively seek help at test-time, improving performance and flexibility in language tasks.
Contribution
It formulates knowledge distillation as an entropy-regularized optimization problem and develops a new algorithm using Path Consistency Learning and constrained reinforcement learning for test-time assistance.
Findings
Improved translation and summarization accuracy.
Enhanced control over teacher assistance during inference.
Unlocks new operating points beyond existing decoding methods.
Abstract
Knowledge distillation is used, in generative language modeling, to train a smaller student model using the help of a larger teacher model, resulting in improved capabilities for the student model. In this paper, we formulate a more general framework for knowledge distillation where the student learns from the teacher during training, and also learns to ask for the teacher's help at test-time following rules specifying test-time restrictions. Towards this, we first formulate knowledge distillation as an entropy-regularized value optimization problem. Adopting Path Consistency Learning to solve this, leads to a new knowledge distillation algorithm using on-policy and off-policy demonstrations. We extend this using constrained reinforcement learning to a framework that incorporates the use of the teacher model as a test-time reference, within constraints. In this situation, akin to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Educational Games and Gamification
MethodsKnowledge Distillation
