Knowledge Distillation with Adapted Weight

Sirong Wu; Xi Luo; Junjie Liu; and Yuhui Deng

arXiv:2501.02705·cs.LG·January 7, 2025

Knowledge Distillation with Adapted Weight

Sirong Wu, Xi Luo, Junjie Liu, and Yuhui Deng

PDF

Open Access

TL;DR

This paper introduces KD-AIF, a knowledge distillation framework that uses influence functions to weight training data, enhancing model transparency, robustness, and performance in semi-supervised learning across multiple benchmarks.

Contribution

It proposes a novel influence-based weighting method for knowledge distillation that improves transparency, robustness, and generalization of student models.

Findings

01

KD-AIF outperforms existing methods on multiple benchmarks.

02

The influence weighting improves learning efficiency and model interpretability.

03

Enhanced semi-supervised learning performance with better data utilization.

Abstract

Although large models have shown a strong capacity to solve large-scale problems in many areas including natural language and computer vision, their voluminous parameters are hard to deploy in a real-time system due to computational and energy constraints. Addressing this, knowledge distillation through Teacher-Student architecture offers a sustainable pathway to compress the knowledge of large models into more manageable sizes without significantly compromising performance. To enhance the robustness and interpretability of this framework, it is critical to understand how individual training data impact model performance, which is an area that remains underexplored. We propose the \textbf{Knowledge Distillation with Adaptive Influence Weight (KD-AIF)} framework which leverages influence functions from robust statistics to assign weights to training data, grounded in the four key SAFE…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning

MethodsKnowledge Distillation