An Empirical Study of Knowledge Distillation for Code Understanding Tasks

Ruiqi Wang; Zezhou Yang; Cuiyun Gao; Xin Xia; Qing Liao

arXiv:2508.15423·cs.SE·August 22, 2025

An Empirical Study of Knowledge Distillation for Code Understanding Tasks

Ruiqi Wang, Zezhou Yang, Cuiyun Gao, Xin Xia, Qing Liao

PDF

Open Access

TL;DR

This study systematically evaluates knowledge distillation techniques for code understanding, demonstrating that feature-based methods significantly improve student model performance while reducing parameters, with code-specific PLMs being particularly effective.

Contribution

It provides a comprehensive empirical analysis of KD methods in code understanding, highlighting the effectiveness of feature-based KD and insights into model architecture choices.

Findings

01

KD boosts performance across various student models.

02

Feature-based KD achieves up to 98% teacher performance with 5% parameters.

03

Similarity to teacher architecture does not guarantee better results.

Abstract

Pre-trained language models (PLMs) have emerged as powerful tools for code understanding. However, deploying these PLMs in large-scale applications faces practical challenges due to their computational intensity and inference latency. Knowledge distillation (KD), a promising model compression and acceleration technique, addresses these limitations by transferring knowledge from large teacher models to compact student models, enabling efficient inference while preserving most of the teacher models' capabilities. While this technique has shown remarkable success in natural language processing and computer vision domains, its potential for code understanding tasks remains largely underexplored. In this paper, we systematically investigate the effectiveness and usage of KD in code understanding tasks. Our study encompasses two popular types of KD methods, i.e., logit-based and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Explainable Artificial Intelligence (XAI)