Smaller but Better: Self-Paced Knowledge Distillation for Lightweight yet Effective LCMs

Yujia Chen; Yang Ye; Zhongqi Li; Yuchi Ma; Cuiyun Gao

arXiv:2408.03680·cs.SE·May 21, 2025

Smaller but Better: Self-Paced Knowledge Distillation for Lightweight yet Effective LCMs

Yujia Chen, Yang Ye, Zhongqi Li, Yuchi Ma, Cuiyun Gao

PDF

Open Access

TL;DR

This paper introduces SODA, a self-paced knowledge distillation framework that creates lightweight yet highly effective large code models, significantly improving performance and surpassing some existing models like ChatGPT.

Contribution

The paper proposes a novel self-paced knowledge distillation method for developing lightweight code models, with a new framework and a series of models outperforming larger counterparts.

Findings

01

SODA improves student models by 65.96% Pass@1.

02

SodaCoder models outperform 15 larger LCMs.

03

SodaCoder-DS-6.7B surpasses ChatGPT on average Pass@1.

Abstract

Large code models (LCMs) have remarkably advanced the field of code generation. Despite their impressive capabilities, they still face practical deployment issues, such as high inference costs, limited accessibility of proprietary LCMs, and adaptability issues of ultra-large LCMs. These issues highlight the critical need for more accessible, lightweight yet effective LCMs. Knowledge distillation (KD) offers a promising solution, which transfers the programming capabilities of larger, advanced LCMs to smaller, less powerful LCMs. In this paper, we propose a novel Self-Paced knOwledge DistillAtion framework, named SODA, aiming at developing lightweight yet effective student LCMs. SODA consists of three stages in one cycle: (1) Correct-and-Fault Knowledge Delivery stage aims at improving the student models capability to recognize errors while ensuring its basic programming skill during the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInnovative Teaching and Learning Methods