Curriculum Learning for Small Code Language Models

Marwa Na\"ir; Kamel Yamani; Lynda Said Lhadj; Riyadh Baghdadi

arXiv:2407.10194·cs.LG·July 16, 2024

Curriculum Learning for Small Code Language Models

Marwa Na\"ir, Kamel Yamani, Lynda Said Lhadj, Riyadh Baghdadi

PDF

Open Access 1 Models 1 Video

TL;DR

This paper demonstrates that curriculum learning can significantly improve the performance of small code language models, especially in code execution tasks, challenging prior assumptions about its effectiveness.

Contribution

It introduces a novel curriculum learning schedule and a code difficulty assessment metric tailored for small code language models, showing notable performance improvements.

Findings

01

Curriculum learning improves code execution accuracy in small models.

02

Effect on code completion tasks is less significant.

03

Proposes a new code difficulty metric for curriculum design.

Abstract

Code language models have emerged as useful tools for various programming tasks, yet they often struggle when it comes to complex ones. In this paper, we explore the potential of curriculum learning in enhancing the performance of these models. While prior research has suggested that curriculum learning does not necessarily help in improving the performance of language models, our results surprisingly show that this may not be the case for code language models. We demonstrate that a well-designed curriculum learning approach significantly improves the accuracy of small decoder-only code language models on the task of code execution, while its effect on code completion is less significant. To explore the potential of curriculum learning, we train multiple GPT models with 1 million parameters each to predict the next token and evaluate them on code completion and execution tasks. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
jblitzar/code-completion
model

Videos

Curriculum Learning for Small Code Language Models· underline

Taxonomy

TopicsModel-Driven Software Engineering Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Cosine Annealing · Layer Normalization · Linear Layer · Attention Dropout · Adam · Dropout · Weight Decay