CodeACT: Code Adaptive Compute-efficient Tuning Framework for Code LLMs

Weijie Lv; Xuan Xia; Sheng-Jun Huang

arXiv:2408.02193·cs.CL·August 6, 2024

CodeACT: Code Adaptive Compute-efficient Tuning Framework for Code LLMs

Weijie Lv, Xuan Xia, Sheng-Jun Huang

PDF

Open Access 1 Repo

TL;DR

CodeACT is a framework that improves open-source code language models by selecting high-quality training data and optimizing training strategies, resulting in better performance and reduced computational costs.

Contribution

It introduces the CDAS data sampling method and Dynamic Pack padding, significantly enhancing training efficiency and model performance with less data and resources.

Findings

01

8.6% performance increase on HumanEval

02

78% reduction in training time

03

27% decrease in peak GPU memory usage

Abstract

Large language models (LLMs) have shown great potential in code-related tasks, yet open-source models lag behind their closed-source counterparts. To bridge this performance gap, existing methods generate vast amounts of synthetic data for fine-tuning, leading to inefficiencies in training. Motivated by the need for more effective and efficient training, we propose the Code Adaptive Compute-efficient Tuning (CodeACT) framework. CodeACT introduces the Complexity and Diversity Aware Sampling (CDAS) method to select high-quality training data based on complexity and diversity, and the Dynamic Pack padding strategy to reduce computational resource usage by minimizing padding tokens during training. Experimental results demonstrate that CodeACT-DeepSeek-Coder-6.7B, fine-tuned on only 40% of the EVOL-Instruct data, achieves an 8.6% performance increase on HumanEval, reduces training time by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kyle-lyu/codeact
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Semantic Web and Ontologies

MethodsAttentive Walk-Aggregating Graph Neural Network