MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning

Bingchang Liu; Chaoyu Chen; Cong Liao; Zi Gong; Huan Wang; Zhichao; Lei; Ming Liang; Dajun Chen; Min Shen; Hailian Zhou; Hang Yu; Jianguo Li

arXiv:2311.02303·cs.LG·November 7, 2023·1 cites

MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning

Bingchang Liu, Chaoyu Chen, Cong Liao, Zi Gong, Huan Wang, Zhichao, Lei, Ming Liang, Dajun Chen, Min Shen, Hailian Zhou, Hang Yu, Jianguo Li

PDF

Open Access 1 Repo 3 Models

TL;DR

MFTCoder introduces a multi-task fine-tuning framework for code LLMs that enhances performance, efficiency, and versatility by enabling simultaneous training on multiple tasks, outperforming single-task approaches and surpassing GPT-4 on benchmarks.

Contribution

The paper presents a novel multi-task fine-tuning method for code LLMs that improves performance and training efficiency, leveraging various loss functions and integrating with open-source models.

Findings

01

Outperforms single-task fine-tuning and ensemble methods.

02

Achieves 74.4% pass@1 on HumaneEval, surpassing GPT-4.

03

Offers efficient training with tokenization modes and PEFT.

Abstract

Code LLMs have emerged as a specialized research field, with remarkable studies dedicated to enhancing model's coding capabilities through fine-tuning on pre-trained models. Previous fine-tuning approaches were typically tailored to specific downstream tasks or scenarios, which meant separate fine-tuning for each task, requiring extensive training resources and posing challenges in terms of deployment and maintenance. Furthermore, these approaches failed to leverage the inherent interconnectedness among different code-related tasks. To overcome these limitations, we present a multi-task fine-tuning framework, MFTcoder, that enables simultaneous and parallel fine-tuning on multiple tasks. By incorporating various loss functions, we effectively address common challenges in multi-task learning, such as data imbalance, varying difficulty levels, and inconsistent convergence speeds.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

codefuse-ai/mftcoder
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning and Data Classification · Natural Language Processing Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Dense Connections · Adam · Layer Normalization · Label Smoothing · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Linear Layer · Byte Pair Encoding