MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning
Bingchang Liu, Chaoyu Chen, Cong Liao, Zi Gong, Huan Wang, Zhichao, Lei, Ming Liang, Dajun Chen, Min Shen, Hailian Zhou, Hang Yu, Jianguo Li

TL;DR
MFTCoder introduces a multi-task fine-tuning framework for code LLMs that enhances performance, efficiency, and versatility by enabling simultaneous training on multiple tasks, outperforming single-task approaches and surpassing GPT-4 on benchmarks.
Contribution
The paper presents a novel multi-task fine-tuning method for code LLMs that improves performance and training efficiency, leveraging various loss functions and integrating with open-source models.
Findings
Outperforms single-task fine-tuning and ensemble methods.
Achieves 74.4% pass@1 on HumaneEval, surpassing GPT-4.
Offers efficient training with tokenization modes and PEFT.
Abstract
Code LLMs have emerged as a specialized research field, with remarkable studies dedicated to enhancing model's coding capabilities through fine-tuning on pre-trained models. Previous fine-tuning approaches were typically tailored to specific downstream tasks or scenarios, which meant separate fine-tuning for each task, requiring extensive training resources and posing challenges in terms of deployment and maintenance. Furthermore, these approaches failed to leverage the inherent interconnectedness among different code-related tasks. To overcome these limitations, we present a multi-task fine-tuning framework, MFTcoder, that enables simultaneous and parallel fine-tuning on multiple tasks. By incorporating various loss functions, we effectively address common challenges in multi-task learning, such as data imbalance, varying difficulty levels, and inconsistent convergence speeds.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning and Data Classification · Natural Language Processing Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Dense Connections · Adam · Layer Normalization · Label Smoothing · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Linear Layer · Byte Pair Encoding
