CodeT5+: Open Code Large Language Models for Code Understanding and Generation
Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D.Q. Bui, Junnan Li,, Steven C.H. Hoi

TL;DR
CodeT5+ introduces a flexible encoder-decoder architecture with diverse pretraining objectives, enabling superior performance across a wide range of code understanding and generation tasks, especially when instruction-tuned.
Contribution
The paper presents a novel encoder-decoder code LLM family with mixed pretraining objectives and efficient initialization, achieving state-of-the-art results on multiple benchmarks.
Findings
State-of-the-art performance on code generation and completion tasks.
Effective instruction-tuning improves task-specific results.
Flexible architecture benefits various downstream code tasks.
Abstract
Large language models (LLMs) pretrained on vast source code have achieved prominent progress in code intelligence. However, existing code LLMs have two main limitations in terms of architecture and pretraining tasks. First, they often adopt a specific architecture (encoder-only or decoder-only) or rely on a unified encoder-decoder network for different downstream tasks. The former paradigm is limited by inflexibility in applications while in the latter, the model is treated as a single system for all tasks, leading to suboptimal performance on a subset of tasks. Secondly, they often employ a limited set of pretraining objectives which might not be relevant to some downstream tasks and hence result in substantial performance degrade. To address these limitations, we propose ``CodeT5+'', a family of encoder-decoder LLMs for code in which component modules can be flexibly combined to suit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Salesforce/codet5p-220mmodel· 15k dl· ♡ 3315k dl♡ 33
- 🤗Salesforce/codet5p-770mmodel· 3.1k dl· ♡ 203.1k dl♡ 20
- 🤗Salesforce/codet5p-770m-pymodel· 95 dl· ♡ 2095 dl♡ 20
- 🤗Salesforce/codet5p-220m-pymodel· 234 dl· ♡ 15234 dl♡ 15
- 🤗Salesforce/instructcodet5p-16bmodel· 83 dl· ♡ 5883 dl♡ 58
- 🤗Salesforce/codet5p-6bmodel· 258 dl· ♡ 15258 dl♡ 15
- 🤗Salesforce/codet5p-16bmodel· 42 dl· ♡ 6642 dl♡ 66
- 🤗Salesforce/codet5p-2bmodel· 586 dl· ♡ 34586 dl♡ 34
- 🤗michaelfeil/ct2fast-codet5p-770mmodel· 23 dl· ♡ 423 dl♡ 4
- 🤗michaelfeil/ct2fast-codet5p-770m-pymodel· 20 dl· ♡ 620 dl♡ 6
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsALIGN
