Code Llama: Open Foundation Models for Code

Baptiste Rozi\`ere; Jonas Gehring; Fabian Gloeckle; Sten Sootla; Itai; Gat; Xiaoqing Ellen Tan; Yossi Adi; Jingyu Liu; Romain Sauvestre; Tal Remez,; J\'er\'emy Rapin; Artyom Kozhevnikov; Ivan Evtimov; Joanna Bitton; Manish; Bhatt; Cristian Canton Ferrer; Aaron Grattafiori; Wenhan Xiong; Alexandre; D\'efossez; Jade Copet; Faisal Azhar; Hugo Touvron; Louis Martin; Nicolas; Usunier; Thomas Scialom; Gabriel Synnaeve

arXiv:2308.12950·cs.CL·February 2, 2024·393 cites

Code Llama: Open Foundation Models for Code

Baptiste Rozi\`ere, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai, Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Romain Sauvestre, Tal Remez,, J\'er\'emy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish, Bhatt, Cristian Canton Ferrer, Aaron Grattafiori

PDF

Open Access 2 Repos 10 Models 2 Datasets

TL;DR

Code Llama introduces a family of open-source large language models for code, offering state-of-the-art performance, infilling, large context support, and instruction-following capabilities across various sizes and specializations.

Contribution

The paper presents a new family of open models for code with improved performance, infilling, and large context handling, including Python-specific and instruction-following variants.

Findings

01

Achieves up to 67% on HumanEval benchmark.

02

Outperforms Llama 2 70B on key code tasks.

03

Outperforms all publicly available models on MultiPL-E.

Abstract

We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks. We provide multiple flavors to cover a wide range of applications: foundation models (Code Llama), Python specializations (Code Llama - Python), and instruction-following models (Code Llama - Instruct) with 7B, 13B, 34B and 70B parameters each. All models are trained on sequences of 16k tokens and show improvements on inputs with up to 100k tokens. 7B, 13B and 70B Code Llama and Code Llama - Instruct variants support infilling based on surrounding content. Code Llama reaches state-of-the-art performance among open models on several code benchmarks, with scores of up to 67% and 65% on HumanEval and MBPP, respectively. Notably,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel-Driven Software Engineering Techniques · Advanced Database Systems and Queries · Semantic Web and Ontologies