Qwen2.5-Coder Technical Report
Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang,, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, Kai Dang, Yang Fan, Yichang, Zhang, An Yang, Rui Men, Fei Huang, Bo Zheng, Yibo Miao, Shanghaoran Quan,, Yunlong Feng, Xingzhang Ren, Xuancheng Ren

TL;DR
The Qwen2.5-Coder series introduces six advanced code-specific models trained on extensive data, achieving state-of-the-art performance across multiple benchmarks and supporting broader research and application in code intelligence.
Contribution
This report presents a new series of six models with improved code generation capabilities, trained on a large corpus, and evaluated on numerous benchmarks, outperforming previous models of similar size.
Findings
Achieved state-of-the-art results on over 10 code-related benchmarks.
Demonstrated superior performance compared to larger models of the same size.
Maintained general and math skills while excelling in code tasks.
Abstract
In this report, we introduce the Qwen2.5-Coder series, a significant upgrade from its predecessor, CodeQwen1.5. This series includes six models: Qwen2.5-Coder-(0.5B/1.5B/3B/7B/14B/32B). As a code-specific model, Qwen2.5-Coder is built upon the Qwen2.5 architecture and continues pretrained on a vast corpus of over 5.5 trillion tokens. Through meticulous data cleaning, scalable synthetic data generation, and balanced data mixing, Qwen2.5-Coder demonstrates impressive code generation capabilities while retaining general and math skills. These models have been evaluated on a wide range of code-related tasks, achieving state-of-the-art (SOTA) performance across more than 10 benchmarks, including code generation, completion, reasoning, and repair, consistently outperforming larger models of the same model size. We believe that the release of the Qwen2.5-Coder series will advance research in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Qwen/Qwen2.5-Coder-7B-Instructmodel· 2.5M dl· ♡ 6762.5M dl♡ 676
- 🤗Qwen/Qwen2.5-Coder-7B-Instruct-GGUFmodel· 102k dl· ♡ 220102k dl♡ 220
- 🤗Qwen/Qwen2.5-Coder-14B-Instruct-GGUFmodel· 55k dl· ♡ 10655k dl♡ 106
- 🤗Qwen/Qwen2.5-Coder-1.5Bmodel· 265k dl· ♡ 87265k dl♡ 87
- 🤗Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUFmodel· 30k dl· ♡ 4430k dl♡ 44
- 🤗Qwen/Qwen2.5-Coder-7B-Instruct-AWQmodel· 520k dl· ♡ 21520k dl♡ 21
- 🤗Qwen/Qwen2.5-Coder-32B-Instructmodel· 918k dl· ♡ 1997918k dl♡ 1997
- 🤗Qwen/Qwen2.5-Coder-3Bmodel· 46k dl· ♡ 4446k dl♡ 44
- 🤗Qwen/Qwen2.5-Coder-14Bmodel· 24k dl· ♡ 6924k dl♡ 69
- 🤗Qwen/Qwen2.5-Coder-32Bmodel· 44k dl· ♡ 14544k dl♡ 145
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced MEMS and NEMS Technologies
