Qwen2 Technical Report

An Yang; Baosong Yang; Binyuan Hui; Bo Zheng; Bowen Yu; Chang Zhou,; Chengpeng Li; Chengyuan Li; Dayiheng Liu; Fei Huang; Guanting Dong; Haoran; Wei; Huan Lin; Jialong Tang; Jialin Wang; Jian Yang; Jianhong Tu; Jianwei; Zhang; Jianxin Ma; Jianxin Yang; Jin Xu; Jingren Zhou; Jinze Bai; Jinzheng; He; Junyang Lin; Kai Dang; Keming Lu; Keqin Chen; Kexin Yang; Mei Li,; Mingfeng Xue; Na Ni; Pei Zhang; Peng Wang; Ru Peng; Rui Men; Ruize Gao; Runji; Lin; Shijie Wang; Shuai Bai; Sinan Tan; Tianhang Zhu; Tianhao Li; Tianyu Liu,; Wenbin Ge; Xiaodong Deng; Xiaohuan Zhou; Xingzhang Ren; Xinyu Zhang; Xipin; Wei; Xuancheng Ren; Xuejing Liu; Yang Fan; Yang Yao; Yichang Zhang; Yu Wan,; Yunfei Chu; Yuqiong Liu; Zeyu Cui; Zhenru Zhang; Zhifang Guo; Zhihao Fan

arXiv:2407.10671·cs.CL·September 11, 2024·48 cites

Qwen2 Technical Report

An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou,, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran, Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei, Zhang, Jianxin Ma, Jianxin Yang, Jin Xu, Jingren Zhou

PDF

Open Access 5 Repos 10 Models 3 Datasets

TL;DR

Qwen2 is a comprehensive suite of large language and multimodal models with up to 72 billion parameters, demonstrating state-of-the-art performance across diverse benchmarks and multilingual capabilities, and is openly accessible for community use.

Contribution

Introduction of the Qwen2 series, a new set of large language and multimodal models with extensive benchmarks, multilingual support, and open availability, advancing open-weight model performance.

Findings

01

Qwen2-72B achieves high scores on multiple benchmarks.

02

Qwen2 models outperform prior open-weight models.

03

Multilingual proficiency across 30 languages.

Abstract

This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, and exhibits competitive performance relative to proprietary models across diverse benchmarks on language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning. The flagship model, Qwen2-72B, showcases remarkable performance: 84.2 on MMLU, 37.9 on GPQA, 64.6 on HumanEval, 89.5 on GSM8K, and 82.4 on BBH as a base language model. The instruction-tuned variant, Qwen2-72B-Instruct, attains 9.1 on MT-Bench, 48.1 on Arena-Hard, and 35.7 on LiveCodeBench. Moreover,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsBalanced Selection