Qwen Technical Report

Jinze Bai; Shuai Bai; Yunfei Chu; Zeyu Cui; Kai Dang; Xiaodong Deng,; Yang Fan; Wenbin Ge; Yu Han; Fei Huang; Binyuan Hui; Luo Ji; Mei Li; Junyang; Lin; Runji Lin; Dayiheng Liu; Gao Liu; Chengqiang Lu; Keming Lu; Jianxin Ma,; Rui Men; Xingzhang Ren; Xuancheng Ren; Chuanqi Tan; Sinan Tan; Jianhong Tu,; Peng Wang; Shijie Wang; Wei Wang; Shengguang Wu; Benfeng Xu; Jin Xu; An Yang,; Hao Yang; Jian Yang; Shusheng Yang; Yang Yao; Bowen Yu; Hongyi Yuan; Zheng; Yuan; Jianwei Zhang; Xingxuan Zhang; Yichang Zhang; Zhenru Zhang; Chang Zhou,; Jingren Zhou; Xiaohuan Zhou; Tianhang Zhu

arXiv:2309.16609·cs.CL·September 29, 2023·87 cites

Qwen Technical Report

Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng,, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang, Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma,, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan

PDF

Open Access 2 Repos 10 Models 2 Datasets

TL;DR

Qwen is a comprehensive series of large language models including base, chat, coding, and math variants, demonstrating strong performance and advanced capabilities across various NLP tasks and applications.

Contribution

Introduction of the Qwen series, including models with human-aligned fine-tuning and specialized variants for coding and mathematics, advancing open-source LLM capabilities.

Findings

01

Base models outperform many open-source models on downstream tasks.

02

Chat models with RLHF show high competitiveness and advanced tool-use.

03

Specialized models like Code-Qwen and Math-Qwen improve performance in their domains.

Abstract

Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities for creating agent applications, showcasing impressive performance even…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsBalanced Selection