Qwen Technical Report
Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng,, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang, Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma,, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan

TL;DR
Qwen is a comprehensive series of large language models including base, chat, coding, and math variants, demonstrating strong performance and advanced capabilities across various NLP tasks and applications.
Contribution
Introduction of the Qwen series, including models with human-aligned fine-tuning and specialized variants for coding and mathematics, advancing open-source LLM capabilities.
Findings
Base models outperform many open-source models on downstream tasks.
Chat models with RLHF show high competitiveness and advanced tool-use.
Specialized models like Code-Qwen and Math-Qwen improve performance in their domains.
Abstract
Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities for creating agent applications, showcasing impressive performance even…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Qwen/Qwen-7Bmodel· 89k dl· ♡ 39889k dl♡ 398
- 🤗Qwen/Qwen-72Bmodel· 2.4k dl· ♡ 3622.4k dl♡ 362
- 🤗Qwen/Qwen-7B-Chatmodel· 122k dl· ♡ 788122k dl♡ 788
- 🤗Qwen/Qwen-7B-Chat-Int4model· 1.4k dl· ♡ 751.4k dl♡ 75
- 🤗X-D-Lab/MindChat-Qwen-7B-v2model· 155 dl· ♡ 9155 dl♡ 9
- 🤗Qwen/Qwen-14B-Chat-Int4model· 285 dl· ♡ 100285 dl♡ 100
- 🤗Qwen/Qwen-14B-Chatmodel· 10k dl· ♡ 37310k dl♡ 373
- 🤗Qwen/Qwen-14Bmodel· 5.4k dl· ♡ 2135.4k dl♡ 213
- 🤗TheBloke/Qwen-14B-Chat-GPTQmodel· 77 dl· ♡ 3477 dl♡ 34
- 🤗openerotica/Qwen-7b-GPTQ-ERPmodel· 17 dl17 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsBalanced Selection
