Enhancing Chat Language Models by Scaling High-quality Instructional Conversations
Ning Ding, Yulin Chen, Bokai Xu, Yujia Qin, Zhi Zheng, Shengding Hu,, Zhiyuan Liu, Maosong Sun, Bowen Zhou

TL;DR
This paper introduces UltraChat, a large-scale, diverse dataset of instructional conversations, and uses it to fine-tune a LLaMA-based model, UltraLLaMA, which surpasses existing open-source chat models in performance.
Contribution
The paper presents UltraChat, a comprehensive dataset of 1.5 million multi-turn dialogues, and demonstrates its effectiveness by fine-tuning UltraLLaMA, achieving state-of-the-art results among open-source models.
Findings
UltraChat contains 1.5 million high-quality dialogues.
UltraLLaMA outperforms other open-source chat models like Vicuna.
UltraChat's diversity and scale improve model performance.
Abstract
Fine-tuning on instruction data has been widely validated as an effective practice for implementing chat language models like ChatGPT. Scaling the diversity and quality of such data, although straightforward, stands a great chance of leading to improved performance. This paper aims to improve the upper bound of open-source models further. We first provide a systematically designed, diverse, informative, large-scale dataset of instructional conversations, UltraChat, which does not involve human queries. Our objective is to capture the breadth of interactions that a human might have with an AI assistant and employs a comprehensive framework to generate multi-turn conversation iteratively. UltraChat contains 1.5 million high-quality multi-turn dialogues and covers a wide range of topics and instructions. Our statistical analysis of UltraChat reveals its superiority in various key metrics,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗openbmb/UltraLM-13bmodel· 849 dl· ♡ 74849 dl♡ 74
- 🤗poisson-fish/ultralm-13b-GPTQmodel· 6 dl· ♡ 16 dl♡ 1
- 🤗TheBloke/UltraLM-13B-GPTQmodel· 48 dl· ♡ 1348 dl♡ 13
- 🤗TheBloke/UltraLM-13B-GGMLmodel· ♡ 14♡ 14
- 🤗TheBloke/UltraLM-13B-fp16model· 837 dl· ♡ 4837 dl♡ 4
- 🤗openbmb/UltraLM-65bmodel· 906 dl· ♡ 9906 dl♡ 9
- 🤗HuggingFaceH4/zephyr-7b-alphamodel· 4.0k dl· ♡ 11204.0k dl♡ 1120
- 🤗HuggingFaceH4/zephyr-7b-betamodel· 136k dl· ♡ 1836136k dl♡ 1836
- 🤗gradientai/Llama-3-8B-Instruct-262kmodel· 1.7k dl· ♡ 2611.7k dl♡ 261
- 🤗gradientai/Llama-3-8B-Instruct-Gradient-1048kmodel· 26k dl· ♡ 68026k dl♡ 680
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Natural Language Processing Techniques
