360Zhinao Technical Report
360Zhinao Team

TL;DR
This paper introduces 360Zhinao, a 7-billion parameter language model with flexible context lengths up to 360K tokens, achieved through careful data preparation, training, and alignment strategies, demonstrating competitive performance.
Contribution
The paper presents a comprehensive development of 360Zhinao, including model training, data strategies, and extended context capabilities, with minimal model size ablation environment.
Findings
Achieved 360K context length extension for 360Zhinao-7B.
Trained on 3.4T tokens with refined data filtering.
Demonstrated competitive performance among similar-sized models.
Abstract
We present 360Zhinao models with 7B parameter size and context lengths spanning 4K, 32K and 360K, all available at https://github.com/Qihoo360/360zhinao. For rapid development in pretraining, we establish a stable and sensitive ablation environment to evaluate and compare experiment runs with minimal model size. Under such guidance, we perfect our data cleaning and composition strategies to pretrain on 3.4T tokens. We also mainly emphasize data during alignment, where we strive to balance quantity and quality with filtering and reformatting. With tailored data, 360Zhinao-7B's context window is easily extended to 32K and 360K. RMs and RLHF are trained following SFT and credibly applied to specific tasks. All together these contributions lead to 360Zhinao-7B's competitive performance among models of similar size.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗qihoo360/360Zhinao2-7B-Basemodel· 7 dl· ♡ 27 dl♡ 2
- 🤗qihoo360/360Zhinao2-7B-Chat-4Kmodel· 2 dl2 dl
- 🤗qihoo360/360Zhinao2-7B-Chat-4K-Int4model· 3 dl3 dl
- 🤗qihoo360/360Zhinao2-7B-Chat-32Kmodel· 8 dl8 dl
- 🤗qihoo360/360Zhinao2-7B-Chat-32K-Int4model· 4 dl4 dl
- 🤗qihoo360/360Zhinao2-7B-Chat-360Kmodel· 6 dl6 dl
- 🤗qihoo360/360Zhinao2-7B-Chat-360K-Int4model· 3 dl3 dl
- 🤗qihoo360/360Zhinao3-7Bmodel· 16 dl· ♡ 216 dl♡ 2
- 🤗qihoo360/360Zhinao3-7B-Instructmodel· 5 dl· ♡ 25 dl♡ 2
- 🤗qihoo360/360Zhinao3-7B-O1.5model· 1 dl· ♡ 31 dl♡ 3
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNuclear physics research studies · Particle physics theoretical and experimental studies · Quantum Chromodynamics and Particle Interactions
MethodsShrink and Fine-Tune
