Aquila2 Technical Report
Bo-Wen Zhang, Liangdong Wang, Jijie Li, Shuhao Gu, Xinya Wu, Zhengduo, Zhang, Boyan Gao, Yulong Ao, Guang Liu

TL;DR
Aquila2 introduces a series of large bilingual models trained with the innovative HeuriMentor framework, which improves training efficiency and data management, achieving strong performance on English and Chinese benchmarks.
Contribution
The paper presents the HeuriMentor framework and the Aquila2 model series, enabling real-time training insights and efficient data optimization for large-scale bilingual models.
Findings
Aquila2 models perform well on English and Chinese benchmarks.
Quantized Aquila2-34B retains most of its performance.
The HeuriMentor system enhances training efficiency and monitoring.
Abstract
This paper introduces the Aquila2 series, which comprises a wide range of bilingual models with parameter sizes of 7, 34, and 70 billion. These models are trained based on an innovative framework named HeuriMentor (HM), which offers real-time insights into model convergence and enhances the training process and data management. The HM System, comprising the Adaptive Training Engine (ATE), Training State Monitor (TSM), and Data Management Unit (DMU), allows for precise monitoring of the model's training progress and enables efficient optimization of data distribution, thereby enhancing training effectiveness. Extensive evaluations show that the Aquila2 model series performs comparably well on both English and Chinese benchmarks. Specifically, Aquila2-34B demonstrates only a slight decrease in performance when quantized to Int4. Furthermore, we have made our training code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Machine Learning and Data Classification
