Yi-Lightning Technical Report
Alan Wake, Bei Chen, C.X. Lv, Chao Li, Chengen Huang, Chenglin Cai,, Chujie Zheng, Daniel Cooper, Fan Zhou, Feng Hu, Ge Zhang, Guoyin Wang, Heng, Ji, Howard Qiu, Jiangcheng Zhu, Jun Tian, Katherine Su, Lihuan Zhang, Liying, Li, Ming Song, Mou Li, Peng Liu, Qicheng Hu, Shawn Wang

TL;DR
Yi-Lightning is a new large language model that achieves top-tier performance across various tasks using an advanced MoE architecture, multi-stage training, safety measures, and scalable infrastructure, while highlighting limitations of traditional benchmarks.
Contribution
The paper introduces Yi-Lightning, a large language model with innovative architecture, training strategies, safety framework, and insights into benchmark limitations, advancing practical AI development.
Findings
Achieves 6th place on Chatbot Arena, with top results in Chinese, Math, Coding, and Hard Prompts.
Utilizes an enhanced MoE architecture with optimized routing and KV-caching.
Demonstrates competitive performance on public benchmarks and reveals disparities between static benchmarks and human preferences.
Abstract
This technical report presents Yi-Lightning, our latest flagship large language model (LLM). It achieves exceptional performance, ranking 6th overall on Chatbot Arena, with particularly strong results (2nd to 4th place) in specialized categories including Chinese, Math, Coding, and Hard Prompts. Yi-Lightning leverages an enhanced Mixture-of-Experts (MoE) architecture, featuring advanced expert segmentation and routing mechanisms coupled with optimized KV-caching techniques. Our development process encompasses comprehensive pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), where we devise deliberate strategies for multi-stage training, synthetic data construction, and reward modeling. Furthermore, we implement RAISE (Responsible AI Safety Engine), a four-component framework to address safety issues across pre-training, post-training, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLightning and Electromagnetic Phenomena
