MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou, Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, Raven Yuan, Tuney, Zheng, Wei Pang, Xinrun Du, Yiming Liang, Yinghao Ma, Yizhi Li, Ziyang Ma,, Bill Lin, Emmanouil Benetos, Huan Yang, Junting Zhou

TL;DR
MAP-Neo is a fully open-sourced bilingual large language model with 7B parameters, trained on 4.5T tokens, achieving performance comparable to proprietary models while emphasizing transparency and reproducibility.
Contribution
This paper introduces MAP-Neo, the first fully open-sourced bilingual LLM with comprehensive training details and high performance, advancing transparency and scientific study in the field.
Findings
MAP-Neo achieves performance comparable to state-of-the-art LLMs.
All training data, code, and checkpoints are openly released.
The model demonstrates strong reasoning, knowledge, and coding capabilities.
Abstract
Large Language Models (LLMs) have made great strides in recent years to achieve unprecedented performance across different tasks. However, due to commercial interest, the most competitive models like GPT, Gemini, and Claude have been gated behind proprietary interfaces without disclosing the training details. Recently, many institutions have open-sourced several strong LLMs like LLaMA-3, comparable to existing closed-source LLMs. However, only the model's weights are provided with most details (e.g., intermediate checkpoints, pre-training corpus, and training code, etc.) being undisclosed. To improve the transparency of LLMs, the research community has formed to open-source truly open LLMs (e.g., Pythia, Amber, OLMo), where more details (e.g., pre-training corpus and training code) are being provided. These models have greatly advanced the scientific study of these large models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Dropout · Dense Connections · Softmax · Layer Normalization · Cosine Annealing · Discriminative Fine-Tuning · Attention Dropout · Linear Layer
