MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model   Series

Ge Zhang; Scott Qu; Jiaheng Liu; Chenchen Zhang; Chenghua Lin; Chou; Leuang Yu; Danny Pan; Esther Cheng; Jie Liu; Qunshu Lin; Raven Yuan; Tuney; Zheng; Wei Pang; Xinrun Du; Yiming Liang; Yinghao Ma; Yizhi Li; Ziyang Ma,; Bill Lin; Emmanouil Benetos; Huan Yang; Junting Zhou; Kaijing Ma; Minghao; Liu; Morry Niu; Noah Wang; Quehry Que; Ruibo Liu; Sine Liu; Shawn Guo; Soren; Gao; Wangchunshu Zhou; Xinyue Zhang; Yizhi Zhou; Yubo Wang; Yuelin Bai; Yuhan; Zhang; Yuxiang Zhang; Zenith Wang; Zhenzhu Yang; Zijian Zhao; Jiajun Zhang,; Wanli Ouyang; Wenhao Huang; Wenhu Chen

arXiv:2405.19327·cs.CL·July 11, 2024·1 cites

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou, Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, Raven Yuan, Tuney, Zheng, Wei Pang, Xinrun Du, Yiming Liang, Yinghao Ma, Yizhi Li, Ziyang Ma,, Bill Lin, Emmanouil Benetos, Huan Yang, Junting Zhou

PDF

Open Access 1 Repo 4 Datasets

TL;DR

MAP-Neo is a fully open-sourced bilingual large language model with 7B parameters, trained on 4.5T tokens, achieving performance comparable to proprietary models while emphasizing transparency and reproducibility.

Contribution

This paper introduces MAP-Neo, the first fully open-sourced bilingual LLM with comprehensive training details and high performance, advancing transparency and scientific study in the field.

Findings

01

MAP-Neo achieves performance comparable to state-of-the-art LLMs.

02

All training data, code, and checkpoints are openly released.

03

The model demonstrates strong reasoning, knowledge, and coding capabilities.

Abstract

Large Language Models (LLMs) have made great strides in recent years to achieve unprecedented performance across different tasks. However, due to commercial interest, the most competitive models like GPT, Gemini, and Claude have been gated behind proprietary interfaces without disclosing the training details. Recently, many institutions have open-sourced several strong LLMs like LLaMA-3, comparable to existing closed-source LLMs. However, only the model's weights are provided with most details (e.g., intermediate checkpoints, pre-training corpus, and training code, etc.) being undisclosed. To improve the transparency of LLMs, the research community has formed to open-source truly open LLMs (e.g., Pythia, Amber, OLMo), where more details (e.g., pre-training corpus and training code) are being provided. These models have greatly advanced the scientific study of these large models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

multimodal-art-projection/map-neo
paddleOfficial

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Dropout · Dense Connections · Softmax · Layer Normalization · Cosine Annealing · Discriminative Fine-Tuning · Attention Dropout · Linear Layer