Xmodel-LM Technical Report

Yichuan Wang; Yang Liu; Yu Yan; Qun Wang; Xucheng Huang; Ling Jiang

arXiv:2406.02856·cs.CL·November 20, 2024

Xmodel-LM Technical Report

Yichuan Wang, Yang Liu, Yu Yan, Qun Wang, Xucheng Huang, Ling Jiang

PDF

Open Access 3 Repos 1 Models

TL;DR

Xmodel-LM is a compact 1.1B language model trained on a large, balanced dataset, achieving superior performance compared to similar-sized open-source models, with accessible code and checkpoints.

Contribution

The paper introduces Xmodel-LM, a smaller yet effective language model trained on a large, balanced dataset, outperforming comparable open-source models.

Findings

01

Surpasses existing open-source models of similar scale

02

Trained on 2 trillion tokens with balanced Chinese and English data

03

Publicly available checkpoints and code

Abstract

We introduce Xmodel-LM, a compact and efficient 1.1B language model pre-trained on around 2 trillion tokens. Trained on our self-built dataset (Xdata), which balances Chinese and English corpora based on downstream task optimization, Xmodel-LM exhibits remarkable performance despite its smaller size. It notably surpasses existing open-source language models of similar scale. Our model checkpoints and code are publicly accessible on GitHub at https://github.com/XiaoduoAILab/XmodelLM.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
XiaoduoAILab/Xmodel_LM
model· 320 dl· ♡ 9
320 dl♡ 9

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Automated Systems · Distributed and Parallel Computing Systems · Simulation Techniques and Applications