Xmodel-LM Technical Report
Yichuan Wang, Yang Liu, Yu Yan, Qun Wang, Xucheng Huang, Ling Jiang

TL;DR
Xmodel-LM is a compact 1.1B language model trained on a large, balanced dataset, achieving superior performance compared to similar-sized open-source models, with accessible code and checkpoints.
Contribution
The paper introduces Xmodel-LM, a smaller yet effective language model trained on a large, balanced dataset, outperforming comparable open-source models.
Findings
Surpasses existing open-source models of similar scale
Trained on 2 trillion tokens with balanced Chinese and English data
Publicly available checkpoints and code
Abstract
We introduce Xmodel-LM, a compact and efficient 1.1B language model pre-trained on around 2 trillion tokens. Trained on our self-built dataset (Xdata), which balances Chinese and English corpora based on downstream task optimization, Xmodel-LM exhibits remarkable performance despite its smaller size. It notably surpasses existing open-source language models of similar scale. Our model checkpoints and code are publicly accessible on GitHub at https://github.com/XiaoduoAILab/XmodelLM.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Automated Systems · Distributed and Parallel Computing Systems · Simulation Techniques and Applications
