Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese
Zhuosheng Zhang, Hanqing Zhang, Keming Chen, Yuhang Guo, Jingyun Hua,, Yulong Wang, Ming Zhou

TL;DR
Mengzi is a family of lightweight, efficient Chinese pre-trained models that achieve state-of-the-art results across various NLP and vision tasks without increasing model size or complexity.
Contribution
We introduce Mengzi, a set of lightweight, powerful Chinese pre-trained models that outperform existing models on benchmark tasks using optimized training and fine-tuning techniques.
Findings
Achieved new state-of-the-art results on CLUE benchmark
Models are simpler yet more powerful than existing Chinese PLMs
No architecture modifications needed for deployment
Abstract
Although pre-trained models (PLMs) have achieved remarkable improvements in a wide range of NLP tasks, they are expensive in terms of time and resources. This calls for the study of training more efficient models with less computation but still ensures impressive performance. Instead of pursuing a larger scale, we are committed to developing lightweight yet more powerful models trained with equal or less computation and friendly to rapid deployment. This technical report releases our pre-trained model called Mengzi, which stands for a family of discriminative, generative, domain-specific, and multimodal pre-trained model variants, capable of a wide range of language and vision tasks. Compared with public Chinese PLMs, Mengzi is simple but more powerful. Our lightweight model has achieved new state-of-the-art results on the widely-used CLUE benchmark with our optimized pre-training and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Langboat/mengzi-bert-base-finmodel· 388 dl· ♡ 14388 dl♡ 14
- 🤗Langboat/mengzi-bert-basemodel· 32 dl· ♡ 4032 dl♡ 40
- 🤗Langboat/mengzi-oscar-base-captionmodel· 4 dl· ♡ 24 dl♡ 2
- 🤗Langboat/mengzi-oscar-base-retrievalmodel· 2 dl· ♡ 32 dl♡ 3
- 🤗Langboat/mengzi-oscar-basemodel· 3 dl· ♡ 53 dl♡ 5
- 🤗Langboat/mengzi-t5-basemodel· 5.6k dl· ♡ 605.6k dl♡ 60
- 🤗Langboat/mengzi-t5-base-mtmodel· 44 dl· ♡ 1744 dl♡ 17
- 🤗Langboat/mengzi-bert-L6-H768model· 3 dl· ♡ 53 dl♡ 5
- 🤗shibing624/t5-chinese-coupletmodel· 14 dl· ♡ 614 dl♡ 6
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
