LERT: A Linguistically-motivated Pre-trained Language Model
Yiming Cui, Wanxiang Che, Shijin Wang, Ting Liu

TL;DR
LERT is a pre-trained language model that incorporates multiple linguistic features during training, significantly improving performance on Chinese NLU tasks and demonstrating the effectiveness of linguistically-informed pre-training strategies.
Contribution
This paper introduces LERT, a novel pre-trained language model that integrates linguistic features with MLM training, enhancing NLP performance and understanding.
Findings
LERT outperforms baseline models on ten Chinese NLU tasks.
LERT effectively captures linguistic features as shown by analytical experiments.
LERT's design proves to be valid and beneficial for linguistic understanding.
Abstract
Pre-trained Language Model (PLM) has become a representative foundation model in the natural language processing field. Most PLMs are trained with linguistic-agnostic pre-training tasks on the surface form of the text, such as the masked language model (MLM). To further empower the PLMs with richer linguistic features, in this paper, we aim to propose a simple but effective way to learn linguistic features for pre-trained language models. We propose LERT, a pre-trained language model that is trained on three types of linguistic features along with the original MLM pre-training task, using a linguistically-informed pre-training (LIP) strategy. We carried out extensive experiments on ten Chinese NLU tasks, and the experimental results show that LERT could bring significant improvements over various comparable baselines. Furthermore, we also conduct analytical experiments in various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
