Instance Regularization for Discriminative Language Model Pre-training
Zhuosheng Zhang, Hai Zhao, Ming Zhou

TL;DR
This paper introduces an instance regularization method for discriminative language model pre-training that estimates the complexity of restoring corrupted texts, leading to improved efficiency, effectiveness, and robustness.
Contribution
It proposes a novel approach to model the contribution of individual training instances based on corruption degree and prediction confidence during pre-training.
Findings
Improves pre-training efficiency and effectiveness
Enhances robustness of language models
Achieves better performance on NLP benchmarks
Abstract
Discriminative pre-trained language models (PrLMs) can be generalized as denoising auto-encoders that work with two procedures, ennoising and denoising. First, an ennoising process corrupts texts with arbitrary noising functions to construct training instances. Then, a denoising language model is trained to restore the corrupted tokens. Existing studies have made progress by optimizing independent strategies of either ennoising or denosing. They treat training instances equally throughout the training process, with little attention on the individual contribution of those instances. To model explicit signals of instance contribution, this work proposes to estimate the complexity of restoring the original sentences from corrupted ones in language model pre-training. The estimations involve the corruption degree in the ennoising data construction process and the prediction confidence in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
