Language Model Pre-training on True Negatives

Zhuosheng Zhang; Hai Zhao; Masao Utiyama; Eiichiro Sumita

arXiv:2212.00460·cs.CL·December 2, 2022

Language Model Pre-training on True Negatives

Zhuosheng Zhang, Hai Zhao, Masao Utiyama, Eiichiro Sumita

PDF

Open Access 1 Video

TL;DR

This paper identifies the false negative issue in discriminative pre-trained language models and proposes enhanced pre-training methods that focus on true negatives, leading to improved performance and robustness on benchmarks.

Contribution

It introduces a novel approach to mitigate false negatives in PLMs by correcting harmful gradient updates, enhancing model quality.

Findings

01

Improved performance on GLUE and SQuAD benchmarks.

02

Enhanced robustness of PLMs against false negative issues.

03

Effective correction of gradient updates during pre-training.

Abstract

Discriminative pre-trained language models (PLMs) learn to predict original texts from intentionally corrupted ones. Taking the former text as positive and the latter as negative samples, the PLM can be trained effectively for contextualized representation. However, the training of such a type of PLMs highly relies on the quality of the automatically constructed samples. Existing PLMs simply treat all corrupted texts as equal negative without any examination, which actually lets the resulting model inevitably suffer from the false negative issue where training is carried out on pseudo-negative data and leads to less efficiency and less robustness in the resulting PLMs. In this work, on the basis of defining the false negative issue in discriminative PLMs that has been ignored for a long time, we design enhanced pre-training methods to counteract false negative predictions and encourage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Language Model Pre-training on True Negatives· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification