PERT: Pre-training BERT with Permuted Language Model

Yiming Cui; Ziqing Yang; Ting Liu

arXiv:2203.06906·cs.CL·March 15, 2022

PERT: Pre-training BERT with Permuted Language Model

Yiming Cui, Ziqing Yang, Ting Liu

PDF

Open Access 1 Repo

TL;DR

PERT introduces a novel pre-training approach for BERT by permuting input tokens and predicting their original positions, enhancing natural language understanding across multiple languages and tasks.

Contribution

The paper proposes PERT, a new pre-training method using permuted language modeling, which diversifies training tasks beyond traditional masked language models.

Findings

01

PERT improves performance on several NLU benchmarks.

02

Permuted Language Model offers a new training paradigm.

03

Diverse pre-training tasks can enhance PLM capabilities.

Abstract

Pre-trained Language Models (PLMs) have been widely used in various natural language processing (NLP) tasks, owing to their powerful text representations trained on large-scale corpora. In this paper, we propose a new PLM called PERT for natural language understanding (NLU). PERT is an auto-encoding model (like BERT) trained with Permuted Language Model (PerLM). The formulation of the proposed PerLM is straightforward. We permute a proportion of the input text, and the training objective is to predict the position of the original token. Moreover, we also apply whole word masking and N-gram masking to improve the performance of PERT. We carried out extensive experiments on both Chinese and English NLU benchmarks. The experimental results show that PERT can bring improvements over various comparable baselines on some of the tasks, while others are not. These results indicate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ymcui/pert
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification