UniLMv2: Pseudo-Masked Language Models for Unified Language Model   Pre-Training

Hangbo Bao; Li Dong; Furu Wei; Wenhui Wang; Nan Yang; Xiaodong Liu; Yu; Wang; Songhao Piao; Jianfeng Gao; Ming Zhou; Hsiao-Wuen Hon

arXiv:2002.12804·cs.CL·March 2, 2020·225 cites

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao, Li Dong, Furu Wei, Wenhui Wang, Nan Yang, Xiaodong Liu, Yu, Wang, Songhao Piao, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon

PDF

Open Access 3 Repos 1 Video

TL;DR

UniLMv2 introduces a novel pseudo-masked language modeling pre-training method that unifies autoencoding and autoregressive tasks, leading to state-of-the-art performance on various NLP benchmarks.

Contribution

The paper presents a new pre-training approach called PMLM that combines autoencoding and autoregressive training within a single unified model architecture.

Findings

01

Achieves new state-of-the-art results on multiple NLP benchmarks.

02

Effectively unifies bidirectional encoding and sequence-to-sequence decoding.

03

Reduces redundant computation through shared context encodings.

Abstract

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis