Train No Evil: Selective Masking for Task-Guided Pre-Training

Yuxian Gu; Zhengyan Zhang; Xiaozhi Wang; Zhiyuan Liu; Maosong Sun

arXiv:2004.09733·cs.CL·October 8, 2020·5 cites

Train No Evil: Selective Masking for Task-Guided Pre-Training

Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun

PDF

Open Access 1 Repo

TL;DR

This paper introduces a task-guided pre-training framework with selective masking to better capture domain- and task-specific patterns, improving efficiency and performance in downstream tasks.

Contribution

It proposes a novel three-stage training framework with a selective masking strategy for task-guided pre-training, enhancing domain and task-specific pattern learning.

Findings

01

Achieves comparable or better performance on sentiment analysis tasks.

02

Reduces training computation cost by less than 50%.

03

Demonstrates effectiveness and efficiency of the proposed method.

Abstract

Recently, pre-trained language models mostly follow the pre-train-then-fine-tuning paradigm and have achieved great performance on various downstream tasks. However, since the pre-training stage is typically task-agnostic and the fine-tuning stage usually suffers from insufficient supervised data, the models cannot always well capture the domain-specific and task-specific patterns. In this paper, we propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and fine-tuning. In this stage, the model is trained by masked language modeling on in-domain unsupervised data to learn domain-specific patterns and we propose a novel selective masking strategy to learn task-specific patterns. Specifically, we design a method to measure the importance of each token in sequences and selectively mask the important tokens.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thunlp/SelectiveMasking
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Natural Language Processing Techniques