VarMAE: Pre-training of Variational Masked Autoencoder for   Domain-adaptive Language Understanding

Dou Hu; Xiaolong Hou; Xiyang Du; Mengyuan Zhou; Lianxin Jiang; Yang; Mo; Xiaofeng Shi

arXiv:2211.00430·cs.CL·November 2, 2022·1 cites

VarMAE: Pre-training of Variational Masked Autoencoder for Domain-adaptive Language Understanding

Dou Hu, Xiaolong Hou, Xiyang Du, Mengyuan Zhou, Lianxin Jiang, Yang, Mo, Xiaofeng Shi

PDF

Open Access

TL;DR

VarMAE is a novel Transformer-based model that improves domain-specific language understanding by encoding context uncertainty into a smooth latent distribution, enabling effective adaptation with limited data.

Contribution

Introduces VarMAE, a masked autoencoder with a context uncertainty module for better domain adaptation in language models.

Findings

01

Effective adaptation to science and finance domains

02

Outperforms existing methods with limited domain data

03

Produces diverse, well-formed contextual representations

Abstract

Pre-trained language models have achieved promising performance on general benchmarks, but underperform when migrated to a specific domain. Recent works perform pre-training from scratch or continual pre-training on domain corpora. However, in many specific domains, the limited corpus can hardly support obtaining precise representations. To address this issue, we propose a novel Transformer-based language model named VarMAE for domain-adaptive language understanding. Under the masked autoencoding objective, we design a context uncertainty learning module to encode the token's context into a smooth latent distribution. The module can produce diverse and well-formed contextual representations. Experiments on science- and finance-domain NLU tasks demonstrate that VarMAE can be efficiently adapted to new domains with limited resources.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis