Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models
Thuy-Trang Vu, Dinh Phung, Gholamreza Haffari

TL;DR
This paper introduces an adversarial masking strategy for unsupervised domain adaptation of language models, significantly improving performance by focusing training on domain-specific tokens.
Contribution
It proposes a novel adversarial masking approach combined with a dynamic programming optimization to enhance domain adaptation of masked language models.
Findings
Achieves up to +1.64 F1 score improvement on NER tasks.
Outperforms random masking strategies in domain adaptation.
Effective in six different unsupervised domain adaptation scenarios.
Abstract
Recent work has shown the importance of adaptation of broad-coverage contextualised embedding models on the domain of the target task of interest. Current self-supervised adaptation methods are simplistic, as the training signal comes from a small percentage of \emph{randomly} masked-out tokens. In this paper, we show that careful masking strategies can bridge the knowledge gap of masked language models (MLMs) about the domains more effectively by allocating self-supervision where it is needed. Furthermore, we propose an effective training strategy by adversarially masking out those tokens which are harder to reconstruct by the underlying MLM. The adversarial objective leads to a challenging combinatorial optimisation problem over \emph{subsets} of tokens, which we tackle efficiently through relaxation to a variational lowerbound and dynamic programming. On six unsupervised domain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
