Domain Adversarial Fine-Tuning as an Effective Regularizer

Giorgos Vernikos; Katerina Margatina; Alexandra Chronopoulou; Ion; Androutsopoulos

arXiv:2009.13366·cs.LG·October 7, 2020

Domain Adversarial Fine-Tuning as an Effective Regularizer

Giorgos Vernikos, Katerina Margatina, Alexandra Chronopoulou, Ion, Androutsopoulos

PDF

1 Repo

TL;DR

This paper introduces AFTER, a regularization method using domain adversarial training during fine-tuning of language models, which improves performance by preventing overfitting to specific domains.

Contribution

The paper proposes a novel adversarial regularization technique, AFTER, that enhances fine-tuning of pretrained language models by maintaining domain-invariant representations.

Findings

01

Improved NLP task performance with AFTER compared to standard fine-tuning.

02

Effective in preventing overfitting to task-specific domains.

03

Enhances generalization across different domains.

Abstract

In Natural Language Processing (NLP), pretrained language models (LMs) that are transferred to downstream tasks have been recently shown to achieve state-of-the-art results. However, standard fine-tuning can degrade the general-domain representations captured during pretraining. To address this issue, we introduce a new regularization technique, AFTER; domain Adversarial Fine-Tuning as an Effective Regularizer. Specifically, we complement the task-specific loss used during fine-tuning with an adversarial objective. This additional loss term is related to an adversarial classifier, that aims to discriminate between in-domain and out-of-domain text representations. In-domain refers to the labeled dataset of the task at hand while out-of-domain refers to unlabeled data from a different domain. Intuitively, the adversarial classifier acts as a regularizer which prevents the model from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

GeorgeVern/AFTERV1.0
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.