QAGAN: Adversarial Approach To Learning Domain Invariant Language   Features

Shubham Shrivastava; Kaiyue Wang

arXiv:2206.12388·cs.CL·June 27, 2022

QAGAN: Adversarial Approach To Learning Domain Invariant Language Features

Shubham Shrivastava, Kaiyue Wang

PDF

Open Access 1 Repo

TL;DR

This paper proposes an adversarial training method to develop domain-invariant language features for question-answering models, improving out-of-domain generalization significantly.

Contribution

It introduces an adversarial approach combined with data augmentation and training strategies to enhance domain robustness in QA models, which is a novel application.

Findings

01

15.2% improvement in EM score on out-of-domain data

02

5.6% boost in F1 score on out-of-domain data

03

Visualization shows learned embeddings are domain-invariant

Abstract

Training models that are robust to data domain shift has gained an increasing interest both in academia and industry. Question-Answering language models, being one of the typical problem in Natural Language Processing (NLP) research, has received much success with the advent of large transformer models. However, existing approaches mostly work under the assumption that data is drawn from same distribution during training and testing which is unrealistic and non-scalable in the wild. In this paper, we explore adversarial training approach towards learning domain-invariant features so that language models can generalize well to out-of-domain datasets. We also inspect various other ways to boost our model performance including data augmentation by paraphrasing sentences, conditioning end of answer span prediction on the start word, and carefully designed annealing function. Our initial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

towardsautonomy/qagan
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications