Learning to Perturb Word Embeddings for Out-of-distribution QA

Seanie Lee; Minki Kang; Juho Lee; Sung Ju Hwang

arXiv:2105.02692·cs.CL·June 25, 2021

Learning to Perturb Word Embeddings for Out-of-distribution QA

Seanie Lee, Minki Kang, Juho Lee, Sung Ju Hwang

PDF

1 Repo

TL;DR

This paper introduces a novel data augmentation technique for question answering models that perturbs word embeddings with learned noise, improving out-of-distribution generalization without semantic loss.

Contribution

It proposes a stochastic noise generator that learns to perturb embeddings preserving semantics, enhancing QA model robustness across diverse domains.

Findings

01

Outperforms baseline data augmentation methods.

02

Significantly better performance than models trained with extensive synthetic data.

03

Effective across multiple target domains.

Abstract

QA models based on pretrained language mod-els have achieved remarkable performance on various benchmark datasets.However, QA models do not generalize well to unseen data that falls outside the training distribution, due to distributional shifts.Data augmentation (DA) techniques which drop/replace words have shown to be effective in regularizing the model from overfitting to the training data.Yet, they may adversely affect the QA tasks since they incur semantic changes that may lead to wrong answers for the QA task. To tackle this problem, we propose a simple yet effective DA method based on a stochastic noise generator, which learns to perturb the word embedding of the input questions and context without changing their semantics. We validate the performance of the QA models trained with our word embedding perturbation on a single source dataset, on five different target domains.The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

seanie12/SWEP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.