Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering

Wei Yang; Yuqing Xie; Luchen Tan; Kun Xiong; Ming Li; and Jimmy Lin

arXiv:1904.06652·cs.CL·April 16, 2019·37 cites

Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering

Wei Yang, Yuqing Xie, Luchen Tan, Kun Xiong, Ming Li, and Jimmy Lin

PDF

Open Access

TL;DR

This paper introduces a data augmentation method using distant supervision to improve BERT fine-tuning for open-domain question answering, achieving significant performance gains on English and Chinese datasets.

Contribution

It proposes a stage-wise fine-tuning approach with data augmentation that leverages positive and negative examples, setting new benchmarks in QA performance.

Findings

01

Large improvements over previous methods on English QA datasets

02

Established new baselines on Chinese QA datasets

03

Effective use of distant supervision for data augmentation

Abstract

Recently, a simple combination of passage retrieval using off-the-shelf IR techniques and a BERT reader was found to be very effective for question answering directly on Wikipedia, yielding a large improvement over the previous state of the art on a standard benchmark dataset. In this paper, we present a data augmentation technique using distant supervision that exploits positive as well as negative examples. We apply a stage-wise approach to fine tuning BERT on multiple datasets, starting with data that is "furthest" from the test data and ending with the "closest". Experimental results show large gains in effectiveness over previous approaches on English QA datasets, and we establish new baselines on two recent Chinese QA datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies

MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax