Contrastive Domain Adaptation for Question Answering using Limited Text   Corpora

Zhenrui Yue; Bernhard Kratzwald; Stefan Feuerriegel

arXiv:2108.13854·cs.CL·September 1, 2021

Contrastive Domain Adaptation for Question Answering using Limited Text Corpora

Zhenrui Yue, Bernhard Kratzwald, Stefan Feuerriegel

PDF

Open Access 1 Repo

TL;DR

This paper introduces CAQA, a contrastive domain adaptation framework that enhances question answering in niche domains with limited text data by combining question generation and domain-invariant learning.

Contribution

The paper presents a novel contrastive domain adaptation method for QA that effectively leverages limited target domain data and synthetic question-answer pairs.

Findings

01

Significant performance improvements over baselines.

02

Effective in low-resource domain adaptation.

03

Combines question generation with domain-invariant learning.

Abstract

Question generation has recently shown impressive results in customizing question answering (QA) systems to new domains. These approaches circumvent the need for manually annotated training data from the new domain and, instead, generate synthetic question-answer pairs that are used for training. However, existing methods for question generation rely on large amounts of synthetically generated datasets and costly computational resources, which render these techniques widely inaccessible when the text corpora is of limited size. This is problematic as many niche domains rely on small text corpora, which naturally restricts the amount of synthetic data that can be generated. In this paper, we propose a novel framework for domain adaptation called contrastive domain adaptation for QA (CAQA). Specifically, CAQA combines techniques from question generation and domain-invariant learning to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yueeeeeeee/caqa
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques