Adversarial Domain Adaptation for Duplicate Question Detection

Darsh J Shah; Tao Lei; Alessandro Moschitti; Salvatore Romeo; Preslav; Nakov

arXiv:1809.02255·cs.CL·September 10, 2018·5 cites

Adversarial Domain Adaptation for Duplicate Question Detection

Darsh J Shah, Tao Lei, Alessandro Moschitti, Salvatore Romeo, Preslav, Nakov

PDF

Open Access 1 Repo

TL;DR

This paper explores adversarial domain adaptation to improve duplicate question detection in forums lacking annotated data, demonstrating significant performance gains across multiple domain pairs.

Contribution

It introduces an adversarial domain adaptation approach tailored for duplicate question detection and analyzes factors influencing its effectiveness.

Findings

01

Average 5.6% improvement over baselines

02

Effectiveness depends on domain similarity and data properties

03

Provides insights into when adversarial adaptation works best

Abstract

We address the problem of detecting duplicate questions in forums, which is an important step towards automating the process of answering new questions. As finding and annotating such potential duplicates manually is very tedious and costly, automatic methods based on machine learning are a viable alternative. However, many forums do not have annotated data, i.e., questions labeled by experts as duplicates, and thus a promising solution is to use domain adaptation from another forum that has such annotations. Here we focus on adversarial domain adaptation, deriving important findings about when it performs well and what properties of the domains are important in this regard. Our experiments with StackExchange data show an average improvement of 5.6% over the best baseline across multiple pairs of domains.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

darsh10/qra_code
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Expert finding and Q&A systems · Sentiment Analysis and Opinion Mining