On Generality and Knowledge Transferability in Cross-Domain Duplicate Question Detection for Heterogeneous Community Question Answering
Mohomed Shazan Mohomed Jabbar, Luke Kumar, Hamman Samuel, Mi-Young, Kim, Sankalp Prabhakar, Randy Goebel, Osmar Za\"iane

TL;DR
This paper investigates the challenges of duplicate question detection across different domains, comparing deep learning and gradient boosting, and exploring transfer learning's effectiveness in heterogeneous community question answering datasets.
Contribution
It reveals that domain-specific understanding of duplicates limits transfer learning effectiveness, highlighting the importance of domain adaptation in cross-domain duplicate detection.
Findings
Transfer learning shows limited success across domains.
Domain-specific definitions of duplicates affect transferability.
Deep neural networks and gradient boosting are compared for performance.
Abstract
Duplicate question detection is an ongoing challenge in community question answering because semantically equivalent questions can have significantly different words and structures. In addition, the identification of duplicate questions can reduce the resources required for retrieval, when the same questions are not repeated. This study compares the performance of deep neural networks and gradient tree boosting, and explores the possibility of domain adaptation with transfer learning to improve the under-performing target domains for the text-pair duplicates classification task, using three heterogeneous datasets: general-purpose Quora, technical Ask Ubuntu, and academic English Stack Exchange. Ultimately, our study exposes the alternative hypothesis that the meaning of a "duplicate" is not inherently general-purpose, but rather is dependent on the domain of learning, hence reducing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExpert finding and Q&A systems · Topic Modeling · Information Retrieval and Search Behavior
