Domain Divergences: a Survey and Empirical Analysis
Abhinav Ramesh Kashyap, Devamanyu Hazarika, Min-Yen Kan, Roger, Zimmermann

TL;DR
This paper surveys divergence measures used in NLP, classifies them into three categories, and empirically analyzes their effectiveness across multiple domain adaptation scenarios to guide better measure selection.
Contribution
It provides a taxonomy of divergence measures, organizes literature around three novel NLP applications, and offers empirical insights into their effectiveness with contextual word representations.
Findings
Information-theoretic measures are prevalent for data selection and decision tasks.
Higher-order measures with contextual representations are effective for learning representations.
Traditional divergence measures remain strong baselines in various NLP domain adaptation scenarios.
Abstract
Domain divergence plays a significant role in estimating the performance of a model in new domains. While there is a significant literature on divergence measures, researchers find it hard to choose an appropriate divergence for a given NLP application. We address this shortcoming by both surveying the literature and through an empirical study. We develop a taxonomy of divergence measures consisting of three classes -- Information-theoretic, Geometric, and Higher-order measures and identify the relationships between them. Further, to understand the common use-cases of these measures, we recognise three novel applications -- 1) Data Selection, 2) Learning Representation, and 3) Decisions in the Wild -- and use it to organise our literature. From this, we identify that Information-theoretic measures are prevalent for 1) and 3), and Higher-order measures are more common for 2). To further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
