On Correlating Factors for Domain Adaptation Performance
Goksenin Yuksel, Jaap Kamps

TL;DR
This paper investigates factors influencing the success of domain adaptation in dense retrievers, highlighting the importance of domain similarity in generated queries for improved zero-shot retrieval performance.
Contribution
It identifies domain similarity of generated queries as a key factor and compares two domain adaptation techniques to enhance understanding of their effectiveness.
Findings
Generated query type distribution impacts domain adaptation success.
Domain-tailored generated queries improve retrieval performance.
Similarity proxies between queries and domains are useful for analysis.
Abstract
Dense retrievers have demonstrated significant potential for neural information retrieval; however, they lack robustness to domain shifts, limiting their efficacy in zero-shot settings across diverse domains. In this paper, we set out to analyze the possible factors that lead to successful domain adaptation of dense retrievers. We include domain similarity proxies between generated queries to test and source domains. Furthermore, we conduct a case study comparing two powerful domain adaptation techniques. We find that generated query type distribution is an important factor, and generating queries that share a similar domain to the test documents improves the performance of domain adaptation methods. This study further emphasizes the importance of domain-tailored generated queries.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
MethodsSparse Evolutionary Training
