Evaluating the Robustness of Dense Retrievers in Interdisciplinary Domains
Sarthak Chaturvedi, Anurag Acharya, Rounak Meyur, Koby Hayashi, Sai Munikoti, Sameera Horawalavithana

TL;DR
This study shows that the perceived benefits of domain adaptation in retrieval models vary significantly depending on the evaluation benchmark's characteristics, affecting deployment decisions in specialized interdisciplinary domains.
Contribution
It demonstrates how benchmark features influence perceived domain adaptation benefits and highlights the importance of choosing appropriate evaluation methods for interdisciplinary retrieval tasks.
Findings
Different benchmarks yield vastly different perceived improvements from domain adaptation.
Higher semantic overlap in benchmarks correlates with larger observed benefits.
Benchmark selection critically impacts assessments of retrieval system effectiveness.
Abstract
Evaluation benchmark characteristics may distort the true benefits of domain adaptation in retrieval models. This creates misleading assessments that influence deployment decisions in specialized domains. We show that two benchmarks with drastically different features such as topic diversity, boundary overlap, and semantic complexity can influence the perceived benefits of fine-tuning. Using environmental regulatory document retrieval as a case study, we fine-tune ColBERTv2 model on Environmental Impact Statements (EIS) from federal agencies. We evaluate these models across two benchmarks with different semantic structures. Our findings reveal that identical domain adaptation approaches show very different perceived benefits depending on evaluation methodology. On one benchmark, with clearly separated topic boundaries, domain adaptation shows small improvements (maximum 0.61% NDCG…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Expert finding and Q&A systems · Topic Modeling
