Improving Document Retrieval Coherence for Semantically Equivalent Queries
Stefano Campese, Alessandro Moschitti, Ivano Lauriola

TL;DR
This paper introduces a modified training loss for dense retrieval models that enhances their consistency in retrieving the same documents for semantically equivalent queries, leading to more stable and accurate retrieval.
Contribution
It proposes a variation of the Multi-Negative Ranking loss that improves retrieval coherence for semantically similar queries, reducing sensitivity to query variations.
Findings
Models trained with the new loss show lower sensitivity to query variations.
The new loss leads to higher retrieval accuracy across multiple datasets.
Enhanced coherence improves robustness of document retrieval systems.
Abstract
Dense Retrieval (DR) models have proven to be effective for Document Retrieval and Information Grounding tasks. Usually, these models are trained and optimized for improving the relevance of top-ranked documents for a given query. Previous work has shown that popular DR models are sensitive to the query and document lexicon: small variations of it may lead to a significant difference in the set of retrieved documents. In this paper, we propose a variation of the Multi-Negative Ranking loss for training DR that improves the coherence of models in retrieving the same documents with respect to semantically similar queries. The loss penalizes discrepancies between the top-k ranked documents retrieved for diverse but semantic equivalent queries. We conducted extensive experiments on various datasets, MS-MARCO, Natural Questions, BEIR, and TREC DL 19/20. The results show that (i) models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Data Quality and Management · Advanced Database Systems and Queries
