$\texttt{MixGR}$: Enhancing Retriever Generalization for Scientific Domain through Complementary Granularity
Fengyu Cai, Xinran Zhao, Tong Chen, Sihao Chen, Hongming Zhang, Iryna, Gurevych, Heinz Koeppl

TL;DR
MixGR is a zero-shot method that enhances scientific document retrieval by integrating multiple granularities of query-document matching, significantly improving retrieval accuracy and downstream question-answering performance.
Contribution
It introduces a novel granularity-aware fusion approach for dense retrievers, addressing domain-specific and complex query-document relationships in scientific retrieval.
Findings
Outperforms previous methods by up to 24.7% in nDCG@5
Improves retrieval for multi-subquery scientific queries
Boosts downstream scientific question-answering tasks
Abstract
Recent studies show the growing significance of document retrieval in the generation of LLMs, i.e., RAG, within the scientific domain by bridging their knowledge gap. However, dense retrievers often struggle with domain-specific retrieval and complex query-document relationships, particularly when query segments correspond to various parts of a document. To alleviate such prevalent challenges, this paper introduces , which improves dense retrievers' awareness of query-document matching across various levels of granularity in queries and documents using a zero-shot approach. fuses various metrics based on these granularities to a united score that reflects a comprehensive query-document similarity. Our experiments demonstrate that outperforms previous document retrieval by 24.7%, 9.8%, and 6.9% on nDCG@5 with unsupervised, supervised, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Residual Connection · Byte Pair Encoding · Layer Normalization · Linear Layer · Attention Dropout · Linear Warmup With Linear Decay · Adam · Dropout
