From Retrieval to Generation: Comparing Different Approaches
Abdelrahman Abdallah, Jamshid Mozafari, Bhawna Piryani, Mohammed Ali,, Adam Jatowt

TL;DR
This paper systematically compares retrieval, generation, and hybrid models for knowledge-intensive tasks like ODQA, highlighting their strengths, weaknesses, and optimal use cases through extensive evaluation and analysis.
Contribution
It provides a comprehensive evaluation of different approaches, offering practical insights and detailed comparisons to guide future development in retrieval-augmented language modeling.
Findings
DPR achieves 50.17% top-1 accuracy on NQ in ODQA.
Hybrid models improve reranking scores on BEIR dataset.
Retrieval-based methods like BM25 have lower perplexity in language modeling.
Abstract
Knowledge-intensive tasks, particularly open-domain question answering (ODQA), document reranking, and retrieval-augmented language modeling, require a balance between retrieval accuracy and generative flexibility. Traditional retrieval models such as BM25 and Dense Passage Retrieval (DPR), efficiently retrieve from large corpora but often lack semantic depth. Generative models like GPT-4-o provide richer contextual understanding but face challenges in maintaining factual consistency. In this work, we conduct a systematic evaluation of retrieval-based, generation-based, and hybrid models, with a primary focus on their performance in ODQA and related retrieval-augmented tasks. Our results show that dense retrievers, particularly DPR, achieve strong performance in ODQA with a top-1 accuracy of 50.17\% on NQ, while hybrid models improve nDCG@10 scores on BEIR from 43.42 (BM25) to 52.59,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Multimodal Machine Learning Applications
MethodsFocus
