Enhancing Retrieval-Augmented Generation with Two-Stage Retrieval: FlashRank Reranking and Query Expansion

Sherine George

arXiv:2601.03258·cs.IR·January 8, 2026

Enhancing Retrieval-Augmented Generation with Two-Stage Retrieval: FlashRank Reranking and Query Expansion

Sherine George

PDF

Open Access

TL;DR

This paper introduces a two-stage retrieval system for retrieval-augmented generation that improves evidence recall, relevance, and efficiency by combining query expansion with a fast reranker, enhancing factual accuracy and cost-effectiveness.

Contribution

The paper presents a novel two-stage retrieval pipeline with LLM-driven query expansion and FlashRank reranking, optimizing evidence selection for RAG systems.

Findings

01

Increased answer accuracy and faithfulness.

02

Improved retrieval recall and relevance.

03

Enhanced computational efficiency.

Abstract

Retrieval-Augmented Generation (RAG) couples a retriever with a large language model (LLM) to ground generated responses in external evidence. While this framework enhances factuality and domain adaptability, it faces a key bottleneck: balancing retrieval recall with limited LLM context. Retrieving too few passages risks missing critical context, while retrieving too many overwhelms the prompt window, diluting relevance and increasing cost. We propose a two-stage retrieval pipeline that integrates LLM-driven query expansion to improve candidate recall and FlashRank, a fast marginal-utility reranker that dynamically selects an optimal subset of evidence under a token budget. FlashRank models document utility as a weighted combination of relevance, novelty, brevity, and cross-encoder evidence. Together, these modules form a generalizable solution that increases answer accuracy,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInformation Retrieval and Search Behavior · Topic Modeling · Multimodal Machine Learning Applications