RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in   LLMs

Yue Yu; Wei Ping; Zihan Liu; Boxin Wang; Jiaxuan You; Chao Zhang,; Mohammad Shoeybi; Bryan Catanzaro

arXiv:2407.02485·cs.CL·July 3, 2024·5 cites

RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs

Yue Yu, Wei Ping, Zihan Liu, Boxin Wang, Jiaxuan You, Chao Zhang,, Mohammad Shoeybi, Bryan Catanzaro

PDF

Open Access

TL;DR

RankRAG introduces a unified instruction fine-tuning approach for LLMs that enhances both context ranking and answer generation, outperforming existing models on multiple knowledge-intensive benchmarks and demonstrating strong domain generalization.

Contribution

The paper presents RankRAG, a novel fine-tuning framework that unifies context ranking and answer generation in LLMs, improving performance without extensive domain-specific data.

Findings

01

Outperforms expert ranking models with minimal ranking data.

02

Surpasses strong baselines like GPT-4 and ChatQA-1.5 on knowledge-intensive tasks.

03

Shows strong generalization to biomedical domain without domain-specific fine-tuning.

Abstract

Large language models (LLMs) typically utilize the top-k contexts from a retriever in retrieval-augmented generation (RAG). In this work, we propose a novel instruction fine-tuning framework RankRAG, which instruction-tunes a single LLM for the dual purpose of context ranking and answer generation in RAG. In particular, the instruction-tuned LLMs work surprisingly well by adding a small fraction of ranking data into the training blend, and outperform existing expert ranking models, including the same LLM exclusively fine-tuned on a large amount of ranking data. For generation, we compare our model with many strong baselines, including GPT-4-0613, GPT-4-turbo-2024-0409, and ChatQA-1.5, an open-sourced model with the state-of-the-art performance on RAG benchmarks. Specifically, our Llama3-RankRAG significantly outperforms Llama3-ChatQA-1.5 and GPT-4 models on nine knowledge-intensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Data Mining Algorithms and Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Weight Decay · Multi-Head Attention · WordPiece · Softmax · Byte Pair Encoding · Layer Normalization · Label Smoothing