From RAG to RICHES: Retrieval Interlaced with Sequence Generation

Palak Jain; Livio Baldini Soares; Tom Kwiatkowski

arXiv:2407.00361·cs.CL·July 2, 2024

From RAG to RICHES: Retrieval Interlaced with Sequence Generation

Palak Jain, Livio Baldini Soares, Tom Kwiatkowski

PDF

Open Access 1 Video

TL;DR

RICHES introduces a unified retrieval and sequence generation method that enables LLMs to retrieve and generate content in a single decoding pass, improving flexibility and performance in open-domain question answering tasks.

Contribution

It proposes a retrieval-interlaced sequence generation approach that eliminates the need for separate retriever and generator modules, adaptable to various tasks via prompting.

Findings

01

Strong performance on ODQA tasks

02

Supports multi-hop retrievals and attributed evidence

03

Operates without additional training

Abstract

We present RICHES, a novel approach that interleaves retrieval with sequence generation tasks. RICHES offers an alternative to conventional RAG systems by eliminating the need for separate retriever and generator. It retrieves documents by directly decoding their contents, constrained on the corpus. Unifying retrieval with generation allows us to adapt to diverse new tasks via prompting alone. RICHES can work with any Instruction-tuned model, without additional training. It provides attributed evidence, supports multi-hop retrievals and interleaves thoughts to plan on what to retrieve next, all within a single decoding pass of the LLM. We demonstrate the strong performance of RICHES across ODQA tasks including attributed and multi-hop QA.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

From RAG to Riches: Retrieval Interlaced with Sequence Generation· underline

Taxonomy

TopicsNatural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Weight Decay · Multi-Head Attention · Residual Connection · WordPiece · Softmax · Byte Pair Encoding · Layer Normalization