Bayesian inference to improve quality of Retrieval Augmented Generation

Dattaraj Rao

arXiv:2408.08901·cs.IR·August 20, 2024

Bayesian inference to improve quality of Retrieval Augmented Generation

Dattaraj Rao

PDF

Open Access

TL;DR

This paper introduces a Bayesian method to evaluate and improve the quality of retrieved text chunks in Retrieval Augmented Generation, enhancing the relevance and accuracy of LLM responses.

Contribution

It proposes a Bayesian framework for post-search verification of text chunks, incorporating relevance likelihood and prior quality estimates to improve RAG system outputs.

Findings

01

Bayesian approach improves answer quality in RAG systems.

02

Likelihood estimation using LLM enhances chunk relevance assessment.

03

Prior page-based probability helps prioritize more relevant text segments.

Abstract

Retrieval Augmented Generation or RAG is the most popular pattern for modern Large Language Model or LLM applications. RAG involves taking a user query and finding relevant paragraphs of context in a large corpus typically captured in a vector database. Once the first level of search happens over a vector database, the top n chunks of relevant text are included directly in the context and sent as prompt to the LLM. Problem with this approach is that quality of text chunks depends on effectiveness of search. There is no strong post processing after search to determine if the chunk does hold enough information to include in prompt. Also many times there may be chunks that have conflicting information on the same subject and the model has no prior experience which chunk to prioritize to make a decision. Often times, this leads to the model providing a statement that there are conflicting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCaching and Content Delivery · Advanced Data Storage Technologies · Algorithms and Data Compression

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Linear Layer · Attention Dropout · WordPiece · Layer Normalization · Multi-Head Attention · Linear Warmup With Linear Decay · Weight Decay · Adam · Attention Is All You Need