Enhancing Retrieval in QA Systems with Derived Feature Association
Keyush Shah, Abhishek Goyal, Isaac Wasserman

TL;DR
This paper introduces RAIDD, an extension to RAG systems that uses LLM-derived features like summaries and questions to improve retrieval accuracy in long-context QA tasks.
Contribution
The paper proposes RAIDD, a novel retrieval method that incorporates inferred features from documents to enhance retrieval relevance in QA systems.
Findings
RAIDD outperforms traditional RAG in long-context QA tasks.
Inferred features improve retrieval relevance.
Enhanced retrieval leads to better answer accuracy.
Abstract
Retrieval augmented generation (RAG) has become the standard in long context question answering (QA) systems. However, typical implementations of RAG rely on a rather naive retrieval mechanism, in which texts whose embeddings are most similar to that of the query are deemed most relevant. This has consequences in subjective QA tasks, where the most relevant text may not directly contain the answer. In this work, we propose a novel extension to RAG systems, which we call Retrieval from AI Derived Documents (RAIDD). RAIDD leverages the full power of the LLM in the retrieval process by deriving inferred features, such as summaries and example questions, from the documents at ingest. We demonstrate that this approach significantly improves the performance of RAG systems on long-context QA tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlind Source Separation Techniques · Machine Learning and ELM · Neural Networks and Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · WordPiece · Attention Dropout · Linear Layer · Weight Decay · Linear Warmup With Linear Decay · Dropout · Byte Pair Encoding · BERT
