Contextual Compression in Retrieval-Augmented Generation for Large   Language Models: A Survey

Sourav Verma

arXiv:2409.13385·cs.CL·October 3, 2024·3 cites

Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey

Sourav Verma

PDF

Open Access 1 Repo

TL;DR

This survey reviews the evolution of Contextual Compression techniques in Retrieval-Augmented Generation for large language models, highlighting their benefits, limitations, and future research directions to enhance model performance.

Contribution

It provides an in-depth overview of the development, challenges, and potential future directions of Contextual Compression in RAG for LLMs.

Findings

01

RAG improves LLM coherence and knowledge integration.

02

Contextual Compression addresses RAG's context window limitations.

03

Future research directions include reducing overhead and relevance filtering.

Abstract

Large Language Models (LLMs) showcase remarkable abilities, yet they struggle with limitations such as hallucinations, outdated knowledge, opacity, and inexplicable reasoning. To address these challenges, Retrieval-Augmented Generation (RAG) has proven to be a viable solution, leveraging external databases to improve the consistency and coherence of generated content, especially valuable for complex, knowledge-rich tasks, and facilitates continuous improvement by leveraging domain-specific insights. By combining the intrinsic knowledge of LLMs with the vast, dynamic repositories of external databases, RAG achieves a synergistic effect. However, RAG is not without its limitations, including a limited context window, irrelevant information, and the high processing overhead for extensive contextual data. In this comprehensive work, we explore the evolution of Contextual Compression…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

srgrace/contextual-compression
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Recommender Systems and Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Attention Dropout · Dense Connections · Multi-Head Attention · Linear Warmup With Linear Decay · Weight Decay · Adam · WordPiece