ParetoRAG: Leveraging Sentence-Context Attention for Robust and   Efficient Retrieval-Augmented Generation

Ruobing Yao; Yifei Zhang; Shuang Song; Yuhua Liu; Neng Gao; Chenyang; Tu

arXiv:2502.08178·cs.CL·February 13, 2025

ParetoRAG: Leveraging Sentence-Context Attention for Robust and Efficient Retrieval-Augmented Generation

Ruobing Yao, Yifei Zhang, Shuang Song, Yuhua Liu, Neng Gao, Chenyang, Tu

PDF

Open Access

TL;DR

ParetoRAG introduces a sentence-level refinement framework for RAG systems that improves retrieval accuracy and generation quality by dynamically re-weighting core content, validated across multiple datasets and models.

Contribution

It proposes an unsupervised, sentence-level optimization method for RAG systems that enhances efficiency and relevance without extra training or API costs.

Findings

01

Improves retrieval precision and generation quality

02

Validated across various datasets and models

03

Does not require additional training or API resources

Abstract

While Retrieval-Augmented Generation (RAG) systems enhance Large Language Models (LLMs) by incorporating external knowledge, they still face persistent challenges in retrieval inefficiency and the inability of LLMs to filter out irrelevant information. We present ParetoRAG, an unsupervised framework that optimizes RAG systems through sentence-level refinement guided by the Pareto principle. By decomposing paragraphs into sentences and dynamically re-weighting core content while preserving contextual coherence, ParetoRAG achieves dual improvements in both retrieval precision and generation quality without requiring additional training or API resources. This framework has been empirically validated across various datasets, LLMs, and retrievers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection · WordPiece