Provence: efficient and robust context pruning for retrieval-augmented   generation

Nadezhda Chirkova; Thibault Formal; Vassilina Nikoulina; St\'ephane; Clinchant

arXiv:2501.16214·cs.CL·January 31, 2025·2 cites

Provence: efficient and robust context pruning for retrieval-augmented generation

Nadezhda Chirkova, Thibault Formal, Vassilina Nikoulina, St\'ephane, Clinchant

PDF

Open Access 8 Models 1 Video

TL;DR

Provence is a novel, efficient context pruning method for retrieval-augmented generation that dynamically adapts to different scenarios, improving model efficiency without sacrificing performance across multiple domains.

Contribution

Provence introduces a universal, robust context pruning approach formulated as sequence labeling, unifying pruning and reranking, trained on diverse data for broad applicability.

Findings

01

Enables effective context pruning with minimal performance loss

02

Operates efficiently across various domains and settings

03

Requires negligible additional computational cost

Abstract

Retrieval-augmented generation improves various aspects of large language models (LLMs) generation, but suffers from computational overhead caused by long contexts as well as the propagation of irrelevant retrieved information into generated responses. Context pruning deals with both aspects, by removing irrelevant parts of retrieved contexts before LLM generation. Existing context pruning approaches are however limited, and do not provide a universal model that would be both efficient and robust in a wide range of scenarios, e.g., when contexts contain a variable amount of relevant information or vary in length, or when evaluated on various domains. In this work, we close this gap and introduce Provence (Pruning and Reranking Of retrieVEd relevaNt ContExts), an efficient and robust context pruner for Question Answering, which dynamically detects the needed amount of pruning for a given…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

Provence: efficient and robust context pruning for retrieval-augmented generation· slideslive

Taxonomy

TopicsData Management and Algorithms · Web Data Mining and Analysis · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Layer Normalization · Dense Connections · Softmax · Linear Warmup With Linear Decay · Adam · Residual Connection · Dropout · Byte Pair Encoding