Provence: efficient and robust context pruning for retrieval-augmented generation
Nadezhda Chirkova, Thibault Formal, Vassilina Nikoulina, St\'ephane, Clinchant

TL;DR
Provence is a novel, efficient context pruning method for retrieval-augmented generation that dynamically adapts to different scenarios, improving model efficiency without sacrificing performance across multiple domains.
Contribution
Provence introduces a universal, robust context pruning approach formulated as sequence labeling, unifying pruning and reranking, trained on diverse data for broad applicability.
Findings
Enables effective context pruning with minimal performance loss
Operates efficiently across various domains and settings
Requires negligible additional computational cost
Abstract
Retrieval-augmented generation improves various aspects of large language models (LLMs) generation, but suffers from computational overhead caused by long contexts as well as the propagation of irrelevant retrieved information into generated responses. Context pruning deals with both aspects, by removing irrelevant parts of retrieved contexts before LLM generation. Existing context pruning approaches are however limited, and do not provide a universal model that would be both efficient and robust in a wide range of scenarios, e.g., when contexts contain a variable amount of relevant information or vary in length, or when evaluated on various domains. In this work, we close this gap and introduce Provence (Pruning and Reranking Of retrieVEd relevaNt ContExts), an efficient and robust context pruner for Question Answering, which dynamically detects the needed amount of pruning for a given…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗naver/provence-reranker-debertav3-v1model· 1.0k dl· ♡ 681.0k dl♡ 68
- 🤗hotchpotch/query-context-pruner-multilingual-Qwen3-4Bmodel· 50 dl· ♡ 250 dl♡ 2
- 🤗hotchpotch/open-provence-reranker-large-v1model· 233 dl233 dl
- 🤗hotchpotch/open-provence-reranker-v1model· 45 dl45 dl
- 🤗hotchpotch/open-provence-reranker-v1-gte-modernbert-basemodel· 342 dl· ♡ 1342 dl♡ 1
- 🤗hotchpotch/open-provence-reranker-xsmall-v1model· 920 dl· ♡ 1920 dl♡ 1
- 🤗zilliz/semantic-highlight-bilingual-v1model· 4.0k dl· ♡ 914.0k dl♡ 91
- 🤗KRLabsOrg/squeez-2bmodel· 1.2k dl· ♡ 11.2k dl♡ 1
Videos
Taxonomy
TopicsData Management and Algorithms · Web Data Mining and Analysis · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Layer Normalization · Dense Connections · Softmax · Linear Warmup With Linear Decay · Adam · Residual Connection · Dropout · Byte Pair Encoding
