Principled Context Engineering for RAG: Statistical Guarantees via Conformal Prediction

Debashish Chakraborty; Eugene Yang; Daniel Khashabi; Dawn Lawrie; and Kevin Duh

arXiv:2511.17908·cs.CL·March 25, 2026

Principled Context Engineering for RAG: Statistical Guarantees via Conformal Prediction

Debashish Chakraborty, Eugene Yang, Daniel Khashabi, Dawn Lawrie, and Kevin Duh

PDF

Open Access

TL;DR

This paper introduces a conformal prediction-based framework for context engineering in Retrieval-Augmented Generation, providing statistical guarantees for evidence retention and significantly reducing context size while maintaining factual accuracy.

Contribution

It presents a novel, coverage-controlled filtering method using conformal prediction to improve context relevance and size in RAG systems, with statistical guarantees.

Findings

01

Conformal filtering reliably meets target coverage levels.

02

Reduces retained context by 2-3x compared to unfiltered retrieval.

03

Factual accuracy remains stable or improves under strict filtering.

Abstract

Retrieval-Augmented Generation (RAG) enhances factual grounding in large language models (LLMs) by incorporating retrieved evidence, but LLM accuracy declines when long or noisy contexts exceed the model's effective attention span. Existing pre-generation filters rely on heuristics or uncalibrated LLM confidence scores, offering no statistical control over retained evidence. We evaluate and demonstrate context engineering through conformal prediction, a coverage-controlled filtering framework that removes irrelevant content while preserving recall of supporting evidence. Using both embedding- and LLM-based scoring functions, we test this approach on the NeuCLIR and RAGTIME collections. Conformal filtering consistently meets its target coverage, ensuring that a specified fraction of relevant snippets are retained, and reduces retained context by 2-3x relative to unfiltered retrieval. On…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education