RAG-E: Quantifying Retriever-Generator Alignment and Failure Modes

Korbinian Randl; Guido Rocchietti; Aron Henriksson; Ziawasch Abedjan; Tony Lindgren; John Pavlopoulos

arXiv:2601.21803·cs.CL·January 30, 2026

RAG-E: Quantifying Retriever-Generator Alignment and Failure Modes

Korbinian Randl, Guido Rocchietti, Aron Henriksson, Ziawasch Abedjan, Tony Lindgren, John Pavlopoulos

PDF

Open Access

TL;DR

RAG-E is a framework that explains and measures how well retrieval and generation components in RAG systems align, revealing significant failure modes where generators ignore or rely on less relevant documents, impacting output quality.

Contribution

The paper introduces RAG-E, a novel explainability framework with new attribution methods and metrics to analyze retriever-generator interactions in RAG systems.

Findings

01

High rates of generator ignoring top-ranked documents (47.4%-66.7%)

02

Significant reliance on less relevant documents (48.1%-65.9%)

03

Alignment issues affect output quality beyond individual component performance

Abstract

Retrieval-Augmented Generation (RAG) systems combine dense retrievers and language models to ground LLM outputs in retrieved documents. However, the opacity of how these components interact creates challenges for deployment in high-stakes domains. We present RAG-E, an end-to-end explainability framework that quantifies retriever-generator alignment through mathematically grounded attribution methods. Our approach adapts Integrated Gradients for retriever analysis, introduces PMCSHAP, a Monte Carlo-stabilized Shapley Value approximation, for generator attribution, and introduces the Weighted Attribution-Relevance Gap (WARG) metric to measure how well a generator's document usage aligns with a retriever's ranking. Empirical analysis on TREC CAsT and FoodSafeSum reveals critical misalignments: for 47.4% to 66.7% of queries, generators ignore the retriever's top-ranked documents, while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Information Retrieval and Search Behavior