Who Benefits from RAG? The Role of Exposure, Utility and Attribution Bias
Mahdi Dehghan, Graham McDonald

TL;DR
This paper investigates how retrieval-augmented generation (RAG) systems impact fairness across different query groups, revealing that RAG can amplify accuracy disparities and that factors like exposure, utility, and attribution significantly influence fairness outcomes.
Contribution
It provides the first comprehensive analysis of query group fairness in RAG systems, highlighting the roles of exposure, utility, and attribution in fairness disparities.
Findings
RAG systems amplify accuracy disparities across query groups.
Group utility, exposure, and attribution strongly influence fairness outcomes.
RAG can worsen fairness compared to LLM-only systems.
Abstract
Large Language Models (LLMs) enhanced with Retrieval-Augmented Generation (RAG) have achieved substantial improvements in accuracy by grounding their responses in external documents that are relevant to the user's query. However, relatively little work has investigated the impact of RAG in terms of fairness. Particularly, it is not yet known if queries that are associated with certain groups within a fairness category systematically receive higher accuracy, or accuracy improvements in RAG systems compared to LLM-only, a phenomenon we refer to as query group fairness. In this work, we conduct extensive experiments to investigate the impact of three key factors on query group fairness in RAG, namely: Group exposure, i.e., the proportion of documents from each group appearing in the retrieved set, determined by the retriever; Group utility, i.e., the degree to which documents from each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Ethics and Social Impacts of AI · Information Retrieval and Search Behavior
