Does RAG Introduce Unfairness in LLMs? Evaluating Fairness in   Retrieval-Augmented Generation Systems

Xuyang Wu; Shuowei Li; Hsin-Tai Wu; Zhiqiang Tao; Yi Fang

arXiv:2409.19804·cs.CL·March 28, 2025·2 cites

Does RAG Introduce Unfairness in LLMs? Evaluating Fairness in Retrieval-Augmented Generation Systems

Xuyang Wu, Shuowei Li, Hsin-Tai Wu, Zhiqiang Tao, Yi Fang

PDF

Open Access 1 Repo

TL;DR

This paper empirically evaluates fairness in Retrieval-Augmented Generation (RAG) systems, revealing persistent biases across demographic attributes despite improvements in utility, and highlights the need for targeted fairness interventions.

Contribution

It introduces a novel fairness evaluation framework for RAG models and provides empirical evidence of fairness issues in current RAG systems.

Findings

01

Fairness disparities exist in both retrieval and generation stages of RAG.

02

Recent utility-focused optimization does not eliminate fairness concerns.

03

The study offers a publicly available dataset and code for further research.

Abstract

Retrieval-Augmented Generation (RAG) has recently gained significant attention for its enhanced ability to integrate external knowledge sources into open-domain question answering (QA) tasks. However, it remains unclear how these models address fairness concerns, particularly with respect to sensitive attributes such as gender, geographic location, and other demographic factors. First, as language models evolve to prioritize utility, like improving exact match accuracy, fairness considerations may have been largely overlooked. Second, the complex, multi-component architecture of RAG methods poses challenges in identifying and mitigating biases, as each component is optimized for distinct objectives. In this paper, we aim to empirically evaluate fairness in several RAG methods. We propose a fairness evaluation framework tailored to RAG, using scenario-based questions and analyzing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

elviswxy/rag_fairness
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications

MethodsAttention Is All You Need · Attention Dropout · WordPiece · Linear Warmup With Linear Decay · Linear Layer · Weight Decay · Byte Pair Encoding · BERT · Softmax · Dropout