No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users
Mengxuan Hu, Hongyi Wu, Zihan Guan, Ronghang Zhu and, Dongliang Guo, Daiqing Qi, Sheng Li

TL;DR
Retrieval-Augmented Generation improves LLMs but can inadvertently cause fairness issues, undermining bias mitigation efforts even with external datasets designed to be unbiased.
Contribution
This paper reveals the fairness risks of RAG in LLMs and introduces a threat model to analyze how user awareness influences fairness censorship.
Findings
RAG can lead to biased outputs even with unbiased datasets
Fairness alignment is vulnerable without fine-tuning or retraining
Current methods are insufficient to guarantee fairness in RAG-based LLMs
Abstract
Retrieval-Augmented Generation (RAG) is widely adopted for its effectiveness and cost-efficiency in mitigating hallucinations and enhancing the domain-specific generation capabilities of large language models (LLMs). However, is this effectiveness and cost-efficiency truly a free lunch? In this study, we comprehensively investigate the fairness costs associated with RAG by proposing a practical three-level threat model from the perspective of user awareness of fairness. Specifically, varying levels of user fairness awareness result in different degrees of fairness censorship on the external dataset. We examine the fairness implications of RAG using uncensored, partially censored, and fully censored datasets. Our experiments demonstrate that fairness alignment can be easily undermined through RAG without the need for fine-tuning or retraining. Even with fully censored and supposedly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAccess Control and Trust · IoT and Edge/Fog Computing · Privacy-Preserving Technologies in Data
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Attention Dropout · Attention Is All You Need · Linear Layer · Weight Decay · Linear Warmup With Linear Decay · Dropout · Byte Pair Encoding · BERT
