Evaluating the Effect of Retrieval Augmentation on Social Biases
Tianhui Zhang, Yi Zhou, Danushka Bollegala

TL;DR
This study investigates how Retrieval Augmented Generation (RAG) influences social biases in language models across multiple languages and bias types, revealing that RAG can amplify existing biases in generated text.
Contribution
The paper systematically analyzes the impact of RAG components on social biases in multilingual NLG, highlighting potential amplification of biases and raising awareness for responsible deployment.
Findings
Biases in document collections are often amplified in RAG-generated responses.
Even low-bias LLMs can produce responses with significant social biases when using biased document collections.
RAG's influence on social biases varies across languages and bias types.
Abstract
Retrieval Augmented Generation (RAG) has gained popularity as a method for conveniently incorporating novel facts that were not seen during the pre-training stage in Large Language Model (LLM)-based Natural Language Generation (NLG) systems. However, LLMs are known to encode significant levels of unfair social biases. The modulation of these biases by RAG in NLG systems is not well understood. In this paper, we systematically study the relationship between the different components of a RAG system and the social biases presented in the text generated across three languages (i.e. English, Japanese and Chinese) and four social bias types (i.e. gender, race, age and religion). Specifically, using the Bias Question Answering (BBQ) benchmark datasets, we evaluate the social biases in RAG responses from document collections with varying levels of stereotypical biases, employing multiple LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTechnology Adoption and User Behaviour · Decision-Making and Behavioral Economics · Psychological and Educational Research Studies
MethodsWeight Decay · Dense Connections · Attention Dropout · Linear Layer · Layer Normalization · Byte Pair Encoding · Residual Connection · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay
