Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation

To Eun Kim; Fernando Diaz

arXiv:2409.11598·cs.IR·July 8, 2025

Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation

To Eun Kim, Fernando Diaz

PDF

Open Access 1 Repo 4 Reviews

TL;DR

This paper systematically evaluates the impact of fairness-aware ranking in retrieval-augmented generation (RAG) systems, demonstrating that fairness can be integrated without sacrificing performance and promoting equitable source attribution.

Contribution

It introduces the first comprehensive evaluation of fairness-aware retrieval in RAG, highlighting its benefits for system effectiveness and source attribution.

Findings

01

Fairness-aware retrieval maintains or improves ranking and generation quality.

02

Fair retrieval practices lead to more balanced source attribution.

03

Incorporating fairness does not compromise system performance.

Abstract

Despite the central role of retrieval in retrieval-augmented generation (RAG) systems, much of the existing research on RAG overlooks the well-established field of fair ranking and fails to account for the interests of all stakeholders involved. In this paper, we conduct the first systematic evaluation of RAG systems that integrate fairness-aware rankings, addressing both ranking fairness and attribution fairness, which ensures equitable exposure of the sources cited in the generated content. Our evaluation focuses on measuring item-side fairness, specifically the fair exposure of relevant items retrieved by RAG systems, and investigates how this fairness impacts both the effectiveness of the systems and the attribution of sources in the generated output that users ultimately see. By experimenting with twelve RAG models across seven distinct tasks, we show that incorporating…

Peer Reviews

Decision·ICLR 2025 Conference Withdrawn Submission

Reviewer 01Rating 8Confidence 4

Strengths

Pros: - The problems discussed in the paper is interesting and important. - The experimental studies and the findings are useful to the research community, although the experiments still have some limitations. For example, only fair exposure is considered. - The paper is well-written and easy to follow.

Weaknesses

Cons: - Given the truth that long-context modeling has been widely applied in many LLMs, it would be great if the discussion in the paper can be extended to such models. I believe if more results can be fed into LLMs, the fairness problem should be different with the problem studied in the paper. - More advanced problems should also be considered. For example, the current RAG system has refiner components. More discussion about fairness in these components should be discussed.

Reviewer 02Rating 3Confidence 4

Strengths

1.The motivation is clear and the experimental setups is detailed introduced. 2.The paper is well-written.

Weaknesses

1. This paper is a trivial work with incremental contributions, which explores the fairness impact of LLM-based RAG systems. In fact, there are few valuable findings compared to previous studies. Many previous studies found that retrieval diversities (akin to fairness) and position biases (e.g., loss-in-the-middle phenomenon) influence the RAG performance a lot. 2. The experiments are not sufficiently thorough. There are only two main discussions about the relationships among ranking fairness, r

Reviewer 03Rating 3Confidence 3

Strengths

1. The research problem is interesting and important, which is essential for the responsible deployment of RAG systems. 2. Experiments results show that fair rankings can maintain or even improve the generation quality of RAG.

Weaknesses

1. Section 3 introduces extensive notation and terminology that may be unnecessary, making the content difficult to follow the experimental settings. Simplifying this section by clearly explaining the evaluation settings and metrics without excessive symbols would enhance readability and comprehension. 2. The paper evaluates only one fair ranking method, which limits the generalizability of the findings. Incorporating other item-side fairness ranking methods (e.g., refer to [1]) would strengthe

Reviewer 04Rating 5Confidence 4

Strengths

S1: This paper presents an interesting scenario: providing more equitable exposure for different items in RAG leads to improved performance outcomes. S2: The authors conduct extensive experiments to show a general trend of a tradeoff between ensuring fairness and maintaining system effectiveness.

Weaknesses

W1: Does this qualify as a definition of fairness? The outcome is equaliable does not necessarily be fair. When it comes to fairness [1,2], it leans more toward a subjective goal: for instance, even if retrieval achieves 100% accuracy, it may still conflict with human values, such as when certain categories receive less exposure. However, as for bias, it mainly cares about the final utility (objective). As for this setting, It seems it is a form of bias because the final RAG goal is to gain more

Code & Models

Repositories

kimdanny/fair-rag
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Machine Learning and ELM

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Attention Dropout · WordPiece · Dense Connections · Residual Connection · Linear Layer · Multi-Head Attention · Linear Warmup With Linear Decay · Adam