TL;DR
This paper introduces PEER, a novel fairness metric for multilingual information retrieval systems that assesses whether documents in different languages are ranked fairly, addressing biases in existing approaches.
Contribution
The work proposes PEER, the first language fairness metric for MLIR, and demonstrates its effectiveness through experiments on artificial and real benchmark data.
Findings
PEER scores reveal biases in existing MLIR systems.
PEER aligns with prior fairness analyses in MLIR.
Implementation is compatible with ir-measures.
Abstract
Multilingual information retrieval (MLIR) considers the problem of ranking documents in several languages for a query expressed in a language that may differ from any of those languages. Recent work has observed that approaches such as combining ranked lists representing a single document language each or using multilingual pretrained language models demonstrate a preference for one language over others. This results in systematic unfair treatment of documents in different languages. This work proposes a language fairness metric to evaluate whether documents across different languages are fairly ranked through statistical equivalence testing using the Kruskal-Wallis test. In contrast to most prior work in group fairness, we do not consider any language to be an unprotected group. Thus our proposed measure, PEER (Probability of EqualExpected Rank), is the first fairness metric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsALIGN
