Can We Trust Recommender System Fairness Evaluation? The Role of   Fairness and Relevance

Theresia Veronika Rampisela; Tuukka Ruotsalo; Maria Maistro; Christina; Lioma

arXiv:2405.18276·cs.IR·May 29, 2024

Can We Trust Recommender System Fairness Evaluation? The Role of Fairness and Relevance

Theresia Veronika Rampisela, Tuukka Ruotsalo, Maria Maistro, Christina, Lioma

PDF

1 Repo

TL;DR

This paper evaluates the reliability of joint fairness and relevance measures in recommender systems, revealing their weak correlations, insensitivity to rank changes, and limited expressiveness, thus urging cautious use.

Contribution

It provides the first empirical analysis of joint fairness-relevance measures across multiple datasets and recommenders, highlighting their limitations and offering guidelines for proper usage.

Findings

01

Most measures correlate weakly and sometimes contradict each other.

02

They are less sensitive to rank position changes than traditional measures.

03

They tend to compress scores at the low end, limiting expressiveness.

Abstract

Relevance and fairness are two major objectives of recommender systems (RSs). Recent work proposes measures of RS fairness that are either independent from relevance (fairness-only) or conditioned on relevance (joint measures). While fairness-only measures have been studied extensively, we look into whether joint measures can be trusted. We collect all joint evaluation measures of RS relevance and fairness, and ask: How much do they agree with each other? To what extent do they agree with relevance/fairness measures? How sensitive are they to changes in rank position, or to increasingly fair and relevant recommendations? We empirically study for the first time the behaviour of these measures across 4 real-world datasets and 4 recommenders. We find that most of these measures: i) correlate weakly with one another and even contradict each other at times; ii) are less sensitive to rank…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

theresiavr/can-we-trust-recsys-fairness-evaluation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsNeighborhood Contrastive Learning