Data Generation via Latent Factor Simulation for Fairness-aware   Re-ranking

Elena Stefancova; Cassidy All; Joshua Paup; Martin Homola; Nicholas; Mattei; Robin Burke

arXiv:2409.14078·cs.IR·September 24, 2024

Data Generation via Latent Factor Simulation for Fairness-aware Re-ranking

Elena Stefancova, Cassidy All, Joshua Paup, Martin Homola, Nicholas, Mattei, Robin Burke

PDF

Open Access

TL;DR

This paper introduces a novel synthetic data generation method for fairness-aware recommender systems, enabling the study of re-ranking algorithms and protected group interactions without privacy concerns.

Contribution

It proposes a new approach to generate synthetic recommender outputs specifically for fairness research, addressing limitations of existing data generation methods.

Findings

01

Synthetic data effectively simulates protected group interactions.

02

The method facilitates fairness evaluation without real sensitive data.

03

It supports testing of re-ranking algorithms in controlled scenarios.

Abstract

Synthetic data is a useful resource for algorithmic research. It allows for the evaluation of systems under a range of conditions that might be difficult to achieve in real world settings. In recommender systems, the use of synthetic data is somewhat limited; some work has concentrated on building user-item interaction data at large scale. We believe that fairness-aware recommendation research can benefit from simulated data as it allows the study of protected groups and their interactions without depending on sensitive data that needs privacy protection. In this paper, we propose a novel type of data for fairness-aware recommendation: synthetic recommender system outputs that can be used to study re-ranking algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Bayesian Modeling and Causal Inference · Data Management and Algorithms