The 2021 RecSys Challenge Dataset: Fairness is not optional

Luca Belli; Alykhan Tejani; Frank Portman; Alexandre Lung-Yut-Fong,; Ben Chamberlain; Yuanpu Xie; Kristian Lum; Jonathan Hunt; Michael Bronstein,; Vito Walter Anelli; Saikishore Kalloori; Bruce Ferwerda; Wenzhe Shi

arXiv:2109.08245·cs.SI·September 23, 2021

The 2021 RecSys Challenge Dataset: Fairness is not optional

Luca Belli, Alykhan Tejani, Frank Portman, Alexandre Lung-Yut-Fong,, Ben Chamberlain, Yuanpu Xie, Kristian Lum, Jonathan Hunt, Michael Bronstein,, Vito Walter Anelli, Saikishore Kalloori, Bruce Ferwerda, Wenzhe Shi

PDF

Open Access

TL;DR

This paper introduces a large, Twitter-synced dataset for the 2021 RecSys Challenge, emphasizing fairness considerations and dynamic data updates to better reflect real-world recommender system challenges.

Contribution

It presents a significantly larger, fairness-aware dataset that is synchronized with Twitter platform changes, addressing challenges of real-time data updates in recommender systems.

Findings

01

Dataset size increased fivefold to ~1 billion data points.

02

Incorporation of fairness considerations into dataset design.

03

Dynamic synchronization with Twitter platform updates.

Abstract

After the success the RecSys 2020 Challenge, we are describing a novel and bigger dataset that was released in conjunction with the ACM RecSys Challenge 2021. This year's dataset is not only bigger (~ 1B data points, a 5 fold increase), but for the first time it take into consideration fairness aspects of the challenge. Unlike many static datsets, a lot of effort went into making sure that the dataset was synced with the Twitter platform: if a user deleted their content, the same content would be promptly removed from the dataset too. In this paper, we introduce the dataset and challenge, highlighting some of the issues that arise when creating recommender systems at Twitter scale.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Advanced Graph Neural Networks · Mobile Crowdsensing and Crowdsourcing