Is Non-IID Data a Threat in Federated Online Learning to Rank?
Shuyi Wang, Guido Zuccon

TL;DR
This paper investigates how non-IID data distributions across clients affect federated online learning to rank (FOLTR), highlighting potential challenges and future research directions in this emerging area of information retrieval.
Contribution
It systematically analyzes the impact of various non-IID data settings on FOLTR performance and identifies research gaps for improving federated learning approaches in ranking tasks.
Findings
Non-IID data can significantly impair FOLTR performance.
Certain data distribution settings pose challenges to current FOLTR methods.
Existing federated learning solutions may not fully address non-IID issues in FOLTR.
Abstract
In this perspective paper we study the effect of non independent and identically distributed (non-IID) data on federated online learning to rank (FOLTR) and chart directions for future work in this new and largely unexplored research area of Information Retrieval. In the FOLTR process, clients participate in a federation to jointly create an effective ranker from the implicit click signal originating in each client, without the need to share data (documents, queries, clicks). A well-known factor that affects the performance of federated learning systems, and that poses serious challenges to these approaches, is that there may be some type of bias in the way data is distributed across clients. While FOLTR systems are on their own rights a type of federated learning system, the presence and effect of non-IID data in FOLTR has not been studied. To this aim, we first enumerate possible data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Internet Traffic Analysis and Secure E-voting · Machine Learning and Algorithms
MethodsNetwork On Network
