Overcoming Prior Misspecification in Online Learning to Rank
Javad Azizi, Ofer Meshi, Masrour Zoghi, Maryam Karimzadehgan

TL;DR
This paper introduces adaptive online learning to rank algorithms that effectively handle prior misspecification, extending to linear models and incorporating various feedback types, with demonstrated success on synthetic and real data.
Contribution
It proposes adaptive algorithms that overcome prior misspecification in online learning to rank, extending to linear and generalized linear models with multiple feedback types.
Findings
Algorithms perform well despite prior misspecification
Effective on synthetic and real-world datasets
Extend to scalar relevance and click feedback
Abstract
The recent literature on online learning to rank (LTR) has established the utility of prior knowledge to Bayesian ranking bandit algorithms. However, a major limitation of existing work is the requirement for the prior used by the algorithm to match the true prior. In this paper, we propose and analyze adaptive algorithms that address this issue and additionally extend these results to the linear and generalized linear models. We also consider scalar relevance feedback on top of click feedback. Moreover, we demonstrate the efficacy of our algorithms using both synthetic and real-world experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Data Stream Mining Techniques
