On the Minimax Regret in Online Ranking with Top-k Feedback
Mingyuan Zhang, Ambuj Tewari

TL;DR
This paper characterizes the minimax regret rates in online ranking with top-k feedback for various performance measures and provides an efficient algorithm for Precision@n.
Contribution
It offers a complete characterization of minimax regret rates for all k and key ranking measures, solving open problems from prior work.
Findings
Full regret rate characterization for Pairwise Loss, DCG, and P@n.
Efficient algorithm achieving minimax regret for P@n.
Analysis extends understanding of partial feedback in online ranking.
Abstract
In online ranking, a learning algorithm sequentially ranks a set of items and receives feedback on its ranking in the form of relevance scores. Since obtaining relevance scores typically involves human annotation, it is of great interest to consider a partial feedback setting where feedback is restricted to the top- items in the rankings. Chaudhuri and Tewari [2017] developed a framework to analyze online ranking algorithms with top feedback. A key element in their work was the use of techniques from partial monitoring. In this paper, we further investigate online ranking with top feedback and solve some open problems posed by Chaudhuri and Tewari [2017]. We provide a full characterization of minimax regret rates with the top feedback model for all and for the following ranking performance measures: Pairwise Loss, Discounted Cumulative Gain, and Precision@n. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms
