Online Ranking with Top-1 Feedback

Sougata Chaudhuri; Ambuj Tewari

arXiv:1410.1103·cs.LG·August 24, 2016·2 cites

Online Ranking with Top-1 Feedback

Sougata Chaudhuri, Ambuj Tewari

PDF

Open Access

TL;DR

This paper studies online ranking with limited feedback, establishing regret bounds for various ranking measures and showing some normalized measures are not learnable in this setting.

Contribution

It introduces a novel top-1 feedback model for online ranking and provides tight regret bounds for several ranking measures, including efficient algorithms for some.

Findings

01

Minimax regret for PairwiseLoss and DCG is Θ(T^{2/3}).

02

Efficient strategies achieve O(T^{2/3}) regret for these measures.

03

Normalized measures like AUC, NDCG, and MAP cannot be learned with sublinear regret.

Abstract

We consider a setting where a system learns to rank a fixed set of $m$ items. The goal is produce good item rankings for users with diverse interests who interact online with the system for $T$ rounds. We consider a novel top- $1$ feedback model: at the end of each round, the relevance score for only the top ranked object is revealed. However, the performance of the system is judged on the entire ranked list. We provide a comprehensive set of results regarding learnability under this challenging setting. For PairwiseLoss and DCG, two popular ranking measures, we prove that the minimax regret is $Θ (T^{2/3})$ . Moreover, the minimax regret is achievable using an efficient strategy that only spends $O (m lo g m)$ time per round. The same efficient strategy achieves $O (T^{2/3})$ regret for Precision@ $k$ . Surprisingly, we show that for normalized versions of these ranking measures, i.e.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems