Contextual Recommendations and Low-Regret Cutting-Plane Algorithms

Sreenivas Gollapudi; Guru Guruganesh; Kostas Kollias; Pasin; Manurangsi; Renato Paes Leme; Jon Schneider

arXiv:2106.04819·cs.LG·June 10, 2021·1 cites

Contextual Recommendations and Low-Regret Cutting-Plane Algorithms

Sreenivas Gollapudi, Guru Guruganesh, Kostas Kollias, Pasin, Manurangsi, Renato Paes Leme, Jon Schneider

PDF

Open Access 1 Video

TL;DR

This paper introduces novel algorithms for contextual linear bandits with low regret, leveraging cutting-plane methods and convex geometry techniques, applicable to routing and recommendation systems.

Contribution

It presents new low-regret algorithms for contextual bandits using cutting-plane methods and convex geometry, including variants with list recommendations and nearly tight bounds.

Findings

01

Achieves regret $O(d\,\log T)$ and $\,\exp(O(d \log d))$

02

Provides algorithms with $O(d^2 \log d)$ regret and polynomial list size

03

Develops nearly tight algorithms for weaker feedback models

Abstract

We consider the following variant of contextual linear bandits motivated by routing applications in navigational engines and recommendation systems. We wish to learn a hidden $d$ -dimensional value $w^{*}$ . Every round, we are presented with a subset $X_{t} \subseteq R^{d}$ of possible actions. If we choose (i.e. recommend to the user) action $x_{t}$ , we obtain utility $⟨ x_{t}, w^{*} ⟩$ but only learn the identity of the best action $ar g max_{x \in X_{t}} ⟨ x, w^{*} ⟩$ . We design algorithms for this problem which achieve regret $O (d lo g T)$ and $exp (O (d lo g d))$ . To accomplish this, we design novel cutting-plane algorithms with low "regret" -- the total distance between the true point $w^{*}$ and the hyperplanes the separation oracle returns. We also consider the variant where we are allowed to provide a list of several recommendations. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Contextual Recommendations and Low-Regret Cutting-Plane Algorithms· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems