Efficient Algorithms for Adversarial Contextual Learning
Vasilis Syrgkanis, Akshay Krishnamurthy, Robert E. Schapire

TL;DR
This paper introduces the first oracle-efficient algorithms with sublinear regret for adversarial contextual bandit problems, addressing both transductive and small separator settings, and extends to semi-bandit and combinatorial optimization.
Contribution
It presents novel, efficient algorithms for adversarial contextual bandits with regret bounds, applicable to semi-bandit and combinatorial problems, advancing the state of the art.
Findings
Achieves regret $O(T^{3/4}\sqrt{K\log(N)})$ in transductive setting
Achieves regret $O(T^{2/3}\d^{3/4} ext{K}\sqrt{\log(N)})$ in separator setting
Extends to semi-bandit linear optimization and contextual combinatorial optimization
Abstract
We provide the first oracle efficient sublinear regret algorithms for adversarial versions of the contextual bandit problem. In this problem, the learner repeatedly makes an action on the basis of a context and receives reward for the chosen action, with the goal of achieving reward competitive with a large class of policies. We analyze two settings: i) in the transductive setting the learner knows the set of contexts a priori, ii) in the small separator setting, there exists a small set of contexts such that any two policies behave differently in one of the contexts in the set. Our algorithms fall into the follow the perturbed leader family \cite{Kalai2005} and achieve regret in the transductive setting and in the separator setting, where is the number of actions, is the number of baseline policies, and is the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
