Minimax Regret for Bandit Convex Optimisation of Ridge Functions
Tor Lattimore

TL;DR
This paper studies adversarial bandit convex optimization where functions are ridge functions, providing an information-theoretic bound on the minimax regret that depends on dimension, number of rounds, and set diameter.
Contribution
It introduces a novel analysis of bandit convex optimization with ridge functions, establishing a regret bound using an information-theoretic approach.
Findings
Minimax regret is bounded by O(d√n log(n diam(K)))
The analysis applies to functions of the form g_t(⟨x, θ⟩) with unknown θ
Provides a short, elegant proof of regret bounds in this setting
Abstract
We analyse adversarial bandit convex optimisation with an adversary that is restricted to playing functions of the form for convex and unknown that is homogeneous over time. We provide a short information-theoretic proof that the minimax regret is at most where is the number of interactions, the dimension and is the diameter of the constraint set.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics
