Minimax Regret for Bandit Convex Optimisation of Ridge Functions

Tor Lattimore

arXiv:2106.00444·cs.LG·June 8, 2021·1 cites

Minimax Regret for Bandit Convex Optimisation of Ridge Functions

Tor Lattimore

PDF

Open Access

TL;DR

This paper studies adversarial bandit convex optimization where functions are ridge functions, providing an information-theoretic bound on the minimax regret that depends on dimension, number of rounds, and set diameter.

Contribution

It introduces a novel analysis of bandit convex optimization with ridge functions, establishing a regret bound using an information-theoretic approach.

Findings

01

Minimax regret is bounded by O(d√n log(n diam(K)))

02

The analysis applies to functions of the form g_t(⟨x, θ⟩) with unknown θ

03

Provides a short, elegant proof of regret bounds in this setting

Abstract

We analyse adversarial bandit convex optimisation with an adversary that is restricted to playing functions of the form $f_{t} (x) = g_{t} (⟨ x, θ ⟩)$ for convex $g_{t} : R \to R$ and unknown $θ \in R^{d}$ that is homogeneous over time. We provide a short information-theoretic proof that the minimax regret is at most $O (d n lo g (n diam (K)))$ where $n$ is the number of interactions, $d$ the dimension and $diam (K)$ is the diameter of the constraint set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics