Continuum armed bandit problem of few variables in high dimensions

Hemant Tyagi; Bernd G\"artner

arXiv:1304.5793·cs.LG·August 25, 2014·2 cites

Continuum armed bandit problem of few variables in high dimensions

Hemant Tyagi, Bernd G\"artner

PDF

Open Access

TL;DR

This paper studies high-dimensional continuum armed bandit problems where the reward depends on a small subset of variables, proposing algorithms with regret bounds that adapt to the intrinsic low-dimensional structure.

Contribution

It introduces a modified CAB1 algorithm for fixed relevant variables and extends it to changing relevant variables, with regret bounds depending on intrinsic dimension.

Findings

01

Regret bound of O(n^((alpha+k)/(2*alpha+k))) for fixed relevant variables

02

Probabilistic construction of sampling points with high probability guarantees

03

Extension to changing relevant variables with similar regret bounds

Abstract

We consider the stochastic and adversarial settings of continuum armed bandits where the arms are indexed by [0,1]^d. The reward functions r:[0,1]^d -> R are assumed to intrinsically depend on at most k coordinate variables implying r(x_1,..,x_d) = g(x_{i_1},..,x_{i_k}) for distinct and unknown i_1,..,i_k from {1,..,d} and some locally Holder continuous g:[0,1]^k -> R with exponent 0 < alpha <= 1. Firstly, assuming (i_1,..,i_k) to be fixed across time, we propose a simple modification of the CAB1 algorithm where we construct the discrete set of sampling points to obtain a bound of O(n^((alpha+k)/(2*alpha+k)) (log n)^((alpha)/(2*alpha+k)) C(k,d)) on the regret, with C(k,d) depending at most polynomially in k and sub-logarithmically in d. The construction is based on creating partitions of {1,..,d} into k disjoint subsets and is probabilistic, hence our result holds with high probability.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms