A Time and Space Efficient Algorithm for Contextual Linear Bandits

Jos\'e Bento; Stratis Ioannidis; S. Muthukrishnan; Jinyun Yan

arXiv:1207.3024·cs.DS·July 8, 2014

A Time and Space Efficient Algorithm for Contextual Linear Bandits

Jos\'e Bento, Stratis Ioannidis, S. Muthukrishnan, Jinyun Yan

PDF

Open Access

TL;DR

This paper introduces a computationally efficient algorithm for contextual linear bandits that achieves logarithmic regret with constant per-iteration complexity and fixed space requirements, even with exponentially many contexts.

Contribution

It presents an $ ext{epsilon}$-greedy algorithm that overcomes previous scalability issues in contextual linear bandits by maintaining low computation and space complexity.

Findings

01

Achieves $O( ext{poly}(d) \, \log T)$ regret.

02

Per-iteration complexity is $O(\text{poly}(d))$, independent of $T$.

03

Space complexity is $O(Kd^2)$, independent of total time steps.

Abstract

We consider a multi-armed bandit problem where payoffs are a linear function of an observed stochastic contextual variable. In the scenario where there exists a gap between optimal and suboptimal rewards, several algorithms have been proposed that achieve $O (lo g T)$ regret after $T$ time steps. However, proposed methods either have a computation complexity per iteration that scales linearly with $T$ or achieve regrets that grow linearly with the number of contexts $∣ \myset X ∣$ . We propose an $ϵ$ -greedy type of algorithm that solves both limitations. In particular, when contexts are variables in $R^{d}$ , we prove that our algorithm has a constant computation complexity per iteration of $O (p o l y (d))$ and can achieve a regret of $O (p o l y (d) lo g T)$ even when $∣ \myset X ∣ = Ω (2^{d})$ . In addition, unlike previous algorithms, its space complexity scales like $O (K d^{2})$ and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms