Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits

Jie Bian; Vincent Y. F. Tan

arXiv:2405.15200·cs.LG·May 27, 2024

Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits

Jie Bian, Vincent Y. F. Tan

PDF

Open Access

TL;DR

This paper introduces the LinIMED algorithms, extending the IMED approach to linear contextual bandits, achieving strong theoretical guarantees and outperforming existing algorithms in empirical tests.

Contribution

It develops the first linear versions of IMED, providing near-optimal regret bounds and demonstrating superior empirical performance over standard linear bandit algorithms.

Findings

01

LinIMED achieves a $ ilde{O}(d\, ext{sqrt}(T))$ regret bound.

02

LinIMED outperforms LinUCB and Linear Thompson Sampling in various regimes.

03

The algorithms are effective in high-dimensional contextual bandit settings.

Abstract

The Indexed Minimum Empirical Divergence (IMED) algorithm is a highly effective approach that offers a stronger theoretical guarantee of the asymptotic optimality compared to the Kullback--Leibler Upper Confidence Bound (KL-UCB) algorithm for the multi-armed bandit problem. Additionally, it has been observed to empirically outperform UCB-based algorithms and Thompson Sampling. Despite its effectiveness, the generalization of this algorithm to contextual bandits with linear payoffs has remained elusive. In this paper, we present novel linear versions of the IMED algorithm, which we call the family of LinIMED algorithms. We demonstrate that LinIMED provides a $O (d T)$ upper regret bound where $d$ is the dimension of the context and $T$ is the time horizon. Furthermore, extensive empirical studies reveal that LinIMED and its variants outperform widely-used linear bandit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Advanced Adaptive Filtering Techniques