Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed   Rewards

Bo Xue; Yimu Wang; Yuanyu Wan; Jinfeng Yi; Lijun Zhang

arXiv:2310.18701·cs.LG·October 31, 2023·1 cites

Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed Rewards

Bo Xue, Yimu Wang, Yuanyu Wan, Jinfeng Yi, Lijun Zhang

PDF

Open Access

TL;DR

This paper introduces two novel algorithms for generalized linear bandits with heavy-tailed rewards, achieving near-optimal regret bounds and practical online learning capabilities, addressing limitations of existing methods for unbounded reward scenarios.

Contribution

The paper proposes truncation and mean-of-medians algorithms for heavy-tailed rewards, with improved regret bounds and practical online learning support.

Findings

01

Achieve regret bound of O(dT^{1/(1+psilon)})

02

Support online learning with truncation-based algorithm

03

Require only O(log T) rewards for mean-of-medians algorithm

Abstract

This paper investigates the problem of generalized linear bandits with heavy-tailed rewards, whose $(1 + ϵ)$ -th moment is bounded for some $ϵ \in (0, 1]$ . Although there exist methods for generalized linear bandits, most of them focus on bounded or sub-Gaussian rewards and are not well-suited for many real-world scenarios, such as financial markets and web-advertising. To address this issue, we propose two novel algorithms based on truncation and mean of medians. These algorithms achieve an almost optimal regret bound of $O (d T^{\frac{1}{1 + ϵ}})$ , where $d$ is the dimension of contextual information and $T$ is the time horizon. Our truncation-based algorithm supports online learning, distinguishing it from existing truncation-based approaches. Additionally, our mean-of-medians-based algorithm requires only $O (lo g T)$ rewards and one estimator per epoch,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Age of Information Optimization

MethodsFocus