Self-Concordant Analysis of Generalized Linear Bandits with Forgetting

Yoan Russac (DI-ENS; CNRS; PSL; VALDA); Louis Faury; Olivier Capp\'e; (DI-ENS; VALDA); Aur\'elien Garivier (UMPA-ENSL)

arXiv:2011.00819·cs.LG·March 5, 2021·1 cites

Self-Concordant Analysis of Generalized Linear Bandits with Forgetting

Yoan Russac (DI-ENS, CNRS, PSL, VALDA), Louis Faury, Olivier Capp\'e, (DI-ENS, VALDA), Aur\'elien Garivier (UMPA-ENSL)

PDF

Open Access

TL;DR

This paper introduces a new confidence-based algorithm for self-concordant generalized linear bandits that effectively handles non-stationarity using forgetting techniques, improving theoretical guarantees and practical performance.

Contribution

It proposes a novel algorithm with theoretical analysis for non-stationary self-concordant GLB, addressing limitations of previous methods in non-linear and changing environments.

Findings

01

Effective handling of non-stationarity in GLB models.

02

Theoretical guarantees for the proposed algorithm.

03

Numerical simulations demonstrating improved performance.

Abstract

Contextual sequential decision problems with categorical or numerical observations are ubiquitous and Generalized Linear Bandits (GLB) offer a solid theoretical framework to address them. In contrast to the case of linear bandits, existing algorithms for GLB have two drawbacks undermining their applicability. First, they rely on excessively pessimistic concentration bounds due to the non-linear nature of the model. Second, they require either non-convex projection steps or burn-in phases to enforce boundedness of the estimators. Both of these issues are worsened when considering non-stationary models, in which the GLB parameter may vary with time. In this work, we focus on self-concordant GLB (which include logistic and Poisson regression) with forgetting achieved either by the use of a sliding window or exponential weights. We propose a novel confidence-based algorithm for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics