Dynamical Linear Bandits

Marco Mussi; Alberto Maria Metelli; Marcello Restelli

arXiv:2211.08997·cs.LG·May 31, 2023

Dynamical Linear Bandits

Marco Mussi, Alberto Maria Metelli, Marcello Restelli

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces the Dynamical Linear Bandits framework, modeling sequential decision problems with delayed, evolving effects using a hidden state and linear dynamics, and proposes an algorithm with provable regret bounds.

Contribution

It extends linear bandits to include hidden states with linear dynamics, providing a new setting and an optimistic algorithm with theoretical regret guarantees.

Findings

01

Regret bound of order $ ilde{O}(d rac{ oot{T}}{(1-ar{ ho})^{3/2}})$ for DynLin-UCB

02

Effective in synthetic and real-world environments compared to baselines

03

Introduces a novel dynamical structure in linear bandit models

Abstract

In many real-world sequential decision-making problems, an action does not immediately reflect on the feedback and spreads its effects over a long time frame. For instance, in online advertising, investing in a platform produces an instantaneous increase of awareness, but the actual reward, i.e., a conversion, might occur far in the future. Furthermore, whether a conversion takes place depends on: how fast the awareness grows, its vanishing effects, and the synergy or interference with other advertising platforms. Previous work has investigated the Multi-Armed Bandit framework with the possibility of delayed and aggregated feedback, without a particular structure on how an action propagates in the future, disregarding possible dynamical effects. In this paper, we introduce a novel setting, the Dynamical Linear Bandits (DLB), an extension of the linear bandits characterized by a hidden…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

marcomussi/dlb
noneOfficial

Videos

Dynamical Linear Bandits· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management