Lifelong Bandit Optimization: No Prior and No Regret

Felix Schur; Parnian Kassraie; Jonas Rothfuss; Andreas Krause

arXiv:2210.15513·stat.ML·June 21, 2023

Lifelong Bandit Optimization: No Prior and No Regret

Felix Schur, Parnian Kassraie, Jonas Rothfuss, Andreas Krause

PDF

Open Access

TL;DR

This paper introduces LIBO, a lifelong bandit optimization algorithm that learns shared kernels across tasks to improve sample efficiency and achieve oracle optimal regret, even without prior knowledge of the kernel.

Contribution

The paper presents LIBO, a novel meta-learning approach for kernelized bandits that adapts across tasks and guarantees sublinear lifelong regret without prior kernel knowledge.

Findings

01

LIBO achieves regret close to oracle optimal as tasks increase.

02

LIBO can be combined with any kernelized bandit algorithm for improved performance.

03

F-LIBO extends LIBO to federated settings without access to individual task data.

Abstract

Machine learning algorithms are often repeatedly applied to problems with similar structure over and over again. We focus on solving a sequence of bandit optimization tasks and develop LIBO, an algorithm which adapts to the environment by learning from past experience and becomes more sample-efficient in the process. We assume a kernelized structure where the kernel is unknown but shared across all tasks. LIBO sequentially meta-learns a kernel that approximates the true kernel and solves the incoming tasks with the latest kernel estimate. Our algorithm can be paired with any kernelized or linear bandit algorithm and guarantees oracle optimal performance, meaning that as more tasks are solved, the regret of LIBO on each task converges to the regret of the bandit algorithm with oracle knowledge of the true kernel. Naturally, if paired with a sublinear bandit algorithm, LIBO yields a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Age of Information Optimization