Hypothesis Transfer in Bandits by Weighted Models

Steven Bilaj; Sofien Dhouib; Setareh Maghsudi

arXiv:2211.07387·cs.LG·November 15, 2022

Hypothesis Transfer in Bandits by Weighted Models

Steven Bilaj, Sofien Dhouib, Setareh Maghsudi

PDF

Open Access

TL;DR

This paper introduces transfer learning techniques for contextual bandits using weighted models, improving exploration efficiency when leveraging prior models and adapting to multiple sources.

Contribution

It proposes a re-weighting scheme for transfer in bandits, extending to multiple source models and dynamic combinations, with theoretical guarantees and empirical validation.

Findings

01

Reduced regret compared to classic Linear UCB with transfer

02

Effective handling of multiple source models

03

Empirical results on simulated and real data confirm benefits

Abstract

We consider the problem of contextual multi-armed bandits in the setting of hypothesis transfer learning. That is, we assume having access to a previously learned model on an unobserved set of contexts, and we leverage it in order to accelerate exploration on a new bandit problem. Our transfer strategy is based on a re-weighting scheme for which we show a reduction in the regret over the classic Linear UCB when transfer is desired, while recovering the classic regret rate when the two tasks are unrelated. We further extend this method to an arbitrary amount of source models, where the algorithm decides which model is preferred at each time step. Additionally we discuss an approach where a dynamic convex combination of source models is given in terms of a biased regularization term in the classic LinUCB algorithm. The algorithms and the theoretical analysis of our proposed methods…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms