Towards Scalable and Robust Structured Bandits: A Meta-Learning   Framework

Runzhe Wan; Lin Ge; Rui Song

arXiv:2202.13227·cs.LG·March 1, 2022·1 cites

Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework

Runzhe Wan, Lin Ge, Rui Song

PDF

Open Access

TL;DR

This paper introduces a scalable, robust meta-learning framework for structured bandits that leverages a Bayesian hierarchical model and meta Thompson sampling to handle large parameter spaces effectively.

Contribution

It proposes a unified meta-learning approach with a Bayesian hierarchical model for structured bandits, improving scalability and robustness over existing methods.

Findings

01

The framework is applicable to many structured bandit problems.

02

The proposed algorithm is scalable to large parameter and action spaces.

03

Theoretical analysis and numerical results validate the method's effectiveness.

Abstract

Online learning in large-scale structured bandits is known to be challenging due to the curse of dimensionality. In this paper, we propose a unified meta-learning framework for a general class of structured bandit problems where the parameter space can be factorized to item-level. The novel bandit algorithm is general to be applied to many popular problems,scalable to the huge parameter and action spaces, and robust to the specification of the generalization model. At the core of this framework is a Bayesian hierarchical model that allows information sharing among items via their features, upon which we design a meta Thompson sampling algorithm. Three representative examples are discussed thoroughly. Both theoretical analysis and numerical results support the usefulness of the proposed method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms