Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework
Runzhe Wan, Lin Ge, Rui Song

TL;DR
This paper introduces a scalable, robust meta-learning framework for structured bandits that leverages a Bayesian hierarchical model and meta Thompson sampling to handle large parameter spaces effectively.
Contribution
It proposes a unified meta-learning approach with a Bayesian hierarchical model for structured bandits, improving scalability and robustness over existing methods.
Findings
The framework is applicable to many structured bandit problems.
The proposed algorithm is scalable to large parameter and action spaces.
Theoretical analysis and numerical results validate the method's effectiveness.
Abstract
Online learning in large-scale structured bandits is known to be challenging due to the curse of dimensionality. In this paper, we propose a unified meta-learning framework for a general class of structured bandit problems where the parameter space can be factorized to item-level. The novel bandit algorithm is general to be applied to many popular problems,scalable to the huge parameter and action spaces, and robust to the specification of the generalization model. At the core of this framework is a Bayesian hierarchical model that allows information sharing among items via their features, upon which we design a meta Thompson sampling algorithm. Three representative examples are discussed thoroughly. Both theoretical analysis and numerical results support the usefulness of the proposed method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms
