Representative Action Selection for Large Action Space Bandit Families
Quan Zhou, Mark Kozdoba, Shie Mannor

TL;DR
This paper introduces a simple algorithm for selecting representative actions in large, correlated action spaces for bandit problems, reducing complexity while maintaining performance.
Contribution
It proposes a straightforward sampling-based method that adapts to correlations without prior knowledge, with theoretical guarantees and empirical validation.
Findings
The algorithm effectively reduces the action space in correlated bandit settings.
It performs comparably to or better than existing baselines like Combinatorial Bandit and Meta Learning Bandit.
Theoretical analysis confirms the algorithm's near-optimal performance.
Abstract
We study the problem of selecting a subset from a large action space shared by a family of bandits. In many natural situations, while the nominal set of actions is large, actions are highly correlated: many yield similar rewards across environments, making it wasteful to maintain the full set. Our aim is to understand whether it is possible -- and how -- to select a smaller set of representative actions that performs nearly as well as the full action space. Our main contribution is a surprisingly simple algorithm: repeatedly sample a bandit instance at random, solve it, and collect the optimal action. This algorithm can significantly reduce the action space when such correlations are present, without the need to know a-priori the correlation structure. We provide theoretical guarantees on the performance of the algorithm and demonstrate its practical effectiveness through empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
