Protocols for Verifying Smooth Strategies in Bandits and Games
Miranda Christ, Daniel Reichman, Jonathan Shafer

TL;DR
This paper develops efficient verification protocols for approximate optimality of strategies in multi-armed bandits and games, requiring fewer queries than traditional learning methods, especially for smooth strategies.
Contribution
It introduces sublinear query protocols for verifying smooth strategies' optimality in bandits and games, including a nearly-tight lower bound and applications to Nash equilibrium verification.
Findings
Verification protocols require fewer arm queries than learning.
Protocols are effective for smooth strategies with limited probability mass on any action.
Achieves sublinear query complexity in large action spaces.
Abstract
We study protocols for verifying approximate optimality of strategies in multi-armed bandits and normal-form games. As the number of actions available to each player is often large, we seek protocols where the number of queries to the utility oracle is sublinear in the number of actions. We prove that such verification is possible for sufficiently smooth strategies that do not put too much probability mass on any specific action. We provide protocols for verifying that a smooth policy for a multi-armed bandit is -optimal. Our verification protocols require provably fewer arm queries than learning. Furthermore, we establish a nearly-tight lower bound on the query complexity of verification in our settings. As an application, we show how to use verification for bandits to achieve verification in normal-form games. This gives a protocol for verifying whether a given strategy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques
