Contextual Bandits with Large Action Spaces: Made Practical
Yinglun Zhu, Dylan J. Foster, John Langford, Paul Mineiro

TL;DR
This paper introduces a practical, efficient algorithm for contextual bandits with large, continuous action spaces, bridging the gap between theory and real-world applications.
Contribution
It presents the first general-purpose, computationally efficient algorithm for large, continuous action spaces using oracles, with theoretical guarantees and practical performance.
Findings
Achieves sample complexity, runtime, and memory independence from action space size.
Outperforms standard baselines in large-scale empirical evaluations.
Provides a simple, practical approach for complex decision-making scenarios.
Abstract
A central problem in sequential decision making is to develop algorithms that are practical and computationally efficient, yet support the use of flexible, general-purpose models. Focusing on the contextual bandit problem, recent progress provides provably efficient algorithms with strong empirical performance when the number of possible alternatives ("actions") is small, but guarantees for decision making in large, continuous action spaces have remained elusive, leading to a significant gap between theory and practice. We present the first efficient, general-purpose algorithm for contextual bandits with continuous, linearly structured action spaces. Our algorithm makes use of computational oracles for (i) supervised learning, and (ii) optimization over the action space, and achieves sample complexity, runtime, and memory independent of the size of the action space. In addition, it is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Smart Grid Energy Management
