Practical Contextual Bandits with Feedback Graphs
Mengxiao Zhang, Yuheng Zhang, Olga Vrousgou, Haipeng Luo, Paul Mineiro

TL;DR
This paper introduces a practical approach to contextual bandits with feedback graphs, leveraging regression reduction to improve learning efficiency while maintaining optimal theoretical performance.
Contribution
It proposes a novel regression-based reduction method for contextual bandits with feedback graphs, achieving minimax rates and practical computational efficiency.
Findings
Algorithms are computationally practical.
Achieve established minimax rates.
Reduce statistical complexity in real-world scenarios.
Abstract
While contextual bandit has a mature theory, effectively leveraging different feedback patterns to enhance the pace of learning remains unclear. Bandits with feedback graphs, which interpolates between the full information and bandit regimes, provides a promising framework to mitigate the statistical complexity of learning. In this paper, we propose and analyze an approach to contextual bandits with feedback graphs based upon reduction to regression. The resulting algorithms are computationally practical and achieve established minimax rates, thereby reducing the statistical complexity in real-world applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Smart Grid Energy Management
