Open Problem: Model Selection for Contextual Bandits
Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo

TL;DR
This paper explores the challenge of model selection in contextual bandit algorithms, aiming to determine if adaptive guarantees similar to those in supervised learning can be achieved in this setting.
Contribution
It introduces the open problem of extending model selection guarantees to contextual bandits, highlighting a key gap in current theoretical understanding.
Findings
Identifies the gap in model selection for contextual bandits
Proposes the problem as an open question in learning theory
Highlights the need for new algorithms with adaptive guarantees
Abstract
In statistical learning, algorithms for model selection allow the learner to adapt to the complexity of the best hypothesis class in a sequence. We ask whether similar guarantees are possible for contextual bandit learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Machine Learning and Algorithms
