Open Problem: Model Selection for Contextual Bandits

Dylan J. Foster; Akshay Krishnamurthy; Haipeng Luo

arXiv:2006.10940·cs.LG·June 22, 2020·1 cites

Open Problem: Model Selection for Contextual Bandits

Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo

PDF

Open Access

TL;DR

This paper explores the challenge of model selection in contextual bandit algorithms, aiming to determine if adaptive guarantees similar to those in supervised learning can be achieved in this setting.

Contribution

It introduces the open problem of extending model selection guarantees to contextual bandits, highlighting a key gap in current theoretical understanding.

Findings

01

Identifies the gap in model selection for contextual bandits

02

Proposes the problem as an open question in learning theory

03

Highlights the need for new algorithms with adaptive guarantees

Abstract

In statistical learning, algorithms for model selection allow the learner to adapt to the complexity of the best hypothesis class in a sequence. We ask whether similar guarantees are possible for contextual bandit learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Machine Learning and Algorithms