Tractable contextual bandits beyond realizability
Sanath Kumar Krishnamurthy, Vitor Hadad, and Susan Athey

TL;DR
This paper introduces a computationally efficient contextual bandit algorithm that remains effective even when the true reward model is not within the assumed class, addressing the limitations of realizability assumptions.
Contribution
It presents a new bandit algorithm that handles model misspecification without relying on realizability, providing regret guarantees similar to realizability-based methods.
Findings
Algorithm's regret bound includes an additive term for misspecification error.
The method is computationally tractable, reducing to constrained regression.
Provides insights into the bias-variance trade-off in contextual bandits.
Abstract
Tractable contextual bandit algorithms often rely on the realizability assumption - i.e., that the true expected reward model belongs to a known class, such as linear functions. In this work, we present a tractable bandit algorithm that is not sensitive to the realizability assumption and computationally reduces to solving a constrained regression problem in every epoch. When realizability does not hold, our algorithm ensures the same guarantees on regret achieved by realizability-based algorithms under realizability, up to an additive term that accounts for the misspecification error. This extra term is proportional to T times a function of the mean squared error between the best model in the class and the true model, where T is the total number of time-steps. Our work sheds light on the bias-variance trade-off for tractable contextual bandits. This trade-off is not captured by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Smart Grid Energy Management
