Action Centered Contextual Bandits
Kristjan Greenewald, Ambuj Tewari, Predrag Klasnja, Susan, Murphy

TL;DR
This paper introduces an extended contextual bandit model tailored for mobile health, combining complex baseline reward modeling with simple treatment effects, supported by theoretical guarantees and experimental validation.
Contribution
It proposes a novel model extension for contextual bandits that separates baseline reward and treatment effect, suitable for mobile health applications, with strong theoretical guarantees.
Findings
Model effectively captures complex baseline rewards and simple treatment effects.
Algorithms demonstrate strong performance guarantees similar to linear models.
Experimental results validate the model on real mobile health data.
Abstract
Contextual bandits have become popular as they offer a middle ground between very simple approaches based on multi-armed bandits and very complex approaches using the full power of reinforcement learning. They have demonstrated success in web applications and have a rich body of associated theoretical guarantees. Linear models are well understood theoretically and preferred by practitioners because they are not only easily interpretable but also simple to implement and debug. Furthermore, if the linear model is true, we get very strong performance guarantees. Unfortunately, in emerging applications in mobile health, the time-invariant linear model assumption is untenable. We provide an extension of the linear model for contextual bandits that has two parts: baseline reward and treatment effect. We allow the former to be complex but keep the latter simple. We argue that this model is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Advanced Wireless Network Optimization
