Effects of Model Misspecification on Bayesian Bandits: Case Studies in UX Optimization
Mack Sweeney, Matthew van Adelsberg, Kathryn Laskey, Carlotta, Domeniconi

TL;DR
This paper investigates how model misspecification affects Bayesian bandit algorithms in UX optimization, demonstrating the impact of overdispersion and cointegration, and proposing extensions for improved performance.
Contribution
It introduces novel formulations for UXO as a restless, sleeping bandit with unobserved confounders, and presents new models addressing overdispersion and cointegration effects.
Findings
Misspecification leads to sub-optimal rewards.
Model extensions improve exploration and exploitation.
Cointegration exploitation achieves finite regret and efficient stopping.
Abstract
Bayesian bandits using Thompson Sampling have seen increasing success in recent years. Yet existing value models (of rewards) are misspecified on many real-world problem. We demonstrate this on the User Experience Optimization (UXO) problem, providing a novel formulation as a restless, sleeping bandit with unobserved confounders plus optional stopping. Our case studies show how common misspecifications can lead to sub-optimal rewards, and we provide model extensions to address these, along with a scientific model building process practitioners can adopt or adapt to solve their own unique problems. To our knowledge, this is the first study showing the effects of overdispersion on bandit explore/exploit efficacy, tying the common notions of under- and over-confidence to over- and under-exploration, respectively. We also present the first model to exploit cointegration in a restless…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
