Causal Bandits: Online Decision-Making in Endogenous Settings
Jingwen Zhang, Yifang Chen, Amandeep Singh

TL;DR
This paper introduces the $ ext{ extepsilon}$-BanditIV algorithm for online decision-making in endogenous linear bandit problems, using instrumental variables to correct bias, with proven regret bounds and demonstrated superior performance in simulations and real data.
Contribution
The paper proposes the $ ext{ extepsilon}$-BanditIV algorithm that addresses endogenous covariates in linear bandits using instrumental variables, with theoretical guarantees and practical validation.
Findings
$ ext{ extepsilon}$-BanditIV outperforms existing methods in endogenous settings.
The algorithm achieves $ ilde{ ext{O}}(k extsqrt{T})$ regret bounds.
Demonstrated effectiveness on real-time bidding data.
Abstract
The deployment of Multi-Armed Bandits (MAB) has become commonplace in many economic applications. However, regret guarantees for even state-of-the-art linear bandit algorithms (such as Optimism in the Face of Uncertainty Linear bandit (OFUL)) make strong exogeneity assumptions w.r.t. arm covariates. This assumption is very often violated in many economic contexts and using such algorithms can lead to sub-optimal decisions. Further, in social science analysis, it is also important to understand the asymptotic distribution of estimated parameters. To this end, in this paper, we consider the problem of online learning in linear stochastic contextual bandit problems with endogenous covariates. We propose an algorithm we term -BanditIV, that uses instrumental variables to correct for this bias, and prove an upper bound for the expected regret of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Advanced Causal Inference Techniques
