Causal Bandits: Online Decision-Making in Endogenous Settings

Jingwen Zhang; Yifang Chen; Amandeep Singh

arXiv:2211.08649·econ.EM·February 28, 2023

Causal Bandits: Online Decision-Making in Endogenous Settings

Jingwen Zhang, Yifang Chen, Amandeep Singh

PDF

Open Access

TL;DR

This paper introduces the $ ext{ extepsilon}$-BanditIV algorithm for online decision-making in endogenous linear bandit problems, using instrumental variables to correct bias, with proven regret bounds and demonstrated superior performance in simulations and real data.

Contribution

The paper proposes the $ ext{ extepsilon}$-BanditIV algorithm that addresses endogenous covariates in linear bandits using instrumental variables, with theoretical guarantees and practical validation.

Findings

01

$ ext{ extepsilon}$-BanditIV outperforms existing methods in endogenous settings.

02

The algorithm achieves $ ilde{ ext{O}}(k extsqrt{T})$ regret bounds.

03

Demonstrated effectiveness on real-time bidding data.

Abstract

The deployment of Multi-Armed Bandits (MAB) has become commonplace in many economic applications. However, regret guarantees for even state-of-the-art linear bandit algorithms (such as Optimism in the Face of Uncertainty Linear bandit (OFUL)) make strong exogeneity assumptions w.r.t. arm covariates. This assumption is very often violated in many economic contexts and using such algorithms can lead to sub-optimal decisions. Further, in social science analysis, it is also important to understand the asymptotic distribution of estimated parameters. To this end, in this paper, we consider the problem of online learning in linear stochastic contextual bandit problems with endogenous covariates. We propose an algorithm we term $ϵ$ -BanditIV, that uses instrumental variables to correct for this bias, and prove an $\tilde{O} (k T)$ upper bound for the expected regret of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Advanced Causal Inference Techniques