Combinatorial Pure Exploration of Causal Bandits
Nuoya Xiong, Wei Chen

TL;DR
This paper introduces the first adaptive algorithms for combinatorial pure exploration in causal bandits, achieving polynomial sample complexity for binary generalized linear models and significant improvements for general graphs.
Contribution
It presents novel gap-dependent, fully adaptive algorithms for causal bandit exploration in BGLM and general graphs, with improved sample complexity bounds.
Findings
First polynomial sample complexity algorithm for BGLM causal bandits
Significant sample complexity improvement for general graphs
Nearly matches the theoretical lower bound for general graphs
Abstract
The combinatorial pure exploration of causal bandits is the following online learning task: given a causal graph with unknown causal inference distributions, in each round we choose a subset of variables to intervene or do no intervention, and observe the random outcomes of all random variables, with the goal that using as few rounds as possible, we can output an intervention that gives the best (or almost best) expected outcome on the reward variable with probability at least , where is a given confidence level. We provide the first gap-dependent and fully adaptive pure exploration algorithms on two types of causal models -- the binary generalized linear model (BGLM) and general graphs. For BGLM, our algorithm is the first to be designed specifically for this setting and achieves polynomial sample complexity, while all existing algorithms for general graphs have…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Advanced Bandit Algorithms Research · Domain Adaptation and Few-Shot Learning
