Combinatorial Pure Exploration of Causal Bandits

Nuoya Xiong; Wei Chen

arXiv:2206.07883·cs.LG·March 15, 2023

Combinatorial Pure Exploration of Causal Bandits

Nuoya Xiong, Wei Chen

PDF

Open Access 1 Video

TL;DR

This paper introduces the first adaptive algorithms for combinatorial pure exploration in causal bandits, achieving polynomial sample complexity for binary generalized linear models and significant improvements for general graphs.

Contribution

It presents novel gap-dependent, fully adaptive algorithms for causal bandit exploration in BGLM and general graphs, with improved sample complexity bounds.

Findings

01

First polynomial sample complexity algorithm for BGLM causal bandits

02

Significant sample complexity improvement for general graphs

03

Nearly matches the theoretical lower bound for general graphs

Abstract

The combinatorial pure exploration of causal bandits is the following online learning task: given a causal graph with unknown causal inference distributions, in each round we choose a subset of variables to intervene or do no intervention, and observe the random outcomes of all random variables, with the goal that using as few rounds as possible, we can output an intervention that gives the best (or almost best) expected outcome on the reward variable $Y$ with probability at least $1 - δ$ , where $δ$ is a given confidence level. We provide the first gap-dependent and fully adaptive pure exploration algorithms on two types of causal models -- the binary generalized linear model (BGLM) and general graphs. For BGLM, our algorithm is the first to be designed specifically for this setting and achieves polynomial sample complexity, while all existing algorithms for general graphs have…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Combinatorial Pure Exploration of Causal Bandits· slideslive

Taxonomy

TopicsMachine Learning and Algorithms · Advanced Bandit Algorithms Research · Domain Adaptation and Few-Shot Learning