Causal Bandit Over Unknown Graphs: Upper Confidence Bounds With Backdoor Adjustment

Yijia Zhao; Qing Zhou

arXiv:2502.02020·cs.LG·April 7, 2026

Causal Bandit Over Unknown Graphs: Upper Confidence Bounds With Backdoor Adjustment

Yijia Zhao, Qing Zhou

PDF

TL;DR

This paper introduces BA-UCB, a new algorithm for causal bandit problems with unknown DAGs, leveraging backdoor adjustment to improve intervention selection and regret bounds.

Contribution

It develops a novel method combining observational and experimental data to identify backdoor sets, enabling effective causal effect estimation without known causal graphs.

Findings

01

BA-UCB achieves lower cumulative regret than existing methods.

02

Theoretical regret bounds are established with relaxed dependency on intervention arms.

03

Simulation results show improved efficiency and accuracy in unknown causal graph settings.

Abstract

The causal bandit problem seeks to identify, through sequential experimentation, an intervention that maximizes the expected reward in a causal system modeled by a directed acyclic graph (DAG). Existing methods typically assume that the causal graph is known or impose restrictive structural assumptions. In this paper, we study causal bandit problems when the causal graph is unknown. We first consider Gaussian DAG models without latent confounders. By combining observational and experimental data collected sequentially during the bandit process, we identify candidate backdoor adjustment sets for each intervention arm. These sets enable estimation of causal effects and construction of upper confidence bounds that integrate information from both data sources. Based on these estimates, we propose a new algorithm, termed backdoor-adjustment upper confidence bound (BA-UCB), for sequential…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.