Asymmetric Graph Error Control with Low Complexity in Causal Bandits
Chen Peng, Di Zhang, Urbashi Mitra

TL;DR
This paper introduces a low-complexity method for causal graph learning and intervention in causal bandits, achieving significant sample efficiency and reward improvements in both stationary and non-stationary environments.
Contribution
It proposes a novel causal graph learning algorithm based on error types, a tailored uncertainty bound for intervention, and a change detection mechanism for non-stationary settings.
Findings
Achieves 85% reward gain over existing methods.
Requires fewer samples to learn causal structure.
Performs well in both stationary and non-stationary environments.
Abstract
In this paper, the causal bandit problem is investigated, with the objective of maximizing the long-term reward by selecting an optimal sequence of interventions on nodes in an unknown causal graph. It is assumed that both the causal topology and the distribution of interventions are unknown. First, based on the difference between the two types of graph identification errors (false positives and negatives), a causal graph learning method is proposed. Numerical results suggest that this method has a much lower sample complexity relative to the prior art by learning sub-graphs. However, we note that a sample complexity analysis for the new algorithm has not been undertaken, as of yet. Under the assumption of minimum-mean squared error weight estimation, a new uncertainty bound tailored to the causal bandit problem is derived. This uncertainty bound drives an upper confidence bound-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Distributed Sensor Networks and Detection Algorithms
