Adversarial Causal Bayesian Optimization
Scott Sussex, Pier Giuseppe Sessa, Anastasiia Makarova, Andreas, Krause

TL;DR
This paper introduces Adversarial Causal Bayesian Optimization (ACBO), a framework that accounts for external interventions and adversarial influences, with an algorithm that adapts to non-stationary environments using causal modeling and online learning.
Contribution
It formalizes ACBO, develops the first bounded regret algorithm CBO-MW, and demonstrates scalability and effectiveness in synthetic and real-world scenarios.
Findings
CBO-MW outperforms non-causal methods in experiments.
The approach effectively models external interventions.
Regret bounds depend on causal graph properties.
Abstract
In Causal Bayesian Optimization (CBO), an agent intervenes on an unknown structural causal model to maximize a downstream reward variable. In this paper, we consider the generalization where other agents or external events also intervene on the system, which is key for enabling adaptiveness to non-stationarities such as weather changes, market forces, or adversaries. We formalize this generalization of CBO as Adversarial Causal Bayesian Optimization (ACBO) and introduce the first algorithm for ACBO with bounded regret: Causal Bayesian Optimization with Multiplicative Weights (CBO-MW). Our approach combines a classical online learning strategy with causal modeling of the rewards. To achieve this, it computes optimistic counterfactual reward estimates by propagating uncertainty through the causal graph. We derive regret bounds for CBO-MW that naturally depend on graph-related quantities.…
Peer Reviews
Decision·ICLR 2024 poster
- Interesting setting of doing BO with causal relationships among variables and external interventions - Interesting applied problem in the experiments
- The main weakness is the naming of the method, which refers to CBO (Aglietti et al 2020) and hereby its relationship with the CBO setting. The paper claims to show a generalization of CBO, which it seems to be an * algorithm * proposed in Aglietti et al 2020 as a solution to the "Causal Global Optimization" (CGO) problem. Here, in the abstract but also in the main paper, (1) no mention to the CGO problem is made (2) CBO seems to refer to the "setting" (somehow as a replacement to CGO), while t
- Experiments show that the algorithm is strong for the use cases considered. - Well written problem statement. - For the model chosen, the analysis is sound.
- The related work ignores causal bandit literature? - The problem is not well motivated. Why is SMS adversarial and not stochastic? - The graph notations are confusing. The typical graph has 1 root node. But in causal graphs, we may have multiple nodes without parents. - The adversary cannot see the action taken by agent before taking its action. Is this adversary weak? You have considered that the agent can see adversary's action before choosing their own action set, but what about vice-versa?
The idea of studying an online causal model looks interesting.
I don't see any fundamental difference between the studied model and a standard bandit or Bayesian optimization problem, where part of the model is stochastic and part of it is controlled by an adversary. Therefore, apart from having a causal model in the story, the novelty of the contribution seems limited.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Bayesian Modeling and Causal Inference · Vaccine Coverage and Hesitancy
