Causal Bandits: Learning Good Interventions via Causal Inference

Finnian Lattimore; Tor Lattimore; Mark D. Reid

arXiv:1606.03203·stat.ML·June 13, 2016·38 cites

Causal Bandits: Learning Good Interventions via Causal Inference

Finnian Lattimore, Tor Lattimore, Mark D. Reid

PDF

Open Access

TL;DR

This paper introduces a causal bandit framework that leverages causal inference to enhance the learning rate of effective interventions in stochastic environments, outperforming traditional methods.

Contribution

It presents a novel algorithm that exploits causal feedback in bandit problems and provides theoretical guarantees showing improved regret bounds over existing approaches.

Findings

01

The proposed algorithm achieves lower simple regret than non-causal methods.

02

Theoretical regret bounds demonstrate the advantage of using causal information.

03

Empirical results confirm the effectiveness of the causal bandit approach.

Abstract

We study the problem of using causal models to improve the rate at which good interventions can be learned online in a stochastic environment. Our formalism combines multi-arm bandits and causal inference to model a novel type of bandit feedback that is not exploited by existing approaches. We propose a new algorithm that exploits the causal feedback and prove a bound on its simple regret that is strictly better (in all quantities) than algorithms that do not use the additional causal information.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics

MethodsCausal inference