Bias-Robust Bayesian Optimization via Dueling Bandits

Johannes Kirschner; Andreas Krause

arXiv:2105.11802·stat.ML·June 10, 2021

Bias-Robust Bayesian Optimization via Dueling Bandits

Johannes Kirschner, Andreas Krause

PDF

Open Access 1 Video

TL;DR

This paper introduces a new kernelized dueling bandit algorithm based on information-directed sampling, designed to handle adversarial biases in Bayesian optimization, with theoretical regret guarantees and extensions to non-linear rewards.

Contribution

It reduces confounded Bayesian optimization to dueling bandits and proposes the first efficient kernelized dueling bandit algorithm with regret guarantees.

Findings

01

First kernelized dueling bandit algorithm with regret bounds

02

Extension to non-linear reward functions

03

Links to doubly-robust estimation

Abstract

We consider Bayesian optimization in settings where observations can be adversarially biased, for example by an uncontrolled hidden confounder. Our first contribution is a reduction of the confounded setting to the dueling bandit model. Then we propose a novel approach for dueling bandits based on information-directed sampling (IDS). Thereby, we obtain the first efficient kernelized algorithm for dueling bandits that comes with cumulative regret guarantees. Our analysis further generalizes a previously proposed semi-parametric linear bandit model to non-linear reward functions, and uncovers interesting links to doubly-robust estimation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Bias-Robust Bayesian Optimization via Dueling Bandits· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Reinforcement Learning in Robotics