Correlational Dueling Bandits with Application to Clinical Treatment in   Large Decision Spaces

Yanan Sui; Yisong Yue; Joel W. Burdick

arXiv:1707.02375·cs.LG·July 11, 2017·6 cites

Correlational Dueling Bandits with Application to Clinical Treatment in Large Decision Spaces

Yanan Sui, Yisong Yue, Joel W. Burdick

PDF

Open Access

TL;DR

This paper introduces CorrDuel, an algorithm for large-scale dueling bandits with correlated arms, applied to clinical treatment optimization, demonstrating improved regret bounds and successful real-world application in spinal cord injury therapy.

Contribution

The paper presents CorrDuel, a novel algorithm for large, correlated decision spaces, with theoretical regret bounds and practical validation in clinical treatment settings.

Findings

01

CorrDuel outperforms existing algorithms in large decision spaces.

02

The approach achieves low regret in simulations and clinical trial.

03

First application of online learning to spinal cord injury treatments.

Abstract

We consider sequential decision making under uncertainty, where the goal is to optimize over a large decision space using noisy comparative feedback. This problem can be formulated as a $K$ -armed Dueling Bandits problem where $K$ is the total number of decisions. When $K$ is very large, existing dueling bandits algorithms suffer huge cumulative regret before converging on the optimal arm. This paper studies the dueling bandits problem with a large number of arms that exhibit a low-dimensional correlation structure. Our problem is motivated by a clinical decision making process in large decision space. We propose an efficient algorithm CorrDuel which optimizes the exploration/exploitation tradeoff in this large decision space of clinical treatments. More broadly, our approach can be applied to other sequential decision problems with large and structured decision spaces. We derive regret…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Machine Learning and Algorithms