Efficient Contextual Bandits with Continuous Actions

Maryam Majzoubi; Chicheng Zhang; Rajan Chari; Akshay Krishnamurthy,; John Langford; Aleksandrs Slivkins

arXiv:2006.06040·cs.LG·December 7, 2020·6 cites

Efficient Contextual Bandits with Continuous Actions

Maryam Majzoubi, Chicheng Zhang, Rajan Chari, Akshay Krishnamurthy,, John Langford, Aleksandrs Slivkins

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a computationally efficient algorithm for contextual bandits with continuous actions, capable of handling unknown structures and compatible with various supervised learning models, validated through theoretical analysis and large-scale experiments.

Contribution

It presents a novel reduction-style algorithm for continuous action contextual bandits that is both computationally tractable and broadly applicable.

Findings

01

Proves the algorithm's effectiveness in a general setting

02

Demonstrates scalability with large-scale experiments

03

Shows compatibility with most supervised learning representations

Abstract

We create a computationally tractable algorithm for contextual bandits with continuous actions having unknown structure. Our reduction-style algorithm composes with most supervised learning representations. We prove that it works in a general sense and verify the new functionality with large-scale experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

instadeepai/catx
jax

Videos

Efficient Contextual Bandits with Continuous Actions· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics