Multi-armed Bandits with Cost Subsidy

Deeksha Sinha; Karthik Abinav Sankararama; Abbas Kazerouni; Vashist; Avadhanula

arXiv:2011.01488·cs.LG·March 16, 2021

Multi-armed Bandits with Cost Subsidy

Deeksha Sinha, Karthik Abinav Sankararama, Abbas Kazerouni, Vashist, Avadhanula

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new variant of the multi-armed bandit problem that incorporates cost subsidies, addressing real-world scenarios where selecting options incurs costs and aims to optimize both rewards and expenses.

Contribution

It formulates the MAB with cost subsidy problem, establishes fundamental lower bounds, and proposes near-optimal algorithms with practical recommendations.

Findings

01

Naive extensions of classical algorithms perform poorly.

02

A fundamental lower bound on performance is established.

03

A simple explore-then-commit algorithm achieves near-optimal regret.

Abstract

In this paper, we consider a novel variant of the multi-armed bandit (MAB) problem, MAB with cost subsidy, which models many real-life applications where the learning agent has to pay to select an arm and is concerned about optimizing cumulative costs and rewards. We present two applications, intelligent SMS routing problem and ad audience optimization problem faced by several businesses (especially online platforms), and show how our problem uniquely captures key features of these applications. We show that naive generalizations of existing MAB algorithms like Upper Confidence Bound and Thompson Sampling do not perform well for this problem. We then establish a fundamental lower bound on the performance of any online learning algorithm for this problem, highlighting the hardness of our problem in comparison to the classical MAB problem. We also present a simple variant of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

PlaytikaOSS/pybandits
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms