Greedy Algorithm almost Dominates in Smoothed Contextual Bandits

Manish Raghavan; Aleksandrs Slivkins; Jennifer Wortman Vaughan; Zhiwei; Steven Wu

arXiv:2005.10624·cs.LG·December 28, 2021·1 cites

Greedy Algorithm almost Dominates in Smoothed Contextual Bandits

Manish Raghavan, Aleksandrs Slivkins, Jennifer Wortman Vaughan, Zhiwei, Steven Wu

PDF

Open Access

TL;DR

This paper demonstrates that in smoothed linear contextual bandits, a greedy algorithm nearly matches the best possible regret rates under certain diversity conditions, reducing the need for explicit exploration.

Contribution

It improves existing results by showing greedy algorithms perform nearly optimally in smoothed contexts, under specific diversity assumptions.

Findings

01

Greedy algorithm achieves near-optimal Bayesian regret in smoothed linear bandits.

02

Regret bound of approximately O(T^{1/3}) under diversity conditions.

03

Explicit exploration may be unnecessary in diverse data environments.

Abstract

Online learning algorithms, widely used to power search and content optimization on the web, must balance exploration and exploitation, potentially sacrificing the experience of current users in order to gain information that will lead to better decisions in the future. While necessary in the worst case, explicit exploration has a number of disadvantages compared to the greedy algorithm that always "exploits" by choosing an action that currently looks optimal. We ask under what conditions inherent diversity in the data makes explicit exploration unnecessary. We build on a recent line of work on the smoothed analysis of the greedy algorithm in the linear contextual bandits model. We improve on prior results to show that a greedy approach almost matches the best possible Bayesian regret rate of any other algorithm on the same problem instance whenever the diversity conditions hold, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems