Online Bandit Linear Optimization: A Study

Vikram Mullachery; Samarth Tiwari

arXiv:1805.05773·cs.LG·May 16, 2018

Online Bandit Linear Optimization: A Study

Vikram Mullachery, Samarth Tiwari

PDF

Open Access

TL;DR

This paper explores online bandit linear optimization, focusing on the SCRiBLe algorithm, which achieves sublinear regret bounds and polynomial runtime, advancing understanding of efficient algorithms in this domain.

Contribution

It extends the SCRiBLe algorithm to the bandit linear optimization setting and analyzes its regret and computational complexity.

Findings

01

SCRiBLe achieves $O(\sqrt{T})$ regret bound.

02

The algorithm has polynomial runtime complexity.

03

The study advances theoretical understanding of bandit linear optimization.

Abstract

This article introduces the concepts around Online Bandit Linear Optimization and explores an efficient setup called SCRiBLe (Self-Concordant Regularization in Bandit Learning) created by Abernethy et. al.\cite{abernethy}. The SCRiBLe setup and algorithm yield a $O (T)$ regret bound and polynomial run time complexity bound on the dimension of the input space. In this article we build up to the bandit linear optimization case and study SCRiBLe.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Data Stream Mining Techniques