Bandit Online Optimization Over the Permutahedron

Nir Ailon; Kohei Hatano; Eiji Takimoto

arXiv:1312.1530·cs.LG·July 8, 2014·1 cites

Bandit Online Optimization Over the Permutahedron

Nir Ailon, Kohei Hatano, Eiji Takimoto

PDF

Open Access

TL;DR

This paper introduces a new efficient algorithm for bandit online optimization over the permutahedron, improving regret bounds and computational complexity compared to previous methods, enabling practical large-scale applications.

Contribution

The paper presents a computationally efficient algorithm with improved regret bounds for bandit optimization over the permutahedron, combining existing approaches and novel variance analysis.

Findings

01

Achieves regret of O(n^{3/2}√T) with O(n^3 T) time complexity.

02

Provides a variance bound for the Plackett-Luce noisy sorting process.

03

Improves practicality of bandit optimization over the permutahedron for large T.

Abstract

The permutahedron is the convex polytope with vertex set consisting of the vectors $(π (1), \dots, π (n))$ for all permutations (bijections) $π$ over ${1, \dots, n}$ . We study a bandit game in which, at each step $t$ , an adversary chooses a hidden weight weight vector $s_{t}$ , a player chooses a vertex $π_{t}$ of the permutahedron and suffers an observed loss of $\sum_{i = 1}^{n} π (i) s_{t} (i)$ . A previous algorithm CombBand of Cesa-Bianchi et al (2009) guarantees a regret of $O (n T lo g n)$ for a time horizon of $T$ . Unfortunately, CombBand requires at each step an $n$ -by- $n$ matrix permanent approximation to within improved accuracy as $T$ grows, resulting in a total running time that is super linear in $T$ , making it impractical for large time horizons. We provide an algorithm of regret $O (n^{3/2} T)$ with total time complexity $O (n^{3} T)$ . The ideas are a combination of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms