Practical Adversarial Combinatorial Bandit Algorithm via Compression of   Decision Sets

Shinsaku Sakaue; Masakazu Ishihata; Shin-ichi Minato

arXiv:1707.08300·cs.DS·July 27, 2017·1 cites

Practical Adversarial Combinatorial Bandit Algorithm via Compression of Decision Sets

Shinsaku Sakaue, Masakazu Ishihata, Shin-ichi Minato

PDF

Open Access

TL;DR

This paper introduces a new algorithm for adversarial combinatorial bandit problems that uses decision set compression via ZDDs, achieving low regret and efficiency on large, network-based problems.

Contribution

The paper presents a novel algorithm leveraging ZDDs to handle exponentially large decision sets in adversarial CMAB problems, improving computational efficiency and regret bounds.

Findings

01

Achieves $O(T^{2/3})$ regret with high probability

02

Achieves $O(\sqrt{T})$ expected regret as an anytime guarantee

03

Demonstrates effectiveness on real-world network routing problems

Abstract

We consider the adversarial combinatorial multi-armed bandit (CMAB) problem, whose decision set can be exponentially large with respect to the number of given arms. To avoid dealing with such large decision sets directly, we propose an algorithm performed on a zero-suppressed binary decision diagram (ZDD), which is a compressed representation of the decision set. The proposed algorithm achieves either $O (T^{2/3})$ regret with high probability or $O (T)$ expected regret as the any-time guarantee, where $T$ is the number of past rounds. Typically, our algorithm works efficiently for CMAB problems defined on networks. Experimental results show that our algorithm is applicable to various large adversarial CMAB instances including adaptive routing problems on real-world networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics