Efficient Learning in Large-Scale Combinatorial Semi-Bandits

Zheng Wen; Branislav Kveton; and Azin Ashkan

arXiv:1406.7443·cs.LG·February 1, 2017·48 cites

Efficient Learning in Large-Scale Combinatorial Semi-Bandits

Zheng Wen, Branislav Kveton, and Azin Ashkan

PDF

Open Access

TL;DR

This paper introduces two efficient algorithms, CombLinTS and CombLinUCB, for large-scale combinatorial semi-bandit problems with linear generalization, providing provable regret bounds and demonstrating scalability and superior performance in experiments.

Contribution

The paper proposes two novel algorithms, CombLinTS and CombLinUCB, that are computationally efficient and statistically effective for large-scale combinatorial semi-bandits with linear structure.

Findings

01

CombLinTS outperforms baselines in large-scale experiments.

02

Both algorithms have regret bounds independent of the number of items.

03

CombLinTS is scalable and robust to parameter choices.

Abstract

A stochastic combinatorial semi-bandit is an online learning problem where at each step a learning agent chooses a subset of ground items subject to combinatorial constraints, and then observes stochastic weights of these items and receives their sum as a payoff. In this paper, we consider efficient learning in large-scale combinatorial semi-bandits with linear generalization, and as a solution, propose two learning algorithms called Combinatorial Linear Thompson Sampling (CombLinTS) and Combinatorial Linear UCB (CombLinUCB). Both algorithms are computationally efficient as long as the offline version of the combinatorial problem can be solved efficiently. We establish that CombLinTS and CombLinUCB are also provably statistically efficient under reasonable assumptions, by developing regret bounds that are independent of the problem scale (number of items) and sublinear in time. We also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems