Matroid Bandits: Fast Combinatorial Optimization with Learning
Branislav Kveton, Zheng Wen, Azin Ashkan, Hoda Eydgahi, Brian Eriksson

TL;DR
This paper introduces matroid bandits, a new class combining bandit learning with combinatorial optimization under matroid constraints, and proposes an algorithm with proven regret bounds.
Contribution
It presents the first algorithm for learning to maximize stochastic modular functions on matroids, with theoretical regret bounds and practical evaluation.
Findings
The OMM algorithm achieves sublinear regret in time.
The regret bounds are tight and proven to be optimal.
The method is effective on real-world problems.
Abstract
A matroid is a notion of independence in combinatorial optimization which is closely related to computational efficiency. In particular, it is well known that the maximum of a constrained modular function can be found greedily if and only if the constraints are associated with a matroid. In this paper, we bring together the ideas of bandits and matroids, and propose a new class of combinatorial bandits, matroid bandits. The objective in these problems is to learn how to maximize a modular function on a matroid. This function is stochastic and initially unknown. We propose a practical algorithm for solving our problem, Optimistic Matroid Maximization (OMM); and prove two upper bounds, gap-dependent and gap-free, on its regret. Both bounds are sublinear in time and at most linear in all other quantities of interest. The gap-dependent upper bound is tight and we prove a matching lower…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Optimization and Search Problems
