Oracle-Efficient Combinatorial Semi-Bandits

Jung-hun Kim; Milan Vojnovi\'c; Min-hwan Oh

arXiv:2510.21431·stat.ML·October 27, 2025

Oracle-Efficient Combinatorial Semi-Bandits

Jung-hun Kim, Milan Vojnovi\'c, Min-hwan Oh

PDF

Open Access 1 Video

TL;DR

This paper introduces oracle-efficient algorithms for combinatorial semi-bandits that drastically reduce oracle calls from linear to logarithmic in time while maintaining optimal regret bounds.

Contribution

It presents novel algorithms that significantly lower the number of oracle queries needed in combinatorial semi-bandit problems, with tight regret guarantees.

Findings

01

Achieves $ ilde{O}( oot{2}{T})$ regret with $O(\log\log T)$ oracle calls.

02

Develops covariance-adaptive algorithms for better regret in structured noise settings.

03

Extends methods to handle general non-linear reward functions.

Abstract

We study the combinatorial semi-bandit problem where an agent selects a subset of base arms and receives individual feedback. While this generalizes the classical multi-armed bandit and has broad applicability, its scalability is limited by the high cost of combinatorial optimization, requiring oracle queries at every round. To tackle this, we propose oracle-efficient frameworks that significantly reduce oracle calls while maintaining tight regret guarantees. For the worst-case linear reward setting, our algorithms achieve $\tilde{O} (T)$ regret using only $O (lo g lo g T)$ oracle queries. We also propose covariance-adaptive algorithms that leverage noise structure for improved regret, and extend our approach to general (non-linear) rewards. Overall, our methods reduce oracle usage from linear to (doubly) logarithmic in time, with strong theoretical guarantees.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Oracle-Efficient Combinatorial Semi-Bandits· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Stochastic Gradient Optimization Techniques