Logarithmic Regret for Unconstrained Submodular Maximization Stochastic   Bandit

Julien Zhou (Thoth; STATIFY); Pierre Gaillard (Thoth); Thibaud Rahier,; Julyan Arbel (STATIFY)

arXiv:2410.08578·cs.LG·February 13, 2025

Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit

Julien Zhou (Thoth, STATIFY), Pierre Gaillard (Thoth), Thibaud Rahier,, Julyan Arbel (STATIFY)

PDF

Open Access

TL;DR

This paper introduces a new algorithm for online unconstrained submodular maximization with stochastic bandit feedback, achieving improved regret bounds and characterizing the problem's hardness transition.

Contribution

It proposes the DG-ETC algorithm, combining Double-Greedy with explore-then-commit, and provides new regret bounds along with a hardness measure for the problem.

Findings

01

Achieves $O(d ext{log}(dT))$ problem-dependent regret bound.

02

Achieves $O(dT^{2/3} ext{log}(dT)^{1/3})$ problem-free regret bound.

03

Introduces a hardness measure for the transition between regret regimes.

Abstract

We address the online unconstrained submodular maximization problem (Online USM), in a setting with stochastic bandit feedback. In this framework, a decision-maker receives noisy rewards from a non monotone submodular function taking values in a known bounded interval. This paper proposes Double-Greedy - Explore-then-Commit (DG-ETC), adapting the Double-Greedy approach from the offline and online full-information settings. DG-ETC satisfies a $O (d lo g (d T))$ problem-dependent upper bound for the $1/2$ -approximate pseudo-regret, as well as a $O (d T^{2/3} lo g (d T)^{1/3})$ problem-free one at the same time, outperforming existing approaches. In particular, we introduce a problem-dependent notion of hardness characterizing the transition between logarithmic and polynomial regime for the upper bounds.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Distributed Sensor Networks and Detection Algorithms

MethodsNetwork On Network