Online Learning with Probing for Sequential User-Centric Selection

Tianyi Xu; Yiting Chen; Henger Li; Zheyong Bian; Emiliano Dall'Anese; Zizhan Zheng

arXiv:2507.20112·cs.LG·August 19, 2025

Online Learning with Probing for Sequential User-Centric Selection

Tianyi Xu, Yiting Chen, Henger Li, Zheyong Bian, Emiliano Dall'Anese, Zizhan Zheng

PDF

TL;DR

This paper introduces the PUCS framework for sequential decision-making with costly probing, providing algorithms with provable guarantees for both offline and online settings, and demonstrating effectiveness on real data.

Contribution

It formalizes the PUCS framework, proposes a greedy algorithm with approximation guarantees, and develops OLPA with regret bounds for online learning, filling a gap in resource-aware sequential decision-making.

Findings

01

Greedy probing algorithm achieves a constant-factor approximation.

02

OLPA algorithm attains near-optimal regret bounds.

03

Experimental results validate the proposed methods' effectiveness.

Abstract

We formalize sequential decision-making with information acquisition as the probing-augmented user-centric selection (PUCS) framework, where a learner first probes a subset of arms to obtain side information on resources and rewards, and then assigns $K$ plays to $M$ arms. PUCS covers applications such as ridesharing, wireless scheduling, and content recommendation, in which both resources and payoffs are initially unknown and probing is costly. For the offline setting with known distributions, we present a greedy probing algorithm with a constant-factor approximation guarantee $ζ = (e - 1) / (2 e - 1)$ . For the online setting with unknown distributions, we introduce OLPA, a stochastic combinatorial bandit algorithm that achieves a regret bound $O (T + ln^{2} T)$ . We also prove a lower bound $Ω (T)$ , showing that the upper bound is tight up to logarithmic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.