Beyond $\mathcal{O}(\sqrt{T})$ Regret: Decoupling Learning and   Decision-making in Online Linear Programming

Wenzhi Gao; Dongdong Ge; Chenyu Xue; Chunlin Sun; Yinyu Ye

arXiv:2501.02761·stat.ML·January 7, 2025

Beyond $\mathcal{O}(\sqrt{T})$ Regret: Decoupling Learning and Decision-making in Online Linear Programming

Wenzhi Gao, Dongdong Ge, Chenyu Xue, Chunlin Sun, Yinyu Ye

PDF

Open Access

TL;DR

This paper introduces a new framework for online linear programming that surpasses the traditional $\\mathcal{O}(\sqrt{T})$ regret bound, achieving near-optimal regret rates under certain conditions, thus advancing sequential decision-making methods.

Contribution

It presents a novel approach that decouples learning and decision-making, enabling first-order algorithms to attain $o(\sqrt{T})$ regret in continuous support and $\mathcal{O}(\log T)$ in finite support settings, surpassing previous bounds.

Findings

01

Achieves $o(\sqrt{T})$ regret in continuous support setting.

02

Attains $\mathcal{O}(\log T)$ regret in finite support setting.

03

Provides new theoretical insights into online LP algorithms.

Abstract

Online linear programming plays an important role in both revenue management and resource allocation, and recent research has focused on developing efficient first-order online learning algorithms. Despite the empirical success of first-order methods, they typically achieve a regret no better than $O (T)$ , which is suboptimal compared to the $O (lo g T)$ bound guaranteed by the state-of-the-art linear programming (LP)-based online algorithms. This paper establishes a general framework that improves upon the $O (T)$ result when the LP dual problem exhibits certain error bound conditions. For the first time, we show that first-order learning algorithms achieve $o (T)$ regret in the continuous support setting and $O (lo g T)$ regret in the finite support setting beyond the non-degeneracy assumption. Our results significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Machine Learning and Algorithms