A Two-Layer Framework for Joint Online Configuration Selection and Admission Control

Owen Shen; Haoran Xu; Yinyu Ye; Peter Glynn; Patrick Jaillet

arXiv:2602.07663·math.OC·February 10, 2026

A Two-Layer Framework for Joint Online Configuration Selection and Admission Control

Owen Shen, Haoran Xu, Yinyu Ye, Peter Glynn, Patrick Jaillet

PDF

Open Access

TL;DR

This paper introduces a two-layer online framework for configuration selection and admission control, providing a new benchmark and an algorithm with sublinear regret for applications like LLM serving and GPU scheduling.

Contribution

It proposes a novel two-layer framework with a switching-aware fluid oracle and develops an algorithm with provable regret bounds for joint configuration and admission decisions.

Findings

01

The switching-aware fluid oracle bounds online policy performance.

02

The max-min formulation characterizes saddle points for benchmarking.

03

The SP-UCB--OLP algorithm achieves $ ilde{O}( oot{K}{T})$ regret.

Abstract

We study online configuration selection with admission control problem, which arises in LLM serving, GPU scheduling, and revenue management. In a planning horizon with $T$ periods, we consider a two-layer framework for the decisions made within each time period. In the first layer, the decision maker selects one of the $K$ configurations (ex. quantization, parallelism, fare class) which induces distribution over the reward-resource pair of the incoming request. In the second layer, the decision maker observes the request and then decides whether to accept it or not. Benchmarking this framework requires care. We introduce a \textbf{switching-aware fluid oracle} that accounts for the value of mixing configurations over time, provably upper-bounding any online policy. We derive a max-min formulation for evaluating the benchmark, and we characterize saddle points of the max-min problem…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Age of Information Optimization