The Value-of-Information in Matching with Queues
Longbo Huang

TL;DR
This paper develops online algorithms for optimal matching in queue-based systems, leveraging learning to estimate rewards and system dynamics, achieving near-optimal utility with improved convergence and delay.
Contribution
The paper introduces two novel algorithms, LRAM and DRAM, that incorporate learning modules to handle reward and system dynamics uncertainties in queue matching problems.
Findings
Both algorithms achieve near-optimal utility within $O( ext{error})$ bounds.
DRAM converges faster and has lower delay than LRAM.
The role of different system information in algorithm performance is systematically analyzed.
Abstract
We consider the problem of \emph{optimal matching with queues} in dynamic systems and investigate the value-of-information. In such systems, the operators match tasks and resources stored in queues, with the objective of maximizing the system utility of the matching reward profile, minus the average matching cost. This problem appears in many practical systems and the main challenges are the no-underflow constraints, and the lack of matching-reward information and system dynamics statistics. We develop two online matching algorithms: Learning-aided Reward optimAl Matching () and Dual- () to effectively resolve both challenges. Both algorithms are equipped with a learning module for estimating the matching-reward information, while incorporates an additional module for learning the system dynamics. We show that both algorithms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Advanced Wireless Network Optimization · Wireless Networks and Protocols
