Strategic Behavior and No-Regret Learning in Queueing Systems
Lucas Baudin, Marco Scarsini, Xavier Venel

TL;DR
This paper analyzes a dynamic queueing model where strategic players and no-regret learning algorithms influence system stability, showing stability conditions depend on penalty levels and learning dynamics.
Contribution
It introduces a model combining strategic behavior and no-regret learning in queueing systems, providing stability conditions based on penalties and learning algorithms.
Findings
System is stable with high penalties under strategic behavior.
No-regret learning algorithms ensure stability when penalties exceed a certain bound.
Stability depends on penalty levels and the type of learning algorithm used.
Abstract
This paper studies a dynamic discrete-time queuing model where at every period players get a new job and must send all their jobs to a queue that has a limited capacity. Players have an incentive to send their jobs as late as possible; however if a job does not exit the queue by a fixed deadline, the owner of the job incurs a penalty and this job is sent back to the player and joins the queue at the next period. Therefore, stability, i.e. the boundedness of the number of jobs in the system, is not guaranteed. We show that if players are myopically strategic, then the system is stable when the penalty is high enough. Moreover, if players use a learning algorithm derived from a typical no-regret algorithm (exponential weight), then the system is stable when penalties are greater than a bound that depends on the total number of jobs in the system.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Advanced Bandit Algorithms Research · Smart Grid Energy Management
