An Online Learning Approach to Optimizing Time-Varying Costs of AoI
Vishrant Tripathi, Eytan Modiano

TL;DR
This paper develops online learning algorithms for optimizing Age of Information (AoI) costs in dynamic, possibly adversarial environments, demonstrating low regret and applicability to mobility tracking.
Contribution
It introduces novel online algorithms with regret guarantees for single and multiple source AoI scheduling, including the Follow-the-Perturbed-Whittle-Leader method.
Findings
Algorithms achieve sublinear regret in non-stationary environments.
The proposed methods outperform oblivious policies in mobility tracking.
Low computational complexity for multi-source scheduling.
Abstract
We consider systems that require timely monitoring of sources over a communication network, where the cost of delayed information is unknown, time-varying and possibly adversarial. For the single source monitoring problem, we design algorithms that achieve sublinear regret compared to the best fixed policy in hindsight. For the multiple source scheduling problem, we design a new online learning algorithm called Follow-the-Perturbed-Whittle-Leader and show that it has low regret compared to the best fixed scheduling policy in hindsight, while remaining computationally feasible. The algorithm and its regret analysis are novel and of independent interest to the study of online restless multi-armed bandit problems. We further design algorithms that achieve sublinear regret compared to the best dynamic policy when the environment is slowly varying. Finally, we apply our algorithms to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
