Stochastic Shortest Path with Sparse Adversarial Costs
Emmeran Johnson, Alberto Rumi, Ciara Pike-Burke, Patrick Rebeschini

TL;DR
This paper investigates the adversarial SSP problem with sparse costs, proposing adaptive algorithms that improve regret bounds by exploiting sparsity, and establishes fundamental limits in the unknown transition setting.
Contribution
It introduces $ ext{l}_r$-norm regularizers that adapt to sparsity, achieving regret bounds depending on the effective dimension $M$, and proves these bounds are optimal.
Findings
Adaptive algorithms achieve regret scaling with $ ext{log} M$.
Negative-entropy regularization fails to adapt to sparsity.
In unknown transitions, regret scales polynomially with $SA$.
Abstract
We study the adversarial Stochastic Shortest Path (SSP) problem with sparse costs under full-information feedback. In the known transition setting, existing bounds based on Online Mirror Descent (OMD) with negative-entropy regularization scale with , where is the size of the state-action space. While we show that this is optimal in the worst-case, this bound fails to capture the benefits of sparsity when only a small number of state-action pairs incur cost. In fact, we also show that the negative-entropy is inherently non-adaptive to sparsity: it provably incurs regret scaling with on sparse problems. Instead, we propose a family of -norm regularizers () that adapts to the sparsity and achieves regret scaling with instead of . We show this is optimal via a matching lower bound,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Age of Information Optimization
