The on-line shortest path problem under partial monitoring
Andras Gyorgy, Tamas Linder, Gabor Lugosi, Gyorgy Ottucsak

TL;DR
This paper introduces an online algorithm for the shortest path problem under partial monitoring, achieving near-optimal regret bounds with efficient implementation, and extends it to various settings including label-efficient and tracking scenarios.
Contribution
It presents a new online learning algorithm for shortest path under partial feedback with polynomial complexity and extends it to label-efficient and dynamic path tracking settings.
Findings
Achieves regret proportional to 1/√n with polynomial complexity.
Extends to label-efficient settings with limited feedback.
Demonstrates effectiveness in routing applications with simulation results.
Abstract
The on-line shortest path problem is considered under various models of partial monitoring. Given a weighted directed acyclic graph whose edge weights can change in an arbitrary (adversarial) way, a decision maker has to choose in each round of a game a path between two distinguished vertices such that the loss of the chosen path (defined as the sum of the weights of its composing edges) be as small as possible. In a setting generalizing the multi-armed bandit problem, after choosing a path, the decision maker learns only the weights of those edges that belong to the chosen path. For this problem, an algorithm is given whose average cumulative loss in n rounds exceeds that of the best path, matched off-line to the entire sequence of the edge weights, by a quantity that is proportional to 1/\sqrt{n} and depends only polynomially on the number of edges of the graph. The algorithm can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms
