A Constrained Randomized Shortest-Paths Framework for Optimal   Exploration

Bertrand Lebichot; Guillaume Guex; Ilkka Kivim\"aki; Marco Saerens

arXiv:1807.04551·cs.LG·July 13, 2018·5 cites

A Constrained Randomized Shortest-Paths Framework for Optimal Exploration

Bertrand Lebichot, Guillaume Guex, Ilkka Kivim\"aki, Marco Saerens

PDF

Open Access

TL;DR

This paper extends the randomized shortest-paths framework to include equality constraints, providing algorithms for optimal exploration and exploitation balancing in networks and Markov decision processes.

Contribution

It introduces a generic constrained RSP algorithm using Lagrangian duality and a simple iterative method for computing optimal randomized policies.

Findings

01

Algorithms effectively balance exploration and exploitation.

02

The approach generalizes soft Bellman-Ford and value iteration.

03

Simulation confirms the model's expected behavior.

Abstract

The present work extends the randomized shortest-paths framework (RSP), interpolating between shortest-path and random-walk routing in a network, in three directions. First, it shows how to deal with equality constraints on a subset of transition probabilities and develops a generic algorithm for solving this constrained RSP problem using Lagrangian duality. Second, it derives a surprisingly simple iterative procedure to compute the optimal, randomized, routing policy generalizing the previously developed "soft" Bellman-Ford algorithm. The resulting algorithm allows balancing exploitation and exploration in an optimal way by interpolating between a pure random behavior and the deterministic, optimal, policy (least-cost paths) while satisfying the constraints. Finally, the two algorithms are applied to Markov decision problems by considering the process as a constrained RSP on a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraffic control and management · Reinforcement Learning in Robotics · Optimization and Search Problems