On the convergence of cycle detection for navigational reinforcement   learning

Tom J. Ameloot; Jan Van den Bussche

arXiv:1511.08724·cs.LG·January 6, 2016

On the convergence of cycle detection for navigational reinforcement learning

Tom J. Ameloot, Jan Van den Bussche

PDF

Open Access

TL;DR

This paper proves that a simple cycle-detection learning algorithm reliably converges in a class of navigation tasks called reducible, where solutions are acyclic, demonstrating effectiveness in nontrivial reinforcement learning scenarios.

Contribution

It provides a formal proof of convergence for a cycle-detection algorithm in reducible tasks and characterizes the final policy structure.

Findings

01

Convergence is guaranteed for reducible tasks with acyclic solutions.

02

The final policy can be precisely characterized syntactically.

03

Simple algorithms can successfully learn complex navigation tasks.

Abstract

We consider a reinforcement learning framework where agents have to navigate from start states to goal states. We prove convergence of a cycle-detection learning algorithm on a class of tasks that we call reducible. Reducible tasks have an acyclic solution. We also syntactically characterize the form of the final policy. This characterization can be used to precisely detect the convergence point in a simulation. Our result demonstrates that even simple algorithms can be successful in learning a large class of nontrivial tasks. In addition, our framework is elementary in the sense that we only use basic concepts to formally prove convergence.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Optimization and Search Problems · Game Theory and Applications