Induction of Subgoal Automata for Reinforcement Learning

Daniel Furelos-Blanco; Mark Law; Alessandra Russo; Krysia Broda and; Anders Jonsson

arXiv:1911.13152·cs.LG·December 2, 2019

Induction of Subgoal Automata for Reinforcement Learning

Daniel Furelos-Blanco, Mark Law, Alessandra Russo, Krysia Broda and, Anders Jonsson

PDF

TL;DR

This paper introduces ISA, a method that learns subgoal automata in reinforcement learning using inductive logic programming, enabling improved convergence and transfer learning by automaton-based reward shaping.

Contribution

ISA is the first approach to automatically induce subgoal automata from observation traces in RL, integrating automaton learning with reinforcement learning for enhanced performance.

Findings

01

ISA performs comparably to methods with pre-defined automata.

02

Learned automata facilitate reward shaping and transfer learning.

03

The approach's efficiency depends on the number of observable events.

Abstract

In this work we present ISA, a novel approach for learning and exploiting subgoals in reinforcement learning (RL). Our method relies on inducing an automaton whose transitions are subgoals expressed as propositional formulas over a set of observable events. A state-of-the-art inductive logic programming system is used to learn the automaton from observation traces perceived by the RL agent. The reinforcement learning and automaton learning processes are interleaved: a new refined automaton is learned whenever the RL agent generates a trace not recognized by the current automaton. We evaluate ISA in several gridworld problems and show that it performs similarly to a method for which automata are given in advance. We also show that the learned automata can be exploited to speed up convergence through reward shaping and transfer learning across multiple tasks. Finally, we analyze the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings