Compositional Automata Embeddings for Goal-Conditioned Reinforcement   Learning

Beyazit Yalcinkaya; Niklas Lauffer; Marcell Vazquez-Chanlatte; Sanjit; A. Seshia

arXiv:2411.00205·cs.LG·January 16, 2025

Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning

Beyazit Yalcinkaya, Niklas Lauffer, Marcell Vazquez-Chanlatte, Sanjit, A. Seshia

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel approach to goal-conditioned reinforcement learning by representing goals with compositional deterministic finite automata (cDFAs), enabling better interpretability and zero-shot generalization.

Contribution

It proposes pre-training graph neural network embeddings on reach-avoid derived cDFAs to improve generalization and policy learning in goal-conditioned RL.

Findings

01

Enables zero-shot generalization to various cDFA task classes

02

Accelerates policy specialization without hierarchical suboptimality

03

Balances formal semantics with interpretability of automata

Abstract

Goal-conditioned reinforcement learning is a powerful way to control an AI agent's behavior at runtime. That said, popular goal representations, e.g., target states or natural language, are either limited to Markovian tasks or rely on ambiguous task semantics. We propose representing temporal goals using compositions of deterministic finite automata (cDFAs) and use cDFAs to guide RL agents. cDFAs balance the need for formal temporal semantics with ease of interpretation: if one can understand a flow chart, one can understand a cDFA. On the other hand, cDFAs form a countably infinite concept class with Boolean semantics, and subtle changes to the automaton can result in very different tasks, making them difficult to condition agent behavior on. To address this, we observe that all paths through a DFA correspond to a series of reach-avoid tasks and propose pre-training graph neural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning· slideslive

Taxonomy

TopicsOptimization and Search Problems · Reinforcement Learning in Robotics · Modular Robots and Swarm Intelligence

MethodsGraph Neural Network · Direct Feedback Alignment