Synthesizing Programmatic Policies with Actor-Critic Algorithms and ReLU   Networks

Spyros Orfanos; Levi H. S. Lelis

arXiv:2308.02729·cs.LG·August 8, 2023·1 cites

Synthesizing Programmatic Policies with Actor-Critic Algorithms and ReLU Networks

Spyros Orfanos, Levi H. S. Lelis

PDF

Open Access

TL;DR

This paper demonstrates that actor-critic algorithms combined with ReLU networks can directly synthesize interpretable, programmatic policies without the need for PIRL-specific algorithms, producing effective and human-readable control policies.

Contribution

It introduces a method to directly generate programmatic policies from ReLU neural networks using actor-critic algorithms, bypassing the need for PIRL-specific search algorithms.

Findings

01

Translated policies are short and effective.

02

The approach outperforms PIRL algorithms in several control tasks.

03

Policies are human-readable with if-then-else and linear structures.

Abstract

Programmatically Interpretable Reinforcement Learning (PIRL) encodes policies in human-readable computer programs. Novel algorithms were recently introduced with the goal of handling the lack of gradient signal to guide the search in the space of programmatic policies. Most of such PIRL algorithms first train a neural policy that is used as an oracle to guide the search in the programmatic space. In this paper, we show that such PIRL-specific algorithms are not needed, depending on the language used to encode the programmatic policies. This is because one can use actor-critic algorithms to directly obtain a programmatic policy. We use a connection between ReLU neural networks and oblique decision trees to translate the policy learned with actor-critic algorithms into programmatic policies. This translation from ReLU networks allows us to synthesize policies encoded in programs with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics

MethodsJigsaw · PIRL