Efficient Symbolic Policy Learning with Differentiable Symbolic   Expression

Jiaming Guo; Rui Zhang; Shaohui Peng; Qi Yi; Xing Hu; Ruizhi Chen,; Zidong Du; Xishan Zhang; Ling Li; Qi Guo; Yunji Chen

arXiv:2311.02104·cs.LG·November 7, 2023·1 cites

Efficient Symbolic Policy Learning with Differentiable Symbolic Expression

Jiaming Guo, Rui Zhang, Shaohui Peng, Qi Yi, Xing Hu, Ruizhi Chen,, Zidong Du, Xishan Zhang, Ling Li, Qi Guo, Yunji Chen

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces ESPL, a gradient-based method for learning compact, interpretable symbolic policies from scratch in reinforcement learning, applicable to both single-task and meta-RL, with improved efficiency and performance.

Contribution

It presents a novel end-to-end differentiable symbolic policy learning approach that efficiently generates symbolic policies without pre-trained models, extending to meta-RL tasks.

Findings

01

Symbolic policies outperform neural networks in data efficiency.

02

ESPL achieves higher performance in single-task RL.

03

Symbolic policies demonstrate potential for interpretability.

Abstract

Deep reinforcement learning (DRL) has led to a wide range of advances in sequential decision-making tasks. However, the complexity of neural network policies makes it difficult to understand and deploy with limited computational resources. Currently, employing compact symbolic expressions as symbolic policies is a promising strategy to obtain simple and interpretable policies. Previous symbolic policy methods usually involve complex training processes and pre-trained neural network policies, which are inefficient and limit the application of symbolic policies. In this paper, we propose an efficient gradient-based learning method named Efficient Symbolic Policy Learning (ESPL) that learns the symbolic policy from scratch in an end-to-end way. We introduce a symbolic network as the search space and employ a path selector to find the compact symbolic policy. By doing so we represent the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guojm14/ESPL
pytorchOfficial

Videos

Efficient Symbolic Policy Learning with Differentiable Symbolic Expression· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification · Evolutionary Algorithms and Applications