Mitigating Information Loss in Tree-Based Reinforcement Learning via   Direct Optimization

Sascha Marton; Tim Grams; Florian Vogt; Stefan L\"udtke; Christian; Bartelt; Heiner Stuckenschmidt

arXiv:2408.08761·cs.LG·March 12, 2025

Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization

Sascha Marton, Tim Grams, Florian Vogt, Stefan L\"udtke, Christian, Bartelt, Heiner Stuckenschmidt

PDF

Open Access 1 Repo

TL;DR

SYMPOL introduces a novel, interpretable, tree-based policy gradient method for reinforcement learning, enabling end-to-end learning of decision trees that outperform existing approaches in interpretability and performance.

Contribution

The paper presents SYMPOL, a new method for learning symbolic, tree-based policies directly within on-policy RL, combining interpretability with gradient-based optimization.

Findings

01

SYMPOL outperforms existing tree-based RL methods in benchmarks.

02

It enables end-to-end differentiable learning of decision trees.

03

SYMPOL maintains high interpretability while achieving competitive performance.

Abstract

Reinforcement learning (RL) has seen significant success across various domains, but its adoption is often limited by the black-box nature of neural network policies, making them difficult to interpret. In contrast, symbolic policies allow representing decision-making strategies in a compact and interpretable way. However, learning symbolic policies directly within on-policy methods remains challenging. In this paper, we introduce SYMPOL, a novel method for SYMbolic tree-based on-POLicy RL. SYMPOL employs a tree-based model integrated with a policy gradient method, enabling the agent to learn and adapt its actions while maintaining a high level of interpretability. We evaluate SYMPOL on a set of benchmark RL tasks, demonstrating its superiority over alternative tree-based RL approaches in terms of performance and interpretability. Unlike existing methods, it enables gradient-based,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

s-marton/sympol
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsSparse Evolutionary Training