Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization
Sascha Marton, Tim Grams, Florian Vogt, Stefan L\"udtke, Christian, Bartelt, Heiner Stuckenschmidt

TL;DR
SYMPOL introduces a novel, interpretable, tree-based policy gradient method for reinforcement learning, enabling end-to-end learning of decision trees that outperform existing approaches in interpretability and performance.
Contribution
The paper presents SYMPOL, a new method for learning symbolic, tree-based policies directly within on-policy RL, combining interpretability with gradient-based optimization.
Findings
SYMPOL outperforms existing tree-based RL methods in benchmarks.
It enables end-to-end differentiable learning of decision trees.
SYMPOL maintains high interpretability while achieving competitive performance.
Abstract
Reinforcement learning (RL) has seen significant success across various domains, but its adoption is often limited by the black-box nature of neural network policies, making them difficult to interpret. In contrast, symbolic policies allow representing decision-making strategies in a compact and interpretable way. However, learning symbolic policies directly within on-policy methods remains challenging. In this paper, we introduce SYMPOL, a novel method for SYMbolic tree-based on-POLicy RL. SYMPOL employs a tree-based model integrated with a policy gradient method, enabling the agent to learn and adapt its actions while maintaining a high level of interpretability. We evaluate SYMPOL on a set of benchmark RL tasks, demonstrating its superiority over alternative tree-based RL approaches in terms of performance and interpretability. Unlike existing methods, it enables gradient-based,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsSparse Evolutionary Training
