There is no Accuracy-Interpretability Tradeoff in Reinforcement Learning for Mazes
Yishay Mansour, Michal Moshkovitz, Cynthia Rudin

TL;DR
This paper demonstrates that in maze reinforcement learning problems with certain conditions, it is possible to have interpretable policies represented by small decision trees without sacrificing optimality, challenging the assumed tradeoff.
Contribution
The paper proves the existence of small, interpretable decision trees representing optimal policies in maze RL problems, introducing a new compressing technique.
Findings
Optimal policies can be represented by small decision trees.
In constant dimension, interpretability does not reduce accuracy.
A new compressing technique is introduced for policy representation.
Abstract
Interpretability is an essential building block for trustworthiness in reinforcement learning systems. However, interpretability might come at the cost of deteriorated performance, leading many researchers to build complex models. Our goal is to analyze the cost of interpretability. We show that in certain cases, one can achieve policy interpretability while maintaining its optimality. We focus on a classical problem from reinforcement learning: mazes with obstacles in . We prove the existence of a small decision tree with a linear function at each inner node and depth that represents an optimal policy. Note that for the interesting case of a constant , we have depth. Thus, in this setting, there is no accuracy-interpretability tradeoff. To prove this result, we use a new "compressing" technique that might be useful in additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning
