Interpretable and Editable Programmatic Tree Policies for Reinforcement   Learning

Hector Kohler; Quentin Delfosse; Riad Akrour; Kristian Kersting,; Philippe Preux

arXiv:2405.14956·cs.AI·May 27, 2024

Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning

Hector Kohler, Quentin Delfosse, Riad Akrour, Kristian Kersting,, Philippe Preux

PDF

Open Access 1 Repo

TL;DR

This paper introduces INTERPRETER, a fast distillation method that creates interpretable, editable tree policies for reinforcement learning, enabling better understanding and correction of agent behaviors in various tasks.

Contribution

The paper presents a novel, efficient distillation approach for generating interpretable and editable tree policies in reinforcement learning, addressing limitations of prior methods.

Findings

01

Tree policies match oracles across multiple tasks

02

Policies can be interpreted and edited to fix misalignments

03

Effective in explaining real-world strategies

Abstract

Deep reinforcement learning agents are prone to goal misalignments. The black-box nature of their policies hinders the detection and correction of such misalignments, and the trust necessary for real-world deployment. So far, solutions learning interpretable policies are inefficient or require many human priors. We propose INTERPRETER, a fast distillation method producing INTerpretable Editable tRee Programs for ReinforcEmenT lEaRning. We empirically demonstrate that INTERPRETER compact tree programs match oracles across a diverse set of sequential decision tasks and evaluate the impact of our design choices on interpretability and performances. We show that our policies can be interpreted and edited to correct misalignments on Atari games and to explain real farming strategies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

KohlerHECTOR/interpreter-py
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification

MethodsSparse Evolutionary Training