Limits of Actor-Critic Algorithms for Decision Tree Policies Learning in   IBMDPs

Hector Kohler; Riad Akrour; Philippe Preux

arXiv:2309.13365·cs.LG·January 23, 2024

Limits of Actor-Critic Algorithms for Decision Tree Policies Learning in IBMDPs

Hector Kohler, Riad Akrour, Philippe Preux

PDF

Open Access

TL;DR

This paper investigates the limitations of actor-critic algorithms in learning decision tree policies within IBMDPs, revealing failures in deep RL approaches and proposing efficient solutions for supervised classification tasks.

Contribution

It demonstrates the failure modes of deep RL in partially observable settings and introduces a new approach for learning optimal decision trees as fully observable MDPs.

Findings

01

Deep RL can fail on simple toy tasks for DT learning.

02

Optimal decision trees can be efficiently learned as fully observable MDPs.

03

New algorithms outperform classical greedy methods in DT learning.

Abstract

Interpretability of AI models allows for user safety checks to build trust in such AIs. In particular, Decision Trees (DTs) provide a global look at the learned model and transparently reveal which features of the input are critical for making a decision. However, interpretability is hindered if the DT is too large. To learn compact trees, a recent Reinforcement Learning (RL) framework has been proposed to explore the space of DTs using deep RL. This framework augments a decision problem (e.g. a supervised classification task) with additional actions that gather information about the features of an otherwise hidden input. By appropriately penalizing these actions, the agent learns to optimally trade-off size and performance of DTs. In practice, a reactive policy for a partially observable Markov decision process (MDP) needs to be learned, which is still an open problem. We show in this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics

Methodsfail