Optimal Decision Tree Policies for Markov Decision Processes

Dani\"el Vos; Sicco Verwer

arXiv:2301.13185·cs.AI·February 15, 2024

Optimal Decision Tree Policies for Markov Decision Processes

Dani\"el Vos, Sicco Verwer

PDF

Open Access 1 Repo

TL;DR

This paper introduces OMDTs, a method for directly optimizing size-limited decision trees for Markov Decision Processes using Mixed-Integer Linear Programming, achieving near-optimal policies with interpretability.

Contribution

It proposes OMDTs, the first approach to directly maximize expected return of decision trees in MDPs under size constraints, addressing limitations of imitation learning.

Findings

01

OMDTs often outperform imitation learning in policy optimality.

02

Limited-depth OMDTs (depth 3) perform close to the optimal.

03

Imitation learning struggles with complex policies in size-limited trees.

Abstract

Interpretability of reinforcement learning policies is essential for many real-world tasks but learning such interpretable policies is a hard problem. Particularly rule-based policies such as decision trees and rules lists are difficult to optimize due to their non-differentiability. While existing techniques can learn verifiable decision tree policies there is no guarantee that the learners generate a decision that performs optimally. In this work, we study the optimization of size-limited decision trees for Markov Decision Processes (MPDs) and propose OMDTs: Optimal MDP Decision Trees. Given a user-defined size limit and MDP formulation OMDT directly maximizes the expected discounted return for the decision tree using Mixed-Integer Linear Programming. By training optimal decision tree policies for different MDPs we empirically study the optimality gap for existing imitation learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tudelft-cda-lab/omdt
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning