Continuous Action Reinforcement Learning from a Mixture of Interpretable   Experts

Riad Akrour; Davide Tateo; Jan Peters

arXiv:2006.05911·cs.LG·November 19, 2021

Continuous Action Reinforcement Learning from a Mixture of Interpretable Experts

Riad Akrour, Davide Tateo, Jan Peters

PDF

1 Repo

TL;DR

This paper introduces a reinforcement learning method that combines complex value functions with a transparent, hierarchical policy structure based on interpretable experts, enabling effective learning while maintaining human interpretability.

Contribution

The paper proposes a novel policy iteration scheme that integrates interpretable experts with non-differentiable prototype selection for continuous action RL.

Findings

01

Achieves competitive performance on continuous action benchmarks.

02

Produces policies more transparent and interpretable than neural network policies.

03

Maintains high performance while enhancing policy interpretability.

Abstract

Reinforcement learning (RL) has demonstrated its ability to solve high dimensional tasks by leveraging non-linear function approximators. However, these successes are mostly achieved by 'black-box' policies in simulated domains. When deploying RL to the real world, several concerns regarding the use of a 'black-box' policy might be raised. In order to make the learned policies more transparent, we propose in this paper a policy iteration scheme that retains a complex function approximator for its internal value predictions but constrains the policy to have a concise, hierarchical, and human-readable structure, based on a mixture of interpretable experts. Each expert selects a primitive action according to a distance to a prototypical state. A key design decision to keep such experts interpretable is to select the prototypical states from trajectory data. The main technical contribution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

akrouriad/tpami_metricrl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.