"So, Tell Me About Your Policy...": Distillation of interpretable policies from Deep Reinforcement Learning agents

Giovanni Dispoto; Paolo Bonetti; Marcello Restelli

arXiv:2507.07848·cs.LG·July 30, 2025

"So, Tell Me About Your Policy...": Distillation of interpretable policies from Deep Reinforcement Learning agents

Giovanni Dispoto, Paolo Bonetti, Marcello Restelli

PDF

Open Access

TL;DR

This paper introduces a novel algorithm that extracts interpretable policies from deep reinforcement learning agents, supported by theoretical guarantees, enabling understanding of complex policies without sacrificing performance.

Contribution

The paper presents a new method to derive interpretable policies from DRL agents using advantage functions, trained on existing experience, with theoretical support and empirical validation.

Findings

01

Successfully extracts interpretable policies in control environments.

02

Effectively captures expert behavior in financial trading scenarios.

03

Provides theoretical guarantees for the extracted policies.

Abstract

Recent advances in Reinforcement Learning (RL) largely benefit from the inclusion of Deep Neural Networks, boosting the number of novel approaches proposed in the field of Deep Reinforcement Learning (DRL). These techniques demonstrate the ability to tackle complex games such as Atari, Go, and other real-world applications, including financial trading. Nevertheless, a significant challenge emerges from the lack of interpretability, particularly when attempting to comprehend the underlying patterns learned, the relative importance of the state features, and how they are integrated to generate the policy's output. For this reason, in mission-critical and real-world settings, it is often preferred to deploy a simpler and more interpretable algorithm, although at the cost of performance. In this paper, we propose a novel algorithm, supported by theoretical guarantees, that can extract an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics · Stock Market Forecasting Methods