Explainable Deep Reinforcement Learning: State of the Art and Challenges

George A. Vouros

arXiv:2301.09937·cs.LG·January 25, 2023

Explainable Deep Reinforcement Learning: State of the Art and Challenges

George A. Vouros

PDF

TL;DR

This paper reviews current methods for making deep reinforcement learning more explainable, addressing the need for transparency and trust in critical real-world applications, and discusses challenges and future directions.

Contribution

It provides a formal framework for explainable deep reinforcement learning and categorizes existing methods, highlighting open challenges and research gaps.

Findings

01

Categorization of explainable DRL methods by paradigm and explanation surface

02

Identification of key components for a general explainable DRL framework

03

Discussion of open challenges and future research directions

Abstract

Interpretability, explainability and transparency are key issues to introducing Artificial Intelligence methods in many critical domains: This is important due to ethical concerns and trust issues strongly connected to reliability, robustness, auditability and fairness, and has important consequences towards keeping the human in the loop in high levels of automation, especially in critical cases for decision making, where both (human and the machine) play important roles. While the research community has given much attention to explainability of closed (or black) prediction boxes, there are tremendous needs for explainability of closed-box methods that support agents to act autonomously in the real world. Reinforcement learning methods, and especially their deep versions, are such closed-box methods. In this article we aim to provide a review of state of the art methods for explainable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.