IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit based on Analyses of Interestingness
Pedro Sequeira, Melinda Gervasio

TL;DR
IxDRL introduces an explainable deep reinforcement learning framework that analyzes agent interestingness to assess competence, offering insights into behavior patterns and limitations for improved human-agent collaboration.
Contribution
The paper presents a novel framework for explainable deep RL based on interestingness analysis, supporting various algorithms and providing holistic competence insights.
Findings
Effective identification of agent behavior patterns
Capability to determine competence-controlling conditions
Insights into agent strengths and limitations
Abstract
In recent years, advances in deep learning have resulted in a plethora of successes in the use of reinforcement learning (RL) to solve complex sequential decision tasks with high-dimensional inputs. However, existing systems lack the necessary mechanisms to provide humans with a holistic view of their competence, presenting an impediment to their adoption, particularly in critical applications where the decisions an agent makes can have significant consequences. Yet, existing RL-based systems are essentially competency-unaware in that they lack the necessary interpretation mechanisms to allow human operators to have an insightful, holistic view of their competency. Towards more explainable Deep RL (xDRL), we propose a new framework based on analyses of interestingness. Our tool provides various measures of RL agent competence stemming from interestingness analysis and is applicable to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics
