Using Part-based Representations for Explainable Deep Reinforcement   Learning

Manos Kirtas; Konstantinos Tsampazis; Loukia Avramelou; Nikolaos; Passalis; Anastasios Tefas

arXiv:2408.11455·cs.LG·August 23, 2024

Using Part-based Representations for Explainable Deep Reinforcement Learning

Manos Kirtas, Konstantinos Tsampazis, Loukia Avramelou, Nikolaos, Passalis, Anastasios Tefas

PDF

Open Access

TL;DR

This paper introduces a non-negative training method for deep reinforcement learning that facilitates interpretable part-based representations, demonstrated on the Cartpole benchmark.

Contribution

It proposes a novel non-negative initialization and sign-preserving training approach to improve interpretability in RL models with part-based representations.

Findings

01

Enhanced interpretability of RL models through part-based representations.

02

Improved training stability and convergence with the proposed method.

03

Successful application on the Cartpole benchmark.

Abstract

Utilizing deep learning models to learn part-based representations holds significant potential for interpretable-by-design approaches, as these models incorporate latent causes obtained from feature representations through simple addition. However, training a part-based learning model presents challenges, particularly in enforcing non-negative constraints on the model's parameters, which can result in training difficulties such as instability and convergence issues. Moreover, applying such approaches in Deep Reinforcement Learning (RL) is even more demanding due to the inherent instabilities that impact many optimization methods. In this paper, we propose a non-negative training approach for actor models in RL, enabling the extraction of part-based representations that enhance interpretability while adhering to non-negative constraints. To this end, we employ a non-negative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning in Healthcare