Counterfactual Explanation Policies in RL

Shripad V. Deshmukh; Srivatsan R; Supriti Vijay; Jayakumar; Subramanian; Chirag Agarwal

arXiv:2307.13192·cs.AI·July 26, 2023

Counterfactual Explanation Policies in RL

Shripad V. Deshmukh, Srivatsan R, Supriti Vijay, Jayakumar, Subramanian, Chirag Agarwal

PDF

Open Access

TL;DR

This paper introduces COUNTERPOL, a novel framework for generating counterfactual explanations in reinforcement learning policies, enabling better interpretability by identifying minimal policy changes for desired outcomes.

Contribution

COUNTERPOL is the first method to systematically analyze RL policies using counterfactual explanations, linking them to trust region optimization techniques.

Findings

01

Effective in explaining skill (un)learning across diverse environments

02

Produces minimal policy modifications for targeted outcomes

03

Demonstrates utility in multiple RL settings

Abstract

As Reinforcement Learning (RL) agents are increasingly employed in diverse decision-making problems using reward preferences, it becomes important to ensure that policies learned by these frameworks in mapping observations to a probability distribution of the possible actions are explainable. However, there is little to no work in the systematic understanding of these complex policies in a contrastive manner, i.e., what minimal changes to the policy would improve/worsen its performance to a desired level. In this work, we present COUNTERPOL, the first framework to analyze RL policies using counterfactual explanations in the form of minimal changes to the policy that lead to the desired outcome. We do so by incorporating counterfactuals in supervised learning in RL with the target outcome regulated using desired return. We establish a theoretical connection between Counterpol and widely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)

MethodsCounterfactuals Explanations