Policy Optimization with Sparse Global Contrastive Explanations
Jiayu Yao, Sonali Parbhoo, Weiwei Pan, Finale Doshi-Velez

TL;DR
This paper introduces a reinforcement learning framework that optimizes policies by making minimal, interpretable changes constrained by sparse, global contrastive explanations, demonstrated in discrete and continuous domains.
Contribution
It presents a novel RL approach that incorporates sparse, global contrastive explanations to guide minimal policy modifications for improved interpretability.
Findings
Effective policy improvements with minimal changes
Demonstrated in discrete MDP and continuous navigation domains
Framework enhances interpretability of policy updates
Abstract
We develop a Reinforcement Learning (RL) framework for improving an existing behavior policy via sparse, user-interpretable changes. Our goal is to make minimal changes while gaining as much benefit as possible. We define a minimal change as having a sparse, global contrastive explanation between the original and proposed policy. We improve the current policy with the constraint of keeping that global contrastive explanation short. We demonstrate our framework with a discrete MDP and a continuous 2D navigation domain.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Data Stream Mining Techniques
