Explaining Reinforcement Learning with Shapley Values

Daniel Beechey; Thomas M. S. Smith; \"Ozg\"ur \c{S}im\c{s}ek

arXiv:2306.05810·cs.LG·June 12, 2023·6 cites

Explaining Reinforcement Learning with Shapley Values

Daniel Beechey, Thomas M. S. Smith, \"Ozg\"ur \c{S}im\c{s}ek

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces SVERL, a principled framework using Shapley values to explain reinforcement learning agents, addressing previous limitations and providing meaningful, human-aligned explanations across various domains.

Contribution

It develops a novel theoretical framework, SVERL, for explaining reinforcement learning with Shapley values, and demonstrates its effectiveness in multiple domains.

Findings

01

SVERL produces explanations that align with human intuition.

02

The approach exposes limitations of previous Shapley value applications.

03

SVERL offers meaningful insights into agent performance.

Abstract

For reinforcement learning systems to be widely adopted, their users must understand and trust them. We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bath-reinforcement-learning-lab/sverl_icml_2023
noneOfficial

Videos

Explaining Reinforcement Learning with Shapley Values· slideslive

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)