HEX: Human-in-the-loop Explainability via Deep Reinforcement Learning

Michael T. Lash

arXiv:2206.01343·cs.LG·June 6, 2022

HEX: Human-in-the-loop Explainability via Deep Reinforcement Learning

Michael T. Lash

PDF

Open Access

TL;DR

HEX introduces a human-in-the-loop deep reinforcement learning framework for machine learning explainability, focusing on decision boundary understanding and functioning effectively with limited or federated data.

Contribution

It presents a novel HEX approach that synthesizes explanation policies considering the decision boundary, addressing limited data and trust issues in MLX.

Findings

01

Effective in limited data scenarios

02

Captures decision boundary explicitly

03

Operates with federated learning settings

Abstract

The use of machine learning (ML) models in decision-making contexts, particularly those used in high-stakes decision-making, are fraught with issue and peril since a person - not a machine - must ultimately be held accountable for the consequences of the decisions made using such systems. Machine learning explainability (MLX) promises to provide decision-makers with prediction-specific rationale, assuring them that the model-elicited predictions are made for the right reasons and are thus reliable. Few works explicitly consider this key human-in-the-loop (HITL) component, however. In this work we propose HEX, a human-in-the-loop deep reinforcement learning approach to MLX. HEX incorporates 0-distrust projection to synthesize decider specific explanation-providing policies from any arbitrary classification model. HEX is also constructed to operate in limited or reduced training data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning