Margin-distancing for safe model explanation

Tom Yan; Chicheng Zhang

arXiv:2202.11266·cs.LG·February 24, 2022

Margin-distancing for safe model explanation

Tom Yan, Chicheng Zhang

PDF

Open Access

TL;DR

This paper explores the balance between transparency and vulnerability in machine learning explanations, proposing a formulation that addresses gaming risks near decision boundaries and empirically evaluates this tradeoff.

Contribution

It introduces a formal framework for understanding the transparency-vulnerability tradeoff and investigates explanation methods that mitigate gaming near decision boundaries.

Findings

01

Identifies decision boundary proximity as a key source of gaming.

02

Proposes explanation strategies that balance expansiveness and uncertainty.

03

Empirically demonstrates the tradeoff on real-world datasets.

Abstract

The growing use of machine learning models in consequential settings has highlighted an important and seemingly irreconcilable tension between transparency and vulnerability to gaming. While this has sparked sizable debate in legal literature, there has been comparatively less technical study of this contention. In this work, we propose a clean-cut formulation of this tension and a way to make the tradeoff between transparency and gaming. We identify the source of gaming as being points close to the \emph{decision boundary} of the model. And we initiate an investigation on how to provide example-based explanations that are expansive and yet consistent with a version space that is sufficiently uncertain with respect to the boundary points' labels. Finally, we furnish our theoretical results with empirical investigations of this tradeoff on real-world datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data