Models That Are Interpretable But Not Transparent

Chudi Zhong; Panyu Chen; Cynthia Rudin

arXiv:2502.19502·cs.LG·February 28, 2025

Models That Are Interpretable But Not Transparent

Chudi Zhong, Panyu Chen, Cynthia Rudin

PDF

Open Access 1 Repo

TL;DR

This paper introduces FaithfulDefense, a method for creating interpretable models that provide fully faithful explanations while minimizing the exposure of the model's decision boundary to protect proprietary information.

Contribution

FaithfulDefense offers a novel approach using set cover formulations and submodularity to generate faithful explanations without fully revealing the model's decision boundary.

Findings

01

FaithfulDefense achieves fully faithful explanations.

02

The method effectively balances interpretability and model protection.

03

It employs set cover and submodularity techniques.

Abstract

Faithful explanations are essential for machine learning models in high-stakes applications. Inherently interpretable models are well-suited for these applications because they naturally provide faithful explanations by revealing their decision logic. However, model designers often need to keep these models proprietary to maintain their value. This creates a tension: we need models that are interpretable--allowing human decision-makers to understand and justify predictions, but not transparent, so that the model's decision boundary is not easily replicated by attackers. Shielding the model's decision boundary is particularly challenging alongside the requirement of completely faithful explanations, since such explanations reveal the true logic of the model for an entire subspace around each query point. This work provides an approach, FaithfulDefense, that creates model explanations for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chudizhong/faithfuldefense
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Advanced Graph Neural Networks

MethodsSparse Evolutionary Training